As of April 7th 2015, there are no IBM PC emulators in the world that can run the demo properly. Unless you have the exact hardware required (see below), this demo won’t run properly; in fact, it hangs or crashes emulators before it is finished. To see what 8088 MPH looks like, I direct you to the video+audio capture of the demo running on real hardware.
Impressive.
If I remember correctly, Thom already posted a link to an article like this one :
http://hackaday.com/2014/06/21/better-full-motion-video-on-the-firs…
Nope, that’s by the same team, but this is real-time, as opposed to “video” playback in the link you provided.
Personally, I wasn’t that impressed. It’s tweaking composite output a little further than before, and makes the PC look like a C64 or the like (but without decent sound, I might add), but that’s it.
If that was all there was to this demo, it never would have won against the strong competition from C64, Amiga and Gameboy Color demos.
It was voted first by demosceners, people who have a good idea of what the hardware can and cannot do.
The 8088 at 4.77 MHz is barely faster than the 1 MHz 6510 in a C64, and the CGA chip is little more than a primitive framebuffer, where the VIC-II has 8 hardware sprites, scanline counter, raster interrupts, multiple vram banks, custom character sets etc, which make a lot of effects a lot less CPU-intensive than on the PC.
The DeLorean sprite against a scrolling background is a good example of something that is easy to do on C64, but has been deemed impossible on PC until now.
I have put the IRC log of the #revision channel at the time our demo was played on my blog: https://scalibq.wordpress.com/2015/04/06/just-keeping-it-real-at-rev…
I think the responses speak for themselves…
Edited 2015-04-09 10:14 UTC
I was there (at Revision), I’ve coded for the 8088 (and CGA) back in the days, and though it is an nice feat overall, I’m still not impressed. That other people are doesn’t need to change my opinion, now does it?
EDIT: Ah, right, you are *the* Scali I assume. In that case, commenting on me not being impressed is even less appropriate.
Edited 2015-04-09 10:19 UTC
If you really knew about 8088+CGA, you’d understand just how tight the code has to be in some places.
If you aren’t impressed by eg the mod player doing 4ch at 16 KHz on a 4.77 MHz 8088, on a PC speaker, well, you simply don’t understand enough about how much processing is required for that, and how little processing power the 8088 really has.
Hiding behind ‘my opinion’ is nonsense, because there are clear and objective measurements here about how many samples to process per second, and how much bandwidth and processing cycles the 8088 has available. If you work out the math, you have to draw the conclusion that the CPU is stressed to the max.
Most mod players would barely reach 6 KHz on PC speaker, on an 8+ MHz 286, which is a LOT faster.
You’re just showing your ignorance and your arrogance (and no, I don’t buy that you actually know how to code 8088, nor that you actually were at Revision… but prove me wrong.. if you are so hot, I expect you to come up with a better 8088 demo next year and beat everyone else).
Edited 2015-04-09 10:38 UTC
Look, you obviously are very emotionally attached to what you created, and you have every right to be. And again, the work that went into it is very impressive. But that doesn’t mean that anyone knowing exactly what you did should by definition be impressed by the end result, or they are arrogant – it is in fact you who are so arrogant that you seem to *demand* that everyone is ecstatic.
Edited 2015-04-09 11:13 UTC
Whatever, I didn’t even make the mod player, that’s reenigne’s code… I just think it’s the most impressive mod player ever written for 8088. Anyone who doesn’t, simply doesn’t realize what it is and does exactly.
And yes, it certainly *is* arrogance on your behalf.
Even if you sincerely weren’t impressed (which I can’t possibly imagine), there would be no need to express that in public.
So a) you can’t imagine that someone has an opinion that differs from what you think it should be and b) if that unimagenable thing happens anyway, you think the ones involved should shut up. And you call *me* arrogant. Wow… just wow… I’m glad I didn’t bump into you at Revision.
Edited 2015-04-09 11:25 UTC
Uhh, no, see… I’m not arrogant, so out of respect, I rarely comment on other people’s demo’s just to say how much I didn’t like them. I generally only give positive comments, or constructive criticism based on my own experience.
If I were arrogant, I’d tell everyone I’m not impressed with their winning demos, while not having anything to show for it myself.
Edited 2015-04-09 11:25 UTC
You obviously don’t know what “arrogant” means. You think it means “saying something’s bad without being able to do it yourself” instead of it’s actual meaning, “exaggerating or disposed to exaggerate one’s own worth or importance often by an overbearing manner” / “showing an offensive attitude of superiority” (from merriam-webster.com). I could not have pulled off what you guys did, for a number of reasons. But that doesn’t leave me unable to have opinions about the end result of what you did.
This seems like an excellent description of your behaviour… You shouting out that you are not impressed implies that you could do better, yet:
Hence exaggeration and superiority.
I said “Personally, I wasn’t that impressed. It’s tweaking composite output a little further than before, and makes the PC look like a C64 or the like (but without decent sound, I might add), but that’s it.”
That’s not “shouting out”. Being not impressed doesn’t “imply you could do better”. But yeah, I’ve spent enough time on this already, whatever dude. Congrats on the 1st place, I guess.
And this is what I responded to. This is your arrogance and ignorance showing, dismissing most of the demo… the 1k colours were only a small part of it (especially your ‘without decent sound’ remark… erm, hello!? there’s a 16 KHz 4ch mod player in there!).
There is a LOT more to this demo than just the 1k trick as I already said. I was just trying to prevent you from making a fool out of yourself. But that obviously failed
Edited 2015-04-09 12:09 UTC
Wow, that escalated quickly
Hi there; I loved you demo, but you should be more realistic and don’t put yourselves in the same league of the C64 demoscene.
1.- An 8088 at ~5 Mhz is much faster than a 6510 at 1 Mhz. I have optimized code in 6510, Z80 and 8086 in the past. I have checked 8088 timings, and we are taking about two to three times the speed of the 6510.
Any significant operation on the 6510 takes 4 cycles, usually 5. Yes, there a few ones faster, but “inx” or zero page operations don’t do much.
An equivalent 8088 op. will take easily 12 cycles but you still have 5 times the clock speed!
2.- In terms of algorithms (ie. the rotozoom) the C64 guys are just way ahead from you. That’s life. Take it as a challenge, which is the way it is supossed to be!
I don’t want to be confrontational, just posting it because my strong respect of the C64 scene, which never ceases to astound me.
I have coded both 6510 and 8088, and the instruction timings on 8088 are a common misconception.
These are best-case instruction timings for the 8086, with a 16-bit bus and a 6-byte prefetch buffer. They assume that instructions are executed from the prefetch buffer. 6502/6510 timings on the other hand are actual instruction timings, as there is no prefetch.
The 8088 has only an 8-bit bus, and only a 4-byte prefetch buffer.
As a result, you are rarely executing from prefetch. Each byte costs 4 cycles to read, and most instructions on 8088 are 2 bytes or more. So this adds up quickly in practice, and you really don’t get much more performance out of the 8088 at 4.77 MHz than a 6510 at 1 MHz.
During development we used routines to measure the actual cycles that pieces of code took on a real 8088. The Intel instruction timings are VERY inflated.
The C64 has a completely different memory layout for their graphics modes, which is far more efficient in most cases. You can rewrite your character set and colorram. This means you only have to update 2k of memory for a full screen. We only have raw bitmapped modes, which take 16k for a full screen.
There is no way we can compete directly with C64, we are at a serious disadvantage in virtually every way. But we can beat them in demo compos
I think you totally got the wrong message. I grew up with C64 myself, and am still amazed by C64 demos, and I do still code C64 myself from time to time. In fact, my experience with C64 and Amiga influenced the design of various routines in this demo, trying to adapt tricks from those platforms to the PC.
The PC SUXX! is not really a joke, as far as I’m concerned. I really do feel that the platform is designed horribly in many ways. However, it also poses a challenge, which is how this demo came about.
Edited 2015-04-09 21:05 UTC
Ok, maybe we should compare some code. I still think it should be at least 2x faster than a 6510.
Let’s say we have a common unrolled table effect:
lda $addr,x
ora $addr,x
ora $addr,x
ora $addr,x
sta $addr
That would take 4*4 to 4*5 (best case, worse case) cycles plus 4 cycles (sta): 20 to 24 cycles or about 24 microseconds.
What would and equivalent 8088 snippet of code look and estimated cost? (my 8086 and 8085 are really rusty, so I will accept your honest answer).
About memory updating, top c64 effects usually need much more than 1 or 2 KB of memory update, but it’s correct that’s not as much as 16 KB. However, the machine has his own penalties. For example, the 4×4 16 color modes require and interrupt every 4 lines, which removes lots of CPU and force “bad lines”, which slow down the CPU even more.
If I am a bit annoyed about those minor points it’s because many programmers from other scenes (ZX, Amstrad) say their CPUs have the same speed of a 6510. That’s scoring “cool” points for free because they really have double to triple speed and their effects are simply not as optimized as the C64 ones.
You’d do something like:
lodsb
or al, [addr]
or al, [addr]
or al, [addr]
stosb
Now, lodsb and stosb are just 1-byte instructions.
or al will be a big instruction of 3 bytes.
So that means we already spend 11*4 cycles on just reading the instructions.
Then the lodsb takes another 12 cycles or so, the stosb takes 11 cycles, and the ors take 9 cycles + EA… which would be 6 cycles for a direct offset.
So in total you’d be be looking at something like (11*4) + 12 + 11 + 3*(9+6) = 44 + 12 + 11 + 45 = 112 cycles.
And poof, gone is your 5*clockspeed.
Yes, but the 6502 is stupid fast at interrupts. An 8088 has a whole chunk of 16-bit registers to push on the stack, which is very costly.
Also, CGA memory has waitstates, so you get a penalty for EVERY byte you want to write. Which is worse than bad lines.
And we don’t have ‘free’ drawing with sprites either.
Jesus in a pogo stick, are those numbers real? I didn’t get the feeling of the PC XT being as slow as a C64 when I was young…
Ok, cool points for your team then. But I still think the C64 sceners have better skillz.
Are you sure?
Even getting a directory listing is slow on these machines. Perhaps you had one of those Turbo XT clones at ~10 MHz? But even then they were barely faster than C64 in most games. Not to mention the horrible CGA colours and beeper sounds.
VGA also makes most things a lot faster than CGA, because it doesn’t have the horrible waitstates, and the chunky byte-oriented memory format makes pixel access a lot faster.
I don’t think skill is related to platform, but whatever.
Oh that’s “it”?
Given the huge difference in audio & video hardware capability between an IBM 5050 and a C64 dismissing the effort as “but that’s it” is a rather short sighted thing to say.
You are right “that wasn’t it”, as I perhaps too easily dismissed the MOD player. But it didn’t play *during* the demo, which focussed on the graphics. And again, thought it’s a nice feat they pulled off, I wasn’t that impressed.
As far as we are concerned, it did.
The Revision 2015 rules stated a maximum running time of 8 minutes. We specifically designed the demo so that the endpart with tune could be heard during the viewing.
It was our last effect, and as such, very much part of the demo.
Well, yeah, it was part of it *technically*, but the end sequence (understandably) suffered rather in complexity, so the demo was mostly nice oldskool gfx with an extremely annoying (imho) PC beep fest, and a final part with decent music but lousy gfx.
We like to think of it as the focus shifting from video to audio.
Thing is, you don’t have enough CPU power to do both. The mod player is designed to take 99% CPU, and leaves just enough room for one small screen update (either adding a character or scrolling up one line) each iteration (while taking the exact same number of CPU cycles for every possible path to avoid any jitter in the audio).
So the modplayer is one of the most complex effects in the demo. It’s just an effect you need to listen to, rather than watch (much like the Vicious SID demos on C64 for example… we actually had the idea of calling this end part Vicious PIT).
Edited 2015-04-09 12:53 UTC
If even we ignore the MOD player, the sprite emulation (on a CGA adaptor!) was incredibly technically impressive.
Though I can’t claim to be able to reproduce it, I found it rather dull. Moving pseudo-sprites has always been a part of CGA games.
Yes, small sprites, at low framerates, with flicker.
This ran at 60 fps, no flicker, and smooth-scrolling background.
But yes, it’s all the same.
In fact, a lot of stuff ran at 60 fps in our demo. Considering that you only have about 170 kb/s write speed to CGA memory, meaning you can’t even clear the screen at more than about 10 fps, getting anything at 60 fps on this machine is a chore.
And no, there is no memory for a backbuffer, so avoiding flicker means racing the beam.
(This should also put the vectorbobs and polygons into perspective).
As I already said, your ignorance is showing.
Edited 2015-04-09 13:51 UTC
Let it go Scali. Can’t you see how silly this discussion became ?
If 8088/CGA was top of the line hardware and you demoed 1024 colours on it general population would be impressed but not today and you know why.
Doing what you did will impress only handful of people who are into that. You got the proof. Your demo won; what else do you need ?
PS. To be honest more I enjoyed materials about the demo than the demo itself. Keep up good work!
I just can’t stand Dunning-Kruger.
There’s a huge difference between not understanding/not being impressed and actually trying to win a technical discussion with the makers of a demo, while having no idea about the hardware or software involved.
I mean, I just can’t get over the arrogance of this guy… This demo won the most prestigious demo compo of the year, gets rave reviews all over the internet, from tons of people who have at least some basic knowledge about the technology…
Yet this guy thinks all these people must be dumb, and he knows better, and he is ‘not impressed’.
Edited 2015-04-09 14:24 UTC
Yeah, you keep saying that. And also how impressive you are. We’ve heard that. The fact is though, that “racing the beam” is *not* impressive, as it has been done countless times on other platforms. It’s nice you pulled it off on a plain old CGA, but sorry, I’m still not impressed.
No, you were claiming it was not impressive at all. I’m not claiming our stuff is super-impressive (well I said I thought the mod-player is super-impressive, but that is not my code), I’m just giving some background info as to why the things we’re doing are not exactly super-trivial.
And your ignorance is showing yet again.
Do you know why it has never been done before? Because the IBM platform is way different from all others. It has no ‘helper’ hardware like scanline counters or raster interrupts, let alone an ANTIC/copper to help synchronize code with the beam.
Add to that the unpredictability of the platform because the memory is refreshed on a timer that is not synchronized to the display at all… and the prefetch buffer in the CPU, which makes it even more difficult to predict the execution time of each instruction… and perhaps, just perhaps, you’d get an inkling of why racing the beam has never been done before on this platform, and why people are rather excited about it happening for the first time.
Edited 2015-04-09 14:42 UTC
You are really fond of dissing shit beyond your own skills, aren’t you?
Vanders gave thumbs up for it, and so do I. That means a lot in here. (Well, not me, but Vanders that is :p )
I admit, being really weak at math, rotating torusses are beyond my skills. And I guess that saying I don’t think something is special is “dissing”. So be it. But unless someone tells me what is special about a rotating torus, I have no reason to think they are special (I guess it’s that it’s (probably) running at 60fp. Trixter alludes to the update speed difficulties in his description).
Here is an in-depth explanation on how the mod player works: http://www.reenigne.org/blog/8088-pc-speaker-mod-player-how-its-don…
I’m really interested in what you mean by “real time” vs a movie.
I’m certain that you aren’t computing the 3D shape and movement of the ring, that has to be pre-rendered, right?
I’m sure there are some complex calculations to figure out what to display when, but there must be some outside computational power that was thrown at this demo to make life easier on the poor 8088, right?
Edit: FWIW, Color me impressed. There has to be a lot of crazy ass hacks in there.
Edited 2015-04-09 17:24 UTC
Well, we don’t want to give away ALL our secrets… But just figure that the entire demo, including all graphics and music, fits on a single 360k disk, and this capture is actually running from a floppy disk in realtime (loading in the background between effects).
Well, given that a game like Elite runs on a C64 (and of course ran on an 8088 PC with CGA), a torus shouldn’t pose to much of a problem (although no doubt Scali will tell us it was crazily difficult and we need to be very impressed).
Comparing apples to oranges by shape only ?
Quaternions, full face, z-buffer, light source, dithering, only to name a few
But yeah, Elite with its wire frame rendering do compare, it was even ported on z80.
Even Virus (Braben’s game) was…
Well, badly rendered apples with pseudo-light-sourced ones, I guess .
Being mathemetical-challenged, I don’t know what quaternions have to do with torusses, but back in the days when I dabbled with 3D (which, admittedly wasn’t made public beyond the simple rotating squares of the Babytro), I’m pretty sure we didn’t make use of them. As for the other features, yeah, they are nice.
On the 8088 PC, Elite was rendered with filled polygons. I don’t know whether it was playable with it, as my XT had a 10MHz turbo mode and a CGA clone, which don’t directly compare to the original PC and CGA.
Well no, a game like Elite ran fine on the original PC (and also, though without filled polygons, on the C64). Although it of course didn’t run at 60fps, because of the slowness of the CGA memory.
I guess the 3D objects *could*’ve been prerendered and stored as a series of update instructions, but I think that would cost way too much space.
EDIT: Mmmm, don’t know why I replied twice. Sorry for that.
Edited 2015-04-09 20:40 UTC
By pre determining the rotations, you can include look up tables for the otherwise time consuming math. Not quite video playback, not quite free rendered. Or maybe that’s just plain obvious. I’ll admit I’m out of my area of knowledge here.
I did basic programs on the 8088/80286 and c64, but I never tried pushing the boundaries like this. My 3D rendering experience was done for fun from scratch with only a knowledge of the higher level math involved. Horribly inefficient I’m sure.
Ditto. The only optimizations I ever did were some vector multiplication tricks, now long forgotten.
Here is an article that describes the technical details of getting the 1k colour mode set up on CGA:
http://www.reenigne.org/blog/1k-colours-on-cga-how-its-done/
No, it’s not just a simple artifacting trick, there is a lot more to it.
That was so much greater than Karateka! And Karateka was like totally awesome! Too bad C64 trolls pester you so much in the comments. Enjoy your achievement. You guys are the best!
I didn’t see any trolls, just a single C64 coder who wanted to be informed about 8088 timing, which Scali answered to his satisfaction (and astonishment ).
Yeah, these timings are slow as molasses.
Just for the record, I don’t consider myself a c64 coder. I coded a bit of assemble in the 80’s and then tried my luck with some effects 10 years ago (didn’t release anything thought).
As an scener I was active releasing prods for Amiga on the 90’s. The c64 scene is just something I enjoy following and taking a peek at the code from time to time.
I would have voted for “Planet Rocklobster” as no. 1, but I respect your prod. and result.
Yes, the MHz myth existed even back then
The short answer is that the DRAM modules used in a PC are basically the same speed as in a C64 and various other 8-bit machines of that era.
The 8088 is really bottlenecked by this, since it is actually a 16-bit CPU on an 8-bit bus, and is starved just trying to fetch instructions.
The 6502 is a true 8-bit CPU and a very efficient design.
The 8086 is the same CPU as the 8088 basically, except on a 16-bit bus, as it was intended. It is much faster in practice. The 8088 was mainly interesting because it could be used with a much cheaper 8-bit chipset and motherboard (similar to the later 386SX CPU, which could run on a 16-bit board with a 286 chipset).
Which is why IBM decided on that.