IBM has won a bid to build a supercomputer called Roadrunner that will include not just conventional Opteron chips but also the Cell processor used in the Sony PlayStation. The supercomputer, for the Los Alamos National Laboratory, will be the world’s fastest machine and is designed to sustain a performance level of a ‘petaflop’, or 1 quadrillion calculations per second, said US Senator Pete Domenici earlier this year. I’d like to play Solitaire on that.
And the Stanford professor who claimed the big chip makers were pushing slow chips for their own benefit. He emphasized ‘streaming processors’.
http://www.primidi.com/2003/12/01.html
http://csl.stanford.edu/~billd/
Right. And IBM is using these “slow chips” to handle the logic computations in this supercomputer, because a “fast chip”, like the Cell, isn’t any good at it.
Seriously, stream processors are for very niche applications. They might be ideal for accelerating the inner loops of some calculation, but for the stuff most of the market uses, massively OOO general-purpose cores rule.
Indeed, comparing the chips made by the “big chip makers” to stream processors is really pointless. It’s comparing a Mack truck to a Lotus Elise. They’re designed completely differently, because they’re designed for completely different jobs. That’s why you have something like this supercomputer — a hybrid that leverages the strengths of each chip.
I don’t think many people even think about how complex and inefficient todays cpus have become, Intel, IBM, AMD know best (or we suppose they do).
Its funny though that a Senator would be announcing this, since when did any Senator actually have any real expertise in anything.
Well Bill Daly and others (myself included) see things very differently than the industry mainstream.
It really is the memory stupid or the Memory Wall.
Since 1986 DRAMs have gone from 120ns Ras cycles to todays 60ns although with the MMU and full TLB misses, that can easily go back out to 300ns all too often. However DRAMs have also gone from densities of 64Kb to 1Gb and the I/O pins now toggle at rate an order faster. 1986 delivered maybe 100Mb/s pin IO rates, now DDR2 rates are upto 1Gb/s so every access delivers one whole big chunk of data to fill ever larger cache lines. Still if you only want 1 word at true random locations, no amount of SuperScaler and Out Of Order can cover this. When locality is high, it sort of works.
The Memory Wall kills HPC apps as well as desktop PCs. The proportion of memory that can be faked as being fast ie cached is also getting smaller, DRAMs can get bigger, but SRAM caches are really stuck near 1-2Mbytes per chip.
What Daly suggests is one way to tackle that.
I have proposed an alternative that swaps the Memory Wall for an explicit Thread Wall, about 40 threads running on a relatively simple cpu design tangling with a special purpose threaded RLDRAM (off the shelf) that can start true random access across entire DRAM space every 2.5ns can effectively cut the Memory Wall back down to a few cpu cycles for Ld/St as well as FP codes.
But no, the silence is deafening!
This is JUST what the world needed.
¡Thanks IBM! ¡Thanks, Los Alamos!
Does it run linux?
Wouldn’t it be best used as a bit torrent seeder?
Lets see some quake4 benchmarks!
Maybe it’ll run AmigaDos!
How fast can it rip & encode MP3s?
Will it have more than 640k of ram?
Who will set the IRQ’s on it?
What kind of motherboard from CompUSA’s 1337 section will it use?
to add that nice machine to my Folding@Home collection. I’m sure it has a few CPU cycles to spare! 😉
An opteron and a Cell is an interesting mix. I imagine the Opteron will handle control and I/O and Cell will be left to do the heavy FP work.
On other occasions if the task requires a lot of random memory access the Opteron will be doing the majority of the work.
Eventually we’ll see individual chips with both functions. Cell already has a control processor but it’s not the fastest thing on the planet so I imagine it’ll be beefed up at some point.
AMD on the other hand have good general purpose cores but don’t have SIMD cores – yet, they definitely have a Cell like processor in the works though. Intel have also planned something similar as well.