Intel is drumming up support for its latest 50-core Knights Corner and Xeon E5 server chips, which are key elements in the company’s plans to scale performance while reducing power consumption moving toward an exascale supercomputer by 2018.
Intel is drumming up support for its latest 50-core Knights Corner and Xeon E5 server chips, which are key elements in the company’s plans to scale performance while reducing power consumption moving toward an exascale supercomputer by 2018.
I’m not a hardware guy but I gotta say I love reading about this stuff. Cool!
How about dual or quad AMD Opteron 6282 SE’s? 16 cores at 3Ghz for only $1020. The Bulldozer architecture doesn’t get a fair shake on Windows, on Linux though it fairs far better.
http://www.cpu-world.com/CPUs/Bulldozer/AMD-Opteron%206282%…
http://openbenchmarking.org/s/4%20x%20AMD%20Opteron~*~@…
http://arstechnica.com/business/news/2011/11/amds-16-core-bulldozer…
The high end cpu’s is behind the GPU’s.
The HD6970 already push out 2.7 terraflops and the GTX590 not far behind. Double that amount in Crossfire and SLi. So this kind of power is already commercially availably.
But it does require optimised software; consumes more power than the cpu’s and does take up about 30cm of your case though so it’s not a fair comparison.
Edited 2011-11-16 21:46 UTC
Well, if their C++ compiler holds up to its promises and if they were able at some point to put this thing inside desktop/server CPUs, with the amount of standardization and open documentation that current CPUs get, this could mean an amount of computational power that’s comparable to a GPU, but without having to deal with buggy and non-standard drivers + silly GPU-specific APIs.
The impact could be huge. But that’s a lot of ifs.
Edited 2011-11-16 22:01 UTC
But this is single precision performance. With double precision (the 1TF claim for KC is for double precision) the performance plummets to 675GF. It’s still pretty good, no question about that. But it is also peak performance. With GPU it very difficult to achieve proper utilization. Even 80% would be a huge success and then only possible on a very small number of tasks.
Due to it’s architecture, I’d expect Knights Corner (what a stupid name!) to be able to perform much closer to its peak performance. And who is to say that it won’t be capable of dual-board configurations.
I’m very excited about this chip. It’s a pity Intel wouldn’t tell us more about it.
In addition non trivially parallel tasks kill the performance. In short if it’s not a linear algorithm, the GPGPU will struggle to perform.
At least that is the fact in CUDA. AMD’s Stream processing might be a bit different.
Add to that the overhead of transfering data from main RAM to GPU RAM(even if it’s the physically the same thing).
From the article:
“The Knights Corner chip mixes standard x86 CPU cores with specialized cores and works as an accelerator alongside the CPU to boost parallel application performance.”
So maybe these are GPU(-like) or even more specialized.
GPU-like refers to streaming or SIMD programming models.
These intel boards can be programmed under either model, but also support MIMD, via message passing even. Which means a large amount of code can be ported to it fairly straightforward. Or at least the porting process is much much much easier than the porting to a GPU would be.
If intel is really getting that DP performance out of that chip, this could put NVIDIA in trouble in the HPC arena. Given that intel’s programming tools are light years ahead. Of course it all depends on the price point for these chips.
PCI-E is still dog slow. This is very hard to get excited about.
PCIe 3 is slated to come around with the next generation of GPUs (HD7000/GTX600) and mobo chipsets.
https://secure.wikimedia.org/wikipedia/en/wiki/PCI_Express#PCI_Expre…
Your life must be so exciting then…
Huh?
Compared to what exactly?
Which application do you use that manages to saturate a PCI-E 2.0 16x link running at ~16GBps FD *? (Let alone a soon to be released ~32GBps PCI-E 3.0 16x link)
– Gilboa
* Keep in mind that a *single* PCI-E 2.0 8x slot can handle a dual 10GbE link at full duplex without breaking a sweat.
for opencl. Not buggy non portable blobs.