SGI hopes by 2018 to build supercomputers 500 times faster than the most powerful today, using specially designed accelerator chips made by Intel. SGI hopes to bring a massive performance boost to its supercomputers through highly parallel processors based on Intel’s MIC (many integrated cores) architecture.
Currently they only have four systems in the top fifty super computers — and their highest is number seven; http://www.top500.org/list/2011/06/100. I’ll be curious to see how well they compete with the other architectures. Interestingly, the new number one is based on SPARC64 but it runs Linux.
Yes, these HPC super computers mostly runs Linux. These super computers is basically a cluster on a fast switch. It is similar to Googles network; Google have 10.000 PCs on a network. Just add a node and start up Linux, and you have increased the performance of the network. Linux scales well horizontally (on large networks with 1000s of PCs). For instance, SGI has a ALTIX server with as many as 1024 cores, it is a bunch of blades in a rack with a fast switch.
The opposite is a single large SMP server, weighing 1,000kg or so. They are built totally different. For instance, IBM’s has released their new and might P795 Unix server, it has as many as 32 POWER7 cpus. IBM’s largest Mainframe has as many as 24 Mainframe z196 cpus. Oracle has a Solaris server M9000 with as many as 64 cpus. Some years ago, Sun sold a Solaris server with as many as 144 cpus. This is vertical scaling, it is not a network with a bunch of blade PCs.
Linux scales bad vertically. Linux has problems going over 32 cores. 48 cores is not handled well by Linux. This can be seen in for instance, SAP benchmarks on 48 cores. Where Linux used faster AMD cpus, and faster DRAM than the Solaris server. Linux achieved only 87% cpu utilization whereas Solaris achieved 99%, and that is the reason Solaris was faster even though it used slower hardware. To scale well vertically, you need a mature Enterprise Unix. Linux can not do it, it takes decades of experience and tuning. Until recently, Linux had Big Kernel Lock!
Ted Tso, ext4 creator, just recently explained that until now, 32 cores was considered exotic and expensive hardware to Linux developers but now that is changing and that is the reason Ted is now working on to scale up to as many as 32 cores. But Solaris/AIX/HP-UX/etc Kernel devs have for decades had access to large servers with many cpus. Linux devs just recently has got access to 32 cores. Not 32 cpus, but 32 cores. After a decade(?), Linux might handle 32 cpus too.
Strange… Linux on Z works perfectly well on 60 CPUs assigned to it. And with high utilisation…
Edited 2011-06-21 11:20 UTC
n00b question then, so whats that mean for even AMD’s next lineup of server CPUs? They’re going to be 16 cores a socket, commonly up to 4 sockets per board.
So Linux wont be able to make full use of a single box anymore then?
Some famous computer scientist no long ago pointed out something interesting about supercomputers. We’re already at the point where the majority of the energy used in parallel computing is in communication between processors, and the relative proportion of compute-related energy is declining rapidly.
Maybe Intel’s idea of using optical interconnects down to inside CPUs could help, then Also, if communication becomes a critical concern, those distributed designs could also be put to work.
Myself, I’ve always wondered why motherboard buses still use copper wires. AFAIK, in high-performance applications, CPU buses are already a major bottleneck. Aren’t integrated photonics already ready for the job ?
Edited 2011-06-21 05:50 UTC
At some point it has to be converted to electrical current and i thought therein lays additional expenses applying terminating and/or conversion components.
Well, it’ll certainly eat more power due to the energy conversions each time a laser or photodetector is involved, but in HPC the goal is raw performance and not so much performance/watt, right ?
I mean, I understand that there could be laser/photodetector size problems for Intel’s on-die photonics idea, but motherboard buses aren’t much miniaturized, or are they ?
Edited 2011-06-21 12:06 UTC
Yes, the key problem with supercomputing for a long time now has been NUMA: http://en.wikipedia.org/wiki/Non-Uniform_Memory_Access
Unfortunately supercomputing is/has been a case of diminishing returns (what isn’t I guess?)… more processors simply remain starved due to NUMA (simplest/worst solution is all processors wait, every time, for the slowest possible memory access).
It’s hoped that with multicores hitting the desktops in the last years, there is going to be much more money/research into ways to mitigate this.
I wouldn’t be as optimistic. Why should the emergence of multicore chips in desktops lead to higher research fundings in the area ? Lots of common desktop tasks already worked well on the Pentium 3, and the sole high-performance task of a desktop which I can think of that is not a niche is gaming.
If anything, I’d say that the PS3 and the Xbox 360 could have done more for the improvement of HPC that multicore desktops. I mean… Lots of CPU cores struggling to access the same tiny RAM chip, outdated GPUs, and that hardware must be pushed to its limits before it’s replaced… Now you have room for research funding
Edited 2011-06-21 06:38 UTC
The NUMA problems that are referring to assume that you only want to run a single application on the whole system. NUMA issues can be greatly reduced when you use something like CPUSETS to confine an application to a set of cpus/cores and memory. With a larger number of cpus/cores and CPUSETS, you can run a fair number of applications (depending upon the required number of cpus/cores) efficiently and with predictable completion times. Consequently, NUMA isn’t a huge stumbling block. It all depends upon how you use the system.
Each time I see on of these articles about higher performance, more speed, more whatnot, my first thoughts are “where will it stop?” and “is it needed?”
It will stop when we have photorealistic 3d and sexbots with convincing AI. I think that’s what Kurtzweil means by “the singularity” – everyone will be single and connected to some virtual reality world as much as possible.
Photorealism ? Nah, no point in spending hours of human work and computing power on a 3D mesh if it’s not better than the original photo
Convincing sexbots, on the other hand, could be used in an interesting way… http://xkcd.com/632/
Just ask scientists working in meteorology (weather forecast), or astrophysics (modelling supernovae, string landscapes), or particle physics (LHC collisions), or cancer research, or neurology (brain modelling).
Or numerical atomic bomb simulations
Just noting that 2018 is 6 years away and Moore’s Law suggests that CPU power doubles every 18 months.
So by 2018, other competitors progression will have taken them some way … there may still be a 30 fold advantage if all goes to plan.
Just trying to get things in proportion, thats all.