In an industry first, Silicon Graphics today announced the worldwide availability of SGI Altix systems running up to 256 Intel Itanium 2 processors within a single instance of the Linux operating system. The record-breaking accomplishment is the end result of an international Altix beta program originally targeted to achieve just half the scalability of today’s 256-processor milestone.
SGI Unveils Systems Running on Single Linux Kernel Up To 256 CPUs
About The Author
Eugenia Loli
Ex-programmer, ex-editor in chief at OSNews.com, now a visual artist/filmmaker.
Follow me on Twitter @EugeniaLoli
66 Comments
Yes, that would be Microsoft, who has been stepping up to play in the same space as the big Unixes. While M$ technology may still be behind the major Unix players, they are catching up and catching on.
Why? Because they own the desktop and the Windows interface and branding are familiar. If Linux makes itself felt strongly on the desktop, that may well secure the position of the high-end Unixes as, cosmetically, they can be made to look like Linux or vice-versa, thereby flattening out the learning curve.
To quote from the paper.
“The peak system bandwidth is determined by the snoop rate of 150 MHz x 64-byte cache line width = 9.6 GBps.
The system-level data interconnect is a 4×4, 5×5 or 10×10 32-bytewide
crossbar, which is bit-sliced across the switch boards.
9.4.2 Multiple snooping coherence domains. Each snooping
coherence domain consists of one CPU/memory board and one I/O
assembly, which are connected together by a switch board called an
expander board. This board provides the system-level Address
Repeater and Data Switch logic required of any snooping coherence
domain. In addition, there is SSM address logic and data logic to
implement transfers between the snooping coherence domains.
When all transfers are to locations inside the same snooping coherence
domain, the peak data bandwidth is 9.6 GBps per snooping coherence domain. ”
I all depends on the coherency doamin. All Sun Fire 3800-6800 have one coherency domain giving you a peak bandwidth of 9.6GB/sec. I can’t seem to find any papers on the SF15K so we can’t really extrapolate these numbers to that system since it has multiple coherency domains.
Ahh, the important part is this though:
When all transfers are to locations inside the same snooping coherence
domain, the peak data bandwidth is 9.6 GBps per snooping coherence domain.
This means within a board (4 CPU packages), the maximum local memory bandwidth is constrained to 9.6GB/s due to that being the maximum capacity of the snoop bus.
When you want to access remote memory, the interconnect has a maximum bandwidth of 4.8GB/s or 150MHz x 32-bytes.
Actually the vibe I’m geathing is that Sun more often than not is winning on price/performance over Itanium and even Xeon and not just raw numbers. Sybase IQ benchmark is a good example:
Well I’m surprised. Sun looks alright in some of those tests. Don’t look at the Sun site of course, but the TPC site – they don’t look quite so good there, but not terrible.
As for my spelling, well translating gnome into irish and posting on osnews at the same time can be hard on the auld brain cells ๐
>>Since its January 2003 introduction, SGI says it has shipped 10,000 processors to 150 Altix 3000 customers, including a 512-processor machine used by NASA Ames Research Center.
http://www.newsforge.com/hardware/04/01/12/1559251.shtml?tid=68&tid…
As for zealots my opinion is that they make up a fixed percentage of any group (be they linux users, *bsd users, christians etc. etc.)
Now it’s generally regard that’s there more linux users out there then *bsd users (i could be wrong though) result when you work out percentages you will end up with more linux zealots (numerically)
The R18k has been cancelled ๐ it was scheduled for relase last year, one of the main contributors at Nekochan (the guy who ported gnome office) had a conversation with the chief scientist @ sgi. He confirmed that it had been cancelled and they were concentrating on Itanium/Linux
No doubt they will bring out a core shrink (from .11 to .09) of the R10k family (R16k the .11 core) and rebrand it as r18k but the original r18k (N0) was a total new design.
Hiya
>Now it’s generally regard that’s there more linux users
>out there then *bsd users (i could be wrong though)
Apple claim to have over 9 million Mac OS X users (with 10
million expected towards the end of this year), so arguably
they are more *bsd-ers than linux-ers ๐
—
Jon
When you want to access remote memory, the interconnect has a maximum bandwidth of 4.8GB/s or 150MHz x 32-bytes
Like I said SF3800-6800 the whole box is one Single Snoop Coherence domain. An SF6800 has 24 CPUs meaning 6 Boards. Read slide 12 with all the pretty pictures.
Ok this discussion is going ina different direction. what bascule was pointing out is the SGI box doesn’t have as much bandwidth as the SF15k.
This the only info I could find on the ALTIX.
Dual plane configuration (Altix 3700) doubles the bandwidth of the NUMAlink-3 fabric to take advantage of SHUB memory bandwidth
2 x 3.2GB/sec total bandwidth between C-brick
Single plane NUMAlink-4(Altix 3300)
1 x 6.4GB/sec per link
There is no info on MPI overhead and sustained bandwidth anywhere. Till you can prove that the SGI has multiple links between each brick eventhough the SGI site claims otherwise this discussion is meaning less.
Hi
“Apple claim to have over 9 million Mac OS X users (with 10
million expected towards the end of this year), so arguably
they are more *bsd-ers than linux-ers ๐
—
Jon ”
goose. apple is not bsd. keep telling that to your kids
regards
Jess
Hiya
> goose. apple is not bsd. keep telling that to your kids
Correct, Apple are a computer company.
From Apple’s web site:
Q. What is Darwin?
A. Darwin is a version of the BSD UNIX operating system that offers advanced networking, services such as the Apache web server, and support for both Macintosh and UNIX file systems. [snip]
Q. How does Darwin relate to Mac OS X?
A. Darwin is the core of Mac OS X. [snip]
See for youself:
http://developer.apple.com/darwin/projects/darwin/faq.html
—
Jon
Are we ready to admit that proprietary Unixes are replacable by Linux? Scalability is no longer an argument.
I’m not going to pretend I have any experience with this, so I don’t know, but I’ve heard rumors of Linux having issues on MP systems with over a gig of RAM, so my question is, what’s stability like, and how much RAM is in these things?
“..over a gig of RAM..”
I’ve got 1.5GB with SMP at home (Linux 2.6.3)
I don’t have a wealth of experience with high end linux systems myself as I’m a Solaris man at heart however my main desktop at home is gentoo linux running on dual Athlon with 2GB of RAM and I’ve never had any stability issues.
Just my own experience with one box so you can’t really draw any conclusions other than I’m very satisfied.
The 1GB limit, is only a problem on 32bit systems.
I figured the case was that it was fine, as I didn’t see where the stability issues would come into play, and I know linux is common on dual proc systems and some four ways, and I figure no such system would keep itself stuck at a gig of RAM. I forget where I heard this, could have been a single instance though, but the guy made it sound as if people were plagued by this issue. I personally have been sticking with single proc, even when I made this newer system I contemplated dual or opteron, but in the end went to a single AMD 3200+ with a Gig of RAM, I guess I just don’t feel I need dual processors right now. I most certainly, however, can see why certain data center or render farms may want such hardware, but even with a small business I think I’d be prone to using a single processor system as my server. I had much success using Linux on my 486 with 32MB of RAM as a server for the past 6 years or so, only recently did I change my server to my old desktop, and the only thing I’m running now that I didn’t run before on the 486 is MySQL, the 486 still handled quite nicely a fair amount of filesharing, e-mail, and web server for me and a few of my friends.
Actually it’s 4GB (4096MB)…
“The 1GB limit, is only a problem on 32bit systems. ”
Eeerrr, no.
“Are we ready to admit that proprietary Unixes are replacable by Linux? Scalability is no longer an argument.”
Well there is still real hot swapping of kernels, fault tolerance, partitioning, decent threading, development tools, etc. So there is still space for propitary Unixen. The right tool for the right job, if you have a hammer (Linux) every problem seesm to be a nail so it seems for some linux zealots.
Hi
”
Well there is still real hot swapping of kernels, fault tolerance, partitioning, decent threading, development tools, etc. So there is still space for propitary Unixen. The right tool for the right job, if you have a hammer (Linux) every problem seesm to be a nail so it seems for some linux zealots.”
does any operating system have hot swapping of kernel?
partitioning is already fixed with lvm2
threading has been fixed with nptl.
what development tools do you need?
gcc,kgdb,lkcd is already there.
try being more specific next time
Jess
Well there is still real hot swapping of kernels, fault tolerance, partitioning, decent threading, development tools, etc. So there is still space for propitary Unixen. The right tool for the right job, if you have a hammer (Linux) every problem seesm to be a nail so it seems for some linux zealots.
Starting with 2.6 Linux uses NPTL, so no need to worry about threading. Who needs fault tolerance when you have cheap and redundant hardware? Development tools? Which ones are you referring to? What is the problem in Linux with partitioning?
The “1GB limit” is a manifestation of the limitations of of the 4GB address space on 32-bit systems. Linux normally likes to map all memory into kernel space. That means all physical RAM, plus all memory mapped devices (like 256MB graphics cards) However, the kernel address space is only 1GB. 3GB is reserved for userspace, for memory mapped files, anonymous memory mappings, stacks, etc. When you’ve got > 1GB of RAM, the kernel cannot map all of RAM into memory at once. Instead, it uses a mechanism called highmem to map parts of main memory into the kernel address space as required.
Highmem isn’t exactly a huge problem. It requires some book-keeping overhead in that every time memory greater than 1GB needs to be accessed by the kernel, it has to be mapped in to a location below 1GB. Also, doing I/O from memory above 1GB often requires the data to be first copied to a “bounce buffer” residing below 1GB. For machines doing a lot of I/O, this can become a big bottleneck.
Actually, Solaris has a feature called “Hot Patches” that lets you apply fixes to a running kernel. I presume other UNIXs have something similar.
Hi
Yes. I have heard of that. However does any operating system allow you to swap a running kernel with a new one. I think that would be very hard thing to do. I am pretty sure there was some discussions related to that in lkml. a micro kernel could possibly swap several sub systems. Linux might be able to incorporate something like hot swap in it. I wonder if this feature if incorporated would actually be a requirement in enterprise systems. its definitely an interesting feature to have.
do we have other such stuff that linux doesnt offer in competition to proprietary systems. please specify them in detail
thanks
Jess
I’ve seen a lot of kneejerk reactions (“right tool for the right job”, “everything looks like a nail [to Linux zealots]”.
Linux with every major kernel release has improved its scalability. It seems to me about 8 years ago, linux was only running on standard hardware uniprocessor boxen. Few years later, it was making its way into dual proc boxen, and reaching for quad processor. About that time a lot of work was going into making it ready for “Workstation” hardware (PA-RISC and the like). The 2.6 release was /almost/ exclusively about scalability – more RAM, more processors, better threading, better scheduling.
I see this as just spreading the new kernel’s wings – seeing what it will do, and how it scales up on big iron. You can’t really say “right tool for the job” until you’re educated on all the tools available to you.
It’s a pity the cancelled the r18k (and “N1” as well) ๐ At least they’ll be maintaining IRIX for next couple of years but we won’t see much in way of improvements (other then bug fixes)
Expect to see an Itanium/Linux alternative for Onyx4 in the next year or so.
Key to enabling support for 256-processor Altix systems is the recently released SGI Advanced Linux? Environment with SGI ProPack? 2.4. >>
That sounds really proprietary to me … which is how SGI is going to get $$$. Offering something that no-one else has.
The future of corporations and Linux is going to be much like the relationship Apple has with BSD/Darwin, where they are happy to give back improvements made to code taken from the open source pool, but all the really cool powertoys will remain proprietary.
(Not that I don’t think this is perfectly fair.)
Hi
The proprietary stuff is not a requirement. it just adds some sgi specific userland tools. Many companies that support Linux in commerical environments adopt this policy where their base stuff is open while what gives them a advantage is proprietary. You are right about that
Jess
From the article:
Anticipating demand for even more powerful Altix systems, SGI also announced year-end plans to scale Linux to 512 processors in an SSI
Now I know what I want for my birthday!
The propack is a mix of opensource and propietry stuff, it has all the patches to the kernel that they apply (alot of which are now in 2.6) as well as stuff like XSCSI (the scsi layer from IRIX) which exists above the kernel and allows for alot better access speeds etc.
Kexec,
http://lwn.net/Articles/15468/
Seems like there are patches for 2.6 which makes hot swapping of kernels a reality…..
Are we ready to admit that proprietary Unixes are replacable by Linux? Scalability is no longer an argument.
The number of processors supported within a single system image speaks more of the kernel design than of that kernel’s ability to scale to a large number of processors, especially when that kernel (or modifications made to it) are targeting a single architecture. The throughput of the NUMA architecture utilized in the Altix (6.4GB/s) is already significantly less than the throuput of, say, higher end Sun offerings with FirePlane interconnect (9.6GB/s with an aggregate bandwidth of 172.8GB/s for an SunFire 15k), and Linux, being a notoriously latency optimized (i.e. desktop oriented), low throughput kernel in comparison to offerings like Solaris, is highly unlikely to scale well to enterprise tasks.
Yes, some of the low latency features are compile time options which can be disabled. But Solaris has been proactively optimized for high throughput enterprise computing tasks, to the point that it’s painfully unresponsive to use as a desktop. Linux has not been proactively optimzied for enterprise tasks… instead it has been retroactively optimized for such tasks by IBM, HP, and SGI, while others attempt to fine tune the kernel for latency. Latency and throughput are always a tradeoff, and with desktop users trying to tune the kernel for latency at the same time IBM, HP, and SGI are pushing for throughput, it can never approach the degree of fine tuning Solaris has seen for enterprise tasks.
The continual, tired argument of Linux zealots is that Linux will eventually replace all proprietary systems. I have never seen this comment made by anyone with enterprise computing experience. Yes, scores of Linux clusters built from low end commodity x86 components have replaced large scientific computing systems. Yes, Linux server farms can provide a web portal for large companies. But what Fortune 500 company farms out edge mail to Linux systems? What companies run their central database and CORBA services on a Linux system? The Altix is the only Linux system that can come close to being enterprise-worthy, and it’s the product of a company on its way out, SGI.
SGI has tried to make Linux enterprise worthy, porting over a number of high availability features, resource management features, and debugging features from Irix. But keep in mind these are SGI patches only, and not in the Linux kernel mainline. Certainly you can apply these patches to your own kernel tree (since they are covered by the GPL) but I don’t believe you’ll find them particularly useful as they are targeted towards the Altix architecture.
But yes, believe it or not Linux, with considerable work from an enterprise systems vendor, can be made enterprise ready. The question is… who really cares about SGI? Their market cap has dipped $200 million since the beginning of the year to ~$575 million, a tiny fraction of their $2.17 billion market cap in the beginning of 2000. Who really feels safe that SGI will be around another 3-5 years to handle their service contract? When you’re buying a > $1 million computer, why would you buy it from a company which appears to be less than a year away from Chapter 11?
SGI has lost their market. Macs and Windows machines are the new graphics workstations. Single system image supercomputers for scientific computing are losing ground to cheap x86 MPI/PVM clusters. NASA bought an Altix system, and Oracle has pledged support for Altix, but is there any other news from a sales perspective?
When the age of enterprise Linux comes, and we see Linux running on massively scalable enterprise caliber systems, it will be brought by the likes of IBM and HP, not by SGI.
Why is SGI stuck with the Itanium 2 when AMD offers cheaper alternatives ? It’s quite an accomplishment to keep on dumping money on a chip that many customers avoid like the plague.
Well regarding sales from what i remember reading both on realworldtech.com and nekochan.net (irix community site) SGI have picked up over a 110 customers for the Altrix since it’s been launched in 2003 comprising over 10,000 Itanium2 processors. I think the most recent Altrix sale was to Skoda (part of Volkswagen) but i’d need to go through SGI site news.
Oracle is actually supported on Altrix but given the fact that Altrix either runs a sgi Linux (based off Redhat) or Suse this isn’t too surprising.
AMD64 wasn’t around when sgi started designing the Altrix, that and when your talking about machines like these the price of the processors are quite small against the overall price of the system itself. Also high end Opterons (8 processor series) aren’t particulary cheap the last time i looked.
As regards SGI patches well i check linus bitkeeper repositry every day, i remember seeing a fair number of Altrix related patches been applied about 3-4 weeks ago. That and kgdb looks like it’s going to be included in 2.6 has seriously reduced the number of external system patches that SGI have to maintain outside the kernel.
They are testing 2.6 on the Altrix at the moment and planning on using it production wise in the next 3-6months.
As for Linux zealots @ bascule you have to admit their our zealots of all shapes and colours out there, if anything i find freebsd ones alot more vocal on this site, but then that’s just my opinion.
Back to ebay and buying sgi kit with me ๐
Slightly off topic post, but given that it’s an sgi thread here’s a screenshot with Firefox built with gtk2 and XFT on IRIX
http://frink.nuigalway.ie/~dubhthach/IRIX/osnews.png
Hi
“But yes, believe it or not Linux, with considerable work from an enterprise systems vendor, can be made enterprise ready. The question is… who really cares about SGI? Their market cap has dipped $200 million since the beginning of the year to ~$575 million, a tiny fraction of their $2.17 billion market cap in the beginning of 2000. Who really feels safe that SGI will be around another 3-5 years to handle their service contract? When you’re buying a > $1 million computer, why would you buy it from a company which appears to be less than a year away from Chapter 11? ”
you are not a finanical analyst are you?
sgi did some mismanagement with nt systems but now have picked up pace with linux systems. take up any financial measure and you will see the growth rate has recently increased. so they are not filing a chapter 11 in a year or so. sgi patches are available in different trees and many of them are actively being pushed into mainline. architecture specific stuff wont get into mainline without significant changes. thats its nature.
latency and throughout are balanced measures and we will always have multiple competing coders and systems inside linux. thats its growth path and its proven very strong
regards
Jess
Cheap and redundant hardware…breaks down, sometimes catastrophically (I define catastrophically as “taking damn near everything else with it). Sometimes a whole line of products is irreparably flawed and you’ve bought 30 of them and they all DIE. Software fault tolerance features can buy you some insurance against losing everything all at once, and sometimes, whatever you lost is too important to wait for restoring from backups.
–JM
“As for Linux zealots @ bascule you have to admit their our zealots of all shapes and colours out there…”
Definitely, although the zealots are probably a large part responsible for driving OS forward, are they not? You can’t have a religion with only a bunch of priests and no religious followers.
It is very nice to see SGI pushing the envelope of scalability with Linux, too bad SGI bet on the wrong horse again (Itanic), which appears to be tanking big time (even Intel and HP are loosing faith). I wonder how far it is going to take SGI, may be they should have kept on developing MIPS after all…
Bascule sez:
“SGI has lost their market. Macs and Windows machines are the new graphics workstations.”
Good points about SGI.
Apple could kill SGI’s workstation market with 3 64 bit compiles: OS X, Shake, and Final Cut Pro. Call it the “G5 Special Edition”.
The fact of the matter is that SGI charged too damn much for their hardware for years. Granted, in the mid 1990s, nobody could touch what an SGI machine could do.
But now? A business would want to buy a Fuel or an Octane or a Tezro why? A maxxed out dual G5 offers a considerably better cost to performance ratio.
SGI’s switching to Intel to supply processors is a good idea, but probably too little, too late.
> But now? A business would want to buy a Fuel or an Octane or a Tezro why? A maxxed out dual G5 offers a considerably better cost to performance ratio.
You would buy Octane/Fuel/Tezro because Irix can offer you extreme scalability without recompiles from single CPU to potentially hundreds of CPU’s without giving you any problems. Even though Apple has a very good value proposition for strictly desktop solutions, it is not a good fit for massive graphics/visualization jobs. SGI and its gear are here to stay.
dubhthach (IP: —.bas1.mvw.galway.eircom.net)
Well regarding sales from what i remember reading both on realworldtech.com and nekochan.net (irix community site) SGI have picked up over a 110 customers for the Altrix since it’s been launched in 2003 comprising over 10,000 Itanium2 processors.
If you can find a definitive link on this it would be nice. I’m afraid I’ll have to assume the figures you’re stating here are slightly dubious… I’m afraid I doubt your memory as you misspell the name of the Altix throughout your post.
As for Linux zealots @ bascule you have to admit their our zealots of all shapes and colours out there, if anything i find freebsd ones alot more vocal on this site, but then that’s just my opinion.
Yes, but I think you’ll find the average Solaris zealot a bit more familiar with enterprise computing than your average Linux zealot. I don’t mind zealotry so much as people posting on issues which are clearly out of their league… and attempting to apply the rationale of a home owner of a small number of computers and the roles of those systems to enterprise computing tasks.
someone (IP: —.pgcc.edu)
Cheap and redundant hardware…breaks down, sometimes catastrophically (I define catastrophically as “taking damn near everything else with it). Sometimes a whole line of products is irreparably flawed and you’ve bought 30 of them and they all DIE. Software fault tolerance features can buy you some insurance against losing everything all at once, and sometimes, whatever you lost is too important to wait for restoring from backups.
Fault tolerant software increases software complexity and thus propensity towards software failure. Can you point to any enterprise caliber database software (i.e. not Google’s periodically updated but largely static database which only supports extremely limited keyword-based querying) that’s meant to be run in the way you seem to imply, i.e. distributed across a number of clustered low-end commodity systems? I’m afraid there’s simply nothing available… database replication only adds scalablility for queries which do not alter the database. An enterprise computing task will typically process an enormous volume of queries which modify database contents, meaning that a single system image is required in order to provide all threads on all processors equal access to database contents, both in memory and disk storage. And frankly, from a throughput standpoint, gigabit Ethernet, at ~125MB/s, cannot compete with Sun’s FirePlane Interconnect, at 9.6GB/s from a throughput standpoint.
Jess a.k.a. Anonymous (IP: 61.95.184.—)
you are not a finanical analyst are you?
No, but I doubt you are looking at SGI’s financials with anything but a layman’s perspective either, so why mention it?
sgi did some mismanagement with nt systems but now have picked up pace with linux systems.
SGI’s market cap has dropped over $125 million since Febuary 19th, from ~$700 million to ~$575 million (for those of you who can’t do math, that’s $6 million a day). The price of a share of stock is up from late 2003 because they offered a one-for-two reverse split on their stock when the share price was in danger of dropping below $1, which would result in their stock being delisted from the NYSE.
take up any financial measure and you will see the growth rate has recently increased.
If you look at share value alone, yes, it’s up, because they now only have half the shares they used to. You have to look at market capitalization in order to gauge the actual company’s value, and SGI’s market capitalization has been and is continuing to plummet.
so they are not filing a chapter 11 in a year or so.
You’re right, at the current rate their market cap is diminishing their stock will be worthless in about 3 months.
latency and throughout are balanced measures and we will always have multiple competing coders and systems inside linux. thats its growth path and its proven very strong
And this is why Linux will not fit into an enterprise role without extensive vendor tuning. We’re only beginning to see IBM offering Linux running natively on zSeries mainframes, rather than inside the z/VM on top of z/OS.
Anonymous (IP: 144.140.2.—)
It is very nice to see SGI pushing the envelope of scalability with Linux, too bad SGI bet on the wrong horse again (Itanic), which appears to be tanking big time (even Intel and HP are loosing faith).
There’s nothing wrong with Itanium except the lingering inability of compilers to fully utilize the architecture. Pricewise an Altix system is much cheaper than a comparable Sun Fire, albeit with a slower I/O architecture and less featureful, scalable, and robust kernel, at least when looking at enterprise features like resource partitioning, using multiple scheduling models (e.g. time share vs fair share) within a single system image, etc. If anything is wrong with the Altix it’s Linux’s inability to fit an enterprise role, not the processors.
I wonder how far it is going to take SGI, may be they should have kept on developing MIPS after all…
SGI has kept developing MIPS. They released the R16000 over a year ago and will be releasing the R18000 later this year. Certainly these are all incremental improvements over the R10000 design, but the R18000 should be a dual core design much like Sun’s recently released UltraSPARC IV.
You would buy Octane/Fuel/Tezro because Irix can offer you extreme scalability without recompiles from single CPU to potentially hundreds of CPU’s without giving you any problems. Even though Apple has a very good value proposition for strictly desktop solutions, it is not a good fit for massive graphics/visualization jobs. SGI and its gear are here to stay.
Visualization, much like scientific computing, is another area where cluster computing has enormous advantages over a single system image.
Last I heard (i.e. heresay and conjecture, not definitive)Pixar is in the process of converting its Xeon-based RenderMan cluster to XServe G5s. Pixar, one of the largest 3D powerhouses in the world, never relied on SGI (perhaps because they’re Steve Jobs baby) but has made the transition from Sun to Intel and may be in the process of moving to Mac.
The enterprise is where SGI is making their last stand, and judging from their rapidly decreasing market capitalization they seem to be failing.
Yes…yes…SGI is dying! Linux isn’t enterprise ready. And nothing beats Solaris’ scalability and Mac’s G5 price/performance ratio. :rollseyes:
The number of processors supported within a single system image speaks more of the kernel design than of that kernel’s ability to scale to a large number of processors, especially when that kernel (or modifications made to it) are targeting a single architecture.
What is this supposed to mean exactly?
The throughput of the NUMA architecture utilized in the Altix (6.4GB/s) is already significantly less than the throuput of, say, higher end Sun offerings with FirePlane interconnect (9.6GB/s with an aggregate bandwidth of 172.8GB/s for an SunFire 15k),
No you’re wrong. The figure you are parroting is Sun’s intra node bandwidth. THe FirePlane interconnect has a bidirectional link bandwidth of only 4.8GB/s. What is more, Altix has multiple links per node, while Sun only has one link per node. Sun’s on board memory latency is also more than 3 times worse than Altix’s off node latency for its nearest neighbours.
Also Sun’s switch, being a crossbar type is basically a dumb brute force approach that is very expensive to scale, which is why they are stuck at 18 nodes (this 256 processor Altix has 128 nodes).
I suggest you read Sun’s product info page a bit harder next time.
and Linux, being a notoriously latency optimized (i.e. desktop oriented), low throughput kernel in comparison to offerings like Solaris, is highly unlikely to scale well to enterprise tasks.
So you think SGI just thought it might be a cool idea to sell NASA a 512 processor SSI Linux system just in the off chance that it might scale well? Don’t know how you can still be claiming how much better Solaris scales.
And why do you say Linux is notoriously latency optimized? Where is your data? Benchmarks?
Yes, some of the low latency features are compile time options which can be disabled. But Solaris has been proactively optimized for high throughput enterprise computing tasks, to the point that it’s painfully unresponsive to use as a desktop. Linux has not been proactively optimzied for enterprise tasks… instead it has been retroactively optimized for such tasks by IBM, HP, and SGI, while others attempt to fine tune the kernel for latency. Latency and throughput are always a tradeoff, and with desktop users trying to tune the kernel for latency at the same time IBM, HP, and SGI are pushing for throughput, it can never approach the degree of fine tuning Solaris has seen for enterprise tasks.
You must be from the marketing department, eh? Of course you wouldn’t care to explain and back up any of your claims but I’ll let it slide because I thought that was funny.
[snip]
The Altix is the only Linux system that can come close to being enterprise-worthy, and it’s the product of a company on its way out, SGI.
Funny you think SGI is on its way out and yet Sun is the greatest thing since sliced bread. Oh, IBM also sells Linux on their POWER4 systems.
Oh, both the POWER4 and Itainium2 processors are far far better than the US in case you forgot. You’d actually probably need 100 ultrasparcs in order to match 64 Itainium2s. Heh.
SGI has tried to make Linux enterprise worthy, porting over a number of high availability features, resource management features, and debugging features from Irix. But keep in mind these are SGI patches only, and not in the Linux kernel mainline. Certainly you can apply these patches to your own kernel tree (since they are covered by the GPL) but I don’t believe you’ll find them particularly useful as they are targeted towards the Altix architecture.
This is for the 2.4 kernel, mind you. So no, you wouldn’t find many of them particularly useful for the 2.6 kernel.
[SGI is crap]
Funny how you think SGI is just about to go down the shitter, but Sun is the best thing since sliced bread.
When the age of enterprise Linux comes, and we see Linux running on massively scalable enterprise caliber systems, it will be brought by the likes of IBM and HP, not by SGI.
Interesting theory. So what do you think is really running on the Altix then? The biggest single system IBM has is 128 processors (or threads I should say – POWER5), HP is 128. SGI, 512.
> Oh, both the POWER4 and Itainium2 processors are far far better than the US in case you forgot. You’d actually probably need 100 ultrasparcs in order to match 64 Itainium2s. Heh.
It is probably all the way around: 100 Itanics for 64 UltraSparc III (32 UltraSparc IV)
No you’re wrong. The figure you are parroting is Sun’s intra node bandwidth. THe FirePlane interconnect has a bidirectional link bandwidth of only 4.8GB/s. What is more, Altix has multiple links per node, while Sun only has one link per node. Sun’s on board memory latency is also more than 3 times worse than Altix’s off node latency for its nearest neighbours.
SF15K’s aggregate bandwidth is 9.6GB/sec(4.8 GB/sec bidirectional) NUMALink with protocol 3 is only 6.4GB/sec aggregate (3.2GB/sec bidirectional). First, The SF15k is a big machine made up of 18 boards not nodes.
Anonymous (IP: 203.173.2.—)
The number of processors supported within a single system image speaks more of the kernel design than of that kernel’s ability to scale to a large number of processors, especially when that kernel (or modifications made to it) are targeting a single architecture.
What is this supposed to mean exactly?
That the number of processors supported by a system is not an indicator of the scalability of that system as a whole. Only with constant time thread selection, a long touted feature of every enterprise operating system, has Linux even been worth consideration for massively multiprocessor systems at all.
No you’re wrong. The figure you are parroting is Sun’s intra node bandwidth. THe FirePlane interconnect has a bidirectional link bandwidth of only 4.8GB/s.
Why are you applying a NUMAlink nomenclature to the SunFire architecture? First, there aren’t “nodes”, there are boards, with boards communicating eath other through 18 channels interconnecting 18 slots. FirePlane is a switching, packetized protocol… these are not point to point links between slots, so any board in any of the 18 slots may send a packet to any of the 17 other boards in the system through any of 18 different channels. Yes, the 9.6GB/s figure does come from the fact that there are two 4.8GB/s channels (32 byte datapath * 150MHz = 4.8GB/s), however the aggregate bandwidth of a bidirectional FirePlane channel is 9.6GB/s. Multiply this figure by the 18 channels along the SunFire 15k’s backplane and you get the 172.8GB/s aggregate system bandwidth. I’ve just done all the math for you, do you get it now? Sun’s numbers actually do add up…
What is more, Altix has multiple links per node, while Sun only has one link per node.
Again, any board in a SunFire 15k may use any of the 18 FirePlane channels to communicate with any other board. That’s eighteen channels per board. Is the concept of a “crossbar” escaping you?
Sun’s on board memory latency is also more than 3 times worse than Altix’s off node latency for its nearest neighbours.
Perhaps, but Sun’s throughput is much higher, as I’ve just demonstrated. And for enterprise tasks, throughput is much more important than latency, as was my argument in the previous post.
Also Sun’s switch, being a crossbar type is basically a dumb brute force approach
…an approach that’s extremely beneficial for I/O, as I’ve shown above. Just what is your problem with a crossbar? I don’t see how it could be portrayed in anything except a positive light.
that is very expensive to scale, which is why they are stuck at 18 nodes
Yes, and their solution for the time being has been to double the number of cores on each processor in the UltraSPARC IV-based SunFire 25k, which does result in less per-processor I/O, so those considering the purchase of a SunFire 25k will need to determine if the tasks they’re throwing at the SunFire 25k’s 144 cores consumes more than the FirePlane crossbar’s aggregate 172.8GB/s.
Funny you think SGI is on its way out and yet Sun is the greatest thing since sliced bread.
Let’s review:
Sun’s market capitalization: $14,298,010,000
SGI’s market capitalization: $575,000,000
I’d say SGI is ripe for a hostile takeover…
Oh, IBM also sells Linux on their POWER4 systems.
Yes, their lower end POWER servers can run Linux. But for large enterprise tasks, a role filled by their zSeries mainframes, Linux really isn’t of much use.
Putting aside the consideration that IBM only recently managed to get Linux running natively on zSeries (i.e. not inside the z/VM), and that purchasers of enterprise systems are not willing to wager the millions of dollars they just spent on zSeries mainframe on something as untested in the enterprise as Linux, the only way to partition Linux on zSeries mainframes is through zSeries LPARs, the dearth of which (15 on a z900) was the cause for the creation of the z/VM in the first place, as z/OS partitioning is also limited by LPARs. Until a suitable replacement partitioning system can be implemented on Linux/zSeries, there won’t be much use for Linux on zSeries mainframes.
If you can show that more than a handful of zSeries mainframes outside of IBM are running Linux (outside the z/VM), I’ll change my mind.
> It is probably all the way around: 100 Itanics for 64
> UltraSparc III (32 UltraSparc IV)
Umm, are you joking or did you just pull that figure out from where the Sun shines impossibly brightly?
Take the well respected STREAM metric for interconnect performance for example: http://www.cs.virginia.edu/stream/top20/Bandwidth.html
Sun’s E25K with 72 US-IV processors comes out a bit above the 32-way POWER4, but significantly below a 64-processor IA64. But this is not so much processor power, but interconnect performance. The guy who was trying to rubbish Altix’s interconnect will notice a 512 CPU Altix system has more than 12 times the interconnect performance with only 7 times the number of CPUs.
Or we could take SPEC results: http://www.spec.org/cpu2000/results/res2004q1/
The best UltraSparc single processor SPECint is 704 for a 1.28GHz USIIIi. A comparable Itanium2 (1.3GHz) gets 1132, and 1404 for a 1.5GHz version. POWER4+ gets 1158. On to SPECfp, and the Sun gets 1063, POWER4+ 1776, Itanium2 2161.
So yeah, the Itanium2 is literally twice as fast as the UltraSparc. Doesn’t matter if you dislike Intel or whatever, it doesn’t change simple facts.
> The best UltraSparc single processor SPECint is 704 for a 1.28GHz USIIIi. A comparable Itanium2 (1.3GHz) gets 1132, and 1404 for a 1.5GHz version. POWER4+ gets 1158. On to SPECfp, and the Sun gets 1063, POWER4+ 1776, Itanium2 2161.
Synthetic benchmarks like SPEC don’t mean a whole lot in real world besides creating artificial hype. UltraSparc may appear to be lagging behind Itanic, but in reality UltraSparc powered machines are still handily beating Itanic and Power4 powered boxen in real world applications — this is why there are many more new UltraSparc deployments than Itanic or Power. Oh, and by the way 99% of enterprise apps don’t even exist for Itanic and god knows if they ever will considering the lackluster “success” of IA-64. There are only 12 ISV’s for Itanic for crying out loud!
The number of processors supported within a single system image speaks more of the kernel design than of that kernel’s ability to scale to a large number of processors, especially when that kernel (or modifications made to it) are targeting a single architecture.
What is this supposed to mean exactly?
That the number of processors supported by a system is not an indicator of the scalability of that system as a whole. Only with constant time thread selection, a long touted feature of every enterprise operating system, has Linux even been worth consideration for massively multiprocessor systems at all.
So the fact that SGI chose Linux for a 512 processor system doesn’t arouse the slightest suspicion that Linux might be pretty scaleable?
Why are you applying a NUMAlink nomenclature to the SunFire architecture? First, there aren’t “nodes”, there are
Do you know what NUMA is? Non Uniform Memory Architecture. That is where the term node comes from, not NUMAlink. Are you trying to claim that Sun’s architecture isn’t NUMA? Anyway call it a board if you like.
Again, any board in a SunFire 15k may use any of the 18 FirePlane channels to communicate with any other board. That’s eighteen channels per board. Is the concept of a “crossbar” escaping you?
The may use any one at a time, yes.
Perhaps, but Sun’s throughput is much higher, as I’ve just demonstrated. And for enterprise tasks, throughput is much more important than latency, as was my argument in the previous post.
Latency is usually more important.
…an approach that’s extremely beneficial for I/O, as I’ve shown above. Just what is your problem with a crossbar? I don’t see how it could be portrayed in anything except a positive light.
An 18×18 crossbar has 324 links, 18 of which can be used at a time.
Yes, and their solution for the time being has been to double the number of cores on each processor in the UltraSPARC IV-based SunFire 25k, which does result in less per-processor I/O, so those considering the purchase of a SunFire 25k will need to determine if the tasks they’re throwing at the SunFire 25k’s 144 cores consumes more than the FirePlane crossbar’s aggregate 172.8GB/s.
Well their two cores about match one of Altix’s CPU cores for speed. Oh, remember that each of Altix’s interconnects is shared between 2 CPUs, while on the 25K they will be between 8 CPUs.
Funny you think SGI is on its way out and yet Sun is the greatest thing since sliced bread.
Let’s review:
Sun’s market capitalization: $14,298,010,000
SGI’s market capitalization: $575,000,000
I’d say SGI is ripe for a hostile takeover…
Why would you say that? You seem quite obsessed with market capitalization. Future viability is much more important.
Yes, their lower end POWER servers can run Linux. But for large enterprise tasks, a role filled by their zSeries mainframes, Linux really isn’t of much use.
All their POWER servers can run Linux.
Putting aside the consideration that IBM only recently managed to get Linux running natively on zSeries (i.e. not inside the z/VM), and that purchasers of enterprise systems are not willing to wager the millions of dollars they just spent on zSeries mainframe on something as untested in the enterprise as Linux, the only way to partition Linux on zSeries mainframes is through zSeries LPARs, the dearth of which (15 on a z900) was the cause for the creation of the z/VM in the first place, as z/OS partitioning is also limited by LPARs. Until a suitable replacement partitioning system can be implemented on Linux/zSeries, there won’t be much use for Linux on zSeries mainframes.
If you can show that more than a handful of zSeries mainframes outside of IBM are running Linux (outside the z/VM), I’ll change my mind.
So if Linux doesn’t run on more than a handful of zSeries outside IBM, then according to you it is not ready for the enterprise? How about I don’t show you anything and you make up your own mind. I really don’t care to change it.
Synthetic benchmarks like SPEC don’t mean a whole lot in real world besides creating artificial hype. UltraSparc may appear to be lagging behind Itanic, but in reality UltraSparc powered machines are still handily beating Itanic and Power4 powered boxen in real world applications — this is why there are many more new UltraSparc deployments than Itanic or Power. Oh, and by the way 99% of enterprise apps don’t even exist for Itanic and god knows if they ever will considering the lackluster “success” of IA-64. There are only 12 ISV’s for Itanic for crying out loud!
Bullshit. The final SPEC number is based on a combination of the performance of a lot of different “real world” applications. Still pretty arbitrary, but to say that Itanium *doubles* US means nothing is just sticking your head in the sand. But you seem to know of many real world applications where UltraSparc machines are beating Itanium and POWER4, show me some. Show me just *one*.
Why are you applying a NUMAlink nomenclature to the SunFire architecture? First, there aren’t “nodes”, there are boards, with boards communicating eath other through 18 channels interconnecting 18 slots. FirePlane is a switching, packetized protocol… these are not point to point links between slots, so any board in any of the 18 slots may send a packet to any of the 17 other boards in the system through any of 18 different channels. Yes, the 9.6GB/s figure does come from the fact that there are two 4.8GB/s channels (32 byte datapath * 150MHz = 4.8GB/s), however the aggregate bandwidth of a bidirectional FirePlane channel is 9.6GB/s. Multiply this figure by the 18 channels along the SunFire 15k’s backplane and you get the 172.8GB/s aggregate system bandwidth. I’ve just done all the math for you, do you get it now? Sun’s numbers actually do add up..
Oh, one other thing, you are WRONG. The FirePlane switch connects 18 bidirectional 4.8GB/s channels. Thats why it is called 18×18.
The 172.8 number you keep getting is what they quote for total memory bandwidth. This is when all CPUs access their board (node) local memory. The FirePlane interconnect bandwidth is much lower. So get your facts straight next time.
“Last I heard (i.e. heresay and conjecture, not definitive)Pixar is in the process of converting its Xeon-based RenderMan cluster to XServe G5s. Pixar, one of the largest 3D powerhouses in the world, never relied on SGI (perhaps because they’re Steve Jobs baby) but has made the transition from Sun to Intel and may be in the process of moving to Mac.”
Funny… That’s not what I thought when I saw the Finding Nemo ending credits.
> But you seem to know of many real world applications where UltraSparc machines are beating Itanium and POWER4, show me some. Show me just *one*.
Why one, I can give you whole bunch. A few of the most recent ones:
-Sun Fire 20K World Record SAP R/3 Enterprise
-Sun Fire E4900/6900 World Record Manugistics Fulfillment
-Sun Fire 6800 Server World Record with PeopleSoft 8 Payroll
-Sun Fire 15K Server World Record on Spec OMPL2001 benchmark
-Sun Fire servers are leaders for 100GB, 300GB and 1000GB Sybase IQ benchmark pretty much across the board
Is that real-world enough for you? And those are just records, I’m not even talking head-to-head compentition. Sun servers are cheaper and better for most enterprise apps than any of the Itanic or Power stuff. And by the way, Altix should not even be compared to Sun servers, since it is not even targetted at the enterprise market being by most part an HPC machine for technical and scientific computing.
Why one, I can give you whole bunch. A few of the most recent ones:
-Sun Fire 20K World Record SAP R/3 Enterprise
-Sun Fire E4900/6900 World Record Manugistics Fulfillment
-Sun Fire 6800 Server World Record with PeopleSoft 8 Payroll
-Sun Fire 15K Server World Record on Spec OMPL2001 benchmark
-Sun Fire servers are leaders for 100GB, 300GB and 1000GB Sybase IQ benchmark pretty much across the board
Is that real-world enough for you? And those are just records, I’m not even talking head-to-head compentition.
No that is not real world enough for me because getting a record doesn’t mean you’re better than anyone at all. Talk about head to head comparisons please.
And by the way, Altix should not even be compared to Sun servers, since it is not even targetted at the enterprise market being by most part an HPC machine for technical and scientific computing.
Oh yeah, my calculator is the most powerful supercomputer in the world. Just don’t compare it to anything faster than it.
Seriously, why?
>Take the well respected STREAM metric for interconnect >performance for example: >http://www.cs.virginia.edu/stream/top20/Bandwidth.html
>Sun’s E25K with 72 US-IV processors comes out a bit above >the 32-way POWER4, but significantly below a 64-processor >IA64. But this is not so much processor power, but >interconnect performance. The guy who was trying to rubbish >Altix’s interconnect will notice a 512 CPU Altix system has >more than 12 times the interconnect performance with only 7 >times the number of CPUs.
Whats really fascinating about these stream results is the performance of the vector processing “dinosaurs” from Cray and NEC. At 32 procs the NEC SX-7 does almost as well as the Altix, and at 4 (!) processors the Cray X-1 is not terribly far behind. As I recall, the X-1 has 48 GB/s memory bandwidth to each processing element, which probably means they have around 30 Rambus channels to each PE (just a guess). The PE’s are mounted 4 to an IBM MCM and the machine runs a Cray Unix variant. The Altix is cool, but in my estimation, this is probably the most advanced computing device on the planet right now.
Point is, you can argue about IO bandwidth, but what must be kept in mind is the workload. The Fire 25K is often deployed against a workload that isnt awfully memory intensive, whereas the Altix, if used for technical computing, is often going to be depending on memory bandwidth. In which case, its ass will often be kicked next to the Cray, since–remember–its only got 6.4 GB/s to RAM to divide between each set of 2 Itaniums (2 sets per board). In this case, things depend entirely on how parallelizable the workload is.
The advantage of >95% cpu utilization per PE for the Cray for most correctly coded workloads will probably never be matched by a single cored Itanium.
As for Linux, well, I just dont get the appeal to an enterprise customer buying big iron right now. Linux as of today will never beat a proprietary unix on TCO, and why one would wish to dump a decade of RAS improvements added to proprietary Unices for a system which is unproven in the enterprise, when millions are at stake just for the hardware and downtime could cost tens of thousands per hour, is beyond me. Unless, of course, you feel sorry for IBM, SGI, and HP and want to help them cut OS development R&D costs.
> No that is not real world enough for me because getting a record doesn’t mean you’re better than anyone at all.
LOL. For your information “world record” by definition means that you’re better than anyone else in the world.
wow Bascule, looks like everyones on your ass. ๐
You said – “But Solaris has been proactively optimized for high throughput enterprise computing tasks, to the point that it’s painfully unresponsive to use as a desktop.”
Well, I installed Solaris 9 on my new SparcStation 5 (170 mhz TurboSparc, 64 megs RAM, 2.1 gig HD) and while not blazingly fast, its not “painfully unresponsive.” Apps take a bit to load, but the mouse never stutters like it used to on Linux or FreeBSD, and I could click on something in CDE with reasonably response time. It didn’t feel that clunky at all, but sure that was with CDE. Gnome or KDE would be a different story I’m sure, but I’m pleasantly suprised with how it feels as a desktop so far.
This is off-topic but where can I go to get help with a dead Sun Blade 60? Sigh, I hope I didn’t kill it, it won’t boot.
LOL. For your information “world record” by definition means that you’re better than anyone else in the world.
Get a grip and just think for yourself for a minute. It means the configuration that has been tested beats all other configurations that have been tested.
I don’t care if Sun can pour millions into building a system to get a “world record”. Show me UltraSparc vs Itanium in a fair comparison, thats all I want.
Oh, one other thing, you are WRONG. The FirePlane switch connects 18 bidirectional 4.8GB/s channels. Thats why it is called 18×18.
The 172.8 number you keep getting is what they quote for total memory bandwidth. This is when all CPUs access their board (node) local memory. The FirePlane interconnect bandwidth is much lower. So get your facts straight next time.
You are wrong. The bandwidth of the Fireplane bus is 9.6 GB/s.
http://www.sc2001.org/papers/pap.pap150.pdf page 2.
2001
UltraSPARC-III
>64
?750 MHz
150 MHz
Broadcast + point-to-point
150 million/sec
9.6 GBps
18
172 GBps
32 bytes
Switches
>And by the way, Altix should not even be compared to Sun servers, since it is not even targetted at the enterprise market being by most part an HPC machine for technical and scientific computing.
>Oh yeah, my calculator is the most powerful supercomputer in the world. Just don’t compare it to anything faster than it.
>Seriously, why?
Because there is more than one kind of “fast”. Throughput, single-thread performance, memory bandwidth, IO capabilities, etc. Oftentimes tradeoffs have to be made, and thus the “faster” machine depends on the workload.
At any rate, the Altix and the Fire are most definitely NOT targeted towards the same market.
> I don’t care if Sun can pour millions into building a system to get a “world record”. Show me UltraSparc vs Itanium in a fair comparison, thats all I want.
Actually the vibe I’m geathing is that Sun more often than not is winning on price/performance over Itanium and even Xeon and not just raw numbers. Sybase IQ benchmark is a good example:
http://www.sun.com/smi/Press/sunflash/2003-06/sunflash.20030624.1.h…
http://www.sc2001.org/papers/pap.pap150.pdf page 2.
From that paper, I quote “The Board Data Switch is a 3×3 crossbar that connects the two halves of the board to the off-board 4.8GB/s Fireplane switch port.” on page 11. Also, see the schematic on page 10.
The 9.6GB/s figure on page 2 is “Max data bandwidth per address bus”. I think this is just misleading wording that you have tripped on. This is just the peak local memory bandwidth per board, as there is one interconnect address bus per board. This figure is quoted because it is this address bus that has to carry all the snoop traffic for that memory.
“I don’t care if Sun can pour millions into building a system to get a “world record”. Show me UltraSparc vs Itanium in a fair comparison, thats all I want.”
Actually the vibe I’m geathing is that Sun more often than not is winning on price/performance over Itanium and even Xeon and not just raw numbers. Sybase IQ benchmark is a good example: