POWER5 Virtualisation: SUSE Linux Virtual I/O Server

Eugenia Loli 2005-05-27 SuSE, openSUSE 19 Comments

Reduce your operation costs for complex environments by creating efficient and flexible virtualisation capabilities. Nigel Griffiths describes the benefits of the IBM POWER5 servers and provides examples on how to set up the environment for pSeries, p5, and eServer OpenPower systems.

About The Author

Eugenia Loli

Ex-programmer, ex-editor in chief at OSNews.com, now a visual artist/filmmaker.

Follow me on Twitter @EugeniaLoli

19 Comments

2005-05-27 8:56 pm
Anonymous
No, this isn’t the same thing as Solaris zones.
2005-05-27 9:14 pm
Anonymous
I thought solaris zones are vm;s but they dont use there own kernel so they are *like* VM’s but arent ?
2005-05-27 11:50 pm
Anonymous
i vaguely recall hearing of low performance using the ibm virtualisation when compared to a standard pc running vmware or virtual server
does anyone with experience with the ibm kit have input regarding the performance of this sort of virtualisation setup?
2005-05-28 4:15 am
Anonymous
> i vaguely recall hearing of low performance using the ibm virtualisation when compared to a standard pc running vmware or virtual server. does anyone with experience with the ibm kit have input regarding the performance of this sort of virtualisation setup?
Yes, hypervisor is quite wasteful is you’ve got many micropartitions per CPU with Power5. According to IBM’s documentation you will loose about 40% of the CPU in hypervisor overhead and contention. If you need more information on this, check out IBM documenation:
http://www.redbooks.ibm.com/redpieces/pdfs/sg245768.pdf, page 116.
IBM is obviously oversells Power5, which is the best thing in computing since sliced bread if you listen to IBM, in reality it is not that good.
2005-05-28 6:23 am
Anonymous
hmmm, right after you posted the link went down I smell conspiracy… 40% in hypervisor is horrible. IBM should be ashamed of themselves.
2005-05-28 7:19 am
Anonymous
do you think IBM removed the performance numbers? Witness another similar doc…
http://www.redbooks.ibm.com/redbooks/pdfs/sg246478.pdf
look on page 176 (192 of 744 in Acrobat) there is a “lparstat -h” and “lparstat -H” command that will show the percentage of time spent in the hypervisor. It would be interesting to see this number in a POWER5 machine with 1 physical CPU (2 logical cores if I understand the architecture properly).
Even with SMT I would imagine that the hypervisor overhead will be quite high.
If you have a true 2CPU machine (4 logical cores if I understand the architecture properly), then the hypervisor can run on one of the CPUs without incurring such a high overhead. Of course IBM never publishes numbers…
Another interesting note, one can turn the SMT on and off dynamically using the smtctl command.
Has anyone done any real world tests (forget SPEC) using the new POWER5 processors? Are they really worth that much money? Or is IBM just milking ignorant customers (think US govt who needs 20 yr contracts)?
2005-05-28 12:43 pm
Anonymous
> Has anyone done any real world tests (forget SPEC) using the new POWER5 processors? Are they really worth that much money?
The SPEC number published by IBM are pretty inflated and they used a number of dirty tricks to artificially inflate the benchmarks numbers. One of the classic tricks in IBM’s book is to use Power MCM’s (multi-core mudules) and to turn off three out of four cores on the die and give all of the available cache to just one core, thus artificially giving more cache to a processor than it would otherwise have. You can safely reduce IBM’s SPEC numbers by about 10 to 20 percent to get the real world performance. And oh yeah, they are damn expensive especially if you outfit the system with IBM DDR-2 RAM, speaking of breaking the bank…
2005-05-28 2:29 pm
Anonymous
The Power5 gets the highest TPC-C numbers, and the test configuration uses all the processor cores.
The IA-64 machine with the same number of cores gets 1/3 of the performance.
2005-05-28 2:50 pm
Anonymous
> The Power5 gets the highest TPC-C numbers, and the test configuration uses all the processor cores.
TPC-C is the absolute most exploited database benchmark that can not really be used a measure of real world performance. TPC-C has been so badly abused by vendors that you can safely disregard it. Even TPC council has already acknowledged that TPC-C is mostly garbage and it is soon to be replaced by TPC-E instead (by the end of this year). A better more realistic measure of database performance is TPC-H. Plus besides all this, database performance is not only about processor performance. Actually fast and symmetricalinterconnect and memory bandwidth are much better contributors to good performance for database workload. And IBM systems are not such good performance on either of the above mentioned factors. This is why IBM screams a lot about stupid SPEC numbers, but tells you very little about interconnect performance.
> The IA-64 machine with the same number of cores gets 1/3 of the performance.
IBM will consistently tell you that you can get 2 times better performance from p5 that from IA64 or SPARC, only they will always compare their latest generation p5 processors to previous gen SPARC or Itanic to dupe ignorant suckers into believing that p5 is some sort of a quantum leap and coaxing $$$ for it.
2005-05-28 3:03 pm
Anonymous
> only they will always compare their latest generation p5 processors to previous gen SPARC or Itanic
so where is the *current* generation of SPARC or Itanic that gets the same performance level as the p5??
2005-05-28 4:17 pm
Anonymous
> so where is the *current* generation of SPARC or Itanic that gets the same performance level as the p5??
It depends on the workload you’re talking about, plus what exactly are you talking about performance or price/performance? In non-clustered configuration SPARC is consistently better than p5 on TCP-H price/performance and sometimes even raw performance (3000 GB bench for instance). Power5 is a very good chip, no question about it, but IBM goes way overboard with its claims about the so called “jaw dropping” performance. Is it fast? Yes. Is is that good? Not really.
2005-05-28 4:50 pm
Anonymous
I’m amazed at how much work goes into setting this up for Linux. I have played around with it on a zSeries before, but was not involved in the install. Most shops these days that have a mainframe are being pushed into upgrading to a zSeries. The interesting part is that most shops upgrade because of huge discounts if they install and run SuSe in atleast 1 lpar. So of course customers agree and just do it to get the discount. Maybe that’s how IBM gets the numbers up on the install base:) The hypervisor definitely adds a lot of overhead to your system and is a common cause for customers to not use it mainframes for databases like Oracle or SAP because of the latency issues. The other thing to keep in mind is that the slicing and dicing of resources was optimized for batch processing workloads. So it’s great for MVS/OS390 and VSE/VM.. not soo great for AIX and Oracle.
Here’s a run down of the virtualization technology from other big vendors:
Sun/Fujitsu: Domaining for hardware level partitioning. This consists of logging into an SC and assigning a system board, I/O cage/slot, and powergrid to a domain. The SC talks to the SunFire backplane, which is a large switch fabric (think about the HDS 9970 backplane), connects the components to their own channel and there you go;) That takes about 20 minutes of time to setup and start jumpstarting a domain. You can do DR (dynamic realocation) of components while the domains are running. The big limitation on the Sun version of this is that you can not break up a system board among domains, so you have a min of 2-4 CPU sockets in each domain. This is not the case on the Fujitsu PrimePower line, which uses a similiar technology but with more flexiblity. No middle man involved with virtualizing the hardware. Now you can do software virtualization with Solaris 10, using Zones. You can have 8192 zones on each CPU, the only limit is memory;) With Resource Management, you can slice and dice your resources inside a zone or domain. Fujitsu definitely beats the pants off of Sun on performance, which is why they are joining forces for the APL line to replace the high-end SunFire line. Check this out:
http://www.networkworld.com/newsletters/servers/2005/0516server2.ht…
http://www.fujitsu-siemens.com/products/unix_servers/benchmarks.htm…
http://www.sun.com/aboutsun/media/presskits/benchmarks/
HP: With the current line of products you can do Npars for hardware parititioning. Pretty simliar in concept to what other vendors do (group some cpu’s, memory, and io cages). Vpars allow you to take an Npar and divide it out another level and enables you to run multiple HP-UX instances inside one Npar. This is at the firmware level inside an Npar, so it’s not like a zone in Solaris. Big disadvantage here is that even though you can dymanicly reallocate components in your vpars, you are limited to what was connected when the npar was booted. With HP’s Process Resource Manager (PRM), you can do resource management of your vpars.
2005-05-28 4:53 pm
Anonymous
2CPU machine (4 logical cores if I understand the architecture properly)
In IBM speak, 2 physical power5 chips is 4 cores, which is 4 cpus.
Yes, hypervisor is quite wasteful is you’ve got many micropartitions per CPU with Power5. According to IBM’s documentation you will loose about 40% of the CPU in hypervisor overhead and contention. If you need more information on this, check out IBM documenation:
I know virtualization is the standout feature, but you don’t have to use it to an extreme. On the flip side, you can always run with 1 giant lpar. Then almost all of your hcalls are during initialization and you pretty much stay out of the hypervisor. Also, you can take the Power4 route of dedicating adapters and disks to your lpars.
PHype was first used on the Regattas and has slowly made its way down to the smaller machines. VMWare started out on small x86 boxes. When have you ever seen an x86 box with the potential to have 240 I/O adapters? When you compare the 2 products, you have to take scalability into account too. Not just performance.
only they will always compare their latest generation p5 processors to previous gen SPARC or Itanic
So Sun’s Niagara and Rock, and Montecito will come out beat the current Power5s. Guess what, IBM will have POWER5+ for Niagara and Eclipz for Rock. The backers of Itanium and Sparc pull the same crap.
2005-05-28 5:03 pm
Anonymous
I’m amazed at how much work goes into setting this up for Linux. I have played around with it on a zSeries before, but was not involved in the install. Most shops these days that have a mainframe are being pushed into upgrading to a zSeries. The interesting part is that most shops upgrade because of huge discounts if they install and run SuSe in atleast 1 lpar. So of course customers agree and just do it to get the discount. Maybe that’s how IBM gets the numbers up on the install base:) The hypervisor definitely adds a lot of overhead to your system and is a common cause for customers to not use it mainframes for databases like Oracle or SAP because of the latency issues. The other thing to keep in mind is that the slicing and dicing of resources was optimized for batch processing workloads. So it’s great for MVS/OS390 and VSE/VM.. not soo great for AIX and Oracle.
There lots of differences between the zseries hypervisor and the pseries hypervisor. Not using the pseries hypervisor would be like not using millicode or lpar on tsaur. In a Power5 box, you have to use the hypervisor. This is different from Power4 where the hypervisor was optional and you could just boot AIX in SMP mode on bare metal.
2005-05-28 5:19 pm
Anonymous
> So Sun’s Niagara and Rock, and Montecito will come out beat the current Power5s. Guess what, IBM will have POWER5+ for Niagara and Eclipz for Rock
Dream on brother, dream on… First off, Niagara is coming out this year and will be at least a few times faster that Power5 on network facing workloads (not databases ,etc.). SPARC64 already has comparable to Power5 performance on data facing workloads. As for Rock it is a long time away, so it is hard to draw any projections whatsoever. Sun is actually taking an interesting strategy of developing specialty processors targetting specific workloads. Niagara, for instance, targetted at throughput oriented network type workloads will beat pants off the Power5 while being a few times cheaper and more power efficient. For data facing workloads (databases, etc.) SPARC64/APL will do the job for the time being and then will be replaced by Rock.
It was IBM’s choice to come up with “one processor fits all” strategy and they might pay as well pay for it. Chances are that specialty type processors will be more efficient/cheaper/faster at the target workloads than general purpose ones.
As for Montecito and the rest of Itanic garbage, I think it is pretty hopeless already and will either die or be relegated to a legacy segment a few years time.
2005-05-29 1:25 am
Anonymous
Dream on brother, dream on… First off, Niagara is coming out this year and will be at least a few times faster that Power5 on network facing workloads (not databases ,etc.). SPARC64 already has comparable to Power5 performance on data facing workloads.
prove it
2005-05-29 4:37 am
Anonymous
> prove it
Prove what? That Niagara is going to be faster than p5 at parallel throughput workloads? It is very simple really, according Sun’s very well publicised releases the first gen Niagara will be about 15 times faster than older UltraSparc III. Assuming that the latest p5 is faster that the older USIII by the factor of about 2 or 3, than doing the math you can see that Niagara will be at least a few times faster than Power5 at networkfacing workloads. BTW, according to Niagara engineers posting at blogs.sun.com, Niagara based servers are already being tested by big customers such as eBay. It is coming soon!
2005-05-29 5:46 am
Anonymous
It is very simple really, according Sun’s very well publicised releases the first gen Niagara will be about 15 times faster than older UltraSparc III.
Andy Ingram, vice president of _marketing_ for Sun Microsystems Inc.’s Scalable Systems Group said
“Niagara will offer 15 times the throughput of UltraSparc IIIi.”
which I read here
http://www.eweek.com/article2/0%2C1759%2C1790939%2C00.a…
I’d believe you if I could only find the same number from a reputable source. I think Andy’s position at sun and his engineering credentials sum up his credibily
“Andy holds a B.S. in business from the University of Colorado, and earned an M.B.A. from the Graduate School of Management at the University of California, Los Angeles. He is also a Certified Public Accountant with career experience at Deloitte Haskins and Sells.”
from
http://www.sun.com/aboutsun/media/bios/bios-ingram.html
2005-05-31 1:47 pm
Anonymous
and what type of degree do you have and from where?