“It’s almost mind-boggling that every 18 months or so, regular as clockwork, you can get twice as much computing power. In fact, it is mind-boggling, and these high octane servers can cause some hassles in the data center. In this tip, I explore the standard fixes applied when server shoppers go overboard. These include server consolidation and grid computing. Then, I explore another option: virtualization with Xen.”
Not really much of an exploration. I’d hoped to see more info on the new version of Xen, hearing it’ll start to get the same capabilities of VMWare’s ESX server.
What features in ESX would you like to see in Xen?
What features in ESX would you like to see in Xen?
I’m not the original poster, but…
I still need to have a look what is in the latest version of Xen, but the following would be nice:
-Content-based sharing of memory pages.
-Resizing of memory allocations
-Multi-CPU per domain support
-Device drivers build into the VMM (IIRC Xen uses a separate IO domain).
-additional resource controls
> -Content-based sharing of memory pages.
Is being looked at by various people, including myself. Adding it is not a simple win in all cases, though.
> -Resizing of memory allocations
That’s been possible for a pretty long time now, using the “balloon driver” (similar to the one VMware uses).
> -Multi-CPU per domain support
Xen 3.0 supports something like 32 virtual processors per guest, I believe. You can hot-plug virtual CPUs to control the level of parallelism a guest will try to use (e.g. hot plug some more CPUs after live-migrating a guest to a larger machine).
> Device drivers build into the VMM (IIRC Xen uses a separate
> IO domain).
Now that’s not going to change ๐ Device drivers in the VMM was a Xen 1.x feature. Moving them into a virtual machine in 2.x allows us to support most of the random hardware Linux does (Xen even works pretty well on laptops, now) and offers the possibility of isolating driver faults in restartable virtual machines (an experimental setup was able to achieve 150-300ms outage from a network driver crash, then restore normal service).
Performance is still strong, so I’m not sure why you’d want drivers in the hypervisor…?
> -additional resource controls
Since the Xen 2.0 device model you can use standard Linux traffic controls for network. There’s now also another CPU scheduler, which can be used to give different kinds of guarantees on CPU time. I guess there’s always more controls that could be done, though ๐
As an aside, you can do all sorts of crazy things like ethernet bonding, RAID disks, etc, transparently to the guest (which just sees a “virtual net/block” device.
The idea I like most (haven’t actually used either ESX or Xen 2/3.0 but looked through documentation) is the ability to move virtual servers on top of a cluster without having to take the virtual server down. I’ve seen mention of this on the Xensource site, but I’ve not seen more info on it elsewhere.
This is something I want to use in my home network, and I might be able to convince my boss to use it instead of MS Virtual Server.
Yeah, it’s pretty cool ๐ That was originally implemented by Ian Pratt, our project leader and XenMaster. He’s usually pretty busy managing the project and designing stuff, but the code he does write is highly cunning.
The migration code transfers the guest’s memory before stopping it; it basically “races” intelligently with the guest (which is still dirtying memory). When it decides there’s no benefit to further precopying, the VM is stopped and remaining dirty state is transferred *very* quickly.
The downtime for the guest during the migration was something < 300ms for a busy Apache server, and about 60ms (!) for an in-use Quake server (the grad students using it at the time didn’t know it had happened ;-). The goal is that you can shift peoples servers around without them caring.
Migration has apparently been tested using runs of thousands of migrations between hosts, so it’s pretty reliable.
Oh, I should mention that ESX has this feature too. Obviously, I can’t see the code but I’d imagine it works in a reasonably similar way.
good article by linux magazine (usa)
http://www.linux-mag.com/content/view/2264/
That is a nice article…. You should have submitted it for posting!!!
> -Multi-CPU per domain support
Xen 3.0 supports something like 32 virtual processors per guest, I believe. You can hot-plug virtual CPUs to control the level of parallelism a guest will try to use (e.g. hot plug some more CPUs after live-migrating a guest to a larger machine).
Very impressive! IIRC ESX server doesn’t event support this many per guest.
Performance is still strong, so I’m not sure why you’d want drivers in the hypervisor…?
Strong is a relative term – a recent paper (can’t recall exact title, at least one of the authors was from HP labs) found that there was significantly higher networking overhead in Xen compared to native Linux, and the the overhead was even higher using a separate IO domain. I do realise that having an IO domain provides you with support for a much wider range of devices, so it does make a lot of sense. ESX server supports fewer devices than Workstation, but performance is higher because the drivers are included in the VMM.
I guess there’s always more controls that could be done, though ๐
Disk bandwidth?
-Resizing of memory allocations
That’s been possible for a pretty long time now, using the “balloon driver” (similar to the one VMware uses).
Nice. Not quite LPAR dynamic reallocation of memory (OS still “sees” memory allocation as constant and this quantity is a ceiling for reallocation), but IIRC supporting that is more of an OS implementation issue which is probably out of your hands anyway. I don’t think ESX server even supports LPAR-like reallocation.
>>> -Multi-CPU per domain support
> Very impressive! IIRC ESX server doesn’t event
>support this many per guest.
I think ESX does 4 per guest at the moment. I don’t have any performance numbers on how well a 32 way guest would work but it’s certainly possible. SMP virtual machines have interesting scheduling issues – what if you deschedule a vCPU whilst it holds a spinlock; even if you schedule other vCPUs they might not make progress without that lock. Interestingly, initial experiments suggest that’s no so much of a problem. There are some other things (e.g. not doing spin locks if somebody hot-unplugs all but one of your CPUs) that would be useful in native Linux too.
>> Performance is still strong, so I’m not sure why >>you’d want drivers in the hypervisor…?
> Strong is a relative term – a recent paper (can’t
> recall exact title, at least one of the authors was
>from HP labs) found that there was significantly
>higher networking overhead in Xen compared to native
> Linux, and the the overhead was even higher using a
> separate IO domain. I do realise that having an IO
> domain provides you with support for a much wider
> range of devices, so it does make a lot of sense.
> ESX server supports fewer devices than Workstation,
> but performance is higher because the drivers are
> included in the VMM.
Well yes, although Workstation is another architecture again with its own issues. It’s very different to the Xen model, though. I’ve seen the HP paper – it’s all correct in terms of where the IO model hurts but it focuses on some relatively bad cases. For comparison the Xen 2.0 IO architecture papers show you can get performance comparable to the Xen 1.x model – the CPU utilisation is higher though, so if your workload is really demanding you’ll need a relatively big machine.
For most workloads this isn’t a problem; our tests small packets at line rate on gigabit ethernet is absolutely pessimal. People who have this kind of workload are probably doing something very scary ๐ For people who really need it, various people are working on supporting Infiniband and other intelligent NICs – these’ll give direct ethernet to the guest.
If you *trust* your guests, you should just be able to “give” them direct access to a PCI ethernet card and let them handle IO for themselves. This will give good performance without needing an intelligent NIC.
>> I guess there’s always more controls that could
>> be done, though ๐
> Disk bandwidth?
Yep, it’s been talked about but there’s not been so much demand / interest for this yet. The only trouble is that you’re still at the mercy of the Linux IO scheduler – I think the new pluggable IO schedulers stuff would help here, though.
>> -Resizing of memory allocations
> Nice. Not quite LPAR dynamic reallocation of memory
> (OS still “sees” memory allocation as constant and
> this quantity is a ceiling for reallocation)
Well not really. You can get around this though: we pass a parameter to Linux saying “allocate a large enough memory map for 4Gig” (the most 32bit non-PAE x86 can handle), then you can subsequently make it as large or small as you want, within the constraints of available RAM. The strictly nice way is for Linux to dynamically resize the memory map – we’ll probably end up using hotplug memory support, once it’s fully baked.
As a heavy ESX user, I would like to know what differentiates Xen from ESX? Or, put another way, why should I spend time learning another virtualization product?
Also, assuming that I decide to review Xen, as a newbie, which version should I work with – 2.x or 3.x?
Xen supports a paravirtualised interface that eliminates the need for most of the performance drains of virtualisation on x86. For ported OSes this gives you *really close to* native performance for most workloads. See http://www.cl.cam.ac.uk/Research/SRG/netos/xen/performance.html (no comparisons against ESX because it’s against the EULA – you’d have to do your own private benchmarking to compare against that).
Hardware support, wide SMP guests, some nifty robustness / physical partitioning features, and the fact it’s free are other advantages. You’d need to read some of the literature to get a full list…
*However* if you want to run unported guests on current hardware, you should just buy VMware. When the VTX / SVM hardware is available from Intel / AMD, you’ll be able to run Windows on Xen. Without those hardware assists (which should be pretty standard in a year or so) you can only run ported guests.
Side note: VMware are also working on a paravirt interface for OSes that are modified to support it. The proposed MS longhorn hypervisor would support a paravirt interface *but* still require virtualisation-aware hardware whether or not that interface is used.