Unikernels: the next stage of Linux’s dominance

Thom Holwerda 2019-07-23 Linux 9 Comments

Unikernels have demonstrated enormous advantages over Linux in many important domains, causing some to propose that the days of Linux’s dominance may be coming to an end. On the contrary, we believe that unikernels’ advantages represent the next natural evolution for Linux, as it can adopt the best ideas from the unikernel approach and, along with its battle-tested codebase and large open source community, continue to dominate. In this paper, we posit that an up-streamable unikernel target is achievable from the Linux kernel, and, through an early Linux unikernel prototype, demonstrate that some simple changes can bring dramatic performance advantages.

A scientific paper on the subject.

About The Author

Thom Holwerda

Follow me on Mastodon @[email protected]

9 Comments

2019-07-23 8:17 pm

sukru
This seems to be based on a “serverless” study using bare containers:
https://dl.acm.org/citation.cfm?id=3103008

However the application is too small to give a proper comparison. Granted most services do not need all the intricacies of the whole Linux kernel (like filesystem in a container), when the work gets more complex, you’d need to include more and more of the full fledged OS.

For example, if you are hosting any kind of web services (gRPC/REST/json), you’d need proper TCP/IP stack, OpenSSL (or variants), threading/fibers, buffering, and most likely some security isolation. Once you factor these components in the “lite” kernel no longer becomes lite enough, especially compared to the examples used.

From that previous article:
“To give the Linux container the best performance possible, we create an optimized container by reducing it to a microcontainer [25] with no libc. In other words, rather than including an entire Linux distribution, the container only includes a statically-linked binary that does not even link libc. The binary is a small application written in inline assembly that directly invokes a system call to output “Hello” on stdout. The container is invoked with runc using a standard config.json and a precreated layered rootfs using
overlayfs. We did not set up network interfaces by default.”

“The comparison unikernel, referred to as ukvm in the experiments, runs a Solo5/ukvm unikernel that prints “Hello” onto the serial console then exits [7]. It is invoked with ukvm-bin, a specialized unikernel monitor generated to run the unikernel [26]. ukvm-bin is modular, and a “hello world” unikernel only needs a serial console to output a string, so that is the only device (i.e. module) set up by the monitor. ukvm-bin runs as a Linux process and loads the unikernel in a new VCPU context via the Linux KVM system call interface.”

[ Slides: https://www.serverlesscomputing.org/wosc2/presentations/s3-serverless-workshop-slides.pdf ]

2019-07-23 10:16 pm

kwan_e
sukru,

For example, if you are hosting any kind of web services (gRPC/REST/json), you’d need proper TCP/IP stack, OpenSSL (or variants), threading/fibers, buffering, and most likely some security isolation.

Isn’t one of the design architectures to use a full-fledged kernel to provide those features (networking, concurrency), and virtualize the unikernels so that they don’t have to do all that?

2019-07-23 10:42 pm

Brendan
Isn’t one of the design architectures to use a full-fledged kernel to provide those features (networking, concurrency), and virtualize the unikernels so that they don’t have to do all that?

So instead of having a kernel API that’s called by processes (and multiple processes); you have a hypervisor API that’s called by virtual machines (and multiple virtual machines); and after many years of idiotic “unikernel wankery” you finally realize you’ve done nothing but change the terminology used in the documentation without changing any code?

2019-07-24 3:47 am

kwan_e
Brendan,

done nothing but change the terminology used in the documentation without changing any code?

Well, no, because getting a kernel like Linux down to a unikernel is not doing nothing and certainly cannot be done without changing some code.

2019-07-24 5:13 pm

Brendan
Well, no, because getting a kernel like Linux down to a unikernel is not doing nothing and certainly cannot be done without changing some code.

You’ve overlooked my “eventually”.

Think of it as a progression starting with “full system emulation” where the host system emulates a lot of devices and guest unikernel has full blown device drivers; which (over time, to improve performance) evolves towards “virtual IO” interfaces; and then (over more time, to improve performance more) those “virtual IO” interfaces get higher level and more “kernel API like”; until eventually you end up at “unikernel is a thin wrapper library that just calls the host kernel”.

Currently we’re still evolving towards “virtual IO interfaces”; so currently you need a stripped down kernel and performance is worse than it could be. Eventually, the wrapper becomes so thin that it’s nothing more than CGROUPs in the host.
2019-07-24 5:33 pm

Brendan
Note that what I’m saying here is that (in the long run) unikernels only make sense when they’re running on bare hardware; because (in the long run) anything else ends up being a re-invention of “processes” (which always were virtual machines, with virtual address spaces, and threads/virtual CPUs, and kernel API providing virtual devices via. a high level/abstracted IO model).

In other words; for the only case where unikernels make sense, if you are hosting any kind of web services (gRPC/REST/json), you’d need proper TCP/IP stack, OpenSSL (or variants), threading/fibers, buffering, and most likely some security isolation in the unikernel itself.
2019-07-24 10:09 pm

kwan_e
Brendan,

Note that what I’m saying here is that (in the long run) unikernels only make sense when they’re running on bare hardware; because (in the long run) anything else ends up being a re-invention of “processes” (which always were virtual machines, with virtual address spaces, and threads/virtual CPUs, and kernel API providing virtual devices via. a high level/abstracted IO model).

By distributing a program as a unikernel, that gives people the option of running that program on bare hardware, or a virtualized machine, without change. You can’t take an a.out/.exe and run it on both a virtualized machine and bare hardware at the same time. So it’s not the same as a process. It would even possible for a system to move an application between virtualization and bare hardware on the fly. You cannot do that with a process; you’d have to have the same supported OS and dependencies on all hardware to move a process between a virtualized and non-virtualized machine.

A unikernel also focuses the implementation of a program onto a much smaller set of capabilities, instead of having the fullblown OS’s API and then requiring admins to set up the cgroups/namespaces/containers properly (and having the program correctly handle security exceptions). eg, say a program distributed as a unikernel with no networking capability doesn’t require people who install the program to do full security audits because the program is built without the vulnerability in the first place.
2019-07-25 3:58 pm

CodeMonkey
Interestingly this is essentially what HPC (supercomputing) architectures and Operating systems did for years. Both older Cray systems and even newer Blue Gene systems ran very small purpose-built OSs that only did the very few things needed for HPC. No multiuser support, no shared library support, no virtual memory, etc. So you were *almost* running on bare metal. Cray eventually transitioned to running Linux on the compute nodes (circa 2007), using a purpose-configured Linux kernel with the minimal set of options needed. They successfully demonstrated equivalent scaling to the custom Catamount kernel and had only a ~10% performance impact while exponentially improving maintainability. IBM’s BlueGene systems, on the other hand, also used a purpose-built OS, but eventually ended up a dead architecture, now replaced by large POWER9/NVIDIA GPU clusters.

My point is that making a tiny purpose-built kernel has been a common approach for a very long time that seems to evenually lead to using a general purpose OS, mostly for the sake of development and maintenance costs. The performance gains of such specialized environments in real-world applications tend to be fairly small.
2019-07-26 12:41 am

kwan_e
CodeMonkey,

My point is that making a tiny purpose-built kernel has been a common approach for a very long time that seems to evenually lead to using a general purpose OS, mostly for the sake of development and maintenance costs. The performance gains of such specialized environments in real-world applications tend to be fairly small.

Interesting about the HPC world, but yeah, HPC is very specialized area.
But unikernels, especially one based on Linux, would have a wider area of application on embedded devices. eg, Raspberry Pi runs their website on Raspberry Pis. A unikernel based on Linux would see greater reuse because it won’t be limited to HPC environments. In fact, a unikernel based on Linux would make it easier to bridge applications between HPC and small embedded systems.