Genode 13.08 featuring Qt5, MP support, and tracing

Guest post by Norman Feske 2013-08-15 OS News 14 Comments

The just released version 13.08 of Genode marks the 5th anniversary of the framework for building component-based OSes with an emphasis on microkernel-based systems. The new version makes Qt version 5.1 available on the entirety of supported kernels, adds tracing capabilities, and vastly improves multi-processor support.

The ability to execute Qt-based applications natively on a range of different kernels is one of the most visible features that attracts users to the framework. Even though Genode’s developers closely follow the development of Qt5 and greatly appreciate the direction Qt is heading to, however, the latest Qt version supported by Genode used to be 4.8.4. With the new release, Genode finally made the switch to Qt version 5.1. The porting of Qt5 was not an easy walk because the platform interfaces of Qt underwent significant changes between versions. In particular, the Qt Window System (QWS), on which Genode formerly relied, got replaced by the new Qt Platform Abstraction (QPA) interface. This created a gap between the functional requirements of QPA and Genode’s GUI server called nitpicker. This gap was bridged quite cleverly by using Genode’s component composition techniques. An outline of the idea and the resulting solution can be found in the release documentation.

The second focus of the current release was the addition of multi-processor support, in particular when using the NOVA microhypervisor as kernel. This topic has been discussed for multiple years by now. The basic problem can be stated as follows. Genode relies on synchronous inter-process communication for letting components interact with each other and for delegating access rights. NOVA provides such an IPC mechanism, but it only works if both communication partners reside on the same CPU. Communication between CPUs is solely possible via shared memory and semaphores. The restriction is not arbitrary. The NOVA developers have good reasons for choosing their design, scalability and IPC performance being the most prominent ones. As a consequence, programs running on top of NOVA need to be aware of the assignment of threads to physical CPUs to comply with this restriction. On the other hand, the Genode API does not impose such a restriction to the application programmer. The API allows for the creation of threads and threads are expected to communicate with each other over synchronous IPC. The circumstances of which thread happens to be executed on which CPU don’t matter and should not matter at Genode’s API level. After a years long discussion of both schools of thought, a series of experiments was kicked off, which ultimately yielded a surprisingly simple design. As a result, Genode has become able to leverage multiple CPUs on NOVA in a way that preserves both NOVA’s incentives, and Genode’s existing API.

Regardless of the particular kernel Genode is used with, one problem is shared by all users of Genode: Complex system scenarios with many components are hard to analyze, which makes it extremely challenging to identify the cause for performance problems. Hence, the process of optimizing application performance, more often than not, tends to be a series of hit-and-miss experiments. To pinpoint performance bottlenecks easier, the framework has gained a new tracing facility that is deeply built-in and always enabled. The tracing support allows for the monitoring of inter-component communication whereas the policy of which information to capture can be defined at runtime, similar to dtrace. The design is the result of more than one year of exploration and experimental work with the goal to facilitate Genode’s architecture for approaching the problem rather than porting an existing solution. The effort paid off. Compared to existing tracing toolkits, which were designed with monolithic OS kernels in mind, Genode’s tracing facility is strikingly simple yet powerful.

Further improvements of the new version are added device drivers for the Exynos-5 SoC, updated kernels, improved virtualization support for x86 on NOVA, and new integrity checks for 3rd-party source codes ported to Genode.

All those topics are covered along with lots of background information in the release documentation of version 13.08.

14 Comments

2013-08-15 11:55 pm
jayrulez
Genode just got a whole lot better .
A question about a mention in the release notes. What makes Nova more advanced than the other kernels for x86?
I always thought Nova and Fiasco.OC were about on par for this architecture.

2013-08-16 8:03 am
Norman Feske
Thank you for the feedback!
Your question can be answered in three ways: The feature set, the design, and the implementation.
When comparing both kernels feature-wise (I am solely referring to x86), NOVA is more advanced because it has IOMMU support, more complete support for virtualization, and a deterministic way of passing processing time between threads that communicate synchronously.
The design of NOVA is more modern because it was designed from scratch in 2006 (I think) whereas Fiasco.OC is the successor of the L4/Fiasco kernel, which predates NOVA by almost a decade. Of course Fiasco.OC’s kernel API has been largely modernized (i.e., use of capabilities) but it still sticks to certain aspects of L4/Fiasco’s original design that were discarded by NOVA. For example, Fiasco.OC uses one kernel thread per user thread wheras NOVA uses a single-stack model for the kernel. This relieves the kernel from holding large state (a stack per user thread) and makes the kernel more easy to reason about. Another example is that Fiasco.OC still uses of identity mappings of physical memory for roottask whereas NOVA starts roottask with an empty address space that can be populated in arbitrary ways. A third example is scheduling. Whereas Fiasco.OC simply schedules threads, NOVA de-couples threads from CPU scheduling contexts, which allows for a deterministic flows of execution and the inheritance of priorities across IPC boundaries.
Comparing the implementations is a bit subjective though. Personally, having worked with both kernels, I highly appreciate NOVAs concise code. I see it as a real master piece of software engineering. But you have to judge this aspect by yourself.

2013-08-16 3:17 pm
jayrulez
I also appreciate the concise code of NOVA. Comparing the two codebases, Fiasco.OC+L4Re seems a bit convoluted (which I believe is as a result of many people working on the code lots of exprimentation and design changes over the years).
Do you think the NOVA API would fit well on the ARM architecture?
I see you aleady have the base-hw platform for ARM. I don’t think doing it at this point would be feasible, but if the situation was different now, do you think it would make sense to port NOVA to ARM for your ARM support rather than creating base-hw?

2013-08-17 10:29 am
Norman Feske
Conceptually, I see nothing that would speak against using NOVA’s design on ARM. That said, there is not much incentive by the NOVA developers to bring the kernel to ARM. NOVA is developed at Intel after all.
As of now, our base-hw platform serves us primarily as experimentation ground. For example, we use it to explore ARM TrustZone, or for enabling new ARM platform quickly. At the current stage, it not as complete as Fiasco.OC or NOVA as it does not support MP, nor does it provide protection for capability-based security (yet).

2013-08-16 4:15 pm
Alfman verbose=1
nfeske,
“For example, Fiasco.OC uses one kernel thread per user thread wheras NOVA uses a single-stack model for the kernel. This relieves the kernel from holding large state (a stack per user thread) and makes the kernel more easy to reason about.”
I’m a big fan of this asynchronous state machine kind of approach versus having a thread-per-request! Does this asynchronous interface transcend into the userspace API?
This brings up memories for me because I wrote an asynchronous library for linux, it’s a subject I’m fairly passionate about! I’m disappointed with the state of async in linux. It doesn’t support async file IO at all. It ignores O_NONBLOCK and causes processes to block never-the-less. This is bad because it makes most network daemons become serialized around the file system and block continually waiting for disk platter rotations and network file systems for data that isn’t cached.
True async libraries such as mine and posix are forced to spawn userspace threads serving no purpose other than to avoid kernel blocking. My library uses async linux socket IO since it abstracts network connections differently than files, but under the posix async library all descriptors are treated generically, and thus it uses blocking threads for all IO including network sockets, thus negating the purpose of async code in the first place to eliminate the need for blocked threads. Another technical problem with threads is how notoriously difficult they are to cancel. But I’m quite off topic by now, getting back to genode…
I really appreciate the analysis of SMP support in your release notes. Its a very interesting technical problem in it’s own right. I’ve found a lot of programmers are not really aware of how expensive SMP synchronization primitives can be. IO-bound code rarely benefits from SMP. Highly multithreaded designs often have hidden bottlenecks due to the implicit serialization required by CPUs maintaining cache coherency, which is the basis for synchronization primitives.
SMP is great for running unrelated workloads on different cores because they really do run in parallel. The question is how should a scheduler distribute the threads? Those with high IPC should be on the same core. If a CPU is at 100%, one should probably try to identify threads with the least IPC and move them to other cores. But then you have to have accounting for such things, which as you’ve mentioned already is much more complicated than a static approach. A dynamic processor affinity solution without IPC accounting (whether it’s implemented in the kernel or in userspace) basically has to resort to guessing which threads should run on which core, it can never be optimal without accounting. But then how much complexity and overhead can be justified to optimize processor affinity?
Anyways, fascinating stuff! I really enjoy technical discussions.

2013-08-17 11:57 am
Norman Feske
For the reasons you stated, Genode’s interfaces for I/O (the session interfaces for networking, block access, and file-system) are designed to work asynchronously. This way, it is possible to issue multiple file operations at once and receive a signal on the completion of requests.
Also in general, Genode tries to completely move away from blocking RPC calls. For example, the original timer session interface had a blocking ‘sleep’ call. On the server-side (in the timer driver), this required either a complicated out-of-order RPC request dispatching or the use of one thread per client. By turning it into an asynchronous interface some months ago, we could greatly simplify the timer and reduce resource usage (by avoiding threads). Another example is the NIC bridge component, which we reworked for the current release. Here the change to modelling the component as a mere state machine improved the performance measurably.
There still exist a few blocking interfaces from the early days, but those will eventually be changed to operate asynchronously too.
However, even though the interfaces are well prepared to a fully asynchronous working software stack, not all server implementations operate this way, yet. For example, the part_blk partition manager dispatches one measly block request at a time. This needs to be fixed.
Your comment about SMP and I/O-bounded work loads is spot-on!
Managing the affinities dynamically at runtime is certainly an interesting project in its own right. In the current release, we have laid the ground work to pursue such ideas. What’s missing are good measurement instruments for the thread’s behaviour. It would be useful to gather statistics per thread about how much CPU time was actually consumed, how many attempts had been made to perform (costly) IPC to a remote CPU, or how often lock-contention took place, maybe even somehow capturing the access profile of local vs. remote memory. This information could then be fed into an optimization algorithm that tries to minimize a cost model. These are tempting topics for further research.

2013-08-16 2:36 am
Pro-Competition
This project just keeps improving!
I haven’t had a chance to read the release documentation yet, but I love simple, elegant solutions to complex problems, so I’m looking forward to it. ;^)
As usual, keep up the good work!
2013-08-16 6:29 pm
ebasconp
OSnews.com news that make me feel at home again!
Thanks!
2013-08-17 5:52 pm
jgfenix
This sounds great but I would like to see some examples of uses in the real world.

2013-08-18 12:43 pm
Norman Feske
The framework is not widely adopted in real-world products. So I am unable to cite tangible showcases. As of today, it is primarily being used as a platform for carrying out OS research. For instance, our company uses it as experimentation ground for doing research consulting projects.

2013-08-18 9:41 pm
ebasconp
Sir,
do you have an installer or an .iso for your newest release?
I did not find it while looking for it in your website.
Thanks!

2013-08-19 12:48 am
jayrulez
Genode is not currently distributed as an end user operating system. There is no iso for the latest release. The latest iso is just a tech demo. If you wish to try Genode, you ideally have to compile it from source.
Genode will not replace your current Linux distribution, Windows or OS X now.

2013-08-18 9:42 pm
ebasconp
One more question,
Did you port Java to your framework?

2013-08-19 12:43 am
jayrulez
No, Java has not been ported to Genode. All libraries and applications can be found in the libports and ports directories.