This article is a quick overview of scheduling in operating systems and presents Bossa, a research project on operating system schedulers and domain-specific languages.
Overview
The recent activity in Linux kernel development caused by the introduction of a new scheduler by Ingo Molnar has emphasized for ordinary Linux users the importance of schedulers in modern operating systems. In this article, we give you a glimpse of what scheduling development is like by letting you implement your own Linux scheduler thanks to Bossa, a framework for scheduler development.
Bossa, a framework for scheduler development
Scheduling
Most commodity computers can only run one process at a time, and yet the user will have the impression that many programs run in parallel. This illusion is created by the operating system scheduler. To do so, the scheduler causes the system to switch quickly between processes. Each process runs for a certain time, and when a process times out the scheduler chooses which process to execute next on the CPU. Therefore, the behavior of a system mainly depends on the scheduler. Fairness, efficiency, latency; the scheduler must balance these properties to offer the appropriate behavior. Hence while a scheduler aimed to desktop applications should emphasize the interactivity, server-oriented schedulers should emphasize efficiency and respond correctly to scalability issues. When implementing a scheduling algorithm the developer must take into account that each property has an impact on the others.
Today’s common operating systems are general-purpose and while they can run on different architectures (server, desktop, embedded devices), they generally implement a unique scheduler, which consequently needs to perform well in all these environments. This is exactly what has been done with the new Linux scheduler. The introduction of O(1) algorithms and the re-writing of the multi-processor scheduling code permit better scalability for Linux 2.6 servers. On the other hand the new scheduler offers mechanisms to estimate the interactivity of processes which improve system feel on desktop utilization.
In parallel with the industrial world, research on schedulers has been a hot topic since the first days of multi-tasking environments and it has witnessed a revival with the growth of multimedia applications. Although many new algorithms have been developed, there has been little change in the schedulers used by commercial operating systems. The needs are clear: multimedia, real-time, resource control, etc, and yet the existing scheduling algorithms are not widely adopted. Surprising? No. One of the main obstacles that scheduler developers face is the implementation of a scheduling policy in a standard OS: developing and implementing a scheduling policy require two different kinds of expertise. Therefore a scheduler developer needs both to master kernel hacking and to be knowledgeable in the scheduling research field.
Bossa
Bossa is a framework to facilitate the implementation and integration of new scheduling policies. This framework has been developed at the Ecole des Mines de Nantes and the University of Copenhagen. The main idea behind Bossa is: let people do their job. If one’s job is to develop scheduling algorithms, then one shouldn’t have to hack kernel code.
The current framework is based on Linux and replaces scheduling code scattered throughout the kernel by a fixed interface of scheduling events. Integration of a new policy amounts to linking a module defining handlers for these events with the kernel. The re-engineering of the kernel is done only once (for each OS) and then it simplifies the integration of schedulers.
For the scheduler developer, Bossa includes a domain-specific language (DSL) that provides high-level scheduling abstractions. The developer uses this language to write scheduling policies which is more suitable for this job than C. Then, a dedicated compiler checks the Bossa DSL code for compatibility with the target kernel and translates the code into C, which is then compiled into a Linux kernel module.
Apart from simplifying the implementation of scheduling policies, the use of a DSL provides better safety guarantees. The Bossa DSL is more constrained than C or C++ in that, for example, pointers (known for being the cause of many bugs) are absent and infinite loops can’t be defined. Moreover, the Bossa compiler performs verifications of the policy.
The Bossa framework also brings hierarchical scheduling to the Linux kernel. This concept consists of building a tree of schedulers, that allow different classes of applications to run under different scheduling policies. The leaves of the tree are schedulers of processes and nodes are schedulers of schedulers (a.k.a. virtual schedulers).
The diagram shows an example of a hierarchy of schedulers. In this example there are two classes of applications: normal and real-time. While the normal applications will be scheduled with the default Linux scheduler, the real-time applications will be scheduled by a real-time scheduler. Finally these two schedulers of processes are scheduled by a virtual scheduler (scheduler of schedulers).
scheduling tree
Bossa is free and you can access material at http://www.emn.fr/x-info/bossa/.
Here are direct links to download :
Round Robin Scheduling
To illustrate the implementation of a scheduling policy using Bossa we consider one of the simplest yet most widely used algorithms: round robin. This algorithm assigns each process a time slice, called a quantum, which represents the time the process is allowed to run. If the process is still running after a period of time equal to its quantum then it is preempted by the kernel and the CPU is given to another process. Processes waiting for execution are stored in a queue and the first process enqueued is the first to run. Once preempted, a process is placed at the end of the queue, implying that it has to wait for all of the other processes to execute. The figure below shows the execution of four processes executed under a round-robin algorithm.
Process execution under round-robin with the quantum set to 3 ms.
In the following, we will explain how to write a round robin scheduler using Bossa. We will also show how to load this scheduler into a Bossa-ified Linux kernel and run some test programs to show that it is working.
Implementing the round robin policy using the Bossa language
The Bossa framework introduces a new programming language to write scheduling policies.
In this section we present an overview of the Bossa language by using it to
implement the round robin algorithm.
Basically a Bossa policy is divided into five parts:
- The process structure definition
- The declaration of the different states
- The ordering criteria used to select the next process to run
- The handling of the scheduling events
- The interface allowing user-level programs to control some aspects of the
policy
We present each of these parts below.
Process structure definition
As our round-robin policy is a scheduler of processes, we need to declare the process attributes that will be used by the policy.
process = {
int time_slice;
}
This declaration means that each process has a time_slice
attribute, which will be used to store the time remaining. When manipulating a process p
we can access its time slice using p.time_slice
.
States
Steps in the lifetime of a process
The steps in the lifetime of a process can be summarized as the states running, ready, blocked and terminated. In the Bossa framework, two levels of states are found:
- State classes:
RUNNING
,READY
,
BLOCKED
,TERMINATED
. These are the basic states in the lifetime of a process from the point of view of the kernel. - States: these are states declared by the developer and used in a scheduling policy. A state is always associated with a state class.
Let’s see how states are declared with Bossa:
states = {
RUNNING running : process;
READY active : sorted fifo queue;
BLOCKED blocked : queue;
TERMINATED terminated;
}
In the code above, we declare four states: running
, active
, blocked
, and terminated
. The state running
is associated with the state class RUNNING
and references a unique process. The state active
is associated with the state class READY
and references a queue of processes which is maintained in fifo
(first-in first-out) order (the annotation sorted
will be discussed in the next section).
running
references the process currently running while active
references a queue containing all the processes ready to be executed, blocked
references the blocked processes. terminated
does not reference any process; instead, it is analogous to /dev/null
as changing the state of a process to terminated
drops any reference to the process.
Selection
Selection is probably the main part of all schedulers, as the behavior (fairness, interactivity, etc) of a scheduling policy depends on how it selects the next process to run. The Bossa language provides features that try to simplify the specification of the selection process. The language allows the user to define criteria that will be used to select a process from a set of processes. The developer uses the annotation sorted
to indicate the state (and data structure) to which the selection criteria should be applied. In the previous section, we declared the queue containing the processes in the state active
as sorted
. Selection will thus be done from the processes in this state.
Criteria are defined using the keyword ordering_criteria
. For instance, if the next process to run must be the process with the highest priority, it should be defined as:
ordering_criteria = { highest priority }
In the case that there are several processes with the highest priority, the next process to run is picked according to the ordering of the queue.
Back to our example. The round robin scheduler just gets the next process from the queue of processes so we don’t need to define any ordering criteria.
ordering_criteria = { }
Event handling
The Bossa framework replaces scheduling code scattered throughout the kernel by a fixed interface of scheduling events. In this section we will see which events are available and how to handle them. As this article is just an overview of what is Bossa, we will just introduce the main events: new
, block
, unblock
, clocktick
, schedule
and end
.
First let’s see how the set of event handlers is declared:
handler(event e) {On event_name_1 {
/* statements */
}On event_name_2 {
e.target.attribute = X;
//e.target is the process targeted by the event e
e.source.attribute = Y;
//e.source is the process which generated the event e
}
}
new
The new
event is the first step in the life of a process. The Linux kernel allocates the process structure and other resources. The policy only needs to initialize the scheduling data and state of the process.
On process.new {
e.target.time_slice = 30;
e.target => active;
}
This code initializes the process’s time_slice
value to 30 units and places the process in the state active
. The process is now eligible to run.
block
This event is quite frequent in the life of I/O based applications. It occurs when a process blocks, for instance to wait for user input.
On block.* { // we can use the wild card * to handle all block events
e.target => blocked;
}
For all of the block events we change the process state to the state blocked
, making the process ineligible to run.
unblock
As a counterpart to the block.*
event, the unblock.*
event happens when a process unblocks, which may for example occur when the user types on the keyboard.
On unblock.* {
if (e.target in blocked) { // if e.target is in the blocked state
e.target => active;
}
}
This handler checks whether the process is in the state blocked
. If so, we change its state from blocked
to active
. The =>
operation automatically removes the process from the data structure (queue, process variable, list, tree, …) associated with its current state.
clocktick
The clocktick
event is triggered at each CPU clock tick. Basically, a policy can use the processor clocktick as a unit of time, which is how the round-robin policy manages the time_slice
process attribute. Back in the new
event handler we set the time_slice
attribute to 30
meaning that a process can run on the CPU for as long as 30 clockticks. Here is how we handle this event in the round robin scheduler:
On system.clocktick {
running.time_slice–; // process just used some time
if (running.time_slice <= 0) { // the process timeslice is over
running.time_slice = 30; // we reload its timeslice
running => active; // and preempt it
}
}
If the process has used up its time slice, it is preempted.
schedule
We have seen that a process can be preempted during a block and a clocktick. When the running process has been preempted, the scheduler needs to elect a new process to run. The round robin algorithm just selects the next process in the state active
. Thus, the implementation of the schedule
event handler is very simple:
On bossa.schedule {
select() => running;
}
select()
returns a process from the sorted
queue, selected according to the ordering_criteria
and any extra information about the ordering of this queue (here fifo
ordering). By changing the state of this process to running
(using =>
) we declare that the newly elected process is to be executed (we change its state from active
to running
).
end
Finally, the end
event indicates process termination. We generally don’t care about terminated processes so this event handler will often just contain:
On process.end {
e.target => terminated;
}
This event handler is typically more complex in the case of policies that have an acceptance criteria based on the current load on the scheduler (for example, EDF1). These policies have to know when processes depart, thus reducing the load.
Stop the theory, show me it works
The above description gave a quick overview of how to write a scheduling policy with Bossa. In this section, we will test the round-robin scheduler. The complete implementation and some test programs can be found in the archive rr.tar.gz. Of course, for these programs to run correctly the first step is to boot and run a Bossa kernel.
- Untar, compile, install and boot the Bossa kernel (Note: you have to use
make install
to install the kernel and the Bossa tools) or boot the Bossa CD - Untar the archive:
$ tar zxf RR.tar.gz
$ cd RR
$ ls
loop10.c Proportion.bossa RR30.bossa testRR.sh
loop30.c RR10.bossa setup_test.sh
$
RR10.bossa and RR30.bossa are two round-robin implementations. In RR10 the
time_slice
of each process is 10 while in RR30 it is 30. - As root, launch the
setup_test.sh
script. This script compiles the two round-robin policies as Linux modules using thebossa_install
tool included with the Bossa kernel. Then the script loads the policy modules in the kernel and configures the scheduling tree as shown by the diagram.
$ ./setup_test.sh
- Run the test script:
$ ./testRR.sh
The script runs three infinite loops in parallel, with each loop printing a different number. The source code can be found in
loop10.c
etloop30.c
. When launched, the loop program attaches itself to the round-robin scheduler and then loops.
We run three loops (1, 2, 3) in parallel. They are
scheduled by the round-robin scheduler with time_slice=30.
1111111111111111111122222222222222222233333333333333333311111111111111111122222
2222222222222333333333333333333331111111111111111111122222222222222222233333333
3333333333111111111111111111222222222222222222223333333333333333333311111111111
11111112222222222222222223333333333333333331111111111111111112222The round robin scheduling effect can clearly be seen.
We run three loops (1, 2, 3) in parallel. They are scheduled by the round-robin
scheduler with time_slice=10.
11111122222233333311111122222233333311111122222233333311111122222233333311111122
22223333331111112222223333331111112222223333331111112222223333331111112222223333
33111111222222333333111111222222333333111111222222333333111111222222333333111111
22222233333311111122222233333311111122222233333311111122The round robin scheduling effect can clearly be seen.
Indeed, this trace shows the behavior of a round robin scheduler.
scheduling tree
Conclusion
In this article we have presented the Bossa framework for developing scheduling policies. You should now have an initial understanding of the development of scheduling policy and the advantages of using a domain-specific language. Bossa makes development easier, and most scheduling algorithms can be written with a few hundred lines of code.
Also, as Bossa simplifies the development and integration of schedulers in an operating system kernel, it can be used to teach scheduling to CS undergraduate students. By experimenting with scheduling policies using Bossa, students can learn in a practical way how a scheduler works.
About the author:
Christophe Augier is a student in Computer Science at the Ecole des Mines de Nantes (France) where he is focusing on Operating Systems and Networking. For his master’s thesis he is currently developing a dedicated language for writing network packet schedulers.
Many thanks to Julia Lawall and Gilles Muller for supporting me in the writing of this article.
References
1 The Earliest-Deadline First scheduling policy:
- The Bossa version of the EDF policy: EDF.bossa
- A research paper introducing EDF:
Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment,
C. L. Liu and James W. Layland
Journal of the ACM (JACM)
Volume 20, Issue 1 (January 1973)
If you would like to see your thoughts or experiences with technology published, please consider writing an article for OSNews.
The link to “University of Copenhagen” is actually a link to the department of computer science at the university.
Why do we need yet another schedular implementation?
Not knocking the effort these people have put in, but kernel options can be confussing at the best of times… now we are going to be spoilt for choice in schedulars.
..this is sort of a “plug-n-play” interface for schedulers? Hmmm. Interesting I guess. I know very little about the linux kernel, so I shouldn’t even comment, but how does the kernel continue to interface with process loaders (memory swapping, etc)? Or does it really simply manage a list of processes (not swapped out I guess) based on some algorithm you hand it, communicating with a loader to pull the process into execution when its turn is ready?
Heh. I am confusing myself now.
Mike
very nice work! I love abstraction. And this is just an area one can use such abstraction.
Very nice article as well. I’d love to go play with this if only I had the time… 😉
Joop
..this is sort of a “plug-n-play” interface for schedulers? Hmmm. Interesting I guess. I know very little about the linux kernel, so I shouldn’t even comment, but how does the kernel continue to interface with process loaders (memory swapping, etc)? Or does it really simply manage a list of processes (not swapped out I guess) based on some algorithm you hand it, communicating with a loader to pull the process into execution when its turn is ready?
Heh. I am confusing myself now.
Mike
========
RESPONSE
========
The scheduler is seperate from the memory management. The scheduler is responsible for picking which process will be run next. It does not matter if the program is currently in memory or not. The scheduler most likely looks at the process table for a list of potential processes to run.
Gotcha! It’s a pick-list with smarts. But more, since it is the core of kernel really, no? It’s more like a director than a pick-list. Abstracting it then provides a consistent interface for any kernel implementing it.
Why do we need more scheduler options? Because there’s more than one way to do things. Have a look at this paper on Lottery Scheduling for an interesting concept:
http://www-csag.ucsd.edu/teaching/cse225s04/Reading%20List/lott…
Whatever makes it easier to experiment and try out new ideas is welcomed by me.
allowing an administrator to adjust or alternate scheduling algorithms during run-time would be very useful. this remove the need to boot different kernels or even OSes depending on your current workload.
This should be ‘defensively’ patented. After sucessful patenting, transfer all rights to the Free Software Foundation. We’ll ammase our own patent portfolio.
I also think that we’ll see more more of this, in the VM and Caching sections of code.
i just read the article properly … what an excellent article!
thanks.
if you try the RR scheduler test and have difficulties or whatever, you may get feedback on the ml of Bossa: [email protected] (i forgot to link it)
I’ve finally learned what a scheduler is
By the way, the link to the “The Bossa version of the EDF policy” at the end of the article doesn’t work for me, and in the fourth step of the fourth page, where it states “The source code can be found in loop10.c et loop30.c” it should be “and”, not “et”