Well, this is quite interesting. This is one of those items where I have to make sure everybody realises I’m no developer as to not make myself look like an idiot. Having said that – LinSched. It’s a user-space program that hosts the Linux kernel scheduler, so you can create and test scheduling policies on arbitrary hardware topologies – without actually having to work with the real hardware.
Originally developed at the University of North Carolina, it is being used by Google to test scheduler policies. “At Google we have used Linsched as a testing tool to validate the behavior of the kernel scheduler,” details Google’s Ranjit Manomohan, “We could cut down our validation time from several days to a couple of hours due to the ability to run multiple tests on several different types of hardware models.”
This has already led to several improvements in the kernel scheduler, thanks to the high fegree of code sharing between LinSched and the ‘real’ scheduler. “Due to the high degree of code sharing between LinSched and the Linux scheduler, porting LinSched code to Linux is reasonably straightforward,” Manomohan explains, “LinSched may be especially useful to those who are new to Linux scheduler development.”
“Since Linsched allows arbitrary hardware topologies to be modeled, it enables testing of scheduler changes on hardware that may not be easily accessible to the developer,” he adds, “For example, most developers don’t have access to a quad-core quad-socket box, but they can use LinSched to see how their changes affect the scheduler on such boxes.”
The code is out there.
This is definitely something to check out if you are interested in making a new Linux scheduler.
You could be the next Con Kolivas!
There is a book I have that dissects the 2.4 kernel. The section about the kernel has some commentary to the effect “This section is well optimized, do not bother trying to make any improvements here its a waste of your time”.
Then Con came and I was a little miffed at first that I took the book’s advise and didn’t play with it more. Then Linus replied, and I was sort of glad I didn’t. But hey, getting personally slammed by Linus would be an email to pass down to the generations. So, I think I will mess with it.
Yes I totally agree! I would also like to be called a “masturbating monkey” or idiot or whatever! Praise the lord Linus! If he tells me to eat shit, I surely would!
Halleluja!
Not really what I was going for. More like the Gandhi quote:
Con sort of won, in a messed up kind a way. Ridicule is step #2.
There’s one more step:
First they ignore you
Then they ridicule you
Then they fight you
Then you win.
Then they sue you for patent infringment
Then next genius of scheduling comes along, all spotlights turn on him, and it’s time for step #3: “then they ignore you” 🙂
Hi,
The problem with the Linux scheduler is (mostly) not the Linux scheduler.
The problem is that there’s no effective process priorities (unless the process happens to be running as a real-time process, which requires “root”, which we all know is bad).
Because there’s no effective process priorities, the scheduler can’t know that things like Apache or X or KDE/Gnome or Mozilla or Quake are more important than things you want done in the background (like compiling the kernel); and therefore if there’s CPU load in the “background” you end up with bad latency and/or a less responsive desktop.
In the past there were misguided attempts to work around the problem; where processes that stop often are assumed to be “interactive” and are given a priority boost. Like all assumptions, it’s often wrong – something like a game that’s struggling to processes a heap of data to update the video (and therefore not voluntarily stopping while this processing is being done) doesn’t get a priority boost, and something like compiling the kernel (that most people would want to happen in the background while they’re playing the game) is a large number of different processes that are all stopping (waiting for disk I/O or each other) that do get a priority boost; and you can end up with the reverse of what you want (the game you’re trying to play ends up running in the background, while the kernel you’re trying to compile in the background is “interactive”).
Throwing more CPUs at it helps hide the problem (it increases the chance of important processes getting CPU time despite unimportant processes tying up other CPUs); and if you do implement effective process priorities it won’t make any difference until the processes actually use them (which will take a long time to happen, as most processes are designed to be portable and not designed to for Linux, and because POSIX thread priorities suck for portability). This means that the problem will be well hidden (due to the increasing number of cores) before it can be solved.
– Brendan
Edited 2010-10-13 19:52 UTC
There are some ways around this. For example, using renice on running processes so they aren’t as likely to interrupt what you’re doing. Or using a tool like cpulimit to keep your compile job at a low limit. …
Or, I suppose, scheduling your compile jobs so as to not run at the same time as your video game. Not all solutions need be technical.
The problem of these solutions is that they all ask the user to do the job of the operating system. When I’m busy using my computer, I don’t have the time nor the will to fix the OS designer’s incompetence.
Take my Windows 7 box as an example : when a backup is running, as scheduled regularily, if I try to play music with WMP, I’ll get choppy playback. And, worse, every time music playback hangs, the rest of the UI hangs together with it.
Does it mean that I, as a user, should give up on scheduled backup or on music playback in the background ?
I still don’t get why we still have those problems. I give my OS 6GB of memory, 4 cpu cores at 3GHz, 1 SSD and 4 2TB harddisks. And still things like watching a 100MB video file will be extremely choppy when I do large moves from one harddisk to the other.
Is there no genius that can solve this? Or do we let SSD’s solve the problem like we did with quad cores?
The problem is not hardware resource, it’s the way they are distributed. With better scheduling, we would get a smooth experience on anything that’s not totally IO-bound.
Windows users don’t know why this is it is double factor. 1) Simple fact backup has max priority. Using some third parties the effect will not be as bad.
So adding SSD drives will not fix this.
IO is not your only issue.
Factor 2 is the drive read speed.
I think it’s not drive speed that’s problematic. On a slower computer running Ubuntu 9.04, I never got such UI lock-ups.
So yeah, priorities have to be defined better.
Given current operating systems, this can could probably be done in user space either with some daemon handling coarse-grain priorities or sitting down for fifteen minutes, figuring out which tasks get priorities, and then wrapping the commands with ‘nice’. I don’t think you need to reinvent the scheduler.
I’ve heard so many stories about music players skipping but I’ve never seen it (heard it?) happen in practice except through virtualization.
The Nice command has a issue. cgroup fixs. service/application spits out subprocesses. cgroups applies the limitation to the service as whole. Where nice applies the limitation piece by piece.
Basically cgroups beats services with 1000 PID doing suffocation on processes with 1 PID.
Yes the tech to handle it right exists. Linux upper levels are just not using it fully yet.
Then how does the Real-Time Kernel Patch works?
Kernel locking has been a major problem. Still is in some areas.
Con Kolivas though by tweaking the scheduler this will cure it. Con Kolivas rejects the idea of cgroups. Very big mistake. Cgroups allows background services to be removed from interfering with the desktop user without having to recode the applications.
“The problem is that there’s no effective process priorities” This is incorrect. Cgroups is one level of effective process priorities. The issue is management of process priorities. IE Linux kernel provides enough.
systemd will bring cgroup control to services so providing a cure to most of the background problems from non user run processes and scheduler.
Even Con Kolivas so call great scheduler is still tripping over background services.
“waiting for disk I/O or each other” This is locking issues. Newer disk I/0 locking in linux is working on que jumping. Ie higher priority process gets to jump to head of que for I/O. cgroup controlled. User interactive session is one cgroup of its own.
Of course user failing to declare that the big code build they are doing should be background is still going to cause lag.
MS Windows method is quite simple. Windows manager informs the scheduler what process currently has the active window and it gets the speed up. Now doing this is not a kernel solution. Its a windows manager one. Server versions of windows lack this.
Really little bit of effort and Linux could be just as interactive as windows. There is really no low down fault preventing it.
“unless the process happens to be running as a real-time process, which requires ‘root’, which we all know is bad”
It requires root only for setting the process priority. Once that is set for RT, the process can drop root privileges without losing the RT priority.
Or, it can be set to SUID, check that the real UID is not root before proceeding, and then elevate the priority before dropping privileges back to those for the real UID. Then any user can run it, and it gets the added safety bonus of ignoring the LD_* env vars.
A very interesting tool.
Thanks for the link
I rather wish someone would implement something like FreeBSD’s idprio for Linux. A CPU hog keeps hogging it’s CPU even if set to nice +20 and SCHED_IDLEPRIO.
Edited 2010-10-14 16:16 UTC
Have you looked into “ionice”? I don’t know how ioprio works, so that may be a red herring for you.
Yes, I’m familiar with ionice it’s for the IO scheduler, not what I’m looking for.
In the FreeBSD scheduler the idle prio class only receives CPU if no other process in a higher class (realtime, normal) requires CPU.
Practical example:
On my dual CPU machine I run VirtualBox, the process uses like ~180% CPU and the system being ~10% idle (or wait state, same thing to me).
Now I run mencoder with nice +20 and SCHED_IDLEPRIO (and ionice -c 3 if you like) and I can see that VirtualBox CPU usage is throttled down to ~120-160% while mencoder receives 30-60% CPU.
If do the same on FreeBSD and run mencoder with idprio 31 then it only uses the ~10% the system was idle before. VirtualBox will run undisturbed.