It’s recently been a year since I started working on my pet OS project, and I often end up looking backwards at what I have done, wondering what made things difficult in the beginning. One of my conclusions is that while there’s a lot of documentation on OS development from a technical point of view, more should be written about the project management aspect of it. Namely, how to go from a blurry “I want to code an OS” vision to either a precise vision of what you want to achieve, or the decision to stop following this path before you hit a wall. This article series aims at putting those interested in hobby OS development on the right track, while keeping this aspect of things in mind.
The article which you’re currently reading aims at helping you answering a rather simple, but very important question: are you really ready to get into OS development? Few hobbies are as demanding, time-consuming and late-rewarding at the same time. Most hobby OS projects end up reaching a dead-end and slowly being abandoned by their creators because they didn’t understand well enough what they were getting into and wrote insufficiently maintainable and ill-designed code or were overwhelmed and depressed by the amount of time it takes to get a mere malloc()
working.
Some bad reasons for trying hobby OS development
-
“I’ll bring the next revolution of computing”: Simply put: no. Long answer: think of yourself as a small research team. You can make something very neat, sure, but only at a small scale. Without massive backing, your research won’t make it to mass distribution. And no, you can’t emulate an existing OS’ software in order to appeal to a wider audience.
-
“I want to become proficient with a programming language, and need some challenging exercises”: Some people try to learn a new programming language through OS development, thinking that they need something difficult to do in order to really understand the language. That’s not a good idea either.
-
First, because it makes things even harder than they already are for the rest of the OS-deving world, and you shouldn’t overestimate your ability to withstand frustration and strain.
-
Second, because being new to the language, you’ll make more mistakes and those mistakes are much harder to debug when done at a low level than they are at the user application level where you have neat tools like GDB and Valgrind to help you.
-
Third, because you won’t actually learn the language itself, but only a subset of it. The whole standard library will be unavailable and except for a few languages designed for the job like C, most programming languages have features which require a runtime support you won’t be able to provide for some time. A few examples: C++ exceptions and vtables, C# and Java’s garbage collectors and real-time pointer checks, Pascal Object’s dynamic arrays and nice strings… More details here and there.
In short, you’ll have a hard time coding things. When you do, you’ll write poor code and you won’t actually learn the language. Don’t do it.
-
-
“Implementing a new kernel is horribly complicated, but there’s a much easier path! I’ll just take the Linux sources and tweak them till I get an operating system which fits my needs, it shouldn’t be so hard”: May I be the first to introduce you to what Linux’s sources actually look like by showing you its main.c file. Your first mission will be to locate where the “actual” main function is. Your second mission will be to understand where each mentioned function is located and what it’s supposed to do. Your third mission will be to estimate the time it’ll take you to understand all of that.
As you should have understood by now, modifying an existing, old codebase without a good knowledge of what you’re getting into is really, really hard. Code reuse is a very noble goal and it has many advantages, but you must understand now that it won’t make your life simpler but rather make it much worse because you’ll not only have to learn kernel development as a whole but also all the specifics of the codebase you’re looking at.
-
“I don’t like detail X in Linux/Windows/Mac OS X, these should totally be re-done”: All modern operating systems have their quirks, especially the desktop ones, no question about that. On the other hand, if your gripes against them remain relatively minor, you really should consider trying to contribute patches which fix them to the relevant projects and only rewriting the incriminated component(s) if this approach fails.
-
“I’m looking for an exciting job”: cf. “I’ll bring the next revolution of computing”. Except if you’re in a research lab which works on operating systems, you’ll hardly get any money from your OS as long as it has no practical uses (and it can remain the case for a very long time).
Some good reasons for trying hobby OS development
-
“I want to experiment with new OS design”: You have your own idea of what an operating system should be like and you can’t find it in the existing OS market ? Hobby OS development can be a way to try designing and implementing what you’ve envisioned and see if it actually works in practice.
-
“I’m now quite experienced with a language and looking for a programming challenge, or want to know how my computer works at the lowest levels”: OS development certainly provides both, and to a great extent. You’ll learn how to debug software with hardly anything more than text output facilities, how to do without a standard library or any sort, why so many people complain about this x86 architecture which works so well on your computers, what bad documentation from the vendor truly is…
-
“I’ve been working on an operating system for some time, know its codebase well, and have many gripes with it”: If addressing these gripes would imply major changes, the kind of which makes software incompatible, you’ll probably no choice but to fork, which while not described in this article is a totally valid path for OS development (and one which is certainly more rewarding than writing a new OS from scratch).
-
“The set of changes I want to bring to an OS is too big for simply patching or forking part of it”: There sure is a point where starting a new project becomes the best option. If you think you have reached it, then welcome to the club!
-
“I’m looking for an exciting hobby”: Well, depends a lot on what your definition of “exciting” is, of course, but OS development can be one of the most exciting hobbies on Earth if you’re sufficiently prepared. It’s somewhat akin to raising a child or growing plants: sure, it’s slow, and sometimes it gets on your nerves and feels highly unrewarding. But this just makes success feel even better. Plus it’s incredibly fun to be in full control of what’s happening and have absolute power on your hardware.
So, are you ready?
Aside from the questioning phase, here are a few things which you must know or learn quickly in order to take this road successfully:
-
General computer science: You must know well and be able to fluently manipulate binary and hexadecimal numbers, boolean logic, data structures (arrays, linked links, hash tables, and friends), and sorting algorithms.
-
Computer architecture: You should know the global internal organisation of a desktop computer. The words and expressions ALU, interrupts, memory and PCI buses, and DMA should not sound mysterious to you. Knowledge of more automated mechanisms like the TLB, multilevel caching, or branch prediction, would be a very valuable asset in the long run, once you try to optimize code for performance.
-
Compiler and executable files internals: You must roughly know what happens during the preprocessing, compilation, and linking steps of binary generation, and be able to find everything you need to control your toolchain in its manual. You should also learn about linker scripting and the internal structure of executable files.
-
Assembly: You must know what Assembly is and how it works, and you must be able to code simple Assembly programs (playing with the stack and the registers, adding and subtracting numbers, implementing if-like branching…). You can write most of your operating system in other programming languages, but at some point you will need assembly snippets to do some tasks.
-
C or C++: Even if you plan to use another programming language to write your OS in, you’ll need to know C or C++ in order to understand the code you’ll find on the web when looking at other operating systems’ source or other OS-deving tutorials.
-
Your system programming language: No matter whether you plan to write your OS in C#, Java, Assembly, BASIC, C, C++, Pascal, Erlang or REBOL, you should be very familiar with it. In fact, you should even have access to a description of its internals. As an example, you should know how to call a function written in your language of choice from Assembly code, what kind of runtime support your language requires and how it can be implemented.
(List based on this page of the OSdev wiki and tweaked based on personal experience.)
This ends the “Are you ready ?” part of this OS-deving tutorial. If you have reached this point, congratulations ! You should have the right mindset and sufficient knowledge to begin the hobby OS development adventure. The next article will explain to you how to set goals and expectations for your OS project, a truly vital step if you actually want it to get somewhere in the end.
Very nice article.
Protip – if you have a knack on kernels, dive into Linux. You will immediately become a sought after developer in the global job market, and can probably secure a premium salary wherever you go.
If you mention your own “revolutionary” kernel project in your CV, you are likely to be deemed slightly delusional and potentially dangerous ;-).
Absolutely.
Look at the plan9 project, where they fixed many of the technical issues with Linux interfaces.
Do they have a good reason to write a better OS?
Yes.
Does anybody care about plan9?
No.
Given that nobody cares, was the effort justified?
Maybe. It’s nice to fix a broken interface. In the end though, those of us who have to deal with the outside world will still have to deal with the broken interfaces despite the fact that something better exists.
Building an OS is a noble person goal, but it should not be undertaken under the impression that it will change the world. The market is too saturated to care.
BTW. Building a malloc implementation is really not that difficult. I’ve built one which significantly outperforms GNU’s malloc in MT processes by using lock free primitives.
Huh? Are you sure you were looking at Plan 9?
Well first the GP made the classic Linux/Unix mistake: Plan9 started in the 80s whereas Linux started in 1991.
But you made the mistake of thinking that interfaces are only GUI which isn’t the case, so he wasn’t incorrect here..
Your point being? Linux started as a Unix clone and never really got passed that. It didn’t bring anything new other than the fact that it was opensource. Indeed it’s a very advanced and well written clone, and a very useful one. Plan9 provided the next step before Linux even started, but nobody embraced it.
Linux advanced by adding (new technologies, good stuff, well written, but still adding). Plan9 was a fundamental change in design.
Edited 2011-01-29 13:28 UTC
The point is that Plan 9 has nothing to do with Linux. So the implication that Plan 9’s goal was to fix the issues with an OS which was not existing when the project got started is kind of silly.
“The point is that Plan 9 has nothing to do with Linux. So the implication that Plan 9’s goal was to fix the issues with an OS which was not existing when the project got started is kind of silly.”
Renox called me on my error when I used “Linux” instead of the more generic term “*nix”.
However, my point about other (and even better) alternatives existing is still correct.
You need to learn more about Plan9 to see how it is better. Obviously Plan9 devs had the benefit of hindsight and could develop better interfaces. So did Linus for that matter, but he chose to do a Unix clone rather than try his hand on improving the model.
If it wasn’t Linux today, who knows what would have taken it’s place? All we know is that there were viable alternatives.
Microsoft’s very existence is proof that connections and timing can outweigh technical merit. Sad, but ultimately true.
And you made the mistake of thinking that somehow my post implied that I thought that interfaces are only GUI. I have absolutely no clue how you were able to jump to that conclusion. Does that make you double mistaken?
Because you said ‘looking at’ and it is well-know that Plan9 has a poor GUI, eventhough its design is considered by many as superior, so I thought that there was a confusion..
So obviously you don’t agree that Plan9 design is superior to *nix: could you explain what you don’t like about Plan9?
Brilliant. More please!
Ah, this reminds me of when I tried to develop my “operating system” when I was 14, with Turbo Pascal on my 286 and later a Pentium 60 MHz. Actually it was a big DOS program with all its “applications” hardcoded inside, I planned to make it load external programs later because I had no idea on how to do it!
My OS had a GUI – in VGA 640x480x16 only – with overlapping but unmovable windows and several 3D controls roughly copied from UNIX pictures I often seen on computer magazines, and supported a mouse – via the DOS mouse driver, of course.
It was not multitasking, this was a planned feature; again, I had no idea on how to do it, but a design feature was to have each app to take over the entire screen, so this was not a real problem; you could switch to another app with a button on the top of the screen (a sort of taskbar). When you switched app, the running app saved your work to a temporary file, so when you switched back to it, it reloaded the file and let you resume where you left.
The apps included a file manager, a text file editor, an hex file editor, a bitmap viewer/editor which supported 1 and 4 bit color depths (standard palettes only), and I was working on a MS Works-like database that never worked and a spreadsheet which got far enough to let you enter data on cells (but not formulas) and plot different kinds of 2D charts! The text editor and the hex editor were even able to print on a character printer connected to a parallel port!
The text editor supported only a monospaced font – of course I was planning to support the non-monospaced fonts which were included in Turbo Pascal – but let you select colors and underline (it couldn’t print either of them, of course).
Some utilities like a calculator, a post-it app, a unit converter, a calendar and a minesweeper were always available from a Utilities menu.
The best part is that I actually gained some money from it, because the father of a friend of mine bought two copies and used them in his company’s office! He used it to write documents and letters because he said that Windows and Office were too bloated and wanted something simpler. Years later, I helped him move his stuff to BeOS when I shown it to him.
I was so sad when I realized that I lost the source code 🙁
Just think, had we been born just a decade earlier, it would be our operating systems everyone would be running today.
You keep using that word OS as in “Operating System” which does not mean what you think it does. 😉
I think you meant to say PS, as in “Productivity Suite.”
My goal was to make an OS 😀
Now I know how an OS works, but sixteen years ago I started with the applications because that was all I was able to do. Hey, I was 14, all my literature were computer magazines and old programming books from the local public library, and in 1994 Internet was mostly a curious word!
Well, my goal was to become king of the world. And yet people still don’t refer to me as your majesty 🙁
Probably you didn’t follow this list: http://www.eviloverlord.com/lists/overlord.html
Turbo Pascal FTW!
But does it run Linux?
Yes the linux kernel is pretty darn complex. That is granted. There are books written on the subject that do a good job of explaining the basics of the kernel. Its been a while since I looked at any of them. I had one for 2.4.
Although, I’m sort of kicking myself for taking its advice. I bought it with the idea of getting into hacking the scheduler. The section on the scheduler had some big bold text that said essentially ” THIS SECTION IS WELL OPTIMISED, DO NOT TRY HACKING HERE YOU WILL NOT COME UP WITH ANYTHING BETTER, EVER”. And yes, scheduler algorithms can be complex, but its obviously been improved since then. I wish I would have ignored that part.
Edited 2011-01-28 23:30 UTC
It sounds more like it meant to say: _you_ will not come up with anything better. Because it has already had so many people look at it and from just reading the code it will not be clear why thing are the way they are.
Although it can definitely use improvement on the interactive side. People have been doing a lot of work on that lately though.
I just wish a distribution will come out ‘soon’ with 2.6.39 when it is ready I’ve seen so many good changelog entries from 2.6.37 and 2.6.38 and promisses for 2.6.39.
Because I think Linux has a lot of potential as a desktop and I keep hoping it will deliver what people want. It seems to be improving every time, but progress feels slow.
I don’t really understand your comment. The book one the 2.4 version of the linux kernel had some stern language warning the user to not try to improve the scheduler. Obviously, its been improved. I wish I would have ignored it and spent more time trying to understand schedulers.
I think it’s written like that because many people try to suggest a “new and improved” scheduler too much. And because of that, the kernel devs (the book I read is from one of the the kernel dev) got tired and just write something like that
Edited 2011-01-29 03:01 UTC
Yeah, the scheduler has been improved. However, it has not been improved by people who were having a first go at understanding the internals of an Operating System.
You have to learn how to walk before you can think about running a marathon.
While it’s highly unlikely that any given individual would produce a “revolution in computing”, it certainly isn’t impossible. Linux, of course, was initially a single person’s effort, and Unix was initially a two-person effort. FORTH, of course had a large impact on computing (and was essentially an OS as well as programming language) and that was a single person’s effort, as were CP/M and QDOS (AKA MS-DOS); admittedly, CP/M cribbed a fair amount from DEC’s RT-11 OS.
And while emulating an existing OS API might not be the path to success, providing Posix compatibility will certainly expedite porting applications (shells, compilers, etc.) .
As for capital, should someone come up with something beneficially revolutionary, venture capitalists might be persuaded to make a monetary contribution in exchange for part ownership.
Of course, perhaps only one such effort in 1000 is likely to have that sort of success.
Also, as for a gateway to an exciting job, one might be able to parlay such an effort into PhD topic, and a PhD would certainly increase the job opportunities.
“While it’s highly unlikely that any given individual would produce a ‘revolution in computing’, it certainly isn’t impossible. Linux, of course, was initially a single person’s effort, and Unix was initially a two-person effort…”
That’s just the point: small/individual efforts succeeded back then because the market was empty. Many of us are capable of doing what Linus did with Linux, but it doesn’t matter any more. Efforts today are in vein, being “better” is not really as significant as being first or having the stronger marketing force.
I’m not trying to downplay Linus’ achievement in the least, but it is likely his pet project would be totally irrelevant if he started in today’s market.
Linux was not the first by a long shot. There were plenty of open source unix-like OS by the time he started writing a single line of code: Minix, and BSDs for example.
There are plenty of opportunities for new stuff to come out of someone’s pet project. In fact most interesting stuff usually comes from “pet projects” because once a product/project is stablished they tend to gather such inertia that they become pigeon-holed or develop a certain level of tunnel vision. Thus missing some of the interesting stuff in the periphery that those “pet projects” have more freedom to explore.
When Linus started writing code for Linux, Minix cost $69 and was not yet freely distributable (not until 2000!) and BSD was tied up in lawsuit with AT&T. Hurd was intended as the kernel for the GNU system, but was not yet (and still isn’t) complete.
Linux was totally a success due to being at the right place, at the right time.
Edited 2011-01-29 04:31 UTC
“Linux was not the first by a long shot. There were plenty of open source unix-like OS by the time he started writing a single line of code: Minix, and BSDs for example.”
Exactly! If Linux has started a few years later, FreeBSD (or another variant) would have “won” and it would be grabbing all the attention instead of Linux.
Same can be said for Microsoft/DOS. Timing is everything.
I agree that it is possible (albeit improbable), but I’ve written this for a few reasons :
1/Due to it being so unlikely, I think it’s nice as a dream (“I would like to bring the next revolution”) but not as a goal (“I will bring the next revolution”). I’d love my OS to have some impact in many, many years, but if it has none (which is likely) I’m ready to admit that it’s a success anyway as long as I am happy with it and its hypothetical future users are too.
2/Trying to write something revolutionary is one of the paths to feature bloat. Once you want to impress your user base, it’s easy to fall for shiny features, start to add lots and lots more, and end up eating up 13GB of HDD space for something which is not more useful than a calculator, a notepad, a primitive word processor, a web browser and a file explorer.
3/Matching an existing operating system, given where they are today, is a matter of years. To stay motivated for that long, I think it’s best to feel the thing is rewarding as it is, not as it’s supposed to be in a very long time.
Yes, but is it a good idea to begin with the idea that you’ll make your OS POSIX compatible ? Emulating POSIX is a big task, and since hobby OS developers are generally one or two per project, they must chose their goals very carefully. This one costs a lot of energy while in the end you get nothing but a “linux clone” (in the view of its potential users, I know that linux != POSIX), which has yet to prove that it’s better than the original.
I mentioned research teams as a way to monetize OS development, a PhD is one of the ways to get in there
Edited 2011-01-29 07:56 UTC
Not in France and not by any account I’ve heard of/read about and not by my experience. Having a PhD has done nothing as far as landing me an interesting or rewarding (note that I didn’t write “exciting”) job. And God knows I have looked in every direction. Out of spite (and to be honest, because the opportunity appeared on my radar), after five months in my current position, I am turning my head and heart towards working for myself. Unfortunately, it means the project I was working on will go commercial, at least in the beginning.
This was an excellent article. Very well informed and hit the issues right on the head. I was tickled to see C# mentioned as a system language since I was one of the SharpOS developers 😀
I look forward to the rest of the series!
I enjoy articles like these, they differ so much from the regular stuff here on OSNews and are more like the stuff people actually expect to see here. It would be great if we got more some day
As for the topic at hand (coding my own OS): I have always wanted to start coding an OS of my own, but I am really bad at actually starting something. I have no delusions of it ever reaching more than 1 user or something like that, the only reason why I’d want to code an OS of my own is simply to learn. Nothing beats learning kernel internals, memory handling and all that like actually coding it all yourself from scratch!
Actually, it was the same with me for a long time. My OS is my first successful long-time personal project so far, and I have two failed attempts on my back (main reason for their failure will be explained in the beginning of article #2 ).
Edited 2011-01-29 08:12 UTC
This is an absolutely brilliant article, so much so I had to create an account and leave a comment.
I’ve probably spent a year on malloc and still not finished because of all the downtime and the fact you have to get kprintf working with variable arguments and so forth. I copied a tutorial to get it working and now I’m in the process of spending months learning what Interupts are and playing with Assembly retrospectively. I understand basic assembly, stacks and registers are but you have to learn as you go along.
God it’s a long road but I’m not giving up! If I spend 5 hours playing with basic assembly and don’t touch the source code of my OS…I still consider that time spent _working_ towards my OS.
Neo what was your thoughts in deciding between using machine language (like menuet os use) or C to code the OS?
Depends. If you want to write portable code, you should avoid Assembly like pest due to its highly machine-specific nature. I also find it much harder to write well-organized, easy-to-debug ASM code, but it might be due to my relative lack of experience with it. If you feel this way too, you should probably be using a higher-level language instead : it’s very important to keep your codebase as tidy as possible.
Otherwise, as you say yourself, MenuetOS itself shows that it’s possible to get some interesting results with Assembly.
Edited 2011-01-29 16:16 UTC
I’d strongly advise against assembly: unless you have years and years of experience with it you’ll sooner or later simply lose oversight of the whole thing simply due to the sheer amount of text you’ll be writing. Not to mention how incredibly tedious it is.
Of course it’s a way of learning assembly, yes, but there’s plenty of better ways of going about that one. If you plan to also learn kernel programming at the same time then you’re faced with the famous chicken and egg problem: you need to learn assembly to do kernel coding, and you need kernel coding to learn assembly..
Of course opinions are opinions, but I’d much rather suggest to set out to learn one thing at a time: either assembly, or kernel programming, not both. If you wish to do kernel programming then choose a language you’re more familiar with.
About keeping track of hardware changes.
The main reason SkyOS became dormant is the rapid development of hardware
http://www.skyos.org/?q=node/647
I was wondering whether a hardware driver in machine language is more easy to add to your kernel and more universal then say a higher level language driver. But maybe then you would have less functionality?
Not necessarily. You can create wrappers in order to use the arch-specific assembly functions you need in your language of choice, and keep all the logic written in this language.
As an example, if I need to output bytes on one of the ports of my x86 CPU, all I have to do is to create a “C version” of the OUTB assembly instruction, using macros and inline assembly, and after that I can use it in the middle of my C code at native speed.
With GCC’s inline assembly, it’d look something like this :
“#define outb(value, port) \
__asm__ volatile ( \
“outb %b0,%w1″ \
::”a” (value),”Nd” (port) \
)”
Yes, it’s ugly, but you only have to do it once.
Edited 2011-01-29 20:09 UTC
The majority should definitely be coded in a high level language.
In the days of DOS TSRs, we wrote in assembly because the code had to be small and efficient.
Even to this day it’s often easy to beat a compiler’s output simply because it is restrained by a fixed calling convention. In x86 assembly language, I am free to stuff values where I please. A variable can be stuffed into segment registers, a function returning boolean can use the “zero flag”. This eliminates the need to do “cmp” in the calling function.
In my own assembly, I can keep variables intact across function calls without touching the stack. To my knowledge, all C compilers use unoptimized calling conventions by design so that separately compiled object files link correctly.
Of course above I’m assuming that function call overhead is significant, YMMV.
However, all this optimization aside, looking back on TSRs I wrote, it takes a long time to familiarize oneself with code paths again. This is true of HLL too, but even more so with assembly. It is crucial to comment everything in assembly, and it is equally crucial that the comments be accurate.
Low level assembly is just not suitable for code that is meant to developed over time by many people. Besides, it’s not portable, and cannot benefit from new architectures with a simple recompile.
Off the top of my head, the bootloader and protected mode task management are the only things which really require any significant assembly language.
Even to this day it’s often easy to beat a compiler’s output
I quite doubt that. There’s been plenty of discussion of this and the general consensus nowadays is that it’s really, really hard to beat atleast GCC’s optimizations anymore. Though, I admit I personally have never even tried to
I have tried it, just for fun. I even used some example assembly provided by AMD, optimised for my processor for parts of it, but could only get up to about 80% of the gcc performance.
Just remember kids, premature optimization is the root of all evil, similar to certain other premature…mishappenings!
Hmm… Theoretically-speaking, I think premature optimization is worse.
I’d spontaneously believe that people can find a benefit in the latter in the end, once they got past the cultural barrier around it : it forces them to diversify their body language much earlier and much more than most other kids would, which is beneficial to everyone in the end.
On the other hand, premature optimization is always a waste. I just can’t think of a way to extract a benefit from it.
“On the other hand, premature optimization is always a waste. I just can’t think of a way to extract a benefit from it.”
I am just being a contrarian here for the sake of it, but all too often a business will ship code without any consideration of the efficiency of the contained algorithms. The users inevitably come back complaining about performance, and the ultimate solution ends up with buying new hardware since nobody wants to touch the code.
Arguably, a good programmer, who keeps an eye open for places to optimize, should do so immediately if there is no loss in readability. Programmers who get used to this practice will train themselves to write more efficient code from the start without spending too much time thinking about it.
On the other hand, I found myself in an argument with a CTO at an ex-employer about removing all string concatenation in dotnet web services in favor of string builders. He was totally adamant about this that it became a company wide policy to do all concatenation with a string builder. Presumably he read something about string builder being better somewhere, and he over generalized the point to the extreme.
I submitted a test suite to measure his claims and low and behold, for the vast majority of concatenations he was having us remove, the result actually made things (marginally) worse. So there he is convincing clients we’re optimizing the product… He didn’t want to loose credibility so his policy stuck, but we didn’t get along after that. What a mess.
I hope someone kicks me off my high horse if I ever become so blind.
The problem is that although experience can help sometimes, more often than not we don’t actually know what is slow in a program before we see it running.
The first time code is written, the primary priority of a developer should not be speed, but cleanness, maintainability, and getting it to work.
Once code is running and working, we can profile it, and see where it spends its time. Afterwards, we can optimize accordingly. Saves a lot of time, and helps keeping the vast majority of the code clean while only putting dirty optimizing tricks in the parts which actually need to be optimized.
Edited 2011-01-30 09:26 UTC
“The first time code is written, the primary priority of a developer should not be speed, but cleanness, maintainability, and getting it to work.”
I don’t disagree with the order of priorities here. But I do believe efficiency should have a larger role up front than your suggesting.
Once the whole thing works, the “cement” has already begun to dry, so to speak.
The way I interpret this “early optimization is the root of all evil” quote, it reads that one should not prioritize optimization above more important concerns the first time which code is written.
If there’s no compromise to make, then I agree that it’s possible to optimize right from the first time, or even at the design stage.
But what I was trying to point out is that it’s relatively easy to optimize well-written code, as soon as it is sufficiently flexible by design (i.e. we can optimize without breaking function calls or thing like that). So one should not worry too much about optimization too early.
Good cement dries slowly, so to speak.
Edited 2011-01-30 10:00 UTC
You shouldn’t do *anything* up-front except writing a prototype to get feedback. Then you’ll see what you did wrong, including performance issues. If “drying cement” keeps you from fixing problems, your project will have much more serious issues than performance.
“You shouldn’t do *anything* up-front except writing a prototype to get feedback. Then you’ll see what you did wrong, including performance issues.”
I don’t think anybody here was referring to the prototype. Never the less, if your prototype is functional enough that it can be used for performance analysis, then it seems to me that you’ve already put in a significant investment, no?
In any case, if you believe in building a separate prototype and using that for performance analysis, then you really ought to agree with me that planning for an efficient design up front is important since that’s essentially what a prototype is. This way you can analyze what works and performs well BEFORE you get too far into development of the production code.
“If ‘drying cement’ keeps you from fixing problems, your project will have much more serious issues than performance.”
What is that supposed to mean?
I think I’ve made a strong case on why planning for efficiency early on is important, sometimes more so than optimizing later on. It doesn’t really matter if I’m right or not because it flies against the tide of authoritative people who say the exact opposite. In the end, no matter how much merit my argument actually holds, I know that I’ve lost.
I think I’ve made a strong case on why planning for efficiency early on is important, sometimes more so than optimizing later on.
I kind of feel you guys are arguing semantics, but I guess it’s more of a personal taste. Anyways, the way I see “Premature optimization is the root of all evil” is that there is no point in trying to optimize your code before you even have all the required features working bug-free.
Having a clear plan of what components interact with what other components and how to avoid bottle-necks etc. has nothing to do with code optimization, ie. it isn’t covered by the slogan.
A good coder plans ahead, tries to produce clean, readable and functional code without spending a literal effort to optimization, and then when all the requirements are fulfilled he will analyze his code when running. Why? Most often than not there’s only a few bottlenecks that make 90% of the negative performance impact and that’s where the optimization should be concentrated.
Depends on the case, but yes, the first prototype *can* be a major investment. Point is, if you invest in anything else, your investment is most likely lost. What the actual requirements of a project are, and what you think they are, usually differ so much that you’d be starting into a totally wrong direction. A prototype then often shows you the right direction.
Remember, a correct and complete requirements specification only happens in fairy tales, and when you ask your users on a purely theoretical basis, i.e. without a prototype, they won’t give you realiable information.
No. The prototype isn’t for performance analysis. You build it to see if your whole project is heading in the right direction, because if it isn’t, any performance optimization is wasted time.
If the direction is right, the prototype next shows you not how to make the design efficient, but whether efficient design is actually needed. See, for example, the above-mentioned Minix mkfs example. In real projects, developers will have it much harder to “guess” how often functionality is used than with mkfs.
It means that if you detect performance problems late, you’d still fix them. Just as if you detect other fundamental design issues late, you fix them.
It means that if at a later point there are any we-won’t-touch-that-part-anymore pieces in the project, and they are an obstacle to core functionality, you’re doomed. So you don’t build such pieces; you build pieces that can be changed even at a late point in the development cycle.
EDIT: I might add that this is becoming a bit off-topic A hobby OS doesn’t have users, it does not have requirements, and maybe you optimize performance just for the fun of it. It does not make sense to ask whether performance is “sufficient” when nobody is actually using your OS for anything, so you rather optimize until you’re satisfied with it.
Edited 2011-01-31 07:07 UTC
It depends ! If you plan to become king of the OS world, you also have to get the fastest OS, isn’t it ?
(Myself, looking at the current kings of the hill, I think it’s definitely not the case, but well…)
Edited 2011-01-31 07:40 UTC
“Depends on the case, but yes, the first prototype *can* be a major investment. Point is, if you invest in anything else, your investment is most likely lost.”
This is a generalization, what do you really mean here?
“A prototype then often shows you the right direction.”
Of course.
“No. The prototype isn’t for performance analysis. You build it to see if your whole project is heading in the right direction, because if it isn’t, any performance optimization is wasted time.”
Again, you are the one who brought the prototype into the discussion, no one here has said anything about needing to optimize the prototype.
What I did say was that it is important to analyze and predict potential efficiency problems with the design before getting so far into development that it’s hard to change. As far as I can tell, your statements like the following seem to agree with my reasoning.
“If the direction is right, the prototype next shows you not how to make the design efficient, but whether efficient design is actually needed.”
“See, for example, the above-mentioned Minix mkfs example. In real projects, developers will have it much harder to “guess” how often functionality is used than with mkfs.”
I’d wager a guess that the Minux mkfs probably worked within one week and that the student was optimizing it afterwards for six months. If this is the case, then it’s not really relevant to a discussion on early optimization. Either way, we agree this situation was ridiculous.
“It means that if you detect performance problems late, you’d still fix them. Just as if you detect other fundamental design issues late, you fix them.”
But fixing them late can be magnitudes harder such that many projects simply give up on the notion of fixing them, please see my earlier examples.
“So you don’t build such pieces; you build pieces that can be changed even at a late point in the development cycle.”
Whether we like it or not, there comes a time in many projects where we pass a point of no practical return. For example the choice between threading, poll/select(), multi-process, or AIO is needs to be decided early on. No matter how good your final design is, it requires a great deal of effort to change models – every single IO point is potentially affected.
Sure it can be done, but those of us with optimization experience can void many performance problems such that we wont have to cross such bridges later on. Again, see my earlier examples.
I’m not talking about profiling each function or anything like that, is that what you’re thinking? I’m really referring to optimizing the overall design for performance at a higher level.
“A hobby OS doesn’t have users, it does not have requirements, and maybe you optimize performance just for the fun of it. It does not make sense to ask whether performance is ‘sufficient’ when nobody is actually using your OS for anything.”
True, but sometimes those pet projects become famous and for better or worse the future has to deal with legacy decisions whether or not they were well thought out.
I’ll concede the industry consensus is on your side.
Professors preach how premature optimization is bad.
Client contracts generally don’t spec out terms for efficiency up front.
Employers never ask about my ability to generate efficient code.
With mores law, who cares if code is less efficient? Ram is cheap, so are more cores.
This collective mindset has lead to code bloat of unimaginable proportions. A desktop with productivity apps require insane amount of ram and cpu for no good reason.
I guess I’m just one of the few dinosaurs left who appreciates the elegance of tight code and I just lash out in vein at what I view to be the cause of it’s downfall: the trivialization of optimal code and efficiency.
Edited 2011-01-30 10:01 UTC
That’s not what I said, but you must have encountered one of the earlier, less clear edits of my post.
I do not advocate the absolute lack of software optimization which we have nowadays. Though I think feature bloat and poor design are more to blame there, as an aside.
What I advocate is only not optimizing too much too early.
There are several reasons for this.
First, it’s a great way to lose time you could have spent on optimizing something more important. Tanenbaum tells us an interesting story in Modern Operating Systems, for that matter : one of his students, who worked on MINIX, spent 6 months optimizing the “mkfs” program, which writes a filesystem on a freshly formatted disk, and more months debugging the optimized version in order to make it work.
This program is generally called exactly once in the life of the operating system, so was it really worth the effort ? Shouldn’t he have cut his optimizer’s teeth on something more important, like say boot times ?
Second reason why early optimization is bad is that, as I mentioned earlier, there’s a degree of optimization past which code becomes dirtier and harder to debug. Caching is a good example. Clean code is very, very important, because past a certain degree of dirtiness one can do nothing with code. Not even optimizing. So this is a though decision, one that is not trivial and which should only be made after profiling has shown that the code is actually too slow as is.
“Second reason why early optimization is bad is that, as I mentioned earlier, there’s a degree of optimization past which code becomes dirtier and harder to debug.”
Re-writing code in assembly (for example) is usually a bad idea even after everything is working, surely it’s even worse to do before. But then this isn’t the sort of optimization I’m referring to at all.
Blanket statements like “premature optimization is the root of all evil” put people in the mindset that it’s ok to defer consideration of efficiency in the initial design. The important factors proposed are ease of use, manageability, etc. Optimization and efficiency should only be tackled on at the end.
However, some designs are inherently more optimal than others, and switching designs mid stream in order to address efficiency issues can involve a great deal more difficultly than had the issues been addressed up front.
For a realistic example, see how many unix client/server apps start by forking each client. This design, while easy to implement up front, tends to perform rather poorly. So now we have to add incremental optimizations such as preforking and adding IPC, then we have to support multiple clients per process, etc.
After all this work, the simple app + optimizations end up being more convoluted than an more “complicated” solution would have been in the first place.
The Apache project is a great example of where this has happened.
The linux kernel has also made some choices up front which has made optimization extremely difficult. One such choice has been the dependence on kernel threads in the filesystem IO layer. The cement has long dried on this one. Every single file IO request requires a kernel thread to block for the duration of IO. Not only has this design been responsible numerous lock ups for network file systems due to it being very difficult to cancel threads safely, but it has impeded the development of efficient asynchronous IO in user space.
Had I been involved in the development of the Linux IO subsystem in the beginning, the kernel would have used async IO internally from the get go. We cannot get there from here today without rewriting all the filesystem drivers.
The point being, sometimes it is better to go with a slightly more complicated model up front inorder to head off complicated optimizations at the end.
Edited 2011-01-30 22:36 UTC
Actually, I think this whole fork() thing started as a memory usage optimization. On systems with a few kilobytes of memory like the ones which UNIX was designed to support, being able to have two processes using basically the same binary image was a very valuable asset, no matter the cost in other areas.
Nowadays, however, even cellphones have enough RAM for the fork() system to be a waste of CPU time and coding efforts (hence the decision of the Symbian team not to include it a while ago). But due to legacy reasons and its mathematical elegance, it still remains.
All hail the tragedy of legacy code which sees its original design decisions becoming irrelevant as time passes (the way I see it).
I think that should we look deeper, we’d find legacy reasons too. After all, as Linux was designed as a clone of ye olde UNIX, maybe it had to behave in the same way on the inside for optimal application compatibility too ? Asynchronous IO is, like microkernels, a relatively recent trend, which has only be made possible due to computers becoming sufficiently powerful to largely afford the extra IPC cost.
Edited 2011-01-31 07:42 UTC
Aw, wonderful Neolander! A new topic, I was getting tired of listening to myself go on about premature optimization.
“I think that should we look deeper, we’d find legacy reasons too. After all, as Linux was designed as a clone of ye olde UNIX, maybe it had to behave in the same way on the inside for optimal application compatibility too ?”
An asynchronous kernel can easily map to synchronous or asynchronous userspace.
An synchronous kernel easily maps to synchronous user space.
However, a synchronous kernel does not map well to asynchronous user space, which is the scenario we find ourselves in under Linux.
Some AIO background:
For those who aren’t aware, setting O_ASYNC flag is not supported for file IO. Posix defined a few new functions to enable explicit asynchronous access to arbitrary descriptors.
In the case of posix aio support in linux, it’s built directly on top of pthreads, which is a major disappointed to anyone looking to use AIO. Each request creates a new thread which blocks synchronously inside the kernel waiting for a response. When done, the kernel returns to the user thread which is configured to either raise a signal or call a user space function.
The problem here is that the whole point of AIO is to avoid the unnecessary overhead of a new thread and stack for each IO request. Ideally, we’d just tell the kernel what we need to read/write and it will notify us when it’s done. With true AIO, there is no need to block or use threads.
More recently, kernel developers have created a new syscall to expose a real AIO interface to userspace. You can install “libaio” to access this interface. It’s not 100% posix compatible, and sadly this still uses threads within the kernel (due to legacy design decisions in the kernel), but at least userspace is completely free to replace the synchronous file interface.
There’s more bad news though, the kernel level AIO has only been implemented for block devices. Therefor sockets, pipes, etc won’t work. This is why posix AIO on linux continues to use pthreads.
“Asynchronous IO is, like microkernels, a relatively recent trend, which has only be made possible due to computers becoming sufficiently powerful to largely afford the extra IPC cost.”
Actually, AIO interfaces reflect the underlying hardware behavior more naturally and efficient than blocking alternatives. It has it’s roots in interrupt driven programming, which is inherently asynchronous.
For whatever reason, operating system began exposing a blocking interface on top of asynchronous hardware such that multi-process/multi-threaded programming became the norm.
I seem to be trailing off topic though…
But in this case, what would happen when process B sends an I/O request to the kernel while it’s already processing the I/O request of process A ? I think that if RAM is cheap or stacks are small, spawning a new thread actually makes more sense there.
You’re a bit preaching to the converted there.
But we’re living in an algorithmic world where programs are supposed to follow a linear track of instruction without interrupts brutally changing the execution flow. If we don’t want to drop algorithms altogether (that would probably be a bad idea, as most people think using algorithms), we must somehow make sure that interrupts do not affect the current flow of the program. Hence the idea of pop-up threads, in my opinion one of the best concepts for interrupt management so far : don’t create a thread just to have it block for I/O right away, but rather spawn a new thread when I/O is completed in order to handle the consequences.
Well, we’re talking about OS design, right ?
Edited 2011-01-31 09:58 UTC
“But in this case, what would happen when process B sends an I/O request to the kernel while it’s already processing the I/O request of process A ?”
Assuming the kernel is written asynchronously (which linux is not), then userspace AIO requests map directly to the kernel ones.
Process A:
start read for sector 20, callback &cba
Kernel:
start read for sector 20, callback (Process A, function &cba)
Process A:
idle
Process B:
start read for sector 50, callback &cbb
Kernel:
start read for sector 50, callback (Process B, function &cbb)
Process B:
idle
At this point the kernel has two requests queued and there are no blocked threads doing synchronous IO. Each can continue dispatching more IO if they need to.
Kernel:
Block 20 is read, notify Process A to call &cba
Process A:
Call &cba somewhere in the event loop
Kernel:
Block 50 is read, notify Process B to call &cbb
Process B:
Call &cbb somewhere in its event loop
Voila!
Writing asynchronously designed software is very different for people who’ve accustomed themselves to synchronous designs.
Obviously linux uses signals for kernel callbacks, which are rather inelegant, but there are workarounds which allow programmers to create an efficient+safe event loop of of them (see signalfd).
Once you are used to them, Async designs can provide very nice solutions compared to blocking threads. The actual API can vary a lot (posix AIO is so-so). This way a process never blocks waiting for IO. An IO bound process spends all it’s time dispatching requests directly to the kernel without involving intermediary mechanisms such as threads. AIO eliminates the need for locking primitives such as semaphores, and therefor completely solves the problem of race conditions.
“I think that if RAM is cheap or stacks are small, spawning a new thread actually makes more sense there.”
If a developer chooses to use an AIO callback design in his app, then do you agree that it makes no sense to introduce blocking threads in the aio library itself which the app will not use?
I’d assert that the primary benefit of threads is to handle CPU bound processes, the spawning of threads to serve IO requests doesn’t make sense.
It’s one thing to say that threads make IO programming easier, but I truly believe that most programmers would prefer AIO designs once they had sufficient experience with it.
“but rather spawn a new thread when I/O is completed in order to handle the consequences.”
Posix AIO supports this today, in my opinion though this should only be used if your going to do some CPU bound calculations. If, after receiving an IO completion callback, you are only going to send out another IO request, then why bother with the new thread at all?
“But we’re living in an algorithmic world where programs are supposed to follow a linear track of instruction without interrupts brutally changing the execution flow.”
Without threads, you do have to break up execution into callbacks or have language support for “yield()” functionality like .net. But it sounds like you are assuming that code can be interrupted at any point in time, which would be awful.
However, AIO can be handled within a relatively simple event loop reading from a queue of events. This event loop can be extremely small and generic by simply invoking callbacks.
I highly recommend everyone study the new linux mechanisms: signalfd, timerfd, eventfd, pollfd.
Once you use them I think you’ll agree they’re awsome, and they go hand in hand with AIO designs.
This is the longest thread I’ve ever read on OSnews. Granted, I didn’t grow a long tooth around here but it’s still impressive.
“Even to this day it’s often easy to beat a compiler’s output”
“I quite doubt that. There’s been plenty of discussion of this and the general consensus nowadays is that it’s really, really hard to beat atleast GCC’s optimizations anymore.”
I feel you’ve copied my phrase entirely out of context. I want to re-emphasize my point that c compilers are constrained to strict calling conventions which imply shifting more variables between registers and the stack than would be possible to do by hand.
I’m not blaming GCC or any other compiler for this, after all calling conventions are very important for both static and dynamic linking. However it can result in code which performs worse than if done by hand.
As for your last sentence, isn’t the consensus that GCC output performs poorly compared to other commercial compilers such as intel’s?
I would be interested in seeing a fair comparison.
Edited 2011-01-30 03:37 UTC
As for your last sentence, isn’t the consensus that GCC output performs poorly compared to other commercial compilers such as intel’s?
I would be interested in seeing a fair comparison.
Apparently this is true. I googled a bit and found three benchmarks:
http://macles.blogspot.com/2010/08/intel-atom-icc-gcc-clang.html
http://multimedia.cx/eggs/intel-beats-up-gcc/
http://www.luxrender.net/forum/viewtopic.php?f=21&t=603
They’re all from 2009 or 2010 and in all of them icc beats GCC by quite a large margin, not to mention icc is much faster at doing the actual compiling, too. Quite surprising. What could the reason be then, why does an open-source compiler fare so poorly against a commercial one?
Edited 2011-01-30 03:58 UTC
“What could the reason be then, why does an open-source compiler fare so poorly against a commercial one?”
Without looking at the assembly, it’s just speculation on my part.
I’ve read that GCC is rather ignorant of code & data locality and cpu cache lines; therefor binary code placement is arbitrary rather than optimal. In theory, this could make a huge difference.
Function inlining is usually good, but only until the cache lines are full, further inlining is detrimental. This may be a weakness for GCC.
The compilers inject prefetch hints into the code, maybe GCC predicts the branches less accurately at compile time?
GLIBC is notoriously bloated. I don’t know if intel links in it’s own streamlined c library? That might make a difference.
As for compilation time, ICC has an “unfair” advantage. If GCC had been compiled under ICC, then GCC itself might perform much better – though I’m not sure the GCC folks would want to admit to that.
Thinking about it further… the GCC bottleneck may not be the compiler at all but just the malloc implementation.
In an earlier post, I had mentioned that I made my own malloc which performs much better than GNU’s malloc in multithreaded apps.
I think my implementation fits somewhere between ptmalloc and Hoard on the following chart.
http://developers.sun.com/solaris/articles/multiproc/multiproc.html
I developed mine from scratch, so I have no idea why GNU’s malloc is slow, but I’m baffled as to why GNU continues to use a slow implementation?
It’s all about money 😉
Well *I* am not surprised: remember that very recently distribution have changed their JPEG rendering libraries with a 20% performance improvement.
You can see this in two ways:
– the optimistic view: nice a 20% improvement!
– the realistic view: JPEG are very old, the improved library use ISA which are very old too, why only now do we have the 20% improvement?
My view is that: open-source developpers like to have very flexible software combinations so GCC compiles many language on many architecture, but from a performance POV the situation isn’t very good..
1) Because GCC is *old* in terms of computer software. It has decades of cruft to take along for the ride.
2) GCC is portable. It’s optimizations can only take the optimization so far, if not to break compatibility with, say, the 68040.
3) Exactly because it is open source. That model means gradual evolution, almost never rewriting/reinvention.
No. As neolander said, you can wrap machine language instructions in C functions (or Java methods, or anything else that models a procedural construct).
The only area where this is impossible is when the function call itself interferes with the purpose of the instruction, and this occurs in very very few places. Certainly not hardware drivers, but rather user-to-kernel switching, interrupt entry points, etc.
As a side node, the NT kernel is based on an even more low-level core called “HAL” (hardware abstraction layer) that IIRC encapsulates those few places where you actually NEED machine language.
I’d also add that in some cases, there’s so much assembly in functions that devs are better off writing the whole function in assembly instead of using such wrappers.
x86 examples :
-Checking if CPUID is available
-Switching to long mode (64-bit mode) or to its “compatibility” 32-bit subset.
Edited 2011-01-29 22:33 UTC
Some attemps were tried to leverage assembly to a more higher level, such like HLA :
http://webster.cs.ucr.edu/AsmTools/HLA/hla2/0_hla2.html
Give it a try and figure out were MASM have stalled…
Kochise
I started OS deving when I was 12, back in 2000. It started as a DOS .com file, written in assembly, and much of the necessary code based off of SHAWN OS (like the GDT, IDT, etc). It turned eventually into a monster 4000 line single file. Then I reformatted my HD and lost it So I started over, this time with more experience under my belt, in C. After about 2 or 3 years of work it had a GUI with a mouse and was reading the HD and CD serials using the ATA/ATAPI identify command (anyone remember that?). Then I started from scratch again, because I realized it would take too much time to rework the monster source code tree to make it load drivers as modules and not link in at compile time. Worked on that for another 3 years. That was a more solid base, could read/write HD properly, had a filesystem (though it was somewhat buggy), and a bootloader with multiboot options, loadable drivers, and a command line interface (I decided to focus on the internals rather than going for the fancy GUI right away this time). Then I stopped working on it because I grew up.
All in all, it was an amazing experience. I spent all my teenage years working on my OS or other programming things, and thanks to that, I’m a pretty good programmer now if I may say so and started my own company doing iPhone dev’ing for now.
If anyone wants to check out its source code, its still on sourceforge… http://cefarix.cvs.sourceforge.net/cefarix/
Edited 2011-01-29 21:08 UTC
I started working on my OS without having any prior knowledge of operating system- or close-to-hardware programming of any kind. The reason for trying it out was, as people here said before me, to learn how everything actually works because you can theorize as much as you want but knowing how it actually works gives a lot more weight to your arguments.
The thing that really struck me was the amount of time it took to go from nothing into something. Behind the work I’ve done the major part excist of just reading tons and tons of documentation, looking at examples, looking at other peoples code and actually learning the tools (because it’s not as simple as gcc kernel.c -o kernel). The actual coding is only a small part of time you invest.
This article amused me because it reminds me of the steady stream of posts on osdev’s forum with people trying to create an operating system that is going to be the next best thing but doesn’t even get the most fundamental things working. Often they try to get other people to do it for them. This is actually more common than you might think.
It is not very hard to get your os to boot and print some text on the screen but going from there into having all the other parts done can truly be a pain in the buttocks. If you are not careful it is easy to overlook some part that you should have thought about in the beginning which now turns out to be a freakin nightmare.
Anyway, good luck to all enthusiasts out there!
Edited 2011-01-29 23:09 UTC
…is to abstract the hardware (CPU, devices, etc) and target a fictitious architecture that suits your needs.
I’ve done that for a hobby project and it became much easier to write a small hobby O/S.
My main interest was in-process component isolation; I managed to add component isolation through the page table of the CPU: each page belonged to a process’ component, and the component could only touch pages of the same component, or call code in pages that were ‘public’ for other components. My pseudo-assembly had relative addressing, as well as absolute addressing, in order to allow me to move components inside each process without too much fuss.
I believe this approach is viable, especially in 64-bit architectures: all programs share the same address space, and each program cannot touch another program’s private memory, except the public pages. It would make component co-operation much easier than what it is today…and much faster ;-).
Having a self-defeating attitude achieves nothing. Perseverance & tenacity will take you further than ‘being realistic’ ever will.
“Even to this day it’s often easy to beat a compiler’s output”
This is true. I do it everyday.
Just in case you guys don’t believe me: I compiled and disassembled a small segment of code that happened to be on my screen, under GCC 4.5.2:
x^=(x<<13), x^=(x>>17), x^=(x<<5)
Which resulted in:
8B45F0 mov eax,[rbp-0x10]
C1E00D shl eax,0xd
3145F0 xor [rbp-0x10],eax
8B45F0 mov eax,[rbp-0x10]
C1E811 shr eax,0x11
3145F0 xor [rbp-0x10],eax
8B45F0 mov eax,[rbp-0x10]
C1E005 shl eax,0x5
3145F0 xor [rbp-0x10],eax
Ouch. 6 memory references. It runs at an average of 20 cycles on my AMD Phenom 8650. The obvious 2 memory reference replacement runs an average of 8 cycles, more than twice as fast.
This is basic stuff that even a neophyte ASM programmer would not miss.
Edited 2011-01-31 02:35 UTC
And what were the compiler parameters for GCC then?
Replying to myself: I got the same code _without any kinds of compiler parameters_, ie. you are comparing optimized code versus completely unoptimized i486-compatible code. The reason why you get such code is quite obvious…
Here is what I get, with and without -O3 in GCC 4.4.1.
(I hope this output doesn’t get clobbered)
Edit: They did get clobbered, I needed to fix manually.
They are both pretty bad, I am actually quite surprised at how poorly GCC handled it. But for the record, I never doubted your claims about being able to do better than the compiler. Does someone have ICC on hand to see it’s output?
Gcc flag -O3
08048410 <func>:
8048410: push %ebp
8048411: mov %esp,%ebp
8048413: mov 0x8(%ebp),%edx
8048416: pop %ebp
8048417: mov %edx,%eax
8048419: shl $0xd,%eax
804841c: xor %edx,%eax
804841e: mov %eax,%edx
8048420: shr $0x11,%edx
8048423: xor %eax,%edx
8048425: mov %edx,%eax
8048427: shl $0x5,%eax
804842a: xor %edx,%eax
804842c: ret
Normal:
080483e4 <func>:
80483e4: push %ebp
80483e5: mov %esp,%ebp
80483e7: mov 0x8(%ebp),%eax
80483ea: shl $0xd,%eax
80483ed: xor %eax,0x8(%ebp)
80483f0: mov 0x8(%ebp),%eax
80483f3: shr $0x11,%eax
80483f6: xor %eax,0x8(%ebp)
80483f9: mov 0x8(%ebp),%eax
80483fc: shl $0x5,%eax
80483ff: xor %eax,0x8(%ebp)
8048402: mov 0x8(%ebp),%eax
8048405: pop %ebp
8048406: ret
Edited 2011-01-31 03:57 UTC
Innominandum,
I see your disassembly is in intel x86 syntax, how did you generate that? All the GNU tools at my disposal generate AT&T syntax which I find very annoying.
Check the objdump parameters, especially –disassembler-options with value intel-mnemonic.
I remember wanting to write my own Operating System years ago, and bought a book called “Developing your own 32-bit Operating System”. It sounds sad, but I had never been so excited about a book and thought it was really good for step-by-step learning.
I then started to read a book on the x86 architecture and protected mode, but unfortunately only got far as writing a boot loader (like so many) that switch the machine into protected mode and then wrote my own code to output some text to the screen.
It took me many months to get to that stage, so much so that I hit a wall with it and gave up. Yet I had original ambitions to write my owner scheduler, memory management, and file system code.
I wasn’t cut out for OS development, so really admire those who managed to write their own hobby OS – it takes a lot of your time and dedication.
“I remember wanting to write my own Operating System years ago, and bought a book called ‘Developing your own 32-bit Operating System’. It sounds sad, but I had never been so excited about a book and thought it was really good for step-by-step learning.”
It seems to be a phase that geeks go through at that age. Does anyone know if today’s youth has the same aspirations?
Doing it forces us to learn a great deal more than can be taught in any class. But as much as I loved being able to write my own bootloader/OS to study computer architectures in detail, it’s a shame that those skills are so unappreciated in today’s job market.
Speaking of which, is anyone hiring in Suffolk County NY? I’m woefully underemployed.
Well I’m in my early 20s im not sure if that qualifies me as a ‘youth’ (then again everything is relative ) I do have fond memories of my GCSE IT class and being the only person to do a programming project for my coursework instead of a database.
When our teacher introduced the module on programming he asked if any of us had used VB6. The class was a sea of blank faces, I answered no but that I was okay at c++ and was trying to learn assembly and c. Our teacher (of the cant do cant teach either variety) seemed to take offence and asked If I would like to teach the class about variables since I was obviously such an ‘expert’.
I still consider it a brave moment when I walked to the front of the class, copied a diagram I remembered from my beginners c++ book and got everyone to understand the concept of data types, variables and memory addresses, most could even get a basic calculator working by the end of class. (The look on old teacher’s face was priceless)
Fuelled by this (undeserved) ego boost I decided I would write my own OS for my coursework (bad move!) It never worked, but the theoretical knowledge I got from just trying was worth it and my documentation was pretty good so I still got a B for the module (maybe that says something about the difficulty of GCSEs)
What is very sad though is that I knew people who got A* results for the course and still didn’t really understand what a simple program let alone an operating system consisted of at the basic levels. Not because they were stupid or didn’t care but because IT like Maths is simply not taught properly in schools these days. We had a week on programming and low level stuff and the rest of the year was spent learning how to mail merge in Office and make charts in excel *sigh*
“Well I’m in my early 20s im not sure if that qualifies me as a ‘youth'”
I was thinking younger but there’s no reason to be discriminating.
“What is very sad though is that I knew people who got A* results for the course and still didn’t really understand what a simple program let alone an operating system consisted of at the basic levels.”
I found this to be often the case.
I had one particular professor for many upper level CS electives who refused to accept “original” solutions to his class problems. He would only accept solutions which were near verbatim copies of what had done in class. This meant that people who merely memorized material did much better than those of us who were able to derive solutions.
After getting a failing grade for an exam (Operating Systems class of all things), I confronted him about this and despite the fact that none of my answers were wrong, they weren’t what he was expecting. Obviously he didn’t care whether the answer was right, only that it matches his. He justified this by saying that he was a professor for 20 years and that he wasn’t about to change for me. I told him genuine industry experts would be unable to pass his exams, he didn’t care.
If you mean that I came across as being discriminatory apologies, wasn’t my intention – I really need to be less colloquial when I write online, things do have a habit of getting lost in translation
There does seem to be a big problem in the UK at least with the IT curriculum, obviously no school can teach their students about all operating systems and software packages but I think they could try to teach more theoretically and rely less on Microsoft for examples. After all if there’s one place you shouldn’t be blinkered it’s at school.
And maybe a bit more OS theory would spark the imagination of the next Bill Gates – maybe not Bill, Linus?
–double post–
Edited 2011-02-02 11:09 UTC