A Guide to Writing Efficient C Programs

Eugenia Loli 2004-02-06 General Development 25 Comments

This guide mainly focuses prime issues that are involved in writing professional code, which is very efficient both in terms of space & time as well as be easily ported to other systems with small changes.

About The Author

Eugenia Loli

Ex-programmer, ex-editor in chief at OSNews.com, now a visual artist/filmmaker.

Follow me on Twitter @EugeniaLoli

25 Comments

2004-02-06 11:52 pm
Anonymous
Does anybody else get these horrible rendering glitches like list bullets cut off at the left border?
I’m using mozilla 1.6 under epiphany 1.0.7 (debian/unstable).
2004-02-07 12:10 am
Anonymous
I am also using Debian Sid with Mozilla 1.6 (I also browse with Epiphany 1.0.7); no glitches whatsoever.
2004-02-07 12:20 am
Anonymous
Looks fine in Safari 1.2 as well.
2004-02-07 1:13 am
Anonymous
Er, use size-specific types instead of generic ordinals? Wtf?
The guy demonstrates what would rapidly turn into a huge main() function and then later harps on about the size of functions!
When to use gotos?
Personally, I’ve been writing C/C++ for about ten years, and been doing it professionally for around six, and I’ve yet to write a single line of code which required the use of a goto; and there’s no reason why anybody else’s should either. If you think you need a goto, you need to refactor your functions.
Sure, there’s some good stuff in there – like letting the compiler optimize for you, and keeping functions below a page, but the examples suck and there’s some really bad points in there:
-32767..to 32767 isn’t 32 bits, for a start. It’s 17 bits, which is nonsense in pretty much every computer system.
“Use of the proper suffixes after constants” isn’t explained at all.
“Only integer or pointer type should be used as Boolean” – I know a pointer should be treated as “NULL or something else”, but that doesn’t make it a boolean type!
Conditional compilation belongs _solely_ in abstraction layers.
Comparing floats: The “if((float_expr_1 – float_expr_2) < DELTA)” needs an fabs() call in there if it’s to work, else you’ll fail to catch the right value on average 50% of the time (when float_2 is larger than float_expr_1).
“of declaration referencing in used be not should keyword”… isn’t even English.
Most UNIX systems specify a minimum of 14 unique characters in a filename, not 31.
The author would do well to read the Single UNIX Specification, as well as the ANSI C99 specification (which even most embeddded systems conform to these days!)
In short.. nice idea, but really sucky way of putting it down.
2004-02-07 1:36 am
Anonymous
Actually, every time I re-read this article, I find more things wrong with it (c.f., malloc is not a system call!), and you do not exit from a function to its caller using exit!
Here are my top tips:
– Ignore non-ANSI compilers. K&R C is dead. Get over it.
– Steer clear of conditionals all over your code. Write a support library if you have to.
– Stick to natural types such as ‘int’ if you can help it. Keep specific-sized types confined to structures and over-the-wire/on-disk streams.
– If it doesn’t need to be exposed, don’t expose it. Make everything that doesn’t need to be public static. Use a naming convention which makes clear th e difference between public and private functions.
– static globals are initialised to zero, so don’t explicitly initialise them (because if you do, they’ll take up space in the .data section which is part of your executable, instead of the .bss section which isn’t).
– Don’t try and second-guess the compiler.
– Stick to ANSI, and if you can’t stick to ANSI, stick to POSIX.
– For preprocessor statements, the # should always sit in column 1. Use spaces after the # to indent if you wish to.
– Order of evaluation is well-defined by the standards. Read them and learn them.
– String literals should always be treated as constants (that is, read only). Never ever assume they can be written to.
2004-02-07 2:05 am
Anonymous
Personally, I’ve been writing C/C++ for about ten years, and been doing it professionally for around six, and I’ve yet to write a single line of code which required the use of a goto; and there’s no reason why anybody else’s should either. If you think you need a goto, you need to refactor your functions.
I disagree. gotos are beautiful when it comes to the error handling parts of your function, and you want an efficient fast path and you don’t want to wade through levels of ifs.
And please don’t bring up C++’s exception handling.
2004-02-07 2:06 am
Anonymous
you use goto statements to simulate exception handling. This is required
in many mixed language projects where you’re making callbacks from c to a language with exception handling or vice versa. otherwise, you’re right, there isn’t much practical use for the language feature.
K&R is not dead, a lot of what they talked about is still rather fundamental.
the only hole in their C programming book is the problem with buffer overrun. clearly they didn’t forsee any of that happening.
I don’t agree with the author’s first bullet, more lines of code does mean that you liken your chances for more errors, however, obfuscated and kludgy code takes some people too much time to debug, especially when many, many eyes are looking over it like an open source project. concise code-yes-however, sometimes it’s better to be expressive with your code, the compiler takes care of all the optimizations anyway so don’t try to be a smart ass.
the other bullets seem alright, who cares if he can’t tell the difference between a system call and a standard c library function.
2004-02-07 2:18 am
Anonymous
Ah, goto.. the enjoyment of spagetthi-code
2004-02-07 2:20 am
Anonymous
Personally, I’ve been writing C/C++ for about ten years, and been doing it professionally for around six, and I’ve yet to write a single line of code which required the use of a goto; and there’s no reason why anybody else’s should either. If you think you need a goto, you need to refactor your functions.
Goto is never required, but it can certainly make error handling code far easier to manage. I don’t use it often, but the rare time I do is usually a case where I’m reading in a data file, and progressively allocating more resources as I go. I put one copy of the error handling code at the end of the function, with labels for the different stages of deallocation. In the main part of the function, if I detect an error, I goto the label corressponding to the level of allocation I’ve reached. That way, I don’t have to repeatedly duplicate all the error handling code. It makes things much easier to change, as there is only one place the code needs to be modified.
-32767..to 32767 isn’t 32 bits, for a start. It’s 17 bits, which is nonsense in pretty much every computer system.
No, that’s 16 bits. The lower end is actually -32768 though.
– Order of evaluation is well-defined by the standards. Read them and learn them.
Not when ++ and — are concerned. In the example in the article, the evaluation is well defined. But if you try to do something such as using i and i++ within expressions on opposite sides of a ==, the result isn’t defined. Take something like this:
if (text[i++] == text[i – 1])
The compiler is free to evalute the ++ any time after i is used in the first array dereference and before the parentheses are closed. That means the second i can have the original value of i, or the original value of i plus 1. The compiler is not required to be consistent in when it decides to update the value of i. In other words, if the next line of code was the same except for text[i] on the right side of the ==, it would be perfectly valid for both if’s to fail.
A side note… the designers of Java realized that problem in C and clearly defined the behavior so that the ++ or — is done immediately after the value is used.
The interesting things you learn in a compilers class…
2004-02-07 2:58 am
Anonymous
> Does anybody else get these horrible rendering glitches like
> list bullets cut off at the left border?
Looks fine under Epiphany 1.0.6 (Mozilla 1.5) – FreeBSD 5.2. Are you using non-default fonts?
2004-02-07 4:35 am
Anonymous
Exception handling is not the only reason to use a goto. It’s also used in hotspots where you need to get out of a deeply nested loop.
People consider “Goto considered harmful” as gospel. You have to take it context. When Djikstra(sic?) wrote that in the early 70’s you had many people writing spaghetti code and overusing goto. That doesn’t mean you should never use goto. Knuth thinks there are uses for goto.
2004-02-07 4:52 am
Anonymous
I’m glad to see a lot of pro-goto sentiment here. But to answer the anti-goto sentiment…
Personally, I’ve been writing C/C++ for about ten years, and been doing it professionally for around six, and I’ve yet to write a single line of code which required the use of a goto; and there’s no reason why anybody else’s should either. If you think you need a goto, you need to refactor your functions.
Let me quote Richard Stevens here…
“Read Structured Programming with go to Statements by Knuth in the ACM Computing Surveys, Vol. 6, No. 4, Dec. 1974 issue. (In fact, this entire issue of Computing Surveys is a classic.) My challenge to the goto-less programmer is to recode tcp_input() (Chapters 27 and 28 of TCP/IPIv2) without any gotos … without any loss of efficiency (there has to be a catch).”
I’ve run into a few cases where I could either use goto, or add an additional branch and an additional block of code. Personally I think the former (with goto) is a cleaner implementation.
I often use goto when I initialize a resource within a function and need to return it to the system in the event of an error, such as opening a socket/file descriptor. goto is especially useful if you perform a number of such initializations within a function and need to deconstruct a complex state object which references multiple resources in the event of an error, as you can simply generate a “stack” of gotos which deconstruct the partially initialized object.
So, as if there’s not enough pro-goto sentiment here already, that’s my take…
2004-02-07 4:55 am
Anonymous
goto really only belongs in assembly, and in most cases a Call works better since it returns to where it was once the call is complete.
Spaghetti code is only found in blocked memory. Flat memory rules…….. so much.
2004-02-07 5:49 am
Anonymous
goto really only belongs in assembly, and in most cases a Call works better since it returns to where it was once the call is complete.
How do you propose doing an if/else type of operation without using gotos? Or how about loops?
When writing assembly code, you use a lot more goto’s than call’s.
2004-02-07 6:09 am
Anonymous
Goto’s are appropriate for very specific things, like kernel code. You should read Torvalds comments on goto and why he uses them in the kernel and why that’s a very good thing(tm).
2004-02-07 7:26 am
Anonymous
You’ll also find people(usually professor-types) who say there should be only one exit point from a routine for the sake of clarity. If you follow that rule to an extreme you usually end up with a bunch of flags and a bunch of ugly nested ifs. The same thing goes for breaks. The whole point is clarity. If using a goto, or a break, or multiple returns is going to make the code clearer then by all means do it.
2004-02-07 9:33 am
Anonymous
An interesting use of goto in older Fortran (66/77), is the computed goto. Its written as:
goto ( 100, 200, 300, 400 ) foo
Each of the numbers are statement labels to which the goto jumps based upon the value of foo. So, if foo==1, we jump to statement 100. If foo==4, we jump to 400.
In context, this wasn’t a bad thing as the “if…then…else” structure had not made it into Fortran (until 77).
I actually work with code that was written like this. Goto has its place, but in this code there are occassions where I have 5 or 6 loops, but they are obscured by the use of a goto statement at the end of the loop and no indication of such logic at the loop’s “beginning”. Very frustrating, and in context, Djikstra was right about the possibilities of spaghetti code.
2004-02-07 12:10 pm
Anonymous
Reminds me of BBC BASIC
LINE 10 PRINT “CHOCOLATE CHEESE CAKE IS NICE!”
LINE 20 GOTO LINE 10
In Amos you could do:
[MENU]
PRINT “HELLO WELCOME TO THE MENU”
PRINT “1 FOOBAH”
INPUT A$
IF INPUT A$ = 1 THEN GOTO FOOBAH
IF INPUT A$ <> 1 THEN GOTO MENU
[FOOBAH]
PRINT “HELLO AND WELCOME TO FOOBAH”
2004-02-07 1:54 pm
Anonymous
RE: A Guide to Writing Efficient C Programs
By Mo (IP: —.cmlx.co.uk) – Posted on 2004-02-07 01:36:00
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
why don’t you write an article outlying the very same concepts but done the “Right” way
2004-02-07 2:07 pm
Anonymous
Goto’s can be used in real time applications and in rare cases for optimal efficiency.
2004-02-07 9:39 pm
Anonymous
“– Stick to natural types such as ‘int’ if you can help it. Keep specific-sized types confined to structures and over-the-wire/on-disk streams. ”
I disagree with your statement here. “int” is related to the number of bits the CPU is architectured upon. int will be 16-bits on 16-bits CPU, 32 bits on 32 bits CPU, and so on.
I much more prefere to never use natural types, and typedefs precise alias like :
U8, S8, U16, S16, U32, S32.
And using a INT only for (known) small loop counters, just for optimization sake.
2004-02-07 11:19 pm
Anonymous
That also reminds me, BBC BASIC also had
ON A GOTO 100,200,300
and
ON A GOSUB 100,200,300
Maybe we should have a GOTO survey 😉
2004-02-08 12:19 am
Anonymous
i have been programming for a long time too.
and have come across some really beautiful code which uses goto , but wisely!! ;
i personally prefer not to use goto;
it represents a different kind of control flow than all the other forms; and thus to me feels like an ostensible inconsistency.
but i still remember that learning to use use goto and computed goto in basic and fortran .. i could appreciate assembly language very well.
once i learnt c , and later c++ i almost forgot fortran.. but now have to read and understand lots of numerical libraries written in fortran;
turns out fortran was really a good language;
i sure do appreciate the presence of goto in any language.. but using it is to be restricted to really extreme cases;
=========
the article was good except for a few inconsistencies here and there .. like saying that malloc is a system call etc..
one issue that might benifit more attention is padding by compilers..
usually compilers provide speed optimisations .. so it is upto us to make sure that space is used properly;
but then .. i wouldnt bank on a compiler providing or not providing padding ..
an end user might compiler my code with any flags!!
id rather write code that does not depend on any knowledge of what the compiler is going to do;
also more examples on undefined behaviour would have been great;
on the whole a good article.
cheers
ram
2004-02-08 12:28 am
Anonymous
# A developer must not reinvent the wheel. It is better to use ctype macros wherever possible instead of writing code. This is true because ctype macros are standardized and remain unchanged on almost all types of libraries.
I disagree. I read the article to that point. I have written code for a long time, and I find that I take extreme measures to optimize C code before I have to write it in assembly. The knowledge of assembly is the number one tool to optimize C code. C started as a glorified macro compiler of assembly. C++ continued to be even more glorified with class types and auto-inline functions. Yet, C remains the closest tool to use to get a higher-level language to produce expected assembly code. The use of standarized macros defeats the purpose of expected assembly code.
I looked at pieces of the Linux kernel. I started to write code well before Torvalds started to write Linux. I’m amazed at the use of the macros. Certainly, the macros make the process to code easier, but it is best to rewrite the code out fully for optimization. If I see code that is written to work well with for a certain macro, then I see code written by an author that forgot about the importance of how the compiler rewrites it. I saw some code in the kernel that I know how to improve.
Use of macros are always great to generate repetitious routines. I try to localize the declaration to make it easier to read:
#define _(_1)
{
extern void module_##_1(void) ;
module_##_1() ;
}
_( stepone ) ;
_( steptwo ) ;
_( andetc ) ;
Then to update such code I only need to change one area.
Keyword “goto” not required? Hmmm… I got business to do.
2004-02-09 3:48 am
Anonymous
I’m surprised nobody has told it yet. Gotos are present in most programs in edulcorated forms: break, continue, return, throw, exit… are gotos, in the sense that was criticized by Dijkstra: different ways of exit a block.
There are cases in which these constructs are the very best solution, specially for clarity. For instance, it’s much better to write “if somecondition then return” in the beginning of a long function than wrapping the whole function inside a “if opposite then …” conditional.