Given the nature of open source software, many Linux applications are distributed in a “tarball” containing source code that you must build before you can run the application. Larger applications can take several hours to build. This article shows how you can use the distributed C compiler, distcc, to speed up the compilation of these sources so you can start using them sooner.
In January 2004, UnixReview has one article regarding this issue:
http://www.unixreview.com/documents/s=1350/ur0401i/
There are quite a few people who swear by distcc for gentoo building (I didn’t because of my 133t one box setup). There is even a guide:
http://www.gentoo.org/doc/en/distcc.xml
If C didn’t have header files, C compilers would be far far faster than they are today. A large portion of their time is spent reparsing including files that are included multiple times. A massive dance of text manipulation and streaming ASCII characters is a C compiler’s view of the world. How sad.
Real programming languages long ago did away with textual substitution. It’s such a shame that C lives on.
“If C didn’t have header files, C compilers would be far far faster than they are today. A large portion of their time is spent reparsing including files that are included multiple times.”
Nearly every project I’ve ever worked in or had a look at uses #ifndef guards to insure each header is included only once. The problem you describe is quite rare in C (though it still pops up in Obj-C, unfortunately, due to Apple’s support of #import).
Nearly every project I’ve ever worked in or had a look at uses #ifndef guards…
But the pre-processor still has to scan, scan, scan until it finds #endif. I think his point was that, even though your would only include the header files once you still have to re-parse them for every compile. Instead of just linking to a binary image.
Most modern C compilers have an option to precompile the headers. This avoids all that and greatly speeds compile time. The downside is that if you change the headers, you have to precompile them again. Because of that, most folks precompile the system headers seperately from the project headers. You should look into it.
Yes, since each source file is compiled as a separate unit, you have to reparse the headers for every source file (ignoring multiple inclusions for a single file). This is a problem, but it also allows you to separate interface from implementation. Something which is woefully lacking in other (newer) languages like Java, C#, etc.
“Most modern C compilers have an option to precompile the headers. This avoids all that and greatly speeds compile time. The downside is that if you change the headers, you have to precompile them again. Because of that, most folks precompile the system headers seperately from the project headers. You should look into it.”
Hi, I’d like to point out that in nearly all compilers that support precompiled headers, they are not turned on by default. The fact remains that the behavior of textual substitution must be emulated in a precompiled-header system, leading to lots of redundant work in the compilation process (i.e. even if the text is not necessarily reparsed, the ASTs must be revisited for each file).
When you have 2 gigabytes of source code to compile (e.g. the entire Solaris code base) that references a few hundred header files, it hurts, big time. Couple that with a make system that compiles each C source file separately, and bye-bye, see you tomorrow morning.
In Java, I can rebuild even the largest of my projects (> 200,000 lines of code) in well under a minute–within a graphical IDE, itself written in Java!
First, go read John Lakos’ book “Large Scale C++ Software Design”
http://www.amazon.com/exec/obidos/tg/detail/-/0201633620/qid=108802…
In a nutshell, place the include guards around your file’s include statement. This is made easier to do if you use a convention for the include gurad name.
Example:
foo.h
#ifndef HDR_FOO_H
#define HDR_FOO_H
…
#endif
in your file:
#ifndef HDR_FOO_H
#include “foo.h”
#endif
This form of guarding should also be used inside of foo.h for the files that it pulls in.
To avoid tediuos typing, this approach works best if your source can adopt a convention where foo.h itself includes all of the other .h files that it requires to compile.
Two other practices that help:
#1 Create two .h files for your modules. The public form, “foo.h”, and the internal form “foo_int.h”. You are going to need many other pieces to implement your foo module, but users of foo don’t need to bother including the .h files of everything that foo.c made use of.
#2 In your foo.h file make use of the fact that pointers to types don’t require that type’s full definiton. Use “struct OtherStruct;” or “class OtherClass;” to declare them and then your prototypes/methods can use those types without having to actually include their full definitions.
Now if someone knows a good technique to make the link process faster (on windows – it’s such crap, touch a source file and you must rebuild your .lib and .dll and then you are required to relink every single one of your other .dll (and their .lib) files)
“This is a problem, but it also allows you to separate interface from implementation. Something which is woefully lacking in other (newer) languages like Java, C#, etc.”
Is it? Last time I checked you had proper interfaces, abstract classes, and classes that implement them. Just see how all J2EE (ie, servlets) is done, Sun provides interfaces and products implement them.
J
Brian, this form of ugly obfuscation is totally unnecessary. The guy who wrote the book must have used second-rate compilers. It won’t help your GCC compilations at all.
…i really loves distcc, it’s simply cool. But i have at home an old quad SS20, a pair of FreeBSD and Linux PC’s, an iBook and my old SGI Octane….it’s really hard to the hell to build cross compilers in all that arch/os mixture.
Are there plans for gcc team to do a better support of cross compiling??? If are there, then distgcc will be cross too. It would be the best cross/compiling distributed tool… (or at least the only one)
BSDero
The restriction in Java that prevents multiple inheritance often means that code reuse must come via cut-and-paste.
But the interface/implementation separation I’m talking about is for developers. Doing a large project in Java is like poking a hot stick in your eye. Granted, doing a large project in C++ is like poking a hot stick in your ear…so it’s not much better, but it’s better.