Manipulating C++ Graph Data Structures with the Boost Graph Library

Eugenia Loli 2006-12-03 General Development 31 Comments

Quickly becoming a de facto standard C++ library, the Boost library includes a powerful graph data structure that’s also easy to use. Jeff Cogswell discusses some interesting theory behind graphs, and explains the Boost library’s graph structures.

About The Author

Eugenia Loli

Ex-programmer, ex-editor in chief at OSNews.com, now a visual artist/filmmaker.

Follow me on Twitter @EugeniaLoli

31 Comments

2006-12-04 12:26 am
Manuma
Informit articles are useless, why keep them linking them here?
2006-12-04 1:31 am
phgt
I used the BGL on a project recently and it was a nightmare. It is very hard to use for any problem that goes beyond the “six degrees of Kavin Becon”. Specifically defining the graph with the properties that matter for your problem is difficult: you usually end up with a dozen pages of compiler errors that show templates within templates within templates… Impossible to to see what is wrong. The documentation doesn’t help. The only solution I found was to start from one of the trivial examples that come with BGL and try to adapt it gradually until it does what I want.
For me BGL is the perfect example of overuse of templates.

2006-12-04 1:59 am
smitty_one_each
This may be of the no-help help variety, but one might consider trying the python bindings
http://www.osl.iu.edu/~dgregor/bgl-python/
Keep a python object list, and don’t use BGL for more than the essential management of a graph of object ids.
2006-12-04 4:02 am
asabjorn
I have also been frustrated when using the Boost Graph Library (BGL) as both the online documenation and the BGL book is very bad. Like you I find it very difficult to explore these data structures in order to construct the graph I want and manipulate it according to my needs. I also know many researchers that decided to not use BGL because of this and that is sad since it is a lost opportunity for research collaboration.
When it comes down to the cryptic error messages from erroneous template instantiations I believe that this is a problem which should be fixed by compilers or tools debugging template use. Some C++ frontends, like EDG, supports the creation of ASTs for erroneous codes and it would be interesting to see if analysing this AST could be a step towards sensible error messages.
2006-12-04 12:15 pm
CrazyDude0
I couldn’t agree more. I usually follow this rule:
1. <T> – Stop, Think and Go
2. <T> in <T> – Go with Exterme Caution
3. <T> in <T> in <T> – Avoid like poison
4. <T> in <T> in <T> in <T> – motherf–king C++ whores…kill them assholes…
All this and some more C++ crap has completely turned me off of C++. It is becoming the next Perl with so much syntactical mess that it is really easy to write code which is a nightmare for others to maintain.
Before people ask what other syntactical crap, here are two things that i can think of right away:
1. Operator overloading – Wow now you don’t know what exactly a + b does
2. multiple inheritance
Edited 2006-12-04 12:19

2006-12-04 3:34 pm
evangs
Operator overloading – Wow now you don’t know what exactly a + b does
int rocknrollpewpew(int a, int b) {
return a * b;
}
int c = rocknrollpewpew(a,b);
Oh dear, the meaning of functions isn’t immediately obvious. Functions are a load of syntactic crap, I suggest we remove them .

2006-12-04 6:49 pm
CrazyDude0
When you are debugging at least you know they are a function call.
How the hell do you know if + is a function call unless ofcourse you plan to make all operators overloaded?

2006-12-04 7:45 pm
h times nue equals e
Correct me if I’m wrong, but :
Standard C++ regulates, that the compiler can
only generate the copy constructor, the default constructor and the assignment operator, if they are not implemented. In any other case, including the operator+(), one has to implement the operator explicitly, like you would have to with functions.
If you have something a la
foo a;
bar b;
foobar c( a + b );
in your code, you have either to look for
foobar foo::operator+( bar const& rhs );
or
foobar operator+( foo const& lhs, bar const& rhs);
in your header files. Besides the inconvenience to look potentially for this two different notations, instead of one in case of a hypothetical plus() function, I can see little problems here.
Assuming one uses operator overloading only where it is justified (e.g. to translate arithmetic or gramatical operations like concatination into code), it is imho supperior to functions. I write numerical code for scientific applications, and at least I find
foobar d( a * b + c );
easier to read/maintain/understand than for example
foobar d( plus( prod( a , b ) , c );
Programmers, who abuse operator overloading should be treated with the proverbial clue bat, agreed. But operator overloading can, just like expression template metaprogramming, give you a very handy tool, that saves time and effort und is a joy to use.
Like every tool, it can be misused, though. but just because some developers choosed to obfuscuate the meaning of their programs by using operators, I don’t see a reason to bash operator overloading.
Regards
EDIT: foobar c() -> foobar d() in the second and third example, sorry 🙂
Edited 2006-12-04 19:55
2006-12-04 7:49 pm
Chicken Blood
Agreed. Any programming construct can be abused.
Operator overloading is one that can add clarity and elegance when used in the right context.
and no-one _forces_ you to use it when that context is not apparent.
2006-12-04 8:19 pm
rayiner
Sure. However, abusing C++ language features is becoming ingrained practice in the C++ community. Books are being written about how to abuse C++ in particularly heinous ways, and C++ 0x is codifying these acts of violence into the standard.
2006-12-04 8:41 pm
Chicken Blood
@Rayiner
I take your point and somewhat agree. I was just making the point that this ‘OPERATOR OVERLOADING IS EVIL PERIOD.’ stuff that people sometimes spout is disingenuous.
Edited 2006-12-04 20:42
2006-12-04 8:24 pm
CrazyDude0
First i didn’t mean to write debugger, i meant when i am reading the code.
Operator overloading makes it harder for other people to understand your code.
The example you gave above could simply be:
x = calcMultiply(a, b);
x = CalcSum(x, c);
d(x)
What is wrong here, it is more clear and when you are reading the code, you immediately know that multiply is not simply an arithmatic multiplication.
I like to write code which the programmer can understand when they are reading it. Ambiguity is something that i hate and C++ has many constructs that are ambiguous.
2006-12-04 8:51 pm
h times nue equals e
Well beauty is in the eye of the beholder. If I code a formula, I find it easier, if the formula in the code actually looks like the formula on the paper. Since the applications I write are mainly numerical, my background may have something to do with this, though.
But, as I have stated in my previous comment, operator overloading is esp. suitable for such situations, where operation is connected very closely to the operator in everyday life, not for every problem.
>What is wrong here, it is more clear and when you are
>reading the code, you immediately know that multiply
>is not simply an arithmatic multiplication.
Well, I assumend that it should become clear from the context of my post, that I intended to describe a situation where “+” is indeed an arithmetic operation or a grammatic operation, that is somehow logically connected to the operator. My FORTRAN programming colleages have little problems to understand
std::string a;
std::string b;
std::string c( a + b);
, but not everybody immediatly grasps, what is meant by
std::string c( returnConnectedString( a , b ) );
It boils down to find a meaningful name (and not all programmers speak English, or the language you choose to code in). If a symbol is easier to understand than a name, then I’m happy if a language allows me to use it.
Furthermore, I see no immediate problems with your code, but please consider, that there are situations, where your solution is probably not optimal, just one example:
If your code is performance critical, eliminating temporaries (x in your code) can increase the performance substantial. Expression TEmplates in combination with operator overloading allow to write code, that reads (for me) very simple and natural and completly avoids temporaries.
Regards
2006-12-04 9:45 pm
CrazyDude0
I disagree with the performance point you made, the temporary are always generated even if you do operator overloading using
a * b + c
x = foo()
foofoo(x)
will have exactly same performance as
foofoo(foo())
in an optimized build.
I agree to some extent that if the operator choice is done carefully, it ends up making more sense but this also has to do with documentation.
STL has some operator overloaded but it is done so carefully that they are never obtrusive or have hidden meaning.
STL avoid overloading operator when possible. As an example, many String templates namely MFC CString has an operator (LPCSTR) which convers the CString object automatically to C style string i.e. char*. BTW did you know you could overload typecasts using operator overloading?
But STL did not chose it and they instead have a function .c_str() and after using it, i think it was an intelligent choice.
If you look at the whole STL, it has very little operator overloading however where ever it uses overloading, it just makes sense.
Sadly good use of operator overloading is quite less in my experience as compared to its misuse hence my default stand is to never use it.
Ofcourse you will find exceptions like STL but because C++ gives such facilities which gets easily abused, i compared it to Perl and i rather stick to C because it does the job you want it to and it does what you read it does not what it is doing behind the ambiguous syntax.
Instead of overloading, c++ could have defined new syntax which would make sense and be clear to reader so they could use
a @* b @+ c or something similar and i wouldn’t have complained because @* in this case will always be a function and is explicit but will still maintain the feel of arithmatic operation that you want.
2006-12-04 10:01 pm
h times nue equals e
Yes, Operator overloading alone does not eliminate the problem of generating temporaries. Expression templates + operator overloading together can do the trick, see[1].
I have not encountered such crude abdominations like you, so perhaps I have indeed been lucky 🙂 (or it is really related to my profession, don’t know). OTOH, I have seen criminal abuses of the OO language features in Java (esp. from my fellow FORTRAN 77 colleages), pointer based suicide missions in C and don’t get me started on some Visual Basic “programms” I had the joy to correct for some of my peers *shuder*.
I hope I made my point clear, that I’m no strict, fundamental “every class needs all possible operators overloaded” advocate, and that abuses of operator overloading can lead to terrible code. Still, I fear that uneducated design descisions are invited by many
high-level languages, and that using tools outside of their domain of application usually is a shortcut into the desaster.
Regards
[1]http://osl.iu.edu/~tveldhui/papers/Expression-Templates/exprtmpl.ht…
EDIT: Corrected on sentence, fixed a typo
Edited 2006-12-04 22:05
2006-12-04 10:30 pm
CrazyDude0
Yeah i guess it all depends upon each of our experience that make us form our opinion.
In my experience i have seen such a horrible use of templates inside templates inside templates that trying to visualize those your data structres in your mind just drives you crazy.
Even though they work, they become hard for a not so C++ expert programmer. Like i mostly work on kernel mode components and they are written in C so even though i had good enough knowledge of C++, the nested templates became a nightmare to understand and maintain.
On top of that the programmer used auto_ptr which tries to implement something which doesn’t always work and when it fails it is a nightmare to debug.
The second worst use of C++ i saw was operator overloading and then multiple inheritance.
In my experience, using C++ with no templates and no operator overloading provides code which is easier to understand for not so C++ experts but still provide many benefits of OOP. Also limiting multiple inheritance or avoiding it helps write more maintainable code too.
2006-12-04 11:03 pm
h times nue equals e
>In my experience i have seen such a horrible use of
>templates inside templates inside templates that
>trying to visualize those your data structres in your
>mind just drives you crazy.
>Even though they work, they become hard for a not so
>C++ expert programmer. Like i mostly work on kernel
>mode components and they are written in C so even
>though i had good enough knowledge of C++, the nested
>templates became a nightmare to understand and
>maintain.
>On top of that the programmer used auto_ptr which
>tries to implement something which doesn’t always work
>and when it fails it is a nightmare to debug.
Ok, I can see who “helped” you arriving at your opinion, and within this context your position is more understandable. I sometimes *need* convoluted expression template trees, because the programms I work with take our hardware to the limit (currently up to weeks runtime on 16-24 nodes of our High-Performance cluster) and if there is a valid option to increase the run-time performance, we usually seize it.
But:
A good (C++) programm/lib should imho include a as-complete-as-possible documentation, including reasons for design decisions taken (e.g. why use an auto_ptr here and not a smart pointer? why using oo instead of a function, etc.) and insights how to use it and (most important) also when and when not to use it.
Which brings me fullcircle back to the starting point ouf this discussion, namely BOOST’s suboptimal documentation.
Nice discussion btw., although it went a bit off-topic.
Regards
EDIT: Corrected some missing words, too tired to write 🙂
Edited 2006-12-04 23:05
2006-12-04 7:46 pm
Chicken Blood
The debugger should show the function ‘operator+()’ on the stack.
So what is the problem?

2006-12-04 2:48 am
rajj
Graphviz is much easier to use in my experience. It has language bindings for C++, Python, Perl and others. It probably doesn’t do exactly the same thing as BGL, but it probably does do what most people want (draw graphs from a declarative specification).
Edit: Guess I should have checked out BGL first. It’s not for drawing graphs. It does, on the other hand, take graphviz definitions as input :-P.
Edited 2006-12-04 02:55

2006-12-04 4:09 am
asabjorn
The aim of BGL is to be a graph manipulation tool so they define graph operations which can be used in performing analysis. Graphviz is a set of simple graph visualization tools and stores these graphs in the dot format, and although the scope of graphviz is limited it is excellent at what it does. BGL supports the dot format, but also supports more complex graphs plus an extensice set of graph operations so Graphviz can not be compared to BGL.

2006-12-04 5:40 am
rayiner
Templates are a horrible way to implement something like this, though with C++, you don’t get anything better. Look at all the nested template crap on page 5 just to define more than one vertex property. Look at the macro I hacked up (and didn’t test!) at http://www.prism.gatech.edu/~gtg990h/graphs.lisp
That code is the basic implementation of a graph-defining macro, letting you define properties for edges and nodes, and creating the appropriate graph, edge, and vertex classes. Using it is ridiculously simple (and very readable, even if you don’t know Lisp):
Here is how you define a graph called “flowgraph” with two vertex properties, a and b.
(defgraph flowgraph ((vtxattr a) (vtxattr b)))
And the best part is, the macro language isn’t some hacked-up template mechanism, but the same language you use for everything else. That means you can use loops or whatever you want in your macro (no template nesting, which C++ template libraries use because they have nothing better). The error messages are readable, because the macro isn’t “special” template code, just regular Lisp code which can use the regular error messages. And since the macro is regular code, it can even check for common errors, just like regular functions. A user who makes a typo like ‘vxtattr a’ can get a regular error message “invalid property type vxtattr”, instead of pages of template expansions. Also, a quick macroexpand-1 is all you need to figure out what code got generated by a given call.
For example, the above example expands to:
(PROGN
(DEFCLASS FLOWGRAPH (GRAPH) NIL)
(DEFCLASS FLOWGRAPHVERTEX (EDGE) ((A) (B)))
(DEFCLASS FLOWGRAPHEDGE (EDGE) NIL))
Which just defines three classes with the appropriate instance variables.
Edited 2006-12-04 05:43

2006-12-04 8:25 am
agentj
boost is good library, except it’s documentation. Boost documentation sucks ass ! I wish Microsoft had made documentation for boost.

2006-12-04 8:40 am
rayiner
Boost is a good library, in general. However, Boost.Lambda, Boost.PPP, and BGL are real abuses of the template mechanism. Good code doesn’t make the language do things it doesn’t want to, and C++’s template mechanism is a way of specifying generic data structures and algorithms, not a general macro language.
2006-12-04 5:10 pm
agentj
As usual some f–k-ups are modding normal comments down, like I care

2006-12-04 5:37 pm
Chicken Blood
Yeah, don’t you dare suggest that Microsoft can do something well 🙂

2006-12-04 9:19 am
Cloudy
Actually, Eugenia, the article didn’t discuss any graph theory. It outlined, badly, a few of the well known problems in graph theory without describing the theory involved at all.
Oh, and it didn’t describe any data structures.
Or algorithms.
10 “pages” and not a single mention of digraphs.
It was, in fact, content free.
Thanks for pointing out yet another author whose books aren’t worth buying.

2006-12-04 11:33 am
agentj
Yeah, it’s neither tutorial for programmers (explain classes, methods and how all this stuff connects together, examples included) nor article for non-programmers. It mostly says what CAN be done with graphs – nothing more. Programmers will usually already know what graph is and how to work with them – they’d just need good API documentation.
Edited 2006-12-04 11:35

2006-12-04 4:33 pm
lkanies
Shawn Garbett has made a Ruby version of this library:
http://gratr.rubyforge.org/
This was initially a fork of simple Ruby bindings, but he has since converted the library to be entirely Ruby. I’m using it in Puppet (http://reductivelabs.com/projects/puppet), and it’s very nice.
2006-12-04 10:44 pm
pauls101
Boost is one of those great ideas that fails by missing the point utterly, the point being that code, no matter how elegant, is worthless if it can’t be read, maintained, extended, or debugged. Most boost code I’ve seen is 0 for 4; the docs I’ve seen weren’t all that bad, though.
Maybe it makes sense: they want to become part of the standard library, which is controlled by the standards committee, which won’t approve anything unless it can be shown to use at least as many templates as the most ridiculous example that currently exists (apologies to Aldous Huxley.)
I keep trying to find a use for boost, I really do… then I spend a few hours trying to build something, then it doesn’t work and all I have is page after page of useless template errors. Then I look on USENET and find horror stories about 2 hour compiles and thank my lucky stars I quit when I did.
I’ve not used the BGL, but based on others’ comments I suspect that once again I can write my own version faster than I can get theirs to work, and have something that fits my needs or can be fixed when it doesn’t.
2006-12-05 8:20 am
acobar
Anyway what is the point to show code that can define graphs?
Clearly this is not the main concern of BGL. If you want just to define a simple graph you can use even a matrix to hold the properties and I doubt anything can be more simple (that is what a lot of researches do on fortran many times). BGL is not there for that. Everyone that already worked on something useful (not just trivial examples) that needed anything graph related knows what makes it a “hard” area: the complexity associated to find/creation of correct and efficient solutions. Many problems are even hard to define on mathematical form. Fact is, useful graph algorithms are hard to develop and reuse and I bet this is just what BGL is trying to address, code reuse.

2006-12-05 3:53 pm
rayiner
Yes, of course, that’s what BGL is trying to do. The complex template API isn’t there for kicks, its there to allow code reuse. However, when a small mistake will get you pages of template expansions as an error message, then people aren’t going to want to reuse that code.