Is C++ fast?

Submitted by Artem S. Tashkinov 2019-01-18 Benchmarks 14 Comments

A library that I work on often these days, meshoptimizer, has changed over time to use fewer and fewer C++ library features, up until the current state where the code closely resembles C even though it uses some C++ features. There have been many reasons behind the changes – dropping C++11 requirement allowed me to make sure anybody can compile the library on any platform, removing std::vector substantially improved performance of unoptimized builds, removing algorithm includes sped up compilation. However, I’ve never quite taken the leap all the way to C with this codebase. Today we’ll explore the gamut of possible C++ implementations for one specific algorithm, mesh simplifier, henceforth known as simplifier.cpp, and see if going all the way to C is worthwhile.

About The Author

Thom Holwerda

Follow me on Mastodon @thomholwerda@exquisite.social

14 Comments

2019-01-20 11:55 pm
Zan Lynx
If a library is too slow in Debug mode then build it in Release with debug symbols. It works well enough for tracking callback sources and doing profiling. The app using it doesn’t need to debug the library. If the library isn’t working write a test for that and debug the library.
As for MSVC yes it is incredibly slow with C++ in Debug mode. First because it doesn’t optimize out and inline the many various tiny functions. Second because the MSVC C++ library includes extensive debugging features that can catch things like invalidated iterators even if they still point into the same std::vector.
The C version of the code is faster in the scenarios not many people care about. If you want speed use optimized code. If you want to debug the library then you want the MSVC code because it really is good at catching mistakes.
2019-01-21 1:07 am
jpelczar
Debug build’s purpose is as the name says to debug the library and verify system’s behaviour, not performance measurement or product release. I prefer good stack trace, perfect line level debugging and ability to view full program’s dynamic state, integrity checking, etc. for development/QA phase. Compilation time is completely irrelevant – just get faster development machine with more CPU cores and RAM and/or use incremental builds, distributed builds or just ccache – which is supported by every normal IDE/build system. Release build with debug symbols is also usually fine as long as you can get correct stack trace. As for performance – I prefer to optimize the algorithm itself, not make code crappier with pseudo optimizations that make source less readable, or Unix developer’s standards “you are not expected to understand this”. Hand optimization is the last phase which needs careful profiling and testing for possible side effects (with high possibility of analysing assembly code).

2019-01-22 7:20 am
ahferroin7
The problem with optimizing the algorithm itself is that doing it properly really does require dropping ‘features’ of C++ in some cases. For example, you can optimize a string parsing algorithm all you like, but nothing you do is going to get you as much of a performance boost as minimizing (or eliminating) usage of std::String unless you have a horrible algorithm to begin with. The same goes for almost anything else in the C++ standard library that deals with immutable types, because all such things treat memory management like it’s completely free when it’s one of the most expensive things on almost any platform.

2019-01-22 4:27 pm
kwan_e
The same goes for almost anything else in the C++ standard library that deals with immutable types, because all such things treat memory management like it’s completely free when it’s one of the most expensive things on almost any platform.
Nothing in the C++ standard library deals with immutable types. Everything is mutable and reuse all you like, and very conservative on memory. Just look at all the in-place algorithms that encourage people to reuse their memory.

2019-01-22 9:48 am
Alfman verbose=1
jpelczar,
Debug build’s purpose is as the name says to debug the library and verify system’s behaviour, not performance measurement or product release. I prefer good stack trace, perfect line level debugging and ability to view full program’s dynamic state, integrity checking, etc. for development/QA phase. Compilation time is completely irrelevant – just get faster development machine with more CPU cores and RAM and/or use incremental builds, distributed builds or just ccache – which is supported by every normal IDE/build system.
Obviously compilation time and run time performance are completely different problems (at least for static languages). Personally I’m nit picky about software efficiency in general, and slow builds are one of the banes of software development. I hesitate to accept the “just upgrade your hardware solution” because often developers spoiled by high end hardware just end up sweeping bloat and overhead under the rug. Many of your users might even thank you for downgrading during development to increase the motivation for making a much better performing end product.
I believe that compilation time should be a non-issue with today’s hardware, even a huge project should compile from scratch in no time, but part of the problem is with computer languages themselves. The pre-processor in C and C++ is an ugly remnant from the beginning that make parsing a context sensitive operation requiring multiple passes. Templates are technically capable of being turing complete, but also at great expense to a much more simplified compilation process. I’m stuck with these languages as much as ever, but I resent them more as the years pass and I feel they’ve held our industry back immeasurably in terms of security and productivity. Clearly there are a lot of people who agree as evidenced by all the new languages that we can’t even keep up with, but alas I don’t feel that any of the challengers has enough clout to dethrone the king as the defacto standard.
At least runtime performance is decent so long as we make careful use of it’s features.

2019-01-21 3:40 am
ChodaSly
> Debug build’s purpose is as the name says to debug the library and verify system’s behaviour, not performance measurement
Thing is, on large project Debug builds quickly become unusable. Especially in video games, testing at 3fps is not feasible, prevents reproducing bugs, and worse it can completely hide bugs due to different timings between the different multi-threaded modules of the software. That’s one of the cause of release-only bugs.
2019-01-22 11:14 am
JLF65
Is C++ fast? It’s basically the same as the question, “can C++ run with tiny amounts of ram?” and basically has the same answer. Yes, if you pretend it’s C and avoid most of the C++ features and libraries, ESPECIALLY iostream. I did a C++ Tic Tac Toe demo on the Sega Genesis showing how you would do C++ homebrew on the Genesis with my toolchain (gcc). The speed isn’t as much of an issue these days as the memory usage. Quite a few C++ libraries assume you’re on a PC with near infinite memory.

2019-01-22 3:01 pm
Alfman verbose=1
JLF65,
Is C++ fast? It’s basically the same as the question, “can C++ run with tiny amounts of ram?” and basically has the same answer. Yes, if you pretend it’s C and avoid most of the C++ features and libraries, ESPECIALLY iostream. I did a C++ Tic Tac Toe demo on the Sega Genesis showing how you would do C++ homebrew on the Genesis with my toolchain (gcc). The speed isn’t as much of an issue these days as the memory usage. Quite a few C++ libraries assume you’re on a PC with near infinite memory.
I agree it can be a problem. While C++ adds much needed functionality&features that were missing in C, C++ makes it easier to hide badly optimized code patterns. Sometimes C++ code can be less efficient and more difficult to write, so when I code in C++, I tend to straddle between both rather than embracing the newer C++ idioms for everything.
Another thing is that often times I require syscalls that aren’t part of C++ standard libraries, and I find it better and cleaner to use the C interfaces directly rather than muck around trying to extend C++ IO.
2019-01-22 4:34 pm
kwan_e
Yes, if you pretend it’s C and avoid most of the C++ features and libraries
https://www.youtube.com/watch?v=zBkNBP00wJE
People like you are doing it wrong and refusing to learn how those actual features work. This guy recreated a popular game on the C64 using fancy new C++17 that cost absolutely nothing on the hardware, some of which could even be considered negative cost.

2019-01-22 8:19 pm
Alfman verbose=1
kwan_e,
Yes, if you pretend it’s C and avoid most of the C++ features and libraries. People like you are doing it wrong and refusing to learn how those actual features work.
People like me? Please, I’ve done it both ways and I use the features I find useful, but like JLF65 I find C++ streams suffer from overengineering and for that I still prefer using strait C. In practice C++ libraries are often incomplete anyways and it can take a lot more boilerplate code and effort to convert and debug everything in C++ than just using C libraries directly. Just because we have some disagreements doesn’t make us wrong, it just means we have a difference of opinion.
This guy recreated a popular game on the C64 using fancy new C++17 that cost absolutely nothing on the hardware, some of which could even be considered negative cost.
That’s a long video, I don’t have time to watch most of it. I agree that C++ can be powerful and when used carefully can produce efficient code. I certainly prefer to have C++ over C in my projects as C++ offers more tools, but it doesn’t mean I have to use them all the time 🙂

2019-01-22 9:07 pm
kwan_e
Alfman,
That’s a long video, I don’t have time to watch most of it. I agree that C++ can be powerful and when used carefully can produce efficient code.
That’s too bad, because what he does in the video is far from “used carefully”. He’s just using those features normally, with no special tricks like you would need in C. No security risks. No undefined behaviour tricks. If I remember correctly, he even uses virtual dispatch, (instead of hand-rolling yet another vtable-like implementation which C programmers seem to love to do), which compiled down to nothing because of devirtualization.
I find C++ streams suffer from overengineering
iostreams are there when you want to write quick one-off tools, and no one claims they are the best to use in all situations. If I want a quick tool that processes a file line by line, I am sure as hell not going to use cstdio and deal with buffer management just to read a line. No sane person uses iostreams for network stuff, for example.
Just because we have some disagreements doesn’t make us wrong, it just means we have a difference of opinion.
What I’m calling wrong is this perpetual myth that you have to go low-level (often with tricks) to get performance, even in C++*. So if someone gives advice that to use C++ for efficiency you have to do away with C++ specific features, then that is demonstrably, empirically, wrong. As in “put on godbolt and look at the assembly” wrong.
If someone can use standard high level features in standard ways and produce a program that runs on a C64, you can literally have no different opinion on the matter, because any opinion that disagrees with empirical, demonstrable, fact is a falsehood. I wasn’t really going to comment on this article at first, but people keep repeating myths that have been busted.
* Not to mention the fact that Rust (and Ada and D and…) also has many high level features that when used normally, also compile down to nothing.

2019-01-22 11:15 pm
Alfman verbose=1
kwan_e,
iostreams are there when you want to write quick one-off tools, and no one claims they are the best to use in all situations. If I want a quick tool that processes a file line by line, I am sure as hell not going to use cstdio and deal with buffer management just to read a line. No sane person uses iostreams for network stuff, for example.
Good to know that you, JLF65 and I all agree on that!
What I’m calling wrong is this perpetual myth that you have to go low-level (often with tricks) to get performance, even in C++* …
As in “put on godbolt and look at the assembly” wrong.
Ok, but I have no idea who/what these statements are in response to? Jpelczar is the only person who mentioned assembly, are you referring to him?
So if someone gives advice that to use C++ for efficiency you have to do away with C++ specific features, then that is demonstrably, empirically, wrong.
I assume this is in response to “At least runtime performance is decent so long as we make careful use of it’s features.”. I stand by that. C++ abstractions can mask allocation overhead that would be more readily obvious in C owing to the fact that C doesn’t have those abstractions. Sure abstractions can be a productivity boost, but naive use of high level objects can hurt performance over C code where we have a better idea of what’s happening.
Take a C++ “string”, one of the most useful abstractions missing from C. Will it allocate on the stack? On the heap? If I copy a string does it duplicate or reuse the memory? How many characters can I append to it without reallocating? If I put a bunch of strings in a class, will that result in multiple memory allocations & frees? Is the implementation thread safe? The answer to such questions are implementation specific and has changed over time. C++11 helps answer some of those things like requiring semantics that disallow copy on write, but there’s a lot of confusion in terms of what the implementations are doing.
https://social.msdn.microsoft.com/Forums/vstudio/en-US/06367898-d42a-4c0d-a83f-1042c44b1b4e/strings-allocate-in-stack-or-heap
https://stackoverflow.com/questions/42049778/where-is-a-stdstring-allocated-in-memory
https://stackoverflow.com/questions/783944/how-do-i-allocate-a-stdstring-on-the-stack-using-glibcs-string-implementation
https://stackoverflow.com/questions/1594803/is-stdstring-thead-safe-with-gcc-4-3
https://stackoverflow.com/questions/17298164/copy-on-write-support-in-stl
https://stackoverflow.com/questions/12199710/legality-of-cow-stdstring-implementation-in-c11
One of the benefits of C++’s powerful abstractions is that programmers shouldn’t concern themselves with what goes on under the hood. However when it comes to performance, implementation details and trade-offs do matter. Indiscriminate use of C++ strings rather than using character arrays as C developers do can effect both runtime performance and memory usage. Does this imply bad performance in C++? No, and please don’t misinterpret my words to mean that. But, As I said before: runtime performance is decent so long as we make careful use of it’s features. If anything, I’d expect you to agree with that not only for C++ but for any language.
I wasn’t really going to comment on this article at first, but people keep repeating myths that have been busted.
Are you referring to me? None of what I am saying should be controversial to you.
2019-01-23 12:45 am
kwan_e
Alfman,
Are you referring to me?
Well, my original reply was quoting JLF65 so I was only responding about his comment. As you can see, I am now also doing what you do with addressing a person by name, since the thread linking feature seems to be missing. Sorry for the confusion.
I assume this is in response to “At least runtime performance is decent so long as we make careful use of it’s features.”.
No, I was responding to the comment from JLF65 that I quoted originally. I highly disagree with his advice of “if you pretend it’s C…”.
2019-01-23 2:11 am
Alfman verbose=1
kwan_e,
Well, my original reply was quoting JLF65 so I was only responding about his comment. As you can see, I am now also doing what you do with addressing a person by name, since the thread linking feature seems to be missing. Sorry for the confusion.
Haha, I do find it confusing. We’ll have to cope somehow. I don’t think longer discussions will work well in this format, which is why I sent osnews a script to toggle chronological comments and I was hoping they’d install it, but I don’t know what’s happening with that.
It’s pretty cool though if you didn’t see it earlier:
http://vocabit.com/osnews/sort_comments_2.html