LLVM 1.7 has been released. “This release contains a completely rewritten llvm-gcc (based on GCC 4.0.1), a brand new SPARC backend, supports GCC-style generic vectors, supports SSE and Altivec intrinsics, support for Objective C/C++, the X86 backend generates much better code and can produce Scalar SSE code, this release has initial DWARF debugging support, includes a new llvm-config utility, has initial support for GCC-style inline assembly, and includes many target-independent code generator and optimizer improvements.”
…what LLVM is: ( from wikipedia : http://en.wikipedia.org/wiki/LLVM )
“Low Level Virtual Machine, generally known as LLVM, is a compiler infrastructure designed for compile-time, link-time, run-time, and “idle-time” optimization of programs written in arbitrary programming languages.”
It sounds very interesting. Is it as nice as it sounds?
Any experienced coder out there want to share some pointers (eh )?
Edited 2006-04-20 20:26
> It sounds very interesting. Is it as nice as it sounds?
Yes
> Any experienced coder out there want to share some pointers (eh )?
Do you have any specific questions?
-Chris
> Do you have any specific questions?
Great job, Chris! I really think your LLVM is revolutionary.
I have 2 questions:
1. How fast is the JIT compared to standard GCC?
2. Ehm… shall we see it in Leopard? ๐
> I have 2 questions:
> 1. How fast is the JIT compared to standard GCC?
It is much, MUCH faster than recompiling the source with GCC. The LLVM JIT works from a precompiled bytecode representation of the source files, where all of the parsing has been done, and most heavy duty optimizations have been done (e.g. interprocedural optimization). Note that LLVM doesn’t give you “portable binaries” for free though: things like “#ifdef LINUX” and other features of the C language make it impractical for general C code.
> 2. Ehm… shall we see it in Leopard? ๐
As you might guess, I can’t comment about that.
-Chris
From the website:
“A compilation strategy designed to enable effective program optimization across the entire lifetime of a program. LLVM supports effective optimization at compile time, link-time (particularly interprocedural), run-time and offline (i.e., after software is installed), while remaining transparent to developers and maintaining compatibility with existing build scripts”
Are there any real-world before-and-after LLVM benchmarks showing improvements in any of the above areas? (compile time performance, link speedups, and runtime improvements?).
For example:
1. why doesn’t KDE use this to speed up compile time for Gentoo users?
2. why doesn’t Gnome use this to speed up link-time and eliminate useless vtable references (if possible) in their binaires?
3. how about PovRay using this to improve runtime performance?
– I’m honestly curious, it sounds great, but I’m not familiar with it!
The reason it’s not used is your entire build system (including compiler) needs to be converted to use the LLVM. It’s very, very new and very, very different.
They did try to get the GCC people to switch over to using LLVM as a representation to be used by the code generators (all GNU compilers compile to an initial representation) but this was not accepted. Lots of people have argued why, some say politics, others say pragmatism (LLVM has not demonstrated a clear performance advantage yet).
It’s definitely something to look out for though.
> The reason it’s not used is your entire build system (including compiler)
> needs to be converted to use the LLVM.
This amounts to setting “CC=llvm-gcc” in most cases, which can usually be specified when you configure or make a project.
> It’s very, very new and very, very different.
Quite true.
> They did try to get the GCC people to switch over to using LLVM as
> a representation to be used by the code generators (all GNU
> compilers compile to an initial representation) but this was not
> accepted. Lots of people have argued why, some say politics, others
> say pragmatism (LLVM has not demonstrated a clear performance
> advantage yet).
Heh, I must have missed that decision. I thought things were still up in the air! Work on LLVM is continuing at a blistering pace and the GCC community is still considering using LLVM for interprocedural optimization support.
At this point, use of the LLVM *code generators* is unlikely, primarily because LLVM doesn’t support all of the targets that GCC does. However, there is a lot more to LLVM than just code generators. My understanding is that (currently) the most likely design would replace the “tree-ssa” components with LLVM, which would give IPO for free, then hook LLVM up to the existing GCC RTL backend.
-Chris
Yup, I’d love to see LLVM go into gcc, too, in particular since Tom Tromey has been hacking on a gcj JIT using llvm recently and liked it quite a bit.
cheers,
dalibor topic
> Are there any real-world before-and-after LLVM benchmarks showing
> improvements in any of the above areas? (compile time performance,
> link speedups, and runtime improvements?).
I don’t know what you consider to be “real world”. We certainly have shown real world 20-30% speedups on code, e.g. altivec-intensive apps and other things. We have good SPECCPU2000 numbers, etc.
> For example:
> 1. why doesn’t KDE use this to speed up compile time for
> Gentoo users?
I’m not sure I follow what you mean here. Using interprocedural optimization generally gives you faster executables at the cost of longer compile times.
> 2. why doesn’t Gnome use this to speed up link-time and eliminate
> useless vtable references (if possible) in their binaires?
> 3. how about PovRay using this to improve runtime performance?
There are a couple of possibilities:
1. Because noone has tried before, this is the most likely.
2. Because there is some feature that isn’t supported by LLVM yet.
In any case, we’d welcome more people using LLVM and building applications. If you run into bugs or problems, report them and we will fix them.
-Chris
>> For example:
>> 1. why doesn’t KDE use this to speed up compile time
>> for Gentoo users?
> I’m not sure I follow what you mean here. Using
> interprocedural optimization generally gives you faster
> executables at the cost of longer compile times.
I think what he meant was that it would be possible to compile C/C++ source to an intermediary representation. This representation could then be distributed in order to translate it to machine code on the end users machine. note: In the Gentoo Linux distribution users compile applications.
Will this be possible with LLVM? I guess it would if you can compile to LLVM bytecode with optimization info.
I think what he meant was that it would be possible to compile C/C++ source to an intermediary representation. This representation could then be distributed in order to translate it to machine code on the end users machine.
Hmm, this gives me an even better idea. Let’s compile source code to machine code, and then distribute the machine code to the users. And maybe the utility that installs this machine code could print some messages like “optimizing for your machine”, “Gentoo: it’s like a spoiler for your computer”, and “reticulating splines”…
> I think what he meant was that it would be possible to compile C/C++ > source to an intermediary representation.
Yes, absolutely. If you are on Mac OS/X, we support all of C/C++/Objective C/Objective C++ (other platforms don’t support objective* well yet [patches welcome]). They can all be linked together and optimized as a unit.
> This representation could then be distributed in order to translate it
> to machine code on the end users machine.
Yes you can do this. However, beware what I mentioned in a previous comment:
“Note that LLVM doesn’t give you ‘portable binaries’ for free though: things like “#ifdef LINUX” and other features of the C language make it impractical for general C code. ”
This means that you can’t compile an arbitrary C application to LLVM, then compile it on different architectures or even different operating systems on the same processor. Once the C preprocessor has been run, information is lost.
Gentoo users may love LLVM (better optimization, more control, etc), but (realistically) they will still have to start from source.
-Chris
Gentoo users may love LLVM (better optimization, more control, etc), but (realistically) they will still have to start from source.
That said, it’d be interesting to use it for distributing applications to API-compatible but not binary-compatible systems. Since a lot of the things that break BC happen at either compile time (structure layout), or link-time (symbol insertion games), it might just work across systems that are similar enough.
Of course, some stuff happens too early (eg: C++ class layout) for that to be feasible for certain types of BC.
> This means that you can’t compile an arbitrary C application to
> LLVM, then compile it on different architectures or even different
> operating systems on the same processor. Once the C
> preprocessor has been run, information is lost.
I came to same conclusion after thinking about it ๐ C really isn’t designed for this.
> Gentoo users may love LLVM (better optimization, more control,
> etc), but (realistically) they will still have to start from source.
We surely will :-). Come to think of it, it actually will be the binary distributions that could gain from this. They could let the user do the final optimizations for the specific CPU (not platform).
This makes me wounder how long the various translations from plain
C source down to BC takes? Also, how long is takes to do the CPU specific optimizations and make the final binary?
The problem with “generic” virtual machine is what it doesn’t use platform-specific hardware.
I havn’t read any llvm info, so i may be wrong,
but how about a “count leading zeroes” instruction for example, or 128bit hardware floating point?
For sure, you don’t have these bytecodes yet =]
from the reference manual ๐
‘llvm.ctlz.*’ Intrinsic
Syntax:
declare ubyte %llvm.ctlz.i8 (ubyte <src>)
declare ushort %llvm.ctlz.i16(ushort <src>)
declare uint %llvm.ctlz.i32(uint <src>)
declare ulong %llvm.ctlz.i64(ulong <src>)
Overview:
The ‘llvm.ctlz’ family of intrinsic functions counts the number of leading zeros in a variable.
Arguments:
The only argument is the value to be counted. The argument may be of any unsigned integer type. The return type must match the argument type.
Semantics:
The ‘llvm.ctlz’ intrinsic counts the leading (most significant) zeros in a variable. If the src == 0 then the result is the size in bits of the type of src. For example, llvm.cttz(int 2) = 30.
LLVM is distributed under a BSD friendly license, is it not? If so, it would seem that it could easily supplant gcc as the default system compiler for the BSD family of operating systems. Is it only the gcc front ends that use GPL’d code or are there other component as well?
Would it be possible to replace the gcc based front ends one at a time with BSD equivalents ( starting with the c front end ) to gradually wean itself off of the gcc dependencies? Perhaps this could be a good Google Summer of Code project for someone?
Another option could be sponsored development which has become very popular in the FreeBSD community as of late. At least three developers have been sponsored for around 9 months of full time work in the last couple of years. Maybe this would be a good community to appeal to if one of the goals is to provide a completely BSD licensed solution.
I for one would like to see another full open source compiler suite. Too much division of effort is wasteful, but having some competition can be a very healthy thing.
Please don’t get sucked into GCC and go away!