Javolution 3.0 – A Java Revolution?

Submitted by Jean-Marie Dautelle 2005-02-23 Java 45 Comments

The new Javolution 3.0 (open source library) allows your objects to be preallocated at start-up and transparently recycled during execution. No dynamic object creation or garbage collection ever (and a significant increase in the execution speed of your Java programs).
Also, noteworthy in the new version:
– Text manipulation in O(Log(n)) instead of O(n) for standard String/StringBuffer.
– FastMap, a map whose capacity increases smoothly with no resize/rehash ever.
– Enhanced Struct/Union classes for easier interoperability with C/C++ struct/union.

Javolution runs on all Java platforms (from CLDC 1.0 to J2EE 1.5) or natively when compiled with GCJ (Gnu Compiler for Java).

About The Author

Eugenia Loli

Ex-programmer, ex-editor in chief at OSNews.com, now a visual artist/filmmaker.

Follow me on Twitter @EugeniaLoli

45 Comments

2005-02-24 11:13 am
Anonymous
And how do you reuse immutable objects like Integer?
2005-02-24 11:43 am
Anonymous
And how do you reuse mutable objects whose state is completely unknown at the time they are scheduled for reuse – considering that there might be no failsafe sequence to bring them into known state?
I really don’t understand why you would reuse an object if what you really want is to reuse the memory that the object occupies for a new object. The difference is only calling the constructor again for the same piece of memory, bringing the new object into known state, instead of re-initializing the object manually.
2005-02-24 11:59 am
Anonymous
[quote]I really don’t understand why you would reuse an object if what you really want is to reuse the memory that the object occupies for a new object. The difference is only calling the constructor again for the same piece of memory, bringing the new object into known state, instead of re-initializing the object manually. [/quote]
Well, simple. Creating a new object is up to 10 times as slow as re-initing it. Read more about what this lib can do, and why. The ideas are actually very good!
2005-02-24 12:28 pm
Anonymous
Its all a way to get around the design flaws of java. Java (as opposed to .NET) does not have user-defined value types, so everything is allocated on the heap with the performance penalty associated with this, especially for small objects like boxed primitives.
For a long time, the best way to get around this limitation was to use object pools. This library, as far as I understand it, does nothing more than generalize the concept of object pools and to provide some kind of stack allocation by having thread-local object pools.
In .NET you would just use a value type if you want to avoid gc. But these days garbage collectors are so good that it is not nessecary to avoid gc for large objects. If you have millions of small objects (like e.g. boxed ints aka Integers), then bypassing the gc pays off really well.
But that is why .NET has user-defined value types.
I used to be a big java fan. I liked the minimalistic approach of the language. But the absense of value types drove me to .NET. I don’t care about all the syntax candy like delegates, properties, operator overloading etc. But value types are essential, and java does not have them.
I would like to see a “cleaned up” java that keeps the language simple. Like so:
whenever
-a class is final
-all its members are final
=> the class is immutable
-the class is less than e.g. 20 bytes
it is created on the stack and behave like a value type. The language would not have to be changed for this. Just the VM.
That would improve many things by an order of magnitude. For example an Integer[] would be a continuous block of memory and not a block of pointers and small objects blown all over the heap. (The original designers of java probably something like this in mind when they made all the primitive wrappers immutable).
Like I said the language would not have to be changed for this since with immutable objects there is no difference between value semantics and reference semantics.
But it seems that sun does not want to touch the jvm, so all you java programmers will have to live with hacks like this or bad performance.
Too bad. Java could have been so cool.
2005-02-24 12:47 pm
Anonymous
<troll>
Help me understand how one allocates prior to allocation.
</troll>
2005-02-24 12:56 pm
Anonymous
Stack or heap allocation is a low level consideration, I think that for a language like Java, it must be the VM or the compiler problem to pick the best method depending on the memory architecture of the running platform.
Same thing for object reuse.
2005-02-24 1:26 pm
Anonymous
I don’t really see the problem with Java at the moment. Sure, objects are all allocated on the heap but there are still primitive types. Instead of Integer[], what’s wrong with int[]?
The only time when boxing primitive types in their corresponding class will bite you in the behind is when using collections like ArrayList, Vector, etc. with primitives in performance critical code. Which is why, when performance is a concern you will use libraries like Colt and Visual numerics which provide collections that do not have the problems associated with boxing as they work on primitives.
2005-02-24 1:38 pm
Anonymous
Why should you not be able to use primitives in collections with decent performance? Using third party collections just to store a few integers efficiently is excessive.
What about if you want user defined value types. Like e.g.
final class PointF {
final float x;
final float y;
…
}
In any decent language something like this should be handled as a value type. But not in java. In java a Point[] or would be an array of pointers with the actual data splattered all over the heap. The difference is extremely important since modern processors need continuous memory access for the cache prefetch to work. And of course storing all these small structures as seperate objects with pointers has a significant memory overhead.
Note that you would not even have to change the language for this, since there is no difference between reference semantics and value semantics for immutable types. You would have to modify the VM and the java bytecode though, and that is something sun refuses to do.
Take a look at swing and the amount of temporary PointF and RectangleF objects that are created. Supporting value semantics for small objects would tremendously speed up swing.
This issue is the core performance problem of java, and I really can’t understand why sun does not do something about this. I talked to a sun engineer once at linuxworld, and he agreed with me.
2005-02-24 1:38 pm
Anonymous
User defined value types would certainly address some performance issues with Java primitive wrapper. I don’t thing you need to modify the VM specification. This could be done by the bytecode compiler. For example: Instances of Integer or Integer[] implemented as int, int[] (no pointer).
I am surprised this has not been done yet!
2005-02-24 1:54 pm
Anonymous
If this can be done without modifying the bytecode, somebody should do it ASAP. I’m on .NET now, but competition is always good. A faster java would mean that microsoft would have to address performance issues with .NET sooner.
The interpretation of the java bytecodes dealing with object references such as aload would have to be changed.
For example “aload 1” would jit to loading the object reference if the object is mutable or large, and to loading the object on the stack by value if the object is immutable and small.
I am still not sure wether the java bytecode would permit doing this. But the designers of java definitely had something like this in mind when they made the primitive wrappers immutable.
But I guess its no use speculating about this. Sun had 10 years to address this issue, and they did not do it.
2005-02-24 2:21 pm
Anonymous
No, the reason was, only %5 of the developers need to use use primitives in collections extensively, and %1 of them does not want to use arrays or primitive collections for that, and %.01 of them thinks it is a reason to switch from java. So, Sun does not want to please %1 by displeasing %99 cracking back compatibility or complicating java.. Same is true for operator overloading etc etc etc..
2005-02-24 2:31 pm
Anonymous
Honestly, in any garbage-collected language, creation time should be almost zero (especially since java features a compacting allocator. If objects die quickly, the GC time should also be free (at least for the better GC algorithms). So why is this that much of a deal with a modern Java implementation?
2005-02-24 3:08 pm
Anonymous
“No, the reason was, only %5 of the developers need to use use primitives in collections extensively, and %1 of them does not want to use arrays or primitive collections for that, and %.01 of them thinks it is a reason to switch from java”
If this is such a small issue, then why are there so many strange workarounds (like the trove collection library and the javolution library)? You can’t really claim that allocating every single object using a static factory method like in the javalution library is natural. Its a hack.
“Same is true for operator overloading etc etc etc..”
There is a huge difference between syntax candy like operators, properties, foreach etc. and semantic differences like the absense of user-defined value types. Of course you can emulate immutable value types using immutable reference types, but the performance will be horrible if the jit does not support this.
2005-02-24 3:10 pm
Anonymous
Depends on the demographics of the developers you talk to :-). If they’re largely in the scientific community, you can be sure they will be falling over themselves for primitive collections, operator overloading and stack allocated objects (i.e. value instead of reference objects).
Well, maybe value type objects aren’t such a big deal. How much faster is .NET compared to java when using value type objects? Anyone got any benchmarks?
2005-02-24 3:21 pm
Anonymous
Value type objects are a big deal, especially if you work with arrays of them. Take the hypothetical point class in the last example.
struct/class Point
{
float x;
float y;
}
Storing this struct by reference has an overhead of 4 references in the current java VM. 3 references for the heap management and one for storing the reference itself. So a Point[] with 1000000 entries will consume 24MB on a 32bit machine or 40MB on a 64bit machine. An array of structs where the points are stored in-place will consume 8MB regardless of the architecture.
The performance difference will be a factor of 5-10. But of course aaa will say it is a “Microbenchmark” and therefore irrelevant. Run them yourself if you are really interested.
< http://www.lambda-computing.com/publications/rants/csvsjava/javagen… >
< http://www.lambda-computing.com/publications/rants/csvsjava/csgener… >
2005-02-24 3:39 pm
Anonymous
In any decent language something like this should be handled as a value type.
Actually, any decent compiler should handle allow the programmer to treat the above construct as a full object, and handle it internally as an unboxed object. Especially for this particular example, where no subclasses are possible, it’s fairly easy for a good compiler to elide the box. Unfortunately, there are no really good Java compilers
2005-02-24 3:52 pm
Anonymous
I am a big Java fan. I really love it because of its design. I love the entire OOD that you design your application in absolute OO way. This is necessary for almost every piece of business software and sometimes games and other applications.
I have seen people blaming how slow Java is on a forum, they post their codes and others laugh at them because their codes are not even up to Junior and not even to mention optimizations.
.NET has big problem because of storing data on memory and not recycling(talking about ASP at least). On .NET platform, when you create an object, the object will not be recycled until server round trip or restart, that’s why .NET is fast but many reports and benchmarks show that .NET is using 3 times more memory than Java.
Javolution is definitely a way around for someone who doesn’t do optimization in their programs. It’s just like someone who uses pointers in C/C++ to build generic Binary Tree good programmers may use Array to implement Binary Tree for fast access.
I believe nowadays, it is not about startup speed. I asked some of my friends, they rather have startup speed slower even up to 100% slower than normal but running smoothly without crashing.
Javolution has some overhead when it allocates extra memory for pre-caching, but this is also an advantage of fast access.
It’s a tradeoff, you can’t escape from it. Even OOC/C++ have disadvantage of breaking the OOD(void* and function*), their void* and function* are just advantages for programs to have fast trick on some stupid problems.
If you really think Java is very slow, think twice. Look at those Eclipse and NetBean or even some other Pure Java programs, they are ONLY partially optimized. And usually they are under heavy development. If one part is not finished in the next plan, it won’t be optimized too much.
Anyway, if you have problems optimizing, go search the internet, there are many tips and hints. I suggest this site which has all Java news, hints and tips, and resources:
http://www.javarss.com
🙂
2005-02-24 3:52 pm
Anonymous
yes it is a stupid micro (actually nano) benchmark. it means nothing.
For a better comparison (even it still is not a good benchmark), check here
http://www.jot.fm/issues/issue_2004_09/column8
Quote:
“For integer types (primitive type in Java and value type in C#), generic Java was 2.29 slower than generic C#. That is significant. For reference types, generic Java was 1.20 times slower. That is not significant and is in part due to the overall efficiency of Java JIT compared with the C# JIT (other experiments have suggested that the C# JIT is slightly more efficient than the Java 1.5 JIT).”
The only thing C# developers says “We do not have generics, but in the future when we will use collections with same syntax as java in our uber scientific calculations it will be 2 times faster in a for loop…” you are fun.
2005-02-24 3:57 pm
Anonymous
“Actually, any decent compiler should handle allow the programmer to treat the above construct as a full object, and handle it internally as an unboxed object. Especially for this particular example, where no subclasses are possible, it’s fairly easy for a good compiler to elide the box.”
That is the sad thing. If the object is final and immutable, nothing prevents the jit from treating it as a value type transparently. The creators of the java class library have anticipated this optimization. That is probably why they made the primitive wrappers immutable. The people of JavaGrande have proposed this solution in 1998. And this is not exactly rocket science: other language have done this for years.
“Unfortunately, there are no really good Java compilers ”
The hotspot compilers are quite good when optimizing code using the builtin primitives. They just persistently ignore this issue.
2005-02-24 3:57 pm
Anonymous
The benchmarks are that way for the simple reason that Java VM doesn’t have generic types. Generics are full boxed objects, except the compiler does the casting nastiness for you under the hood. The .NET CLR (2.0), on the other hand, does have true generics, and primitives are stored as primitives and not as objects. Even reference types are handled more efficiently, because the compiler can elide some type-checks.
Thus, Java is slow on the primitive test, because the objects are getting boxed, and slower on the reference test, partly because it has to do extra type-checking that the C# version doesn’t.
2005-02-24 4:00 pm
Anonymous
“NET has big problem because of storing data on memory and not recycling(talking about ASP at least). On .NET platform, when you create an object, the object will not be recycled until server round trip or restart”
What??
The .NET CLR has garbage collection like most languages out there, doesn’t it?
“For reference types, generic Java was 1.20 times slower. That is not significant”
Not??
Ahum, I’d say if a Java JIT after TEN YEARS (yes, dammit!) is slower than a brand-new CLR-VM, then that makes me think!
2005-02-24 4:06 pm
Anonymous
@aaa
If it is actually 2x faster, that is very significant.
@Rüdiger Klaehn
I’ll take your word for it that the C# version is much faster. I’m running PPC Linux and OS X at the moment. Neither of them have a version of Java or C# that supports generics.
2005-02-24 4:15 pm
Anonymous
i know the issue about primitives in collections. My point is, it is actually not a big deal. but C# developers thinks this is the most important thing in the world and the only thing they say is that without knowing the reasonings behind.
Performance measurement is a difficult task, Sadly microsoft used young and clueless developers when .Net first introduced with the claims like “.Net is 8 times faster than java..”. seriously i still see those people around. Or they tried to create slogans like “Java is ne next cobol” not noticing C# is hell similar to it.
Or when mono was first introduced they showed the reason not to chose java as performance, now shamefully they replaced those words form their site. maybe playing to the winning horse was the idea, but hey, i still do not see .Net as a winning horse, no matter how much it is pushed.
i am just smiling to them. Of course time will tell.
Anyway, i am very happy with Java, and its performance. i know it is not the uber perfect technology (there is no such thing), but in the mean time, it is the best for my problem domains.. i say good luck for those who uses a OS lock-down technology.
2005-02-24 4:15 pm
Anonymous
@Viro:
If you compile mono from source you can get generics on OSX and PPC Linux. But I don’t know wether the code generation for PPC is as optimized as the one for x86. It should be, because I suppose generating code for the clean PPC ISA should be much easier than for the baroque x86 isa.
You don’t have to run a benchmark to see that the memory overhead for storing tiny objects on the heap will be huge.
@aaa:
In my projects I got much more than a factor of 2. But even if it were only a factor of 2, this would be very significant. A simulation running at 20fps is interactive, at 10fps its more like a slideshow.
If you love java, instead of denying that the problem exists you should lobby sun to address this issue.
2005-02-24 4:19 pm
Anonymous
the point is, the usage of value types in collections are very very limited. When was the last time you needed a HashSet for integer values? Most of the time arrays are enough and does a better job in ters of performance (Matrixes, FFT calculations etc uses pre-determined sizes). Plus, as i noted before in java you are not without a choise in java, you can use primitcve collections if you wish to use a Vector or Linked list for primitives. Thus, in ovrall performance, i do believe java is better.
2005-02-24 4:41 pm
Anonymous
The “Java mentality” is that SUN is always right. SUN knows best. When Java prevents creative coders from expressing their solutions, it is because the creative coder is wrong and SUN is right.
If it wasn’t for .NET stealing Java’s thunder, would Java have had autoboxing, or *gasp* generics? And after a decade of development, why does the bloody JVM still generate freaking sluggish and bloated code?
Watch out for the response. “It’s not Java’s fault, it’s yours!”
2005-02-24 4:42 pm
Anonymous
” A simulation running at 20fps is interactive, at 10fps its more like a slideshow. ”
Your simulation is based on a for loop with value objects and collections? i can achieve 30fps by using an array then.. what a reasoning. I am very happy with the performance of java and it gives a run for Microsofts money.
2005-02-24 4:46 pm
Anonymous
Ahum, I’d say if a Java JIT after TEN YEARS (yes, dammit!) is slower than a brand-new CLR-VM, then that makes me think!
There are two things to note. First, the CLR traces it’s lineage back to Colusa’s OmniVM, which is well over a decade old. Since Microsoft bought Colusa, I assume they got a lot of the experience of the people who implemented OmniVM. Though the current CLR is based on new code, the design is fairly mature. Second, Microsoft has a lot of experience with code generation on x86. Microsoft C and Visual C have been available for close to two decades. Sun has much less experience generating code for x86.
2005-02-24 4:48 pm
Anonymous
Generics were planned much before. JSR-14 . it was approved in 11 May, 1999. Why do you think C#’s generics are almost identical to java? For the autoboxing, i even do not like it
2005-02-24 7:16 pm
Anonymous
Microsoft C and Visual C have been available for close to two decades. Sun has much less experience generating code for x86.
Yet somehow HotSpot optimizes better for IA32/SSE2 than it does for SPARC…
2005-02-24 7:47 pm
Anonymous
“That is the sad thing. If the object is final and immutable, nothing prevents the jit from treating it as a value type transparently.”
There are a couple wrinkles…
1. “java.lang.Integer” references can be “null”. The problem here is that “null” was allowed to creep into *every* reference type. Using “null” should have been more explicit and should have been allowed on primitive types.
http://nice.sourceforge.net/manual.html#optionTypes
2. People often convert “Byte[]” to “Object[]”, which costs nothing. Converting a “byte[]” to “Object[]” is expensive. Though the JIT could detect such situations and use boxed value when necessary, this would result in unintuitive performance degredation. Though an API designed with this in mind could work well, the current Java API would probably match up well.
Sadly, both of these problems are caused by arbitrary design decisions in Java. Had parametric polymorphism been taken into account in the beginning, I’m sure these issues would have been completely avoided. But now, it’s much harder (not insurmountable, but not trivial, either).
2005-02-24 7:47 pm
Anonymous
“Yet somehow HotSpot optimizes better for IA32/SSE2 than it does for SPARC…”
which one? HotSpot for server or for desktop?
i wanna know. thank you
2005-02-24 8:34 pm
Anonymous
Yet somehow HotSpot optimizes better for IA32/SSE2 than it does for SPARC…
How is that relevant? I’m saying that part of the reason Microsoft’s CLR has a performance edge over Sun’s JVM is that Microsoft has a lot more experience than Sun in x86 code generation. Now, that doesn’t imply that the JVM would optimize better for SPARC than for IA32. SPARC is very hard to optimize for, being an in-order processor, while current x86 processors, being out-of-order processors, are easier to optimize for. What it does imply, however, is that JVM on SPARC would have a code-generation edge over CLR on SPARC.
2005-02-24 11:17 pm
Anonymous
Rüdiger Klaehn wrote:
> If the object is final and immutable, nothing prevents the
> jit from treating it as a value type transparently. The
> creators of the java class library have anticipated this
> optimization. That is probably why they made the primitive
> wrappers immutable.
I don’t think so. I think they made the wrappers immutable so that you don’t have to clone them all the time. E.g. if you want to do something like “if (isOK(x)) use(x);” you need to be sure that x doesn’t change or you’ll get a race condition and a confused deputy.
2005-02-24 11:45 pm
Anonymous
@Marcus Sundman: Rudiger Klaehn is right. In languages that treat primitives as full-blown objects, many primitives are immutable, and sometimes final. The immutability guarantees to the compiler that nobody can modify the object via a reference. This allows the compiler to store the value in a register, and eliminates the heap allocation (and the pointer dereferencing). Preventing subclassing makes it easier to do unboxing analysis with a less sophisticated type inference engine.
@Rudiger Klaehn: I was being a bit facetious in my comment about the quality of the java compiler. My main beef with it is that it lacks a lot optimizations that are fairly run-of-the-mill in other compilers. For example, it can’t stack allocate objects with non-escaping references. It can’t elide boxes, it doesn’t attempt any sort of type inference to make dispatch faster, etc.
2005-02-25 12:41 am
Anonymous
Rayiner Hashem wrote
> In languages that treat primitives as full-blown objects,
> many primitives are immutable, and sometimes final. The
> immutability guarantees to the compiler that nobody can
> […]
I never said otherwise. I just said that I don’t think that is why they made the wrappers immutable.
2005-02-25 1:46 am
Anonymous
My point is that the Java team has some pretty competent language designers. I’m sure they are familiar enough with the properties of unboxing analyses that they consciously designed the wrappers to make implementation of such a feature easier.
2005-02-25 3:31 am
Anonymous
The fact that Java requires so many code patterns (as opposed to proper design patterns) and factories to do anything remotely useful really annoys me. This just underlines very fundamental flaws in Java’s design and architecture.
Design patterns (the proper kind, not the bastardized ones common in Java, which are actually code and class patterns) are good for abstracting the problem domain and code seperation, but when you try to force them into a pattern for forcing code and classes, it just screams something very wrong. It just adds a million useless abstraction layers that complicates and convolutes your solution. It’s also one of the main reasons OO gets a bad rep. Java does too many things just for the sake of making a class “more OO”, but ends up shooting itself in the foot and makes your system less OO instead and adds no value.
I use Java and C#. I work with Java in my job. I make use of lots of design patterns, but in C#, I have not needed even once to resort to a code pattern. i.e. I’ll have singletons but used absolutely 0 singleton code patterns.
Java in its current state should really be relegated to academics. But that’s just me.
2005-02-25 3:49 am
Anonymous
Rayiner Hashem wrote
> My point is that the Java team has some pretty competent
> language designers. I’m sure they are familiar enough with
> the properties of unboxing analyses that they consciously
> designed the wrappers to make implementation of such a
> feature easier.
I, too, am sure they are familiar with primitive-boxing, but I believe that if they would have chosen immutability because of that then they would have made primitives look like objects in the first place (which I think they should’ve done).
2005-02-25 3:52 am
Anonymous
Anonymous wrote:
> I make use of lots of design patterns, but in C#, I have
> not needed even once to resort to a code pattern. i.e.
> I’ll have singletons but used absolutely 0 singleton code
> patterns.
Could you clarify this a bit, perhaps with an example, please?
2005-02-25 12:05 pm
Anonymous
What’s the difference between a code pattern and a design pattern?
2005-02-25 3:34 pm
Anonymous
The following page contains an example of the kind of performance hit Java suffers due to the lack of value types
http://www.spinellis.gr/blog/20050210/
2005-02-25 7:45 pm
Anonymous
That is a very synthetic example that is written only to show Java’s flaws in performance. The reason that Java is slow is because of boxing/unboxing, *and* the toString() method call whenever you try to display a RInteger.
However, in that example, you could easily replace TreeSet with an array of int[], and you’d not get that performance hit. Or you could just use one of the primitive collections that are available on sourceforge.
So yes, Java is slow when working with the built-in collections, but there are alternatives available, and like aaa has mentioned, for most numerically intensive code, arrays work best.
2005-02-25 10:13 pm
Anonymous
The toString() method is only called 10 times, so it has no effect on performance. Using an array of int might be possible in this particular case, but in a real application one would loose the abstraction offered by a class.
2005-02-25 11:10 pm
Anonymous
Sorry, mistook that part about the toString() methods. Was rushing home from the office, so I didn’t pay attention.
Sure, using an array is the best thing in this toy example. If you are worried about abstraction, you’d use a class. Like I mentioned in the previous post, there are loads of primitive collections around. Have a look at http://pcj.sourceforge.net/ since it’s compatible with the Java Collections Framework. I don’t use it personally since I tend to use Colt if I need primitive collections, but I imagine it should perform just fine.