Now, this is an interesting development in the ongoing war against Android. Oracle didn’t just sue Google for allegedly infringing its Java patents; it also claimed copyright infringement. Oracle has amended its complaint, and, fair is fair, they’ve got the code to prove it: indeed, Android contains code that appears to be copied verbatim from Java – mind you, appears. However, the code in question comes straight from Apache’s Harmony project, which raises the question – would a respected and long-established cornerstone of the open source world really accept tainted code in the first place?
The amended complaint reiterates the patent claims, but also goes into more detail about the copyright infringement part of the lawsuit. Oracle trots out some code comparisons of the ‘PolicyNodeImpl.java’ class library, and compares Oracle’s version with that of Android – the parts look identical (but might not be – more below). Interestingly enough, this isn’t really Google’s code; it’s code from Apache’s Harmony project.
“In at least several instances, Android computer program code also was directly copied from copyrighted Oracle America code,” Oracle states in the amended complaint, “For example, as may be readily seen in Exhibit J, the source code in Android’s ‘PolicyNodeImpl.java’ class is nearly identical to ‘PolicyNodeImpl.java’ in Oracle America’s Java, not just in name, but in the source code on a line-for-line basis.”
Since this code comes from the Harmony project, which is part of Apache, I’m really wondering just how much merit Oracle’s complaint has. Oracle is being disingenuous by claiming Google copied code from Oracle, while it’s quite clear the code has actually been taken from Harmony. Of course, if Harmony has no right to use this code, then Google is also liable. However, the reverse is also true – if Google is found to have violated Oracle’s copyright with this piece of code, then so did the Harmony project.
It gets weirder, though. While the code in the Android source tree states it’s Harmony code, this class can actually not be found in the current Harmony source.
Then there’s the fact that this piece of code actually seems to be testing code, as in, code that doesn’t actually ship on devices (can anyone please confirm this?), which, I think, is also an important distinction, as Oracle claims we’re talking about code shipped to customers on devices.
But there’s more. The code indeed looks copied, but is it, really? Carlo Daffara generated a diff of the two files, which paints a slightly different picture. According to a comment over at Groklaw (I know, I know, but he makes a good point), any similarities can easily be explained by “using the same naming convention for variables and the widespread use of automatic code generation in the Java community”.
This is not as clear-cut a matter as it seems, but I’m sure the usual suspects will rail on Google anyway. Surely more to follow.
Apache probably trusted whomever the contributor was and their reputation w/o question.
It won’t be hard to backtrack the commits.
I hear a ghost whisper telling me that Java and MySQL are the next Oracle victims.
It’s a rip off.
http://www.binplay.com/2010/10/look-at-copied-oracle-code.html
I agree.
Why are the methods in the same order?
Same private method names – what are the chances of that??
probably decompiled and then re-created. tsk tsk.
Absolutely – everything about those comparisons screams “decompiler”. All the names that would appear in the bytecode (class name, field names) are identical; all local variables have been given obviously machine-assigned names based on their type (set1, flag1, etc).
Looks awfully damning… if this code really came from Harmony, the Apache guys have been *really* careless about the code they accept.
I couldn’t find anything in the Harmony-project SVN which made me think this came from there.
Not in incubator either.
So the license at the top is a mistake, atleast.
Maybe someone ran a tool to place such a license at the top of files which didn’t have anything yet…?
The code from Google and Oracle/Sun looks very similar, their is definitely a common origin. The quesion obviously is, what is the origin. It could have been generated based on some specification for the x509-certificate that this deals with.
But the creator (or person that imported the source into the Android git repository) should be able to point out where it came from.
Or completely the other way around, there might be a test-suite which tried to use this API and they created a similair API because it was needed to pass some tests.
Edited 2010-10-28 23:35 UTC
Maybe the API-documentation was the source for Android from which they derived the function list and the source for the API-documentation was the source from Oracle/Sun.
Maybe that is allowed by the copyright law, I don’t know.
Although, I have to say, I can’t find a API-document that proves this.
Edited 2010-10-28 23:44 UTC
Why would it have to be decompiled? Its also present in Open JDK.
Replying to myself now that i understand the matter a little more clearly.
The code in question is not from Harmony, its from OpenJDK. That’s what I knew. For some reason, I thought that the parent was un aware that Sun Java is OpenJDK ( minus a few parts no one really cares that much about). There have been many people on many discussion boards that have made that mistake, so that’s what I assumed the parent was inferring.
However, in my rush and caffeine deprived mind, I neglected to imagine the more likely case: Google needed the code, but didn’t want to copy the GPL’d Open JDK directly to their BSD code as that would not work licence wise. So they tried to clean room it by decompiling the binary ( my guess anyway). That makes sense to me anyway. Some one did the same thing and came up with the exact same code as Google has in Android ( very similar with a few odd quirks).
Now does this either condemn Google or Exonerate it? I have no idea. I am not a lawyer, nor a psychic.
I love one of the comments there
“Well, I guess that’s what happens when you sponsor a ‘Summer of Code…'”
It really doesn’t matter what someone thinks. The lineage of that code will need to be established, it will depend on how that code is licensed and if it was a Sun/Oracle employee that dropped that code in……..
And yet, reading Oracle’s code once, I can write the code that is very similar to the one found in Harmony. That, btw, would not constitute a copyright infringement. (Since it follows common industry conventions and has to conform to a published specification)
But then again, I am a Java dev for 10 years now. And Java has common naming conventions.
EDIT: Interesting point, the code is in the repository, but it’s a test…
Edited 2010-10-29 23:11 UTC
In how many ways can you declare 10 variables? The answer is 10! = 3.6 million ways. Almost 4 million ways.
The probability that the Google programmer chose exactly the same ordering as Oracle, would be 1 in 4 million. Highly unlikely.
Chances are more than 99.99999% the code is a copy.
That is assuming that the probability law is uniform. I’m not sure that this assumptions goes well when considering the human brain.
I strongly disagree. I read through the sample so-called “copied code” and it’s a lot less similar than code SCO claimed was ‘copied’ into Linux, and where it is similar it’s pretty often due to implementing a known interface.
If you tell ten developers to write a quicksort function in C (for example) you’ll get ten results which, while not identical, will be very similar and will be more similar to each other than these examples, right down to identical variable deceleration order. Even if there are a billion ways you *could* do the same thing, chances are culture, tradition and best practices will reduce that set down to a relatively small handful of similar implementations.
Oracle is building a case that will be tried by a jury, all of whom will almost certainly have no idea what Java is.
Ultimately, this type of clear duplication of code will be very persuasive in support of Oracle’s case.
Also, this argument that Google is not responsible because they used Apache code is just silly. The reality is that using open source does not absolve you of copyright or patent laws.
]{
I think you’re missing the point mate. It’s not that Google is somehow less guilty because they’re using Open Source… It’s that they would not be the ones who breached the copyright but rather Apache foundation might be.
In the event of a guilty verdict it doesn’t change much I figure as the whether module is getting removed from Harmony or from Google directly doesn’t really change that it would be unavailable I suppose. IANAL (Duh!) but I’m reasonably confident there would be a difference in financial and business repercussions though.
If the code is a copy, and not a re-implementation, then compensation would come down to damages and culpability.
If it was the Harmony project who copied code (rather than re-implemented it), then Google can’t be held to be culpable.
It might boil down to a simple cure for Google to re-write bits of Android’s code.
You are, in fact, correct that the simple remedy is re-implementing the code in a clean room context.
You are not correct that Google is not culpable. If I publish an encyclopedia from some content I found on-line marked as ‘public domain’ and it is later found to be the property of another party I would, at a minimum, have to pay damages to the other party equal to some percentage of revenue I might have derived from that content.
All that said, this is just a side show for Oracle to show that Google is being disingenuous. A side show to the main event if you will.
And since Google does not sell the code in question, how can the damages be calculated?
Probably removed from a dark, dark place.
No, copyright law does not work like that. If you copy someones copy you are culpable. Otherwise, everyone would circumvent copyright through a shell entity that did not have the resources to pay remedies.
While not commenting in anyway on the allegations, I will say that this is really hard to track in practice.
How would you know if it was a copy unless you saw the original?
I can say from experience working on these kind of projects, I often go out of my way never to see the original.
Perhaps automated mechanisms might be put in place if you were doing a really high-profile project like Harmony. Apache might have setup some diff against the JDK to double check that no one brought in suspicious code (not nearly as simple to do as it sounds…); but was it OSS when they started Harmony though? I don’t think it was…
At the end of the day, it is just hard to know if the code you are getting has been taken from some other code, somewhere else you don’t know about.
Short version is that it is hard enough to know all your own code; knowing all of everyone else’s code is plain impossible.
Edited 2010-10-28 21:16 UTC
You can easily do the diff after the code has been released.
Or use bytecode decompiler, as seems to be the case here. Actually, this might be bad publicity for Java, as releasing Java “binaries” is almost equivalent to releasing the source code (you often hear this used as argument in favor of Java against Python, js and others where source is often zipped).
I should add, in my view the difficulty involved doesn’t get Apache off the hook.
Once made aware, if it is true, they should correct it. The point is that even in good faith such a thing could easily happen.
The only person to blame morally (again, if this is true) is the person that knowingly took someone else’s code.
Maybe they googled it…
Like this you mean ?:
http://www.google.com/codesearch?hl=en&lr=&q=%22PolicyNodeImpl+…
Edited 2010-10-28 23:26 UTC
…copies feature for feature, and line for line. I know this site ardently defends android, but sooner or later people have to take their heads out of the sand.
You didn’t read. Android didn’t copy it. Harmony did (if true). Google only copied it from Harmony, probably assuming the code was clean because it’s Apache’s. We might want to wait for a response from Apache.
Without checking the code did it occur to anyone that it might have been an inverse order that the jdk copied the code from harmony.
However, *with* looking at the code, it’s obvious what way copying (or again, decompiling) happened.
That is the wrong (strawman) argument now, focus on other matters Thom presented.
This issue isn’t really worth discussing until someone spends some time (20 minutes?) with Harmony VCS and tracks this down.
Edited 2010-10-28 21:34 UTC
If only Google had access to some sort of search device that searched through open-source projects. A “Google Codesearch,” if you will.
But who could invent such a crazy moon gadget?
Edited 2010-10-29 17:43 UTC
I don’t defend android. I defend the right of coders to do the right tool for the job. Software patents are stupid. Patenting concepts is as stupid as snorting uranium or cranial intrusion. What’s the point of charging someone from using… say… sorting algorithms?
“Method for sorting data according to explicit numeric value”.
There… No more sorting algorithms. If you want to sort stuff… well.. to f–king bad. Pay up or shut up. Because that’s what it is about.
And yes, I know sorting algorithms aren’t patented, but other algorithms are. Stuff like image and video encoding come to mind. How about patenting some DSP algorithms or FFT? Why not? There’s money to be made! f–k the industry, f–k technology, f–k science. This is capitalisms baby!
You seem to be taking a stand for software patents and how “Google is clearly ripping of Apple/Microsoft because OMG Android UI is so f–king similar to iOS or Windows Mobile and they must be sued because of it hurd durr”.
Yet i bet you’re posting this comment on a OS which has a GUI. GUIs where invented by Xerox. Apple ripped off Xerox when they first developed MacOS System ? (I dunno).. and MS ripped off apple. Where would we be now if they hadn’t done so? Snail much?
Why are MS and Apple the only ones entitled to rip people off? Are they God made corporation?
Addendum: I find hilarious that neither MS nor Apple even dare to look funny at IBM. You know why? Because IBM is the Big Daddy of computing patents. Every single one of their products infringes IBM patents. Yet Big Blue doesn’t do shit about it. Know why? Because they are old enough to know how to adapt their business model to the market. They don’t try to coerce the market into doing their bidding. They play ball. And that’s the way it should be.
That is all.
Edited 2010-10-29 05:43 UTC
Except that in this case we’re talking about copyright infringement, which any open source aficionado such as myself will tell you is always a Bad Thing (TM).
Basically, someone needed an implementation of this class and decided to just use automated tools to extract (read: steal) source code from one of the official Java implementations. And the evidence is clear that this is what happened.
Who really did it is the question we need to ask.
Copyrights and patents are not capitalism. They are protectionism, corporatism, fascism, or if you have use the word capitalism; it’s crony capitalism.
True capitalism would not allow for one company to have a monopoly over another through coercion of the state.
True capitalism is a beautiful thing. The word is just misused today, the U.S. hasn’t seen true capitalism in a long long time if ever.
Sorry for the rant, I just get tired of people calling corporatism capitalism. They are not even close. The customer is king with capitalism. It’s completely opposite with corporatism.
…it was a Java (Sun/Oracle) employee who dropped that into Harmony.
And this is why I never follow coding conventions. J/K!
lol, thats an excuse I haven’t heard before.
What if Harmony was the originator, and Oracle copied it from them?
Either way, I’d get a chuckle.
I am shocked. Oracle still sue Google, because of Android.
Jonathan Schwartz congrats on 2007 Google
http://blogs.sun.com/jonathan/entry/congratulations_google
and Oracle sue them.
Oracle have itself pushed off OpenSolaris, they will lost OpenOffice.org (its now LibreOffice), they will lost MySQL (its now MariaDB) and they will lost Java.
Oracle goes the way, the SCO Group goes before. And they will end at the same point.
Sadly, they smash so nice products against the wall.
Edited 2010-10-28 22:14 UTC
Oracle is suing Google.
Neither Sun, nor Jonathan Schwartz, are suing Google.
Jonathan Schwartz has left Sun, which has been swallowed by Oracle.
So Jonathan Schwartz congratulated Google on Android. The Oracle decisionmakers couldn’t care less.
Edited 2010-10-29 16:45 UTC
The original file from Oracle (Sun) states it’s licensed under the terms of GPLv2 (even with Classpath exception). So what’s the problem? Does Google violate the GPL and has licensed Dalvik under a incompatible license? So Oracle actually defends the GPL and hence free/open source software? Then Oracle would be the good guy here.
Can anyone tell me what’s going on, please?
Sun released Java under GPL (as OpenJDK). But Oracle’s lawyers are apparently unaware of this
There are two issues.
One is the file is re-published in a different license. (Remember the discussion where BSD code was found in Linux kernel code? It would be OK to copy the code, even relicense under GPL, only if they had attributed to the original author. But an employee wanted to take the credit).
The second one is with patents (subject of the other lawsuit). Even if you publish your code under GPL, there are some limitations due to patents (this is why we have GPL3 changed that provisions).
Here is the GPL version:
http://cvs.savannah.gnu.org/viewvc/classpath/gnu/java/security/x509…
Note the variable and method order in all 3 files.
The file in question is not from the GPL implementation.
Correction: the file is not from Classpath. The OP is referring to Sun’s version of this class which is also GPL licensed. The Classpath file is from 2004, Sun’s variant from 2006.
They aren’t talking about the GNU classpath, the are talking about OpenJDK, not the same product.
Dalvik is licensed under the Apache License 2.0 which is not compatible with GPLv2
.
Edited 2010-10-28 23:14 UTC
Can someone please explain how this could possibly be a copyright violation if Java was released by Sun under the GPL? The only way I could see that it could be a copyright violation would be if Android and/or Harmony are not released under licenses compatible with the GPL. I’m missing something here, I guess.
I wonder the same thing. And yes, dalvik is Apache License (BSD like) and not GPL, so this is a breach of the GPL. So Oracle is actually an advocate for the GPL and this is a GPL violation!?
If it was truly Oracle’s code, then the copyright was removed, and the license changed, which is a HUGE no-no under copyright law… In that case, Oracle has every right to sue.
However, since the code was already FOSS, seems to me that Oracle is simply looking for more ammunition to hammer Google with, and this just happened to be one of the items they were able to locate in their search.
It is Oracle’s code, but it is also released under the GPL.
GPL allows for de-compiling and for re-implementation. In fact, it encourages it.
http://www.gnu.org/philosophy/free-sw.html
De-compiling the program comes under freedom 1. Every recipient of GPL software is granted unconditional permision (under freedom 1) to study the code.
De-compilation, yes. But redistribution of your decompilation under a different licence? Of course not.
GPL allows anyone to study the code, and to use it internally. Only-redistribution of the GPL code invokes the conditions within the GPL, permission to do anything else is granted unconditionally.
I would presume this means that it is perfectly permissible to de-compile GPL code, and then re-implement it so that the re-implementation is not a copy of the original.
Names of API entry points and the like are not protectable under copyright law, for reasons of interoperability.
You could end up with very similar-looking code that was not a violation of copyright, given the provisions of the GPL license.
Edited 2010-10-29 02:26 UTC
Harmony is under Apache 2.0, witch is no GPL compatible. But GPL is strong exactly because of copyright, not in-spite.
The real world and the law are far messier than most computer geeks realize. On the the hand Oracle will eventually make all these issues go away for a few hundred (million) bucks.
… a good thing for Google to do would be to drop Java, creating a (slightly) different language that still targets their JVM? That would be an opportunity for a simpler syntax. Oracle can’t sue over code resemblance… or can they? Just see how much Java looks like C++.
Google would create a converter, that would take Java code, output Google’s X language code. C++ to Java and Java to C++ converters exist. So that shouldn’t pose any problems either.
They would provide said language X in the SDK IDE if there’s an IDE in the SDK and ask developers to switch languages and it’ll all be done. If Apple could convert devs to Objective-C, Google too can succeed in doing that.
Not saying the lawsuit would suddenly stop but it’s a way out of this morass that Google should have not put themselves in by creating a different VM, especially with openJDK available.
In the end, I’m just wondering why and how Google felt the need for Dalvik… weren’t they calling for trouble?
drop Java? All those lovely apps need to be rewritten…. Devs would simply swap platforms (blackberry, webOS, W7p)
The apps would need to be converted, hence the converter I hinted at. Run a batch program over your code tree, wait a few seconds/minutes and you’re done. No fiddling with the code, no algorithm change, no manual editing required. If the conversion is seamless, it won’t be a thorn in any developer’s foot even if they’ll have to learn that new language. It’s not like language cheatsheets are a rare thing.
I’m talking about idempotent transformations conversions such as the C++ syntax proposed in http://www.csse.monash.edu.au/~damian/papers/HTML/ModestProposal.ht…, where the syntax is different as no C++ compiler would accept that.
Google can do something of that kind for Java syntax. If they are not officially supporting Java but rather that differently-branded resyntax, what weight would Oracle’s claims have? No much I think: a different VM is targeted, the byte code is different and the language would be different because of a different syntax. As I’m writing this, .NET and C# come to mind… How different is C# from C++ and Java? I don’t know as I’ve never written anything for .NET whether in C# or another .NET-suitable language but I guess not much more than what I was proposing.
What languages are used to write apps on those other platforms? Is it universally Java? If not, then moving to another platform because of the Android app language changing would be hard to understand/justify, all the more if Google offers the tools to make the change seamless. Which is not an easy task (and I can only guess) because 1- third-party libraries also need conversion and 2- the pace the smartphone market and technologies are evolving may be too fast to not be a serious hurdle.
To sum up, I don’t see a problem in “cross-compiling” java source code to another syntax that’s semantically equivalent and I don’t see why Google didn’t just go with OpenJDK which, from what I’ve read so far, is by nature immune to the current lawsuit. What were they thinking using the Java syntax but producing byte code that’s incompatible with any existing JVM? That was a bad decision the consequences of which are now biting them in the rear end because if I am not mistaken, doing so is forbidden by some license, agreement, terms of usage, etc. somewhere. Isn’t it?
Experienced C++ and Java programmers can pick up C# and though your idea of a mass conversion is interesting and technically possible (with tweaking) you would run into a human problem. A lot of Java programmers have supported Android because it uses Java. Many of them are very anti MS/.NET and would rather drink paint than learn C#.
Their intention was to discourage apps from being ported to other platforms. It was also done during a time when Sun was obsessed with looking open source friendly so they were not worried about a lawsuit.
What they should do is make a deal with Oracle and add in Java ME support. Let developers decide which one is better.
Edited 2010-10-29 21:41 UTC
I dont think the oracle board are stupid.
If they think they have a case with Merit. They sue Google. A company with more money than it knows what to do with. Little bit of bad press, but, everyone sues Google.
Apache is a different kettle of fish. suing them is, in essence, suing ‘open source’. It doesnt take a genius to see that that would be a PR disaster. Once they win the case they would simply send a letter to apache saying there was precedence and let them remove the code without courts.
People seem pretty confused over this including me. Here’s what I understand, correct me if I’m wrong.
Harmony was created as a Java SE implementation, validated by the Java TCK, which only allows for desktop implementations of Java. Oracle, formerly Sun, does not allow implementations for mobile devices to be validated by the Java TCK. So Google did an end around for Dalvik by using a validated open source implementation for their base, namely Harmony. Now Google doesn’t claim that Dalvik is Java and it isn’t since it only includes a subset of the specification but since the code was originally a part of project Harmony which was validated by the Java TCK Oracle cannot sue Harmony because they didn’t violate the license while Google did with its mobile implementation, sort of. It really isn’t that clear. I believe the whole code theft thing is really a red herring considering Java is open source (or am I missing something?). The real issue is whether or not it can be transferred to Harmony and then on to Google’s Dalvik in such a fashion as it has been without violating the license.
Can anyone clarify?
I can’t really clarify but Java is not open source, at least not in the sense of Firefox or TeX or Linux or anything that has its roots in the open source movement, whether as seen by the FSF or predating it (TeX -and other works of Knuth?- for instance).
Second, open source doesn’t equate “take it and do whatever you want with it”; open source comes with a license, with agreement terms, conditions, etc. Those terms have to be explicitly licensed under a “this is public domain, I did it for all to own” license with no restriction to be truly free of any constraint. In short, all licenses have license terms. That’s including a public domain implicit license term that you may not sue others for having the same benefit that you have.
The problem here seems to be that Google violates some of the terms.
Edited 2010-10-29 12:52 UTC
For one thing, code theft is still very much an issue with open source software. Open source software still has a software license that must be abided, and in this case, open source Java is GPL, which is one of the more restrictive open source licenses (technically we call it “free software”).
Furthermore it is possible that this version of the code was lifted from a non-GPL implementation of Java. Even if it was the same code that was GPLed later, the terms of the decompiled copy still apply.
No, because this code was not transferred in the typical way for source code. Instead it was generated from the machine code (well, Java byte code), which bypasses the need to have the source code to make a modified copy. Certainly if this class had been under a compatible license, it would not have been decompiled at all. Instead this is an attempt to get around whatever license applied at the time using freshman-level CS plagiarism.
I’ve contacted the Apache Software Foundation. hopefully they can offer some clarifications in this matter – sorely needed.
Nice one. I’m not interested in people comparing diffs and lines of code because it tells us nothing. What’s more interesting is where this code came from and how it is licensed.
I’ve had some interesting exchanges. They’re currently investigating the claims, and will get back to me.
See https://blogs.apache.org/foundation/entry/read_beyond_the_headers
and released Geronimo?
http://www.theserverside.com/news/thread.tss?thread_id=22337#101301
Evidence is in Apache Incubator and Geronimo mail archives of that time. No doubt about it.
Here:
http://marc.info/?l=geronimo-dev&m=106875482022176&w=2
So this Oracle crap is just for blackmailing google to support Java ME and/or pay licensing fees. I hope Google defends successfully in court, they are rich enough to push Android through the legal minefield set up by the competition.
Also a _lot_ of java is GPLed, which the evil Oracle doesn’t seem to respect in this case.
See https://blogs.apache.org/foundation/entry/read_beyond_the_headers
if google gpl’d android would this solve the problem because as i see it the problem is being presented as google took gpl’d code and has it under an apache license
is that correct?