Chromium: a tale of two pwnies (part 1)

Thom Holwerda 2012-05-22 Internet 21 Comments

“Just over two months ago, Chrome sponsored the Pwnium browser hacking competition. We had two fantastic submissions, and successfully blocked both exploits within 24 hours of their unveiling. Today, we’d like to offer an inside look into the exploit submitted by Pinkie Pie.” A work of pure art, this. Also, this is not the same person as the other PinkiePie. Also also, you didn’t think I’d let a story with a headline like this go by unnoticed, did you?

About The Author

Thom Holwerda

Follow me on Mastodon @thomholwerda@exquisite.social

21 Comments

2012-05-23 6:42 am
Radio
Also also, you didn’t think I’d let a story with a headline like this go by unnoticed, did you?
Red paint, girlscout, etc?
More seriously, the GPU is *again* the weak link. That is cause for concern for the security of modern browsers: is it manageable when they have so much code touching so many hard/soft wares?

2012-05-23 7:47 am
moondevil
Actually C and C++ are the weakest links, not the GPU, as the exploits take advantage of the pointer tricks so dear to C and C++ developers.
If ComputeMaxResults() was done in a more sane language, this exploit wouldn’t have been possible, without doing some Assembly code rewriting.

2012-05-23 8:14 am
kwan_e
If ComputeMaxResults() was done in a more sane language, this exploit wouldn’t have been possible, without doing some Assembly code rewriting.
Did you actually read the functions? It is a calculation logic error. There is no language alive to prevent logic errors. The logic error results in an invalid buffer access for a GPU related task. No “sane” language has yet been extended to use GPUs that do no rely on creating buffers directly at some point in its execution.
You do understand that were a managed language required to access the GPU, it would also need to do manual memory management undercovers, don’t you?

2012-05-23 9:32 am
moondevil
Did you actually read the functions? It is a calculation logic error. There is no language alive to prevent logic errors. The logic error results in an invalid buffer access for a GPU related task. No “sane” language has yet been extended to use GPUs that do no rely on creating buffers directly at some point in its execution.
Yes I’ve read the functions, ComputeMaxResults() and ComputeSize(),
are the standard way to manipulate blocks of memory/arrays in C and related to the way arrays decay into pointers.
You do understand that were a managed language required to access the GPU, it would also need to do manual memory management undercovers, don’t you?
Safe programming languages != GC != Managed.
Ada, Modula-2, Delphi, Turbo Pascal are safe programming languages with manual memory management, compiling nicely to native code as well, just as an example.

2012-05-23 9:36 am
kwan_e
Ada, Modula-2, Delphi, Turbo Pascal are safe programming languages with manual memory management, compiling nicely to native code as well, just as an example.
And do they prevent you from making the LOGIC ERROR that was explained about the functions?
2012-05-23 11:09 am
moondevil
And do they prevent you from making the LOGIC ERROR that was explained about the functions?
YES! Because the LOGIC ERROR is about a MEMORY ACCESS ALGORITHM known to ANY C PROGRAMMER.
2012-05-23 11:17 am
kwan_e
YES! Because the LOGIC ERROR is about a MEMORY ACCESS ALGORITHM known to ANY C PROGRAMMER.
Really.
static uint32 ComputeMaxResults(size_t size_of_buffer) { return (size_of_buffer – sizeof(uint32)) / sizeof(T); }
So, say, Delphi, would prevent someone from making a mistake doing a subtraction and then a division, knowing what the calculation would be used for, would it?
size_of_buffer is an integer.
So is sizeof(uint32).
So is sizeof(T).
Are you seriously telling me there are programming languages out there that would actually tell the programmer “hey, did you know size_of_buffer you passed in was smaller than the size of uint32, and I checked every usage of ComputeMaxResults and I’ve noticed these unsafe uses?”
2012-05-23 1:11 pm
moondevil
Yes, because if you really cared to read everything, you will see that the outcome of such functions is used for buffer manipulation tricks.
This calculation then overflowed and made the result of this function zero, instead of a value at least equal to sizeof(uint32). Using this, Pinkie was able to write eight bytes of his choice past the end of his buffer. The buffer in this case is one of the GPU transfer buffers, which are mapped in both processesâ€™ address spaces and used to transfer data between the Native Client and GPU processes. The Windows allocator places the buffers at relatively predictable locations; and the Native Client process can directly control their size as well as certain object allocation ordering. So, this afforded quite a bit of control over exactly where an overwrite would occur in the GPU process.
The next thing Pinkie needed was a target that met two criteria: it had to be positioned within range of his overwrite, and the first eight bytes needed to be something worth changing. For this, he used the GPU buckets, which are another IPC primitive exposed from the GPU process to the Native Client process. The buckets are implemented as a tree structure, with the first eight bytes containing pointers to other nodes in the tree. By overwriting the first eight bytes of a bucket, Pinkie was able to point it to a fake tree structure he created in one of his transfer buffers. Using that fake tree, Pinkie could read and write arbitrary addresses in the GPU process. Combined with some predictable addresses in Windows, this allowed him to build a ROP chain and execute arbitrary code inside the GPU process.
A safer language would have a runtime error when such situations get detected.
The logic error as you called is only required, because they need to calculate specific values for pointer math. Without pointer math no need for logic errors that turn into buffer exploits.
2012-05-23 1:50 pm
kwan_e
A safer language would have a runtime error when such situations get detected.
Let’s forget for a moment that C++ has both the STL and Boost, which demonstrate how to use C++ without needing pointer math, if they themselves can’t be used…
Without pointer math no need for logic errors that turn into buffer exploits.
Really, so how would a language like Ada, which you mentioned, handle buffers (arrays) without “pointer math”? And, pray tell, how do you propose a “safe” language like Ada communicate with a GPU without passing it raw buffer instructions?
You’re seriously trying to tell me that a “safe” language can read a programmer’s mind and work out how to translate its memory model into something the GPU knows?
Here’s a hint, any “safe” language requiring such functionality will need to have it written for it, which does nothing to prevent a similar bug from being introduced in that manner.
My Quantum Dot! What kind of people do they churn out of CS courses these days?
2012-05-23 2:20 pm
moondevil
Let’s forget for a moment that C++ has both the STL and Boost, which demonstrate how to use C++ without needing pointer math, if they themselves can’t be used…
Library != Language
Really, so how would a language like Ada, which you mentioned, handle buffers (arrays) without “pointer math”?
With normal indexes, coupled with bound checked access.
My Quantum Dot! What kind of people do they churn out of CS courses these days?
Same to you, end of conversation. Bye.
2012-05-23 2:29 pm
kwan_e
Let’s forget for a moment that C++ has both the STL and Boost, which demonstrate how to use C++ without needing pointer math, if they themselves can’t be used…
Library != Language
Newsflash, C++ was designed in such a way that libraries were supposed to do most of the heavy lifting. This is from Bjarne Stroustrup’s book and many of his writings. For C++, library is a major part of the language.
Really, so how would a language like Ada, which you mentioned, handle buffers (arrays) without “pointer math”?
With normal indexes, coupled with bound checked access.
Of course. Ada magically can write things to memory without pointer math under the covers. Pointer math that, incidentally get written by compiler writers.
More pointedly is how you managed to evade the question of how a language was supposed to interface with a GPU, as they are currently designed, without manual manipulation of buffers in their rawest forms.
My Quantum Dot! What kind of people do they churn out of CS courses these days?
Same to you, end of conversation. Bye.
Love it or hate it, I consider anyone who comes out of a CS degree not understanding the design principles of a language like C++ to be deficient. Like it or not, for C++, library == language – INTENTIONALLY.
You have no business claiming to understand software development if you don’t bother understanding the full extent of the tools you use.
2012-05-23 4:34 pm
Bill Shooter of Bul Platinum Prime
Long crazy argument summary:
If you use a safe-er language, you’d be safe-er. Or if you used your existing language safe-er, you’d also be safe-er. But, you can never be completely safe, due to the trust issue:
http://cm.bell-labs.com/who/ken/trust.html
2012-05-29 11:29 pm
zima
Ahh, but it’s “You can’t trust code that you did not totally create yourself.” (from the article, emphasis mine)
Since the issue encompasses also microcode (potentially sabotaged chips in general)… http://members.iinet.net.au/~daveb/simplex/ringhome.html
Yes, the resulting machine would be… limited. But that’s OK, since it will also force basic software, it will require a mindset of frugality when creating, say, Contiki-like stack – making the task possible for one human.
2012-05-23 5:53 pm
anevilyak
Really.
static uint32 ComputeMaxResults(size_t size_of_buffer) { return (size_of_buffer – sizeof(uint32)) / sizeof(T); }
So, say, Delphi, would prevent someone from making a mistake doing a subtraction and then a division, knowing what the calculation would be used for, would it?
I don’t know about Delphi specifically, but for most of the languages that are considered “safe” in this respect, you’re not allowed to pass around raw blocks of memory without any type safety/control. As a consequence, the language’s runtime always knows and tracks the size of any currently allocated arrays. Given this knowledge, the runtime effectively converts any instance of array[x] to something analogous to:
if (x < 0 || x >= arraySize)
raise_runtime_error();
else
return array[x];
Obviously this incurs some overhead compared to the raw C version, but it does indeed prevent these kinds of problems. So yes, it really is possible to handle in a safe/high level language, if the logic error in question is confined to the level of the language. However, if the logic error is actually in the instructions passed to the GPU itself, and the latter is able to access arbitrary areas of memory, then that’s a different story.
Edited 2012-05-23 17:54 UTC
2012-05-23 11:20 pm
kwan_e
However, if the logic error is actually in the instructions passed to the GPU itself, and the latter is able to access arbitrary areas of memory, then that’s a different story.
Yes, it is a different story. That is the whole context of this discussion. This SPECIFIC problem with raw GPU access won’t be fixed by using a different language.
But on a more general note, I mentioned before about C++ libraries which provide automatic memory management, which ARE intended to be considered as part of the language. I highly recommend using STL and Boost containers when programming in C++. Especially now that compilers have started changing their STL libraries to use move semantics which makes them an order of magnitude faster.
Edited 2012-05-23 23:23 UTC
2012-05-23 5:02 pm
looncraz
**ALL** languages talk to the hardware the same way!
Memory pointers are a hardware feature. They just happen to be ‘exposed’ in c/++. I can use assembly to fool any programming written in any language that I am friendly and of the same breed and interject myself using hardware features.
Security like you are speaking would require a massive hardware change where memory is addressed in locked relative-memory segments. That is every process can create restricted areas of memory which can only be accessed by a compile-time validated code-path – something a few steps beyond nX-bits.
Even then, you would only need to find a way to inject yourself into that code-path by altering the binary… but that would be easier to catch and prevent. You would then need to rely on OS/hardware bugs… but you would still be able to get ‘in’ in some manner…nothing is safe beyond not permitting execution at all…
Which language was used simply doesn’t mean squat. A program written in Delphi still debases itself to the same approximate code which was written in c. I’m still using movl %eax, %ecx, call, jmp, etc…
–The loon
Edited 2012-05-23 17:02 UTC

2012-05-23 8:39 pm
panzi
You say there is no known language where this calculation would return the right result? Obviously you don’t know Python or Ruby. These language have variable length integers which means that you never have a integer overflow/underflow.
Yes, the result is then a negative number. But given the definition of the function and the parameters the result is “correct”. And in Ruby/Python you don’t have any buffers through which you can access arbitrary memory anyway.

2012-05-23 11:14 pm
kwan_e
You say there is no known language where this calculation would return the right result? Obviously you don’t know Python or Ruby. These language have variable length integers which means that you never have a integer overflow/underflow.
I’ve programmed in Python. I love Python. How would you suggest Python be able to directly instruct the GPU? I’ll give you a hint: you write the extension in C.
This is the cause of my earlier lament. People like you treat it as though languages like Python and Ruby magically spawn out of nowhere without having anything to do with C.
Yes, the result is then a negative number. But given the definition of the function and the parameters the result is “correct”.
But given the PURPOSE of the function, the “correct” answer is wrong. And you’ll end up with the same problem of incorrectly addressing the buffers contents.
So it is a LOGIC error. The formula is WRONG. Languages cannot fix wrong formulae, which is the heart of the problem with the function.
And in Ruby/Python you don’t have any buffers through which you can access arbitrary memory anyway.
Unless you write an extension in C, which you pretty much have to do if you want it to talk to the GPU.
This is why one of the earlier commenters was right. This is about the GPU. Not the language.

2012-05-23 7:59 am
renox
More seriously, the GPU is *again* the weak link. That is cause for concern for the security of modern browsers: is it manageable when they have so much code touching so many hard/soft wares?
And it’s only a start: when I read about Firefox’s developpers working on WebGL, I immediately thought: this feature has a lot of potential security issues..

2012-05-23 8:08 am
Savior
Maybe not the same guys, but they obviously got the inspiration from the same place (see http://arstechnica.com/business/2012/03/googles-chrome-browser-on-f… ). Which is all right, because Pinkie Pie is awesome.
2012-05-23 7:26 pm
FunkyELF
If the phones and game consoles I bought came with root access I’d be against exploits. But as it stands, exploits are what allows me to root these things.