This looks really strange, but it seems there’s now a project to convert x86-64 processors into working as 32-bit! The author of this post claims that on x86-64 programs claim too much memory and they can work more efficiently by going back to 32-bit.
Lets just convert it all back to 8bit. Imagine the gains!
Nah – 12 bits rule ok.
WTF?
The article is talking about using a different data model (ILP32 instead of LP64).
http://en.wikipedia.org/wiki/LP64#Specific_C-language_data_models
Basically, All 64 bit operating systems are using 64 bit pointers which the author notes is usually not needed when 32 bit pointers would suffice for most programs.
But the x86_64 architecture has a better design than the older x86_32 which he lists. He wants those benefits without the added memory consumption.
I, myself, use many programs that need more than 4 GB of memory, so I will not be using/testing this version of linux.
Edit:
Silly me, I forgot to explain why its confusing!
Its confusing, because of course, you can run x86_32 bit programs on x86_64 cpu’s. However, you have to live with out the good features of the x8_64 ( increased registers & what not). So the summary of ” convert x86-64 processors into working as 32-bit!” is confusing. It already works with 32 bit programs and operating systems.
Edited 2011-01-20 20:19 UTC
Yeah, the headlines are a bit confusing there, but the article is clear enough. It’s basically about getting the benefit of the x86_64 additions to x86, without the added overhead of 64-bit pointers for programs that don’t need them.
Edited 2011-01-20 20:34 UTC
This is just an excuse to write non-portable software, if you can’t deal properly with other data models you should fix your software.
All the world is not an i386.
I was under the impression that this is actually modifying the compiler to support it.
Use relative addressing.
Look on quite a few non-x86 64-bit systems, and their binaries… quite a lot of them run most of the userland as 32-bit, because the architecture doesn’t stick 32-bit code with limitations like x86-64 does normally.
(In fact, on 64-bit SPARCs, there’s a mode much like this – the 64-bit extensions, but in 32-bit mode – called sparcv8plus. sparcv8 being the 32-bit SPARC, sparcv9 being 64-bit SPARC… so v8plus is v8 with all of the v9 updates.)
What I’d really like to see out of a project like this is some performance metrics.
I’ve always suspected that being forced to use 64bit addresses when it’s completely unnecessary is needlessly passing twice as many bits around. Can this be done to in a way to prevent that while still utilizing the additional registers provided by x86-64?
I hope to see some followup metrics from this project
Not necessarily. In case the bus and registers are at least 64-bit wide, there’s no loss from using 64-bit addresses. From what I recall from my courses, there is no gain in speed either by limiting addresses to 32-bit as the components still work at their nominal width.
Making your program use 64-bit pointers increases its memory footprint. I’m not sure by how much, but some people seem to be bothered by it. Increasing the memory footprint has the effect of decreasing the cache efficiency; by that I mean that because of the less compact memory representation, cache misses increase, which increases the average memory latency. On a modern x86 processor, L1 hits are like 3 cycles, L2 hits are on the order of 40, and last-level misses are hundreds of cycles. These aggressive out-of-order architectures can often completely absorb L1 misses by continuing to execute instructions that don’t depend on an out-standing memory read, but last-level misses invariably lead to a lengthy stall. In a talk, one Sun engineer characterized modern OOO processors as a race between LLC misses, with the LLC misses being the dominating factor in runtime.
Regarding cache efficiency, here’s an experiment to try. Let’s say you have a lookup table, and the numbers are small enough to fit in 8 bits. You can use a char array to store it. But due to various overheads, x86 is more efficent at accessing 32-bit words than 8-bit words, so as long as the table is small enough, it’s actually faster to use ints. Now, enlarge that table to be a few times the size of the L1 cache. Now, the L1 misses start to dominate. Switching from 32-bit words to 8-bit words decreases the cache footprint and thereby speeds up your program (depending on how sensitive your algorithm is to L2 latency).
Switching from absolute 64-bit addressing to relative 32-bit addressing will decrease the cache pressure of your programs, resulting in a speed increase. (Albeit to a much lesser degree.)
I don’t think that memory footprint is much increased by using 64-bit pointers, except maybe on very large code bases (e.g. QT, the Linux kernel…)
If you look at the nokia Qt SDK, which is one of the biggest binary downloads I can think of right now…
http://www.forum.nokia.com/info/sw.nokia.com/id/e920da1a-5b18-42df-…
“Linux 32 : 591 MB
Linux 64 : 596 MB”
The 64-bit version of the SDK is five megabytes larger. Not such a big deal on a download weighting hundreds of MB. For smaller software, the difference will probably be barely noticeable, if noticeable at all.
Edited 2011-01-21 15:43 UTC
How much of that package is binary code, and how much is data files and other stuff?
I’ve chosen the QT SDK because contrary to, say, a game, it shouldn’t include huge media files weighting hundreds of MB.
Except help files, maybe ?
Oh, and is that archive compressed? Compression is going to reduce much of the overhead.
I loved this: “Perhaps AMD didn’t evaluate this correctly, or perhaps its marketing side won over technical merit.” Isn’t it always the case?