The processor in the Game Boy Advance, the ARM7TDMI, has a weird characteristic where the carry flag is set to a “meaningless value” after a multiplication operation.
What this means is that software cannot and should not rely on the value of the carry flag after multiplication executes. It can be set to anything. Any value. 0, 1, a horse, whatever. This has been a source of memes in the emulator development community for a few years – people would frequently joke about how the implementation of the carry flag may as well be
cpu.flags.c = rand() & 1;
. And they had a point – the carry flag seemed to defy all patterns; nobody understood why it behaves the way it does. But the one thing we did know, was that the carry flag seemed to be deterministic. That is, under the same set of inputs to a multiply instruction, the flag would be set to the same value. This was big news, because it meant that understanding the carry flag could give us key insight into how this CPU performs multiplication.And just to get this out of the way, the carry flag’s behavior after multiplication isn’t an important detail to emulate at all. Software doesn’t rely on it. And if software did rely on it, then screw the developers who wrote that software. But the carry flag is a meme, and it’s a really tough puzzle, and that was motivation enough for me to give it a go. Little did I know it’d take 3 years of on and off work.
↫ bean machine
Please don’t make me understand any of this.
I only skimmed through the link. It’s such a detailed analysis. Good job 🙂
I was eyeing the use of signed multiplication…
In binary, representing negative numbers in two’s compliment usually takes care of any negative computations very naturally with no need to explicitly know or care that a number is negative. This can be a surprising yet ingenious byproduct of twos compliment arithmetic.
For example:
What do you think the output will be? Turns out that the operation is mathematically identical regardless of which sign you tell C to use.
Sign extension on the other hand does care about integers being signed or unsigned, however it makes me wonder if it might be optimized away better in the algorithm. I’ve implemented an arbitrary precision library and most of the time these types of cases could be optimized out. It’s just a silly problem and I haven’t really taken the time to go through the code in serious detail, but some of the quirks about the algorithm might be resolved with more refinement.
Quickly read the article, got bored and concluded the carry bit is just something that is used when adding the intermediate results of the “sub” multiplications.
The way you do when manually adding two large numbers: you put them on seperate lines and go from right to left adding just the two digits. Sometimes the result is more than 9 and you ‘carry’ the one on to the two digits to the left.
That was more or less obvious from the day one. That’s how carry is produced in all other instructions: it’s either result of addition or result of the use of barrel shifter.
The real trick was to understand what exactly do you need to add or shift to get that strange semi-random result.
It’s similar to how on 8086 rep prefix changes the sign of division result.
zde,
I find it interesting how much having this one bit of output lets us infer about the underlying silicon implementation by ruling out implementations that couldn’t produce it.