Solving the mystery of ARM7TDMI multiply carry flag

The processor in the Game Boy Advance, the ARM7TDMI, has a weird characteristic where the carry flag is set to a “meaningless value” after a multiplication operation.

What this means is that software cannot and should not rely on the value of the carry flag after multiplication executes. It can be set to anything. Any value. 0, 1, a horse, whatever. This has been a source of memes in the emulator development community for a few years – people would frequently joke about how the implementation of the carry flag may as well be cpu.flags.c = rand() & 1;. And they had a point – the carry flag seemed to defy all patterns; nobody understood why it behaves the way it does. But the one thing we did know, was that the carry flag seemed to be deterministic. That is, under the same set of inputs to a multiply instruction, the flag would be set to the same value. This was big news, because it meant that understanding the carry flag could give us key insight into how this CPU performs multiplication.
And just to get this out of the way, the carry flag’s behavior after multiplication isn’t an important detail to emulate at all. Software doesn’t rely on it. And if software did rely on it, then screw the developers who wrote that software. But the carry flag is a meme, and it’s a really tough puzzle, and that was motivation enough for me to give it a go. Little did I know it’d take 3 years of on and off work.
↫ bean machine

Please don’t make me understand any of this.

4 Comments

2024-10-23 9:57 am
Alfman verbose=1
I only skimmed through the link. It’s such a detailed analysis. Good job 🙂
I was eyeing the use of signed multiplication…
Handling Signed Multiplication
Turns out this algorithm doesn’t support signed multiplication yet either. To implement this, we need to take a closer look at the CSA.
…
In order to implement signed multiplication, we need to sign-extend all 3 of these numbers to the full 65 bits.
In binary, representing negative numbers in two’s compliment usually takes care of any negative computations very naturally with no need to explicitly know or care that a number is negative. This can be a surprising yet ingenious byproduct of twos compliment arithmetic.
For example:
int a1=-13;
int b1=17;
int c1=a1*b1;
printf(“signed mul: %d * %d = %d\n”,a1,b1,c1);
unsigned int a2=-13;
unsigned int b2=17;
unsigned int c2=a2*b2;
printf(“unsigned mul: %u * %u = %u\n”,a2,b2,c2);
printf(“cast mul: %d * %d = %d\n”,a2,b2,c2);
What do you think the output will be? Turns out that the operation is mathematically identical regardless of which sign you tell C to use.
signed mul: -13 * 17 = -221
unsigned mul: 4294967283 * 17 = 4294967075
cast mul: -13 * 17 = -221
Sign extension on the other hand does care about integers being signed or unsigned, however it makes me wonder if it might be optimized away better in the algorithm. I’ve implemented an arbitrary precision library and most of the time these types of cases could be optimized out. It’s just a silly problem and I haven’t really taken the time to go through the code in serious detail, but some of the quirks about the algorithm might be resolved with more refinement.
2024-10-24 2:49 am
Rrups
Quickly read the article, got bored and concluded the carry bit is just something that is used when adding the intermediate results of the “sub” multiplications.
The way you do when manually adding two large numbers: you put them on seperate lines and go from right to left adding just the two digits. Sometimes the result is more than 9 and you ‘carry’ the one on to the two digits to the left.

2024-10-25 7:06 am
zde
That was more or less obvious from the day one. That’s how carry is produced in all other instructions: it’s either result of addition or result of the use of barrel shifter.
The real trick was to understand what exactly do you need to add or shift to get that strange semi-random result.
It’s similar to how on 8086 rep prefix changes the sign of division result.

2024-10-25 7:44 am
Alfman verbose=1
zde,
The real trick was to understand what exactly do you need to add or shift to get that strange semi-random result.
I find it interesting how much having this one bit of output lets us infer about the underlying silicon implementation by ruling out implementations that couldn’t produce it.

Solving the mystery of ARM7TDMI multiply carry flag

About The Author

Thom Holwerda

4 Comments