CORDIC QMUL/QDIV for signed multiply/divide

TonyB_ · 2025-12-28 13:46

QMUL does a 32 x 32 multiply with 64-bit unsigned result. However, there is a simple method to convert this to a signed result (which I think has not been mentioned on the forum before) as follows: if one operand is negative then subtract the other operand from the high half of the result. If both negative subtract both. (The low half is same for unsigned and signed.)

In my view this method is better than multiplying absolute values and negating the result if only one was negative, especially for larger bit widths, e.g. 64 x 64. If the value that must be subtracted (the subtrahend) is calculated whilst waiting for the first GETQX/Y result then the conversion from unsigned to signed adds only two cycles for 32 x 32 and only four for 64 x 64.

A couple of code examples are below. Online unsigned/signed 64 x 64 multiply with user entry, showing subtraction if signed (can be used for 32 x 32 if low bits are zeroed):
https://www.cse.scu.edu/~dlewis/book3/tools/64x64Multiply.shtml

muls_32x32

'start 32x32 unsigned multiply

                qmul    a,b

'create 32-bit subtrahend in s
'needed only for signed result

                mov     s,b             wc
                testb   a,#31           wz
        if_nz   mov     s,#0
        if_c    add     s,a

'read product

                getqy   a
                getqx   b

'unsigned 64-bit result in {a,b}
'subtract s from a for signed result

        _ret_   sub     a,s

'operands
a               long    $BABE_FACE
b               long    $DEAD_BEEF

'subtrahend
s               long    0

{{
BABEFACE*DEADBEEF =
A2705BD6,37A70A52 unsigned
0903A219,37A70A52 signed
}}

muls_64x64
'
'64x64 multiply
'{a,b,c,d} = {a,b} * {c,d}
'
'      ab
'x     cd        getqy,getgx
'--------        variables
'      BD = b*d        c,d
'     AD  = a*d     b,c1   +
'     BC  = b*c    b2,c2   +
'    AC   = a*c  a,b3      +
'                ---------

'start four 32x32 unsigned multiplies

                qmul    b,d
                waitx   #4

                qmul    a,d
                waitx   #4

                qmul    b,c
                waitx   #4

                qmul    a,c

'create 64-bit subtrahend in {s,t}
'needed only for signed result

                mov     t,#0
                mov     s,#0
                testb   a,#31           wz
        if_z    mov     t,d
        if_z    mov     s,c
                testb   c,#31           wz
        if_z    add     t,b             wc
        if_z    addx    s,a

'read and sum four 64-bit partial products

                getqx   d
                getqy   c
                waitx   #2

                getqx   c1
                getqy   b
                add     c,c1            wc
                modz    _c              wz

                getqx   c2
                getqy   b2
                add     c,c2            wc
                addx    b,b2            wc

                getqx   b3
                getqy   a
                addx    a,#0
                modc    _z              wc
                addx    b,b3
                addx    a,#0

'unsigned 128-bit result in {a,b,c,d}
'subtract {s,t} from {a,b} for signed result

                sub     b,t             wc
        _ret_   subx    a,s

'operands
a               long    $FFFEFDFC
b               long    $FBFAF9F8
c               long    $F7F6F5F4
d               long    $F3F2F1F0

'subtrahend
s               long    0
t               long    0

'scratchpad
b2
b3              long    0
c1
c2              long    0

{{
FFFEFDFC,FBFAF9F8*F7F6F5F4,F3F2F1F0 =
F7F5FC0B,244878B4,213F4D4A,350CD080 unsigned
00000819,345A8CCC,213F4D4A,350CD080 signed
}}

EDIT:
32 x 32 subtrahend code now shorter and quicker.

mwroberts · 2025-12-28 14:39

Great tip!
This should be in the P2 PASM manual where they talk about signed/unsigned 32bit, 64 bit and 96 bit operations...

Wuerfel_21 · 2025-12-28 14:59

https://p2docs.github.io/idiom.html#signed-qmul
etc, etc

(Though it really lacks an explanation as to why it works like that...)

TonyB_ · 2025-12-28 16:40

Just saved one long and two cycles in the 32 x 32 subtrahend calculation.

Replace

                mov     s,#0
                testb   a,#31           wz
        if_z    mov     s,b     
                testb   b,#31           wz
        if_z    add     s,a

with

                mov     s,b             wc
                testb   a,#31           wz
        if_nz   mov     s,#0
        if_c    add     s,a

TonyB_ · 2025-12-28 17:16

@Wuerfel_21 said:
https://p2docs.github.io/idiom.html#signed-qmul
etc, etc

(Though it really lacks an explanation as to why it works like that...)

Why it works:

An N-bit negative number a in two's complement notation = 2^N - |a|. Multiplying by positive number b, the product = (2^N - |a|).b = 2^N.b - |a|.b, which is too large by 2^N.b therefore that must be subtracted to give the correct result. 2^N.b is b shifted left by N bits, i.e. into the high half of the result.

The case of two negative numbers is left as an exercise for the reader.

Wuerfel_21 · 2025-12-28 17:40

@TonyB_ said:

@Wuerfel_21 said:
https://p2docs.github.io/idiom.html#signed-qmul
etc, etc

(Though it really lacks an explanation as to why it works like that...)

Firstly, others should know your doc was different before I told you about the subtrahend method (probably could do with a better name to be honest). Why it works:

?? I don't remember ever changing it and git blame tells me that section is from the initial commit of the idioms file.
So people should not know that because it's not true?

An N-bit negative number a in two's complement notation = 2^N - |a|. If multiplying that by positive number b, the product = (2^N - |a|).b = 2^N.b - |a|.b, which is too large by 2^N.b therefore that must be subtracted to give the correct result. 2^N.b is b shifted left by N bits, i.e. into the high half of the result.

The case of two negative numbers is left as an exercise for the reader.

That's one way to put it.
The way I like to think of it is that you must imagine each of the 32 bit operands being sign-extended to 64 bit. The upper half is then either all zeroes in the positive case (does nothing), or all ones in the negative case, which if we factor that into the multiplication is equivalent to adding -1 times the other operand into the high half of the result.

Also note that you can move where the negation happens: a NEG is as cheap as a MOV and you can then ADD the adjustment instead of SUB-in it, which may in some cases be useful.

TonyB_ · 2025-12-28 19:35

@Wuerfel_21 said:

@TonyB_ said:

@Wuerfel_21 said:
https://p2docs.github.io/idiom.html#signed-qmul
etc, etc

(Though it really lacks an explanation as to why it works like that...)

Firstly, others should know your doc was different before I told you about the subtrahend method (probably could do with a better name to be honest). Why it works:

?? I don't remember ever changing it and git blame tells me that section is from the initial commit of the idioms file.
So people should not know that because it's not true?

I jumped to the wrong conclusion, please forgive me.

Also note that you can move where the negation happens: a NEG is as cheap as a MOV and you can then ADD the adjustment instead of SUB-in it, which may in some cases be useful.

Further note that the 32 x 32 optimization I found this afternoon (post #4) only works if a subtrahend is subtracted. I couldn't save any code in the 64 x 64 routine applying the same idea, so that could add an addend if desired. EDIT: adding does not work always - see post #9 below.

Wuerfel_21 · 2025-12-28 20:32

@TonyB_ said:
I jumped to the wrong conclusion, please forgive me.

Is fine, happens to the best of us.

Also note that you can move where the negation happens: a NEG is as cheap as a MOV and you can then ADD the adjustment instead of SUB-in it, which may in some cases be useful.

Further note that the 32 x 32 optimization I found this afternoon (post #4) only works if a subtrahend is subtracted. I couldn't save any code in the 64 x 64 routine applying the same idea, so that could add an addend if desired.

I'm pretty sure that does work exactly the same if you swap the first MOV to a NEG and the ADD to a SUB (and I think invert the IF_C condition to IF_NC?). Point is, you can exactly mirror this.

TonyB_ · 2025-12-28 21:54

@Wuerfel_21 said:
I'm pretty sure that does work exactly the same if you swap the first MOV to a NEG and the ADD to a SUB (and I think invert the IF_C condition to IF_NC?). Point is, you can exactly mirror this.

I tested adding and the C condition is inverted compared to subtracting. If b = $8000_0000 then mov s,b wc ... and subtracting produce the right result. If b = $8000_0000 then neg s,b wc ... and adding produce the wrong result.

Wuerfel_21 · 2025-12-28 22:43

@TonyB_ said:

@Wuerfel_21 said:
I'm pretty sure that does work exactly the same if you swap the first MOV to a NEG and the ADD to a SUB (and I think invert the IF_C condition to IF_NC?). Point is, you can exactly mirror this.

I tested adding and the C condition is inverted compared to subtracting. If b = $8000_0000 then mov s,b wc ... and subtracting produce the right result. If b = $8000_0000 then neg s,b wc ... and adding produce the wrong result.

Yeah C from NEG is "MSB of result", so it doesn't work properly in the extreme case. afuu~
I think I was thinking of the P1 version of NEG which does give the same C flag as MOV would.

TonyB_ · 2025-12-28 23:01

@Wuerfel_21 said:

@TonyB_ said:

@Wuerfel_21 said:
I'm pretty sure that does work exactly the same if you swap the first MOV to a NEG and the ADD to a SUB (and I think invert the IF_C condition to IF_NC?). Point is, you can exactly mirror this.

I tested adding and the C condition is inverted compared to subtracting. If b = $8000_0000 then mov s,b wc ... and subtracting produce the right result. If b = $8000_0000 then neg s,b wc ... and adding produce the wrong result.

Yeah C from NEG is "MSB of result", so it doesn't work properly in the extreme case. afuu~
I think I was thinking of the P1 version of NEG which does give the same C flag as MOV would.

Problem is MSB doesn't invert for NEG D,S when S = $8000_0000 but does for all other S and C condition is inverted for all S.

TonyB_ · 2025-12-29 12:39

@Wuerfel_21 said:
Yeah C from NEG is "MSB of result", so it doesn't work properly in the extreme case. afuu~

An alternative to neg d,s wc that writes c correctly for any s including $8000_0000 is mov d,#0 then subs d,s wc

Christof Eb. · 2025-12-31 05:58

@"Stephen Moraco"
This kind of info here would make your manual great.

Stephen Moraco · 2026-01-03 01:36

@"Christof Eb." said:
@"Stephen Moraco"
This kind of info here would make your manual great.

Noted! Thank you. @Wuerfel_21 has a number of these collected. Do we know of other sources that I can use as well?

bob_g4bby · 2026-01-14 19:20

It begs the question, how about converting the unsigned QDIV to signed? Can a similar technique be used?

Wuerfel_21 · 2026-01-14 19:36

For QDIV you really do need to use ABS/NEGC type framing around the divide. (and it gets nasty if you want to support negative divisor)

ManAtWork · 2026-01-15 08:20

There is the TESTB D,#31 xorc instruction. By testing the sign of both the divisor and the divident you can store the sign of the result in a single C flag.

bob_g4bby · 2026-01-15 09:37

That sounds the simplest

Wuerfel_21 · 2026-01-15 12:49

But XORC is not available on ABS and you need the absolute value, no fancy sign equivalence tricks allowed with division.

TonyB_ · 2026-01-15 14:18

Here is a 32-bit signed division routine:

'Before: a = dividend, b = divisor
'After:  a = quotient, b = remainder

'Algorithm copied from x86 IDIV instruction

                abs     b               wc      'b = |divisor|, c if divisor < 0
                modz    _c              wz      'z if divisor < 0
                abs     a               wc      'a = |dividend|, c if dividend < 0, 
                qdiv    a,b                     'start 32-bit/32-bit division
                modz    _c_ne_z         wz      'z if dividend/divisor < 0

                getqx   a                       'a = |quotient|
                getqy   b                       'b = |remainder|

                negc    b                       'remainder < 0 if dividend < 0
                negz    a               wc      'quotient < 0 if dividend/divisor < 0, c if quotient < 0

'see truth table below

        _ret_   modc    _c_ne_z                 'c if quotient out of range

{{
dividend/
divisor < 0     |quotient|[31]  quotient[31]
z                               c

0               0               0       OK
1               0               1       OK
0               1               1       Out of range!
1               1               0       Out of range!
1               1               1       OK, |qoutient| = quotient = $8000_0000
}}

bob_g4bby · 2026-01-15 15:01

Nice one, @TonyB_

Wuerfel_21 · 2026-01-15 15:26

I think you need to move the QDIV below the ABS

TonyB_ · 2026-01-15 16:03

@Wuerfel_21 said:

I think you need to move the QDIV below the ABS

Corrected, thank you. I was trying to put some instructions between QDIV and GETQX and I got carried away!

EDIT: It's possible to do ABS, ABS, QDIV at the start then two instructions before GETQX

TonyB_ · 2026-01-15 19:14

I've tested the 32-bit signed division at https://forums.parallax.com/discussion/comment/1571208/#Comment_1571208 and error flag C is correct:

C = 0 for $7FFFFFFF / $00000001
C = 0 for $7FFFFFFF / $FFFFFFFF

C = 0 for $80000000 / $00000001
C = 1 for $80000000 / $FFFFFFFF

C = 0 for $80000001 / $00000001
C = 0 for $80000001 / $FFFFFFFF

C = 1 for $xxxxxxxx / $00000000

CORDIC QMUL/QDIV for signed multiply/divide

Comments