CORDIC QMUL for signed multiply? Easy!

TonyB_ · 2025-12-28 13:46

QMUL does a 32 x 32 multiply with 64-bit unsigned result. However, there is a simple method to convert this to a signed result (which I think has not been mentioned on the forum before) as follows: if one operand is negative then subtract the other operand from the high half of the result. If both negative subtract both. (The low half of unsigned and signed result is identical.)

In my view this method is better than multiplying absolute values and negating the result if only one was negative, especially for larger bit widths, e.g. 64 x 64. If the value that must be subtracted (the subtrahend) is calculated whilst waiting for the first GETQX/Y result then the conversion from unsigned to signed adds only two cycles for 32 x 32 and only four for 64 x 64.

A couple of code examples are below. Online unsigned/signed 64 x 64 multiply with user entry, showing subtraction if signed (can be used for 32 x 32 if low bits are zeroed):
https://www.cse.scu.edu/~dlewis/book3/tools/64x64Multiply.shtml

muls_32x32

'start 32x32 unsigned multiply

                qmul    a,b

'create 32-bit subtrahend in s
'needed only for signed result

                mov     s,b             wc
                testb   a,#31           wz
        if_nz   mov     s,#0
        if_c    add     s,a

'read product

                getqy   a
                getqx   b

'unsigned 64-bit result in {a,b}
'subtract s from a for signed result

        _ret_   sub     a,s

'operands
a               long    $BABE_FACE
b               long    $DEAD_BEEF

'subtrahend
s               long    0

{{
BABEFACE*DEADBEEF =
A2705BD6,37A70A52 unsigned
0903A219,37A70A52 signed
}}

muls_64x64
'
'64x64 multiply
'{a,b,c,d} = {a,b} * {c,d}
'
'      ab
'x     cd        getqy,getgx
'--------        variables
'      BD = b*d        c,d
'     AD  = a*d     b,c1   +
'     CB  = c*b    b2,c2   +
'    AC   = a*c  a,b3      +
'                ---------

'start four 32x32 unsigned multiplies

                qmul    b,d
                waitx   #4

                qmul    a,d
                waitx   #4

                qmul    c,b
                waitx   #4

                qmul    a,c

'create 64-bit subtrahend in {s,t}
'needed only for signed result

                mov     t,#0
                mov     s,#0
                testb   a,#31           wz
        if_z    mov     t,d
        if_z    mov     s,c
                testb   c,#31           wz
        if_z    add     t,b             wc
        if_z    addx    s,a

'read and sum four 64-bit partial products

                getqx   d
                getqy   c
                waitx   #2

                getqx   c1
                getqy   b
                add     c,c1            wc
                modz    _c              wz

                getqx   c2
                getqy   b2
                add     c,c2            wc
                addx    b,b2            wc

                getqx   b3
                getqy   a
                addx    a,#0
                modc    _z              wc
                addx    b,b3
                addx    a,#0

'unsigned 128-bit result in {a,b,c,d}
'subtract {s,t} from {a,b} for signed result

                sub     b,t             wc
        _ret_   subx    a,s

'operands
a               long    $FFFEFDFC
b               long    $FBFAF9F8
c               long    $F7F6F5F4
d               long    $F3F2F1F0

'subtrahend
s               long    0
t               long    0

'scratchpad
b2
b3              long    0
c1
c2              long    0

{{
FFFEFDFC,FBFAF9F8*F7F6F5F4,F3F2F1F0 =
F7F5FC0B,244878B4,213F4D4A,350CD080 unsigned
00000819,345A8CCC,213F4D4A,350CD080 signed
}}

EDIT:
32 x 32 subtrahend code now shorter and quicker.

mwroberts · 2025-12-28 14:39

Great tip!
This should be in the P2 PASM manual where they talk about signed/unsigned 32bit, 64 bit and 96 bit operations...

Wuerfel_21 · 2025-12-28 14:59

https://p2docs.github.io/idiom.html#signed-qmul
etc, etc

(Though it really lacks an explanation as to why it works like that...)

TonyB_ · 2025-12-28 16:40

Just saved one long and two cycles in the 32 x 32 subtrahend calculation, making this method just as quick and short as multiplying absolutes and negating result if necessary.

Replace

                mov     s,#0
                testb   a,#31           wz
        if_z    mov     s,b     
                testb   b,#31           wz
        if_z    add     s,a

with this

                mov     s,b             wc
                testb   a,#31           wz
        if_nz   mov     s,#0
        if_c    add     s,a

CORDIC QMUL for signed multiply? Easy!

Comments