Shop OBEX P1 Docs P2 Docs Learn Events
CORDIC QMUL for signed multiply? Easy! — Parallax Forums

CORDIC QMUL for signed multiply? Easy!

TonyB_TonyB_ Posts: 2,252
edited 2025-12-28 17:24 in Propeller 2

QMUL does a 32 x 32 multiply with 64-bit unsigned result. However, there is a simple method to convert this to a signed result (which I think has not been mentioned on the forum before) as follows: if one operand is negative then subtract the other operand from the high half of the result. If both negative subtract both. (The low half of an unsigned and signed result is identical.)

In my view this method is better than multiplying absolute values and negating the result if only one was negative, especially for larger bit widths, e.g. 64 x 64. If the value that must be subtracted (the subtrahend) is calculated whilst waiting for the first GETQX/Y result then the conversion from unsigned to signed adds only two cycles for 32 x 32 and only four for 64 x 64.

A couple of code examples are below. Online unsigned/signed 64 x 64 multiply with user entry, showing subtraction if signed (can be used for 32 x 32 if low bits are zeroed):
https://www.cse.scu.edu/~dlewis/book3/tools/64x64Multiply.shtml

muls_32x32

'start 32x32 unsigned multiply

                qmul    a,b

'create 32-bit subtrahend in s
'needed only for signed result

                mov     s,b             wc
                testb   a,#31           wz
        if_nz   mov     s,#0
        if_c    add     s,a

'read product

                getqy   a
                getqx   b

'unsigned 64-bit result in {a,b}
'subtract s from a for signed result

        _ret_   sub     a,s

'operands
a               long    $BABE_FACE
b               long    $DEAD_BEEF

'subtrahend
s               long    0

{{
BABEFACE*DEADBEEF =
A2705BD6,37A70A52 unsigned
0903A219,37A70A52 signed
}}

muls_64x64
'
'64x64 multiply
'{a,b,c,d} = {a,b} * {c,d}
'
'      ab
'x     cd        getqy,getgx
'--------        variables
'      BD = b*d        c,d
'     AD  = a*d     b,c1   +
'     CB  = c*b    b2,c2   +
'    AC   = a*c  a,b3      +
'                ---------

'start four 32x32 unsigned multiplies

                qmul    b,d
                waitx   #4

                qmul    a,d
                waitx   #4

                qmul    c,b
                waitx   #4

                qmul    a,c

'create 64-bit subtrahend in {s,t}
'needed only for signed result

                mov     t,#0
                mov     s,#0
                testb   a,#31           wz
        if_z    mov     t,d
        if_z    mov     s,c
                testb   c,#31           wz
        if_z    add     t,b             wc
        if_z    addx    s,a

'read and sum four 64-bit partial products

                getqx   d
                getqy   c
                waitx   #2

                getqx   c1
                getqy   b
                add     c,c1            wc
                modz    _c              wz

                getqx   c2
                getqy   b2
                add     c,c2            wc
                addx    b,b2            wc

                getqx   b3
                getqy   a
                addx    a,#0
                modc    _z              wc
                addx    b,b3
                addx    a,#0

'unsigned 128-bit result in {a,b,c,d}
'subtract {s,t} from {a,b} for signed result

                sub     b,t             wc
        _ret_   subx    a,s

'operands
a               long    $FFFEFDFC
b               long    $FBFAF9F8
c               long    $F7F6F5F4
d               long    $F3F2F1F0

'subtrahend
s               long    0
t               long    0

'scratchpad
b2
b3              long    0
c1
c2              long    0

{{
FFFEFDFC,FBFAF9F8*F7F6F5F4,F3F2F1F0 =
F7F5FC0B,244878B4,213F4D4A,350CD080 unsigned
00000819,345A8CCC,213F4D4A,350CD080 signed
}}

EDIT:
32 x 32 subtrahend code now shorter and quicker.

Comments

  • Great tip!
    This should be in the P2 PASM manual where they talk about signed/unsigned 32bit, 64 bit and 96 bit operations...

  • https://p2docs.github.io/idiom.html#signed-qmul
    etc, etc

    (Though it really lacks an explanation as to why it works like that...)

  • TonyB_TonyB_ Posts: 2,252
    edited 2025-12-28 20:29

    Just saved one long and two cycles in the 32 x 32 subtrahend calculation.

    Replace

                    mov     s,#0
                    testb   a,#31           wz
            if_z    mov     s,b     
                    testb   b,#31           wz
            if_z    add     s,a
    

    with

                    mov     s,b             wc
                    testb   a,#31           wz
            if_nz   mov     s,#0
            if_c    add     s,a
    
  • TonyB_TonyB_ Posts: 2,252
    edited 2025-12-28 19:36

    @Wuerfel_21 said:
    https://p2docs.github.io/idiom.html#signed-qmul
    etc, etc

    (Though it really lacks an explanation as to why it works like that...)

    Why it works:

    An N-bit negative number a in two's complement notation = 2^N - |a|. Multiplying by positive number b, the product = (2^N - |a|).b = 2^N.b - |a|.b, which is too large by 2^N.b therefore that must be subtracted to give the correct result. 2^N.b is b shifted left by N bits, i.e. into the high half of the result.

    The case of two negative numbers is left as an exercise for the reader.

  • @TonyB_ said:

    @Wuerfel_21 said:
    https://p2docs.github.io/idiom.html#signed-qmul
    etc, etc

    (Though it really lacks an explanation as to why it works like that...)

    Firstly, others should know your doc was different before I told you about the subtrahend method (probably could do with a better name to be honest). Why it works:

    ?? I don't remember ever changing it and git blame tells me that section is from the initial commit of the idioms file.
    So people should not know that because it's not true?

    An N-bit negative number a in two's complement notation = 2^N - |a|. If multiplying that by positive number b, the product = (2^N - |a|).b = 2^N.b - |a|.b, which is too large by 2^N.b therefore that must be subtracted to give the correct result. 2^N.b is b shifted left by N bits, i.e. into the high half of the result.

    The case of two negative numbers is left as an exercise for the reader.

    That's one way to put it.
    The way I like to think of it is that you must imagine each of the 32 bit operands being sign-extended to 64 bit. The upper half is then either all zeroes in the positive case (does nothing), or all ones in the negative case, which if we factor that into the multiplication is equivalent to adding -1 times the other operand into the high half of the result.

    Also note that you can move where the negation happens: a NEG is as cheap as a MOV and you can then ADD the adjustment instead of SUB-in it, which may in some cases be useful.

  • TonyB_TonyB_ Posts: 2,252
    edited 2025-12-28 22:02

    @Wuerfel_21 said:

    @TonyB_ said:

    @Wuerfel_21 said:
    https://p2docs.github.io/idiom.html#signed-qmul
    etc, etc

    (Though it really lacks an explanation as to why it works like that...)

    Firstly, others should know your doc was different before I told you about the subtrahend method (probably could do with a better name to be honest). Why it works:

    ?? I don't remember ever changing it and git blame tells me that section is from the initial commit of the idioms file.
    So people should not know that because it's not true?

    I jumped to the wrong conclusion, please forgive me.

    Also note that you can move where the negation happens: a NEG is as cheap as a MOV and you can then ADD the adjustment instead of SUB-in it, which may in some cases be useful.

    Further note that the 32 x 32 optimization I found this afternoon (post #4) only works if a subtrahend is subtracted. I couldn't save any code in the 64 x 64 routine applying the same idea, so that could add an addend if desired. EDIT: adding does not work always - see post #9 below.

  • @TonyB_ said:
    I jumped to the wrong conclusion, please forgive me.

    Is fine, happens to the best of us.

    Also note that you can move where the negation happens: a NEG is as cheap as a MOV and you can then ADD the adjustment instead of SUB-in it, which may in some cases be useful.

    Further note that the 32 x 32 optimization I found this afternoon (post #4) only works if a subtrahend is subtracted. I couldn't save any code in the 64 x 64 routine applying the same idea, so that could add an addend if desired.

    I'm pretty sure that does work exactly the same if you swap the first MOV to a NEG and the ADD to a SUB (and I think invert the IF_C condition to IF_NC?). Point is, you can exactly mirror this.

  • TonyB_TonyB_ Posts: 2,252
    edited 2025-12-28 21:56

    @Wuerfel_21 said:
    I'm pretty sure that does work exactly the same if you swap the first MOV to a NEG and the ADD to a SUB (and I think invert the IF_C condition to IF_NC?). Point is, you can exactly mirror this.

    I tested adding and the C condition is inverted compared to subtracting. If b = $8000_0000 then mov s,b wc ... and subtracting produce the right result. If b = $8000_0000 then neg s,b wc ... and adding produce the wrong result.

  • @TonyB_ said:

    @Wuerfel_21 said:
    I'm pretty sure that does work exactly the same if you swap the first MOV to a NEG and the ADD to a SUB (and I think invert the IF_C condition to IF_NC?). Point is, you can exactly mirror this.

    I tested adding and the C condition is inverted compared to subtracting. If b = $8000_0000 then mov s,b wc ... and subtracting produce the right result. If b = $8000_0000 then neg s,b wc ... and adding produce the wrong result.

    Yeah C from NEG is "MSB of result", so it doesn't work properly in the extreme case. afuu~
    I think I was thinking of the P1 version of NEG which does give the same C flag as MOV would.

Sign In or Register to comment.