CORDIC QMUL for signed multiply? Easy!
QMUL does a 32 x 32 multiply with 64-bit unsigned result. However, there is a simple method to convert this to a signed result (which I think has not been mentioned on the forum before) as follows: if one operand is negative then subtract the other operand from the high half of the result. If both negative subtract both. (The low half of an unsigned and signed result is identical.)
In my view this method is better than multiplying absolute values and negating the result if only one was negative, especially for larger bit widths, e.g. 64 x 64. If the value that must be subtracted (the subtrahend) is calculated whilst waiting for the first GETQX/Y result then the conversion from unsigned to signed adds only two cycles for 32 x 32 and only four for 64 x 64.
A couple of code examples are below. Online unsigned/signed 64 x 64 multiply with user entry, showing subtraction if signed (can be used for 32 x 32 if low bits are zeroed):
https://www.cse.scu.edu/~dlewis/book3/tools/64x64Multiply.shtml
muls_32x32
'start 32x32 unsigned multiply
qmul a,b
'create 32-bit subtrahend in s
'needed only for signed result
mov s,b wc
testb a,#31 wz
if_nz mov s,#0
if_c add s,a
'read product
getqy a
getqx b
'unsigned 64-bit result in {a,b}
'subtract s from a for signed result
_ret_ sub a,s
'operands
a long $BABE_FACE
b long $DEAD_BEEF
'subtrahend
s long 0
{{
BABEFACE*DEADBEEF =
A2705BD6,37A70A52 unsigned
0903A219,37A70A52 signed
}}
muls_64x64
'
'64x64 multiply
'{a,b,c,d} = {a,b} * {c,d}
'
' ab
'x cd getqy,getgx
'-------- variables
' BD = b*d c,d
' AD = a*d b,c1 +
' CB = c*b b2,c2 +
' AC = a*c a,b3 +
' ---------
'start four 32x32 unsigned multiplies
qmul b,d
waitx #4
qmul a,d
waitx #4
qmul c,b
waitx #4
qmul a,c
'create 64-bit subtrahend in {s,t}
'needed only for signed result
mov t,#0
mov s,#0
testb a,#31 wz
if_z mov t,d
if_z mov s,c
testb c,#31 wz
if_z add t,b wc
if_z addx s,a
'read and sum four 64-bit partial products
getqx d
getqy c
waitx #2
getqx c1
getqy b
add c,c1 wc
modz _c wz
getqx c2
getqy b2
add c,c2 wc
addx b,b2 wc
getqx b3
getqy a
addx a,#0
modc _z wc
addx b,b3
addx a,#0
'unsigned 128-bit result in {a,b,c,d}
'subtract {s,t} from {a,b} for signed result
sub b,t wc
_ret_ subx a,s
'operands
a long $FFFEFDFC
b long $FBFAF9F8
c long $F7F6F5F4
d long $F3F2F1F0
'subtrahend
s long 0
t long 0
'scratchpad
b2
b3 long 0
c1
c2 long 0
{{
FFFEFDFC,FBFAF9F8*F7F6F5F4,F3F2F1F0 =
F7F5FC0B,244878B4,213F4D4A,350CD080 unsigned
00000819,345A8CCC,213F4D4A,350CD080 signed
}}
EDIT:
32 x 32 subtrahend code now shorter and quicker.

Comments
Great tip!
This should be in the P2 PASM manual where they talk about signed/unsigned 32bit, 64 bit and 96 bit operations...
https://p2docs.github.io/idiom.html#signed-qmul
etc, etc
(Though it really lacks an explanation as to why it works like that...)
Just saved one long and two cycles in the 32 x 32 subtrahend calculation.
Replace
mov s,#0 testb a,#31 wz if_z mov s,b testb b,#31 wz if_z add s,awith
mov s,b wc testb a,#31 wz if_nz mov s,#0 if_c add s,aWhy it works:
An N-bit negative number a in two's complement notation = 2^N - |a|. Multiplying by positive number b, the product = (2^N - |a|).b = 2^N.b - |a|.b, which is too large by 2^N.b therefore that must be subtracted to give the correct result. 2^N.b is b shifted left by N bits, i.e. into the high half of the result.
The case of two negative numbers is left as an exercise for the reader.
?? I don't remember ever changing it and git blame tells me that section is from the initial commit of the idioms file.
So people should not know that because it's not true?
That's one way to put it.
The way I like to think of it is that you must imagine each of the 32 bit operands being sign-extended to 64 bit. The upper half is then either all zeroes in the positive case (does nothing), or all ones in the negative case, which if we factor that into the multiplication is equivalent to adding -1 times the other operand into the high half of the result.
Also note that you can move where the negation happens: a NEG is as cheap as a MOV and you can then ADD the adjustment instead of SUB-in it, which may in some cases be useful.
I jumped to the wrong conclusion, please forgive me.
Further note that the 32 x 32 optimization I found this afternoon (post #4) only works if a subtrahend is subtracted. I couldn't save any code in the 64 x 64 routine applying the same idea, so that could add an addend if desired. EDIT: adding does not work always - see post #9 below.
Is fine, happens to the best of us.
I'm pretty sure that does work exactly the same if you swap the first MOV to a NEG and the ADD to a SUB (and I think invert the IF_C condition to IF_NC?). Point is, you can exactly mirror this.
I tested adding and the C condition is inverted compared to subtracting. If b = $8000_0000 then
mov s,b wc... and subtracting produce the right result. If b = $8000_0000 thenneg s,b wc... and adding produce the wrong result.Yeah C from NEG is "MSB of result", so it doesn't work properly in the extreme case. afuu~
I think I was thinking of the P1 version of NEG which does give the same C flag as MOV would.