Welcome to the Parallax Discussion Forums, sign-up to participate.

Rayman
Posts: **8,897**

Wish I would have looked more closely at this earlier, might have asked for a cordic option...

Looks like Q16.16 is going to make the most sense for a lot of applications.

I think this is how you do multiplication on P2:

Might have been nicer if there was a QMUL2 that did the SHR, SHL and ADD. Would that have added a lot?

It is nice to have the 32x32 multiply... Puts is close to ARM Cortex M3, apparently...

It's looking to me like both have to be positive for this to work right (or at least the same sign)...

I found "libfixmath" that does some clever stuff with the leftover bits... The upper bits can be used for overflow detection and the lower bits for rounding:

Looks like Q16.16 is going to make the most sense for a lot of applications.

I think this is how you do multiplication on P2:

qmul f2,f1 getqx f1 getqy f2 shr f2,#16 'lower bytes shl f1,#16 'make 16.16 add f1,f2

Might have been nicer if there was a QMUL2 that did the SHR, SHL and ADD. Would that have added a lot?

It is nice to have the 32x32 multiply... Puts is close to ARM Cortex M3, apparently...

It's looking to me like both have to be positive for this to work right (or at least the same sign)...

I found "libfixmath" that does some clever stuff with the leftover bits... The upper bits can be used for overflow detection and the lower bits for rounding:

/* 64-bit implementation for fix16_mul. Fastest version for e.g. ARM Cortex M3. * Performs a 32*32 -> 64bit multiplication. The middle 32 bits are the result, * bottom 16 bits are used for rounding, and upper 16 bits are used for overflow * detection. */ #if !defined(FIXMATH_NO_64BIT) && !defined(FIXMATH_OPTIMIZE_8BIT) fix16_t fix16_mul(fix16_t inArg0, fix16_t inArg1) { int64_t product = (int64_t)inArg0 * inArg1; #ifndef FIXMATH_NO_OVERFLOW // The upper 17 bits should all be the same (the sign). uint32_t upper = (product >> 47); #endif if (product < 0) { #ifndef FIXMATH_NO_OVERFLOW if (~upper) return fix16_overflow; #endif #ifndef FIXMATH_NO_ROUNDING // This adjustment is required in order to round -1/2 correctly product--; #endif } else { #ifndef FIXMATH_NO_OVERFLOW if (upper) return fix16_overflow; #endif } #ifdef FIXMATH_NO_ROUNDING return product >> 16; #else fix16_t result = product >> 16; result += (product & 0x8000) >> 15; return result; #endif } #endif

Prop Info and Apps: http://www.rayslogic.com/

## Comments

11 Commentssorted by Date Added Votes2,4650Vote UpVote Down8,8970Vote UpVote Down2,4650Vote UpVote DownYes, although I think I may change it to a routine that uses a series of 16x16 MUL instructions -- I did a test a while back and doing 3 MULs plus the appropriate shifts and adds is slightly faster than QMUL.

8,8970Vote UpVote DownStill, cordic has this pipeline thing that might speed things up for me, we'll see...

1,2090Vote UpVote Down2,4650Vote UpVote DownIIRC CORDIC is kind of slow for multiplies (I think 16 cycles?? Plus 2 cycles to issue the qmul and 4 cycles to fetch the results). But it's been a while since I ran the tests.

2,4650Vote UpVote DownNo, and that would potentially improve performance at the cost of a more complicated compiler (and assuming there are useful things to do between the QMUL and GETQX).

2,3140Vote UpVote DownMelbourne, Australia8,8970Vote UpVote Down5,9130Vote UpVote DownOnly time this fails is if it's a simple recursion that has to be serially processed.

5,9130Vote UpVote Down