Hi to all who need more COGs

cessnapilot · 2009-08-27 20:08

A small and complete Fixed-point arithmetic package (http://obex.parallax.com/objects/501/) with the necessary Trig in SPIN has just appeared in··Obex.

Enjoy,

Istvan····

Timmoore · 2009-08-31 17:20

Thanks, makes it much easier to convert some of my code to int and speed it up.
One comment, it would be worth adding a comment to the deg_atan2 routine that the parameters must be < 256. That caught me for a few minutes - you do a square of the 2 parameters and they overflow. And/or do a range check on the input parameters.

cessnapilot · 2009-09-01 05:58

Hi Timmoore,

Thanks for the bug report. You are right. The ugly head of Fixed-point overflow rose again. The range is checked before the int atan2 for the product of x*y, but is not checked before the (later appended) Newton-Raphson refinement for x*x and y*y, separately. My fault, I am sorry.

In your code, as a quick treatment of the symptoms, you can Qdiv x and y with 100.0, for example, before the atan2, and take ten times the calculated distance (if needed).

The disease will be cured in the driver by downscaling·· x and y parallel for both ABS(x) and ABS(y) to be less that 128.0 (Qvalue). This will· affect the precision of the result only a very little. The debugged code will be uploaded within a week.

Numerically much better solution will be a QRadius(x, y) procedure, where, like in the Qmuldiv, 64-bit arithmetic is used for the (x*x+y*y) sum, and a 64-bit SQR is done. However, I do not know yet how to make 64-bit SQR in SPIN quickly. Maybe someone out there can help us.

Cheers,

Istvan

Post Edited (cessnapilot) : 9/1/2009 6:05:34 AM GMT

HollyMinkowski · 2009-09-01 06:21

Years ago I got a peculiar book as a Bat Mitzvah gift.
It was written by a mathematician while locked up
in a concentration camp. It is supposed to be a method of
doing advanced math quickly and all in your head.

I have been wondering if those methods might be useful
translated into pasm for the prop...I never actually read the
book, I have it here somewhere in all this mess.

The guys name who wrote that book was Trachtenberg.

Might be worth looking into.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
- Some mornings I wake up cranky.....but usually I just let him sleep -

Timmoore · 2009-09-01 06:24

The case I ran across it was x is ~10 and y ~400 i.e. shallow angles. The change I did was to check if x or y > qval(128) and div in that case but not otherwise.
The code I am converting runs ~ 2x faster from floatmath to fixed int (i.e. the spin version of float), that contains 2 atan2, 10 mul and 6 div. With the 2x atan2 taking as much as the mul and divs.

Post Edited (Timmoore) : 9/1/2009 6:31:06 AM GMT

Humanoido · 2009-09-01 06:30

It's the same way you can use an algorithm to do square roots in your head. My Advanced Lab partner could do these faster than a calculator. Any of these algorithms can fit the Prop. The book and idea are good suggestions. It's likely many of these techniques will show up on a web search.

www.geocities.com/cnowlen/Cathy/Emat4680/Squareroot.htm
www.homeschoolmath.net/teaching/square-root-algorithm.php

humanoido

cessnapilot · 2009-09-01 16:29

Thanks to all for the responses. They were useful and inspiring. I will look after that book and I found interesting ideas in the links. In the promised upgrade· (< 1 week) there will be a QRadius(X,Y) procedure (and a corrected ATAN2), since we can do 64-bit SQR in SPIN as follows

Place the 64-bit result of the operations

·· (X*X) + (Y*Y)
··
in the [noparse][[/noparse]Hi] and [noparse][[/noparse]Lo] 32-bit registers:

IF [noparse][[/noparse]Hi] is zero

·· RETURN :=·Qsqr([noparse][[/noparse]Lo])·· 'As a valid and precise Qvalue result of SQR

ELSE

· ·Take 32-bit SPIN SQR of [noparse][[/noparse]Hi] to get a 1st approximation of the integer part of SQR

·· [noparse][[/noparse]1st_Approx] := ^^[noparse][[/noparse]Hi]

···1st_Approx contains at least 1 (or usually more) decimal digit of the correct result , I guess. This is· an integer approximation of SQR(something x 2^32), so shift it to be a Qvalue Fixed-point number

·· [noparse][[/noparse]1st_Approx] := [noparse][[/noparse]1st_Approx << 16]
··
Unrolled REPEAT 3 follows

···Double the number of the correct digits by a· 64-bit / 32-bit division (similar to that in Qmuldiv) + averaging,

·· [noparse]/noparse]2nd_Approx] := ([noparse][[/noparse][noparse][[/noparse]Hi][noparse][[/noparse]Lo / [noparse][[/noparse]1st_Approx] + [noparse][[/noparse]1st_Approx]) / 2

·· Now we have at least 2 (or usually more) correct decimal digits. Double the number of the correct decimal digits again

·· [noparse]/noparse]3rd_Approx] := ([noparse][[/noparse][noparse][[/noparse]Hi][noparse][[/noparse]Lo / [noparse][[/noparse]2st_Approx] + [noparse][[/noparse]2st_Approx]) / 2

·· And again(to be sure of =>4 digit precision for both the integer and the fractional part of··the RESULT)

Precise ·64-bit Fixed-point SQR done. (As a first trial.)

To avoid unnecessary 64-bit divisions· new approximations will be tested to be close enough to the previous ones. If so Quit REPEAT loop

Cheers,

Istvan

Post Edited (cessnapilot) : 9/1/2009 4:39:40 PM GMT

Timmoore · 2009-09-01 17:19

Couple of other comments about atan2. I did an implementation of atan2 using floatmath a while ago and was using that. It uses the standard tayor series. Comparing with your implementation I see 2 things
1. They return 90degree different. I believe I compared mine with the float32full implementation and got the same results though it was a while ago and I could be mistaken. You might want to check.
2. Looking at the timing the other maths stuff mul/div I have used from your fixed implementation is 3-5 times faster than using floatmath. The atan2 implementations are the same speed both take ~20ms to run (on 80Mhz system). I am attaching my float implementation in case there is anything interesting

Timmoore · 2009-09-02 02:07

Another bug QMul doesn't have teh sign code at the end, its sets s correctly at teh start but doesn't neg result if s is not 0
i added

if s
-result

at end of qmul

Couple of other comments, you can speed up atan2 (> 2x faster) with 2 changes
1.

'One-shot Newton-Raphson refinement of angle
c := QMul(COS_Deg(qV),r)              'Approx. Cosine
s := QMul(SIN_Deg(qV),r)              'Approx. Sine
IF (ix =< iy)
  cr := -Qdiv(ix - c, s)
ELSE
  cr := Qdiv(iy - s, c)

The above code does the same thing with 2xmultiplies and 1div rather than 3x divides
2.
The RadtoDeg call in atan2 is only working with small degrees so you can use a qmul rather than the 64bit multiply in radtodeg since you will not get overflow in this case.

Post Edited (Timmoore) : 9/2/2009 4:45:28 AM GMT

cessnapilot · 2009-09-02 18:41

Hi Timmoore,

Thanks for good tip for that Newton-Raphson step. Your version is not only faster, but more accurate, as well. SPIN_TrigPack is running with it now.

The missing 'sign restore' in Qmul is not a bug, or at least a harmless one, as we do not need sign check/restore at all with SPIN * and ** operators. I have left an unnecessary check before the operations. It's only the laziness of the coder (me). The results were good, so I didn't care enough to clean up. So, the Qmul now looks like

PUB Qmul(arg1, arg2) : qV | s, h, l, r
'-------------------------------------------------------------------------
'----------------------------------&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;-------------------------------
'----------------------------------&#9474; Qmul &#9474;-------------------------------
'----------------------------------&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;-------------------------------
'-------------------------------------------------------------------------
''     Action: Multiplies Qs15_16 Fixed-point numbers
'' Parameters: Multiplicand and Multiplier in Qs15_16 Fixed-point format                               
''     Result: Product in Qs15_16 Fixed-point format                   
''+Reads/Uses: - _32K
''             - _MUL, _OVERFLOW                 
''    +Writes: e_orig, e_kind                                    
''      Calls: SPIN_TrigPack_Error
''       Note: - Fixed-point addition and subtraction goes directly with
''               the + - operators of SPIN. 
''             - Intermediate results are in Qs31_32 double precision
''               Fixed-point format in (h, l), notated as dQvalue 
'-------------------------------------------------------------------------
'Multiply (Upper 32-bit) 
h := arg1 ** arg2
 
IF (||h > _32K)          'Check for overflow when ABS(dQvalue)>32K
  e_orig := _MUL
  e_kind := _OVERFLOW
  SPIN_TrigPack_Error    'This will decide what to do: CONT/NOTIFY/ABORT
  RETURN 

'Multiply (Lower 32-bit)
l := arg1 * arg2
 
'Convert Qs31_32 double precision Fixed-point dQvalue in (h, l) into
'Qs15_16 Qvalue number
h := h << 16 
r := (l & (|<15)) >> 15
l := l >> 16
qV :=  h + l + r
 
RETURN qV
'-------------------------------------------------------------------------

·The Qradius(X,Y) procedure goes as follows

PUB Qradius(qX, qY) : qR | hx, lx, hy, ly, h, l, cf, ap
'-------------------------------------------------------------------------
'--------------------------------&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;------------------------------
'--------------------------------&#9474; QRadius &#9474;------------------------------
'--------------------------------&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;------------------------------
'-------------------------------------------------------------------------
''     Action: Calculates Distance of point (X,Y) from the origo  
'' Parameters: X, Y Descartes coordinates in Qs15_16 Fixed-point format                                
''     Result: Distance from the origo in Qs15_16 Fixed-point format                   
''+Reads/Uses: None                   
''    +Writes: None                                    
''      Calls: Add64, Div64
''       Note: QAngle(qX, qY) is encrypted as Deg_ATAN2(qX, qY)
''       ToDo: Swap ap, qR for clarity
'-------------------------------------------------------------------------
hx := qX ** qX
lx := qX * qX
hy := qY ** qY
ly := qY * qY
 
Add64(hx, lx, hy, ly, @dQval)           'LONG dQval[noparse][[/noparse]2] global array
hx := dQval[noparse][[/noparse]0]
lx := dQval[noparse][[/noparse]1]
 
'Check for zero hx
IF (hx == 0)
  RETURN (Qsqr(lx) >> 8)  
ELSE
  'Prepare SQR iteration loop
  ap := (^^hx) << 16
  'Do iteration 3 times
  REPEAT 3
    h := hx
    l := lx
    'Perform 64-bit division
    qR := Div64(hx, lx, ap)
    'Calculate next approximation as ap
    ap := (qR + ap) >> 1
    IF (qR == ap)
      QUIT
 
RETURN ap  
'-------------------------------------------------------------------------

PRI Add64(ah, al, bh, bl, res_) | la, lb 
'-------------------------------------------------------------------------
'---------------------------------&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;-------------------------------
'---------------------------------&#9474; Add64 &#9474;-------------------------------
'---------------------------------&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;-------------------------------
'-------------------------------------------------------------------------
'     Action: Adds 64-bit numbers
' Parameters: - Hi and Lo part of Arg1
'             - Hi and Lo part of Arg2
'             - Addr of 64-bit dQval                                                                     
'     Result: None                                                                    
'+Reads/Uses: None                    
'    +Writes: 64-bit result into LONG[noparse][[/noparse]address]                                    
'      Calls: None
'       Note: Result is passed by reference
'-------------------------------------------------------------------------
LONG[noparse][[/noparse]res_][noparse][[/noparse]0] := ah + bh
LONG[noparse][[/noparse]res_][noparse][[/noparse]1] := al + bl
IF ((al < 0) AND (bl < 0))
  LONG[noparse][[/noparse]res_][noparse][[/noparse]0]++
  RETURN
ELSEIF ((al => 0) AND (bl => 0))
  RETURN
ELSE
  la := al &  $7FFF_FFFF
  lb := bl &  $7FFF_FFFF
  IF ((la + lb) < 0)
    LONG[noparse][[/noparse]res_][noparse][[/noparse]0]++
RETURN
'-------------------------------------------------------------------------

PRI Div64(dh, dl, dr) : qQ | cf
'-------------------------------------------------------------------------
'---------------------------------&#9484;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9488;-------------------------------
'---------------------------------&#9474; Div64 &#9474;-------------------------------
'---------------------------------&#9492;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9496;-------------------------------
'-------------------------------------------------------------------------
'     Action: Divides a 64-bit dQvalue with q 32-bit Qvalue
' Parameters: - Hi and Lo part of dividend
'             - Divisor                                                                                     
'     Result: Quotient in Qs15_16 Fixed-point Qvalue format                                                                    
'+Reads/Uses: None                    
'    +Writes: None                                    
'      Calls: None
'       Note: None
'       ToDo: Some optimization can be done for speed here
'-------------------------------------------------------------------------
qQ~
REPEAT 32
  cf := dh < 0
  dh := (dh << 1) + (dl >> 31)
  dl <<= 1
  qQ <<= 1
  IF ((dh > dr) OR cf)
    ++qQ
    dh -= dr
RETURN qQ    
'-------------------------------------------------------------------------

Qradius·gives for arguments

X = 10000.0
Y = 10000.0

>R = 14142.1356

Note the 9 digits of precision.

If someone has a better way to do 64-bit addition in SPIN, let us know. I feel my· method a little bit clumsy.

SPIN_TrigPack upgrade· on OBEX < 3 days.

Cheers,

Istvan

Timmoore · 2009-09-02 19:20

Thanks I will try it out. The QMul did cause problems because of the || on the args. So the ** and * didn't get the sign. Your new version should work.

Post Edited (Timmoore) : 9/2/2009 7:56:03 PM GMT

cessnapilot · 2009-09-02 20:38

Well, Timmoore, you were right. What you saw was really a bug, a nasty hiding one. We need those sign checks and ||s. I just noticed that small negative fractions will cause overflow in many sections of the code if·those tests·are missing. Do not waste your time to·check that scrap Qmul in my previous post.
Cheers,

Istvan

SPIN_TrigPack upgrade < 2 days

Timmoore · 2009-09-02 21:43

OK, thanks. Another minor bug in QvalToIangle(qV)

s is not set except if qV is -ve, if it happened to be 1 in memory then the result is -ve.

cessnapilot · 2009-09-04 22:59

Hi All,

SPIN_TrigPack upgrade has been placed on OBEX. You can try the first Fixed-point package there. As a bonus this object now contains the first True Random Generator with the Propeller microcontroller using only SPIN. Sequences will repeat themselves only after the end of Times. Thanks to all who helped in the development.

Cheers,

Istvan

Hi to all who need more COGs

Comments