Fast, Faster, Fastest Code: SPIN is COG'd up and TRIGgered
cessnapilot
Posts: 182
Hi All,
Sometimes it's enough, that the code can be written, quickly and conveniently. The title of this thread comes with this context. To waste 2 COGs for a bloody Atan2 in an application, where there are other COG hungry objects crowding, is not good. Besides, we have finely granulated sine and cosine tables in the main ROM, that can be accessed easily from SPIN, and we paid for them, to say the least. SIN, COS, ATAN2, LOG and EXP can be done in SPIN and they can be coupled with a high-precision number format. A small library of usually 'Enough and Necessary' SPIN trig functions, based upon the ROM tables and on some effective integer trig algorithms is about to be ready for Obex. This small collection of fixed-point baseMath(+-*/), trig functions and conversion utilities will be a good platform to test and launch (if not busted) other SPIN based integer code for trig and exponential functions. This post shows simple conversion utilities, in pursue for better ideas, of course, or to help the reader in the programming in SPIN. This new object is based on ROM tables. However, the access to these tables and the utilization of the result in integer arithmetic, is hampered by the different data formats, that you have to use in a mixed manner.
iAngle
======
The interval of (Pi/2) is stretched over 2K registers, so Pi equals 4046 in 'ROM register address' units. I shall call these as iAngle units (for integer Angle).·For other angles in degrees you get the iAngle equivalent as
iAngle := (Degrees * 1024) / 45
The resolution of these iAngle values is good enough for most robot or for many, common-sense designed navigation applications. The sine, cosine values are returned from the ROM tables in 17-bit·signed·format, where -1 is -65536 and +1 is 65535. Another format again, namely the Qs16.
This Qs16 format is not precise enough to deal with coordinates efficiently. When the arena of events, or a journey, is so large, that hundreds of miles can be spanned within, the precision of Qs16 format will underperform those of the GPS. For the general calculations, we should use the full 32-bit register length of the Propeller, as we paid for every bit of it, too. The 32-bit Qs16_17-bit fixed-point has enough precision to deal with coordinates about an inch accuracy over hundreds of miles. From now on, I will refer to this format as iValue. Qs16 fits exatly into the 'fractional' part of Qs15_16, without loosing a single bit of precision.
OK. Stop. How many formats do we have until now?
LONG······ of SPIN
iAngle····· for ROM tables
Qs16····· ·for trig outcomes
Qs15_16 ·iValue for precise calculations··· (proposed)
(the cryptic formats of the Log/Antilog tables worth of mentioning here, too)
There are too many formats, to handle them separately in each case, when they come up in a SPIN program. And I am proposing a new one.(?) (You might think, that cessnapilot apparently lost his mind.) I found, however, that transforming other values to/from the precise iValues, with small utilities, coding becames easier, and you can access those ROM table resources conveniently.
It seems a good idea to use the precise iValue format for angels in degrees, too. However, you got into trouble soon when you use this format for angles.· To convert, for example, 111.55 deg as an iValue ($6F_8CCC) to iAngle in SPIN, you will bump into the 32-bit limit. According to Chuck Yeager, -never wait for trouble- , or at least prepare for them (with conversion utilities). When you just calculate
(111*1024)/45 = 2525,
or, with a better approximation
(112*1024)/45 = 2548
you are far from the correct ROM address. So, the brute-force integer arithmetic does loose precision. You may use a new format for the degrees (100x, or so), but that is again a plus load on the clarity of code. The following Ia_Conv_IvD2Ia conversion routine, however, will give you the correct 2538 iAngle value.
An here is the back transformation
····
Repeating them back and forth many tines, you get back the same values, so rounding errors does not integrate here. With these conversions the iAngle format can be hidden in the trig routines. With function calls like
result [noparse][[/noparse]iValue] := Sin_Deg(angle [noparse][[/noparse]iValue] in degs) or
result [noparse][[/noparse]iValue] := Sin_Rad(angle [noparse][[/noparse]iValue] in rads),
formulas can be easily programmed, and the main lines of the program will be more readable then messing around with scale factors in the main code. To convert a LONG data pair into iValue goes as
The package·contains other iValue conversions, including to/from ASCII ones, too, that are trickier than LONG data handling.
Ideas, better codes, or just a single Obex ID of an existing fixed-point full-featured SPIN trig object, are welcome. Thanks.
Cheers,
Istvan
Sometimes it's enough, that the code can be written, quickly and conveniently. The title of this thread comes with this context. To waste 2 COGs for a bloody Atan2 in an application, where there are other COG hungry objects crowding, is not good. Besides, we have finely granulated sine and cosine tables in the main ROM, that can be accessed easily from SPIN, and we paid for them, to say the least. SIN, COS, ATAN2, LOG and EXP can be done in SPIN and they can be coupled with a high-precision number format. A small library of usually 'Enough and Necessary' SPIN trig functions, based upon the ROM tables and on some effective integer trig algorithms is about to be ready for Obex. This small collection of fixed-point baseMath(+-*/), trig functions and conversion utilities will be a good platform to test and launch (if not busted) other SPIN based integer code for trig and exponential functions. This post shows simple conversion utilities, in pursue for better ideas, of course, or to help the reader in the programming in SPIN. This new object is based on ROM tables. However, the access to these tables and the utilization of the result in integer arithmetic, is hampered by the different data formats, that you have to use in a mixed manner.
iAngle
======
The interval of (Pi/2) is stretched over 2K registers, so Pi equals 4046 in 'ROM register address' units. I shall call these as iAngle units (for integer Angle).·For other angles in degrees you get the iAngle equivalent as
iAngle := (Degrees * 1024) / 45
The resolution of these iAngle values is good enough for most robot or for many, common-sense designed navigation applications. The sine, cosine values are returned from the ROM tables in 17-bit·signed·format, where -1 is -65536 and +1 is 65535. Another format again, namely the Qs16.
This Qs16 format is not precise enough to deal with coordinates efficiently. When the arena of events, or a journey, is so large, that hundreds of miles can be spanned within, the precision of Qs16 format will underperform those of the GPS. For the general calculations, we should use the full 32-bit register length of the Propeller, as we paid for every bit of it, too. The 32-bit Qs16_17-bit fixed-point has enough precision to deal with coordinates about an inch accuracy over hundreds of miles. From now on, I will refer to this format as iValue. Qs16 fits exatly into the 'fractional' part of Qs15_16, without loosing a single bit of precision.
OK. Stop. How many formats do we have until now?
LONG······ of SPIN
iAngle····· for ROM tables
Qs16····· ·for trig outcomes
Qs15_16 ·iValue for precise calculations··· (proposed)
(the cryptic formats of the Log/Antilog tables worth of mentioning here, too)
There are too many formats, to handle them separately in each case, when they come up in a SPIN program. And I am proposing a new one.(?) (You might think, that cessnapilot apparently lost his mind.) I found, however, that transforming other values to/from the precise iValues, with small utilities, coding becames easier, and you can access those ROM table resources conveniently.
It seems a good idea to use the precise iValue format for angels in degrees, too. However, you got into trouble soon when you use this format for angles.· To convert, for example, 111.55 deg as an iValue ($6F_8CCC) to iAngle in SPIN, you will bump into the 32-bit limit. According to Chuck Yeager, -never wait for trouble- , or at least prepare for them (with conversion utilities). When you just calculate
(111*1024)/45 = 2525,
or, with a better approximation
(112*1024)/45 = 2548
you are far from the correct ROM address. So, the brute-force integer arithmetic does loose precision. You may use a new format for the degrees (100x, or so), but that is again a plus load on the clarity of code. The following Ia_Conv_IvD2Ia conversion routine, however, will give you the correct 2538 iAngle value.
PUB Ia_Conv_IvD2Ia(iV) | s, ip, fp, ia 'This converts iValue [noparse][[/noparse]deg] to iAngle. It takes care of fractions s:= FALSE IF (iV < 0) -iV s := TRUE 'Check magnitude of input to be < 1440 degrees IF (iV=>_94388224) e_orig := _IA_C_IVD2IA e_kind := _OVERFLOW SPIN_TrigPack_Error 'Or correct it with repeated subt. of 360*64K RETURN 'Scale up integer part of iValue. ip := iV >> 7 'This includes now a hefty part of the orig. fraction 'Multiply up this scaled-up integer part ia := ip * 2912 'Get fraction of the product fp := ia & $0000_FFFF 'Calculate integer part of the product(this is iAngle 1st approx.) ia := ia >> 16 'Round iAngle IF (fp => $8000) ia := ia + 1 'Set sign IF (s==TRUE) -ia RETURN ia
An here is the back transformation
PUB Iv_Conv_Ia2IvD(iA) | s, ip, fp, iv 'This converts iAngle into iValue [noparse][[/noparse]deg] s:= FALSE IF (iA < 0) -iA s := TRUE 'Multiply back and shift + roundup iv := ((iA * 45) << 6) + 1440 IF s -iv RETURN iv
····
Repeating them back and forth many tines, you get back the same values, so rounding errors does not integrate here. With these conversions the iAngle format can be hidden in the trig routines. With function calls like
result [noparse][[/noparse]iValue] := Sin_Deg(angle [noparse][[/noparse]iValue] in degs) or
result [noparse][[/noparse]iValue] := Sin_Rad(angle [noparse][[/noparse]iValue] in rads),
formulas can be easily programmed, and the main lines of the program will be more readable then messing around with scale factors in the main code. To convert a LONG data pair into iValue goes as
PUB Iv_Conv_IpFp2Iv(intP, fracP) | s, iv 'This merges separate integer+fraction LONG registers into iValue register s:= FALSE 'Check sign IF (intP < 0) -intP s := TRUE 'Check input and signal error IF (intP>_32K) OR (fracP>_64K) e_orig := _IV_C_IPFP2IV e_kind := _OVERFLOW SPIN_TrigPack_Error RETURN 'Shift integer part and add fractional part iv := (intP << 16) + fracP 'Restore negative sign IF s -iv RETURN iv
The package·contains other iValue conversions, including to/from ASCII ones, too, that are trickier than LONG data handling.
Ideas, better codes, or just a single Obex ID of an existing fixed-point full-featured SPIN trig object, are welcome. Thanks.
Cheers,
Istvan
Comments
Here is the Obex quality integer ATAN2 from the SPIN_TrigPack driver. Its average precision is about 0.1 degrees. This is more than enough for most robotic or simple navigation applications. This precision can be expanded to 10^(-5) degrees with a Newton-Raphson iteration step (not shown here), to satisfy higher demands.·
As you can see, the code does not use ROM tables, but the SPIN_TrigPack uses them heavily. The preliminary code for SIN, COS and TAN shows, how smoothly the Qs16 format of the ROM table output connects to the Qs15_16 iValues.
A 100 dollar bill goes to the first applicant, who can make a simpler ATAN2 approximation (fewer parameters) or at least a twice as fast with the same precision on the random circle, as the now presented. All features of SPIN's integer math (except PASM codes or directly driven FPU ...) are accepted. For example rollout CORDIC may be competitive. (Oops, I might lose that note.) Deadline is 01.10.2009.
Cheers,
Istvan
Post Edited (cessnapilot) : 8/11/2009 11:23:12 PM GMT
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Quicker answers in the #propeller chat channel on freenode.net. Don't know squat about IRC? Download Pigin! So easy a caveman could do it...
http://folding.stanford.edu/ - Donating some CPU/GPU downtime just might lead to a cure for cancer! My team stats.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Quicker answers in the #propeller chat channel on freenode.net. Don't know squat about IRC? Download Pigin! So easy a caveman could do it...
http://folding.stanford.edu/ - Donating some CPU/GPU downtime just might lead to a cure for cancer! My team stats.
Tan, then was pretty much then a one liner.
You shouldn't need to worry about the cases of tan breaking since spin will just return 0 if dividing by 0.
As for Atan2. I was thinking that you could take the x and y points that woulkd be produced by using the cos and sin functions (scaled by a radius) and then using a bit of flags to remember the quadrant you could fold the x and y points back onto of the sin table.
You then could preform a fast lookup since the values are ordered and return the index which would be the degree. Then you could un mirror the lookuped up value using he flags and you would be done.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nyamekye,
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nyamekye,
That's not math, that's experience. Somewhere an ATC recorder captured the following conversation between a captain and the co-pilot
-Captain. How is that you never make· mistakes?
-That's experience.
-And how did you get that experience?
-By making mistakes.
That's it.
Why do you need a radius to get a point on the unit circle?
I think I should worry about, when an algorithm tries to divide by zero. Never wait for trouble, you know. I read on the internet, that Chuck Norris can divide by zero and the result is the last digit of Pi. I don't know that digit, and I do not dare to ask. So, I use a large number instead as default.
You know, at the end of a long day of programming, everything boils down to performance. Time is a very prestigious measure when you fix the range of precision,· you can always take the CNT difference from before and after a procedure. And then let's time decide. Clean, fair, useful.
As for ATAN2, I have two (almost) identical 100 dollar bills in my banknote collection, so one of it I can give away, if you see what I mean. The final version of ATAN2, of course, does not contain such lines as 90*_64K, or that prescaling is done with precalculated x*y, etc.... The code has only 5 or 6 multiplications,·2 divisions. With the Qs15_16 numbers, coordinates from 1 tenth of an inch to thousand of miles can be handled with tenth of a degree avr. precision. This, or a better code, will be usefull for others, too.
Well, it's past 9 in the evening, so I had better get going to bed in my pyjamas with Chuck on it.
Cheers and Good night,
Istvan
Post Edited (cessnapilot) : 8/11/2009 11:05:51 PM GMT
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nyamekye,
Massimo
I really like your clean coding style, and I know that you can be easily offended. So, I have just a small note to that: You use far too many comments, according to my taste.
Have a nice Summer Holiday! And a last advice before that: Explaining programming, you can put girls easily to asleep.
Cheers,
Istvan
ATAN2, by definition, does not. It won't be difficult, as the x*x and y*y products are calculated inside, and you have to add them up, backshift with sh, to obtain the R squared. But, you loose precision that way, since they dissipate zeroes and ones by shifting left, then you fill back only zeroes.
However, when you calculate, and store the products (for later use) from the original inputs(x*x-->x2, y*y-->y2) and give back the sum by reference, you only make 1 addition and 2 shifts more, than the original code. That addition had to done elsewhere, so the real burden is only 2 shifts. But, you spared 2 multiplications with this! So, the full answer is : It does not, but it should.
I recommend for this modification to give back R squared (the sum), as there are algorithms that use only that, instead of R. When the calling code needs R let it calculate that, presumably for a global variable. So, something like a 'De-luxe'··
ATAN2(X,Y,@r2):angle
can do what you need. It is a good idea, so trivial, that I didn't even noticed that. Speeds up things a lot at critical points. Thanks for that.
Cheers,
Istvan
i have posted YAATAN2 (yet another atan2()) in SPIN and assembler at
http://forums.parallax.com/forums/default.aspx?f=25&m=385233&p=1
its using cordic algorithm, and at least in SPIN its a quite clean code, i think...
Regards
Markus Greim