Fast Trigonometry with Spin
Duane Degn
Posts: 10,588
Over in the thread about speeding up Spin, it was mentioned scaled integer math is much faster than floating point math.
If I have basic add, subtract, multiply and divide types of equations, I always use scaled integers myself. When I need to perform trigonometric equations I usually turn to F32 for floating point math.
The HMC5883L compass module links to a program which calculates headings using integer trig functions.
The object's name is "SL32_INTEngine_2."
Here's part of the heading of the object:
I decided to compare the integer trig functions with similar calculations using F32.
My test program generated 80 test points. These test points were laid out in a 20 by 20 square. The corners of the square were -10, 10; 10, 10; 10, -10; -10, -10.
I had the program calculate the distance to these points from the origin. The integer math section was:
The floating point section was:
F32 doesn't include a distance method. Here's the method I wrote to compute the distance.
Integer math won this round. It took the integer math equation about 6ms to compute the 80 distances while the floating point math calculations took about 14ms. Technically, this isn't really a trig problem. I wasn't really surprised integer math won this round.
The first of three trig problems was to compute the angle these points made with the x-axis with respect to the origin.
The integer math method was "polarAngle."
The equivelent F32 method is "ATan2."
The ATan2 method is a very useful method with program a hexapod. The IK calculations use it and it's also used to calculate the direction of travel using the x and y feedback from the joystick.
The integer math method required 59.7ms while F32 was able to compute the 80 angles in 5.4ms.
Not only was F32 able to calculate the angles in less than a tenth of the time as the integer math equations, it also provide a much more precise answer.
While scaling the integers allowed the distance to the points to be accurately calculated, I'm not aware of a way of scaling the integers such as to produce a scaled angle. I think the "polarAngle" method would require extensive rewriting in order to have it return a scaled angle.
Not only were the angles returned from "polarAngle" rounded to the nearest full integer angle, they weren't always rounded correctly (30.964 was rounded to 30).
This lack a precision with angles is something I don't know how to get around when using integer math.
The remaining comparisons are between the calculations used to recompute the x and y coordinates based on the distance and angles previously calculated.
The integer trig object has a nice method for calculating coordinates based on the angle and distance. Here the scaled distance can enhance the accuracy of having to use integer angles.
The same calculation requires to F32 calls. One to compute the cosine of the angle and another to multiply the result by the distance.
The same type of calculations were made to compute the y coordinates, this time using the sine.
Once again, F32 was faster. The integer math calculation used to find the x coordinates took 15.7ms while F32 took 7.9ms. The "find y" times were 14.4ms and 7.9ms with F32 again computing the coordinates the fastest.
In general I think it's safe to say scaled integer math calculations are faster than floating point calculations but based on what I've seen, if there's a lot of trig in the calculations then floating point math appears to be the best option.
I've attached my test program.
Here's the very long output from the program. The output did not occur at the time of the calculations. The numbers were saved to arrays to be displayed at the end of the program.
If there are faster ways of performing trig calculations with integers, I'd like to know about it. I'm particularly interested in a fast and accurate ATan2 method.
The reason I'm so interested in fast trig calculations is so I can increase the capability of my hexapod. (If you haven't done so, make sure and watch the video embedded to post #73.)
(Don't worry, it won't always have puppy dog eyes.)
So far the hexapod moves with its body level to the ground. A level body doesn't require as many calculations to figure out where the feet should be as the calculations required by a body tilted with respect to the ground. I'd like to add the complexity of allowing the robot to walk while tilted but I'm concerned about the time all these calculations will take. My current plan is to use multiple instances of F32 with multiple cogs performing the needed calculations.
If I have basic add, subtract, multiply and divide types of equations, I always use scaled integers myself. When I need to perform trigonometric equations I usually turn to F32 for floating point math.
The HMC5883L compass module links to a program which calculates headings using integer trig functions.
The object's name is "SL32_INTEngine_2."
Here's part of the heading of the object:
// SL32 Integer Engine // // Author: Kwabena W. Agyeman // Updated: 8/28/2010 // Designed For: P8X32A // Version: 1.0
I decided to compare the integer trig functions with similar calculations using F32.
My test program generated 80 test points. These test points were laid out in a 20 by 20 square. The corners of the square were -10, 10; 10, 10; 10, -10; -10, -10.
I had the program calculate the distance to these points from the origin. The integer math section was:
repeat localIndex from 0 to MAX_POINT_INDEX intDistance[localIndex] := Math.polarRadius(y[localIndex], x[localIndex])
The floating point section was:
repeat localIndex from 0 to MAX_POINT_INDEX floatDistance[localIndex] := ComputeDistance2D(fY[localIndex], fX[localIndex])
F32 doesn't include a distance method. Here's the method I wrote to compute the distance.
PUB ComputeDistance2D(fLocalX, fLocalY) result := F32.FSqr(F32.FAdd(F32.FMul(fLocalX, fLocalX), F32.FMul(fLocalY, fLocalY)))
Integer math won this round. It took the integer math equation about 6ms to compute the 80 distances while the floating point math calculations took about 14ms. Technically, this isn't really a trig problem. I wasn't really surprised integer math won this round.
The first of three trig problems was to compute the angle these points made with the x-axis with respect to the origin.
The integer math method was "polarAngle."
repeat localIndex from 0 to MAX_POINT_INDEX intAngle[localIndex] := Math.polarAngle(y[localIndex], x[localIndex])
The equivelent F32 method is "ATan2."
repeat localIndex from 0 to MAX_POINT_INDEX floatAngle[localIndex] := F32.ATan2(fY[localIndex], fX[localIndex])
The ATan2 method is a very useful method with program a hexapod. The IK calculations use it and it's also used to calculate the direction of travel using the x and y feedback from the joystick.
The integer math method required 59.7ms while F32 was able to compute the 80 angles in 5.4ms.
Not only was F32 able to calculate the angles in less than a tenth of the time as the integer math equations, it also provide a much more precise answer.
While scaling the integers allowed the distance to the points to be accurately calculated, I'm not aware of a way of scaling the integers such as to produce a scaled angle. I think the "polarAngle" method would require extensive rewriting in order to have it return a scaled angle.
Not only were the angles returned from "polarAngle" rounded to the nearest full integer angle, they weren't always rounded correctly (30.964 was rounded to 30).
This lack a precision with angles is something I don't know how to get around when using integer math.
The remaining comparisons are between the calculations used to recompute the x and y coordinates based on the distance and angles previously calculated.
The integer trig object has a nice method for calculating coordinates based on the angle and distance. Here the scaled distance can enhance the accuracy of having to use integer angles.
repeat localIndex from 0 to MAX_POINT_INDEX xNew[localIndex] := Math.cos(intAngle[localIndex], intDistance[localIndex])
The same calculation requires to F32 calls. One to compute the cosine of the angle and another to multiply the result by the distance.
repeat localIndex from 0 to MAX_POINT_INDEX fXNew[localIndex] := F32.FMul(floatDistance[localIndex], F32.cos(floatAngle[localIndex]))
The same type of calculations were made to compute the y coordinates, this time using the sine.
Once again, F32 was faster. The integer math calculation used to find the x coordinates took 15.7ms while F32 took 7.9ms. The "find y" times were 14.4ms and 7.9ms with F32 again computing the coordinates the fastest.
In general I think it's safe to say scaled integer math calculations are faster than floating point calculations but based on what I've seen, if there's a lot of trig in the calculations then floating point math appears to be the best option.
I've attached my test program.
Here's the very long output from the program. The output did not occur at the time of the calculations. The numbers were saved to arrays to be displayed at the end of the program.
****** Scroll down to see times. ****** exact | int | int | recalculated |float | float | recalculated (x, y) | dist |angle| integer x, y | dist | angle | float x, y -10, 10|14.142| 135| -9.999, 9.999|14.142| 135.000|-10.000, 10.000 -9, 10|13.453| 132| -8.996, 10.002|13.454| 131.987| -9.000, 10.000 -8, 10|12.806| 129| -8.055, 9.955|12.806| 128.660| -8.000, 10.000 -7, 10|12.206| 125| -6.997, 10.000|12.207| 124.992| -7.000, 10.000 -6, 10|11.661| 121| -6.002, 9.997|11.662| 120.964| -6.000, 10.000 -5, 10|11.180| 117| -5.072, 9.962|11.180| 116.565| -5.000, 10.000 -4, 10|10.770| 112| -4.029, 9.987|10.770| 111.801| -4.000, 10.000 -3, 10|10.440| 107| -3.045, 9.985|10.440| 106.699| -3.000, 10.000 -2, 10|10.198| 102| -2.119, 9.975|10.198| 101.310| -2.000, 10.000 -1, 10|10.049| 96| -1.046, 9.994|10.050| 95.711| -1.000, 10.000 0, 10|10.000| 90| 0.000, 10.000|10.000| 90.000| 0.000, 10.000 1, 10|10.049| 84| 1.054, 9.993|10.050| 84.289| 1.000, 10.000 2, 10|10.198| 78| 2.127, 9.973|10.198| 78.690| 2.000, 10.000 3, 10|10.440| 73| 3.053, 9.983|10.440| 73.301| 3.000, 10.000 4, 10|10.770| 68| 4.037, 9.984|10.770| 68.199| 4.000, 10.000 5, 10|11.180| 63| 5.080, 9.959|11.180| 63.435| 5.000, 10.000 6, 10|11.661| 59| 6.010, 9.992|11.662| 59.036| 6.000, 10.000 7, 10|12.206| 55| 7.005, 9.995|12.207| 55.008| 7.000, 10.000 8, 10|12.806| 51| 8.063, 9.948|12.806| 51.340| 8.000, 10.000 9, 10|13.453| 48| 9.003, 9.995|13.454| 48.013| 9.000, 10.000 10, 10|14.142| 45| 9.999, 9.999|14.142| 45.000| 10.000, 10.000 10, 9|13.453| 42| 10.002, 8.996|13.454| 41.987| 10.000, 9.000 10, 8|12.806| 38| 10.095, 7.878|12.806| 38.660| 10.000, 8.000 10, 7|12.206| 35| 10.000, 6.997|12.207| 34.992| 10.000, 7.000 10, 6|11.661| 30| 10.101, 5.825|11.662| 30.964| 10.000, 6.000 10, 5|11.180| 26| 10.050, 4.896|11.180| 26.565| 10.000, 5.000 10, 4|10.770| 21| 10.057, 3.852|10.770| 21.801| 10.000, 4.000 10, 3|10.440| 16| 10.035, 2.877|10.440| 16.699| 10.000, 3.000 10, 2|10.198| 11| 10.011, 1.943|10.198| 11.310| 10.000, 2.000 10, 1|10.049| 5| 10.011, 0.869|10.050| 5.711| 10.000, 1.000 10, 0|10.000| 0| 10.000, 0.000|10.000| 0.000| 10.000, 0.000 10, -1|10.049| -5| 10.010, -0.869|10.050| -5.711| 10.000, -1.000 10, -2|10.198| -11| 10.009, -1.943|10.198| -11.310| 10.000, -2.000 10, -3|10.440| -16| 10.033, -2.877|10.440| -16.699| 10.000, -3.000 10, -4|10.770| -21| 10.054, -3.852|10.770| -21.801| 10.000, -4.000 10, -5|11.180| -26| 10.047, -4.896|11.180| -26.565| 10.000, -5.000 10, -6|11.661| -30| 10.097, -5.825|11.662| -30.964| 10.000, -6.000 10, -7|12.206| -35| 9.995, -6.997|12.207| -34.992| 10.000, -7.000 10, -8|12.806| -38| 10.089, -7.878|12.806| -38.660| 10.000, -8.000 10, -9|13.453| -42| 9.995, -8.996|13.454| -41.987| 10.000, -9.000 10,-10|14.142| -45| 9.999, -9.999|14.142| -45.000| 10.000,-10.000 9,-10|13.453| -48| 8.996, -9.995|13.454| -48.013| 9.000,-10.000 8,-10|12.806| -51| 8.055, -9.948|12.806| -51.340| 8.000,-10.000 7,-10|12.206| -55| 6.997, -9.995|12.207| -55.008| 7.000,-10.000 6,-10|11.661| -59| 6.002, -9.992|11.662| -59.036| 6.000,-10.000 5,-10|11.180| -63| 5.072, -9.959|11.180| -63.435| 5.000,-10.000 4,-10|10.770| -68| 4.029, -9.984|10.770| -68.199| 4.000,-10.000 3,-10|10.440| -73| 3.045, -9.983|10.440| -73.301| 3.000,-10.000 2,-10|10.198| -78| 2.119, -9.973|10.198| -78.690| 2.000,-10.000 1,-10|10.049| -84| 1.046, -9.993|10.050| -84.289| 1.000,-10.000 0,-10|10.000| -90| 0.000,-10.000|10.000| -90.000| 0.000,-10.000 -1,-10|10.049| -96| -1.046, -9.994|10.050| -95.711| -1.000,-10.000 -2,-10|10.198| -102| -2.119, -9.975|10.198|-101.310| -2.000,-10.000 -3,-10|10.440| -107| -3.045, -9.985|10.440|-106.699| -3.000,-10.000 -4,-10|10.770| -112| -4.029, -9.987|10.770|-111.801| -4.000,-10.000 -5,-10|11.180| -117| -5.072, -9.962|11.180|-116.565| -5.000,-10.000 -6,-10|11.661| -121| -6.002, -9.997|11.662|-120.964| -6.000,-10.000 -7,-10|12.206| -125| -6.997,-10.000|12.207|-124.992| -7.000,-10.000 -8,-10|12.806| -129| -8.055, -9.955|12.806|-128.660| -8.000,-10.000 -9,-10|13.453| -132| -8.996,-10.002|13.454|-131.987| -9.000,-10.000 -10,-10|14.142| -135| -9.999, -9.999|14.142|-135.000|-10.000,-10.000 -10, -9|13.453| -138| -9.995, -9.003|13.454|-138.013|-10.000, -9.000 -10, -8|12.806| -142|-10.089, -7.886|12.806|-141.340|-10.000, -8.000 -10, -7|12.206| -145| -9.995, -7.005|12.207|-145.008|-10.000, -7.000 -10, -6|11.661| -150|-10.097, -5.833|11.662|-149.036|-10.000, -6.000 -10, -5|11.180| -154|-10.047, -4.903|11.180|-153.435|-10.000, -5.000 -10, -4|10.770| -159|-10.054, -3.860|10.770|-158.199|-10.000, -4.000 -10, -3|10.440| -164|-10.033, -2.884|10.440|-163.301|-10.000, -3.000 -10, -2|10.198| -169|-10.009, -1.951|10.198|-168.690|-10.000, -2.000 -10, -1|10.049| -175|-10.010, -0.877|10.050|-174.289|-10.000, -1.000 -10, 0|10.000| 180|-10.000, 0.000|10.000| 180.000|-10.000, 0.000 -10, 1|10.049| 175|-10.010, 0.877|10.050| 174.289|-10.000, 1.000 -10, 2|10.198| 169|-10.009, 1.951|10.198| 168.690|-10.000, 2.000 -10, 3|10.440| 164|-10.033, 2.884|10.440| 163.301|-10.000, 3.000 -10, 4|10.770| 159|-10.054, 3.860|10.770| 158.199|-10.000, 4.000 -10, 5|11.180| 154|-10.047, 4.903|11.180| 153.435|-10.000, 5.000 -10, 6|11.661| 150|-10.097, 5.833|11.662| 149.036|-10.000, 6.000 -10, 7|12.206| 145| -9.995, 7.005|12.207| 145.008|-10.000, 7.000 -10, 8|12.806| 142|-10.089, 7.886|12.806| 141.340|-10.000, 8.000 -10, 9|13.453| 138| -9.995, 9.003|13.454| 138.013|-10.000, 9.000 Integer Math: atan2 time = 59.691 ms, distance time = 6.061 ms, find x time = 15.662 ms, find y time = 14.366 ms Float Math: atan2 time = 5.362 ms, distance time = 13.932 ms, find x time = 7.868 ms, find y time = 7.851 ms
If there are faster ways of performing trig calculations with integers, I'd like to know about it. I'm particularly interested in a fast and accurate ATan2 method.
The reason I'm so interested in fast trig calculations is so I can increase the capability of my hexapod. (If you haven't done so, make sure and watch the video embedded to post #73.)
(Don't worry, it won't always have puppy dog eyes.)
So far the hexapod moves with its body level to the ground. A level body doesn't require as many calculations to figure out where the feet should be as the calculations required by a body tilted with respect to the ground. I'd like to add the complexity of allowing the robot to walk while tilted but I'm concerned about the time all these calculations will take. My current plan is to use multiple instances of F32 with multiple cogs performing the needed calculations.
Comments
Very interesting post. Thanks for sharing.
Bruce
You might give the following atan2 method a try. It's a modification of Chip's cordic algo:
-Phil
Very good. It's scaled to tenths of a degree. It should be useful if fast enough. I'll report back how it compares.
@Bruce. You're welcome.
The extra resolution came at a price. Chip's algorithm was slower than the one Kye used in his integer math object.
The other integer math atan2 method took 59.691 ms to complete the 80 calculations.
Edit: I've attached the new test code to this post.
That doesn't jibe with my experience. When I run the following method, I get a printout of 2210, or 2.2 ms per execution of atan2:
Edit: Okay, upon rereading your post, I'm a bit confused. Are the times for one execution or 80?
-Phil
The times listed are for the full 80 points (calculations).
It does make a bit more sense when the times are displayed under the list of the 80 data points but I should have clarified it better when I used the times without the list of data.
So yes, the execution time for a single atan2 calculation was 2.2ms. A single calculation of F32's atan2 took 0.067ms.
That is a comparison of sweet peppers to jalape
F32 does seem like the best bet unless you absolutely need the cog for something else.
I'm kind of surprised that someone hasn't made a CORDIC fixed point navigation engine in pasm, but then again, I don't really know if it would be much smoother or more efficient. Chip had at one point in time promised CORDIC hardware to be build into the P2, a pet interest of his. I've always been curious how that was or is going to be implemented, but I wait and see.
Knowing my coding skills it is broken for sure, but I attach it.
There are some examples of assembly integer trig functions, but they are probably hard to find..
Massimo
Also look at the "FME" library linked in my signature. It's a spin only floating point library with a full set of trig and exponential functions. It's way slower than F32, but should be fine for less time critical tasks and it frees up a cog.
Both F32 and FME, spend significant amounts of time packing and unpacking floating point numbers. I think it'd be faster and more precise to have them work from a stack of un-packed floating point numbers instead. (i.e. like a RPN calculator) My guess is that a stack/RPN based FME would probably get 2x faster, while F32 would get ~30% faster.
Marty
This has crossed my mind a few times. I remember Jonathan mentioning equation scripting would need to be left out in order to fit F32 into a single cog. I don't use F32's log functions and there are probably other functions I could remove from F32 to make room for my own scripts. Much of F32 is beyond my present understanding my I think I understand it enough to add some sort of scripting feature.
I'm trying to keep all the calculations within the 20ms (50Hz) servo refresh interval. The present IK calculations are fast enough complete at 50Hz but I'm concerned adding the additional calculations required to allow the hexapod to walk while in a tilted attitude will exceed this 20ms interval. I think creating some sort of script in PASM would likely allow the additional calculations to be completed in time. Thanks for the suggestion. I'll probably do this.