Floating point math and the propeller

bprager · 2010-10-24 12:39

I'm looking at a application that needs to do a lot of high precision trig calculations.
I know there are math objects that will help with this, but I was also wondering about the FPU chip that is offered in the Parallax store.

Does this chip help any with the propeller, or is it only useful with the stamp chipset?

And any suggestions as to the best way to use the propeller for this kind of application?

Ignoring the FPU issue, my initial thought was to dedicate a propeller chip to handling only the trig calculations to maximize available code space and speed.

Any thoughts on this would be gratefully accepted.

Thanks!

Mike Green · 2010-10-24 12:50

You can use an external FPU with the Propeller, but it takes time to transfer the data back and forth and the Propeller can do its own floating point quite quickly. An addition or subtraction take less than 5us. Multiplication takes less than 11us and division takes less than 14us. The external FPU that Parallax carries has a minimum time for transferring a single byte of almost 2us. That makes the Prop and the FPU very comparable in speed depending on the mix of operations you need to do.

Remember that both the FPU and the Prop's FP library are limited to single precision floating point. You're limited to about 7 significant digits. If you need greater precision, you may have to use some kind of fixed point (scaled) arithmetic and you may need to use CORDIC algorithms for your transcendentals. I think there's a CORDIC object in the Object Exchange that you might look at.

bprager · 2010-10-24 13:00

Thanks Mike. The calcs in questions are primarily inverse trig functions, Atan, Asin etc.

These are to translate coordinates between different coor systems.

There is of course, some basic addition/division etc, however, the bulk of the calcs are trig, with the results of those being added or multiplied together at the end.

Also, there are some calcs doing basic multiplication, but the numbers need to be held at high precision due to the use of large multipliers in parts of the formulas. If I would round, that rounding would cause a significant answer change.

So between the inverse trig and the precision calcs, the question popped up, would the use of the FPU either gain us, speed, precision, or additional code space?

It sounds like speed would be similar, what are your thoughts on precision and/or code space?

Thanks!

I wasn't sure if the FPU would either speed up the use of inverse trig functions, or offload the propeller enough to be worth using.

Mike Green · 2010-10-24 13:06

You'll have to read the documentation for the FPU and the Prop floating point library. They have information about the speed and precision of the various operations provided. That's what I'd have to do to answer your question. The Prop assembly FP library includes about 4K of routines that get loaded into 2 cogs, then are not needed again. That space can be used for other things after initializing the cogs. It's not convenient or automatic, but can be done if you're tight on space.

localroger · 2010-10-24 13:16

bprager, the trig tables in the ROM are supplied to simplify doing those calculations without floating point math. By combining linear interpolation with the ROM trig tables it's possible to get cartesian-polar conversions accurate to better than 1 part in 10,000 in most cases with purely integer math. There are very few real world applications (OK, GPS is one) that need better accuracy than that.

In my experience, floating point math is almost never really necessary with proper scaling and an understanding of the ** (return high order long of integer multiply) function. In fact, integer math with add and subtract 2's complement wraparound works much better than float for angular functions when you normalize 1 360-degree rotation to 2^32 bits.

bprager · 2010-10-24 13:27

My accuracy needs are +- 1 arcsec accuracy in the trig functions which is close to your 1 part in 10,000, though it might be enough.

This is a new application, where the prior was implemented in a chip where FP math and trig functions were easily available as was space, so there was no issue.

I haven't had an opportunity to use the ** function and will have to investigate it to see how we might be able to use it to both yield the accuracy and speed we need, while keep out space requirements to a minimum.

My understanding of the trig tables in ROM, is that they do not support the inverse functions, for those you have to load the trig math object.
I have used linear interpolation before and in fact, one of our routines relies on it in order to calculate positions to the required accuracy from speciality tables.

I will have to investigate your suggestion of the ** function along with wraparound to see if we can bypass much if not all the need for FP math.

Thank you for the suggestions.

Bruce

max72 · 2010-10-24 14:20

I second the suggestion of combining integer math with cordic.
I use this coupled with planar cartesian projection and it works well.
I started writing a page on the propeller wikispaces, but I had no time to finish the page yet.
On obex there are other integer implementation of navigation routines.
It takes a little bit to move from float to integer (because it sounds odd), but at the end it is worth the effort.
Massimo

localroger · 2010-10-24 14:46

bprager, you can do inverse functions with the trig tables by implementing a binary search. (Start at 1/2, test high or low, go 1/2 in that direction, etc. until you have homed in on your value). Not as fast as direct lookup but still faster than FP math.

The value of ** is that it lets you multiply a number by a fraction between 0 and 1 (scaled from 0 to 2^32) very finely, with a single integer multiply and no shifts. If you are using an assembly language multiply (I use the one from the PASM interpreter) you'll get this result automatically along with the lower 32 bits. Since the Prop has no hardware multiply you still have a multiply routine, but you avoid all the shifting and positioning and normalization of float math so it's still much faster.

bprager · 2011-01-12 16:50

Sorry for the delay in responding, medical issues intervened.
Could you possible give me an example of how I would do a inverse sin using the method you recommend?
If this is outside the scope of this forum, you could reply to bprager@att.net and we could take this offline.

I truly appreciate your time and knowledge and thank you very much.

Tracy Allen · 2011-01-12 22:44

Thanks for pickup up the thread where you left off. Welcome back, and good health!

I'm attaching a demo of binary inverse search in the hub sine table, to find the inverse sine. The demo does not implement the interpolation that localroger mentioned, but that is not hard to add in order to improve angular resolution.

Heater. · 2011-01-12 22:50

bprager,

Should you decide you still need to use some floating point I should mention that there is now a new floating point object by Lonesock in the object exchange. This version puts all the functions of the older version into a single COG and speeds them up a bit as well. I think it's called "F32".

Ray0665 · 2011-01-13 07:46

I have used the um-FPU V3.1 chip quite successfully on a robot project.
Attached is a spin I used while learning about this chip. It needs two other objects that can be found in the OBEX

bprager · 2011-01-13 08:03

Wow! You are all fantastic. I ask a question and I get every solution I could want! Thanks to all of you, you gave me a lot of info I have to digest, but I think you also
gave me my answer (and then some!). I was wondering about the FPU having used such chips before, but with the prop, it was a bit of an unknown as to if it really helped or not. Looks like
I'll have to try each of the ideas presented to see which matches best to my calcs.
Heater - thanks for the OBEX reference. I didn't know about the newer object and COGS are at a bit of a premium so that helps a lot.

Tracey, I assume the interpolation I would use would be the standard math interpolation formula for estimating between data points. I am using that now to interpolate between two hourly values to derive
the value at the minute requested.

Again, thanks to all for a ton of information. I hope I can contribute back.

Bruce

Tracy Allen · 2011-01-13 09:01

With regard to the µFPU from micromega, it is like a programmable calculator, in that you can enter long series of calculations in advance into its flash or eeprom memory, up to 64 different functions. At run time load the necessary registers and then call the desired function, which the chip crunches and returns the result(s). That can be a real time saver if you have substantial functions, , say, calculation of sunrise/sunset or moonrise as a function of date latitude and longitude, or things like evapo-transpiration that are complicated formulas based on temperature, humidity, windspeed, radiation and cloud cover. The µFPU has its own IDE that helps to create the chained functions. Of course, to use it you have to learn it, but then again, you would have a lot of work cut out to implement those formulas on the Prop. If your project involves a GPS, it can be attached directly to the serial port on the µFPU, which can parse the NMEA sentences directly into internal variables that can be used in calculation. The trig functions on the µFPU will be true 24 bit accurate in the mantissa. The Prop sine table is 11 bit accurate in angle and 16 bit accurate in value returned, and while linear interpolation might be good for most purposes, it will fall down in stiff calculations (small differences in denominators). Better trig accuracy on the Prop can be had with CORDIC. I'd say that for real heavy lifting, the µFPU would still be a contender.

By the way, I don't work for micromega, and like others, I do 99.9% of my math in the integer domain. After the BASIC Stamp, the Prop's 32 bit signed integers are close to infinity.

bprager · 2011-01-13 09:23

Actually, I am doing a lot of astronomical calcs so the FPU might be exactly what is needed. Some of the calcs get really involved and also require significant precision due to the size of the numbers.
You mention loading it at runtime, can the FPU can "loaded" dynamically, Ie; User wants to know a location, so I get the current coordinates, load the required calcs into the FPU and get back the answer?
That would be ideal for what we want to do, as sunrise/sunset calcs etc are effectively what we are doing, in some cases, even more complex (planetary coordinates based on current time etc)

Thanks for this info, I had never heard of this chip and in the future, we intend to add a GPS system so it might match up well.

Bruce

Tracy Allen · 2011-01-13 11:11

The FPU has both flash and eeprom. Formulas stored in the flash memory need to be thought out in advance and burned into the chip, whereas formulas stored in eeprom can be created at run time. So, things like the Naval Observatory standard formulas could go in flash, and ad hoc formulas in eeprom. The micromega IDE makes it relatively painless to enter formulas in a familiar syntax. Check out the application notes on the micromega web site. By the way, Cam Thomson, the principal guy at micromega, wrote the original FP library for the propeller.

lonesock · 2011-01-13 12:00

Are there specs anywhere for the numerical accuracy of the uFPU? I didn't see anything in the data sheet. I'd also be interested in seeing the algorithms used, especially for the trig functions. On the propeller, Cam implemented the forward sin/cos using the table lookup and interpolation, and the arc* stuff used some pretty horrendous math (2 sqrts, a 6th-order polynomial approximation, and the ATan2 used a division to convert to a standard ATan). In F32 the arc* functions now use a CORDIC ATan2 routine, resulting in much better accuracy.

Jonathan

Heater. · 2011-01-13 12:23

bprager,

... FPU might be exactly what is needed... and also require significant precision due to the size of the numbers

This does not exactly make sense.

Working with floating point allows you to have huge numbers of course but the precision is limited to the 24 bits of mantissa. Straight 32 bit ints limits your number range to "only" plus/minus 2 billion but then you have a superior precision of 32 bits. So working with suitably scaled ints may give a better precision over the required range than floats.

Of course using a ready made float system, hardware or software, allows you to get on with development more quickly. One just needs to be aware of it's limitations.

Sounds like you have an interesting project on your hands.

Floating point math and the propeller

Comments