Problem with Floating Point Routine

mynet43 · 2007-09-23 00:18

I've been programming a routine that converts from barometric pressure to altitude.

It first converts an adc count to the pressure in kPa and then calculates the altitude in feet from the pressure. The formula used is:

Altitude = (10^(log10(Pressure/101.325) / 5.2558797) - 1.0) / (-6.8755856e-6)

The code seemed to be working fine, but then began to give strange results for some input values.

In particular, it works fine for adc counts of 3352 and 3360 but it gives very bad results for a count of 3356. I've boiled the code down to a fairly small routine that displays all of the intermediate calculations.

I've also attached a spreadsheet that does the same calculations and gets the right answer all the time.

Based on the display, in comparison with the spreadsheet, there appears to be a bug in the Float32 routine Pow(base, exp). I took a quick look at the assembly language code, which I can follow. However, it calls other routines like log( ) and exp( ), so it gets messy to debug.

If possible, would someone please take a look at this and see if you can spot my error or help me get the bug in Pow ( ) fixed.

Here's the test code I used:

{ Test Altitude Calculations -- 9/22/07
  1st and 3rd tests are correct, 2nd is bad.
  See Excel Spreadsheet (AltCalc.exe) for comparison.
}

CON
  _clkmode = xtal1 + pll16x
  _xinfreq = 5_000_000

' VGA and Keyboard info for vga.start
  BasePin  = 16                 ' base pin for video output (0, 8, 16, 24) - 16 on proto board
  
  kbd_dpin = 26                 ' keyboard dpin number - pin 26 on proto board
  kbd_cpin = 27                 ' keyboard cpin number - pin 27 on proto board
  kbd_lock = %0_001_000         ' keyboard locks: disable scroll lock key, NumLock, CapsLock, ScrollLock all off
  kbd_auto = %01_01000          ' keyboard auto repeat: 0.5 sec delay, 15 chars/sec repeat (?)
        
  kbdflag  = 0                  ' set this if a keyboard is connected/needed
  
DAT
'Runtime variables
        bADC            word 0  'barometric adc data
        bAlt            long 0.0'ft
        bPres           long 0.0 'kpa
        Period          long 500           
'Calibration
        PRES_ALT_CONST  long -6.8755856e-6

OBJ

'  TV    :       "VGA_1024"      ' use for my VGA display (ask if you want it) 
  TV    :       "PC_Text"        ' use for generic
'  fl    :       "Float32Full"
  fl    :       "Float32"        ' same results with Float32Full
  fts   :       "FloatString"

PUB Main

  waitcnt(clkfreq*1 + cnt)
                                                                 
'  TV.start(BasePin, kbdflag, kbd_dpin, kbd_cpin, kbd_lock, kbd_auto)   ' for my VGA            
        
  if TV.start(30)
    waitcnt(clkfreq*1 + cnt)
    TV.str(string("TV Cog started."))
    TV.out($0D)
  else
    TV.str(string("TV Cog failed to start!"))
  TV.out($0D)
        

  if fl.Start
    waitcnt(clkfreq*1 + cnt)
    TV.str(string("Floating point Cog started."))
  else
    TV.str(string("Floating point Cog failed to start!"))
  TV.out($0D)

  bADC := 3352              ' set test value for altitude calculation
  do_pres_calcs(Period)

  bADC := 3356              ' set test value for altitude calculation
  do_pres_calcs(Period)

  bADC := 3360              ' set test value for altitude calculation
  do_pres_calcs(Period)

  repeat                    ' keep cog running

Pub do_pres_calcs(timechange) | bfADC,ftimechange
  bfADC:=fl.fFloat(bADC) 
    TV.out($0D)
    TV.out($0D)
    TV.out($0D)
    TV.str(string("ADC: "))

    TV.str(FTS.FloatToString(bfADC))
    TV.str(string(" -> "))

  ftimechange:=fl.fFloat(timechange) 
  'change time from ms to s for calcs
  ftimechange:=fl.fDiv(ftimechange,1000.0) 
   
  'Pressure=((ADC/4096)+0.095)/0.009
  bPres:=fl.fDiv(bfADC,4096.0)
  bPres:=fl.fAdd(bPres,0.095) 
  bPres:=fl.fDiv(bPres,0.009)

    TV.str(string("Pressure = "))
    TV.str(FTS.FloatToString(bPres))
    TV.out($0D)
    TV.out($0D)
    TV.out($0D)

    TV.str(string("Altitude = (10^(log10(Pressure/101.325) / 5.2558797) - 1.0) / (-6.8755856e-6)"))
    TV.out($0D)
    TV.out($0D)

  bAlt:=fl.fDiv(bPres,101.325) 
    TV.str(string("  p/101.325 = "))
    TV.str(FTS.FloatToString(bAlt))
    TV.out($0D)
    TV.out($0D)

  bAlt:=fl.Log10(bAlt)
    TV.str(string("  log10(p/101.325) = "))
    TV.str(FTS.FloatToString(bAlt))
    TV.out($0D)
    TV.out($0D)

  bAlt:=fl.fDiv(bAlt,5.2558797) 
    TV.str(string("  log10( )/5.2558797 = "))
    TV.str(FTS.FloatToString(bAlt))
    TV.out($0D)
    TV.out($0D)

  bAlt:=fl.Pow(10.0,bAlt) 
    TV.str(string("  10^(log10( )/5.2558797) = "))
    TV.str(FTS.FloatToString(bAlt))
    TV.out($0D)
    TV.out($0D)

  bAlt:=fl.fSub(bAlt,1.0) 
    TV.str(string("  10^(  ) - 1.0 = "))
    TV.str(FTS.FloatToString(bAlt))
    TV.out($0D)
    TV.out($0D)

  bAlt:=fl.fDiv(bAlt,-6.8755856e-6) 'alt in feet
    TV.str(string("Altitude = (10^( )-1.0)/(-6.8755856e-6) = "))
    TV.str(FTS.FloatToString(bAlt))
    TV.out($0D)
    TV.out($0D)

Thanks for your help.

Jim

Tracy Allen · 2007-09-23 08:32

Did you happen to try it also with exp10(bAlt) instead of pow(10.0, bAlt)? The asm code is streamlined for exp10(), because it doesn't have to compute the log of the base and do other gyrations. That might narrow it down a bit.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com

deSilva · 2007-09-23 10:58

Mynet,
sorry to answer that late... There are not many here interested in Floating Points, though....

(A) Note that the precision is generally more limited than we people - used to EXCEL's double precision arithmetic - think:
5 digits is the limit! That can give causes to many numerical instabilities.

(B) The exponential functiom (or EXP10 or POWER) uses the build-in look-up tables to interpolate the results.
What you now have detected is a constant error involved of the (constant) order of 0.006 which is 100 times larger that the general rounding errors.

I have prepared a plot showing the typical behaviour. I am sure one can definitely do better, but have not yet looked deeper into it.

Edit: I expanded the plot a little bit..
Note that the results are precise for x > 0.00103

Post Edited (deSilva) : 9/23/2007 11:31:06 AM GMT

Ale · 2007-09-23 11:19

That seems to show the limits of the built-in table/algorithm. A workaround could be to scale the argument of log/exp so they are not that small. You can also always use better precision calculating via an expanded polynomial series exp. The poly is quite straightforward, do not forget to use more than 10 digits to get half of them exact.
Chapter 20 of "Algorithms for programmers" @ www.jjj.de/fxt/#fxtbook have a nice overview

. BCD implementations also exist (sadly I am not that far with nine [noparse]:)[/noparse]

deSilva · 2007-09-23 12:10

No limits! There is just a hidden bug

The interpolation works fine for larger values (> 0.00103). As the e-function is somewhat "linear" in this range, it should even work better...

mynet43 · 2007-09-23 14:23

Wow! Thanks everyone for the great responses!

deSilva,

I love your chart! The three values I had going into the function were: 1.3e-4, 2.2e-4 and 3.1e-4. For the second one, I got an answer of -532 ft. instead of the correct answer of -73 ft.

If I plot these on your chart, it's obvious I was just lucky to get any of them right. The 1st and third just happen to be where the function intersects the excel results and the third one wasn't so lucky, it's about half way up the error ladder. It could have been worse [noparse]:([/noparse]

Actually, what I'm showing is just the tip of the iceberg. There are actually bigger errors at other adc counts. I was trying to keep the example simple.

If I were doing it, I'd make the interpolation at least as accurate as the math. The speed doesn't help if the answer's really wrong!

Would someone like to take a look at fixing this? I'd really appreciate it. Otherwise, I may do it when I get a little free time. It can't be that hard [noparse]:)[/noparse]

Thanks again for the help!

Jim

P.S. Does anyone know if Parallax (or the author) is aware of this issue?

Tracy Allen · 2007-09-23 19:01

This is very strange. Cam put a note in the v1.3 file (1-Apr-07) about a fix in the sinus interpolation code, but no mention of the log and exp interpolations. Those have their own routines. I'll email Cam a question about it. I can sort of follow the interpolation code, but 'm lost in the packing and unpacking.

I have to believe the empirical result in deSilva's graph. It came right from the Prop output as it scanned through the x values, right? Must be a bug. One thing that strikes me about it is that it doesn't look like a proper interpolation error. By that I mean that the values in the ROM table are accurate to 16 bits at both end points of the interpolation, so the error should come within 1/65536 = 0.000015 at those two end points, and the error curve should take on a form close to a parabola in between similar to the attached (theoretical) graph. Instead it looks like a sawtooth. That would be characteristic of picking values out of a course table, and using them without interpolation. The error increases quasi-linear up until it hits the next value picked from the table. What is more, the error on the sawtooth graph is roughly 1/2048 = 0.005, 11 bits. Interpolation on 2^x with x in {0,1} should be much better than that, 14 or 15 bits I should think.

This is floating point, and the main part of the calculation is done on the fractional part of the exponent, after it is converted to a power of 2. So the interpolation is always done on the interval {0,1}, and the exponent is tracked separately. The accuracy should scale to both small and large values.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com

Fred Hawkins · 2007-09-23 21:55

An alternative formula Z = --26216 x ln( P(kPa) / 101.304 ). From http://www.tfs.net/~petek/rockets/altimeter/begin.html

Head of his pages regarding 'cheap altimeter': http://www.tfs.net/~petek/rockets/altimeter/

Basil · 2007-09-23 22:27

Fred Hawkins said...
An alternative formula Z = --26216 x ln( P(kPa) / 101.304 ). From http://www.tfs.net/~petek/rockets/altimeter/begin.html

Head of his pages regarding 'cheap altimeter': http://www.tfs.net/~petek/rockets/altimeter/

Hi Guys,

Here is a little graph showing the relationship between the various Altitude calcs.

Pink line =(10^(LOG(Pres/101.325)/5.2558797)-1)/(-6.8755856*(10^-6))
Blue line (hidden behind pink line =((1-(Pres/101.325)^0.19026)*288.15)/0.00198122
Yellow line = -26216*LN(Pres/101.304 )

The pink and blue lines are obviously the most accurate, but even the blue line uses POW which seems to cause troubles too?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
-Alec

My our page

deSilva · 2007-09-23 22:29

@Tracy
I gave up debugging after half an hour

It looks as if the factor used for the interpolation - derived from the "remainder" - is much to high for small values (around line 770 in LoadTable)

Yes, I copied the values directly from a very simple program (attached)

Fred Hawkins · 2007-09-23 23:19

Alec,
What happens when you plug the 101.325 into the yellow line formula?
Fred

Basil · 2007-09-23 23:37

Fred Hawkins said...
Alec,
What happens when you plug the 101.325 into the yellow line formula?
Fred

You get this [noparse]:)[/noparse] A little better, but still a 5000ft off at higher alts!

EDIT: Ok, so you cant tell its better by the graphs, but the data looks closer [noparse]:D[/noparse]

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
-Alec

My our page

Post Edited (Basil) : 9/23/2007 11:57:15 PM GMT

Fred Hawkins · 2007-09-23 23:44

Yup, internet + lump of salt = good to go.

mynet43 · 2007-09-24 03:43

OK, I started this mess, so let me add to it [noparse]:)[/noparse]

Out of frustration, I took it upon myself to try to debug Float32.

My conclusion: There's something very strange going on it that module.

Here's what I did:

1. I came to the same conclusion as deSilva, that the slope of the correction in the routine 'loadTable' was much too high. So I decided to add an instruction 'shr t1, #1' to reduce the product resulting in the correction for the interpolation.

When I compiled this, I immediately got the error 'Origin exceeds $1F0'. I assume this means it's out of memory space, which seems strange since I only added one instruction. When I removed the 'shr t1, #1', it compiled and worked just like before.

2. I then tried to make some room in the routine, by commenting out all but the first and last instructions in the '_Tan' routine, which isn't called by anyone for this test. I then ran it again with only this change.

Believe it or not, this time I got a different set of errors. In this run, the first altitude was bad and the second and third values were correct. Originally, the first and third were correct and the second was wrong. In addition, at least part of the error this time was caused by a bad answer from the log10( ) routine: 9.6e-4 instead of 6.8e-4.

Both of these conditions are completely repeatable and reversible when I change the code back.

I hesitated a long time before posting something. I don't mind making occasional stupid mistakes but this sounds unbelievable.

So I thought, what will these guys say is causing the problem. Aha, the most obvious thing is that I have bad memory or something like that peculiar to my board.

Fortunately, I have three completely different propeller boards, so I tried it on all three, including a Parallax demo board.

Same results on all three boards, very repeatable. The only difference between what I posted above and what I'm running is that I'm using the VGA output.

At this point, I can't decide whether to keep looking or to take up heavy drinking.

I'd really like to hear from the author or Parallax.

If someone would try my code it would be really appreciated, to see if I made some terrible error. The Excel file in the same post gives the right answer every time.

Thanks everyone.

Jim

Fred Hawkins · 2007-09-24 13:05

Has the float to string routine been ruled out? (In other words the arithmetic is ok, but its visual representation is flawed)

Wiki suggests a possible legitimate source of the error:

IEEE-754-compliant hardware allows one to set the rounding mode to any of the following:

round to nearest (the default; by far the most common mode)
round up (toward +∞; negative results round toward zero)
round down (toward −∞; negative results round away from zero)
round toward zero (sometimes called "chop" mode; it is similar to the common behavior of float-to-integer conversions, which convert −3.9 to −3)

In the default rounding mode the IEEE 754 standard mandates the round-to-nearest behavior described above for all fundamental algebraic operations, including square root. ("Library" functions such as cosine and log are not mandated.) This means that IEEE-compliant hardware's behavior is completely determined in all 32 or 64 bits.· http://en.wikipedia.org/wiki/Floating_point

____

·And

why does nobody use BCD arithmetic ?

By vincent himpe
Posted Sunday 13th August 2006 14:56·GMT

The IEEE floating point format beeing limited in precision is a well known problem.

if you need absolute precision : use BCD format . Every cpu in the world that is not half braindead can perform BCD calculations in hardware. And on the occasion that your RISC cpu can not do it in hardware there are software libraries to do it.

a number is stored in a compressed string. 2 digits per byte. ( one nibble per number )

0000 to 1001 represent numbers 0 to 9 ( direct bit translation )

1111 represents the E ( exponent

1110 represents the . ( decimal point )

1101 represents the - (sign bit)

1101 0001 : 0xD1

1001 1110 : 0x9E

0000 0001 : 0x01

1111 1101 : 0xFD

1001 1001 : 0x99

1001 1001 : 0x99

1001 1001 : 0x99

would represent -19.01E-999999 with only 7 bytes of storage.

Try stuffing that one in IEEE format.

conversion from text to BCD and back is a snap and can be done with maybe 30 lines of assembly code .

Such a simple well known system that is even implemented in hardware on most computers. Yet virtually no compilers support it.

http://www.regdeveloper.co.uk/2006/08/12/floating_point_approximation/comments/#c_351

Post Edited (Fred Hawkins) : 9/24/2007 1:49:05 PM GMT

mynet43 · 2007-09-24 14:39

I no longer believe it's a precision problem.

The mantissa of the IEEE numbers used is 23 bits plus one implied leading '1' bit, which gives 24 usable bits.

If you divide this into groups of 4, it translates into 6 hex digits, which is more than 6 decimal digit accuracy.

If I'm correct, then this is enough for these calculations.

I'd really like to hear from the author. Has someone contacted Cam?

Some of his bit shifting is hard to follow when you try to compare it to the anti-log description on p. 422 of the Propeller manual.

Has anyone else tried to figure this out? Help!

Jim

mynet43 · 2007-09-25 15:14

I just now heard back from Cam Thompson, the author. He was away for a couple of days.

He's looked at the post and acknowledged that there seems to be an error in the interpolation routine.

He said he would get back to me and post a note here when he's looked at the code.

Thanks for all the interest.

Jim

Mike Green · 2007-09-25 15:33

Fred,
BCD floating point, in the absence of specialized support hardware, is much slower to implement than binary floating point. There's no particular benefit in terms of numeric accuracy. When the result of calculations is provided for human consumption, decimal arithmetic looks better because we're used to seeing decimal values that stop at a particular point rather than continue as infinite series. We also work with exact quantities rather than approximations. The same problems show up with decimal floating point as will binary floating point, just with different values and slightly different circumstances. For example, successively dividing 1 by powers of 2 comes out exactly in a short binary floating point representation, but requires increasing lengths of strings of decimal digits to represent exactly in decimal.

Tracy Allen · 2007-09-25 17:15

Jim,

I heard back from Cam, too. Let me put in a plug for the uFPU, which is a highly evolved product that Cam developed through his company, Micromega, and also sells through Parallax. It has a lot of features that would be good for your project. For example, you can program in a macro mode for a series of computations as you have in your program, so you could just pass it the ADC reading and out would pop the altitude. I know there might be good reasons for doing it all on the Prop, but there you have it as a very viable alternative.

Note that the Prop routine in FLOAT32 for sinus, log and exp will not give you 24 bits. The tables in HUB rom are given to 16 bits, and that is the limit of what you can expect from interpolation. And I emphasize again that the 16 bits is the calculation of the mantissa. The percentage of error should scale the entire range of floating point. Nevertheless, given all the approximations involved in the altitude formula, 16 bits should be totally adequate.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com

IAN STROME · 2007-09-25 17:55

Yep,
looked the specs for this floating point machine,looks good
over an SPI interface bit slow over I2C there was an old thread
about interfacing a 68882 yes it can be done but not very feasible

:-Have you ever noticed wrong numbers are never engaged.

mynet43 · 2007-09-25 18:26

Tracy,

Good point about the 16 bits in the table, actually it's 17 with the implied 1 for the 1st bit.

Like you said, it should still give good results. 17 bits holds over 131,000 which is still more than 5 digits accuracy.

Jim

deSilva · 2007-09-25 19:06

@mynet:
You loose a little bit of accuracy through the interpolation.. And 100,000 ist not really MUCH more than 5 digits

@Tracy: I checked the uFPU some weeks ago, but did not much as the raw performance is limited: The Propeller is faster...
I think a posted a small note...

On the other hand it has a remarkable set of features, the code for which will never fit into 8 COGs

On the third hand you can buy 2 Props for its price and do your own programming

camt · 2007-09-25 21:43

Hi Everyone,

The exp10 problem Jim encountered took a little while to find, but ended up being a simple typo. As some of you discovered, once you started to add debug code the behavior changed. It turned out to be a classic assembler error. In the loadTable routine used by the log and exp functions, two assembler instructions had an errant # before a variable name, so the address was being used instead of the value. I've enclosed a new version of Float32 in the attached zip file. I've also enclosed an Excel spreadsheet comparing Float32 and Excel values for Exp10. The lines on the chart now completely overlay each other, in fact, the difference first occurs in the 6th digit after the decimal point. Jim, let me know once you've tested this version and I'll also post it to the object exchange.

I'd also like to respond to some other things that came up in the thread:

1) The accuracy of a 32-bit floating point number is slightly more than 7 digits.

2) The uM-FPU V3.1 chip is faster than the Propeller in most applications. If you could run all your floating point code in assembler, in a cog, it would be as fast as user-defined functions on the uM-FPU, but since most floating point application code is at the Spin level, the uM-FPU V3.1 chip is faster. There's also built-in support for GPS parsing, FFT operations, matrix operations, and other advanced features. If you have an application that requires lots of floating point, it's worth a look at the uM-FPU. If you only need to do a small amount of floating point, the Propeller code should do the job.

3) The bug spurred a lot of accuracy concerns, much of which was based on the inaccurate results caused by this bug. I think most of the concerns will go away once you re-evaluate with the correct data produced by the new version.

Regards,
Cam

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Cam Thompson
Micromega Corporation

Tracy Allen · 2007-09-25 21:56

Cam, thanks very much for your efforts on this.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com

deSilva · 2007-09-25 22:04

Thank you Cam!
When I compared the uFPU v3 to the Propeller I was thinking of using more than one COG for arithmetic

I implemented a small matrix multiplication with four COGs but is was not worth it... I changed back to "fixed point" arithmetic...
I fully appreciate the amount of features in the uFPU - I think I was quite clear about that.
There is even a two channel A/D converter

mynet43 · 2007-09-26 00:11

Cam,

I ran quite a few tests. From -102 ft to 30,000+ ft altitude.

Tracy, how about running your sawtooth test, just to be sure.

I haven't been able to make it fail! It's always within one ft of the excel results for the values I've tried.

Good job!

Thanks so much for the quick response. As far as I'm concerned, you can post it as an update.

Jim

Basil · 2007-09-26 00:27

Agreed [noparse]:)[/noparse]

I ran it through my program too and its spot on [noparse]:)[/noparse]

Thanks Cam

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
-Alec

My our page

mynet43 · 2007-09-26 01:39

Minor misstatement. I said it was Tracy's sawtooth program. Actually it was deSilva's.

Sorry about that.

Jim

deSilva · 2007-09-26 06:08

No need to check - the issue is obvious from "prima vista" at the code

Tracy Allen · 2007-09-26 16:15

Yes, indeed, there is no longer need to run deSilva's program to check out of concerns for accuracy. It looks spot on.

I have been curious though about the values that are stored in the hub rom tables. I wrote the little spin object attached in order to dump the tables to my desktop so as to look at the values in Excel.

A xls sheet for the exponential is attached. There are 2048 word integer entries dumped from the table from $D000 to $DFFE, and next to each is the same function 2^x as calculated by Excel (times the 65536 scale factor), and the error. One question that I laid to rest was whether the hub rom tables are truncated or rounded. They are rounded, so that each 16 bit integer is either above or below the true value and within 1/2 lsb. The sheet also does an integer interpolation at the halfway point and compares that with the excel precise value at that point. It assumes that the interpolation is done at more than 16 bits so that it can represent the 0.5. The error in that case is still bounded within +- 1/2 lsb. One interesting thing that I just realized is that the error at the end points of the interpolation is often on opposite sides of the true curve, so the interpolated value intersects the true curve and the error becomes very small. That is, with 24 bit accuracy in the interpolation, the curve can at points come within 1/2 lsb of 1/(2^24). The error curve for interpolation, while not a sawtooth, is not a series of regular parabolas either. It is a smooth curve that follows the numerics.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com

mynet43 · 2007-09-26 20:17

Tracy,

Thanks for doing this. I was going to do something similar but hadn't got around to it.

Good info and confirmation of the accuracy.

Interesting about the interpolated points. It's almost like they planned it that way [noparse]:)[/noparse]

Jim

Problem with Floating Point Routine

Comments