# Floating point bugs in Spin2

ersmith
Posts:

**5,193**
in Propeller 2

Besides the misparsing of "-1.0" as "-(1.0)" (which could be argued is a feature, but it's certainly a surprise) there is also a pervasive confusion in Spin2 between NaN (not a number) and infinity.

The IEEE encoding specifies that infinity is encoded as $7f80_0000, and that numbers $7f80_0001 - $7fff_ffff are NaN. Spin2 seems only to have a single NaN ($7fff_ffff) and also returns this for many cases that should yield infinity.

The following program illustrates some of these problems:

con _clkfreq = 180_000_000 debug_baud = 2_000_000 pub main() | minus_zero, bignum, x, y, inf1, real_inf minus_zero := $8000_0000 bignum := $7f7f_ffff ' largest representable IEEE single real_inf := $7f80_0000 ' IEEE infinity inf1 := bignum /. 0.0 DEBUG("float test") if inf1 <> real_inf DEBUG("ERROR:", fdec(inf1), " and ", fdec(real_inf), " should both be infinity") x := bignum *. 2.0 if x <> real_inf DEBUG("ERROR: ", fdec_(bignum), " *. 2.0 should be infinite but got ", fdec(x), " : ", uhex_(x)) x := 1.0 /. real_inf if x <>. 0.0 DEBUG("ERROR: 1.0 /. infinity should give 0.0 got ", fdec(x), " : ", uhex_(x)) DEBUG("done test")

## Comments

709Please keep in mind we all ask because we care. I made it a goal to try out the spin 2 floating point over the holiday and it was a pretty major bugbear. And I got some pretty confusing feedback yesterday.

It's a situation of "everything you know is wrong". Handling of floating point really doesn't have to be unique, because the uniqueness isn't solving a particular problem.

I'm making an effort in toning down my passionate language. Hopeful Parallax will do the right thing.

13,690Well, please tell me what the right thing is, because I don't know. I'm kind of dense. You won't hurt my feelings.

I see that infinity could be handled, but what else is wrong?

709Var2 := Var2+.-.210.0 just makes my head explode.

Or gosh, Var1-.-.1.0

Where the const or literal type is known, the complier should remember it.

If I type 120 or -33 or similar it's an integer

If I type 12.0 or 6e3 or -5.2e3, it can be inferred this is a float.

The complier can remember this. Or how about we just have a yes/no decision, this is or isn't an integer.

It's understood that var1 + var2 is going to be indeterminate. But the complier should be just smart enough to see 10.0 + 120.0 and be, wait a minute, why would you want to do that? (Float, integer op, float)

Or even 0.0 - 120.0. why would you want to do that? Or var1 + 12.8, again why would you want to do that? (Assume integer, integer add, float)

Or even with a const, var1 + const1, could be okay if const1 is an integer, and you just have to trust the person and let it go.

But var1 + const2 is most definitely wrong if const2 is known to be a float because it was directly set from an expression that provided a float.

As it stands, it's very easy to trip up and get a NaN that cascades down the calculations.

I can mentally process needing to do +. and /. as shorthand operations.

But it's incomprehensible from an outside observer why a negative floating point number on a literal has to need an awkward extra dot. Come up for air with fresh eyes and see this from a high level.

Option 2,

Maybe just remove the floating point negation entirely? and require that -. be strictly a subtraction operation that requires two operands?

Compiler would still require the intelligence to handle the - as either an integer subtraction, integer negation, or part of a negative literal (integer or float).

The answer to why doesn't var1:= -.var2 compile, could be simple to support.

Well, do this:

var1 := 0.0 -. var2

5,193@whicker I wouldn't go too far overboard -- trying to implicitly add types leads down quite a rabbit hole. But I definitely agree that "-1.0" should be parsed not as an operator and a number but as a single (negative) floating point number. The only case that's a bit problematic is a string like "-.5": is this a number (the same as "-0.5") or an operator "-." applied to "5"? I'd favor the former interpretation, but throwing an error would work too. The user can always type "-0.5" for the number, and "-. 5" (with a space after the operator) to get the negative float operator applied to the integer 5 (hard to imagine a use case for this though).

5,193For completeness while I'm reporting float bugs, besides the NaN / Infinity problems, there are also:

(1) some rounding errors (e.g. $4000_0000 +. $3400_0000 gives $4000_0001, but is supposed to give $4000_0000 according to IEEE)

(2) FSQRT($8000_0000) is supposed to return 0 according to IEEE (that's an odd one, so probably not high priority).

13,690Thanks, Eric and Whicker.

I am going to make negative float constants require just '-', instead of '-.'.

I'll address the rounding and infinity issues, too. Also, that FSQRT($8000_0000) problem will get fixed.

13,690Neither the compiler nor the Spin2 interpreter were handling FSQRT(-0.0) properly. The compiler was also miscomputing FSQRT(0.0). All those bugs are now fixed.

The next thing I will do is address the rounding issue. I need to find the rule. I believe that you only add a half-lsb when the power-of-two exponent is odd? Is that right? I will search the forum.

5,193The rule is to round the mantissa to the nearest even value. In theory this means you compute the infinitely precise result, and if the result is exactly between two possible floating point numbers you pick the one with the even mantissa, otherwise just round to the nearest one.

In practice you only have to keep a few extra bits of precision, as long as you have a "sticky" bit which is the logical OR of all the bits to the right of where the values you're actually using end (hence the name: as bits are shifted right they logically stick in the sticky bit). There's some discussion of it at https://stackoverflow.com/questions/8981913/how-to-perform-round-to-even-with-floating-point-numbers ; the original question is a bit off. but the answers correct it. I think the IEEE standard itself also describes things.

13,690Ok, so this is how it would go at the LSBs of the mantissa?

0.5000000 becomes 0

1.5000000 becomes 2

2.5000000 becomes 2

3.5000000 becomes 4

4.5000000 becomes 4

12,230Chip,

Yes, that's correct as I understand it.

I worked out a method for division using the Cordic that seems pretty good. My method doesn't calculate the extra mantissa bits as recommended in the linked questions. See https://web.archive.org/web/20130124022505/http://pages.cs.wisc.edu/~cs354-1/cs354/karen.notes/flpt.apprec.html

Or more accurately, I wanted to keep all 32 bits of the Cordic result.

What I've done is use the pipeline to give me two answers and use the result of one to decide if the other should be used. See https://forums.parallax.com/discussion/comment/1529194/#Comment_1529194

Of course this is specific to division. I've not considered other cases. I guess square-root is one ...

13,690That's neat, Evan. Can you please remind me, again, why this is important to do? I know it has to do with result integrity, but I can't remember what the issue was, exactly.

13,690I've got both the compiler and interpreter properly rounding mantissas now, so that $4000_0000 +. $3400_0000 gives $4000_0000.

Next, I guess I will address the infinity vs NaN issue.

Thanks, Evan, for posting that link to the floating-point page. Good information there:

1/0 = infinity

any positive value / 0 = positive infinity

any negative value / 0 = negative infinity

infinity * x = infinity

1/infinity = 0

0/0 = NaN

0 * infinity = NaN

infinity * infinity = NaN

infinity - infinity = NaN

infinity/infinity = NaN

12,230Eric explained it a few posts before my solution. It's one way, as selected by IEEE, to evenly spread the rounding outcome of 0.5 results. Of evening the statistical distribution.

13,690I'm thinking this infinity stuff is nonsense. If your numbers run out of range, you've lost all determinancy. These theoretical infinity rules can be applied that make egghead sense, but in the real world, I think they are completely useless. NaN responses are the only thing that make practical sense.

I think NaN should be represented by $7FFF_FFFF and $FFFF_FFFF. All other values are valid numbers.

12,230Hehe, yep. Eric mostly wants this for consistency between the compilers but maybe also for trust and integrity for presenting to those new arrivals that will assume/expect full IEEE-754 compliance.

Of course, that then immediately leads to calls for double floats. Every new grad wants all the shiny bits.

5,193It is in fact useful to be able to distinguish between NaN (your answer makes no sense, e.g. 0/0) and Infinity (your answer overflowed).

It's also nice to stick to the standard. If you don't want IEEE floats, then perhaps we should do something radically different. But for interchange with other micros (if nothing else) the IEEE format is useful. And in the IEEE format $7f80_0000 is +infinity, and $7f80_0001 - $7fff_ffff are NaN.

Having 23 bits available in the lower part of the NaN representation also makes for some very interesting possibilities. For example, you can store a pointer to a string saying how the NaN arose ("division by 0", or "sqrt of negative number"). Some interpreters also use IEEE NaN's constructively: they store all objects as IEEE floats, and use NaN's for things like strings, objects, and such that are, well, not a number, with the bottom bits being used to distinguish between them.

13,690Oh, that's a neat idea, storing pointers and cause of death.

I think an Infinity is just one operation away from becoming a NaN. Recording what went wrong is useful, but having Infinity able to resolve back to 0 is reckless.

5,193Infinities are sticky -- infinity + x = infinity, infinity * x = infinity, and so on. I don't think there's any way to get back to 0 from an infinity. I don't know all the details, but I do know the IEEE standard was designed by professionals in numerical analysis, so I'm sure there are good reasons for all of the design choices.

709...I think we're being a bit theoretical again.

To be clear, ersmith's last paragraph (edit: about the 23 bits of a NaN) isn't part of the standard, other than it being allowed.

As I'm reading the IEEE-754 standard (it's only 70 pages) it's surprising what isn't in it.

However, section 7 does ask that five exceptions be signaled when they arise:

Invalid Operation

Division by Zero

Overflow

Underflow

Inexact

And it is a bit freeing to know how to handle division by zero:

The "Division by Zero" exception should be signaled if and only if an exact infinite result is defined for an operation on finite operands. (in other words don't worry about the edge case weird stuff).

"For division, when the divisor is zero and the dividend is a finite non-zero number, the sign of the infinity is the exclusive OR of the operands' signs".

Following a standard, there's moments like this where no deep level of thinking is required.

12,230Eric's last statement is simple logic case. Each is relatively easy, just they do require checks to encode.

1,270IMHO, it would be tolerable to sacrifice strict bit for bit equality with official IEEE results if it can be justified with a noticable performance gain AND if any divergences are well documented. It is important to be able to check for errors. But I agree with Chip that subtle special cases like the distiction between NAN and infinity is not really important.

Support of floating point numbers in the Spin2 compiler is a real advantage. It makes it much easier to write libraries without worrying about number ranges and overflows, for example for digital filters. They just work no matter what the scaling of the input data is. There could be still overflows with floating point numbers. But in any real world screnario where the data comes from sensor hardware the worst case ranges are limited and never overflow a floating point number except when there's a major design flaw like dividing by a number that can get close to or equal zero.

The main applications for the P2 are instrumentation, games, motion control etc. Real time performance is more important than precision to the least significant bit, I think.

709But most coders are not exactly precise in their coding.

There is still a need to see those 5 flags, probably during debug, along with the standardized result.

You can see here where this individual basically just gave up because of the NaNs:

https://forums.parallax.com/discussion/174215/tracking-angles-and-the-nan-problem

What operation first caused the NaN? No idea.

5,193Well, I agree with you in part -- for example, 1 bit rounding errors for the "exactly between two floats" case are probably not a big deal. However, the difference between NaN and infinity is huge. They're completely different things: NaN is an error, whereas infinity is a maximum value. Also, they're encoded completely differently, and keeping compatibility with the IEEE binary encoding is pretty important, I think.

As for performance, it shouldn't be impacted by treating NaN and Infinity properly -- both are special cases that are detected the same way (via the exponent being $FF) and can be handled "out of line". The main code should be optimized for the common case where 0 < exponent < $FF.