Shop OBEX P1 Docs P2 Docs Learn Events
Anyone have any practical experience with POSIT math? — Parallax Forums

Anyone have any practical experience with POSIT math?

Any experience out there with POSIT math, as opposed to FLOAT math?

Comments

  • cgraceycgracey Posts: 14,153
    Thanks, Guys.
  • Just my peresonal opinion and not neccessarily a scientific statement....

    I think that precision advantage of Posits is vastly overestimated. It might be useful to mathematicians when you have to handle formulas as abstract theory and you need a best fit for any possible case. However, in the real world the error of measuring is almost always worse that the error of calculation (if done right!).

    If you are clever you often can make assumptions about where to expect best accuracy and which cases are to be avoided. Example:
    void CartToPolar (double x, double y, double &w, double &r)
    {
        r= Length (x, y);
        if (r != 0.0) {
            if (fabs (x) > fabs (y))
            {
                w= asin (y / r); // near the X axis asin is better
            }
            else
            {
                w= acos (x / r); // near the Y axis acos is
            }
        }
        else w= 0.0;
    }
    
    It is much better to completely avoid cases where the numbers become very big or very small instead of relying on the number format to handle them.

    The example with the scalar product of a = (3.2e7...) b = (4.0e7...) is just plain wrong (as Heater already found out). 32bit IEEE floating point math gives the correct result. Precision for even much bigger numbers could be maintained by re-ordering the calculations and doing the subtraction of large numbers first before adding the small ones.

    And I doubt that the variable bit size of exponent and mantissa is very efficient. It might be for hardware implementations but it is surely not very helpful for software emulation. Most of the complexity of the IEEE format comes from the special cases, denormalized numbers, NaN and so on. Geting rid of those and instead using a single "error" flag would make things much simpler.

    But I agree that the support of "inline" floating point math (software emulated for the P2 and maybe in hardware for the future P3) would be a big advantage, no matter if it'd be IEEE or Posits. Although almost anything cyn be done with fixed point math floats make the code much simpler, easier to read and maintain and can avoid most of the bugs (overflows...)
  • evanhevanh Posts: 15,916
    ManAtWork wrote: »
    I think that precision advantage of Posits is vastly overestimated. It might be useful to mathematicians when you have to handle formulas as abstract theory and you need a best fit for any possible case. However, in the real world the error of measuring is almost always worse that the error of calculation (if done right!).
    That equally applies to IEEE floats. You're not a target of Posits if not already using floats.

    Without hardware support for either, like the Propellers, the question then is can a Prop1/2 do Posits quicker than IEEE floats? I'd guess that's what Chip is wanting to know.

  • POSITs might look technically interesting but you might want to consider whether it is worth giving up the compatibility with IEEE floats if you choose an alternative representation. What if SPIN2 code needed to share data with some floating point library written in another language which uses the usual IEEE format at some future time...? Will this still be possible or do you then need to build in extra conversion steps on all data being passed back and forth? Doing that could be slow for large arrays etc.
  • Perhaps, having them implemented in hardware is the way to go...

    https://eee.hku.hk/~hso/Publications/jaiswal_date_2018.pdf

  • As usual, not everyone will be in agreement, but, as said by Nelson Falcão Rodrigues (brazilian writer, deceased): "Toda unanimidade é burra." ("All unanimity is stupid."; courtesy : Google), so there are other POVs and, literally, lots to study and learn:

    https://hal.inria.fr/hal-01959581v4/document
  • JRoarkJRoark Posts: 1,215
    edited 2020-12-02 00:17
    ManAtWork wrote: »
    in the real world the error of measuring is almost always worse that the error of calculation (if done right!).

    Maybe you can give me some guidance here. If so I'd certainly appreciate it!

    I'm currently fighting precision issues dealing with GPS coordinates in FlexBASIC. Single-precision IEEE floats just wont cut it, and the implementation of IEEE double-precision math in any of the Propeller compilers is still way out on the horizon.

    Real example: The GPS module I'm dealing with outputs a longitude of 118.123456. That coordinate reflects an actual, repeatable, point on the ground. It's not just some "funny number". (In actuality this ZED-F9 GPS module does *centimeter* RTK accuracy, but I'm not using it that way). So the GPS output is easily accurate to 1/1,000,000 of a degree (ie, about 4 inches). By the time I run this position through single-precision math, that 1/1,000,000 of a degree gets devalued to about 36 feet CPE just in the conversion process. Run a few trig calcs and the cumulative error goes to almost 100 feet in the worst case.

    Not to hijack this thread, but how would you deal with this? It seems like either Posits or IEEE double-precision or DEC64/fixed-point would be a better answer, but maybe I'm wrong?


  • evanhevanh Posts: 15,916
    That appears to need something like 30 bits precision. 32-bit (Single) floats won't cut it since the mantissa is only 24-bit. I doubt a 32-bit Posit will make the grade either. It's actually a good example of exactly what Man-At-Work is saying. Absolute precision is best done with fixed-point or integers.
  • evanhevanh Posts: 15,916
    edited 2020-12-02 07:31
    Yanomani wrote: »

    I've now manually installed their test package and run the plot demo (sin, cos and atan) in that PDF. There is instructions on last page. It's my first time explicitly knowing I'm using Python. :)

    Their instructions are for Ubuntu 18.04 but I'm using 20.04 so not all the listed components had to be newly installed. For example Python3 is already installed. Only had to add matplotlib and libmpfr-dev. The slow part was downloading their "artifact" source code from their webpage. Very slow responses.

  • evanhevanh Posts: 15,916
    edited 2020-12-02 07:57
    Starting to dig around and first bit of simple code I come across has already got me confused.
    posit32_t ui32_to_p32( uint32_t a ) {
    	int_fast8_t k, log2 = 31;//length of bit (e.g. 4294966271) in int (32 but because we have only 32 bits, so one bit off to accomdate that fact)
    	union ui32_p32 uZ;
    	uint_fast32_t uiA;
    	uint_fast32_t expA, mask = 0x80000000, fracA;
    
    	if ( a > 4294966271)
    		uiA = 0x7FC00000; // 4294967296
    	else if ( a < 0x2 )
    		uiA = (a << 30);
    	else {
    		fracA = a;
    		while ( !(fracA & mask) ) {
    			log2--;
    			fracA <<= 1;
    		}
    		k = (log2 >> 2);
    		expA = (log2 & 0x3) << (27 - k);
    		fracA = (fracA ^ mask);
    		uiA = (0x7FFFFFFF ^ (0x3FFFFFFF >> k)) | expA | fracA>>(k+4);
    
    		mask = 0x8 << k;  //bitNPlusOne
    
    		if (mask & fracA)
    			if (((mask - 1) & fracA) | ((mask << 1) & fracA)) uiA++;
    
    	}
    	uZ.ui = uiA;
    	return uZ.p;
    }
    

    My C is obviously rusty because it doesn't make sense to assign a value into an auto variable just before returning. Namely, the "uZ.ui = uiA;"

    Here's the union definition:
    	union ui32_p32 { uint32_t ui; posit32_t p; };
    
  • evanhevanh Posts: 15,916
    And what the heck is a "fast" data type as opposed to regular? eg: uint_fast32_t

  • evanh wrote: »
    And what the heck is a "fast" data type as opposed to regular? eg: uint_fast32_t

    The fastest data type that is at least 32 bit wide - i.e. it doesn't need to have the defined under/overflow behaviour of the normal uint32_t
  • evanhevanh Posts: 15,916
    edited 2020-12-02 09:18
    I didn't know there was any defined under/overflow behaviour for integers, other than natural roll-over. EDIT: Ah, hadn't thought about overflowing multiplies. I've never tried to handle that other than avoiding it. EDIT2: I imagine most hardware will just truncate the upper bits. Kind of like a remainder.
  • JRoark wrote: »
    this ZED-F9 GPS module does *centimeter* RTK accuracy, but I'm not using it that way). So the GPS output is easily accurate to 1/1,000,000 of a degree (ie, about 4 inches). By the time I run this position through single-precision math, that 1/1,000,000 of a degree gets devalued to about 36 feet CPE just in the conversion process.

    The circumference of the earth is ~40,000km so 32 bit fixed point math would give you ~1cm resolution. As Evan already said it won't work with single precision IEEE floats because they only have 24 bits mantissa. Posits may give you some more bits but definitely less than 32.

    By clever use of the formulas, re-ordering the operands and so on you can avoid unnecessarily loosing precision. But you can't beat the best case. Even the best number format can't pack more information into the available bits.

    Scaling polar coordinates as fixed point numbers in such a way that 2^32 is one full circle would have the advantage that you can add angles without having to care about overflows, BTW.
  • ManAtWork wrote: »
    JRoark wrote: »
    this ZED-F9 GPS module does *centimeter* RTK accuracy, but I'm not using it that way). So the GPS output is easily accurate to 1/1,000,000 of a degree (ie, about 4 inches). By the time I run this position through single-precision math, that 1/1,000,000 of a degree gets devalued to about 36 feet CPE just in the conversion process.

    The circumference of the earth is ~40,000km so 32 bit fixed point math would give you ~1cm resolution. As Evan already said it won't work with single precision IEEE floats because they only have 24 bits mantissa. Posits may give you some more bits but definitely less than 32.

    By clever use of the formulas, re-ordering the operands and so on you can avoid unnecessarily loosing precision. But you can't beat the best case. Even the best number format can't pack more information into the available bits.

    Scaling polar coordinates as fixed point numbers in such a way that 2^32 is one full circle would have the advantage that you can add angles without having to care about overflows, BTW.

    Using floats (worse: radians as floats!) for angles is just deeply flawed in most cases.
  • JRoarkJRoark Posts: 1,215
    edited 2020-12-02 17:11
    ManAtWork wrote: »
    Scaling polar coordinates as fixed point numbers in such a way that 2^32 is one full circle would have the advantage that you can add angles without having to care about overflows, BTW.

    I had never considered that. Then for short distance/vector calcs, you could avoid most of the trig functions entirely and just do a "run over rise" type of math using a local grid (Pythagorean style). It wouldnt be pretty, but on a small scale it would probably work just fine. Food for thought there.
    Wuerfel_21 wrote: »
    Using floats (worse: radians as floats!) for angles is just deeply flawed in most cases.

    So true. Unfortunately, I'm sort of stuck with them since they are the natural units for all of my spherical trig functions. I just don't have the math background needed to do much in the way of workarounds. There are some games that I might play, but the essential flaw remains. At one point I even tried to subsitute a scaled Taylor expansion for a couple of the primative functions, but that has its own set of problems (convergence) and it's really slow if you want any precision (which was the entire point!).

    I believe @ersmith has been pondering implementing a fixed-point math option in the Flex suite. If/when that becomes a reality, it'll be a very happy day for me.

    Sorry to have hijacked this thread, but I really appreciate your responses.
  • evanh wrote: »
    I didn't know there was any defined under/overflow behaviour for integers, other than natural roll-over. EDIT: Ah, hadn't thought about overflowing multiplies. I've never tried to handle that other than avoiding it. EDIT2: I imagine most hardware will just truncate the upper bits. Kind of like a remainder.

    In C only unsigned integers have a defined overflow behavior, which is the "natural roll-over" you describe. Signed integers can do anything at all upon overflow. Taking only the bottom 32 bits is a common behavior (common enough that many people think it's what will always happen) but implementations are free to use saturating arithmetic (overfloat gives MAXINT), to throw an exception, or indeed to return a random number upon overflow.
  • JRoark wrote: »
    ManAtWork wrote: »
    Scaling polar coordinates as fixed point numbers in such a way that 2^32 is one full circle would have the advantage that you can add angles without having to care about overflows, BTW.

    I had never considered that. Then for short distance/vector calcs, you could avoid most of the trig functions entirely and just do a "run over rise" type of math using a local grid (Pythagorean style). It wouldnt be pretty, but on a small scale it would probably work just fine. Food for thought there.

    Another alternative is to use two floats to represent the numbers, a high and low part. You can google for "double-double arithmetic". This is most commonly used as a "poor mans quad precision" with two doubles, but the same principle can be used with two floats to implement a kind of double.
  • evanhevanh Posts: 15,916
    It's dawned on me what that union is all about. And therefore why they're assigning to it before returning. I hadn't paid enough attention to what was being returned - the other component of the union.

    It's a form of casting between datatypes. So the value being assigned to integer uZ.ui is then immediately used as return value posit uZ.p.

  • evanhevanh Posts: 15,916
    ersmith wrote: »
    In C only unsigned integers have a defined overflow ...
    Thanks Eric. I think I've gained my confidence back again. :)

  • BTW, that union trick is also called "type punning" and is only defined behavior in C. In C++, at least as of C++14, it's undefined behavior - most compilers out there take the freedom to make it work, since UB doesn't preclude implementation defined behavior. There is (or at least was) a good amount of, eh, discussion about whether type punning should be allowed in the standard or not.
  • evanhevanh Posts: 15,916
    Thanks Kuba. A new word for my vocabulary.

Sign In or Register to comment.