Will P2 have an FPU?



  • jmgjmg Posts: 14,539
    ersmith wrote: »
    Indeed, who knows? But the devil is in the details, and I don't see any practical (reasonable performance) way forward without lookup tables, and lookup tables for 32 bits will be *huge*. IEEE floating point has its flaws, but at least it's clear how to implement it.

    Having said that, I hope he is able to come up with a solution. But right now it's definitely not ready for prime time.

    The broad idea of elastic precision is good, and I can see that FPGA's could get some ways to a practical solution, with their 'spare' memory bits.
    It seems the extra information does need to have larger memory (even if sometimes that might be less, that helps little in engineering design)

    eg a 36b wide memory in FPGA, has 4 spare tag bits for normal numbers.
    That's a lot and allows a simple precision tag like
    0000b -> simplest int_32
    0001b -> simplest int_64

    or is there even room for a signed bit, to make mixing

    0000b -> simplest uint_32
    0001b -> simplest int_32
    0010b -> uint_64
    0011b -> int_64
    .. some spares..
    1000b -> float_32
    1001b -> float_40
    1010b -> float_48
    1011b -> float_56
    1100b -> float_64
    1101b -> float_96
    1110b -> float_128
    1111b -> float_UP

    or these MSB's could simply tag the bits of precision, as multiples of 8, starting at 32, for 32..160b ?
    That would be useful for Float libs using polynomial expansions, as they then know when to stop iterating.

    Whatever the final mapping, hardware can instantly dispatch the number to the correct ALU, and even promote on overflow.
    The many floats tagged assume there is some intermediate speed benefits possible, and float_UP is user precision, which has precision set in another following 4b field.

    The cost of 'making numbers easier' is certainly felt in silicon, so at what point does this become worthwhile ?

    Vendors could include ROM FPU routines, so that values not in their FPU silicon, get SW calls, and that makes handling the threshold invisible to users.

    Looks like Cortex M7's have Double FPU, and are sub $10, so maybe somewhere above this, you can get benefits.
    Do you get enough users tho ?

    Scripts that try to allow 32b BIT level ops and default Float, do have problems with float_32, but they can 'fix' that with a move to float_64, and float_64 already exists now in upper end MCUs, so who does that leave ?

    What can be packed into one P2 COG, in terms of Float support ?
  • TorTor Posts: 2,001
    Scaled integers can do a lot of what is usually thought of as needing floating point. On desktop PCs fp is apparently as fast as int processing now, otherwise I would have used scaled ints for 95% of what I use fp for.
    Then there are tables.. at least for 16-bit precision (w/32-bit output, sometimes) the table sizes aren't unmanageable. Both methods are described here (Garth Wilson's site): http://wilsonminesco.com/16bitMathTables/index.html, with a focus on the tables but with an intro to scaled integer math.
  • Back in the day when I was working on display software for a 3-D phased array radar http://www.radartutorial.eu/19.kartei/karte112.en.html one of our team had his code fail review because he had used floats in the code. Our processors did not have float hardware but our Coral 66 compiler supported fixed point arithmetic. The project manager said:

    "If you think you need floating point maths to solve the problem you don't understand the problem"

    A corollary to that could be:

    "If actually do need floating point to solve the problem, now you have a problem you don't understand"

    Certainly use of floats gets programmers into all kind of difficulties they don't expect.

    See here for why the corollary may be true: https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html It's a long hard read...

  • Heater.Heater. Posts: 21,233
    edited 2017-10-28 - 09:49:08
    There has been some very interesting developments in the UMUM idea. Basically since this thread started Gustafson has had some brainwaves and developed what looks like a very practical way to implement a variant of his UNUMS. He calls them POSITs. These POSITs do not require variable size operands. They are very simple to implement in hardware or software emulation. Perhaps it's best to just quote Dr Gustafson's own summary:

    "A new data type called a "posit" is designed for direct drop-in replacement for IEEE Standard 754 floats. Unlike unum arithmetic, posits do not require interval-type mathematics or variable size operands, and they round if an answer is inexact, much the way floats do. However, they provide compelling advantages over floats, including simpler hardware implementation that scales from as few as two-bit operands to thousands of bits. For any bit width, they have a larger dynamic range, higher accuracy, better closure under arithmetic operations, and simpler exception-handling. For example, posits never overflow to infinity or underflow to zero, and there is no "Not-a-Number" (NaN) value. Posits should take up less space to implement in silicon than an IEEE float of the same size. With fewer gate delays per operation as well as lower silicon footprint, the posit operations per second (POPS) supported by a chip can be significantly higher than the FLOPs using similar hardware resources. GPU accelerators, in particular, could do more arithmetic per watt and per dollar yet deliver superior answer quality."

    I'm kind of sold on this idea. One amazing feature of POSITs is that you can do an FFT and then run the results through the reverse FFT and get back, bit for bit, exactly the input you started with. Try doing that with floats.

    Seems that people are already designing POSITs into actual silicon and there is a lot of interest from the likes of Intel, Nvidia and others. Also the IEEE has already approached Gustafson asking if POSITs can be made into a IEEE standard. So far he has resisted this idea as he, quite rightly, does not want the whole idea made over complicated and compromised by the "design by committee" that would happen during such a standardization process.

    There is a great introduction to POSITs by Gustafson in this video:

    There is a POSIT library in C++ here: https://github.com/libcg/bfp. Looking at that code POSIT implementation does seem to be very simple.

    POSITs would be great to have in a future P3.

    In the meanwhile we should get to work on a POSIT implementation in PASM for the P1 and P2....

  • Excellent, Heater. I will watch that.

    This seems like the direction things need to go in.
  • One amazing feature of POSITs is that you can do an FFT and then run the results through the reverse FFT and get back, bit for bit, exactly the input you started with. Try doing that with floats.

    I'm sold. I hate floats, because they are a mess. Obviously, I use them, but always wit a jaded, guarded eye.

    These will, I suspect, quickly replace them in things like CAD, where such accuracy would eliminate a bazillion judges to make geometry representations actually work.

  • I just realised that unum is probably borrowing a little from the ancient ulaw - https://en.wikipedia.org/wiki/Μ-law_algorithm

  • evenh the necromancer, raising 3 year old threads from the dead.
  • Hehe, true. I bumped into it while searching for other old posts. I've been pondering filters of late, they tend to produce big numbers quickly.

Sign In or Register to comment.