Indeed, who knows? But the devil is in the details, and I don't see any practical (reasonable performance) way forward without lookup tables, and lookup tables for 32 bits will be *huge*. IEEE floating point has its flaws, but at least it's clear how to implement it.

Having said that, I hope he is able to come up with a solution. But right now it's definitely not ready for prime time.

The broad idea of elastic precision is good, and I can see that FPGA's could get some ways to a practical solution, with their 'spare' memory bits.
It seems the extra information does need to have larger memory (even if sometimes that might be less, that helps little in engineering design)

eg a 36b wide memory in FPGA, has 4 spare tag bits for normal numbers.
That's a lot and allows a simple precision tag like
0000b -> simplest int_32
0001b -> simplest int_64

or is there even room for a signed bit, to make mixing

or these MSB's could simply tag the bits of precision, as multiples of 8, starting at 32, for 32..160b ?
That would be useful for Float libs using polynomial expansions, as they then know when to stop iterating.

Whatever the final mapping, hardware can instantly dispatch the number to the correct ALU, and even promote on overflow.
The many floats tagged assume there is some intermediate speed benefits possible, and float_UP is user precision, which has precision set in another following 4b field.

The cost of 'making numbers easier' is certainly felt in silicon, so at what point does this become worthwhile ?

Vendors could include ROM FPU routines, so that values not in their FPU silicon, get SW calls, and that makes handling the threshold invisible to users.

Looks like Cortex M7's have Double FPU, and are sub $10, so maybe somewhere above this, you can get benefits.
Do you get enough users tho ?

Scripts that try to allow 32b BIT level ops and default Float, do have problems with float_32, but they can 'fix' that with a move to float_64, and float_64 already exists now in upper end MCUs, so who does that leave ?

What can be packed into one P2 COG, in terms of Float support ?

Scaled integers can do a lot of what is usually thought of as needing floating point. On desktop PCs fp is apparently as fast as int processing now, otherwise I would have used scaled ints for 95% of what I use fp for.
Then there are tables.. at least for 16-bit precision (w/32-bit output, sometimes) the table sizes aren't unmanageable. Both methods are described here (Garth Wilson's site): http://wilsonminesco.com/16bitMathTables/index.html, with a focus on the tables but with an intro to scaled integer math.

Back in the day when I was working on display software for a 3-D phased array radar http://www.radartutorial.eu/19.kartei/karte112.en.html one of our team had his code fail review because he had used floats in the code. Our processors did not have float hardware but our Coral 66 compiler supported fixed point arithmetic. The project manager said:

"If you think you need floating point maths to solve the problem you don't understand the problem"

A corollary to that could be:

"If actually do need floating point to solve the problem, now you have a problem you don't understand"

Certainly use of floats gets programmers into all kind of difficulties they don't expect.

There has been some very interesting developments in the UMUM idea. Basically since this thread started Gustafson has had some brainwaves and developed what looks like a very practical way to implement a variant of his UNUMS. He calls them POSITs. These POSITs do not require variable size operands. They are very simple to implement in hardware or software emulation. Perhaps it's best to just quote Dr Gustafson's own summary:

"A new data type called a "posit" is designed for direct drop-in replacement for IEEE Standard 754 floats. Unlike unum arithmetic, posits do not require interval-type mathematics or variable size operands, and they round if an answer is inexact, much the way floats do. However, they provide compelling advantages over floats, including simpler hardware implementation that scales from as few as two-bit operands to thousands of bits. For any bit width, they have a larger dynamic range, higher accuracy, better closure under arithmetic operations, and simpler exception-handling. For example, posits never overflow to infinity or underflow to zero, and there is no "Not-a-Number" (NaN) value. Posits should take up less space to implement in silicon than an IEEE float of the same size. With fewer gate delays per operation as well as lower silicon footprint, the posit operations per second (POPS) supported by a chip can be significantly higher than the FLOPs using similar hardware resources. GPU accelerators, in particular, could do more arithmetic per watt and per dollar yet deliver superior answer quality."

I'm kind of sold on this idea. One amazing feature of POSITs is that you can do an FFT and then run the results through the reverse FFT and get back, bit for bit, exactly the input you started with. Try doing that with floats.

Seems that people are already designing POSITs into actual silicon and there is a lot of interest from the likes of Intel, Nvidia and others. Also the IEEE has already approached Gustafson asking if POSITs can be made into a IEEE standard. So far he has resisted this idea as he, quite rightly, does not want the whole idea made over complicated and compromised by the "design by committee" that would happen during such a standardization process.

There is a great introduction to POSITs by Gustafson in this video:

There is a POSIT library in C++ here: https://github.com/libcg/bfp. Looking at that code POSIT implementation does seem to be very simple.

POSITs would be great to have in a future P3.

In the meanwhile we should get to work on a POSIT implementation in PASM for the P1 and P2....

One amazing feature of POSITs is that you can do an FFT and then run the results through the reverse FFT and get back, bit for bit, exactly the input you started with. Try doing that with floats.

I'm sold. I hate floats, because they are a mess. Obviously, I use them, but always wit a jaded, guarded eye.

These will, I suspect, quickly replace them in things like CAD, where such accuracy would eliminate a bazillion judges to make geometry representations actually work.

## Comments

14,540The broad idea of elastic precision is good, and I can see that FPGA's could get some ways to a practical solution, with their 'spare' memory bits.

It seems the extra information does need to have larger memory (even if

sometimesthatmightbe less, that helps little in engineering design)eg a 36b wide memory in FPGA, has 4 spare tag bits for normal numbers.

That's a lot and allows a simple precision tag like

0000b -> simplest int_32

0001b -> simplest int_64

or is there even room for a signed bit, to make mixing

0000b -> simplest uint_32

0001b -> simplest int_32

0010b -> uint_64

0011b -> int_64

.. some spares..

1000b -> float_32

1001b -> float_40

1010b -> float_48

1011b -> float_56

1100b -> float_64

1101b -> float_96

1110b -> float_128

1111b -> float_UP

or these MSB's could simply tag the bits of precision, as multiples of 8, starting at 32, for 32..160b ?

That would be useful for Float libs using polynomial expansions, as they then know when to stop iterating.

Whatever the final mapping, hardware can instantly dispatch the number to the correct ALU, and even promote on overflow.

The many floats tagged assume there is some intermediate speed benefits possible, and float_UP is user precision, which has precision set in another following 4b field.

The cost of 'making numbers easier' is certainly felt in silicon, so at what point does this become worthwhile ?

Vendors could include ROM FPU routines, so that values not in their FPU silicon, get SW calls, and that makes handling the threshold invisible to users.

Looks like Cortex M7's have Double FPU, and are sub $10, so maybe somewhere above this, you can get benefits.

Do you get enough users tho ?

Scripts that try to allow 32b BIT level ops

anddefault Float, do have problems with float_32, but they can 'fix' that with a move to float_64, and float_64 already exists now in upper end MCUs, so who does that leave ?What can be packed into one P2 COG, in terms of Float support ?

2,001Then there are tables.. at least for 16-bit precision (w/32-bit output, sometimes) the table sizes aren't unmanageable. Both methods are described here (Garth Wilson's site): http://wilsonminesco.com/16bitMathTables/index.html, with a focus on the tables but with an intro to scaled integer math.

21,233"If you think you need floating point maths to solve the problem you don't understand the problem"

A corollary to that could be:

"If actually do need floating point to solve the problem, now you have a problem you don't understand"

Certainly use of floats gets programmers into all kind of difficulties they don't expect.

See here for why the corollary may be true: https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html It's a long hard read...

21,233"A new data type called a "posit" is designed for direct drop-in replacement for IEEE Standard 754 floats. Unlike unum arithmetic, posits do not require interval-type mathematics or variable size operands, and they round if an answer is inexact, much the way floats do. However, they provide compelling advantages over floats, including simpler hardware implementation that scales from as few as two-bit operands to thousands of bits. For any bit width, they have a larger dynamic range, higher accuracy, better closure under arithmetic operations, and simpler exception-handling. For example, posits never overflow to infinity or underflow to zero, and there is no "Not-a-Number" (NaN) value. Posits should take up less space to implement in silicon than an IEEE float of the same size. With fewer gate delays per operation as well as lower silicon footprint, the posit operations per second (POPS) supported by a chip can be significantly higher than the FLOPs using similar hardware resources. GPU accelerators, in particular, could do more arithmetic per watt and per dollar yet deliver superior answer quality."I'm kind of sold on this idea. One amazing feature of POSITs is that you can do an FFT and then run the results through the reverse FFT and get back, bit for bit, exactly the input you started with. Try doing that with floats.

Seems that people are already designing POSITs into actual silicon and there is a lot of interest from the likes of Intel, Nvidia and others. Also the IEEE has already approached Gustafson asking if POSITs can be made into a IEEE standard. So far he has resisted this idea as he, quite rightly, does not want the whole idea made over complicated and compromised by the "design by committee" that would happen during such a standardization process.

There is a great introduction to POSITs by Gustafson in this video:

There is a POSIT library in C++ here: https://github.com/libcg/bfp. Looking at that code POSIT implementation does seem to be very simple.

POSITs would be great to have in a future P3.

In the meanwhile we should get to work on a POSIT implementation in PASM for the P1 and P2....

13,125This seems like the direction things need to go in.

10,121I'm sold. I hate floats, because they are a mess. Obviously, I use them, but always wit a jaded, guarded eye.

These will, I suspect, quickly replace them in things like CAD, where such accuracy would eliminate a bazillion judges to make geometry representations actually work.

10,0882,93310,088