Can sloppy math processors speed things up... a lot?

ElectricAye · 2011-01-05 08:07

I guess it works well enough for the brain, so why not computers...

http://www.physorg.com/news/2011-01-sloppy-arithmetic.html

Heater. · 2011-01-05 09:30

Why not indeed?

In my Fourier Transform for the Prop I used fixed point arithmetic such that 0.0 to 1.0 is represented as 0 to 4095. So all calculations are done to only done to 13 bits accuracy.
That means the multiplies can go faster because they have fewer bits to shift and add.

Given this not so good precision it still works out that if you input a single frequency all the output frequency bins are exactly zero except the correct one. I suspect that one could reduce the precision even further, say 10 or 8 bits and the result would just be a little noise in the output. This almost doubles the speed of the multiplies and the little noise might be quite acceptable in many applications. I also suspect that the Fourier Transform algorithm has a tendency to average out noise so that the results don't accumulate errors.

As an other example. If you are controlling something with a PID loop, say an actuator position, there is no point in using arithmetic of extreme precision if your input measurement data is only accurate to 8 or 10 bits anyway.

On the other hand, I'm sure there are other algorithms that iterate over the data a lot and accumulate error as they go. The results might be rendered useless. For example what happens if you put a one percent error into every multiply of a Mandlebrot set calculation. I'm have a feeling that the resulting image would change dramatically.

As it happens, I'm now a days working with systems using fuzzy logic. No need for 128 bit floating point ops there.

Bobb Fwed · 2011-01-05 14:24

Prop 3 anyone? Some sloppy single-cycle division would be nice. Or single-cog parallel single-cycle sloppy multiplication (not that I could see how that possibly would work ... Chip and Beau will figure it out)!

Martin_H · 2011-01-05 18:33

Fast approximations are very useful. Here's a well known one for a fast inverse square root:

http://ilab.usc.edu/wiki/index.php/Fast_Square_Root#Method_using_The_Quake_3_Walsh_Method

Heater. · 2011-01-05 22:12

One might put ones finger in the air and say that the number of transistors required for a processor goes up with the square of the data width. (Chip designers here please comment on that statement).

So instead of having a single 64 bit core we could have had four 32 bit cores. That would give four time the performance on day to day PC work. I could argue that the race to 64 bits of precision has held up performance gains we could have had with only 32 bits.

kwinn · 2011-01-06 10:25

Heater. wrote: »

One might put ones finger in the air and say that the number of transistors required for a processor goes up with the square of the data width. (Chip designers here please comment on that statement).

So instead of having a single 64 bit core we could have had four 32 bit cores. That would give four time the performance on day to day PC work. I could argue that the race to 64 bits of precision has held up performance gains we could have had with only 32 bits.

The ratio is less than the square of the data width, but a bit more than linear. Doubling the data width requires doubling memory, register, and gating circuitry sizes, and more than doubling gates that perform arithmetic and logic functions.

pharseid · 2011-01-06 13:06

I would also think it would depend on the exact execution hardware the processor contains. Flash multipliers are big to start with and they get bigger at slightly worse than the square of the data width. For adders, you could get away with a linear ratio, but it would get progressively slower without additional carry look-ahead logic. I would think the control logic wouldn't grow much at all.

phar

Can sloppy math processors speed things up... a lot?

Comments