cessnapilot

07-20-2009, 04:08 AM

Hi,

64-bit Double precision float arithmetic can be done with· pairs of 32-bit Single precision floats using the Propeller's Floating point software package. Each 64-bit Double operand is the unevaluated sum of two 32-bit IEEE 754 Singles of which the first represents the leading digits, and the second the trailing digits, of the format's value. Its exponent range is almost the same as Single's.·I show here, in SPIN like pseudocodes, how the Pi·can be·represented, and how the double precision addition is carried out with (Single-Single)s on the Propeller. Each arithmetic operation (12 altogether) or intermediate result in the second pceudocode is, of course, a 32- bit, single precision float operation or value:

'-------------------------------------------------------------

PUB DS_Pi(dsA_)

'

'This returns Pi to Double precision

'

'Result dsA is given by reference

'

'Notation:

'··········· dsA[0] represents high-order Single,

'··········· dsA[1] represents low-order Single

dsA[0] :=· 3.141593E+00

dsA[1] := -3.464102E-07

'-------------------------------------------------------------

'-------------------------------------------------------------

PUB DS_Add(dsA_, dsB_, dsC_)·| t1, t2, e

'

'Computes (Single-Single) = (Single-Single) + (Single-Single)

'

'······················ dsC·· ···· =···· ··· dsA·· ···· +···· ·· dsB

'

'Parameters dsA, dsB and result dsC are given by reference

'

'Notation:

'··········· dsA[0] represents high-order Single,

'··········· dsA[1] represents low-order Single

'

'Order of operations, defined by the brackets, counts here

t1 := dsA[0] + dsB[0]

e· := t1 - dsA[0]

t2 := ((dsB[0] - e) + (dsA[0] - (t1 - e))) + dsA[1] + dsB[1]

'The result is t1 + t2, after normalization

dsC[0] := t1 + t2

dsC[1] := t2 - (dsc[0] - t1)

'---------------------------------------------------------------

The idea of improving floating point precision this way goes back to the 1960s. Nowadays, dedicated hardware, like GPUs in graphic cards or the Cell processor in PS3, run Single precision float operations so fast, that they can provide great potential for medical imaging, aerospace and defense. The Cell is hardware optimized toward vectorized, Single precision floating point computation and goes with a peak Single precision performance of 204 Gflops/sec. Scientific, CAD or·defence·computing needs, however, higher precision than 7.5 digits in many situations. So, effective software implementations of Double or Quad precision arithmetic on these cheap but capable hardwares are of current interest. Some types of Cells·can do Double precision calculations by hardware, but with an order of magnitude performance penalty.

Software implementations can compete with this, by greatly eliminating the performance penalty by clever programming. Which means here, to use single precision math wherever possible, especially for the most compute-intensive part of the code, and then fall back to Double precision, only when necessary. This goes without demolishing the Double precision of the results of many basic algorithms of high performance computing.

To take advantage of this mixed Single/Double approach systematically, the code changes have to be done by hand, since algorithm is beyond the intelligence of today's compiler technology. A software library, that contains 32/64-bit twin-float-procedures makes these changes available and comfortable to the programmer.

··

To allow for similar software tricks with the Propeller, I am coding these 'Single-Single' algorithms to·enhance Prop's capabilities in software implemented floating point calculations. A Propeller object using (Single-Single) Doubles is in preparation. This object will be placed on OBEX, if any interest shows up on the forum.

My questions to the Forum members are:

Do embedded applications with the Prop need Double precision (15 digits) at all?

(IBM came out lately with a hardware implemented Double precision float version of Cell, aimed for embedded applications in cooperation with other firms...)

Does someone know of a downloadable, ready-made, bug free and better solution for the Propeller to do DP float math?

(A free, OBEX quality SPIN file that compiles and works correctly, will do...)

·

Will the four basic operations be enough in Double precision?

(Maybe for a lite DP package...)

Which functions are sufficient and necessary in Double precision for an enhanced, but basic math package?

(SQRT, SIN, LOG, ..., ?)

Is it worth to sacrifice one or two (or) more COGs to make it fast?

(?)

What about to make a software based, but high speed Single/Double/Quad precision versatile FPU from a single(!) Propeller with large EEPROM for accurate tables?

(With SPI interface, In SixBladeProp, or so,...?)

Should the solution be somewhat 'Navigation' oriented?

(ATAN2, WGS84/ECEF,... ?)

And,

Will someone use such code written to the Propeller/uM-FPU combination, too?

(Only one COG consumed, three-five times the speed, much less HUB RAM needed..., ?)

Cheers,

Istvan

Post Edited (cessnapilot) : 7/19/2009 9:20:24 PM GMT

64-bit Double precision float arithmetic can be done with· pairs of 32-bit Single precision floats using the Propeller's Floating point software package. Each 64-bit Double operand is the unevaluated sum of two 32-bit IEEE 754 Singles of which the first represents the leading digits, and the second the trailing digits, of the format's value. Its exponent range is almost the same as Single's.·I show here, in SPIN like pseudocodes, how the Pi·can be·represented, and how the double precision addition is carried out with (Single-Single)s on the Propeller. Each arithmetic operation (12 altogether) or intermediate result in the second pceudocode is, of course, a 32- bit, single precision float operation or value:

'-------------------------------------------------------------

PUB DS_Pi(dsA_)

'

'This returns Pi to Double precision

'

'Result dsA is given by reference

'

'Notation:

'··········· dsA[0] represents high-order Single,

'··········· dsA[1] represents low-order Single

dsA[0] :=· 3.141593E+00

dsA[1] := -3.464102E-07

'-------------------------------------------------------------

'-------------------------------------------------------------

PUB DS_Add(dsA_, dsB_, dsC_)·| t1, t2, e

'

'Computes (Single-Single) = (Single-Single) + (Single-Single)

'

'······················ dsC·· ···· =···· ··· dsA·· ···· +···· ·· dsB

'

'Parameters dsA, dsB and result dsC are given by reference

'

'Notation:

'··········· dsA[0] represents high-order Single,

'··········· dsA[1] represents low-order Single

'

'Order of operations, defined by the brackets, counts here

t1 := dsA[0] + dsB[0]

e· := t1 - dsA[0]

t2 := ((dsB[0] - e) + (dsA[0] - (t1 - e))) + dsA[1] + dsB[1]

'The result is t1 + t2, after normalization

dsC[0] := t1 + t2

dsC[1] := t2 - (dsc[0] - t1)

'---------------------------------------------------------------

The idea of improving floating point precision this way goes back to the 1960s. Nowadays, dedicated hardware, like GPUs in graphic cards or the Cell processor in PS3, run Single precision float operations so fast, that they can provide great potential for medical imaging, aerospace and defense. The Cell is hardware optimized toward vectorized, Single precision floating point computation and goes with a peak Single precision performance of 204 Gflops/sec. Scientific, CAD or·defence·computing needs, however, higher precision than 7.5 digits in many situations. So, effective software implementations of Double or Quad precision arithmetic on these cheap but capable hardwares are of current interest. Some types of Cells·can do Double precision calculations by hardware, but with an order of magnitude performance penalty.

Software implementations can compete with this, by greatly eliminating the performance penalty by clever programming. Which means here, to use single precision math wherever possible, especially for the most compute-intensive part of the code, and then fall back to Double precision, only when necessary. This goes without demolishing the Double precision of the results of many basic algorithms of high performance computing.

To take advantage of this mixed Single/Double approach systematically, the code changes have to be done by hand, since algorithm is beyond the intelligence of today's compiler technology. A software library, that contains 32/64-bit twin-float-procedures makes these changes available and comfortable to the programmer.

··

To allow for similar software tricks with the Propeller, I am coding these 'Single-Single' algorithms to·enhance Prop's capabilities in software implemented floating point calculations. A Propeller object using (Single-Single) Doubles is in preparation. This object will be placed on OBEX, if any interest shows up on the forum.

My questions to the Forum members are:

Do embedded applications with the Prop need Double precision (15 digits) at all?

(IBM came out lately with a hardware implemented Double precision float version of Cell, aimed for embedded applications in cooperation with other firms...)

Does someone know of a downloadable, ready-made, bug free and better solution for the Propeller to do DP float math?

(A free, OBEX quality SPIN file that compiles and works correctly, will do...)

·

Will the four basic operations be enough in Double precision?

(Maybe for a lite DP package...)

Which functions are sufficient and necessary in Double precision for an enhanced, but basic math package?

(SQRT, SIN, LOG, ..., ?)

Is it worth to sacrifice one or two (or) more COGs to make it fast?

(?)

What about to make a software based, but high speed Single/Double/Quad precision versatile FPU from a single(!) Propeller with large EEPROM for accurate tables?

(With SPI interface, In SixBladeProp, or so,...?)

Should the solution be somewhat 'Navigation' oriented?

(ATAN2, WGS84/ECEF,... ?)

And,

Will someone use such code written to the Propeller/uM-FPU combination, too?

(Only one COG consumed, three-five times the speed, much less HUB RAM needed..., ?)

Cheers,

Istvan

Post Edited (cessnapilot) : 7/19/2009 9:20:24 PM GMT