Rayman

01-10-2008, 10:57 PM

Anybody seen optimized multiply and divide routines in PASM here?

View Full Version : Assembly multiply and divide routines ?

Rayman

01-10-2008, 10:57 PM

Anybody seen optimized multiply and divide routines in PASM here?

hippy

01-10-2008, 11:04 PM

Not sure if optimised enough for your needs, but from my SpinVM ....

' | $F4 MPY lhs := lhs * rhs

' | $F5 MPY_BIG lhs := lhs ** rhs

' lhsTop:lhsBot := || lhs

' rhsBot := || rhs

' accTop:accBot := 0

'

' while rhs

' if rhs & 1

' accTop:accBot += lhsTop:lhsBot

' lhsTop:lhsBot := lhsTop:lhsBot << 1

' rhs := rhs >> 1

OvrMPY abs lhs,lhs WC ' Sets C if was negative

muxc :signBit,#1

abs rhs,rhs WC ' Sets C if was negative

IF_C xor :signBit,#1

mov :lhsTop,#0

mov :accBot,#0

mov :accTop,#0

:Loop shr rhs,#1 WC, WZ ' if rhs & 1

IF_C add :accTop,:lhsTop ' accTop:accBot += lhsTop:lhsBot

IF_C add :accBot,:lhsBot WC

IF_C add :accTop,#1

shl :lhsBot,#1 WC ' lhsTop:lhsBot := lhsTop:lhsBot << 1

rcl :lhsTop,#1

IF_NZ jmp #:Loop

shr :signBit,#1 WC, NR

IF_C neg :accBot,:accBot WZ

IF_C neg :accTop,:accTop

IF_C_AND_NZ sub :accTop,#1

shr opc,#1 WC, NR

IF_NC mov lhs,:accBot ' MPY

IF_C mov lhs,:accTop ' MPY_BIG

jmp #CommonReturn

' | $F6 DIV lhs := lhs / rhs

' | $F7 MOD lhs := lhs // rhs

OvrDIV abs rhs,rhs WC, WZ ' Sets C if was negative

IF_Z mov lhs,#0

IF_Z jmp #CommonReturn

muxc :signBit,#1

abs lhs,lhs WC ' Sets C if was negative

IF_C xor :signBit,#1

mov acc,#0

mov :count,#1

:ShlLoop and rhs,:k_4000_0000 WZ,NR

IF_Z shl rhs,#1

IF_Z add :count,#1

IF_Z jmp #:ShlLoop

:SubLoop shl acc,#1

cmps lhs,rhs WC, WZ, NR

IF_AE sub lhs,rhs

IF_AE or acc,#1

shr rhs,#1

djnz :count,#:SubLoop

' acc = result

' lhs = remainder

shr opc,#1 WC, NR

IF_NC mov lhs,acc ' DIV

' If sign bit was set we need to negate back

shr :signBit,#1 WC, NR

IF_C neg lhs,lhs

jmp #CommonReturn

:k_4000_0000 long $4000_0000

The multiply is 32x32=64 and can be easily cut-back to 32x32=32 and becomes much simpler ( remove lhs:top and acc:top ). It doesn't have deterministic timing.

Post Edited (hippy) : 1/10/2008 3:09:13 PM GMT

' | $F4 MPY lhs := lhs * rhs

' | $F5 MPY_BIG lhs := lhs ** rhs

' lhsTop:lhsBot := || lhs

' rhsBot := || rhs

' accTop:accBot := 0

'

' while rhs

' if rhs & 1

' accTop:accBot += lhsTop:lhsBot

' lhsTop:lhsBot := lhsTop:lhsBot << 1

' rhs := rhs >> 1

OvrMPY abs lhs,lhs WC ' Sets C if was negative

muxc :signBit,#1

abs rhs,rhs WC ' Sets C if was negative

IF_C xor :signBit,#1

mov :lhsTop,#0

mov :accBot,#0

mov :accTop,#0

:Loop shr rhs,#1 WC, WZ ' if rhs & 1

IF_C add :accTop,:lhsTop ' accTop:accBot += lhsTop:lhsBot

IF_C add :accBot,:lhsBot WC

IF_C add :accTop,#1

shl :lhsBot,#1 WC ' lhsTop:lhsBot := lhsTop:lhsBot << 1

rcl :lhsTop,#1

IF_NZ jmp #:Loop

shr :signBit,#1 WC, NR

IF_C neg :accBot,:accBot WZ

IF_C neg :accTop,:accTop

IF_C_AND_NZ sub :accTop,#1

shr opc,#1 WC, NR

IF_NC mov lhs,:accBot ' MPY

IF_C mov lhs,:accTop ' MPY_BIG

jmp #CommonReturn

' | $F6 DIV lhs := lhs / rhs

' | $F7 MOD lhs := lhs // rhs

OvrDIV abs rhs,rhs WC, WZ ' Sets C if was negative

IF_Z mov lhs,#0

IF_Z jmp #CommonReturn

muxc :signBit,#1

abs lhs,lhs WC ' Sets C if was negative

IF_C xor :signBit,#1

mov acc,#0

mov :count,#1

:ShlLoop and rhs,:k_4000_0000 WZ,NR

IF_Z shl rhs,#1

IF_Z add :count,#1

IF_Z jmp #:ShlLoop

:SubLoop shl acc,#1

cmps lhs,rhs WC, WZ, NR

IF_AE sub lhs,rhs

IF_AE or acc,#1

shr rhs,#1

djnz :count,#:SubLoop

' acc = result

' lhs = remainder

shr opc,#1 WC, NR

IF_NC mov lhs,acc ' DIV

' If sign bit was set we need to negate back

shr :signBit,#1 WC, NR

IF_C neg lhs,lhs

jmp #CommonReturn

:k_4000_0000 long $4000_0000

The multiply is 32x32=64 and can be easily cut-back to 32x32=32 and becomes much simpler ( remove lhs:top and acc:top ). It doesn't have deterministic timing.

Post Edited (hippy) : 1/10/2008 3:09:13 PM GMT

hippy

01-10-2008, 11:45 PM

If this is for your MP3 decoding it may be worthwhile looking at a lookup table for the

multiplication. It doesn't have to be 2^64 entries ( although that's the fastest ) but

could be done by 'quarter squares'. That reduces the size of lookup quite considerably

but I cannot remember by how much; 2^33 ?

I cannot remember the details but it involves turning "ab = a(bČ-aČ)/(b+a) + aČ" into

"ab = qs(a+b) - qs(a-b)" or something like that; a+b would give the lookup table size

of 2^33 as above. qs(n) = nČ/4

Added : details here ... mathworld.wolfram.com/QuarterSquaresRule.html (http://mathworld.wolfram.com/QuarterSquaresRule.html)

You might get some mileage out of dropping down to 16x16, even lower. Although you'll

lose quality at least you may be able to go forward.

Post Edited (hippy) : 1/10/2008 4:02:18 PM GMT

multiplication. It doesn't have to be 2^64 entries ( although that's the fastest ) but

could be done by 'quarter squares'. That reduces the size of lookup quite considerably

but I cannot remember by how much; 2^33 ?

I cannot remember the details but it involves turning "ab = a(bČ-aČ)/(b+a) + aČ" into

"ab = qs(a+b) - qs(a-b)" or something like that; a+b would give the lookup table size

of 2^33 as above. qs(n) = nČ/4

Added : details here ... mathworld.wolfram.com/QuarterSquaresRule.html (http://mathworld.wolfram.com/QuarterSquaresRule.html)

You might get some mileage out of dropping down to 16x16, even lower. Although you'll

lose quality at least you may be able to go forward.

Post Edited (hippy) : 1/10/2008 4:02:18 PM GMT

Paul Baker

01-11-2008, 01:25 AM

There is also some routines in Propeller Guts.pdf: http://forums.parallax.com/attachment.php?attachmentid=40588·(this document served as the manual before there was a manual).

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Paul Baker (mailto:pbaker@parallax.com)

Propeller Applications Engineer

[/url][url=http://www.parallax.com] (http://www.parallax.com)

Parallax, Inc. (http://www.parallax.com)

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Paul Baker (mailto:pbaker@parallax.com)

Propeller Applications Engineer

[/url][url=http://www.parallax.com] (http://www.parallax.com)

Parallax, Inc. (http://www.parallax.com)

deSilva

01-11-2008, 02:16 AM

Of course also published in deSilva's Machine Language tutorial ..if anyone would read it ... *sigh*

mpark

01-11-2008, 05:28 AM

Maybe more people would read your tutorial if you posted a link every time you mentioned it.

Like so: http://forums.parallax.com/showthread.php?p=668559

See also: Lighting a candle v. cursing the dark.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Michael Park

PS, BTW, and FYI:

To search the forum, use search.parallax.com (http://search.parallax.com) (do not use the Search button).

Check out the Propeller Wiki: propeller.wikispaces.com/ (http://propeller.wikispaces.com/)

Like so: http://forums.parallax.com/showthread.php?p=668559

See also: Lighting a candle v. cursing the dark.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Michael Park

PS, BTW, and FYI:

To search the forum, use search.parallax.com (http://search.parallax.com) (do not use the Search button).

Check out the Propeller Wiki: propeller.wikispaces.com/ (http://propeller.wikispaces.com/)

bambino

01-11-2008, 05:51 AM

deSilva,

I have read your tutorial and I apologize for having not commented on how informative it is. In fact, I am in the process of rereading it now, and I shall keep it in favorites!

Maybe you could post a link to it in your signature. That way it will get publicity even when your wrtting about something totally unrelated!

I have read your tutorial and I apologize for having not commented on how informative it is. In fact, I am in the process of rereading it now, and I shall keep it in favorites!

Maybe you could post a link to it in your signature. That way it will get publicity even when your wrtting about something totally unrelated!

deSilva

01-11-2008, 06:18 AM

Thanks, Bambino! And a good hint, as I MOSTLY write something totally unrelated http://forums.parallax.com/images/smilies/smile.gif

Though assembly code should not be overemphasized. Alas, the full wisdom of this forum is not easy to tap. Reading between some lines, I have the impression that even the much adversized work of our WIKI writers is hardly known. When I read in it from time to time I am always astonished about its comprehensive and to the point information. Great work there!

Though assembly code should not be overemphasized. Alas, the full wisdom of this forum is not easy to tap. Reading between some lines, I have the impression that even the much adversized work of our WIKI writers is hardly known. When I read in it from time to time I am always astonished about its comprehensive and to the point information. Great work there!