Shop OBEX P1 Docs P2 Docs Learn Events
Assembly multiply and divide routines ? — Parallax Forums

Assembly multiply and divide routines ?

RaymanRayman Posts: 14,801
edited 2008-01-10 22:18 in Propeller 1
Anybody seen optimized multiply and divide routines in PASM here?

Comments

  • hippyhippy Posts: 1,981
    edited 2008-01-10 15:04
    Not sure if optimised enough for your needs, but from my SpinVM ....

    ' |     $F4     MPY     lhs := lhs * rhs
    ' |     $F5     MPY_BIG lhs := lhs ** rhs
    
    '               lhsTop:lhsBot := || lhs
    '                      rhsBot := || rhs
    '               accTop:accBot := 0
    '
    '               while rhs
    '                 if rhs & 1
    '                   accTop:accBot += lhsTop:lhsBot
    '                   lhsTop:lhsBot := lhsTop:lhsBot << 1
    '                 rhs := rhs >> 1
               
    OvrMPY          abs     lhs,lhs WC                      ' Sets C if was negative
                    muxc    :signBit,#1
    
                    abs     rhs,rhs WC                      ' Sets C if was negative
            IF_C    xor     :signBit,#1
    
                    mov     :lhsTop,#0
                    mov     :accBot,#0
                    mov     :accTop,#0
                    
    :Loop           shr     rhs,#1 WC, WZ                   ' if rhs & 1
    
            IF_C    add     :accTop,:lhsTop                 '   accTop:accBot += lhsTop:lhsBot
            IF_C    add     :accBot,:lhsBot WC
            IF_C    add     :accTop,#1
    
                    shl     :lhsBot,#1 WC                   '   lhsTop:lhsBot := lhsTop:lhsBot << 1
                    rcl     :lhsTop,#1
           
            IF_NZ   jmp     #:Loop
    
                    shr     :signBit,#1 WC, NR
        IF_C        neg     :accBot,:accBot WZ
        IF_C        neg     :accTop,:accTop
        IF_C_AND_NZ sub     :accTop,#1
                    
                    shr     opc,#1 WC, NR
            IF_NC   mov     lhs,:accBot                     ' MPY
            IF_C    mov     lhs,:accTop                     ' MPY_BIG
    
                   jmp     #CommonReturn
    
    
    



    ' |     $F6     DIV     lhs := lhs / rhs
    ' |     $F7     MOD     lhs := lhs // rhs
    
    OvrDIV          abs     rhs,rhs WC, WZ                  ' Sets C if was negative
            IF_Z    mov     lhs,#0
            IF_Z    jmp     #CommonReturn
                    muxc    :signBit,#1
    
                    abs     lhs,lhs WC                      ' Sets C if was negative
            IF_C    xor     :signBit,#1
    
                    mov     acc,#0
                    
                    mov     :count,#1
    
    :ShlLoop        and     rhs,:k_4000_0000 WZ,NR
            IF_Z    shl     rhs,#1
            IF_Z    add     :count,#1
            IF_Z    jmp     #:ShlLoop
    
    :SubLoop        shl     acc,#1
                    cmps    lhs,rhs WC, WZ, NR
            IF_AE   sub     lhs,rhs
            IF_AE   or      acc,#1
                    shr     rhs,#1
                    djnz    :count,#:SubLoop
    
    '               acc   = result
    '               lhs   = remainder
    
                    shr     opc,#1 WC, NR
            IF_NC   mov     lhs,acc                         ' DIV
    
    '               If sign bit was set we need to negate back
    
                    shr     :signBit,#1 WC, NR
            IF_C    neg     lhs,lhs
    
                    jmp     #CommonReturn
    
    :k_4000_0000    long    $4000_0000
    
    
    



    The multiply is 32x32=64 and can be easily cut-back to 32x32=32 and becomes much simpler ( remove lhs:top and acc:top ). It doesn't have deterministic timing.

    Post Edited (hippy) : 1/10/2008 3:09:13 PM GMT
  • hippyhippy Posts: 1,981
    edited 2008-01-10 15:45
    If this is for your MP3 decoding it may be worthwhile looking at a lookup table for the
    multiplication. It doesn't have to be 2^64 entries ( although that's the fastest ) but
    could be done by 'quarter squares'. That reduces the size of lookup quite considerably
    but I cannot remember by how much; 2^33 ?

    I cannot remember the details but it involves turning "ab = a(b²-a²)/(b+a) + a²" into
    "ab = qs(a+b) - qs(a-b)" or something like that; a+b would give the lookup table size
    of 2^33 as above. qs(n) = n²/4

    Added : details here ... mathworld.wolfram.com/QuarterSquaresRule.html

    You might get some mileage out of dropping down to 16x16, even lower. Although you'll
    lose quality at least you may be able to go forward.

    Post Edited (hippy) : 1/10/2008 4:02:18 PM GMT
  • Paul BakerPaul Baker Posts: 6,351
    edited 2008-01-10 17:25
    There is also some routines in Propeller Guts.pdf: http://forums.parallax.com/attachment.php?attachmentid=40588·(this document served as the manual before there was a manual).

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Paul Baker
    Propeller Applications Engineer

    Parallax, Inc.
  • deSilvadeSilva Posts: 2,967
    edited 2008-01-10 18:16
    Of course also published in deSilva's Machine Language tutorial ..if anyone would read it ... *sigh*
  • mparkmpark Posts: 1,305
    edited 2008-01-10 21:28
    Maybe more people would read your tutorial if you posted a link every time you mentioned it.

    Like so: http://forums.parallax.com/showthread.php?p=668559

    See also: Lighting a candle v. cursing the dark.





    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Michael Park

    PS, BTW, and FYI:
    To search the forum, use search.parallax.com (do not use the Search button).
    Check out the Propeller Wiki: propeller.wikispaces.com/
  • bambinobambino Posts: 789
    edited 2008-01-10 21:51
    deSilva,
    I have read your tutorial and I apologize for having not commented on how informative it is. In fact, I am in the process of rereading it now, and I shall keep it in favorites!
    Maybe you could post a link to it in your signature. That way it will get publicity even when your wrtting about something totally unrelated!
  • deSilvadeSilva Posts: 2,967
    edited 2008-01-10 22:18
    Thanks, Bambino! And a good hint, as I MOSTLY write something totally unrelated smile.gif
    Though assembly code should not be overemphasized. Alas, the full wisdom of this forum is not easy to tap. Reading between some lines, I have the impression that even the much adversized work of our WIKI writers is hardly known. When I read in it from time to time I am always astonished about its comprehensive and to the point information. Great work there!
Sign In or Register to comment.