Shop OBEX P1 Docs P2 Docs Learn Events
Multiply 8-bit by 8-bit unsigned, loop and unrolled — Parallax Forums

Multiply 8-bit by 8-bit unsigned, loop and unrolled

heaterheater Posts: 3,370
edited 2009-06-16 11:30 in Propeller 1
I need an 8 bit by 8 bit unsigned multiply with 16 bit result for the MoCog emulator.

I'm sure this has been discussed a thousand times here already but I can never find anything in these forums so this is what I came up with.

The idea is the first rdword reads the bytes to be multiplied.

I'm using #ifdefs to compile in a fast in-line multiply or a slower, but smaller looping multiply.

BUT the loop multiply bails out early if it runs out of bits to multiply by (in the alu operand) so it will be faster than the in-line code for multiplying by smaller numbers.

Now that I'm pooped out and got a headache thinking about all this can anyone point out any flaws in this?

Never mind the junk at the end which just sets the 6809 flags.

EDIT: Embarrassingly wrong code deleted here. See next post.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

Post Edited (heater) : 6/16/2009 5:45:45 AM GMT

Comments

  • heaterheater Posts: 3,370
    edited 2009-06-15 21:59
    Yes, yes, the flaw in the plan is that it does not work. When I thought I was testing the loop I still had the in-line code compiled in. I have to sleep....

    Edit: The traditional approach seems to work:

    mul                     rdword  alu, d_reg
                            mov     data_16, alu
                            andn    data_16, #$FF
                            shl     data_16, #15
    #ifndef FAST_MUL
    #warn Using slow multiply
                            mov     data_8, #8
    
    :loop                   shr     alu, #1 wc
                  if_c      add     alu, data_16
                            djnz    data_8, #:loop
    #else
    #warn Using fast multiply
                            shr     alu, #1 wc
                  if_c      add     alu, data_16
                            shr     alu, #1 wc
                  if_c      add     alu, data_16
                            shr     alu, #1 wc
                  if_c      add     alu, data_16
                            shr     alu, #1 wc
                  if_c      add     alu, data_16
                            shr     alu, #1 wc
                  if_c      add     alu, data_16
                            shr     alu, #1 wc
                  if_c      add     alu, data_16
                            shr     alu, #1 wc
                  if_c      add     alu, data_16
                            shr     alu, #1 wc
                  if_c      add     alu, data_16
    #endif
                            shr     alu, #16 wz
                  if_z      or      cc, #zero_flag
                            test    alu, #%10000000 wz
                            muxnz   cc, #carry_flag
                            wrword  alu, d_reg
                            jmp     #fetch
    
    

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.

    Post Edited (heater) : 6/15/2009 10:08:28 PM GMT
  • Mark SwannMark Swann Posts: 124
    edited 2009-06-15 22:30
    In the first shorter loop, you want to exit when the least significant 8 bits in alu is zero, but that is not what you are testing.

    Even if you can exit early, you still want to eventually shift a full 8 bits so that the product will have the correct allignment for the final ssh #16.

    You could try keeping a bit counter, exit early then shr enough bits per the bit counter to allign the product properly. That would add an extra instruction·after the loop, however
    :loop                   shr     alu, #1 wc wz
                  if_c      add     alu, data_16
                  if_nz     djnz    data_8, #:loop
     
                            shr     alu, data_8
    
    

    The above is not perfect because it is still not testing the least 8 bits of alu, but perhaps you get the idea. If it exits early, data_8 contains the bits that remain to be shifted.

    I hope this helps.

    Mark

    PS, I haven't fully thought this through·so if someone·can catch a mistake or·can add more·to the solution, please do.




    Post Edited (Mark Swann) : 6/15/2009 10:35:57 PM GMT
  • Cluso99Cluso99 Posts: 18,069
    edited 2009-06-16 04:05
    I placed a link under the MoCog thread for the Spin Interpreter code. I don't think my version is any different from Chip's in the multiply section. You will have to trim for 8 bits.

    I haven't had time to look at your code above.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Links to other interesting threads:

    · Home of the MultiBladeProps: TriBladeProp, SixBladeProp, website (Multiple propeller pcbs)
    · Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
    · Prop Tools under Development or Completed (Index)
    · Emulators: Micros eg Altair, and Terminals eg VT100 (Index)
    · Search the Propeller forums (via Google)
    My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm
  • heaterheater Posts: 3,370
    edited 2009-06-16 05:47
    Mark yes, I was getting tired and optimistic. I can't see a way to bail out early that doesn't make the thing complicated enough not to bother with.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.
  • RvnPhnxRvnPhnx Posts: 36
    edited 2009-06-16 11:30
    I've contemplated writing up a booth multiply in PASM, but I have to admit that I haven't gotten around to it. In theory it shouldn't be that difficult--just a little messy.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    --RvnPhnx
Sign In or Register to comment.