Multiply 8-bit by 8-bit unsigned, loop and unrolled

heater · 2009-06-15 21:41

I need an 8 bit by 8 bit unsigned multiply with 16 bit result for the MoCog emulator.

I'm sure this has been discussed a thousand times here already but I can never find anything in these forums so this is what I came up with.

The idea is the first rdword reads the bytes to be multiplied.

I'm using #ifdefs to compile in a fast in-line multiply or a slower, but smaller looping multiply.

BUT the loop multiply bails out early if it runs out of bits to multiply by (in the alu operand) so it will be faster than the in-line code for multiplying by smaller numbers.

Now that I'm pooped out and got a headache thinking about all this can anyone point out any flaws in this?

Never mind the junk at the end which just sets the 6809 flags.

EDIT: Embarrassingly wrong code deleted here. See next post.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

Post Edited (heater) : 6/16/2009 5:45:45 AM GMT

heater · 2009-06-15 21:59

Yes, yes, the flaw in the plan is that it does not work. When I thought I was testing the loop I still had the in-line code compiled in. I have to sleep....

Edit: The traditional approach seems to work:

mul                     rdword  alu, d_reg
                        mov     data_16, alu
                        andn    data_16, #$FF
                        shl     data_16, #15
#ifndef FAST_MUL
#warn Using slow multiply
                        mov     data_8, #8

:loop                   shr     alu, #1 wc
              if_c      add     alu, data_16
                        djnz    data_8, #:loop
#else
#warn Using fast multiply
                        shr     alu, #1 wc
              if_c      add     alu, data_16
                        shr     alu, #1 wc
              if_c      add     alu, data_16
                        shr     alu, #1 wc
              if_c      add     alu, data_16
                        shr     alu, #1 wc
              if_c      add     alu, data_16
                        shr     alu, #1 wc
              if_c      add     alu, data_16
                        shr     alu, #1 wc
              if_c      add     alu, data_16
                        shr     alu, #1 wc
              if_c      add     alu, data_16
                        shr     alu, #1 wc
              if_c      add     alu, data_16
#endif
                        shr     alu, #16 wz
              if_z      or      cc, #zero_flag
                        test    alu, #%10000000 wz
                        muxnz   cc, #carry_flag
                        wrword  alu, d_reg
                        jmp     #fetch

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

Post Edited (heater) : 6/15/2009 10:08:28 PM GMT

Mark Swann · 2009-06-15 22:30

In the first shorter loop, you want to exit when the least significant 8 bits in alu is zero, but that is not what you are testing.

Even if you can exit early, you still want to eventually shift a full 8 bits so that the product will have the correct allignment for the final ssh #16.

You could try keeping a bit counter, exit early then shr enough bits per the bit counter to allign the product properly. That would add an extra instruction·after the loop, however

:loop                   shr     alu, #1 wc wz
              if_c      add     alu, data_16
              if_nz     djnz    data_8, #:loop
 
                        shr     alu, data_8

The above is not perfect because it is still not testing the least 8 bits of alu, but perhaps you get the idea. If it exits early, data_8 contains the bits that remain to be shifted.

I hope this helps.

Mark

PS, I haven't fully thought this through·so if someone·can catch a mistake or·can add more·to the solution, please do.

Post Edited (Mark Swann) : 6/15/2009 10:35:57 PM GMT

Cluso99 · 2009-06-16 04:05

I placed a link under the MoCog thread for the Spin Interpreter code. I don't think my version is any different from Chip's in the multiply section. You will have to trim for 8 bits.

I haven't had time to look at your code above.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:

· Home of the MultiBladeProps: TriBladeProp, SixBladeProp, website (Multiple propeller pcbs)
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: Micros eg Altair, and Terminals eg VT100 (Index)
· Search the Propeller forums (via Google)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm

heater · 2009-06-16 05:47

Mark yes, I was getting tired and optimistic. I can't see a way to bail out early that doesn't make the thing complicated enough not to bother with.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

RvnPhnx · 2009-06-16 11:30

I've contemplated writing up a booth multiply in PASM, but I have to admit that I haven't gotten around to it. In theory it shouldn't be that difficult--just a little messy.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

--RvnPhnx

Multiply 8-bit by 8-bit unsigned, loop and unrolled

Comments