Multiply 8-bit by 8-bit unsigned, loop and unrolled
I need an 8 bit by 8 bit unsigned multiply with 16 bit result for the MoCog emulator.
I'm sure this has been discussed a thousand times here already but I can never find anything in these forums so this is what I came up with.
The idea is the first rdword reads the bytes to be multiplied.
I'm using #ifdefs to compile in a fast in-line multiply or a slower, but smaller looping multiply.
BUT the loop multiply bails out early if it runs out of bits to multiply by (in the alu operand) so it will be faster than the in-line code for multiplying by smaller numbers.
Now that I'm pooped out and got a headache thinking about all this can anyone point out any flaws in this?
Never mind the junk at the end which just sets the 6809 flags.
EDIT: Embarrassingly wrong code deleted here. See next post.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
Post Edited (heater) : 6/16/2009 5:45:45 AM GMT
I'm sure this has been discussed a thousand times here already but I can never find anything in these forums so this is what I came up with.
The idea is the first rdword reads the bytes to be multiplied.
I'm using #ifdefs to compile in a fast in-line multiply or a slower, but smaller looping multiply.
BUT the loop multiply bails out early if it runs out of bits to multiply by (in the alu operand) so it will be faster than the in-line code for multiplying by smaller numbers.
Now that I'm pooped out and got a headache thinking about all this can anyone point out any flaws in this?
Never mind the junk at the end which just sets the 6809 flags.
EDIT: Embarrassingly wrong code deleted here. See next post.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
Post Edited (heater) : 6/16/2009 5:45:45 AM GMT

Comments
Edit: The traditional approach seems to work:
mul rdword alu, d_reg mov data_16, alu andn data_16, #$FF shl data_16, #15 #ifndef FAST_MUL #warn Using slow multiply mov data_8, #8 :loop shr alu, #1 wc if_c add alu, data_16 djnz data_8, #:loop #else #warn Using fast multiply shr alu, #1 wc if_c add alu, data_16 shr alu, #1 wc if_c add alu, data_16 shr alu, #1 wc if_c add alu, data_16 shr alu, #1 wc if_c add alu, data_16 shr alu, #1 wc if_c add alu, data_16 shr alu, #1 wc if_c add alu, data_16 shr alu, #1 wc if_c add alu, data_16 shr alu, #1 wc if_c add alu, data_16 #endif shr alu, #16 wz if_z or cc, #zero_flag test alu, #%10000000 wz muxnz cc, #carry_flag wrword alu, d_reg jmp #fetch▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
Post Edited (heater) : 6/15/2009 10:08:28 PM GMT
Even if you can exit early, you still want to eventually shift a full 8 bits so that the product will have the correct allignment for the final ssh #16.
You could try keeping a bit counter, exit early then shr enough bits per the bit counter to allign the product properly. That would add an extra instruction·after the loop, however
:loop shr alu, #1 wc wz if_c add alu, data_16 if_nz djnz data_8, #:loop shr alu, data_8The above is not perfect because it is still not testing the least 8 bits of alu, but perhaps you get the idea. If it exits early, data_8 contains the bits that remain to be shifted.
I hope this helps.
Mark
PS, I haven't fully thought this through·so if someone·can catch a mistake or·can add more·to the solution, please do.
Post Edited (Mark Swann) : 6/15/2009 10:35:57 PM GMT
I haven't had time to look at your code above.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBladeProp, SixBladeProp, website (Multiple propeller pcbs)
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: Micros eg Altair, and Terminals eg VT100 (Index)
· Search the Propeller forums (via Google)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--RvnPhnx