Multiply 8-bit by 8-bit unsigned, loop and unrolled
heater
Posts: 3,370
I need an 8 bit by 8 bit unsigned multiply with 16 bit result for the MoCog emulator.
I'm sure this has been discussed a thousand times here already but I can never find anything in these forums so this is what I came up with.
The idea is the first rdword reads the bytes to be multiplied.
I'm using #ifdefs to compile in a fast in-line multiply or a slower, but smaller looping multiply.
BUT the loop multiply bails out early if it runs out of bits to multiply by (in the alu operand) so it will be faster than the in-line code for multiplying by smaller numbers.
Now that I'm pooped out and got a headache thinking about all this can anyone point out any flaws in this?
Never mind the junk at the end which just sets the 6809 flags.
EDIT: Embarrassingly wrong code deleted here. See next post.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
Post Edited (heater) : 6/16/2009 5:45:45 AM GMT
I'm sure this has been discussed a thousand times here already but I can never find anything in these forums so this is what I came up with.
The idea is the first rdword reads the bytes to be multiplied.
I'm using #ifdefs to compile in a fast in-line multiply or a slower, but smaller looping multiply.
BUT the loop multiply bails out early if it runs out of bits to multiply by (in the alu operand) so it will be faster than the in-line code for multiplying by smaller numbers.
Now that I'm pooped out and got a headache thinking about all this can anyone point out any flaws in this?
Never mind the junk at the end which just sets the 6809 flags.
EDIT: Embarrassingly wrong code deleted here. See next post.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
Post Edited (heater) : 6/16/2009 5:45:45 AM GMT
Comments
Edit: The traditional approach seems to work:
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
Post Edited (heater) : 6/15/2009 10:08:28 PM GMT
Even if you can exit early, you still want to eventually shift a full 8 bits so that the product will have the correct allignment for the final ssh #16.
You could try keeping a bit counter, exit early then shr enough bits per the bit counter to allign the product properly. That would add an extra instruction·after the loop, however
The above is not perfect because it is still not testing the least 8 bits of alu, but perhaps you get the idea. If it exits early, data_8 contains the bits that remain to be shifted.
I hope this helps.
Mark
PS, I haven't fully thought this through·so if someone·can catch a mistake or·can add more·to the solution, please do.
Post Edited (Mark Swann) : 6/15/2009 10:35:57 PM GMT
I haven't had time to look at your code above.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBladeProp, SixBladeProp, website (Multiple propeller pcbs)
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: Micros eg Altair, and Terminals eg VT100 (Index)
· Search the Propeller forums (via Google)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--RvnPhnx