[POC] Countiplier, The

Disclaimer: No counters were harmed during this exercise.
Well, I was missing the technical discussion/discoveries in this forum so I sat down and implemented an idea I've had for some time now. Multiplication with counter support. The code is still pretty rough and a number of optimisations can be applied depending on usage patterns. The important bit is that it multiplies (correctly) at 4 cycles/bit.
The multiplier cog itself shortcuts execution depending on how many bits are used but uses byte granularity (8/16/24/32 bits). The gate cog lazily sends out all bits of the operand so its recovery time may work against using multiplication sequentially.
The example code is set to perform $89AB * $CDEF = $6EBE75A5.
Enjoy!
Well, I was missing the technical discussion/discoveries in this forum so I sat down and implemented an idea I've had for some time now. Multiplication with counter support. The code is still pretty rough and a number of optimisations can be applied depending on usage patterns. The important bit is that it multiplies (correctly) at 4 cycles/bit.
The multiplier cog itself shortcuts execution depending on how many bits are used but uses byte granularity (8/16/24/32 bits). The gate cog lazily sends out all bits of the operand so its recovery time may work against using multiplication sequentially.
The example code is set to perform $89AB * $CDEF = $6EBE75A5.
Enjoy!
Comments
Your post makes me feel like Winnie the Pooh. I suspect that if I think very hard, I might just be able to appreciate that you've done something remarkably clever. Alas, its late and my brain isn't up to it just now. But congratulations anyway
I never considered using an instruction like this before:
add_frqa_frqa = $80BFF5FA ' add frqa, frqa
...
long add_frqa_frqa[7] ' adjust operand
I suppose that specifies 7 additions. Toolbox item
Nice work.
Neither did I until now but it greatly helps with readability especially when you have lots of them (and are more interested in the code structure). The number in brackets is the repeat count (explained with an example in the manual around page 102, Declaring Repeating Data).
Admittedly the multiplier itself is using too much code in its current shape (culprit being the mov phsa, #0). But ATM it has to be exactly in this place as the gate cog works directly with outa which causes a number of undefined states until the time is right. There is a solution for this which costs 4 cycles but would save 38 longs.
Also, both cogs have to run with IDs N and (N+1) mod 8 as I need them 2 cycles apart when coming out of a hub op.
Thanks!
Kuroneko, congratulations for escalating the Forum to a higher level of technical discussion and discovery!
Humanoido