Faster PASM Multiplication?
I'm learning PASM. I'm reading deSilva's guide and it goes over a simple multiplication routine. It seems as though all it's doing is recursion; adding a, b times. Isn't there a faster way of multiplying and dividing? It seems that a simple multiplication routine takes longer, the bigger the numbers...
Comments
http://en.wikipedia.org/wiki/Multiplication_algorithm
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer
Parallax, Inc.
What I had in mind, was a combo of shifting and adding(for multiplication). Let's say you had 3 * 9. What you could do is 3<<3(3*2*2*2), then add one more three. Granted, it would take some clock cycles to determine how many x2's you'd have to do, but for large numbers, it seems the overhead would be worth it.
Another example, 6*45 = 6*32 + 6*8 + 6*4 + 6 = 6<<5 + 6<<3 + 6<<2 + 6... My thinking is that if you can use a few extra clock cycles to figure out how many shifts you need to do, you might make those cycles up when you don't have to use iteration.
I'm certainly not saying that there isn't a faster way, but I haven't seen it. One possibility is instead of actually doing 16 full iterations, stopping as soon as there are no more 1 bits left in y. This would require a final shift instruction to align the result, but would be much faster for multiplying small numbers (i.e. with < 16 bits of input) as it could end early. The trick is finding a way to short-circuit the loop without taking more than 3 instructions, as your worst case scenario would be much worse. For example, I could do this in 4 instructions, yielding a wort case scenario of 33% slower than the current way. I have not found a way to do this in only 3 instructions yet, of course [noparse][[/noparse]8^)
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lonesock
Piranha are people too.
Post Edited (lonesock) : 6/23/2008 10:08:29 PM GMT
int mul(int op1, int op2)
{
int res = 0;
while (op1)
{
if ((op1 & 1) != 0)
res += op2;
op2 <<= 1;
op1 >>= 1;
}
return res;
}
For constant multiplication, a compiler may generate the shift-add/sub sequence in line, as you can shift by multiple bits if there is a sequence of zeroes.
// richard
I've been greatly amused by the miscommunication in this thread. The "inefficient algorthm" that Philldapill started off the thread wondering about was a*b = a+a+...+a+a (i.e. a added to itself b times) — clearly suboptimal and not the method anyone in their right mind would use. Of course, the algorithm that everyone uses and is mentioned multiple times in this thread, and which Philldapill "rediscovered" in his second post, is the venerable shift-and-add routine. I think everyone is on the same page here, but I'm not so sure everyone knows that.
-Phil