Does the propeller do multiply/divide in hardware?
webmasterpdx
Posts: 39
Does the propeller have a hardware multiply and divide in the processors?
Thanks
-D
Thanks
-D
Comments
.... that was easy ;o)
How about a barrel shifter to shift any number of bits in one instruction time? That could speed up a multiply algorithm.
Yes.
www.parallax.com/Portals/0/Downloads/docs/prod/prop/WebPM-v1.1.pdf
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lt's not particularly silly, is it?
Here's a download page containing documents that will answer most questions you may have about the Propeller: www.parallax.com/tabid/442/Default.aspx. If, after reading them, you have furhter questions, please do not hesitate to pose them here.
Thanks,
-Phil
There are lots of fast multiplies in PASM for the Prop floating around here, 8, 16, 32 bit .
For example Cheetah Integer Division :http://forums.parallax.com/showthread.php?p=828437
and Kenyan Multiplication http://forums.parallax.com/showthread.php?p=828096
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
Got the answers I need, plus some interesting algorithms I wasn't familiar with.....I love playing around with that kind of stuff.
Will have to check out the fast multiply/divides presented above.
Thanks again.
-Donald
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
Are you posting here just to drop your SPAM-links?
Or does your Prop have Cellulite?
Nick
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Never use force, just go for a bigger hammer!
The DIY Digital-Readout for mills, lathes etc.:
YADRO
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nyamekye,
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Has anyone used the huge log table to do fast multiply. If the table is 2K that's 12 bit multiply and divide converted into an add and subtract respectively.
Multiply should be:
lx = LOG2(x)
ly = LOG2(y)
lxy = lx + ly
xy = ALOG2(lxy)
The only issue is to get the log and antilog you'd have to wait for the hub to get to that COG.
Question: How many INSTRUCTION (4x) cycles does it take from one hub appearance to the next for a specific COG. Also, How many instructions does the HUB remain with a COG before moving on.
Thx
-D
Page 24
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
What I was very pleasently surprised with was the fact that the propeller allows one COG to begin it's HUB operation while another COG can begin it's HUB operation 2 cycles later (regardless of whether it takes 7 cycles to complete the first COG's operation. So basically, it's all pipelined and every COG get's access to the hub to ececute a hub instruction every 16 clocks, or 4 instruction cycles. It can also run 2 local instructions between each hub instruction without missing a beat. Pretty efficient. This means that almost no time is spent waiting on the hub.
OK, on to the multiply. Unfortunately, using logs won't work for a quick integer multiply as you have to take account of decimal exponents. It's OK for fixed point math, but for byte multiplies you need exact results typically.....so it's not the best way to proceed.
There are some ways to speed up multiply. e.g. You could use 4 bit multiplies using a 256 byte table. Then a byte multiply can be done quickly by combining nibbles together and using the result to look up that table, shifting values left 4 bits and adding these numbers together, etc...
I'll need to think on this. I like the new algorithms, but we can always use a faster algorithm....
Thx
-D
Leon
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Amateur radio callsign: G1HSM
Suzuki SV1000S motorcycle
-Phil
Leon
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Amateur radio callsign: G1HSM
Suzuki SV1000S motorcycle
I know from my past that if you have a multiply you can use this to calculate divide quickly, so just a hardware multiply would have been nice. Maybe in the next rev
I'd like to see a byte multiply in under 5 instructions using a combination of lookups and algorithm, or a 16-bit multiply in under 12 instructions.
I need to look more at the log table to see if it could be used more efficiently....
-D
To get log2 of a 32-bit integer, you first get the location of the MS bit and that gives you a value from 0-31 which is the integer part of the log2 function (fits in 5 bits). This is placed in bits 20-16 of the result (high 16 bits). Then you take the next 11 bits below the MS bit and index this into the log table. The 16 bit value returned is placed in bits 15-0 (low 16 bits) of the result. And you are done! Thats the log2(x).
The example they give seems to be way too long for that.
If there is an assembler command to tell you the MS set bit or something like that could give you that integer part. The fractional part (to the right of the decimal point) can be simply got by taking the 11 bits below the MS bit and indexing into the log table.
I suspect that using this and by finding a quick way to do the calculations above, that this way of doing multiply will give you accurate values for byte calculations and possibly up to 11 or 12 bits with 100% accuracy.
The trick is doing this calculation very quickly.
Be interested in seeing:
1. How many bits it can do the calculation without any loss.
2. How many cycles (minimum) can it be done in.
-D
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nyamekye,
-Phil
A digit reverse is where each pair of bits is taken as a digit and the address is reversed with that in mind (for radix 4 FFTs).
-D
Thx
-D
The best chart of assembly intstructions is in Chip's Propeller Guts (pdf) paper. The chart in the manual doesn't say what the instructions do; the one in Chip's paper does. ('Not sure why they left that out of the one in the manual. I refer to Chip's version constantly.)
-Phil