Shop OBEX P1 Docs P2 Docs Learn Events
Does the propeller do multiply/divide in hardware? — Parallax Forums

Does the propeller do multiply/divide in hardware?

webmasterpdxwebmasterpdx Posts: 39
edited 2009-09-02 05:31 in Propeller 1
Does the propeller have a hardware multiply and divide in the processors?

Thanks
-D

Comments

  • MagIO2MagIO2 Posts: 2,243
    edited 2009-09-01 08:28
    No



    .... that was easy ;o)
  • webmasterpdxwebmasterpdx Posts: 39
    edited 2009-09-01 08:35
    That sucks. Any hardware assist?
    How about a barrel shifter to shift any number of bits in one instruction time? That could speed up a multiply algorithm.
  • BradCBradC Posts: 2,601
    edited 2009-09-01 08:39
    webmasterpdx said...
    That sucks. Any hardware assist?
    How about a barrel shifter to shift any number of bits in one instruction time? That could speed up a multiply algorithm.

    Yes.

    www.parallax.com/Portals/0/Downloads/docs/prod/prop/WebPM-v1.1.pdf

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    lt's not particularly silly, is it?
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2009-09-01 08:45
    webmasterpdx,

    Here's a download page containing documents that will answer most questions you may have about the Propeller: www.parallax.com/tabid/442/Default.aspx. If, after reading them, you have furhter questions, please do not hesitate to pose them here.

    Thanks,
    -Phil
  • heaterheater Posts: 3,370
    edited 2009-09-01 08:55
    Didn't we already answer this one? Whatever.

    There are lots of fast multiplies in PASM for the Prop floating around here, 8, 16, 32 bit .

    For example Cheetah Integer Division :http://forums.parallax.com/showthread.php?p=828437
    and Kenyan Multiplication http://forums.parallax.com/showthread.php?p=828096

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.
  • webmasterpdxwebmasterpdx Posts: 39
    edited 2009-09-01 09:02
    Thanks to all who have replied....
    Got the answers I need, plus some interesting algorithms I wasn't familiar with.....I love playing around with that kind of stuff.

    Will have to check out the fast multiply/divides presented above.

    Thanks again.
    -Donald
  • heaterheater Posts: 3,370
    edited 2009-09-01 09:22
    Yeah. That Kenyan thing really got be when it turned up here recently. Never heard of such a thing before.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.
  • Nick MuellerNick Mueller Posts: 815
    edited 2009-09-01 10:09
    @thoufiq987:
    Are you posting here just to drop your SPAM-links?

    Or does your Prop have Cellulite?


    Nick

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Never use force, just go for a bigger hammer!

    The DIY Digital-Readout for mills, lathes etc.:
    YADRO
  • KyeKye Posts: 2,200
    edited 2009-09-01 11:15
    Spam

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Nyamekye,
  • heaterheater Posts: 3,370
    edited 2009-09-01 11:16
    Machine Intelligence just got something working!!!

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.
  • Agent420Agent420 Posts: 439
    edited 2009-09-01 11:21
    ^ I don't care who you are, that there's funny lol.gif

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
  • webmasterpdxwebmasterpdx Posts: 39
    edited 2009-09-01 15:30
    I'm not sure who the tape recorder is....but I have another question about multiply/divide.

    Has anyone used the huge log table to do fast multiply. If the table is 2K that's 12 bit multiply and divide converted into an add and subtract respectively.

    Multiply should be:

    lx = LOG2(x)
    ly = LOG2(y)
    lxy = lx + ly
    xy = ALOG2(lxy)

    The only issue is to get the log and antilog you'd have to wait for the hub to get to that COG.

    Question: How many INSTRUCTION (4x) cycles does it take from one hub appearance to the next for a specific COG. Also, How many instructions does the HUB remain with a COG before moving on.

    Thx
    -D
  • Agent420Agent420 Posts: 439
    edited 2009-09-01 15:59
    Page 382

    Page 24

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
  • webmasterpdxwebmasterpdx Posts: 39
    edited 2009-09-01 17:58
    A couple of good and bad points based on the manual. I had trouble downloading that manual. Eventually, I had to right click and download the link instead of trying to view it in my browser. I think there is a vista problem with wireless and big files in that in some situations it seems to hang when downloading files. Anyway, when I downloaded it I was able to store a local copy of the manual which seems to run fine now.

    What I was very pleasently surprised with was the fact that the propeller allows one COG to begin it's HUB operation while another COG can begin it's HUB operation 2 cycles later (regardless of whether it takes 7 cycles to complete the first COG's operation. So basically, it's all pipelined and every COG get's access to the hub to ececute a hub instruction every 16 clocks, or 4 instruction cycles. It can also run 2 local instructions between each hub instruction without missing a beat. Pretty efficient. This means that almost no time is spent waiting on the hub.

    OK, on to the multiply. Unfortunately, using logs won't work for a quick integer multiply as you have to take account of decimal exponents. It's OK for fixed point math, but for byte multiplies you need exact results typically.....so it's not the best way to proceed.

    There are some ways to speed up multiply. e.g. You could use 4 bit multiplies using a 256 byte table. Then a byte multiply can be done quickly by combining nibbles together and using the result to look up that table, shifting values left 4 bits and adding these numbers together, etc...
    I'll need to think on this. I like the new algorithms, but we can always use a faster algorithm....

    Thx
    -D
  • LeonLeon Posts: 7,620
    edited 2009-09-01 19:34
    An 18 pin 40 MIPS PIC24 could be used as a maths co-processor. It'll do a 17x17 bit multiply in one clock, and has fast divide hardware. They are cheap.

    Leon

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Amateur radio callsign: G1HSM
    Suzuki SV1000S motorcycle
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2009-09-01 19:45
    The problem with a coprocessor, of course, is the time required to get the arguments in and the results out, especially with a limited number of pins. My guess is that, compared with software multiply/divide, it's a push.

    -Phil
  • LeonLeon Posts: 7,620
    edited 2009-09-01 20:19
    The bigger 28 pin version has a full 16-bit port available with DMA. The main problem is how many pins can be made available on the Propeller. Some of the larger chips have an 8-bit bidirectional port intended for very fast inter-chip data transfers.

    Leon

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Amateur radio callsign: G1HSM
    Suzuki SV1000S motorcycle
  • webmasterpdxwebmasterpdx Posts: 39
    edited 2009-09-01 21:40
    The whole idea though is to be able to do all this in the one chip.
    I know from my past that if you have a multiply you can use this to calculate divide quickly, so just a hardware multiply would have been nice. Maybe in the next rev smile.gif
    I'd like to see a byte multiply in under 5 instructions using a combination of lookups and algorithm, or a 16-bit multiply in under 12 instructions.
    I need to look more at the log table to see if it could be used more efficiently....
    -D
  • webmasterpdxwebmasterpdx Posts: 39
    edited 2009-09-01 22:03
    OK, heres an idea. I don't know the propeller well enough yet to tell how to do this....but I suspect the exponent example in page 323 is not optimized at all.

    To get log2 of a 32-bit integer, you first get the location of the MS bit and that gives you a value from 0-31 which is the integer part of the log2 function (fits in 5 bits). This is placed in bits 20-16 of the result (high 16 bits). Then you take the next 11 bits below the MS bit and index this into the log table. The 16 bit value returned is placed in bits 15-0 (low 16 bits) of the result. And you are done! Thats the log2(x).
    The example they give seems to be way too long for that.
    If there is an assembler command to tell you the MS set bit or something like that could give you that integer part. The fractional part (to the right of the decimal point) can be simply got by taking the 11 bits below the MS bit and indexing into the log table.

    I suspect that using this and by finding a quick way to do the calculations above, that this way of doing multiply will give you accurate values for byte calculations and possibly up to 11 or 12 bits with 100% accuracy.
    The trick is doing this calculation very quickly.

    Be interested in seeing:
    1. How many bits it can do the calculation without any loss.
    2. How many cycles (minimum) can it be done in.

    -D
  • KyeKye Posts: 2,200
    edited 2009-09-01 22:43
    Ask cnesspilot (Author of the kenyan integer multiplication stuff). He's the one who was interested in very fast multiply code.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Nyamekye,
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2009-09-01 22:44
    The Prop I does not have an encode instruction (although, as with multiply, there's a place reserved in the instruction set for one). The Prop II will have hardware multiply, but I'm not sure about the encode instruction.

    -Phil
  • webmasterpdxwebmasterpdx Posts: 39
    edited 2009-09-02 03:49
    I'm not sure who to ask, but a bit reverse operation and digit reverse might be very useful for fast addressing in FFT calculations (just something I remember from my old supercomputer days).
    A digit reverse is where each pair of bits is taken as a digit and the address is reversed with that in mind (for radix 4 FFTs).

    -D
  • Mike GreenMike Green Posts: 23,101
    edited 2009-09-02 03:54
    The Prop does have a bit reverse instruction that reverses a specified number of low order bits in the destination operand.
  • webmasterpdxwebmasterpdx Posts: 39
    edited 2009-09-02 05:09
    cool. I couldn't find that in the manual.

    Thx
    -D
  • Mike GreenMike Green Posts: 23,101
    edited 2009-09-02 05:13
    The instruction is REV and the equivalent Spin operator is "><".
  • webmasterpdxwebmasterpdx Posts: 39
    edited 2009-09-02 05:16
    Page 163 under Bitwise Reverse.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2009-09-02 05:31
    Donald,

    The best chart of assembly intstructions is in Chip's Propeller Guts (pdf) paper. The chart in the manual doesn't say what the instructions do; the one in Chip's paper does. ('Not sure why they left that out of the one in the manual. I refer to Chip's version constantly.)

    -Phil
Sign In or Register to comment.