Shop OBEX P1 Docs P2 Docs Learn Events
Propeller Performance - Page 2 — Parallax Forums

Propeller Performance

2»

Comments

  • Paul BakerPaul Baker Posts: 6,351
    edited 2006-03-21 14:08
    No there is no direct ratio, its like trying to compare CISC with RISC, in spin, some commands take a relatively short time compared to other more complicated instructions (such as * and /).

    Chip has said that an average ratio is about 500 to 1 as far as speed of asm to spin.

    To give you an idea, 19.2k baud RS232 is not possible in spin. Similiar asynchronous communication in assembler can be easily in the MHz.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    ·1+1=10

    Post Edited (Paul Baker) : 3/21/2006 2:12:45 PM GMT
  • Martin HebelMartin Hebel Posts: 1,239
    edited 2006-03-21 14:17
    An important point to remember is that when using assembler, it resides in the cog itself, so unless the code needs to access the common 32K RAM, the hub cycle time doesn't even come into play.

    -Martin


    ·
  • Kaos KiddKaos Kidd Posts: 614
    edited 2006-03-21 15:22
    Ummm, just 2 bits worth of performance:
    Measuring performance for a new chip, one that has no equal in the market is a truely hard task. Such things as the tidbit from Paul about 19.2K RS232 is not possable in spin, would be a measuring point, but not the tell tell of the story. I know my C64 can't do 19.2k RS232 either, but it can do video and access megs of ram fast, things this "gem" wasn't designed for.

    For me, and the bulk of the junior hobbiest, the preformance issue is viewed as "What's the fastest ...", "Can this do ...", and in which case, one could say, "In SPIN, 14.4 is the Fasted RS232, but if you do in in assembler, you can easliy reach the Mhz range", and this is what is going to "sell" and move the chip: Knowing it's limits at a higher level. The average hobbiest isn't going to care about hub cycles and xfer rates unless he has a project demanding that level of preformance, at which time the actual, low level specs come to play.

    I know when I started looking at the stamp, I was impressed, and when I purchased my bs2P40, I believed I made a good desision. (Still do as a matter of fact) Me, like many before and thoes after me, the speed (there are faster stamps), wasn't the issue. Now, in all fairness, if I had known the Propeller was due to be released, I would have waited, (it's got more of the things I need / want).

    For the newbies, wannabes, and the junior hobbiest (I'm in that mix someplace..), the preformance level is going to be measured by the "Can I do.." questions. If you look through the stamp forum, how many times has the question been posted, "Can the stamp do..." or "which stamp do I need to do..." This is the type of preformance specs that need to be determined, and changed as the ones who know more then I (and that would be most everyone) start thinking "out of the box" with this chip and do come up with better ways of doing things.



    Ok, thanks for reading...

    [noparse]:)[/noparse]

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Just tossing my two bits worth into the bit bucket


    KK
    ·
  • WaldoDTDWaldoDTD Posts: 142
    edited 2006-03-21 23:34
    Actually I am working on a semi self aware vision system. I am almost done writing the first research paper on the project.
  • rockin_rickrockin_rick Posts: 32
    edited 2006-03-22 00:13
    Gadgetman said...
    16clock cycles is enough for 4 assembly instructions, and that is enough for a compact loop. If the HUB suddenly serviced the COG at 12 or 10 cycles, the loop would not match that timing, would 'miss' its slots and be forced to wait for the next time the HUB services that COG. (And suddenly, instead of doing a copy every 16cycles, it does one every 20 or 24 cycles. Not good... )

    I'm not certain this is the case...

    In the "Early propeller chip doc v0.1" it states that you can only fit in (2) 4-clock instructions without an increase in execution time. (if I understand correctly) While the hub comes around every 16, you have to deduct the time for the hub instruction to finish executing (while the hub is on to the next cog, and the next, ...)

    "The HUB runs at half the COGs' clock frequency, serving each of the eight COGs, in turn, with each subsequent clock. From the
    COGs' perspectives, this is once every 16 clocks. Because the HUB runs steadily, dedicating a clock cycle to each COG, and because the COGs run independently, taking various numbers of clocks for different instructions, each COG must re-sync to the HUB whenever it executes a HUB instruction. This results in execution times ranging from 7 to 22 clocks for each hub instruction. Once a HUB instruction has executed, there will be 9 free clocks before another HUB instruction could execute and take the minimal 7 clocks. Nine clocks is enough time for two 4-clock instructions to execute before another HUB instruction would take 8 clocks. So, to minimize clock waste, you can insert two 4-clock instructions between any two otherwise-contiguous HUB instructions without any increase in execution time. Beware that HUB instructions can inject jitter into your execution schedule – particularly the first one in a sequence. For deterministic timing, you might want to place them outside of time-critical code sequences in which the HUB-sync is unknown."


    Did I misunderstand?
    Rick
  • rockin_rickrockin_rick Posts: 32
    edited 2006-03-22 00:20
    Paul Baker said...
    Chip has said that an average ratio is about 500 to 1 as far as speed of asm to spin.

    He said 250 to 1 in this thread's second post. -

    http://forums.parallax.com/forums/default.aspx?f=25&m=114494


    Maybe I missed a later correction by him?

    (not trying to be a know-it-all, just want the facts straight...)

    Rick
  • CJCJ Posts: 470
    edited 2006-03-22 02:18
    looks like 250 to me, 20MIPS to 80K-IPS

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Who says you have to have knowledge to use it?

    I've killed a fly with my bare mind.
  • Paul BakerPaul Baker Posts: 6,351
    edited 2006-03-22 02:57
    Your right, I misrecollected.

    Only two assembly instructions can be fit between hub accesses, because the hub access take 5 clock cycles itself, leaving 3 cycles of waiting.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    ·1+1=10
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2006-03-22 05:13
    For really simple stuff the ratio isn't quite so pronounced. I measured about 77:1 for

    [b]repeat[/b]
      [b]outa[/b] := 1
      [b]outa[/b] := 0
    
    


    (11.5 µSec / cycle), versus

    loop    [b]mov  outa[/b],#1
            [b]mov  outa[/b],#0
            [b]jmp[/b]  loop
    
    


    (150 nSec / cycle).

    -Phil
  • Paul BakerPaul Baker Posts: 6,351
    edited 2006-03-22 06:02
    One of the things on my to do list is getting a hex dump of the interpretor and seeing if I can reverse assembly it to figure out exactly how long each operation takes and look for tricks Chip used to accomplish things, man that list is growing longer by the day.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    ·1+1=10
Sign In or Register to comment.