Shop OBEX P1 Docs P2 Docs Learn Events
Hub instructions — Parallax Forums

Hub instructions

AleAle Posts: 2,363
edited 2007-07-18 21:48 in Propeller 1
I have several questions regarding Hub instructions and their inner workings smile.gif

From Propeller P8X32A Preliminary Datasheet Rev 0.2 4/2/2007 (Pag 6)

1. When occurs the synchronization between the Hub and a Hub instruction ?

If we suppose that an instruction has 4 stages and thus (min) 4 cycles: Fetch, Decode, Execute, Writeback.

In my opinion the synchronization should occur at the execution stage, or is it signaled at the Fetch stage (how it waits then ?) ?

1.5. Where gets stalled, the instruction, when the HUB is not synchronized ?

2. When occurs the access to a HUB resource, being it memory of special registers during these 3 extra cycles ?

3. Now looking at the graph at the bottom of the page, what happens if cog 0 is executing an in-sync HUB instruction (finishing at 3.5 Sysclocks), cog 2 wants to execute a HI (at 2) and cog 4 also (at 4) ? Is cog 4 going to wait till when (2 full turns ? that is a bit more than a grand total of 22 cycles!) ?, if we stage the cog accesses from cog 0 cog 7 what happens ? (That is not clear from the manual and or datasheet...).

Am I missing something ?

Thanks

Comments

  • ericballericball Posts: 774
    edited 2007-07-17 19:53
    Ale said...
    I have several questions regarding Hub instructions and their inner workings smile.gif

    From Propeller P8X32A Preliminary Datasheet Rev 0.2 4/2/2007 (Pag 6)

    1. When occurs the synchronization between the Hub and a Hub instruction ?

    If we suppose that an instruction has 4 stages and thus (min) 4 cycles: Fetch, Decode, Execute, Writeback.

    In my opinion the synchronization should occur at the execution stage, or is it signaled at the Fetch stage (how it waits then ?) ?

    1.5. Where gets stalled, the instruction, when the HUB is not synchronized ?

    2. When occurs the access to a HUB resource, being it memory of special registers during these 3 extra cycles ?

    3. Now looking at the graph at the bottom of the page, what happens if cog 0 is executing an in-sync HUB instruction (finishing at 3.5 Sysclocks), cog 2 wants to execute a HI (at 2) and cog 4 also (at 4) ? Is cog 4 going to wait till when (2 full turns ? that is a bit more than a grand total of 22 cycles!) ?, if we stage the cog accesses from cog 0 cog 7 what happens ? (That is not clear from the manual and or datasheet...).

    Am I missing something ?

    Thanks
    Yes, you've missed something.· COG to HUB access is round-robin.· So COG0 gets a HUB timeslice, then COG1, then COG2 and so on.· If a COG isn't executing a HUB instruction during it's timeslice then nothing happens.· When a COG executes a HUB instruction, it wait until it's next HUB timeslice.
    ·
  • Mike GreenMike Green Posts: 23,101
    edited 2007-07-17 20:00
    If a cog misses its timeslice, it simply waits for the next one ... a maximum of 15 clock cycles.
  • Paul BakerPaul Baker Posts: 6,351
    edited 2007-07-17 20:06
    Ale, so you know, the instruction pipeline is SDIR or source destination instruction result, where the instruction fetch is for the next instruction executed (therefore the instruction for the current instruction was fetched before the result of the previous instruction, and the reason self modifying code must have an intervening instruction). Eric is correct, hub access is round-robin and is clocked at 1/2 cog speed so each cog gains access every 16 cycles. The number of cycles it takes a cog to perform a hub access is variable depending on when the instruction is executed and when it's access time occurs, the number is between 7 and 22 clock cycles, and this concept is explained in the manual starting on page 24.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Paul Baker
    Propeller Applications Engineer

    Parallax, Inc.
  • AleAle Posts: 2,363
    edited 2007-07-17 21:08
    @ericball:
    @Mike Green: May be I missed what is not written ?, still q1 and q1.5 are not answered :-(

    @Paul: Noted the fetch before writeback. I think I read it somewhere but remained in the back of my mind.
    Your explanation still does not answer question 2.

    So, seems that every cog has an opportunity every hub clock cycle for a hub instruction. Every hub access performed by a cog takes no more than 1 hub clock cycle. So in 8 Hub clock cycles the 8 cogs can start (if synchronized) a HUB instruction. Where again is that written in the manual ? (Don't forget I'm writing my own Propeller simulator, and this ambiguities do not help, no Gear does not cut it for me, too much bugs, and only runs on winblows).

    Well, questions 1 and 1.5 are still not answered. Unless Paul you are implying something with regard to the pipeline that is still cloudy.

    Thanks.
  • Graham StablerGraham Stabler Posts: 2,507
    edited 2007-07-17 22:24
    For the record I don't even understand questions 1 and 1.5
  • AribaAriba Posts: 2,685
    edited 2007-07-17 22:48
  • Paul BakerPaul Baker Posts: 6,351
    edited 2007-07-17 23:50
    Ale, question 2 is irrelavent, it's round robin so the cog gets hub access when it gets hub access, any data written to hub memory by cog 0 will be availible to cog 1 right afterwards, and cog 1 cannot perform any further processing of data obtained from the hub operation until after it is completed, so what precisely happens those 3 cycles has no material consequence on the operation of the chip or how one would program it.

    You are making this much more complicated than it needs to be, hub access occurs 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 ad infinitum. If a cog has a hub operation scheduled, it occurs, if it doesn't the hub is idle that period. The only synchronization that occurs is with any one cog and it's specified hub window, this is a fixed thing which happens exactly 16 clock cycles with no deviation whatsoever. A cog can completely ignore whether or not any other cog is accessing the hub, as far as it's concerned it is a single processor that gets to access a specific portion of memory every 16 clock cycles. It needs 1 cycle to alert the hub it needs access, and 2 cycles to access thats 3 cycles. The hub operation must be fully executed before it can alert the hub or 4 cycles, thats 7 if it happened to complete its execution precisely in the time window·it needs to notify the hub. If it just missed then it has to wait an additional 15 clock cycles before it's next alert window opens up, thats 22 cycles. So depending on where it happenes to execute it takes between 7 and 22 cycles to execute. Thereafter if you have regular accesses, you can execute 2 assembly instructions between each hub access and not miss a single hub access time slot.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Paul Baker
    Propeller Applications Engineer

    Parallax, Inc.
  • AleAle Posts: 2,363
    edited 2007-07-18 05:33
    Thanks Ar(r)iba, that answers q1 and 1.5. (Are you sure is not Arriba, you know in Spanish a r sounds different as rr, and Speedy used to say arriba arriba)

    Thanks Paul, that was a complicated yes for my "So, it seems". Sorry, but from the text in the manual, it was not clear (to me).
  • David BDavid B Posts: 591
    edited 2007-07-18 20:54
    I've been working on some propeller assembly code with tight timing requirements, and had similar questions about timing of hub operations. So I wrote some test code.

    The tests showed that the actual, total elapsed clock time used by hub operations was not 7 to 15 or 22 clocks, but instead was exactly one of 8, 12, 16 or 20 clocks. (Do those odd numbers refer to some internal command synchronization states, and not to the full command execution? )

    I tested simple commands like "NOP" and measured an elapsed time of 4 clocks, and·tested conditional jumps like "TJZ" and measured 4 or 8 clocks, just like the documentation states. But the hub operations repeatedly showed numbers other than those 7-15 or 22 clocks that the documentation reports.

    My test code is included. The tests are simple enough, yet seem to report timing numbers that as far as I know, have never been provided or discussed in any documentation or forum text. Am I missing something stupid? Have I made a mistake in my test code? Are my timing tests correct?

    If my tests are right, then it seems to me that it would be more useful for actual programming timing·design for the assembly command documentation to show those·full clock requirements of hub operations as (8,12,16,20) and not the partial clock usages of 7-15, 22.

    David
  • Paul BakerPaul Baker Posts: 6,351
    edited 2007-07-18 21:06
    The reason you are seeing those discrete values is that assembly instructions take 4 instructions to execute, so you are seeing the 4 cycle quanta in the region of 7 to 22. But the alignment of the pipeline with the system clock can be arbitrary, IOW cnt // 4 can equal 0,1,2 or 3 for·when the first stage executes. Using a waitcnt where the last two bits are set to a specific value can achieve·"phase alignment". This granularity is what accounts for the full range of 7-22. But once a cog performs a hub instruction it snaps back to the 4 cycle quanta a "hub aligned" cog will occupy.


    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Paul Baker
    Propeller Applications Engineer

    Parallax, Inc.
  • David BDavid B Posts: 591
    edited 2007-07-18 21:39
    I think I get it. So a hub operation forces hub alignment, but any of the waitxxx ops can force a dealignment by individual clock counts, whereupon another hub op may experience those odd alignment counts as it realigns the cog to the hub...

    I added another section to that test program that seems to demo that effect, showing integer increments in clock counts to execute this code:

    rdlong dummy, addr0 ' (align with hub)
    mov count0, cnt ' 0 reference count
    mov dummy, cnt ' 4
    add dummy, #11 ' 4 10 -> $13 9->$12 8 -> $....4698
    waitcnt dummy, #20 ' ?
    mov count1, cnt ' 4

    That helps! Thanks, Paul.
  • Paul BakerPaul Baker Posts: 6,351
    edited 2007-07-18 21:48
    No problem, glad to help.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Paul Baker
    Propeller Applications Engineer

    Parallax, Inc.
Sign In or Register to comment.