Shop OBEX P1 Docs P2 Docs Learn Events
cognew synchronous or asynchronous? — Parallax Forums

cognew synchronous or asynchronous?

Hi All,

I have not been able to find this information in the documentation although I am sure it is there somewhere. When cognew copies the 512 word program into the new cog, who is doing that copying? Is the cog that calls cognew (i.e. the code is being pushed from hub memory onto the target cog) ? Or is some microcode on the target cog that is pulling the code from the hub memory.

I am wondering because I want to know if I can expect the target cog to be running after cognew or running some 1k's of instructions later.


All the best,

Tom

Comments

  • ErNaErNa Posts: 1,752
    edited 2016-02-23 16:17
    Befor cognew I set a flag in Hub. The ASM program running in the cog resets the flag. So if the flag is reset, I know the cog is up and running.
  • It takes time to load the COG. Imagine the COG using its HUB access repeatedly to fill it's memory before starting. I don't recall the number of cycles off hand, but it's pretty easy to watch the counter to determine.

    COG fills up, then program execute starts at address 0.
  • The hub logic is doing the copying during the memory access slot normally used by the cog being started. The COGNEW/COGINIT instruction initiates the operation which then proceeds independently. The "target cog" begins running shortly after 512 hub cycles. Several programs that use multiple synchronized cogs use the system clock for synchronization. The initiating program reads the clock, adds 16*512+"fudge factor" to it and leaves the result where the target cog(s) can get it. The target cog(s) all do a WAITCNT on this value. When the WAITCNT ends, these cogs are synchronized. The initiating program can also do a WAITCNT to synchronize with the target cog(s) or wait for a flag to be set like ErNa mentioned. Look at the higher resolution VGA text drivers in the Object Exchange for examples of this technique.
  • Heater.Heater. Posts: 21,230
    In short, asynchronous.

    cognew/coginit returns immediately and the calling cog continues execution. As it does so the cog being started is loading its code from HUB and starts to execute it some time later.

    I thought the timing of all this was described in the manual some place. If not it should be. But it looks like others have outlined the process above.

  • Hi Heater,
    Heater. wrote: »
    In short, asynchronous.

    cognew/coginit returns immediately and the calling cog continues execution. As it does so the cog being started is loading its code from HUB and starts to execute it some time later.

    I thought the timing of all this was described in the manual some place. If not it should be. But it looks like others have outlined the process above.

    This is indeed what I was wondering. I knew it would take 512, give or take, hub memory reads to get the program into the cog but it was indeed unclear who was doing that work. Either the source or target cog seemed equally likely.

    After re-reading Mike's post 3 times I now understand that it is neither. It is actually more like DMA with an external agent doing the job and then signaling the target.

    One related question is if each cog image is always exactly 512 words (padded out as needed) or is there some length somewhere that tells how many words long the image actually is? From Mike's description it sounds like the image is always 512 words irrespective of how much code is actually in there.

    Is this the case?

    Thanks,

    Tom
  • Heater.Heater. Posts: 21,230
    Yep, cogs always load 512 longs of HUB. Even if your PASM code + data is only a few longs. They have no way to know you are using less.

    Actually, isn't it 496, given the special locations in the COG address space? I forget. Makes no odds really.

    By the way, when I said "cognew/coginit returns immediately" that is true but remember that these are "hub ops" which use more clocks than normal cog instructions. See manual for details.




  • Ok, thanks a lot for all your help. This makes it all clear enough.

    Best regards,

    Tom
  • Cluso99Cluso99 Posts: 18,069
    It is actually 496 longs (512-16 for the special registers). The special registers are all cleared to 0.
    The Verilog code for the P1V is available here on the forum (somewhere).
  • I could be wrong, but I thought that 512 longs get transferred with the last 16 longs transferred to the "shadow RAM" memory locations while the special registers that overlay "shadow RAM" are cleared to 0.
  • Cluso99Cluso99 Posts: 18,069
    I just took a quick look at the P1V but couldn't find the appropriate section. I did find the part where the 16 registers are cleared.
    I am fairly certain that the shadow registers are cleared because I use them in my debugger.
    However, I think you may be correct (Mike) that 512 longs are read from hub because I seem to recall that the pc (program counter) wrap causes the coginit sequence to complete.
  • Thanks for following up on this information regarding the image size. I am interested because part of what I was doing was using the cog image space for data after the cog was started. I start each cog only once so it seemed silly to have these 2k blocks of memory sitting around doing nothing. But of course to make that work, I need to know how much stuff I can write in there before I get into trouble.

  • You don't have to waste 2k, just make sure the code on a cog doesn't care about values beyond its high watermark
    then the extra stuff that gets loaded doesn't matter and you can use it for other cogs and stuff (I think that's
    the default behaviour with the tools anyway).
Sign In or Register to comment.