Cog Launch Procedure and Timing
Ken Peterson
Posts: 806
I just want to make sure I fully understand how cogs are launched in the Prop.· Here's my take:
For the sake of simplicity, let's leave·SPIN out of it.· Assume I'm launching a PASM·routine into a cog from another PASM routine.
1.· PASM coginit instruction takes 7-22 clocks like any hub instruction and returns immediately
2.· HUB system loads cog with 496 logs, each long transferred during that cog's hub window
3.· Cog starts executing at first instruction in cog memory
If this is true, then according to my calculations it should take 7936 (496*16) system clock cycles to load the cog, plus 7-22 to execute the coginit instruction, and at 80MHz this should take about 100 microseconds. ·Is this accurate?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Post Edited (Ken Peterson) : 6/27/2008 4:28:16 PM GMT
For the sake of simplicity, let's leave·SPIN out of it.· Assume I'm launching a PASM·routine into a cog from another PASM routine.
1.· PASM coginit instruction takes 7-22 clocks like any hub instruction and returns immediately
2.· HUB system loads cog with 496 logs, each long transferred during that cog's hub window
3.· Cog starts executing at first instruction in cog memory
If this is true, then according to my calculations it should take 7936 (496*16) system clock cycles to load the cog, plus 7-22 to execute the coginit instruction, and at 80MHz this should take about 100 microseconds. ·Is this accurate?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Post Edited (Ken Peterson) : 6/27/2008 4:28:16 PM GMT
Comments
Just hoping for someone to offer some guesses before this falls off page 1. I didn't find anything in the docs that clearly describes what happens and how long it takes.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards,
Ken
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Reference:
www.parallax.com/Portals/0/Downloads/docs/prod/prop/PropellerDatasheet-v1.0.pdf
Item 1:
Hub instructions, the Propeller Assembly instructions that access mutually exclusive resources, require 7 cycles to execute but they first need to be synchronized to the start of the Hub Access Window. It takes up to 15 cycles (16 minus 1, if we just missed it) to synchronize to the Hub Access Window plus 7 cycles to execute the hub instruction, so hub instructions take from 7 to 22 cycles to complete.
Propeller Assembly Instruction Table
* 000011 0001 1111 ddddddddd
010 COGINIT D Initialize a cog according to D Result = 0 No cog free 0 7..22 (Clocks)
========================================================================================================================================
Item 2. - Page 15:
When a cog is booted up, locations 0 ($000) through 495($1EF) are loaded sequentially from Main RAM / ROM and its special purpose locations, 496 ($1F0) through 511($1FF), are cleared to zero.
Each Special Purpose register may be accessed via its physical address, its predefined name, or indirectly in Spin via a register array variable SPR with an index of 0 to 15, the last four bits of the register's address.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Aka: CosmicBob
mov val1,CNT
wrlong val1,ptr_1st_long
and, in launched
mov val1,CNT
wrlong val1,ptr_2nd_long
And later the difference should say how much. But you kew that. So how much ?
@Bob: I saw the part you posted in the manual, but it doesn't explicitly describe the mechanism whereby the 496 longs get copied. I assume it happens during normal hub access windows to avoid screwing up the timing for the rest of the cogs, but that's only a guess on my part.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
From Cogs (Processors) : the cog begins executing instructions, starting at location 0 of Cog RAM.
"When a cog is booted up, locations 0 ($000) through 495 ($1EF) are loaded sequentially from Main RAM / ROM and its special purpose locations, 496 ($1EF) through 511 ($1FF) are cleared to zero. After loading, the cog begins executing instructions, starting at location 0 of Cog RAM. It will continue to execute code until it is stopped or rebooted by either itself or another cog, or a reset occurs."
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Aka: CosmicBob
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
I suppose where I'm going with this is that I'm wondering if it makes sense to separate a large assembly program into chunks and page it via subroutine calls rather than using LMM. When calling a subroutine, you would save all of your important data on a "stack" in hub memory, then call the subroutine by launching it in the same cog. I'm just not sure about an easy mechanism for getting the main program back into the cog and picking up at the execution point right after where the call was made. Perhaps with a jump table at the beginning...
Anyways, instead of having a 5X to 8X performance hit like with LMM, you would have a subroutine call overhead of about 8200 clocks, with another hit of 8200 clocks upon return.
200us doesn't seem like a bad price to pay for·many applications.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Post Edited (Ken Peterson) : 6/30/2008 3:29:49 PM GMT
you only have to copy what you need and you have persistent variables and I/O pins ready to
go without having to worry about the stack and loading / saving them.
LMM doesn't have to be single instruction at a time, there's no reason paging cannot also be
used. In fact it's probably better for any LMM code which loops or needs to be high speed. Put
a 'call #overlaythis" at the start of the LMM code which is paged and LMM code sections
can easily be changed from instruction at a time to paged operation.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔