SPIN Speed
Dan Moline
Posts: 3
Hi, everyone. Can someone help me understand the propeller just a bit better?
My understanding is this:
If you were to only have a single SPIN program; that program resides in RAM and it would be interpreted from COG 0.
But COG 0 only has access to the HUB 1/8 of the time.
Does that mean if there were a propeller with a single COG, that this program would run 8X faster than the propeller? Since it wouldn't need to access 7 other COG's with no programs associated with them?
In other words, as the propeller spins, empty COG's tend to slow down the execution of programs loaded into the other COG's. Am I making sense?
I'm not dissing the propeller. I think it's extraordinary. I'm just trying to test my understanding.
Thanks,
Dan
My understanding is this:
If you were to only have a single SPIN program; that program resides in RAM and it would be interpreted from COG 0.
But COG 0 only has access to the HUB 1/8 of the time.
Does that mean if there were a propeller with a single COG, that this program would run 8X faster than the propeller? Since it wouldn't need to access 7 other COG's with no programs associated with them?
In other words, as the propeller spins, empty COG's tend to slow down the execution of programs loaded into the other COG's. Am I making sense?
I'm not dissing the propeller. I think it's extraordinary. I'm just trying to test my understanding.
Thanks,
Dan
Comments
Spin programs are interpreted and reside in the main RAM. The interpreter is optimized so that some of the main RAM accesses are synchronized so that, once the first one occurs with its variable delay, the subsequent ones occur without a delay. In addition, other useful instructions are executed between the main RAM accesses so there's no additional time used waiting for the hub slot to become available.
My·understanding is·that the interpreter resides in the COG and the SPIN program resides in HUB RAM. Does that mean that program interpretation only occurs 1 out of every 16 clock cycles?
I did a very simple experiment. I compared a loop in SPIN versus a loop in assembly. (Not knowing what else to do, I just included a dummy statement "temp := 1"· inside the loop for no real reason). "value" in the PASM = 1,000,000.
SPIN
· repeat i from 0 to 1_000_000
··· temp := 1
PASM
:loop·· add temp, #1
········ djnz value, #:loop
The difference in speed (using a watch) was about 120:1. Is that difference caused because the assembly code is in the COG and running·100% of the time wheras the SPIN program is waiting for access to the HUB every 16 clock cycles?
Thanks for your patience. My background is RF ... not microcontrollers.
Regards,
Dan
This example copies data from one array of 32 bit values to another adding a fixed value to each array element. It uses every hub access slot taking 32 clock cycles per loop. The only access that the program has to wait for more than one clock cycle is the first read for the first time through the loop.
Each Cog runs full-speed but has to wait for its cyclic access to the Hub memory ( where Spin programs and data is stored ). It will get its access at the same rate no matter what any other Cog is doing, so Spin execution is no slower when 8 Cogs are in use or when just one is.
The rotating Hub offers each Cog in turn access to Hub memory whether it wants it or not. The cyclic rotation doesn't care if access is taken or not, but will always offer it every time.
If there were a single Cog version and it didn't use such a rotating Hub access scheme ( as it wouldn't need it ), access to Hub memory would always be available so some code running in Cog would be quicker, but only that accessing Hub memory. The Spin Interpreter makes a fair number of accesses to Hub memory so may be slightly faster but not necessarily. A Cog doesn't get slowed down waiting for Hub access if it asks for access at the right time.
A single Cog version running a Spin Interpreter would not run at 8 times the speed. It may be a bit faster, but probably not much.
Regards,
Dan