COGINIT not working if code not reloaded? - Fixed
bob_g4bby
Posts: 489
I wanted to measure the time difference between
1. starting a cog with cog ram load
2. restarting the cog, no cog ram load
The code below does (1) but (2) fails. What have I done wrong, please, I'm going round in circles?
DAT ORG
getct startcycle
getct cycles
sub cycles, startcycle ' cycles ends up = 2
cogid cog_id
debug("Measurement overhead in ", udec_long(cycles))
loadandstart
getct startcycle
setq ptra_val
coginit #COGEXEC+1, #@cog1start
waitatn
getct cycles
sub cycles, startcycle
debug("Time to load and start cog 1 in ", udec_long(cycles))
juststart
getct startcycle
setq ptra_val
coginit #%1_0_0001, #@cog1start ' E=1 don't reload cogram; N=0 target cog is V; V=1 cog1 is target
waitatn
getct cycles
sub cycles, startcycle
debug("Time to just restart cog 1 in ", udec_long(cycles))
cogstop #0
cog1start
cogatn #1
cogstop #1
startcycle res 1
cycles res 1
cog_id res 1
ptra_val res 1

Comments
I didn't try to solve exactly which part was at fault, I just did a general cleanup.
Output when compiled with Flexspin:
CON _xtlfreq = 20_000_000 _clkfreq = 100_000_000 DAT ORG asmclk getct startcycle getct cycles sub cycles, startcycle ' cycles ends up = 2 cogid cog_id decod ptra_val, cog_id debug("Measurement overhead in ", udec_long(cycles)) loadandstart getct startcycle setq ptra_val coginit #COGEXEC_NEW, #@cogruntest waitatn getct cycles sub cycles, startcycle debug("Time to load and start cog 1 in ", udec_long(cycles)) juststart getct startcycle setq ptra_val coginit #HUBEXEC_NEW, #@cogruntest waitatn getct cycles sub cycles, startcycle debug("Time to just restart cog 1 in ", udec_long(cycles)) cogstop #0 cogruntest cogatn ptra cogid pa cogstop pa startcycle res 1 cycles res 1 cog_id res 1 ptra_val res 1Problem with the measuring is Debug adds a massive overhead to each cog's startup time.
@bob_g4bby, please check the code from thread Questions about PASM2 and register allocation. It measures the startup time of a cog without debugger. It should be easy to modify it to measure what you want.
Bob,
Remember to turn off debugging to eliminate its overhead.
I think the problem is that (2) starts the Cog in HUBEXEC mode, however the start address is below $400 so it is actually a Cog/Lut exec mode, and since the previous start loaded uninitialized hub content, it executes random instructions.
The fix should be
coginit #%1_0_0001, #0 ' E=1 don't reload cogram; N=0 target cog is V; V=1 cog1 is targetThe start address 0 restarts the Cog with whatever the ram holds at the time.
This is the debug output with the modify:
The difference is exactly 512, the clock cycles needed to load the full Cog ram.
I have added a counter to the Cog1 code to see if the HUBEXEC start reuses the Cog ram and looks correct:
This is the modified code:
DAT org getct startcycle getct cycles sub cycles, startcycle ' cycles ends up = 2 cogid cog_id debug("Measurement overhead in ", udec_long(cycles)) loadandstart getct startcycle setq ptra_val coginit #COGEXEC + 1, #@cog1start waitatn getct cycles sub cycles, startcycle debug("Time to load and start cog 1 in ", udec_long(cycles)) juststart getct startcycle setq ptra_val coginit #HUBEXEC + 1, #0 ' E=1 don't reload cogram; N=0 target cog is V; V=1 cog1 is target waitatn getct cycles sub cycles, startcycle debug("Time to just restart cog 1 in ", udec_long(cycles)) cogstop #0 cog1start debug(udec(counter)) add counter, #1 cogatn #1 cogstop #1 counter long 0 startcycle res 1 cycles res 1 cog_id res 1 ptra_val res 1Hope this is correct.
Thanks everyone, instant explanations, how good is that?! I now appreciate the addressing mistake I made in the 2nd COGINIT. The code starting with code1start is loaded into cogram at 0, so the next COGINIT needs to restart at cog ram 0.
I was expecting 'load and start' to take a bit of time, but would expect 'just start' to be quite quick, which it isn't in my debug dependent code. So like @Kaio says, you can't make a true measurement with debug active - I'll look at the code you've recommended.
Huh, yeah, mine wasn't working correctly either.
Updated output with corrected code placement:
Hubexec is 490 ticks quicker.
CON _xtlfreq = 20_000_000 _clkfreq = 100_000_000 DAT ORG 0 asmclk getct startcycle getct cycles sub cycles, startcycle ' cycles ends up = 2 cogid cog_id decod ptra_val, cog_id debug("Measurement overhead in ", udec_long(cycles)) loadandstart getct startcycle setq ptra_val coginit #COGEXEC_NEW, ##@cogruntest waitatn getct cycles sub cycles, startcycle debug("Time to load and start cog 1 in ", udec_long(cycles)) juststart getct startcycle setq ptra_val coginit #HUBEXEC_NEW, ##@cogruntest waitatn getct cycles sub cycles, startcycle debug("Time to just restart cog 1 in ", udec_long(cycles)) cogstop #0 startcycle res 1 cycles res 1 cog_id res 1 ptra_val res 1 DAT ORGH $400 ORG 0 cogruntest cogatn ptra cogid pa cogstop paStill floundering around with WAITATN / COGATN - can you spot the mistake in the SPIN started version please?
This all assembly code version works with or without debug - the LED on p56 blinks on briefly. With debug, the INIT statements make sense:-
DAT ORG coginit #COGEXEC+1, #@startcog1 cogstop #0 startcog1 drvl #56 getct startcycle getct cycles sub cycles, startcycle ' cycles ends up = 2 debug("Measurement overhead in ", udec_long(cycles)) loadandstart getct startcycle setq ptra_val coginit #COGEXEC+2, #@cog2start ' remember - cog1start is the address to load from, but it will load to address 0 waitatn getct cycles sub cycles, startcycle debug("Time to load and start cog 2 in ", udec_long(cycles)) juststart getct startcycle setq ptra_val coginit #%1_0_0010, #0 ' E=1 don't reload cogram; N=0 target cog is V; V=1 cog1 is target - and we start from address 0 waitatn getct cycles sub cycles, startcycle debug("Time to just restart cog 2 in ", udec_long(cycles)) waitx ##10000000 drvh #56 cogstop #1 cog2start cogatn #2 cogstop #2 startcycle res 1 cycles res 1 ptra_val res 1Debug displays:-
Cog0 INIT $0000_0000 $0000_0000 load
Cog1 INIT $0000_0008 $0000_0000 load
Cog1 Measurement overhead in cycles = 2
Cog2 INIT $0000_0064 $0000_0000 load
Cog1 Time to load and start cog 2 in cycles = 7_083
Cog2 INIT $0000_0000 $0000_0000 jump
Cog1 Time to just restart cog 2 in cycles = 6_555
The following code starts with cog0 running SPIN and starting cog1 in assembly code. Cog1 should start cog2 twice, but instead gets stuck at the first waitatn. Led on p56 comes on and stays on. Running with debug, the INIT statements don't make sense - cog0 is reported as starting a cog twice, which Main() clearly doesn't:-
'' File .......... Load and run cog versus re-run cog.spin2 '' Version........ 1 '' Purpose........ Measure the time to load and run a cog and also rerunning the same cog '' Author......... Bob Edwards '' Email.......... bob.edwards50@yahoo.com '' Started........ / / '' Latest update.. / / ''============================================================= '' Cog0 runs a small spin program to start cog1. Cog1 loads and starts cog2, then just restarts cog2 '' The time taken to do these two things is recorded by cog1 and cog0 displays the results in cycles CON {timing constants} _xinfreq = 20_000_000 _clkfreq = 200_000_000 CON {fixed IO pins} CON {application IO pins} CON {Enumerated constants} #0, cycles1, cycles2, PARAM_COUNT ' names of global variables CON {Data Structure definitions} OBJ {Child Objects} VAR {Variable Declarations} long PARAM[PARAM_COUNT] ' this array of global vars is accessible to SPIN and PASM {Public / Private methods} pub main() coginit(1, @loadandstart, @PARAM) ' set up cog 1 with the assembly code DAT {Symbols and Data} DAT {Data Pointers} DAT {Cog-exec assembly language} ORG loadandstart drvl #56 setq ptra coginit #COGEXEC+2, #@cog2start ' remember - cog2start is the address to load from, but it will load to address 0 in cog ram waitatn ' wait until cog2 running juststart setq ptra coginit #%1_0_0010, #0 ' E=1 don't reload cogram; N=0 target cog is V; V=2 cog2 is target - and it starts from address 0 waitatn ' wait until cog2 running waitx ##10000000 drvh #56 cogstop #1 cog2start cogatn #2 ' tell cog1 that cog2 is running cogstop #2 ' stop cog2Debug displays:-
Cog0 INIT $0000_0000 $0000_0000 load
Cog0 INIT $0000_0F64 $0000_1898 jump
Cog1 INIT $0000_184C $0000_1890 load
Cog2 INIT $0000_0034 $0000_1890 load
Cheers, Bob
The problem is the address used to start the cog, it doesn't point to the correct location in spin-mode.
One hint is that the parameter doesn't throw a compile error, it should because the location is certainly above 512 and won't fit, you must use the augs syntax with ## in that case.
Another hint is the debug line itself:
It tells you that the code load from location $34 that is a portion of the Spin2 Interpreter.
In Spin mode, the address operator in the PASM blocks returns the object-relative address (actually, it is the same in PASM-mode however PASM-mode doesn't have the interpreter prepended and child objects so is equivalent), you need to use the loc instruction to calculate the actual hub address:
loc pa, #@cog2start - @loadandstart add pa, ptrb setq ptra coginit #COGEXEC + 2, paI find loc a bit of a mistery I don't fully understand it.
If you are using Spin Tools or flexspin you can use the special @@@ absolute hub address operator, more intuitive (but not supported by PNut):
I strongly suggest to use a tool that generates a human-readable listing so you can see where the code is placed in memory:
The above is generated by Spin Tools, it tells you the hub address, the object-relative address and cog-relative address of each instruction.
Many thanks, @macca , that's clear now. I mostly use Spin Tools with PNUT as backup. So I could use ##@@@cog2start (good grief!) and the listing feature, which I hadn't looked at closely enough.
If you look around all the official documentation, all this addressing stuff is spread very thin / isn't mentioned, so a helping hand from a master-coder is very welcome. I can carry on with my Christmas spare-time experiments again.
Happy Christmas, Bob
If measuring coginit timing with debug enables doesn't work, because debug takes quite a bit of time reporting every coginit event, what to do?
Instead of displaying the timings using the debug window, I wrote a SPIN program to do the display, based on jonnymac's full duplex serial code from the obex - see the attached file. I use Spin Tools to compile it, because I need the built-in terminal window to display the results.
Cogint set to load the cog ram and then run took 556 cycles
Cogint set to just re-run the same code took 30 cycles
So that's good. If I write a library of dsp functions, they can be loaded into cog or lut ram the once and thereafter the code is not reloaded every time a function is required, it takes around 30 cycles just to restart the cogs and call the function.
I just noticed that, by using a CON symbol DEBUG_COGS, debug functionality can be selectively enabled on a per cog basis. Which means it should be easy to both use debug for reporting the coginit() launch times while at the same time not incurring debug overhead on the measuring cog.
Thanks for this hint, @evanh. I didn't know this symbol and I wondering when it was introduced. A search in the forum does not show any hit beside this thread.
It's always been there, just not highlighted. Looking back through stashed PDFs, I can even see it changed name where had been called DEBUG_ENABLE in v34 back in 2020.