Shop OBEX P1 Docs P2 Docs Learn Events
CogInit dark magic — Parallax Forums

CogInit dark magic

Spoon-feeding P2 knowledge to me every day now i stumbled across the COGINIT instruction.

Reading the Documentation COGINIT can find an unused cog, copy $1F8 longs form HUB-Ram to the COG-Ram and then jumps to COG address $0 of the newly initialized cog. No Problems to understand here.

BUT:
In the Google Tables Documentation of the P2 Instructions COGINIT seams to achieve all this in 2..9 clock cycles. What kind of dark magic is built into the P2 to copy $1F8 longs this fast?
Or is the instruction table simply wrong here?

«1

Comments

  • cgraceycgracey Posts: 14,151

    It takes 2..9 cycles to execute the instruction. The instruction initiates the self-loading procedure in the target cog.

  • Ahh so there is a self loading + jump to $0 build into the target cog.

    So the calling cog uses 2..9 clocks and then can proceed. But it takes then some time for the new cog to come to live. So the calling cog haste to wait some time till it can communicate or signal the new cog.

    How long does the self loading mechanism take?

  • evanhevanh Posts: 15,912
    edited 2021-04-10 07:50

    For hubexec, very little, but most examples are cogexec. For cogexec, it'll be something like $1f8 longwords (clocks) plus a few extra, so somewhere around 512 clocks.

    It'll be quite easy to measure by using GETCT in both cogs, and comparing.

  • evanhevanh Posts: 15,912
    edited 2021-04-10 08:41

    529 to 536 ticks, not including the COGINIT.
    536 to 543 531 to 542 ticks, including the COGINIT.

            waitx   #1
            getct   pb
            coginit #2, launchptr
    '       coginit #$22, launchptr
    '       getct   pb
    
            waitatn
            rdlong  pa, #0
            sub pa, pb
            call    #itod
            call    #putnl
    
            jmp #$
    
    
    launchptr   long    launch
    
    
    orgh  $400
            long    0[7]
    launch
            getct   pa
            wrlong  pa, #0
            cogatn  #1      'cog0
    
            cogid   pa
            cogstop pa
    
  • evanhevanh Posts: 15,912
    edited 2021-04-10 08:53

    28 to 35 ticks for hubexec, not including the COGINIT.
    31 to 39 30 to 44 ticks for hubexec, including the COGINIT.

            waitx   #7
            getct   pb
    '       coginit #2, launchptr
            coginit #$22, launchptr
    '       getct   pb
    
            waitatn
            rdlong  pa, #0
            sub pa, pb
            call    #itod
            call    #putnl
    
            jmp #$
    
    
    launchptr   long    launch
    
    
    orgh  $400
            long    0[8]
    launch
            getct   pa
            wrlong  pa, #0
            cogatn  #1      'cog0
    
            cogid   pa
            cogstop pa
    
  • evanhevanh Posts: 15,912
    edited 2021-04-10 08:59

    Hmm, I've still missed 3 ticks from the 542. It should be 545. In hindsight, the simple way is measure the post-COGINIT times then just add the execute range of COGINIT to them.

  • SuracSurac Posts: 176
    edited 2021-04-10 09:46

    using this program

    con _clkfreq = 10_000_000
    
    dat org
        asmclk
        dirh #0
        coginit #16,##$400
    
        rep #1,#0
        drvnot #0   
    
    dat
        orgh $400
        org
    
        rep #1,#0       
        drvnot #2
    

    I measure 534 clocks with my logic analyzer.
    This kind of knowledge seems to be in the documentation, i guess. I will add some later.

  • Cluso99Cluso99 Posts: 18,069

    And the way to sync ie know the new cog is running is to clear/set a hub location.

  • SuracSurac Posts: 176
    edited 2021-04-10 10:02

    What baffles me with COGINIT is that the self-loading of the started cog seems to be able to copy its data OUT of the COG and HUB ram of the COG that initiates the COGINIT instruction
    It is clearly mentioned inside the documentation that target cog copy's it's data from the hub.

    con _clkfreq = 10_000_000
    
    dat org
        asmclk
        coginit #$1,##@cog2 'do not use lable to make copy+run form 0 sure
    
        rep #1,#0
        drvnot #0   
    
    dat org
    
    cog2    rep #1,#0       'blink pin 57 takes 53,4 us
        drvnot #2
    

    this works perfectly well, even as the cog2 lable is clearly NOT in the HUB ram but in the COG ram of cog 0.

    • So does cog1 copy out of cog0 COG ram here? how is it working? is there a hidden third port on the COG ram?
    • how much does ist copy?
    • does it also copy bytes out of the HUB ram of cog0? or does it stop at $1f8?
    • and btw. what does ##@ do?
  • evanhevanh Posts: 15,912
    edited 2021-04-10 10:08

    @Surac said:
    using this program
    ...
    I measure 534 clocks with my logic analyzer.

    In that case there is a range of 8 ticks, depending on hubRAM fetching alignment of the launched cog. So if you, say, change the ORGH from $400 to $404 then the time taken to transfer the data is delayed by a tick, making it 535. That increases up to 536 then wraps back to 529 ticks with ORGH $40c. The alignment is different phase for different cogs, both launcher and launchee.

  • evanhevanh Posts: 15,912

    @Surac said:
    this works perfectly well, even as the cog2 lable is clearly NOT in the HUB ram but in the COG ram of cog 0.

    Ah, another illusion. even cog0 is copied from hubRAM. So all program code is sitting in hubRAM and is repeatedly copied to each cog as directed by the COGINITs.

  • evanhevanh Posts: 15,912
    edited 2021-04-10 10:12

    @Surac said:

    • and btw. what does ##@ do?

    The ## means it's a 32-bit immediate data value embedded in the program code. ie: A hidden AUGx instruction is prefixed.
    The @ means a program label is forced to hubRAM byte scaled addressing.

  • evanhevanh Posts: 15,912
    edited 2021-04-10 10:23

    @evanh said:
    ... So all program code is sitting in hubRAM and is repeatedly copied to each cog as directed by the COGINITs.

    A detail: Once a cogexec cog is launched, the cog runs with the content of cogRAM. It no longer needs the code in hubRAM unless there is a need to launch further copies later on. So that space in hubRAM can be reused after launch.

  • No

    have a look here

    00000 000             | dat org
    00000 000 04 80 80 FF 
    00004 001 00 30 67 FD |     hubset  ##clkmode_ & !%11
    00008 002 86 01 80 FF 
    0000c 003 1F 80 66 FD |     waitx   ##20_000_000/100
    00010 004 04 80 80 FF 
    00014 005 00 36 67 FD |     hubset  ##clkmode_
    00018 006             | 
    00018 006 24 02 EC FC |     coginit #$1,#$24    'do not use lable to make copy+run form 0 sure
    0001c 007             |     
    0001c 007 00 02 DC FC |     rep #1,#0
    00020 008 5F 00 64 FD |     drvnot #0   
    00024 009             | 
    00024 009             | 
    00024 009             | cog2    rep #1,#0       'blink pin 57 takes 53,4 us
    00024 009 00 02 DC FC 
    00028 00a 5F 04 64 FD |     drvnot #2
    

    the coginit instruction clearly points to byte address $24, that is inside the cog ram but counted as bytes, how on earth would coginit know where to look for this "shadow copy of cog0" you mentioned?

  • evanhevanh Posts: 15,912
    edited 2021-04-10 10:36

    @evanh said:
    The @ means a program label is forced to hubRAM byte scaled addressing.

    This is useful for something like a COGINIT code reference because sometimes the referenced code is not specified as living in hubRAM.

    EDIT: Especially if the referenced code is meant to be cogexec. The @ then provides the hubRAM address of the code rather than the assembled ORGin.

  • We are not talking about the cogexec spin instruction but the coginit assembler instruction

  • evanhevanh Posts: 15,912
    edited 2021-04-10 10:57

    eg: This doesn't works:

    dat
    org
            coginit #1, #cogexec_reference
            jmp #$
    
    orgh
    hubexec_reference
    org
    cogexec_reference
            dirh    #0
    .loop
            outnot  #0
            jmp #.loop
    

    This works:

    dat
    org
            coginit #1, #hubexec_reference
            jmp #$
    
    orgh
    hubexec_reference
    org
    cogexec_reference
            dirh    #0
    .loop
            outnot  #0
            jmp #.loop
    

    And so does this:

    dat
    org
            coginit #1, #@cogexec_reference
            jmp #$
    
    orgh
    hubexec_reference
    org
    cogexec_reference
            dirh    #0
    .loop
            outnot  #0
            jmp #.loop
    

    EDIT: Doh! Forgot the jmp #$ ... and the dat ... and the first case doesn't work. Pays to test these things :)

  • evanhevanh Posts: 15,912

    @Surac said:
    We are not talking about the cogexec spin instruction but the coginit assembler instruction

    Correct, I don't write Spin at all. Never got the hang of it.

  • evanhevanh Posts: 15,912

    @Surac said:
    We are not talking about the cogexec spin instruction but the coginit assembler instruction

    Oh, yeah, the @ has a similar but different meaning in Spin. I haven't tried to work it out.

  • evanhevanh Posts: 15,912
    edited 2021-04-10 11:19

    @Surac said:

    ...
    00024 009             | cog2  rep #1,#0       'blink pin 57 takes 53,4 us
    00024 009 00 02 DC FC 
    00028 00a 5F 04 64 FD |   drvnot #2
    

    the coginit instruction clearly points to byte address $24, that is inside the cog ram but counted as bytes, how on earth would coginit know where to look for this "shadow copy of cog0" you mentioned?

    See the addresses on the very left. That's the hubRAM, byte scaled, addresses where the loader will place the program in hubRAM before launching cog0. The second column is the cogRAM addresses for ORG'd code.

    COGINIT only accepts a hubRAM address. So that $24 is hubRAM address $24. So the program code is copied from a hubRAM range of $24 to $24+$1f8*4-1 or $803.

  • evanhevanh Posts: 15,912

    PS: hubRAM below $400 exists as regular addressable RAM. It's only accessible as data, code execution doesn't happen there, but it's very much still there.

  • RaymanRayman Posts: 14,634

    So I guess there is some hidden code somewhere that does the dark magic of copying the data from hubram to the new cog's RAM for cogexec mode.

    If this hidden code could be treated as hubexec, seems to me that the new cog could load and jump to cog address #0 itself and save the cog that did the coginit several hundred clocks...

    Could one write their own version of coginit that does this? I.e., start the new cog in hubexec mode, pointed to some new code that loads the new cog's local RAM and then jumps to address #0 to switch to cogexec mode?

  • evanhevanh Posts: 15,912

    @Rayman said:
    If this hidden code could be treated as hubexec, seems to me that the new cog could load and jump to cog address #0 itself and save the cog that did the coginit several hundred clocks...

    That's how it works already. The COGINIT itself only take a few clocks. We're measuring how long the newly launched cog takes.

  • evanhevanh Posts: 15,912
    edited 2021-04-10 11:48

    Second last page of the prop2 hardware doc has the verilog wired code for a newly coginit'd cog. The comments say:

    mov outa,#0    (clear port shadow registers)
    mov outb,#0
    mov ina,#$1f8  (point ina/ijmp0 to cog's initial int0 handler)
    setq #$1f7     (if !hubs, load $1f8 longs from ptrb)
    rdlong 0,ptrb
    jmp dirb/ptrb  (if !hubs, jump to $000 (dirb=0), else ptrb)
    

    The last three instructions have a dynamic encoding that changes depending on "hubs". This will be hubexec or not.

  • Sorry @evanh , nothing you wrote here helps me to understand.

    I think I wait till @cgracey clears this out for us.

  • evanhevanh Posts: 15,912

    Surac,
    Here's the .lst from the middle example just above. I've changed the final JMP to an absolute encoded branch so you can see it is jumping to address $1 in cogRAM.

    00000                 | dat
    00000 000             | org
    00000 000 08 02 EC FC |         coginit #1, #hubexec_reference
    00004 001 FC FF 9F FD |         jmp #$
    00008 002             | 
    00008                 | orgh
    00008                 | hubexec_reference
    00008 000             | org
    00008 000             | cogexec_reference
    00008 000 41 00 64 FD |         dirh    #0
    0000c 001             | .loop
    0000c 001 4F 00 64 FD |         outnot  #0
    00010 002 01 00 80 FD |         jmp #\.loop
    

    It shows the cog addresses in the second column. You can see those addresses reset back to $0 at the second ORG. With the ORG I'm informing the assembler of my intention to use that section of code as if it were located at address $0 (in cogRAM). And the COGINIT causes the newly launched cog to copy that code into cogRAM at detination address $0.

    But it is copied from hubRAM. And in hubRAM it is located at address $8. So the COGINIT needs to be told to pass the source address of $8 to the newly launched cog.

  • evanhevanh Posts: 15,912

    When Pnut or Loadp2 or Proptool or any other loader loads a pure pasm2 program into the Prop2 it places it in hubRAM from address $0. From there the loader, which itself will be running in cog0, will typically perform a COGINIT #0,#0 to load cog0 with the first $1f8 longwords from hubRAM into cog0 and restart from $0.

    Spin programs load differently.

  • @evanh said:
    When Pnut or Loadp2 or Proptool or any other loader loads a pure pasm2 program into the Prop2 it places it in hubRAM from address $0. From there the loader, which itself will be running in cog0, will typically perform a COGINIT #0,#0 to load cog0 with the first $1f8 longwords from hubRAM into cog0 and restart from $0.

    Spin programs load differently.

    Ok if it works this way, this information will help

  • evanhevanh Posts: 15,912
    edited 2021-04-10 14:09

    Err, I was slightly inaccurate. That's how the initial program must load after a reset. It must be machine code and it is always deposited at hubRAM address $0 and launched with COGINIT #0,#0. For a pure pasm2 program that maybe all there is.

    An actual staged loader might provide more flexible options.

  • evanhevanh Posts: 15,912

    And of course the full story has more to it. :) A cold boot, hard reset, actually loads 16 kB of code from mask ROM into hubRAM and then performs a COGINIT #0,#0. That ROM code then performs a number of actions including attempting to load code from EEPROM and SD card. It is sort of a collection of programs that has a simple command over serial mechanism, the Taqoz interpreter and a monitor/debugger program as well.

    But the end result is, when just plain booting, it also does a COGINIT #0,#0 or loads the next stage into cogRAM of cog0 and JMP #0. So still equivalent to above declaration.

Sign In or Register to comment.