Yes, all zeros is a special "NOP" case and is trapped by the silicon, rather than being a ROR instruction. If it wasn't trapped it would be a
_RET_ ROR 0,0
which would cause a _RET_ operation following the ROR 0,0 which would have no effect. Wouldn't want that.
But in P1, anything with the CCCC=0 would become a NOP, so it was often used to hold a smaller variable, or convert an instruction to a NOP.
It really comes down to COGINIT requiring both a load address of zero and execution to start at zero. ORG has to be aligned to this. If you don't, you make a mess.
That's why Pasm variables in all Obex programs are at the end.
PS: You guys are messing up this topic with chatter!
I like to reuse the top area in the COG where I normally place my startup code as res variables needed later and to store parameter I got from whoever started my COG.
On the P2 one can do this very efficient.
DAT
cog long 0
PUB start(param1, param2, param3, param4)
stop
cog := cogstart(@entry, @param1) + 1
PUB stop
if cog
cogstop(cog-1)
DAT
entry
ORG0
par1 rdlong par1, ptra++
par2 rdlong par2, ptra++
par3 rdlong par3, ptra++
par4 rdlong par4, ptra++
and now my parameter uses the same cog register used by the instruction needed to load the parameter
setq #2-1
rdlong a, PTRA
add PTRA, #8
' is significantly faster than
rdlong a, PTRA++
rdlong a+1, PTRA++
The issue is you only get synchronous access to the hub with block reads/writes, so the latter version has to wait
7 cycles as it misses the boat on the second read. Block reads and writes run at one long per clock cycle, ie
two per instruction time.
A trap I ran across: the "altr" setting for writing to the result register (and presumably similar settings like "altd" and "alts") is destroyed by the "aug" prefix. So for example:
altr 0, #A
add B, #1
puts B+1 into A, but:
altr 0, #A
add B, ##1
puts B+1 into B (!). I thought of breaking the "add B, ##1" apart into an explicit aug+add and putting the altr between them, but since the altr uses an immediate that probably will fail too.
There are work-arounds, but it's something to watch out for.
A small defect with system clock setting got past the P2ES testing undetected. For clock config word, %0000_xxxE_DDDD_DDMM_MMMM_MMMM_PPPP_CCSS, if a PLL mode is configured with %PPPP = %1111 (_XDIVP = 1) there is a possibility of crashing if attempting to readjust the clock without knowing the exact config it is in.
One solution, "mailbox hand-off", for reliable operation into the future is to maintain a universal reserved hubRAM location containing a copy of the most recently issued config. That location is currently set as byte addresses $18 to $1b. Little-endian.
Here is sample code for basic startup that includes the ability for a boot-loader to specify what config it has already set.
CON
XTALFREQ = 20_000_000 'PLL stage 0: crystal frequency
XDIV = 2 'PLL stage 1: crystal divider
XMUL = 25 'PLL stage 2: crystal / div * mul
XDIVP = 1 'PLL stage 3: crystal / div * mul / divp (1,2,4..30)
XOSC = %10 'OSC ' %00=OFF, %01=OSC, %10=15pF, %11=30pF
XSEL = %11 'XI+PLL ' %00=rcfast(20+MHz), %01=rcslow(~20KHz), %10=XI(5ms), %11=XI+PLL(10ms)
XPPPP = ((XDIVP>>1) + 15) & $F ' 1->15, 2->0, 4->1, 6->2...30->14
CLOCKFREQ = XTALFREQ / XDIV * XMUL / XDIVP
SETFREQ = 1<<24 + (XDIV-1)<<18 + (XMUL-1)<<8 + XPPPP<<4 + XOSC<<2
ENAFREQ = SETFREQ + XSEL ' %0000_000e_dddddd_mmmmmmmmmm_pppp_cc_ss ' enable oscillator
DAT 'not Spin code
ORGH 0 'loaded to hubram at address 0
ORG 'longword addressing at 0
jmp #_init
long 0,0,0
'--------------------------------------------------------
'*** Boot-loader can fill all four of the following ***
'--------------------------------------------------------
spare1 long 0 'hubRAM addr $010 - compatible reserved for system variable
clk_freq long CLOCKFREQ 'hubRAM addr $014 - sysclock frequency, integer frequency in hertz
clk_mode long 0 'hubRAM addr $018 - clock mode config word, used directly in HUBSET
asyn_baud long BAUDRATE 'hubRAM addr $01c - comport baud rate, integer baud in hertz
_init
andn clk_mode, #%11 'clear the two select bits to force RCFAST selection
hubset clk_mode 'switch to RCFAST using known prior mode
mov clk_mode, ##SETFREQ 'replace old with new
hubset clk_mode 'setup for new mode, still RCFAST
waitx ##25_000_000/100 '~10ms for crystal/PLL to settle
hubset ##ENAFREQ 'engage
...
...
A second solution, "RCFAST hand-off", which is largely compatible with the mailbox hand-off above, can happily co-exist, is to have a convention of always reverting back to RCFAST before launching the loaded program.
'RCFAST hand-off method in target (loaded) program
_init
hubset ##SETFREQ 'setup for new mode, still RCFAST
waitx ##25_000_000/100 '~10ms for crystal/PLL to settle
hubset ##ENAFREQ 'engage, select PLL as clock source
...
...
'RCFAST hand-off method in loader program
...
andn clk_mode, #3 'disable XI+PLL mode
hubset clk_mode 'switch to RCFAST mode
hubset #0 'turn off all crystal/PLL modes
waitx ##25_000_000/100 'wait 10 ms for crystal shutdown
coginit #0, address 'launch cog 0 from address
Loadp2 has adopted RCFAST hand-off as its default. Mailbox hand-off needs -PATCH option, FlexGUI uses -PATCH.
If using COGATN for synchronisation it's useful to know that the sender cog completes the COGATN instruction two sysclocks earlier than the WAITATN in the receiving cogs.
I do not believe I am the only follower of this thread who had not seen (or perhaps forgot about) this description of all the P2 Instructions. click here
Chip said...
It's just for the next instruction (interrupts suspended to read the lower 32bits following reading the top 32bits, and no intervening instructions).
I think the top 32 bits are two clocks ahead of the lower 32 bits. That's how it winds up time-aligned.
Here is a summary of the P2 CALL/JMP instructions for V33 RevB silicon
- Encoding - "#S = immediate (I=1). S = register.
#D = immediate (L=1). D = register." "* Z = (result == 0).
** If #S and cogex, PC += signed(S). If #S and hubex, PC += signed(S*4). If S, PC = register S."
EEEE 1011001 CZI DDDDDDDDD SSSSSSSSS CALLD D,{#}S {WC/WZ/WCZ} Call to S** by writing {C, Z, 10'b0, PC[19:0]} to D. C = S[31], Z = S[30].
EEEE 1011010 0LI DDDDDDDDD SSSSSSSSS CALLPA {#}D,{#}S Call to S** by pushing {C, Z, 10'b0, PC[19:0]} onto stack, copy D to PA.
EEEE 1011010 1LI DDDDDDDDD SSSSSSSSS CALLPB {#}D,{#}S Call to S** by pushing {C, Z, 10'b0, PC[19:0]} onto stack, copy D to PB.
EEEE 1101011 CZ0 DDDDDDDDD 000101101 CALL D {WC/WZ/WCZ} Call to D by pushing {C, Z, 10'b0, PC[19:0]} onto stack. C = D[31], Z = D[30], PC = D[19:0].
EEEE 1101011 CZ0 DDDDDDDDD 000101110 CALLA D {WC/WZ/WCZ} Call to D by writing {C, Z, 10'b0, PC[19:0]} to hub long at PTRA++. C = D[31], Z = D[30], PC = D[19:0].
EEEE 1101011 CZ0 DDDDDDDDD 000101111 CALLB D {WC/WZ/WCZ} Call to D by writing {C, Z, 10'b0, PC[19:0]} to hub long at PTRB++. C = D[31], Z = D[30], PC = D[19:0].
EEEE 1101101 RAA AAAAAAAAA AAAAAAAAA CALL #A Call to A by pushing {C, Z, 10'b0, PC[19:0]} onto stack. If R = 1, PC += A, else PC = A.
EEEE 1101110 RAA AAAAAAAAA AAAAAAAAA CALLA #A Call to A by writing {C, Z, 10'b0, PC[19:0]} to hub long at PTRA++. If R = 1, PC += A, else PC = A.
EEEE 1101111 RAA AAAAAAAAA AAAAAAAAA CALLB #A Call to A by writing {C, Z, 10'b0, PC[19:0]} to hub long at PTRB++. If R = 1, PC += A, else PC = A.
EEEE 11100WW RAA AAAAAAAAA AAAAAAAAA CALLD PA/PB/PTRA/PTRB,#A Call to A by writing {C, Z, 10'b0, PC[19:0]} to PA/PB/PTRA/PTRB (per W). If R = 1, PC += A, else PC = A.
0000 ------- --- --------- --------- _RET_ <inst> <ops> Execute <inst> always and return if no branch. If <inst> is not branching then return by popping stack[19:0] into PC.
EEEE 1101011 CZ1 000000000 000101101 RET {WC/WZ/WCZ} Return by popping stack (K). C = K[31], Z = K[30], PC = K[19:0].
EEEE 1101011 CZ1 000000000 000101110 RETA {WC/WZ/WCZ} Return by reading hub long (L) at --PTRA. C = L[31], Z = L[30], PC = L[19:0].
EEEE 1101011 CZ1 000000000 000101111 RETB {WC/WZ/WCZ} Return by reading hub long (L) at --PTRB. C = L[31], Z = L[30], PC = L[19:0].
EEEE 1011001 110 111111111 111110001 RETI3 Return from INT3. (CALLD $1FF,$1F1 WC,WZ)
EEEE 1011001 110 111111111 111110011 RETI2 Return from INT2. (CALLD $1FF,$1F3 WC,WZ)
EEEE 1011001 110 111111111 111110101 RETI1 Return from INT1. (CALLD $1FF,$1F5 WC,WZ)
EEEE 1011001 110 111111111 111111111 RETI0 Return from INT0. (CALLD $1FF,$1FF WC,WZ)
EEEE 1011001 110 111110000 111110001 RESI3 Resume from INT3. (CALLD $1F0,$1F1 WC,WZ)
EEEE 1011001 110 111110010 111110011 RESI2 Resume from INT2. (CALLD $1F2,$1F3 WC,WZ)
EEEE 1011001 110 111110100 111110101 RESI1 Resume from INT1. (CALLD $1F4,$1F5 WC,WZ)
EEEE 1011001 110 111111110 111111111 RESI0 Resume from INT0. (CALLD $1FE,$1FF WC,WZ)
EEEE 1101011 CZ0 DDDDDDDDD 000101100 JMP D {WC/WZ/WCZ} Jump to D. C = D[31], Z = D[30], PC = D[19:0].
EEEE 1101011 00L DDDDDDDDD 000110000 JMPREL {#}D Jump ahead/back by D instructions. For cogex, PC += D[19:0]. For hubex, PC += D[17:0] << 2.
EEEE 1101100 RAA AAAAAAAAA AAAAAAAAA JMP #A Jump to A. If R = 1, PC += A, else PC = A.
EEEE 1011011 00I DDDDDDDDD SSSSSSSSS DJZ D,{#}S Decrement D and jump to S** if result is zero.
EEEE 1011011 01I DDDDDDDDD SSSSSSSSS DJNZ D,{#}S Decrement D and jump to S** if result is not zero.
EEEE 1011011 10I DDDDDDDDD SSSSSSSSS DJF D,{#}S Decrement D and jump to S** if result is $FFFF_FFFF.
EEEE 1011011 11I DDDDDDDDD SSSSSSSSS DJNF D,{#}S Decrement D and jump to S** if result is not $FFFF_FFFF.
EEEE 1011100 00I DDDDDDDDD SSSSSSSSS IJZ D,{#}S Increment D and jump to S** if result is zero.
EEEE 1011100 01I DDDDDDDDD SSSSSSSSS IJNZ D,{#}S Increment D and jump to S** if result is not zero.
EEEE 1011100 10I DDDDDDDDD SSSSSSSSS TJZ D,{#}S Test D and jump to S** if D is zero.
EEEE 1011100 11I DDDDDDDDD SSSSSSSSS TJNZ D,{#}S Test D and jump to S** if D is not zero.
EEEE 1011101 00I DDDDDDDDD SSSSSSSSS TJF D,{#}S Test D and jump to S** if D is full (D = $FFFF_FFFF).
EEEE 1011101 01I DDDDDDDDD SSSSSSSSS TJNF D,{#}S Test D and jump to S** if D is not full (D != $FFFF_FFFF).
EEEE 1011101 10I DDDDDDDDD SSSSSSSSS TJS D,{#}S Test D and jump to S** if D is signed (D[31] = 1).
EEEE 1011101 11I DDDDDDDDD SSSSSSSSS TJNS D,{#}S Test D and jump to S** if D is not signed (D[31] = 0).
EEEE 1011110 00I DDDDDDDDD SSSSSSSSS TJV D,{#}S Test D and jump to S** if D overflowed (D[31] != C, C = 'correct sign' from last addition/subtraction).
EEEE 1011110 01I 000000000 SSSSSSSSS JINT {#}S Jump to S** if INT event flag is set.
EEEE 1011110 01I 000000001 SSSSSSSSS JCT1 {#}S Jump to S** if CT1 event flag is set.
EEEE 1011110 01I 000000010 SSSSSSSSS JCT2 {#}S Jump to S** if CT2 event flag is set.
EEEE 1011110 01I 000000011 SSSSSSSSS JCT3 {#}S Jump to S** if CT3 event flag is set.
EEEE 1011110 01I 000000100 SSSSSSSSS JSE1 {#}S Jump to S** if SE1 event flag is set.
EEEE 1011110 01I 000000101 SSSSSSSSS JSE2 {#}S Jump to S** if SE2 event flag is set.
EEEE 1011110 01I 000000110 SSSSSSSSS JSE3 {#}S Jump to S** if SE3 event flag is set.
EEEE 1011110 01I 000000111 SSSSSSSSS JSE4 {#}S Jump to S** if SE4 event flag is set.
EEEE 1011110 01I 000001000 SSSSSSSSS JPAT {#}S Jump to S** if PAT event flag is set.
EEEE 1011110 01I 000001001 SSSSSSSSS JFBW {#}S Jump to S** if FBW event flag is set.
EEEE 1011110 01I 000001010 SSSSSSSSS JXMT {#}S Jump to S** if XMT event flag is set.
EEEE 1011110 01I 000001011 SSSSSSSSS JXFI {#}S Jump to S** if XFI event flag is set.
EEEE 1011110 01I 000001100 SSSSSSSSS JXRO {#}S Jump to S** if XRO event flag is set.
EEEE 1011110 01I 000001101 SSSSSSSSS JXRL {#}S Jump to S** if XRL event flag is set.
EEEE 1011110 01I 000001110 SSSSSSSSS JATN {#}S Jump to S** if ATN event flag is set.
EEEE 1011110 01I 000001111 SSSSSSSSS JQMT {#}S Jump to S** if QMT event flag is set.
EEEE 1011110 01I 000010000 SSSSSSSSS JNINT {#}S Jump to S** if INT event flag is clear.
EEEE 1011110 01I 000010001 SSSSSSSSS JNCT1 {#}S Jump to S** if CT1 event flag is clear.
EEEE 1011110 01I 000010010 SSSSSSSSS JNCT2 {#}S Jump to S** if CT2 event flag is clear.
EEEE 1011110 01I 000010011 SSSSSSSSS JNCT3 {#}S Jump to S** if CT3 event flag is clear.
EEEE 1011110 01I 000010100 SSSSSSSSS JNSE1 {#}S Jump to S** if SE1 event flag is clear.
EEEE 1011110 01I 000010101 SSSSSSSSS JNSE2 {#}S Jump to S** if SE2 event flag is clear.
EEEE 1011110 01I 000010110 SSSSSSSSS JNSE3 {#}S Jump to S** if SE3 event flag is clear.
EEEE 1011110 01I 000010111 SSSSSSSSS JNSE4 {#}S Jump to S** if SE4 event flag is clear.
EEEE 1011110 01I 000011000 SSSSSSSSS JNPAT {#}S Jump to S** if PAT event flag is clear.
EEEE 1011110 01I 000011001 SSSSSSSSS JNFBW {#}S Jump to S** if FBW event flag is clear.
EEEE 1011110 01I 000011010 SSSSSSSSS JNXMT {#}S Jump to S** if XMT event flag is clear.
EEEE 1011110 01I 000011011 SSSSSSSSS JNXFI {#}S Jump to S** if XFI event flag is clear.
EEEE 1011110 01I 000011100 SSSSSSSSS JNXRO {#}S Jump to S** if XRO event flag is clear.
EEEE 1011110 01I 000011101 SSSSSSSSS JNXRL {#}S Jump to S** if XRL event flag is clear.
EEEE 1011110 01I 000011110 SSSSSSSSS JNATN {#}S Jump to S** if ATN event flag is clear.
EEEE 1011110 01I 000011111 SSSSSSSSS JNQMT {#}S Jump to S** if QMT event flag is clear.
EEEE 1101011 00L DDDDDDDDD 000110001 SKIP {#}D Skip instructions per D. Subsequent instructions 0..31 get cancelled for each '1' bit in D[0]..D[31].
EEEE 1101011 00L DDDDDDDDD 000110010 SKIPF {#}D Skip cog/LUT instructions fast per D. Like SKIP, but instead of cancelling instructions, the PC leaps over them.
EEEE 1101011 00L DDDDDDDDD 000110011 EXECF {#}D Jump to D[9:0] in cog/LUT and set SKIPF pattern to D[31:10]. PC = {10'b0, D[9:0]}.
Within SKIPF sequences where CALL/CALLPA/CALLPB are used to execute subroutines in which skipping will be suspended until after RET, all CALL/CALLPA/CALLPB immediate branch addresses must be absolute in cases where the instruction after the CALL/CALLPA/CALLPB might be skipped. This is not possible for CALLPA/CALLPB but CALL can use '#\address' syntax to achieve absolute immediate addressing. CALL/CALLPA/CALLPB can all use registers as branch addresses, since they are absolute.
For non-CALL\CALLPA\CALLPB branches within SKIPF sequences, SKIPF will work through all immediate-relative branches, which are the default for immediate branches within cog/LUT memory. If an absolute-address branch is being used (#\label, register, or RET, for example), you must not skip the first instruction after the branch. This is not a problem with immediate-relative branches, however, since the variable PC stepping works to advantage, by landing the PC at the first instruction of interest at, or beyond, the branch address.
'
' SPI receiving with a smartpin works very cleanly as the number of I/O buffer stages does not affect timing.
' This is so because the SPI clock follows the same number of stages as the data does.
'
' SPI sending, on the other hand, has a problem with responding to a SPI clock input in a timely manner.
' The number of input stages and output stages both stack to make a four sysclock lag
' from SPI clock input to SPI tx data output.
'
' Here is a demo of using a full period leading SPI clock to compensate for the four sysclock lag.
' Note that this relies on the sysclock to SPI clock ratio being set to 4:1. Unsuitable for others.
' Makes use of idle-high SPI clocking.
'
' Use a digital storage oscilloscope to view the timings.
'
Comments
Yes, all zeros is a special "NOP" case and is trapped by the silicon, rather than being a ROR instruction. If it wasn't trapped it would be a
_RET_ ROR 0,0
which would cause a _RET_ operation following the ROR 0,0 which would have no effect. Wouldn't want that.
But in P1, anything with the CCCC=0 would become a NOP, so it was often used to hold a smaller variable, or convert an instruction to a NOP.
That's why Pasm variables in all Obex programs are at the end.
PS: You guys are messing up this topic with chatter!
On the P2 one can do this very efficient.
and now my parameter uses the same cog register used by the instruction needed to load the parameter
you can't do that shorter,
Enjoy!
Mike
Using the counter for timeouts
Here is how to setup your own timeout and code to test if the timeout has been exceeded Note the use of the CMPM D,#/S {wc/wz/wcz} instruction!!!
C = MSB of the result of D-S
[ this is fixed in spin2gui release 1.3.9 and later ]
That was a bug; I accidentally commented some error detection code out while debugging. It'll be fixed in the next version.
I can compile a sub-object without errors, but compiling the main object throws then errors in the sub object.
Maybe you eliminate unused functions to early?
Enjoy!
Mike
The issue is you only get synchronous access to the hub with block reads/writes, so the latter version has to wait
7 cycles as it misses the boat on the second read. Block reads and writes run at one long per clock cycle, ie
two per instruction time.
There are work-arounds, but it's something to watch out for.
One solution, "mailbox hand-off", for reliable operation into the future is to maintain a universal reserved hubRAM location containing a copy of the most recently issued config. That location is currently set as byte addresses $18 to $1b. Little-endian.
Here is sample code for basic startup that includes the ability for a boot-loader to specify what config it has already set.
A second solution, "RCFAST hand-off", which is largely compatible with the mailbox hand-off above, can happily co-exist, is to have a convention of always reverting back to RCFAST before launching the loaded program.
Loadp2 has adopted RCFAST hand-off as its default. Mailbox hand-off needs -PATCH option, FlexGUI uses -PATCH.
EDIT: Ditched all the underscores in the constants to avoid Pnut symbol naming conflict
(10-3-2019) Improved comments of system variables
(24-6-2019) Link to original topic - https://forums.parallax.com/discussion/169838/p2-reset-possible-problem/p1
(18-7-2019) Added ANDN masking for RCFAST clock selection - from Chip's Spin2 interpreter
(20-11-2019) Belatedly added the simpler RCFAST hand-off option. First suggested by Chip - https://forums.parallax.com/discussion/comment/1474672/#Comment_1474672 and some details at https://forums.parallax.com/discussion/170405/clockfreq-clockmode-and-others-agreement-for-what-and-where
But PNut throws the error "_CLKFREQ/_XINFREQ specified without _CLKMODE". So it looks like these three words may be reserved.
should be _CLOCKFREQ
Chip said...
It's just for the next instruction (interrupts suspended to read the lower 32bits following reading the top 32bits, and no intervening instructions).
I think the top 32 bits are two clocks ahead of the lower 32 bits. That's how it winds up time-aligned.
From the manual... Refer here forums.parallax.com/discussion/comment/1478399/#Comment_1478399 for further discussion (please keep this thread free from discussion)
(using the Rom source code)
From the docs
Be aware that this feature cannpnt cross the PinA/PinB boundary and
wraps within the 32 pin group.
For example the following code configures pins 30,31,0,1 not 30,31,32,33
' SPI receiving with a smartpin works very cleanly as the number of I/O buffer stages does not affect timing.
' This is so because the SPI clock follows the same number of stages as the data does.
'
' SPI sending, on the other hand, has a problem with responding to a SPI clock input in a timely manner.
' The number of input stages and output stages both stack to make a four sysclock lag
' from SPI clock input to SPI tx data output.
'
' Here is a demo of using a full period leading SPI clock to compensate for the four sysclock lag.
' Note that this relies on the sysclock to SPI clock ratio being set to 4:1. Unsuitable for others.
' Makes use of idle-high SPI clocking.
'
' Use a digital storage oscilloscope to view the timings.
'