An Issue with REP blocks, Single step interrupts and HUB execution

Hi Chip
I have been playing with the single step interrupt mechanism in P2 and it works great!
However I am experiencing one little issue when stepping into a REP block in HUB exec mode.
Here is the demo REP block that I used for COG,LUT and HUB exec testing.
Here is the results from my P2 debugger (work in progres) for COG exec mode.
COG address advances from $007 to $00B as expected.
Here is the results for LUT exec mode.
Address advances from $283 to $287 as expected.
And finally here is the HUb exec mode results.
Note that the address advances past the expected address of $7014 to $7018.
The expected "BMASK" instruction has executed as proven by the cog register "delta" contents.
and skipped in the same manner.
Hope I made sense...
Cheers
Brian
I have been playing with the single step interrupt mechanism in P2 and it works great!
However I am experiencing one little issue when stepping into a REP block in HUB exec mode.
Here is the demo REP block that I used for COG,LUT and HUB exec testing.
rep_block mov delta,#0
rep @.rb_end,#32
add alpha,#1
sub bravo,#2
mul charlie,#2
.rb_end
bmask delta,#19
ret
Here is the results from my P2 debugger (work in progres) for COG exec mode.
COG address advances from $007 to $00B as expected.
-----------------------------------------------------------------------------------------(P2 Debugger)
STACK : --_00000 --_00000 --_00000 --_00000 --_00000 --_00000 --_00000 --_00000
COG | $00040: "Delta" $00000000 %00000000_00000000_00000000_00000000 #0
00007: FCDC0420 REP #$002,#$020
00008: F1047A01 ADD $03D,#$001
00009: F1847C02 SUB $03E,#$002
0000A: FA047E02 MUL $03F,#$002
Flags (CZ) = _Z
>*
-----------------------------------------------------------------------------------------(P2 Debugger)
STACK : --_00000 --_00000 --_00000 --_00000 --_00000 --_00000 --_00000 --_00000
COG | $00040: "Delta" $00000000 %00000000_00000000_00000000_00000000 #0
0000B: F99C8013 BMASK $040,#$013
Flags (CZ) = _Z
Here is the results for LUT exec mode.
Address advances from $283 to $287 as expected.
-----------------------------------------------------------------------------------------(P2 Debugger)
STACK : -Z_0000E --_00000 --_00000 --_00000 --_00000 --_00000 --_00000 --_00000
COG | $00040: "Delta" $00000000 %00000000_00000000_00000000_00000000 #0
00283: FCDC0420 REP #$002,#$020
00284: F1047A01 ADD $03D,#$001
00285: F1847C02 SUB $03E,#$002
00286: FA047E02 MUL $03F,#$002
Flags (CZ) = _Z
>*
-----------------------------------------------------------------------------------------(P2 Debugger)
STACK : -Z_0000E --_00000 --_00000 --_00000 --_00000 --_00000 --_00000 --_00000
COG | $00040: "Delta" $00000000 %00000000_00000000_00000000_00000000 #0
00287: F99C8013 BMASK $040,#$013
Flags (CZ) = _Z
And finally here is the HUb exec mode results.
Note that the address advances past the expected address of $7014 to $7018.
The expected "BMASK" instruction has executed as proven by the cog register "delta" contents.
-----------------------------------------------------------------------------------------(P2 Debugger)
STACK : -Z_00289 -Z_0000E --_00000 --_00000 --_00000 --_00000 --_00000 --_00000
COG | $00040: "Delta" $00000000 %00000000_00000000_00000000_00000000 #0
07004: FCDC0420 REP #$002,#$020
07008: F1047A01 ADD $03D,#$001
0700C: F1847C02 SUB $03E,#$002
07010: FA047E02 MUL $03F,#$002
Flags (CZ) = _Z
>*
-----------------------------------------------------------------------------------------(P2 Debugger)
STACK : -Z_00289 -Z_0000E --_00000 --_00000 --_00000 --_00000 --_00000 --_00000
COG | $00040: "Delta" $000FFFFF %00000000_00001111_11111111_11111111 #1048575
07018: FD600031 RET
Flags (CZ) = _Z
If the instruction following the REP block is a 2 long variant (i.e AUGx,ALTx etc) it is executedand skipped in the same manner.
Hope I made sense...
Cheers
Brian
Comments
Isn't REP one of them?
It's just the single step that has an issue.
PS: I haven't read the opening post.
For REP in hub-exec mode, things are handled a bit differently than they are in cog-exec mode. I have an idea of what it could be, but I need to relearn this chunk of Verilog. Working on it....
Thought it would go off the rails if rep block didn't happen to be all in the fifo...
The FIFO is not used. On any branch, including a REP loop, the pipeline is reloaded from memory.
I'm looking this over, still. I'm wondering if there are any differences for when the body of the REP block is only only one instruction. Any chance you could try out one-instruction cases?
Same goes for a single instruction REP block.
In tiis case the instruction at $700C is executed too.
-----------------------------------------------------------------------------------------(P2 Debugger) COG | $00040: "Delta" $00000000 %00000000_00000000_00000000_00000000 #0 07004: FCDC0020 REP #$000,#$020 07008: F1048001 ADD $040,#$001 Flags (CZ) = _Z >* -----------------------------------------------------------------------------------------(P2 Debugger) COG | $00040: "Delta" $000FFFFF %00000000_00001111_11111111_11111111 #1048575 07010: FD600031 RET Flags (CZ) = _Z >
How about the original test, but with a rep count of 1? It should allow you to step through the block, as there is, actually, no REP taking place.
-----------------------------------------------------------------------------------------(P2 Debugger) COG | $00040: "Delta" $00000000 %00000000_00000000_00000000_00000000 #0 07004: FCDC0401 REP #$002,#$001 07008: F1047A01 ADD $03D,#$001 0700C: F1847C02 SUB $03E,#$002 07010: FA047E02 MUL $03F,#$002 Flags (CZ) = _Z >* -----------------------------------------------------------------------------------------(P2 Debugger) COG | $00040: "Delta" $000FFFFF %00000000_00001111_11111111_11111111 #1048575 07018: FD600031 RET Flags (CZ) = _Z >
What about cog-exec? I assume that was hub-exec.
-----------------------------------------------------------------------------------------(P2 Debugger) COG | $00040: "Delta" $00000000 %00000000_00000000_00000000_00000000 #0 00007: FCDC0401 REP #$002,#$001 00008: F1047A01 ADD $03D,#$001 00009: F1847C02 SUB $03E,#$002 0000A: FA047E02 MUL $03F,#$002 Flags (CZ) = _Z >* -----------------------------------------------------------------------------------------(P2 Debugger) COG | $00040: "Delta" $00000000 %00000000_00000000_00000000_00000000 #0 0000B: F99C8013 BMASK $040,#$013 Flags (CZ) = _Z >
What about in cog-exec mode? Will it stop on the next REP, or execute that whole block, too?
-----------------------------------------------------------------------------------------(P2 Debugger) 00007: FCDC0401 REP #$002,#$001 00008: F1048201 ADD $041,#$001 00009: F1848402 SUB $042,#$002 0000A: FA048602 MUL $043,#$002 Flags (CZ) = _Z >* -----------------------------------------------------------------------------------------(P2 Debugger) 0000B: FCDC0401 REP #$002,#$001 0000C: F1048201 ADD $041,#$001 0000D: F1848402 SUB $042,#$002 0000E: FA048602 MUL $043,#$002 Flags (CZ) = _Z >*
Thanks for checking that. I'm still looking the code over, reconstructing how this works in my head.
/////////// // // // REP // // // /////////// reg [11:0] repdc; reg [9:0] repic, repi; reg [32:0] repb; reg repa; wire rep_si = !hubx && ~|d[8:0]; // rep, single instruction if cog exec and d[8:0] = 0 wire rep_sb = s == 1'b1; // rep, single block if s = 1 wire repiz = ~|repi; // instruction counter zero? wire repmore = |repb[32:1]; // infinite or more blocks? `regscan (repdc, 12'b0, // p delta capture go && rep, hubx ? {{1'b0, d[8:0]} + 1'b1, 2'b0} // for hub exec, must jump back by {d[8:0] + 1} * 4 : d[8:0]) // for cog exec, must subtract d[8:0] from p `regscan (repic, 10'b0, // instruction count capture go && rep, d[8:0] + hubx) // cog exec uses d[8:0], hub exec uses d[8:0] + 1 `regscan (repi, 10'b0, // instruction counter go && (rep || repa), rep ? rep_si ? 9'b0 : d[8:0] - !hubx // rep, load zero if single instr, else load instr count - 1 : repiz ? repic // if instruction counter empty, reload : repi - 1'b1) // else, decrement `regscan (repb, 33'b0, // block counter go && (rep || repa && repiz), rep ? {~|s, rep_si ? s - 1'b1 : s} // rep, capture infinite flag and block count : {repb[32], repb[31:0] - 1'b1}) // preserve infinite flag, decrement block count `regscan (repa, 1'b0, // rep active flag !ena || go && (rep || repa && repiz), !ena ? 1'b0 : rep ? !rep_si || !rep_sb // rep, remain active if not single instruction or not single block : repmore) // after rep, remain active if infinite or more blocks wire rept = rep ? rep_si && !rep_sb // rept on rep if cog exec and single instruction and multiple blocks : repa && repiz && repmore; // rept after rep when active and instruction counter zero and more blocks wire [11:0] repd= rep ? d[8:0] // p delta on rep, cog exec only : repdc; // p delta after rep
It usually takes me a few hours to get back into something like this and fully remember how it works.
Yes. The interrupt is inhibited if a REP instruction is executing or the register 'repa' is high.
What happens in hub-exec seems normal. I'm trying to figure out why cog-exec seems to work as one would expect. I think in cog-exec, 'repa' goes low one instruction early. I don't understand why, yet.
Okay. I see why it does that now. The register 'repb' doesn't go down to one until after the last loop back. Meanwhile, 'repa' is high. Only at the end of the last block does 'repa' go low. I think it's happening one instruction too late in the case of hub-exec.
I'm compiling some test version of the prop2 now, so that I can look at the PC and JMP bits during REP, both in cog-exec and hub-exec. I need to relearn what it's doing. It's like my brain unspooled some details over the last two months.
I'm watching the signals on a logic analyzer, which shows me what I need to see, for now.
Does your debugger drive a monitor or serial terminal? I'd like to use it, but I don't know what I need to know, in order to. I assume it's pretty simple, right? If you post it, I will try it out.
I'm preparing it for posting now....