I'm going to bed now, but I'll use this tomorrow to test the single-step after hub-exec REP. I think I've figured out the trouble now. I need to raise 'repa' one instruction early in hub-exec.
Chip - have you considered taking a little time and setting up a Modelsim simulation environment? It doesn't seem like it should be that hard, and for cases like this it sounds like it would save you time. I don't know if there's someone trusted who could help you out - e.g. if they had access to your source code, then they could work on it for you. The simulation could just load a custom ROM with a simple test case, so you wouldn't need to go through the lengthy boot process.
Today was spent recoding a big mux that selects the next PC value. It's much cleaner now and the compiler has more freedom to optimize it. This was all done in preparation of solving the single-step problem that Ozpropdev discovered. Tomorrow I'll be back on the single-step issue.
Chip - have you considered taking a little time and setting up a Modelsim simulation environment? It doesn't seem like it should be that hard, and for cases like this it sounds like it would save you time. I don't know if there's someone trusted who could help you out - e.g. if they had access to your source code, then they could work on it for you. The simulation could just load a custom ROM with a simple test case, so you wouldn't need to go through the lengthy boot process.
I don't know anything about Modelsim. It's my feeling that the setup I have is quite decent. I just need to set up test cases, occasionally, that sometimes require shifting into granny gear. Shifting gears is the hard part.
Today was spent recoding a big mux that selects the next PC value. It's much cleaner now and the compiler has more freedom to optimize it. This was all done in preparation of solving the single-step problem that Ozpropdev discovered. Tomorrow I'll be back on the single-step issue.
That sounds a nice clean up. Probably the sort of thing that only gets done on major optimisation drives in bigger teams.
Cheers to Oz too, for doing the work to find the bug. Presumably it affects all interrupt sources and I wonder if more than just REP instruction might be caught up in it too. At the very least, the bug adds more jitter to interrupt timings.
Worst case, one could just insert a NOP after all hubexec REP blocks, right?
If you really wanted to single step, that is.
Single step sounds great for seeing that the instructions really do.
But, once we have good docs, don't think it's the most important feature so doesn't have to be perfect in all cases...
I'm pretty certain this is not just a debug concern, that's just where the bug has made itself obvious. Besides, we're getting a probable tiny improvement in active power draw from Chip's re-engineering effort. Each one of these tweaks adds up.
Okay! I think I solved the problem. Here is a new set of files for the Prop123-A9 board. This FPGA image has just two cogs and no CORDIC, but it will be fine for determining if the bug is gone:
Ozpropdev, I just got to the point where things check out on the logic analyzer. I haven't tested it with your debugger, yet. I think it will work fine, though. I'll try the debugger out in the morning. Thanks!
An important note about changes to the 'REP D/#,S/#' instruction:
I changed the D/# value to be the actual number of instructions to repeat (not # minus 1). If 0 is used for D/#, nothing will be repeated, regardless of the S/# value. PNut.exe has been updated to properly compute #D when @label is used.
The S/# value conveys the number of iterations. An S/# value of 1 will execute the subsequent block once, while 2 will execute it twice. If S/# is 0, the block will be repeated indefinitely.
If a branch occurs in the block of instructions that is being repeated, the REP will be canceled.
All these test were CogExec only. Only the first test cases of S=0 failed. S-0 OK, debugger needs modification for 0 length REP blockks
I will test LutExec and HubExec tomorrow.
Cancelling with a JMP works as expected in CogExec and single step shpws the instruction at the destination address as the next instruction.
I'm running all the tests again now with a tweaked debugger,
I will make a complete result output soon.
Today I tested in Cog,Lut and Hub exec for all test cases above.
All work correctly except for the case of D=1 and a JMP instruction (Cancel).
Cog and lut exec jumps to cog or lut addresses a short by 1 address.
JMP's to hub addresses are OK.
Probably not a big deal as a REP block with a length of 1 and a JMP is illogical anyway.
Today I tested in Cog,Lut and Hub exec for all test cases above.
All work correctly except for the case of D=1 and a JMP instruction (Cancel).
Cog and lut exec jumps to cog or lut addresses a short by 1 address.
JMP's to hub addresses are OK.
Probably not a big deal as a REP block with a length of 1 and a JMP is illogical anyway.
So looking good Chip!
Boy, that case of D=1 and JMP in cog/lut-exec is weird. I half-way know why that's happening: a repeat is occurring, already, on the JMP instruction, as REP executes. The PC is not in a natural state at that time and the JMP is relative.
Getting back into that REP instruction has been a chore.
I made a new version today which turns REP into a NOP if D [8:0] is zero. This means the next instruction can be single-stepped, as you'd expect.
Does it work for D>1 and the last instruction in the REP block is a JMP? (I can't look at your test code at the moment.)
I had tested for D>1 which worked OK, but in all my tests I hadn't tried JMP as the last instruction in the block.
Your onto something there Seairth, it seems that if the last instruction is a JMP the issue raises it's head again.
I'll keep digging....
Comments
Basic steps to get started
F11 code into P123 board from Pnut
Start PST @ 1M baud
Wait for prompt ">"
Type STK+
Type SWL+
Repeated pressing of "*" will step through the code and show flags, stack and register values.
Pressing "?" will show a basic help screen.
I'm still generating docs for this work in progress.
Hope it helps.
Try Version 1.2a instead, I was little hasty posting the code.
I'm going to bed now, but I'll use this tomorrow to test the single-step after hub-exec REP. I think I've figured out the trouble now. I need to raise 'repa' one instruction early in hub-exec.
I fixed a bug in the debugger!
Here's the updated version 1.2b
I don't know anything about Modelsim. It's my feeling that the setup I have is quite decent. I just need to set up test cases, occasionally, that sometimes require shifting into granny gear. Shifting gears is the hard part.
That sounds a nice clean up. Probably the sort of thing that only gets done on major optimisation drives in bigger teams.
Cheers to Oz too, for doing the work to find the bug. Presumably it affects all interrupt sources and I wonder if more than just REP instruction might be caught up in it too. At the very least, the bug adds more jitter to interrupt timings.
If you really wanted to single step, that is.
Single step sounds great for seeing that the instructions really do.
But, once we have good docs, don't think it's the most important feature so doesn't have to be perfect in all cases...
https://drive.google.com/a/parallax.com/file/d/0B9NbgkdrupkHbnVRb0hqTWVqOTg/view?usp=sharing
Ozpropdev, I just got to the point where things check out on the logic analyzer. I haven't tested it with your debugger, yet. I think it will work fine, though. I'll try the debugger out in the morning. Thanks!
I changed the D/# value to be the actual number of instructions to repeat (not # minus 1). If 0 is used for D/#, nothing will be repeated, regardless of the S/# value. PNut.exe has been updated to properly compute #D when @label is used.
The S/# value conveys the number of iterations. An S/# value of 1 will execute the subsequent block once, while 2 will execute it twice. If S/# is 0, the block will be repeated indefinitely.
If a branch occurs in the block of instructions that is being repeated, the REP will be canceled.
I will try it out.
Looks good.
Super! Thanks, Ozpropdev.
The corner cases for REP are:
D=0 and S=0
D=0 and S=1
D=0 and S>1
D=1 and S=0
D=1 and S=1
D=1 and S>1
D>1 and S=0
D>1 and S=1
D>1 and S>1
Also, how does it return to single-step after a branch within the repeat block?
These are all things that need testing.
Here's the first test for
D=0 and S=0
D=0 and S=1
D=0 and S>1 and here's the result
Note in all three cases the ADD instruction is skipped.
Works OK, debugger needs modification to allow for zero length blocks.
D=1 and S=0 'nevers returns to debugger because infinite repeat block as expected
D=1 and S=1
D=1 and S>1
These test OK
D>1 and S=1
D>1 and S>1
Test OK
All these test were CogExec only.
Only the first test cases of S=0 failed. S-0 OK, debugger needs modification for 0 length REP blockks
I will test LutExec and HubExec tomorrow.
To clarify, when S==0 the block is skipped over?
Never mind, I see S==0 means infinite loop...
Is allowing a branch inside REP new?
REP with D==0 is just like NOP then?
Ozpropdev, thanks for testing.
I can fix the problem of skipping one instruction when REP executes with D=0.
Cancelling a REP via a branch is new (and necessary). Prop2-Hot did that.
Yes, I will make REP with D=0 into a NOP.
I'm running all the tests again now with a tweaked debugger,
I will make a complete result output soon.
All work correctly except for the case of D=1 and a JMP instruction (Cancel).
Cog and lut exec jumps to cog or lut addresses a short by 1 address.
JMP's to hub addresses are OK.
Probably not a big deal as a REP block with a length of 1 and a JMP is illogical anyway.
So looking good Chip!
Boy, that case of D=1 and JMP in cog/lut-exec is weird. I half-way know why that's happening: a repeat is occurring, already, on the JMP instruction, as REP executes. The PC is not in a natural state at that time and the JMP is relative.
Getting back into that REP instruction has been a chore.
I made a new version today which turns REP into a NOP if D [8:0] is zero. This means the next instruction can be single-stepped, as you'd expect.
Your onto something there Seairth, it seems that if the last instruction is a JMP the issue raises it's head again.
I'll keep digging....