Normal interrupts don't appear to be a problem. For some reason, it is considered too risky to allow interrupts during skipping because "they would introduce hidden state data (skip bits waiting in the wings) during debug interrupts."
I assume hidden means not readable directly. Skip bits are hidden but not unknown and isn't Q also hidden?
It would help if the potential problems during debug interrupts were explained. They are essentially the same as CALLs in terms of what happens to the skip bits. To be honest, I don't know why we have to mess about and can't have skip interrupts all the time.
How does XBYTE debugging with interrupts compare to that without?
My big concern was that it may be too complex for me to figure out how to allow interrupts during skipping, not just that there would be hidden state data.
I thought I would do an experiment and take the obvious and simple path, which might be too simplistic, but it would let me see if the functionality was even in the ball park. So, I added a simple skip disabler that just inhibited skipping if an interrupt service routine was executing. Very simple. To my surprise, it seems to be all that was needed! Skipping is working with all interrupts now. You can single-step through it. I'm really glad about that, because I hated that single stepping was opaque, not to mention causing interrupt latencies to grow.
So, skipping now works with interrupts.
One thing this won't work with is if you decide to implement a time-slice scheduler that switches tasks on a timer interrupt. In that case you'll be returning to different code after the interrupt. I suppose that isn't likely on the Propeller though since independent tasks are likely to be run on separate COGs.
Yes, as long as XBYTE/SKIP/SKIPF/EXECF were not part of the top-level cog code, you could switch tasks okay.
Normal interrupts don't appear to be a problem. For some reason, it is considered too risky to allow interrupts during skipping because "they would introduce hidden state data (skip bits waiting in the wings) during debug interrupts."
I assume hidden means not readable directly. Skip bits are hidden but not unknown and isn't Q also hidden?
It would help if the potential problems during debug interrupts were explained. They are essentially the same as CALLs in terms of what happens to the skip bits. To be honest, I don't know why we have to mess about and can't have skip interrupts all the time.
How does XBYTE debugging with interrupts compare to that without?
My big concern was that it may be too complex for me to figure out how to allow interrupts during skipping, not just that there would be hidden state data.
I thought I would do an experiment and take the obvious and simple path, which might be too simplistic, but it would let me see if the functionality was even in the ball park. So, I added a simple skip disabler that just inhibited skipping if an interrupt service routine was executing. Very simple. To my surprise, it seems to be all that was needed! Skipping is working with all interrupts now. You can single-step through it. I'm really glad about that, because I hated that single stepping was opaque, not to mention causing interrupt latencies to grow.
Normal interrupts don't appear to be a problem. For some reason, it is considered too risky to allow interrupts during skipping because "they would introduce hidden state data (skip bits waiting in the wings) during debug interrupts."
I assume hidden means not readable directly. Skip bits are hidden but not unknown and isn't Q also hidden?
It would help if the potential problems during debug interrupts were explained. They are essentially the same as CALLs in terms of what happens to the skip bits. To be honest, I don't know why we have to mess about and can't have skip interrupts all the time.
How does XBYTE debugging with interrupts compare to that without?
My big concern was that it may be too complex for me to figure out how to allow interrupts during skipping, not just that there would be hidden state data.
I thought I would do an experiment and take the obvious and simple path, which might be too simplistic, but it would let me see if the functionality was even in the ball park. So, I added a simple skip disabler that just inhibited skipping if an interrupt service routine was executing. Very simple. To my surprise, it seems to be all that was needed! Skipping is working with all interrupts now. You can single-step through it. I'm really glad about that, because I hated that single stepping was opaque, not to mention causing interrupt latencies to grow.
So, skipping now works with interrupts.
One thing this won't work with is if you decide to implement a time-slice scheduler that switches tasks on a timer interrupt. In that case you'll be returning to different code after the interrupt. I suppose that isn't likely on the Propeller though since independent tasks are likely to be run on separate COGs.
Yes, as long as XBYTE/SKIP/SKIPF/EXECF were not part of the top-level cog code, you could switch tasks okay.
I'm sorry to ask for a feature, no matter how small, but would it be difficult to add a way to get the current frozen skip mask from inside an interrupt service routine? That would allow a scheduler to properly restore the skip mask.
Address 0006 is not expected to appear in the above example.
This is a BIG step forward though (pardon the pun).
P.S. In the SKIP example above a means of knowing the next instruction's "SKIP" status would be useful.
i.e. "RDSKIP WC" or "RDSKIP myreg" where C or bit 0 of D is next SKIP bit.
Does that mean DRVH #$021 does not execute, so does correctly skip, but it takes what would then be a dummy cycle,
Yes it doesn't execute, and so a "dummy" step is needed
So this is operating as programmed, but the Debug could show a NOP (or skNOP ?) to make that operation clearer ?
Is it predictable when it inserts that NOP action ?
How does the relative timing compare, in Step-Debug vs non step ?
Wasn't there some comment, a while back, about SKIPF reverting to SKIP in some cases ?
ie it does the fast 'zero-cycle' in some cases, but can revert to a NOP equivalent ?
That would be tolerable ? - as single-step is not real-time anyway, and as long as the skip-decisions are valid, you can debug this way, but run faster.
I tried lots of things to clean up that initial address hiccup, but I wasn't able to eradicate it.
This version has some things that may help drive your debugger, though.
When you do a GETINT instruction, D[31] will be set if there are any '1' skip bits buffered. D[30:22] will contain the next nine upcoming SKIP bits, with the LSB showing the very next one to be used.
Does that mean DRVH #$021 does not execute, so does correctly skip, but it takes what would then be a dummy cycle,
Yes it doesn't execute, and so a "dummy" step is needed
So this is operating as programmed, but the Debug could show a NOP (or skNOP ?) to make that operation clearer ?
Is it predictable when it inserts that NOP action ?
How does the relative timing compare, in Step-Debug vs non step ?
Wasn't there some comment, a while back, about SKIPF reverting to SKIP in some cases ?
ie it does the fast 'zero-cycle' in some cases, but can revert to a NOP equivalent ?
That would be tolerable ? - as single-step is not real-time anyway, and as long as the skip-decisions are valid, you can debug this way, but run faster.
In cog memory, SKIPF will jump ahead by multiple instructions. If SKIPF is used in hub memory, it will behave just like SKIP, cancelling instructions instead of jumping over them.
I tried lots of things to clean up that initial address hiccup, but I wasn't able to eradicate it.
This version has some things that may help drive your debugger, though.
When you do a GETINT instruction, D[31] will be set if there are any '1' skip bits buffered. D[30:22] will contain the next nine upcoming SKIP bits, with the LSB showing the very next one to be used.
I'm getting the same results as the previous FPGA image.
In terms of what works/ does not work, is this the present matrix ? :
SKIP : Works in Real time Run & in Step Debug (in all memory locations ?)
SKIPF : Works in Real time Run, and works in Step Debug, but has one (or more?) NOP equivalent SKIP artifacts that can appear ?
Do those artifacts have no nett effect on operation ? Is there always only one, or can more occur with differing SKIP patterns ?
(eg if the first 2 SKIPF, what is the result then ? )
I realized that if the first skip bit (LSB) is 1, things blow up, probably because the interrupt CALLD gets cancelled, instead of the next instruction. Looks like I just remedied that by not allowing any interrupt when the SKIP circuit is cancelling the next instruction. This will have some effect on single-stepping.
One new thing: On XBYTE, C and Z are set to the bytecode index bits 1 and 0.
Bits 1 and 0 of LUT address? What's the thinking here?
The latest image DOES support the CALL/RET with skip freezing. If the CALL doesn't execute, either due to being skipped or exec-condition-false, the skip freezing does not occur. It works like one would expect.
I made XBYTE write the bytecode index LSBs to C and Z, so that if you had a block of, say, four instructions that used the same bytecode routine, you would have 4 different C/Z patterns going into the bytecode routine. I use this in the interpreter to distinguish between RETURN and ABORT, since skipping was not a sufficient mechanism to differentiate the two instructions, due to looping within the routine:
'
'
' ABORT result a C=0
' ABORT x b C=0
' RETURN result c C=1
' RETURN x d C=1
'
' (pop result)
' pop return
' pop dbase
' pop vbase
' pop pbase | flags
' pop <old top>
'
retn mov ptra,dbase 'a b c d ptra points to dbase
setq #6-1 'a | c | pop z/pbase/vbase/dbase/y/x (1st pass only)
setq #5-1 '| b | d pop z/pbase/vbase/dbase/y
rdlong z,--ptra[5] 'a c ptra points to top of stack after pop
if_nc bitl pbase,#1 wcz ' if abort and !try, return again
if_nc jmp #retn
bitl pbase,#0 wcz ' result?
if_nc mov x,z ' if not, get z (top of stack) into x
if_c wrlong x,++ptra ' if so, x holds 'result', push x
_ret_ rdfast #0,y ' start new bytecode read
One new thing: On XBYTE, C and Z are set to the bytecode index bits 1 and 0.
Bits 1 and 0 of LUT address? What's the thinking here?
The latest image DOES support the CALL/RET with skip freezing. If the CALL doesn't execute, either due to being skipped or exec-condition-false, the skip freezing does not occur. It works like one would expect.
I made XBYTE write the bytecode index LSBs to C and Z, so that if you had a block of, say, four instructions that used the same bytecode routine, you would have 4 different C/Z patterns going into the bytecode routine. I use this in the interpreter to distinguish between RETURN and ABORT, since skipping was not a sufficient mechanism to differentiate the two instructions, due to looping within the routine:
'
'
' ABORT result a C=0
' ABORT x b C=0
' RETURN result c C=1
' RETURN x d C=1
'
' (pop result)
' pop return
' pop dbase
' pop vbase
' pop pbase | flags
' pop <old top>
'
retn mov ptra,dbase 'a b c d ptra points to dbase
setq #6-1 'a | c | pop z/pbase/vbase/dbase/y/x (1st pass only)
setq #5-1 '| b | d pop z/pbase/vbase/dbase/y
rdlong z,--ptra[5] 'a c ptra points to top of stack after pop
if_nc bitl pbase,#1 wcz ' if abort and !try, return again
if_nc jmp #retn
bitl pbase,#0 wcz ' result?
if_nc mov x,z ' if not, get z (top of stack) into x
if_c wrlong x,++ptra ' if so, x holds 'result', push x
_ret_ rdfast #0,y ' start new bytecode read
Thanks for the explanation.
If XBYTE now always writes the bytecode index LSBs to C and Z, then my XBYTE example will be completely ruined. Very often C and Z must be preserved between one bytecode read and the next, the first being one of several different prefix bytes.
The bytecode is stored at $1F6. Couldn't the appropriate bit(s) of that be tested at the start of the skip sequence?
I made XBYTE write the bytecode index LSBs to C and Z, so that if you had a block of, say, four instructions that used the same bytecode routine, you would have 4 different C/Z patterns going into the bytecode routine.
My first thought was: this makes it clear that using C and Z flags as long-term state in a bytecode interpreter is not possible ---
C and Z are not part of a bytecoded program's state, but a local element of a single bytecode's implementation.
This is probably because it would not be easy to keep track of flags' states when programming a bytecode interpreter.
I don't mind, seems OK to me.
Was this your intention, Chip?
If so, then putting a possibly useful value in C and Z (instead of just clearing them for each XBYTE) is very nice.
I like that least significant two bits in the bytecode can be mentioned (in the right order) in prefixes like "IF_00", "IF_01", "IF_NOT_00", "IF_NOT_10", etc.
By the way, was the flags saving and restoring in all types of CALL... and RET... clarified completely?
I vaguely remember a discussion a few weeks ago, it seemed that there were some corner cases not working in the most convenient or memorable way.
I made XBYTE write the bytecode index LSBs to C and Z, so that if you had a block of, say, four instructions that used the same bytecode routine, you would have 4 different C/Z patterns going into the bytecode routine.
My first thought was: this makes it clear that using C and Z flags as long-term state in a bytecode interpreter is not possible ---
C and Z are not part of a bytecoded program's state, but a local element of a single bytecode's implementation.
This is probably because it would not be easy to keep track of flags' states when programming a bytecode interpreter.
I don't mind, seems OK to me.
Was this your intention, Chip?
If so, then putting a possibly useful value in C and Z (instead of just clearing them for each XBYTE) is very nice.
I disagree 100%, C and Z should last as long as the programmer wants them to last.
I think the intention was to improve the Spin2 interpreter and the dire consequences elsewhere were not foreseen.
Using FPGA image -> Prop123_A9_Prop2_v19skip3.rbf
I added some GETINT support to the debugger to shed some more light on the operation.
In real time mode:
SKIP and SKIPF function correctly with predictable clock counts.
This applies to COG/LUT and HUB exec variants.
In single step mode:
The first step after a SKIP/SKIPF instruction always advances to PC+2 not PC+1.
Although PC+1 was missed by the single step the PC+1 instruction is still executed
or skipped as defined by the skip pattern.
In both SKIP and SKIPF the correct sequence of instructions are processed.
Here's a example of SKIP with the GETINT data included.
In both variants the PC steps from address $0004 to $0006(pc+2) on the first step after SKIP(F)
I'll try and run some more tests tomorrow and zip all of the source and results too.
Comments
Yes, as long as XBYTE/SKIP/SKIPF/EXECF were not part of the top-level cog code, you could switch tasks okay.
Two cogs would be fine.
Invert the problem, and it's simple!
Cool beans.
https://drive.google.com/file/d/0B9NbgkdrupkHb0lPbm1FWklDc1k/view?usp=sharing
There's 256KB of hub RAM and smart pins on 63/62/33/32/7..0.
One new thing: On XBYTE, C and Z are set to the bytecode index bits 1 and 0.
I'm sorry to ask for a feature, no matter how small, but would it be difficult to add a way to get the current frozen skip mask from inside an interrupt service routine? That would allow a scheduler to properly restore the skip mask.
Hrere's what I found so far.
Using the code snippet below with debug interrupt (single step)
Here are the results for the SKIP version.
Code executes as expected.
Note: Currently when my debugger sees a SKIP/SKIPF/EXECF is shows the expected execution steps.
and here's the same sequence using a SKIPF instruction.
Code executes as expected but address information indicate incorrect address sequence.
Address 0006 is not expected to appear in the above example.
This is a BIG step forward though (pardon the pun).
P.S. In the SKIP example above a means of knowing the next instruction's "SKIP" status would be useful.
i.e. "RDSKIP WC" or "RDSKIP myreg" where C or bit 0 of D is next SKIP bit.
Is that an artifact of the interrupt ? Tho it seems DRVH #$022 & DRVH #$02F correctly straddle 2 skipped lines ?
I don't know why it does that, but I'm really surprised that it works, at all.
I'm thinking about this...
So this is operating as programmed, but the Debug could show a NOP (or skNOP ?) to make that operation clearer ?
Is it predictable when it inserts that NOP action ?
How does the relative timing compare, in Step-Debug vs non step ?
Wasn't there some comment, a while back, about SKIPF reverting to SKIP in some cases ?
ie it does the fast 'zero-cycle' in some cases, but can revert to a NOP equivalent ?
That would be tolerable ? - as single-step is not real-time anyway, and as long as the skip-decisions are valid, you can debug this way, but run faster.
Here's another Prop123-A9 image:
https://drive.google.com/file/d/0B9NbgkdrupkHRzFVZkRfT0J4UkU/view?usp=sharing
I tried lots of things to clean up that initial address hiccup, but I wasn't able to eradicate it.
This version has some things that may help drive your debugger, though.
When you do a GETINT instruction, D[31] will be set if there are any '1' skip bits buffered. D[30:22] will contain the next nine upcoming SKIP bits, with the LSB showing the very next one to be used.
In cog memory, SKIPF will jump ahead by multiple instructions. If SKIPF is used in hub memory, it will behave just like SKIP, cancelling instructions instead of jumping over them.
I'm getting the same results as the previous FPGA image.
I did make use of the new GETINT skip pattern bits, works nicely.
SKIP : Works in Real time Run & in Step Debug (in all memory locations ?)
SKIPF : Works in Real time Run, and works in Step Debug, but has one (or more?) NOP equivalent SKIP artifacts that can appear ?
Do those artifacts have no nett effect on operation ? Is there always only one, or can more occur with differing SKIP patterns ?
(eg if the first 2 SKIPF, what is the result then ? )
I realized that if the first skip bit (LSB) is 1, things blow up, probably because the interrupt CALLD gets cancelled, instead of the next instruction. Looks like I just remedied that by not allowing any interrupt when the SKIP circuit is cancelling the next instruction. This will have some effect on single-stepping.
Just the few skip debug tests so far show that single-stepping is vital. ozpropdev, does the "SKIPNOP" also occur in real time?
A few more questions:
1. Does the latest test image support skip CALL/RET?
2. If yes to 1, are CALLs that cannot work disabled and in effect NOPs, or is it up to the programmer not to use them?
3.
Bits 1 and 0 of LUT address? What's the thinking here?
I'm not at my P123-A9 setup at the moment. Will be back on air later tonight.
The latest image DOES support the CALL/RET with skip freezing. If the CALL doesn't execute, either due to being skipped or exec-condition-false, the skip freezing does not occur. It works like one would expect.
I made XBYTE write the bytecode index LSBs to C and Z, so that if you had a block of, say, four instructions that used the same bytecode routine, you would have 4 different C/Z patterns going into the bytecode routine. I use this in the interpreter to distinguish between RETURN and ABORT, since skipping was not a sufficient mechanism to differentiate the two instructions, due to looping within the routine:
https://drive.google.com/file/d/0B9NbgkdrupkHRHVabWhWVnJkb3M/view?usp=sharing
Thanks for the explanation.
If XBYTE now always writes the bytecode index LSBs to C and Z, then my XBYTE example will be completely ruined. Very often C and Z must be preserved between one bytecode read and the next, the first being one of several different prefix bytes.
The bytecode is stored at $1F6. Couldn't the appropriate bit(s) of that be tested at the start of the skip sequence?
My first thought was: this makes it clear that using C and Z flags as long-term state in a bytecode interpreter is not possible ---
C and Z are not part of a bytecoded program's state, but a local element of a single bytecode's implementation.
This is probably because it would not be easy to keep track of flags' states when programming a bytecode interpreter.
I don't mind, seems OK to me.
Was this your intention, Chip?
If so, then putting a possibly useful value in C and Z (instead of just clearing them for each XBYTE) is very nice.
I like that least significant two bits in the bytecode can be mentioned (in the right order) in prefixes like "IF_00", "IF_01", "IF_NOT_00", "IF_NOT_10", etc.
By the way, was the flags saving and restoring in all types of CALL... and RET... clarified completely?
I vaguely remember a discussion a few weeks ago, it seemed that there were some corner cases not working in the most convenient or memorable way.
I disagree 100%, C and Z should last as long as the programmer wants them to last.
I think the intention was to improve the Spin2 interpreter and the dire consequences elsewhere were not foreseen.
I don't mind in this case, even if I see the value in having C and Z flags as long-term state.
That's why I commented, to understand where it's going.
I would not say "dire consequences".
If Chip did not intend the result, I'm sure he'll find a better design. He always did eventually.
Here's what i've found so far.
Using FPGA image -> Prop123_A9_Prop2_v19skip3.rbf
I added some GETINT support to the debugger to shed some more light on the operation.
In real time mode:
SKIP and SKIPF function correctly with predictable clock counts.
This applies to COG/LUT and HUB exec variants.
In single step mode:
The first step after a SKIP/SKIPF instruction always advances to PC+2 not PC+1.
Although PC+1 was missed by the single step the PC+1 instruction is still executed
or skipped as defined by the skip pattern.
In both SKIP and SKIPF the correct sequence of instructions are processed.
Here's a example of SKIP with the GETINT data included.
and here's the same code woth SKIPF used.
In both variants the PC steps from address $0004 to $0006(pc+2) on the first step after SKIP(F)
I'll try and run some more tests tomorrow and zip all of the source and results too.