In thinking about this, we DO want to keep address-compare breakpoint, because that enables a certain cog to stop on what might be a lot of public-access hub exec code.
Yes, I was thinking you meant both, not one or other.
I'm thinking that maybe we need some external-pin debug breakpoint, just to wake up some cog without requiring another cog to do the job, in case they are all busy. What do you think?
How would you configure that ? - A debug register with 6 bits ?
Could another bit specify mode choice of Break or Hold - I'm thinking another use could be to simply pause the COG - allowing cross-COG pauses..for Debug and RUN uses.
These changes should open up debugging to where it needs to be. Now, I need to get some code written to quickly prove everything.
Here is what the debug ISR should be able to do, in order:
1) dump all COG and LUT registers
2) report PC, flags, and status data from GETINT
3) receive next command to single-step, run to address, or run until an async breakpoint gets asserted from another cog
4) restore all registers and resume execution
If it can do that, we'll have a nice basis for debugging.
Chip
I already have a debugger running under V31 that includes the above and more.
Only would require updating to V32's changes to be up and running.
Might save you some of your valuable time.
These changes should open up debugging to where it needs to be. Now, I need to get some code written to quickly prove everything.
Here is what the debug ISR should be able to do, in order:
1) dump all COG and LUT registers
2) report PC, flags, and status data from GETINT
3) receive next command to single-step, run to address, or run until an async breakpoint gets asserted from another cog
4) restore all registers and resume execution
If it can do that, we'll have a nice basis for debugging.
Chip
I already have a debugger running under V31 that includes the above and more.
Only would require updating to V32's changes to be up and running.
Might save you some of your valuable time.
I know you've got one running and it could save me time, but I want to know firsthand that it works like I think it's supposed to work. There are some subtleties about this thing that I want to be sure of. And I don't think I'll go to the extent you did. I just want to see that the mechanics are in proper order.
I've added a software BRK instruction ($0000001). BRK is like NOP ($00000000), but causes a debug interrupt when enabled by the debug scheme. Within the debug ISR, the return address will point to the next instruction after BRK. So, it's like a software-insertable breakpoint.
For address-match breakpoint and the BRK instruction, there were some bugs I fixed about not triggering when the address or BRK instruction is getting skipped within a SKIP sequence.
I really want to get this stuff straight, so that we don't have shortcomings later.
I'm compiling a silicon-equivalent Prop123-A9 image right now. It drives P0..P63, but doesn't control the LEDs, since it's the hardcore ASIC version of Verilog.
I know I've asked and you've told me before, but I'm not remembering... Are you using a Prop123-A9 board?
If so, I'll post this image and update the Google Doc, and maybe you can run it through its paces with your debugger?
Not really. Throughout this project, designing things together has brought us a lot of good features.
Honestly, doing it this way helps validate a lot of behavior that would take a lot to make work with all the features, COGS, in play.
We are close. There is time to get it golden. At this point, doing that makes all the sense in the world.
If I were Chip, I would not ship it without really thinking through stuff like this. We have had a ton of, "needs more debug" discussions. Interestingly, a bunch of us employ techniques that don't lean on debug.
Also interestingly, those of us who are advocates for this functionality have seen the world move to a more self explanatory world. Learn by doing.
Most of our discussion centered on exposing the guts completely. How to do that varied too, but mostly centered on various ways and means out there already.
Now, at this stage, the need for robust and transparent debug, think learning tools, has revealed how to do this in ways that people can grok quickly and easily.
Good.
That's a design goal from early on. Make it robust, powerful, interactive, etc...
Recently, the topic of "how in the world will docs get done?" got a lot of thought. It's a valid concern.
I submit, if we do this well, the documentation tasks will change. Core, necessary functional docs will be needed. No change there. But, a document, video, example code, sample projects ready to go and learn from, will get right at the heart of the matter:
Helping people get to a safe place where they can learn how to learn, and where they can try things, get results, explore.
P1 had this property. Tons of us jumped on the early documentation, wrote code,shared, and made P1 do a lot of cool stuff long before documentation was arguably sufficient for that to happen.
P2 will also have it, and making dang sure we have the basics needed to make test beds, and environments that run, despite mistakes, and that communicate what is happening, will have a serious influence on early adoption.
We need that.
Over time, a ton will get produced. Early on, there will be users figuring out how things happen, why they happen, and what it all might mean.
Totally worth it.
People need to be able to jump in and play, and they won't have the benefit we have, which is all the context working together on this brings.
I also submit we will have some of the same discovery found in P1. Remember LMM, running WAITVID backwards, all the crazy counter tricks?
This chip is even more of a playground. More is needed for that reason as well as the clear trend toward learn by doing going on.
I don't know about you guys, but the last few years have been profound. Many people won't make the time investment needed to absorb pages of docs. Some will. More will do that eventually too. Just as we all did.
What they want is some success right away. Do something, get success, do another related thing, build on earlier success, wash rinse repeat.
Ever see the people live coding? Their chatter, plus watching them do something other find compelling speaks volumes. And the best part is everyone sees a bit they need to see. Or they gain enough shared experience to pick something up and run with it.
Thinking ahead now makes sense. It's more than do we have the needed features and functions. It's how it all might flow.
Doing that is what made P1 the device it is. P1 + Prop Tool is pretty awesome, if a bit dated today. Blocky gets at that, and people are learning what they need and want to. Who cares how?
Very robust multimedia capabilities are part of this design. That is coupled with a lot of DAC / ADC , and or Smart Pin capability.
Roll that up, and it's gonna be possible to use a P2 to learn about developing P2.
I've added a software BRK instruction ($0000001). BRK is like NOP ($00000000), but causes a debug interrupt when enabled by the debug scheme. Within the debug ISR, the return address will point to the next instruction after BRK. So, it's like a software-insertable breakpoint.
For address-match breakpoint and the BRK instruction, there were some bugs I fixed about not triggering when the address or BRK instruction is getting skipped within a SKIP sequence.
I really want to get this stuff straight, so that we don't have shortcomings later.
Hi Chip
Since interrupts are ignored during a REP looping, does it means that a BRK will not operate at all (or, better saying, will be treated as a NOP), when placed within a REP block?
P.S. Along the same line of thought; if, at some point,I would try to use SETBRK to interrogate a COG, in order to see what is it doing, and it happens that the "to be BRKed COG" is executing a REP loop, the "break" will not happen at all?
I've added a software BRK instruction ($0000001). BRK is like NOP ($00000000), but causes a debug interrupt when enabled by the debug scheme. Within the debug ISR, the return address will point to the next instruction after BRK. So, it's like a software-insertable breakpoint.
For address-match breakpoint and the BRK instruction, there were some bugs I fixed about not triggering when the address or BRK instruction is getting skipped within a SKIP sequence.
I really want to get this stuff straight, so that we don't have shortcomings later.
Hi Chip
Since interrupts are ignored during a REP looping, does it means that a BRK will not operate at all (or, better saying, will be treated as a NOP), when placed within a REP block?
P.S. Along the same line of thought; if, at some point,I would try to use SETBRK to interrogate a COG, in order to see what is it doing, and it happens that the "to be BRKed COG" is executing a REP loop, the "break" will not happen at all?
I've added a software BRK instruction ($0000001). BRK is like NOP ($00000000), but causes a debug interrupt when enabled by the debug scheme. Within the debug ISR, the return address will point to the next instruction after BRK. So, it's like a software-insertable breakpoint.
For address-match breakpoint and the BRK instruction, there were some bugs I fixed about not triggering when the address or BRK instruction is getting skipped within a SKIP sequence.
I really want to get this stuff straight, so that we don't have shortcomings later.
Hi Chip
Since interrupts are ignored during a REP looping, does it means that a BRK will not operate at all (or, better saying, will be treated as a NOP), when placed within a REP block?
P.S. Along the same line of thought; if, at some point,I would try to use SETBRK to interrogate a COG, in order to see what is it doing, and it happens that the "to be BRKed COG" is executing a REP loop, the "break" will not happen at all?
REP cannot be interrupted. To make it interruptible would require tons more flops and some complicated reentry mechanism. It's not worth it, just for debugging. REP exploits some hardware subtlety in how the cog RAM works. The hub exec version is not efficient, but exists for code compatibility. I wish it were more reasonable to make REP interruptible, but it's not.
On the other hand, REP functionality could be spoofed by a debugger which manually controls the process.
Neat ideas, Potatohead. I wish there were some ways to graphically convey what the mnemonics do. Assembly Language represents a complicated set of concepts which don't easily lend themselves to simple graphical representations.
REP cannot be interrupted. To make it interruptible would require tons more flops and some complicated reentry mechanism. It's not worth it, just for debugging. REP exploits some hardware subtlety in how the cog RAM works. The hub exec version is not efficient, but exists for code compatibility. I wish it were more reasonable to make REP interruptible, but it's not.
On the other hand, REP functionality could be spoofed by a debugger which manually controls the process.
Couldn't a BRK, placed within a REP block, be made to act as an "open door for SETBRK", being substituted for a BRANCH, thus prematurely exiting the REP loop, in the event another COG uses SETBRK to interrogate the state of the REP-running COG?
In fact, I was not thinking in reentering the REP loop, but in BRKing the loop, on-the-fly, under another COG control, to see where it is (and what is it doing), at some certain moment, in time.
Sure, I could move the REP loop to the HUB, and have another COG selectively inserting a branch into the block, to obtain the same effect.
Normally I'm happy/enthusiastic with the additions but, in this case, I'm with Cluso. Although maybe not for the same reason.
Soft debuggers have to be accommodated and they never achieve their aim of being invisible. Just not the right path, imho. I'd happily ditch the whole debugger support system, including the protected HubRAM.
Normally I'm happy/enthusiastic with the additions but, in this case, I'm with Cluso. Although maybe not for the same reason.
Soft debuggers have to be accommodated and they never achieve their aim of being invisible. Just not the right path, imho. I'd happily ditch the whole debugger support system, including the protected HubRAM.
printf() solves all.
That is how I have always worked, too.
I do see value, though, in making the innards very visible and accessible, mainly for the purpose of enabling learning.
A debugger lets you single-step code and check values at breakpoints. This enables quick learning so you can get to writing real-time code that does stuff a simulator can't.
A proper debugger is vital to many use cases. The fact that you don't have jtag debugging will stop a lot of potential users from even considering the chip. Abandoning what little debugging features you have would be another mistake that costs users.
printf ABSOLUTELY does not solve all. In fact, it's almost worthless in a great deal of situations, and can cause more problems than it helps. It changes the code and memory state more than the current debugger stuff Chip has in there.
Everyone I have every talked to that think's all they need is printf, has never used proper debugging facilities, and often has never worked on code more than a few thousand lines long (often not even more than a few hundred lines).
Yes, the debugger will be handy for learning, but it's way more valuable to the veteran coder working on large/complex projects.
My biggest complaint, by far, about the P1 is the lack of any real proper debugging ability. I have used the "attempts" that are out there, and they are very lacking compared to what should be there.
Nah, the only people who need debuggers are C++ programmers. The language they are trying to use has become so complex that they have no idea what their programs will do ahead of time and have to single step their way though it to find out.
I suppose you could just place a BRK in lieu of a REP and have the debugger fake the REP.
A good idea, but that only works for basic flow checking, as the speed difference will be so great as to make 'real time' no longer true.
Closest least-time-impact will likely be a DJNZ variant of REP, which IIRC adds one opcode time to the loop ?
Smart would be to offer Debug operators a couple of choices around 'REP debug' emulation.
That may be tolerable enough in many cases, at least to establish the code 'works as expected', and the final real-time check is a separate development step.
printf ABSOLUTELY does not solve all. In fact, it's almost worthless in a great deal of situations, and can cause more problems than it helps. It changes the code and memory state more than the current debugger stuff Chip has in there.
Yup, and any chip that targets the hard real time edge, cannot use 'printf' - when even sub 30c MCUs have decent debug included, you really DO need a serious debug action, or P2 will founder amongst the very customers needed to ensure its success.
I've added a software BRK instruction ($0000001). BRK is like NOP ($00000000), but causes a debug interrupt when enabled by the debug scheme. Within the debug ISR, the return address will point to the next instruction after BRK. So, it's like a software-insertable breakpoint.
For address-match breakpoint and the BRK instruction, there were some bugs I fixed about not triggering when the address or BRK instruction is getting skipped within a SKIP sequence.
I really want to get this stuff straight, so that we don't have shortcomings later.
Great, a software BRK will be quite important.
Break within a SKIP, where the break-line is actually skipped, could generate some debate.
If the code-line is not executed, some might say that should not generate a break, until that line is executed.
Is that how BRK and SKIP interact now ?
I've added a software BRK instruction ($0000001). BRK is like NOP ($00000000), but causes a debug interrupt when enabled by the debug scheme. Within the debug ISR, the return address will point to the next instruction after BRK. So, it's like a software-insertable breakpoint.
For address-match breakpoint and the BRK instruction, there were some bugs I fixed about not triggering when the address or BRK instruction is getting skipped within a SKIP sequence.
I really want to get this stuff straight, so that we don't have shortcomings later.
Great, a software BRK will be quite important.
Break within a SKIP, where the break-line is actually skipped, could generate some debate.
If the code-line is not executed, some might say that should not generate a break, until that line is executed.
Is that how BRK and SKIP interact now ?
If a BRK sits at a location that is getting skipped, it doesn't execute, but gets skipped, as expected. Same with the address-match breakpoint.
I got rid of the dedicated BRK ($00000001) and renamed SETBRK to BRK. This way, when BRK is used outside of the debug ISR (in user code) it can generate a breakpoint AND pass an 8-bit value to the debug ISR (BRK #/D, D[7:0] is passed). This should allow for some flexibility in how debugging can be implemented.
Everything seems to be working fine. BRK breakpoints and address-match breakpoints always stop AFTER the instruction executes, landing at the next instruction.
It's neat to single-step SKIPF code and see the PC advance as expected and update all 32 SKIP bits, which you can read back now. This is going to look really nice in a debugger that shows what upcoming instructions are going to get skipped, maybe dimming their backgrounds, or something.
I think I've got all the logic right. I got way into that part today, revisiting why I wrote it the way I did and making diagrams in the Verilog code to show state sequences and what happens when. I noticed the equivalent of a few 'comb-overs' which I cleaned up with better code and comments.
Right now, I'm compiling an 8-cog 512KB silicon-equivalent BeMicro-A9 image for Cluso99 and Peter to try out. I'll need to update the Google Doc to cover the changes in the debug scheme.
I am thinking that having a background debugger that can always be put to work without any special setup or consideration is going to be quite nice. I wish I could get REP(eat) more debug-friendly. I'll look into, but it will be a bear to change.
I don't have the Google Doc modified, yet, but I'll attach a screenshot of the Verilog code, which should tell you everything you need to know about the debugging differences.
Remember: SETBRK is renamed to BRK. That is the only mnemonic change.
GETINT D/# 'generate async break in cog D/#, must be enabled in target cog
GETINT D WC/WZ/WCZ 'read various debug-related status data into D
I also attached the .spin2 program that I was using to test out the debug features.
I'll get the proper documentation done tomorrow, hopefully, but this should apprise you of everything. Just a different approach than usual.
Comments
Yes, I was thinking you meant both, not one or other.
How would you configure that ? - A debug register with 6 bits ?
Could another bit specify mode choice of Break or Hold - I'm thinking another use could be to simply pause the COG - allowing cross-COG pauses..for Debug and RUN uses.
Maybe you have solved it, although it appears you intended to do a joke, at least to me.
Once enabled and under software control, a pulse into the RSTn RESn pin, within some specified timing constraints could fire it.
Shorter than the minimum = glitch;
Between limits = fire breakpoint;
Longer than maximum = true reset.
Naturaly, a reset signal, once accepted as being true, clears all configs, then you must restart the whole thing, again.
Only a thought.
Henrique
P.S. My bad. Embedded memory failure (RSTn x RESn)! -_-
Some small MCUs that use RST as the debug Link, do exactly that - the narrow signals open the debug channel.
Chip
I already have a debugger running under V31 that includes the above and more.
Only would require updating to V32's changes to be up and running.
Might save you some of your valuable time.
I know you've got one running and it could save me time, but I want to know firsthand that it works like I think it's supposed to work. There are some subtleties about this thing that I want to be sure of. And I don't think I'll go to the extent you did. I just want to see that the mechanics are in proper order.
For address-match breakpoint and the BRK instruction, there were some bugs I fixed about not triggering when the address or BRK instruction is getting skipped within a SKIP sequence.
I really want to get this stuff straight, so that we don't have shortcomings later.
Cluso99,
I'm compiling a silicon-equivalent Prop123-A9 image right now. It drives P0..P63, but doesn't control the LEDs, since it's the hardcore ASIC version of Verilog.
I know I've asked and you've told me before, but I'm not remembering... Are you using a Prop123-A9 board?
If so, I'll post this image and update the Google Doc, and maybe you can run it through its paces with your debugger?
We're both using BeMicro CV-A9 boards
Not really. Throughout this project, designing things together has brought us a lot of good features.
Honestly, doing it this way helps validate a lot of behavior that would take a lot to make work with all the features, COGS, in play.
We are close. There is time to get it golden. At this point, doing that makes all the sense in the world.
If I were Chip, I would not ship it without really thinking through stuff like this. We have had a ton of, "needs more debug" discussions. Interestingly, a bunch of us employ techniques that don't lean on debug.
Also interestingly, those of us who are advocates for this functionality have seen the world move to a more self explanatory world. Learn by doing.
Most of our discussion centered on exposing the guts completely. How to do that varied too, but mostly centered on various ways and means out there already.
Now, at this stage, the need for robust and transparent debug, think learning tools, has revealed how to do this in ways that people can grok quickly and easily.
Good.
That's a design goal from early on. Make it robust, powerful, interactive, etc...
Recently, the topic of "how in the world will docs get done?" got a lot of thought. It's a valid concern.
I submit, if we do this well, the documentation tasks will change. Core, necessary functional docs will be needed. No change there. But, a document, video, example code, sample projects ready to go and learn from, will get right at the heart of the matter:
Helping people get to a safe place where they can learn how to learn, and where they can try things, get results, explore.
P1 had this property. Tons of us jumped on the early documentation, wrote code,shared, and made P1 do a lot of cool stuff long before documentation was arguably sufficient for that to happen.
P2 will also have it, and making dang sure we have the basics needed to make test beds, and environments that run, despite mistakes, and that communicate what is happening, will have a serious influence on early adoption.
We need that.
Over time, a ton will get produced. Early on, there will be users figuring out how things happen, why they happen, and what it all might mean.
Totally worth it.
People need to be able to jump in and play, and they won't have the benefit we have, which is all the context working together on this brings.
I also submit we will have some of the same discovery found in P1. Remember LMM, running WAITVID backwards, all the crazy counter tricks?
This chip is even more of a playground. More is needed for that reason as well as the clear trend toward learn by doing going on.
I don't know about you guys, but the last few years have been profound. Many people won't make the time investment needed to absorb pages of docs. Some will. More will do that eventually too. Just as we all did.
What they want is some success right away. Do something, get success, do another related thing, build on earlier success, wash rinse repeat.
Ever see the people live coding? Their chatter, plus watching them do something other find compelling speaks volumes. And the best part is everyone sees a bit they need to see. Or they gain enough shared experience to pick something up and run with it.
Thinking ahead now makes sense. It's more than do we have the needed features and functions. It's how it all might flow.
Doing that is what made P1 the device it is. P1 + Prop Tool is pretty awesome, if a bit dated today. Blocky gets at that, and people are learning what they need and want to. Who cares how?
Very robust multimedia capabilities are part of this design. That is coupled with a lot of DAC / ADC , and or Smart Pin capability.
Roll that up, and it's gonna be possible to use a P2 to learn about developing P2.
Good.
Hi Chip
Since interrupts are ignored during a REP looping, does it means that a BRK will not operate at all (or, better saying, will be treated as a NOP), when placed within a REP block?
P.S. Along the same line of thought; if, at some point,I would try to use SETBRK to interrogate a COG, in order to see what is it doing, and it happens that the "to be BRKed COG" is executing a REP loop, the "break" will not happen at all?
REP cannot be interrupted. To make it interruptible would require tons more flops and some complicated reentry mechanism. It's not worth it, just for debugging. REP exploits some hardware subtlety in how the cog RAM works. The hub exec version is not efficient, but exists for code compatibility. I wish it were more reasonable to make REP interruptible, but it's not.
On the other hand, REP functionality could be spoofed by a debugger which manually controls the process.
Couldn't a BRK, placed within a REP block, be made to act as an "open door for SETBRK", being substituted for a BRANCH, thus prematurely exiting the REP loop, in the event another COG uses SETBRK to interrogate the state of the REP-running COG?
Sure, I could move the REP loop to the HUB, and have another COG selectively inserting a branch into the block, to obtain the same effect.
Soft debuggers have to be accommodated and they never achieve their aim of being invisible. Just not the right path, imho. I'd happily ditch the whole debugger support system, including the protected HubRAM.
printf() solves all.
That is how I have always worked, too.
I do see value, though, in making the innards very visible and accessible, mainly for the purpose of enabling learning.
Good point. Yes, is educational for sure. Maybe that's really what everyone is wanting anyway.
printf ABSOLUTELY does not solve all. In fact, it's almost worthless in a great deal of situations, and can cause more problems than it helps. It changes the code and memory state more than the current debugger stuff Chip has in there.
Everyone I have every talked to that think's all they need is printf, has never used proper debugging facilities, and often has never worked on code more than a few thousand lines long (often not even more than a few hundred lines).
Yes, the debugger will be handy for learning, but it's way more valuable to the veteran coder working on large/complex projects.
My biggest complaint, by far, about the P1 is the lack of any real proper debugging ability. I have used the "attempts" that are out there, and they are very lacking compared to what should be there.
A good idea, but that only works for basic flow checking, as the speed difference will be so great as to make 'real time' no longer true.
Closest least-time-impact will likely be a DJNZ variant of REP, which IIRC adds one opcode time to the loop ?
Smart would be to offer Debug operators a couple of choices around 'REP debug' emulation.
That may be tolerable enough in many cases, at least to establish the code 'works as expected', and the final real-time check is a separate development step.
Yup, and any chip that targets the hard real time edge, cannot use 'printf' - when even sub 30c MCUs have decent debug included, you really DO need a serious debug action, or P2 will founder amongst the very customers needed to ensure its success.
Great, a software BRK will be quite important.
Break within a SKIP, where the break-line is actually skipped, could generate some debate.
If the code-line is not executed, some might say that should not generate a break, until that line is executed.
Is that how BRK and SKIP interact now ?
If a BRK sits at a location that is getting skipped, it doesn't execute, but gets skipped, as expected. Same with the address-match breakpoint.
I got rid of the dedicated BRK ($00000001) and renamed SETBRK to BRK. This way, when BRK is used outside of the debug ISR (in user code) it can generate a breakpoint AND pass an 8-bit value to the debug ISR (BRK #/D, D[7:0] is passed). This should allow for some flexibility in how debugging can be implemented.
Everything seems to be working fine. BRK breakpoints and address-match breakpoints always stop AFTER the instruction executes, landing at the next instruction.
It's neat to single-step SKIPF code and see the PC advance as expected and update all 32 SKIP bits, which you can read back now. This is going to look really nice in a debugger that shows what upcoming instructions are going to get skipped, maybe dimming their backgrounds, or something.
I think I've got all the logic right. I got way into that part today, revisiting why I wrote it the way I did and making diagrams in the Verilog code to show state sequences and what happens when. I noticed the equivalent of a few 'comb-overs' which I cleaned up with better code and comments.
Right now, I'm compiling an 8-cog 512KB silicon-equivalent BeMicro-A9 image for Cluso99 and Peter to try out. I'll need to update the Google Doc to cover the changes in the debug scheme.
I am thinking that having a background debugger that can always be put to work without any special setup or consideration is going to be quite nice. I wish I could get REP(eat) more debug-friendly. I'll look into, but it will be a bear to change.
https://drive.google.com/file/d/1cz4baqVB9tIKYa3yVI0hWTP3qcI997CT/view?usp=sharing
The new PNut.exe is attached.
I don't have the Google Doc modified, yet, but I'll attach a screenshot of the Verilog code, which should tell you everything you need to know about the debugging differences.
Remember: SETBRK is renamed to BRK. That is the only mnemonic change.
GETINT D/# 'generate async break in cog D/#, must be enabled in target cog
GETINT D WC/WZ/WCZ 'read various debug-related status data into D
I also attached the .spin2 program that I was using to test out the debug features.
I'll get the proper documentation done tomorrow, hopefully, but this should apprise you of everything. Just a different approach than usual.
Thanks.