Looks Ok Chip except that a GETINT D doesn't seem to set D[31] when an active skip pattern is present now.
SKIP PATTERN (from GETINT D[31:22]) = 0_000011010
That's right. I forgot to mention that now D[31] is just the next SKIP bit up from D[30]. I got rid of the circuit that tracked whether any SKIP bits were left, because that thing had become a critical path. So, we can see only the next 10 SKIP bits.
Using SETQ or SETQ2 with Q[9]=1 would seem to be the sensible way to enable CZ writing in XBYTE. This is a bit of a dummy's question: if the D in SETQ/SETQ2 is 9-bit immediate, are the top 23 bits written to Q all zero, ensuring that XBYTE will not modify CZ?
Is CZ writing in XBYTE optional now, or is this yet to be done?
From a previous reply, a RET to $1F8-$1FF always starts a new XBYTE, even if there are '1' bits remaining in the skip pattern. Sometimes this early termination must occur inside a called subroutine. Assuming no nesting, is there any reason why the following shouldn't work?
' skipsub is routine called from SKIPF sequence
skipsub ...
' discard return address and start next XBYTE
_ret_ pop temp
I've been thinking, too, that SETQ D[9] could convey whether or not the C/Z bits are written with the two LSBs of the index. That's the last thing I need to implement before I can make another release.
To answer your XBYTE question: As long as the top of the stack was $1F8..$1FF, a new XBYTE would begin on a _RET_/RET, replacing any current skip pattern. If you were to do a '_RET_ POP temp', with the $1F8..$1FF on top of the stack, it would begin an XBYTE, but also pop the stack, causing any future _RET_/RET to return to the new top stack address, unless it was also $1F8..$1FF, in which case another XBYTE would commence.
Roger Wilco Chip!
Still running some tests, all looks good so far.
Ok. Good. Thanks.
I think it's a really good thing that we've got interrupts operating with SKIP instructions. This is something I wouldn't have thought to pursue, but with your guys' encouragement, it turned out not to be too big of a deal. I feel a lot better about the reduction in interrupt latency this affords, not to mention the single-stepping aspect.
Roger Wilco Chip!
Still running some tests, all looks good so far.
Ok. Good. Thanks.
I think it's a really good thing that we've got interrupts operating with SKIP instructions. This is something I wouldn't have thought to pursue, but with your guys' encouragement, it turned out not to be too big of a deal. I feel a lot better about the reduction in interrupt latency this affords, not to mention the single-stepping aspect.
Yes, skipping is very good now. Supporting CALLs was the first step and allowing interrupts was the obvious next step. Who suggested both of these, I wonder? Anyway, thanks Chip for getting them to work.
Looks Ok Chip except that a GETINT D doesn't seem to set D[31] when an active skip pattern is present now.
SKIP PATTERN (from GETINT D[31:22]) = 0_000011010
That's right. I forgot to mention that now D[31] is just the next SKIP bit up from D[30]. I got rid of the circuit that tracked whether any SKIP bits were left, because that thing had become a critical path. So, we can see only the next 10 SKIP bits.
This change saves an instruction (more on that later) and seeing another skip bit helps but only the next 10 SKIP bits is the issue here. There are three ways that a routine called from a skip sequence could handle the remaining skip bits:
(1) Leave them alone, so that they take effect after the routine (normal option).
(2) Overwrite them, with a SKIPF say, in which case the return address would be popped and discarded probably.
(3) Save them, execute a new skip sequence with SKIPF, then restore the original skip bits with another SKIPF before returning.
I've found uses for all three options and I'm sure others will, too. Skipping is so good that nesting as in (3) should be supported but it will only work sometimes because only 10 skip bits can be saved. Currently the code looks something like
' skipsub is routine called from SKIPF/EXECF/XBYTE sequence
skipsub getint oldskip ' get next 10 skip bits [31:22]
shr oldskip,#22 ' shift bits
' (masking bit 31 not needed now)
skipf newskip ' start new skip sequence
...
_ret_ skipf oldskip ' restore skip bits and return
' *** not guaranteed to work ***
Ideally it would be
skipsub rdskip oldskip ' get all remaining skip bits
skipf newskip
...
_ret_ skipf oldskip ' restore skip bits and return
' *** always works ***
This does mean having a new RDSKIP instruction but I can't see any way to avoid that. GETINT would revert to its original behaviour, with the high 10 bits zero. Being able to read any of the skip bits is useful but GETINT is rather half-baked. To help offset the new RDSKIP logic, a 22-bit mux could be eliminated elsewhere.
The beauty of the XBYTE mechanism is small fast code snippets.
When you start adding nested SKIPF/CALL's to the mix, it seems to defeat its purpose.
The overhead of adding the extra baggage of CALL,GETINT,SHR,SKIPF,RET) ~(12 clocks) kills off VM speed.
Straight line code snippets in hub would probably be a better alternative and easier to follow/debug.
The beauty of the XBYTE mechanism is small fast code snippets.
When you start adding nested SKIPF/CALL's to the mix, it seems to defeat its purpose.
The overhead of adding the extra baggage of CALL,GETINT,SHR,SKIPF,RET) ~(12 clocks) kills off VM speed.
Straight line code snippets in hub would probably be a better alternative and easier to follow/debug.
I agree, somewhat.
There were instances in the interpreter where a single, but variable, instruction would read or write a register/LUT/hub variable within a skip sequence, via ALTI. To introduce bit fields into the mix, a whole separate subroutine was required. The CALL allowance worked beautifully for this. I can now call short routines which handle the bit field reading or writing, and involve them as if they were single instructions within the skip sequence. It makes things nice as can be.
The beauty of the XBYTE mechanism is small fast code snippets.
When you start adding nested SKIPF/CALL's to the mix, it seems to defeat its purpose.
The overhead of adding the extra baggage of CALL,GETINT,SHR,SKIPF,RET) ~(12 clocks) kills off VM speed.
Straight line code snippets in hub would probably be a better alternative and easier to follow/debug.
I was thinking that this skip concept, to be fully realized within an architecture, should be integrated such that when a CALL takes place, the current skip pattern should be pushed onto a hardware stack, along with the return address, and a new skip pattern should be started. That way, you have integrated skipping at every subroutine level - or, at least, the provision for it. It could make for some dense coding.
Roger Wilco Chip!
Still running some tests, all looks good so far.
Ok. Good. Thanks.
I think it's a really good thing that we've got interrupts operating with SKIP instructions. This is something I wouldn't have thought to pursue, but with your guys' encouragement, it turned out not to be too big of a deal. I feel a lot better about the reduction in interrupt latency this affords, not to mention the single-stepping aspect.
Yes, skipping is very good now. Supporting CALLs was the first step and allowing interrupts was the obvious next step. Who suggested both of these, I wonder? Anyway, thanks Chip for getting them to work.
Looks Ok Chip except that a GETINT D doesn't seem to set D[31] when an active skip pattern is present now.
SKIP PATTERN (from GETINT D[31:22]) = 0_000011010
That's right. I forgot to mention that now D[31] is just the next SKIP bit up from D[30]. I got rid of the circuit that tracked whether any SKIP bits were left, because that thing had become a critical path. So, we can see only the next 10 SKIP bits.
This change saves an instruction (more on that later) and seeing another skip bit helps but only the next 10 SKIP bits is the issue here. There are three ways that a routine called from a skip sequence could handle the remaining skip bits:
(1) Leave them alone, so that they take effect after the routine (normal option).
(2) Overwrite them, with a SKIPF say, in which case the return address would be popped and discarded probably.
(3) Save them, execute a new skip sequence with SKIPF, then restore the original skip bits with another SKIPF before returning.
I've found uses for all three options and I'm sure others will, too. Skipping is so good that nesting as in (3) should be supported but it will only work sometimes because only 10 skip bits can be saved. Currently the code looks something like
' skipsub is routine called from SKIPF/EXECF/XBYTE sequence
skipsub getint oldskip ' get next 10 skip bits [31:22]
shr oldskip,#22 ' shift bits
' (masking bit 31 not needed now)
skipf newskip ' start new skip sequence
...
_ret_ skipf oldskip ' restore skip bits and return
' *** not guaranteed to work ***
Ideally it would be
skipsub rdskip oldskip ' get all remaining skip bits
skipf newskip
...
_ret_ skipf oldskip ' restore skip bits and return
' *** always works ***
This does mean having a new RDSKIP instruction but I can't see any way to avoid that. GETINT would revert to its original behaviour, with the high 10 bits zero. Being able to read any of the skip bits is useful but GETINT is rather half-baked. To help offset the new RDSKIP logic, a 22-bit mux could be eliminated elsewhere.
I just looked into all this. The problem is that there is more state information to save and restore than just 32 bits of skip data. There is the call depth counter and the skip-type flag. These things would require multiple instructions to perform read and set. I think it's not a good idea to go there. What we have now can be observed, somewhat, but to make it reorganizable within interrupts would take a lot more.
I think being able to read all the remaining skip bits would be handy, if nothing else. I forgot that SKIPF clears the call counter.
Is this the end of the road for skip nesting or is some sort of hardware stack solution still possible? There are only 22 skip bits for EXECF/XBYTE, probably the most important application, so there would room for other stuff in one long.
If three bits needed for call depth counter and one for skip-type flag, that would leave 28 skip bits in one long, more than is possible for EXECF/XBYTE and a small but acceptable reduction for SKIP/SKIPF.
I think being able to read all the remaining skip bits would be handy, if nothing else. I forgot that SKIPF clears the call counter.
Is this the end of the road for skip nesting or is some sort of hardware stack solution still possible? There are only 22 skip bits for EXECF/XBYTE, probably the most important application, so there would room for other stuff in one long.
True, but to grow the hardware stack width from 20 to 32 would increase the number of flops by 12 flops * 8 levels * 16 cogs = 1536. Then, we'd want to be able to push and pop 32 bits of data, plus those two flags bits (how to handle them?), and so on. A lot of these things just open up new frontiers of optimization. It maybe never ends. I think even reporting a single bit via GETINT to show whether just the next instruction will be skipped, or not, may be 90% the value of reporting anything. My gut feeling is that XBYTE is a special add-on in this design and doesn't warrant full expansion. If we were starting anew, then something like this could be expanded all the way. For now, it does everything that seems reasonable, I think.
Your and Ozpropdev's input have already made this feature pretty awesome. My only concern is whether the world is too frenetic, with too short an attention span these days, to be able to engage all these niceties.
.... My gut feeling is that XBYTE is a special add-on in this design and doesn't warrant full expansion. If we were starting anew, then something like this could be expanded all the way. For now, it does everything that seems reasonable, I think.
Your and Ozpropdev's input have already made this feature pretty awesome. My only concern is whether the world is too frenetic, with too short an attention span these days, to be able to engage all these niceties.
I'd agree XBYTE is a nifty but niche feature, important for hand crafted packing code into a COG for things like Byte-Code engines.
Having it work with interrupts and calls, seems all the refinement needed.
Better to now focus on broad features, like confirming USB, confirming HyperRAM refresh, and a clean up of SmartPin modes...
I've been watching the development of SKIP, SKIPF, EXECF and XBYTE, and they certainly provide unique features. I'm a little concerned whether the benefits from these instructions will be worth the real-estate that they use. Hopefully, they won't use much silicon. My feeling is that the P2 will mostly be used to execute from hub RAM with small bits of time-critical code running from cog/lut RAM.
Chip, have you done a power analysis lately on the P2? I wonder how the current design compares with the P2-Hot.
Skip patterns shift left and assembler must add trailing zeroes to short patterns. 99.9% or more of P2 programmers won't mind because they won't program the P2 until the chip actually exists.
I think this change would increase easy-of-use massively. English is read or written from left to right and top to bottom and I for one find it hard to work backwards. Here's an excerpt from the doc:
Once he or she knows what 0 and 1 mean, I bet that almost every newcomer would think that the 1st, 3rd and 4th instructions will be skipped. It's so unnatural to read from right to left that people will forget the reverse order quite often and bugs will result. (It's a good job single-stepping works!) Nobody would forget a left-to-right order.
This is not about assembler or hardware design, fundamentally it's about useability. Maybe some non-programmers should be consulted for their views. I'm just worried, after all the effort that has been put into skipping, that it will be more difficult to use than it should be.
Chip, have you done a power analysis lately on the P2? I wonder how the current design compares with the P2-Hot.
Please don't ask
If we didn't know P2HOT was hot, we would have been using silicon for a few years now. Sure there would be restrictions, but most users would never run enough to actually reach "HOT".
IMHO, the current P2 has probably been "HOT" since the egg-beater with hub-exec was added. When you run worst case figures, accessing the hub ram block on every clock for a full 16 Cogs running flat out with hub-exec, I expect this will be "HOT". But realistically that's going to be rare, if ever, so it might require a warning.
Time will tell, but hopefully it won't stop the P2 from going to silicon.
Skip patterns shift left and assembler must add trailing zeroes to short patterns. 99.9% or more of P2 programmers won't mind because they won't program the P2 until the chip actually exists.
I think this change would increase easy-of-use massively. English is read or written from left to right and top to bottom and I for one find it hard to work backwards. Here's an excerpt from the doc:
Once he or she knows what 0 and 1 mean, I bet that almost every newcomer would think that the 1st, 3rd and 4th instructions will be skipped. It's so unnatural to read from right to left that people will forget the reverse order quite often and bugs will result. (It's a good job single-stepping works!) Nobody would forget a left-to-right order.
This is not about assembler or hardware design, fundamentally it's about useability. Maybe some non-programmers should be consulted for their views. I'm just worried, after all the effort that has been put into skipping, that it will be more difficult to use than it should be.
I cannot see any problems with the skip field starting with bit 0 which is after all, the least significant bit. Often PASM loops are formed with counters or flags, and they mostly start from the LSB end.
...
Once he or she knows what 0 and 1 mean, I bet that almost every newcomer would think that the 1st, 3rd and 4th instructions will be skipped. It's so unnatural to read from right to left that people will forget the reverse order quite often and bugs will result. (It's a good job single-stepping works!) Nobody would forget a left-to-right order.
This is not about assembler or hardware design, fundamentally it's about useability. Maybe some non-programmers should be consulted for their views. I'm just worried, after all the effort that has been put into skipping, that it will be more difficult to use than it should be.
I'm not really seeing an issue here ?
This is just a MSB/LSB convention, and that's been around since dot.
PASM has a natural limited range immediate, so I'd expect all immediate opcodes to fill from the right. (ie fill LSB first)
As to the order of the skip within any string, a more serious support for XBYTE would automate this step anyway.
ie a smarter Assembler would avoid tedious and error-prone manual bit-editing entirely, and allow a automatic scan of some annotated column for the mask extract.
Adding/removing lines would then self-adjust, and the (new) user never worries about MSB/LSB conventions at all
If we take Chip's coding approach, where he manually column-collects, that exact source can be automated
sha_mod_skiptag ' m n a b c d e f g h i Column tag
sha_mod mov y,x ' x x a b c d e | | h i a: >>
sgn_mod not y,x ' x x | | | | | f g | | b: <<
alti rd 'rd m n | | | | | | | | | c: SAR
popa x 'rd,op m n a b c d e f g h i d: ROR
rev x 'REV x x | | | | | f | | | e: ROL
shr x,y '>> x x a | | | | f | | | f: REV
shl x,y '<< x x | b | | | | g | | g: SIGNX
sar x,y 'SAR x x | | c | | | g | | h: +
ror x,y 'ROR x x | | | d | | | | | i: -
rol x,y 'ROL x x | | | | e | | | |
add x,y '+ x x | | | | | | | h |
sub x,y '- x x | | | | | | | | i
alti wr 'wr m n | | | | | | | | |
ret 'wr,op m n a b c d e f g h i m: var ?= exp (isolated)
_ret_ popa x 'iso m | n: var ?= exp (push)
_ret_ zerox x,sz 'push n x: use a..i
' this can be automated, like :
SKIP #SkipCol(sha_mod_skiptag,"a") '>> collect skip sequence for Column 'a' in source code (Col 53 here)
SKIP #SkipCol(sha_mod_skiptag,"g") 'SIGNX collect skip sequence for Column 'g' in source code (col 65 here)
Here the parser is given a label-tag, and starts search after the comment char, for first column char match, then it scans vertically for '|' or TagChr, and builds the mask.
Exits on " " and reports error on <> "|" or <> TagChar or mask too long for opcode.
I'm not clear on what x,m,n are doing, but it looks like a multiple column rule would support that .
It could use x as look-right, but not easy to sense when to stop look-right, so probably easiest to allow dual tags, and simply OR from there. (x becomes | alias)
Either of :
SKIP #SkipCol(sha_mod_skiptag,"m","a") ' merge m & a,
SKIP #SkipCol(sha_mod_skiptag,"g") + SkipCol(sha_mod_skiptag,"m") ' merge m and g, no x needed
Minor : Instead of 'x', maybe '>' is more clearly a look-right hint, and an extraction that finds '>' could error if no explicit second column is given
LSB-first is what the hardware design dictates but MSB-first would be easier to program. This is certainly an issue for me because I have to create 512 skip patterns and I don't fancy doing them all back-to-front!
Luckily I have found a solution that would make me very happy and perhaps other people too. It's so simple, it's beautiful. The design stays the same and a small enhancement to the assembler is all that is needed. I really hope there is no syntactical reason why it could not be implemented.
Assembler change:
The binary symbol % can be a suffix as well as a prefix. When % is a suffix the binary string is reversed and must be unreversed by the assembler. In both cases the MSB is next to %.
The following simple example
skip #%10110 'skip 2nd, 3rd, 5th instructions
could be written as
skip #01101% 'skip 2nd, 3rd, 5th instructions
And this XBYTE example
' Bytecode routines
'
r4 rfbyte pa 'get byte offset or
rfword pa 'get word offset or
rflong pa 'get long offset
add pb,pa 'add offset or
sub pb,pa 'sub offset
_ret_ rdfast #0,pb 'init fifo read at new address
'
' Bytecode EXECF table that gets moved into lut
'
bytetable long r4 | %0_10_110 << 10 'forward byte branch
long r4 | %0_10_101 << 10 'forward word branch
long r4 | %0_10_011 << 10 'forward long branch
long r4 | %0_01_110 << 10 'reverse byte branch
long r4 | %0_01_101 << 10 'reverse word branch
long r4 | %0_01_011 << 10 'reverse long branch
could also be written as
'
' Bytecode routines
'
r4 rfbyte pa 'get byte offset or
rfword pa 'get word offset or
rflong pa 'get long offset
add pb,pa 'add offset or
sub pb,pa 'sub offset
_ret_ rdfast #0,pb 'init fifo read at new address
'
' Bytecode EXECF table that gets moved into lut
'
bytetable long r4 | 011_01_0% << 10 'forward byte branch
long r4 | 101_01_0% << 10 'forward word branch
long r4 | 110_01_0% << 10 'forward long branch
long r4 | 011_10_0% << 10 'reverse byte branch
long r4 | 101_10_0% << 10 'reverse word branch
long r4 | 110_10_0% << 10 'reverse long branch
LSB-first is what the hardware design dictates but MSB-first would be easier to program. This is certainly an issue for me because I have to create 512 skip patterns and I don't fancy doing them all back-to-front!
If you really need 512 skip patterns, it sounds like a script is needed to manage that, so you never enter binary-strings...
Look at fasmg, it is an Assembler with very powerful scripting, that can do almost anything...
Luckily I have found a solution that would make me very happy and perhaps other people too. It's so simple, it's beautiful. The design stays the same and a small enhancement to the assembler is all that is needed. I really hope there is no syntactical reason why it could not be implemented.
Assembler change:
The binary symbol % can be a suffix as well as a prefix. When % is a suffix the binary string is reversed and must be unreversed by the assembler. In both cases the MSB is next to %.
The broad idea is good, but maybe this is too subtle, and not easy to read for new users.
eg I'm already used to assemblers that allow this
#0010_0010b for binary strings,
ie a trailing bit indicator syntax is already common, so variant #0010_0010% looks very much the same, at a glance, and not clear that very special LSB-FIRST is used.
Better might be an explicit string reversal operator in the assembler, that allows either boolean string.
Do you also want Hex Strings to be LSB first ?
TonyB_, LSB first makes more sense to me than MSB first. I see no confusion in using LSB first. I find your suggestion for using % as a suffix confusing. Rather than creating a syntax that may be confusing to others it might be better to have a compile-time function, such as REVERSE() that would generate a bit-reversed version of a constant. So if you don't like writing "skip #%10110" you could write "skip #REVERSE(%01101)" instead.
Comments
That's right. I forgot to mention that now D[31] is just the next SKIP bit up from D[30]. I got rid of the circuit that tracked whether any SKIP bits were left, because that thing had become a critical path. So, we can see only the next 10 SKIP bits.
I've been thinking, too, that SETQ D[9] could convey whether or not the C/Z bits are written with the two LSBs of the index. That's the last thing I need to implement before I can make another release.
To answer your XBYTE question: As long as the top of the stack was $1F8..$1FF, a new XBYTE would begin on a _RET_/RET, replacing any current skip pattern. If you were to do a '_RET_ POP temp', with the $1F8..$1FF on top of the stack, it would begin an XBYTE, but also pop the stack, causing any future _RET_/RET to return to the new top stack address, unless it was also $1F8..$1FF, in which case another XBYTE would commence.
Still running some tests, all looks good so far.
Ok. Good. Thanks.
I think it's a really good thing that we've got interrupts operating with SKIP instructions. This is something I wouldn't have thought to pursue, but with your guys' encouragement, it turned out not to be too big of a deal. I feel a lot better about the reduction in interrupt latency this affords, not to mention the single-stepping aspect.
FPGA Image "Prop123_A9_Prop2_v19skip6.rbf"
Single step of SKIP,SKIPF,EXECF and XBYTE are all working as expected.
Thanks a bunch for doing that.
Yes, skipping is very good now. Supporting CALLs was the first step and allowing interrupts was the obvious next step. Who suggested both of these, I wonder? Anyway, thanks Chip for getting them to work.
This change saves an instruction (more on that later) and seeing another skip bit helps but only the next 10 SKIP bits is the issue here. There are three ways that a routine called from a skip sequence could handle the remaining skip bits:
(1) Leave them alone, so that they take effect after the routine (normal option).
(2) Overwrite them, with a SKIPF say, in which case the return address would be popped and discarded probably.
(3) Save them, execute a new skip sequence with SKIPF, then restore the original skip bits with another SKIPF before returning.
I've found uses for all three options and I'm sure others will, too. Skipping is so good that nesting as in (3) should be supported but it will only work sometimes because only 10 skip bits can be saved. Currently the code looks something like
Ideally it would be
This does mean having a new RDSKIP instruction but I can't see any way to avoid that. GETINT would revert to its original behaviour, with the high 10 bits zero. Being able to read any of the skip bits is useful but GETINT is rather half-baked. To help offset the new RDSKIP logic, a 22-bit mux could be eliminated elsewhere.
When you start adding nested SKIPF/CALL's to the mix, it seems to defeat its purpose.
The overhead of adding the extra baggage of CALL,GETINT,SHR,SKIPF,RET) ~(12 clocks) kills off VM speed.
Straight line code snippets in hub would probably be a better alternative and easier to follow/debug.
I agree, somewhat.
There were instances in the interpreter where a single, but variable, instruction would read or write a register/LUT/hub variable within a skip sequence, via ALTI. To introduce bit fields into the mix, a whole separate subroutine was required. The CALL allowance worked beautifully for this. I can now call short routines which handle the bit field reading or writing, and involve them as if they were single instructions within the skip sequence. It makes things nice as can be.
I was thinking that this skip concept, to be fully realized within an architecture, should be integrated such that when a CALL takes place, the current skip pattern should be pushed onto a hardware stack, along with the return address, and a new skip pattern should be started. That way, you have integrated skipping at every subroutine level - or, at least, the provision for it. It could make for some dense coding.
I just looked into all this. The problem is that there is more state information to save and restore than just 32 bits of skip data. There is the call depth counter and the skip-type flag. These things would require multiple instructions to perform read and set. I think it's not a good idea to go there. What we have now can be observed, somewhat, but to make it reorganizable within interrupts would take a lot more.
Is this the end of the road for skip nesting or is some sort of hardware stack solution still possible? There are only 22 skip bits for EXECF/XBYTE, probably the most important application, so there would room for other stuff in one long.
If three bits needed for call depth counter and one for skip-type flag, that would leave 28 skip bits in one long, more than is possible for EXECF/XBYTE and a small but acceptable reduction for SKIP/SKIPF.
True, but to grow the hardware stack width from 20 to 32 would increase the number of flops by 12 flops * 8 levels * 16 cogs = 1536. Then, we'd want to be able to push and pop 32 bits of data, plus those two flags bits (how to handle them?), and so on. A lot of these things just open up new frontiers of optimization. It maybe never ends. I think even reporting a single bit via GETINT to show whether just the next instruction will be skipped, or not, may be 90% the value of reporting anything. My gut feeling is that XBYTE is a special add-on in this design and doesn't warrant full expansion. If we were starting anew, then something like this could be expanded all the way. For now, it does everything that seems reasonable, I think.
Your and Ozpropdev's input have already made this feature pretty awesome. My only concern is whether the world is too frenetic, with too short an attention span these days, to be able to engage all these niceties.
Having it work with interrupts and calls, seems all the refinement needed.
Better to now focus on broad features, like confirming USB, confirming HyperRAM refresh, and a clean up of SmartPin modes...
Chip, have you done a power analysis lately on the P2? I wonder how the current design compares with the P2-Hot.
There is one final thing I'd like to say about skipping. I mentioned yesterday that a 22-bit mux could be saved and it's in EXECF.
Before:
After:
Skip patterns shift left and assembler must add trailing zeroes to short patterns. 99.9% or more of P2 programmers won't mind because they won't program the P2 until the chip actually exists.
I think this change would increase easy-of-use massively. English is read or written from left to right and top to bottom and I for one find it hard to work backwards. Here's an excerpt from the doc:
Once he or she knows what 0 and 1 mean, I bet that almost every newcomer would think that the 1st, 3rd and 4th instructions will be skipped. It's so unnatural to read from right to left that people will forget the reverse order quite often and bugs will result. (It's a good job single-stepping works!) Nobody would forget a left-to-right order.
This is not about assembler or hardware design, fundamentally it's about useability. Maybe some non-programmers should be consulted for their views. I'm just worried, after all the effort that has been put into skipping, that it will be more difficult to use than it should be.
If we didn't know P2HOT was hot, we would have been using silicon for a few years now. Sure there would be restrictions, but most users would never run enough to actually reach "HOT".
IMHO, the current P2 has probably been "HOT" since the egg-beater with hub-exec was added. When you run worst case figures, accessing the hub ram block on every clock for a full 16 Cogs running flat out with hub-exec, I expect this will be "HOT". But realistically that's going to be rare, if ever, so it might require a warning.
Time will tell, but hopefully it won't stop the P2 from going to silicon.
This is just a MSB/LSB convention, and that's been around since dot.
PASM has a natural limited range immediate, so I'd expect all immediate opcodes to fill from the right. (ie fill LSB first)
As to the order of the skip within any string, a more serious support for XBYTE would automate this step anyway.
ie a smarter Assembler would avoid tedious and error-prone manual bit-editing entirely, and allow a automatic scan of some annotated column for the mask extract.
Adding/removing lines would then self-adjust, and the (new) user never worries about MSB/LSB conventions at all
If we take Chip's coding approach, where he manually column-collects, that exact source can be automated
Here the parser is given a label-tag, and starts search after the comment char, for first column char match, then it scans vertically for '|' or TagChr, and builds the mask.
Exits on " " and reports error on <> "|" or <> TagChar or mask too long for opcode.
I'm not clear on what x,m,n are doing, but it looks like a multiple column rule would support that .
It could use x as look-right, but not easy to sense when to stop look-right, so probably easiest to allow dual tags, and simply OR from there. (x becomes | alias)
Either of :
Minor : Instead of 'x', maybe '>' is more clearly a look-right hint, and an extraction that finds '>' could error if no explicit second column is given
Exactly. SKIPs must begin with the LSB for this reason.
Luckily I have found a solution that would make me very happy and perhaps other people too. It's so simple, it's beautiful. The design stays the same and a small enhancement to the assembler is all that is needed. I really hope there is no syntactical reason why it could not be implemented.
Assembler change:
The binary symbol % can be a suffix as well as a prefix. When % is a suffix the binary string is reversed and must be unreversed by the assembler. In both cases the MSB is next to %.
The following simple example
could be written as
And this XBYTE example
could also be written as
Users could choose whichever option they prefer.
Look at fasmg, it is an Assembler with very powerful scripting, that can do almost anything...
The broad idea is good, but maybe this is too subtle, and not easy to read for new users.
eg I'm already used to assemblers that allow this
#0010_0010b for binary strings,
ie a trailing bit indicator syntax is already common, so variant #0010_0010% looks very much the same, at a glance, and not clear that very special LSB-FIRST is used.
Better might be an explicit string reversal operator in the assembler, that allows either boolean string.
Do you also want Hex Strings to be LSB first ?
Just do something like this if you want to use reverse patterns
Edit: The new "MUXQ" could be used here too.
SKIPF #%011010110 >< 9
I just realized in that example it wouldn't matter.
I had to look twice. :-), not the best example.
But it solves the original question of writing the bits the other way around for readability...
Enjoy!
Mike
Honestly, I use Excel for this kind of stuff all the time.
Then there is experience. After a few, your brain will switch and you won't think about it, just do it.
My .02