This performs the save of both C and Z by shifting them in while preserving any previous saves.
At the same time, you can also optionally set C and/or Z from the lost bits at the opposite end of D.
So effectively you can save C and Z one end of D while setting C and Z from the other end.
Alternately you can just save C and Z on either end of D while preserving a previous saved C and Z provided it is on the same end.
Postedit: Best to keep CZ in the same orientation as CALL saves them.
Chip,
Wouldn't this be better than the SETCZ that we have now, or what has been suggested?
This achieves almost all suggestions in two instructions and retires one instruction.
It's logical and consistent.
Does it add much logic?
Also did you read my comments about CALLx D {wc,wz} vs CALLx #A ?
Yes, thanks for the reminder about the CALLx D dilemma. That takes me a several minutes just to remember what the problem is. We need to come up with a policy for those instructions and then implement it into the assembler.
Could you make a simple text formula to show what SHLCZ and SHRCZ do, exactly? I assume something like this, but I'm not sure:
For SHLCZ D, D = {D[29:0], C, Z} and new {C,Z} are old D[31:30].
For SHRCZ D, D = {C,Z,D[31:2]} and new {C,Z} are old D[1:0].
Could you make a simple text formula to show what SHLCZ and SHRCZ do, exactly? I assume something like this, but I'm not sure:
For SHLCZ D, D = {D[29:0], C, Z} and new {C,Z} are old D[31:30].
For SHRCZ D, D = {C,Z,D[31:2]} and new {C,Z} are old D[1:0].
SHIFT is probably a poor name choice, as these are really just extended RCR,RCL opcodes.
RCR/RCL make a 33 bit string, with just Carry, RCZR, RCZL make a 34 bit string with both C and Z
Again, here {wc} makes little sense, as clearly you always want to update C in this opcode.
That bit in the opcode can add Z. giving
For RCZL D, D = {D[29:0], C, Z} and new {C,Z} are old D[31:30].
For RCZR D, D = {C,Z,D[31:2]} and new {C,Z} are old D[1:0].
The bit extractor is a tall order because it involves a 32-bit shifter, which is large. Also, it would be kind of nice to have the complement, too, where you could insert some number of S' LSBs at an offset into D. If we had some pre-settable register that contained the shift metrics, then it would be neat to have one instruction to extract the field from within S into D (LSB-justified) and another instruction to insert the field from S' LSBs into D. We'd need two 'D,S' slots for that (maybe SETFLD/GETFLD).
I'm not surprised that the extractor is a tall order. Something to think about for P3 I guess
So, your instructions are super easy to implement, of course, but what about SETLE and SETGT (less-than or equal, or greater-than), as well? Those would take all six slots, then.
Well, you can always reverse the order of the preceding compare (using CMPR instead of CMP) to turn SETNC and SETC into SETLE and SETGT (a <= b is the same as NOT (b<a)). So SETLE and SETGT are definitely low priority for a compiler, although they might be useful in PASM. If you do create them then we'd probably want some option (wc? or a global setting?) for all of the SETs to output -1 instead of 1 for true; that way they'd be useful for languages like Spin that use -1 as the truth value. That's only really an issue for SETLE and SETGT, since SETC with a -1 output can be achieved by MUXC and a predefined register with -1 in it.
For SETCZ, how about if C/Z were stored in D[2:1] and the new C/Z were extracted from D[1:0]?
I guess my preferences are for them to be in the same place they get extracted from, or else in the upper bits (like the SHIFT CZ instruction suggested above). To be honest I don't have an immediate need for these, just thinking that if we have a SETCZ we should have a corresponding GETCZ, especially since the SETCZ operation is actually easier to implement for the user (1 shift instruction) than GETCZ (two BITS/MUX instructions). OTOH putting them anywhere other than where they're taken from means we need to do a shift to restore them, and we end up with two instructions anyway, so maybe it doesn't matter. In fact I guess we could argue that SETCZ is redundant and should just be removed to simplify things.
I'd rather not use WC in the MUXxx instructions to change their behavior. The rules for WC and WZ are very consistent, still. I think, anyway.
Fair enough. Consistency is important, and I guess the fewer changes at this late stage the better.
This performs the save of both C and Z by shifting them in while preserving any previous saves.
At the same time, you can also optionally set C and/or Z from the lost bits at the opposite end of D.
So effectively you can save C and Z one end of D while setting C and Z from the other end.
Alternately you can just save C and Z on either end of D while preserving a previous saved C and Z provided it is on the same end.
Postedit: Best to keep CZ in the same orientation as CALL saves them.
Chip,
Wouldn't this be better than the SETCZ that we have now, or what has been suggested?
This achieves almost all suggestions in two instructions and retires one instruction.
It's logical and consistent.
Does it add much logic?
Also did you read my comments about CALLx D {wc,wz} vs CALLx #A ?
Yes, thanks for the reminder about the CALLx D dilemma. That takes me a several minutes just to remember what the problem is. We need to come up with a policy for those instructions and then implement it into the assembler.
Could you make a simple text formula to show what SHLCZ and SHRCZ do, exactly? I assume something like this, but I'm not sure:
For SHLCZ D, D = {D[29:0], C, Z} and new {C,Z} are old D[31:30].
For SHRCZ D, D = {C,Z,D[31:2]} and new {C,Z} are old D[1:0].
Is that correct?
Yes, precisely that Chip
Then we don't need SETCZ.
re CALLx D
I cannot see any reason that CALLxD could not just be the same as CALLx #A. ie C & Z are always saved, and remove the {wc,wz} option from CALLx D. IMHO it's a logic decision, not the assembler.
Is there anything else beyond ANDC/ORC/XORC? I think NOTC would be good, too.
Yes, NOTC is useful.
The ideal is to allow boolean expression evaluation using only BIT and BIT,C opcodes, and being able to include Pins in that.
Here are the 8051 BIT & C related opcodes, I think you now have a superset of these, with none missing ?
(maybe ANL C, /bit ORL C, /bit are missing ?)
Reach of the 8051 opcodes is 256 bits.
CLR C
CLR bit
SETB C
SETB bit
CPL C
CPL bit
ANL C, bit
ANL C, /bit
ORL C, bit
ORL C, /bit
MOV C, bit
MOV bit, C
JC rel8
JNC rel8
JB bit, rel8
JNB bit, rel8
JBC bit, rel8
RL A
RLC A
RR A
RRC A
I just reread your prior post. So, we can cover the 0/-1 output with a MUXC/NC/Z/NZ and a $FFFFFFFF mask. And we can reverse the compare order to get around needing WRLE/WRGT. Perfect. We'll have WRC/NC/Z/NZ output 0/1, only, then.
The smart pins do have quadrature encoding, but only use A,B. What's the Z pin for? Home position sensing?
Yes, some systems have a Zero channel, a single narrow impulse.
Z can come from a 3rd sensor in a rotary encoder (which limits you to one revolution) or a limit switch, on a larger system.
Choices there are to clear the Quad Counter, or Capture the value (leading Z edge) and then subtract that in SW.
Can the smart pins capture from a 3rd pin, when Quad counting from 2 others ?
I just reread your prior post. So, we can cover the 0/-1 output with a MUXC/NC/Z/NZ and a $FFFFFFFF mask. And we can reverse the compare order to get around needing WRLE/WRGT. Perfect. We'll have WRC/NC/Z/NZ output 0/1, only, then.
Yes, I think Eric needed 0/1 (not 0,-1) for exact emulation of other cores.
Not sure if WRZ, WRNZ were needed for that, but sounds like they come for free (or almost free?)
The smart pins do have quadrature encoding, but only use A,B. What's the Z pin for? Home position sensing?
Yes, some systems have a Zero channel, a single narrow impulse.
Z can come from a 3rd sensor in a rotary encoder (which limits you to one revolution) or a limit switch, on a larger system.
Choices there are to clear the Quad Counter, or Capture the value (leading Z edge) and then subtract that in SW.
Can the smart pins capture from a 3rd pin, when Quad counting from 2 others ?
No. The quadrature encoder can work either periodically, reporting the count delta every X clocks, or work in totalizer mode. In its current design, you would have to either watch for the 'zero' pin or use it to generate an interrupt. It would be easy to reset the counter in totalizer mode when 'zero' went high. It seems like this would be troublesome in cases of bidirectional rotation because of hysteresis on the 'zero' detector.
Can the smart pins capture from a 3rd pin, when Quad counting from 2 others ?
No. The quadrature encoder can work either periodically, reporting the count delta every X clocks, or work in totalizer mode. In its current design, you would have to either watch for the 'zero' pin or use it to generate an interrupt. It would be easy to reset the counter in totalizer mode when 'zero' went high. It seems like this would be troublesome in cases of bidirectional rotation because of hysteresis on the 'zero' detector.
That's probably tolerable, as interrupt can be quite quick, in machine terms.
Most Z designs I've seen have deliberately narrow Z, so it 'fits inside' a single state, and so 'hysteresis' on direction change is not an issue.
Can the smart pins capture from a 3rd pin, when Quad counting from 2 others ?
No. The quadrature encoder can work either periodically, reporting the count delta every X clocks, or work in totalizer mode. In its current design, you would have to either watch for the 'zero' pin or use it to generate an interrupt. It would be easy to reset the counter in totalizer mode when 'zero' went high. It seems like this would be troublesome in cases of bidirectional rotation because of hysteresis on the 'zero' detector.
That's probably tolerable, as interrupt can be quite quick, in machine terms.
Most Z designs I've seen have deliberately narrow Z, so it 'fits inside' a single state, and so 'hysteresis' on direction change is not an issue.
I just looked at the smart pin and each mode is limited to two inputs and one output. For PWM in SMPS mode, it can output PWM and simultaneously monitor I and V digital inputs. For quadrature modes, there's no output, but the two inputs get used as A and B.
Can the smart pins capture from a 3rd pin, when Quad counting from 2 others ?
No. The quadrature encoder can work either periodically, reporting the count delta every X clocks, or work in totalizer mode. In its current design, you would have to either watch for the 'zero' pin or use it to generate an interrupt. It would be easy to reset the counter in totalizer mode when 'zero' went high. It seems like this would be troublesome in cases of bidirectional rotation because of hysteresis on the 'zero' detector.
That's probably tolerable, as interrupt can be quite quick, in machine terms.
Most Z designs I've seen have deliberately narrow Z, so it 'fits inside' a single state, and so 'hysteresis' on direction change is not an issue.
Most, but not all.
Chip hit the nail on the head before. Sometimes I've seen a bolthead and a proximity sensor. Then you would have to clear on the rising edge only, and an interrupt can do that.
Most, but not all.
Chip hit the nail on the head before. Sometimes I've seen a bolthead and a proximity sensor. Then you would have to clear on the rising edge only, and an interrupt can do that.
If you have a "wide zero" and need 'any direction' operation, you need to flip the interrupt polarity based on direction of travel.
All CALLx instructions save C & Z flags at D{21:20} (if they are saved)
ie Currently the save is done by D = 10'b0, C, Z, PC{19:0} presuming both C & Z are saved.
This has been done since the internal stack is only 22 bits wide ( C, Z, <20bit return address).
RETx {wc,wz} optionally restores the C & Z flags.
JMP D {wc,wz} optionally restores the C & Z flags.
_RET_ prefix does not (cannot) restore C & Z flags.
SETCZ {#}D {wc,wz} sets flags according to D. C = D[1], Z = D[0]. If D is a register then D = {D[1:0], D[31:2]} (ie then ROR D,#2).
I have proposed these SHLCZ & SHRCZ instructions replace SETCZ, although I admit that they do not precisely solve the SETCZ #%bb {wc,wz} usage.
If D is a cog register (ie not in the stack, not in LUT, and not in hub) then it is possible to change the return condition by presetting the C & Z bits held in D[21:20] using two MUXxx or two BITxx instructions. MUXxx requires a pair of long masks as well whereas BITxx can use an immediate #S[4:0] to specify the bit location.
We actually have 5 pairs of 0's being D[31:22] unused when the C/Z/PC are saved in a cog register D.
If we utilised this mechanism for push/pop/replace/set the C & Z flags into just D[31:20] we would have 6 pairs of C & Z flags to utilise, while at the same time preserving D[19:0]. So we could use this mechanism with both normal register usage, and the specific case of a saved return address with C & Z flags.
So what do we need?
* Set C and/or Z from #D[1:0] by using {wc,wz}
* Set C and/or Z from D[20] and D[19] by using {wc,wz}
* Set C and/or Z from D[20] and D[19] by using {wc,wz} and then ROR D[31:20],#2 (ie pop) D = D{21:20}, D{31:22}, D{19:0} (ROR or SHR ???)
* Save C and/or Z to D[20] and D[19] by using {wc,wz}
* Save C and/or Z to D[20] and D[19] by using {wc,wz} and then SHL D[31:20},#2 (ie push) D = D{29:20}, wc ? C : D{21}, wz ? Z : D{20}, D{19:0} syntax???
Good question. hopefully not, I guess we need to see the encoding of the opcodes to be sure.
I was thinking of AND.OR.XOR versions of the BIT opcodes, rather than ones needing 32b masks, which are costly.
BITC D,{#}S Set bit S[4:0] of D to C. D = D & !(1 << S[4:0]) | ( C << S[4:0]). C/!Z = bit S[4:0] of D.
(etc)
add
BITOR D,{#}S C = OR bit S[4:0] of D with C. == ORC D,{#}S
BITAND D,{#}S C = AND bit S[4:0] of D with C. == ANDC D,{#}S
BITXOR D,{#}S C = XOR bit S[4:0] of D with C. == XORC D,{#}S
Those can address any bit without needing a mask, and so allow compact named Booleans,
I was thinking of AND.OR.XOR versions of the BIT opcodes,[/b] rather than ones needing 32b masks, which are costly.
BITC D,{#}S Set bit S[4:0] of D to C. D = D & !(1 << S[4:0]) | ( C << S[4:0]). C/!Z = bit S[4:0] of D.
(etc)
add
BITOR D,{#}S C = OR bit S[4:0] of D with C.
BITAND D,{#}S C = AND bit S[4:0] of D with C.
BITXOR D,{#}S C = XOR bit S[4:0] of D with C.
Those can address any bit without needing a mask, and so allow compact named Booleans,
I think someone was proposing logic instructions that work directly on the C flag. How necessary is such a thing?
Yes, that was me.
This is a natural extension of the BIT opcodes (see above), since P2 can already SET or CLEAR or CPL bits or move C,!C,Z,!Z to BITs
- being able to BIT level AND.OR.XOR, then allow those boolean variables to be collected in an expression like
M = (N AND O) OR (P XOR Q) here, mnopq all need just one bit of memory. No wide masks needed, or wasted memory.
and for languages that have Boolean Expression results, you can code
M = (N AND O) OR (int32==i32) etc
PLC style code loves named booleans. Makes things very easy to read.
yes, and once those opcodes are there, that can be further improved with an assembler that can do this
ie assembler allocates a name to myflags, bit7 etc..
There is need for source and destination to be any of C, Z, and a register bit. You'd want to be able to make a register bit the destination, as well as a flag, and have operations between flags. I gotta' think about what that would look like.
There is need for source and destination to be any of C, Z, and a register bit. You'd want to be able to make a register bit the destination, as well as a flag, and have operations between flags. I gotta' think about what that would look like.
I realise that this isn't a load store architecture, and maybe it's my 65C02 experience colouring my view, but when I see STZ I think of Store Zero. Would it offend anyone to suggest instead?
Good question. hopefully not, I guess we need to see the encoding of the opcodes to be sure.
I was thinking of AND.OR.XOR versions of the BIT opcodes, rather than ones needing 32b masks, which are costly.
BITC D,{#}S Set bit S[4:0] of D to C. D = D & !(1 << S[4:0]) | ( C << S[4:0]). C/!Z = bit S[4:0] of D.
(etc)
add
BITOR D,{#}S C = OR bit S[4:0] of D with C. == ORC D,{#}S
BITAND D,{#}S C = AND bit S[4:0] of D with C. == ANDC D,{#}S
BITXOR D,{#}S C = XOR bit S[4:0] of D with C. == XORC D,{#}S
Those can address any bit without needing a mask, and so allow compact named Booleans,
Is BITOR essentially the same as?
if_c BITC D,{#}S
After all, the state of the bit won't change if the carry is clear.
On a personal note, I'd avoid any mnemonic that contains the string "ORC".
Comments
Yes, thanks for the reminder about the CALLx D dilemma. That takes me a several minutes just to remember what the problem is. We need to come up with a policy for those instructions and then implement it into the assembler.
Could you make a simple text formula to show what SHLCZ and SHRCZ do, exactly? I assume something like this, but I'm not sure:
For SHLCZ D, D = {D[29:0], C, Z} and new {C,Z} are old D[31:30].
For SHRCZ D, D = {C,Z,D[31:2]} and new {C,Z} are old D[1:0].
Is that correct?
It would be good to have something like:
No WC would be needed for these, since WC on existing bit opcodes could enable these functions.
Is there anything else beyond ANDC/ORC/XORC? I think NOTC would be good, too.
RCR/RCL make a 33 bit string, with just Carry, RCZR, RCZL make a 34 bit string with both C and Z
Again, here {wc} makes little sense, as clearly you always want to update C in this opcode.
That bit in the opcode can add Z. giving
For RCZL D, D = {D[29:0], C, Z} and new {C,Z} are old D[31:30].
For RCZR D, D = {C,Z,D[31:2]} and new {C,Z} are old D[1:0].
The WRxx instructions write 0 or 1 to D, based on flag status.
Well, you can always reverse the order of the preceding compare (using CMPR instead of CMP) to turn SETNC and SETC into SETLE and SETGT (a <= b is the same as NOT (b<a)). So SETLE and SETGT are definitely low priority for a compiler, although they might be useful in PASM. If you do create them then we'd probably want some option (wc? or a global setting?) for all of the SETs to output -1 instead of 1 for true; that way they'd be useful for languages like Spin that use -1 as the truth value. That's only really an issue for SETLE and SETGT, since SETC with a -1 output can be achieved by MUXC and a predefined register with -1 in it.
I guess my preferences are for them to be in the same place they get extracted from, or else in the upper bits (like the SHIFT CZ instruction suggested above). To be honest I don't have an immediate need for these, just thinking that if we have a SETCZ we should have a corresponding GETCZ, especially since the SETCZ operation is actually easier to implement for the user (1 shift instruction) than GETCZ (two BITS/MUX instructions). OTOH putting them anywhere other than where they're taken from means we need to do a shift to restore them, and we end up with two instructions anyway, so maybe it doesn't matter. In fact I guess we could argue that SETCZ is redundant and should just be removed to simplify things.
Fair enough. Consistency is important, and I guess the fewer changes at this late stage the better.
Eric
Yes, precisely that Chip
Then we don't need SETCZ.
re CALLx D
I cannot see any reason that CALLxD could not just be the same as CALLx #A. ie C & Z are always saved, and remove the {wc,wz} option from CALLx D. IMHO it's a logic decision, not the assembler.
Would it be a problem for C if all these WRC/NC/Z/NZ instructions wrote 0 or -1, instead of 0 or 1? It would be better for Spin, of course.
If we don't need WRLE/WRGT, then we could use those slots as boolean promoters:
BOOL D - promote non-0 value to -1, 0 value stays 0
BOOLN D - make D = -1 if D was 0, else make D = 0
Yes, NOTC is useful.
The ideal is to allow boolean expression evaluation using only BIT and BIT,C opcodes, and being able to include Pins in that.
Here are the 8051 BIT & C related opcodes, I think you now have a superset of these, with none missing ?
(maybe ANL C, /bit ORL C, /bit are missing ?)
Reach of the 8051 opcodes is 256 bits.
I just reread your prior post. So, we can cover the 0/-1 output with a MUXC/NC/Z/NZ and a $FFFFFFFF mask. And we can reverse the compare order to get around needing WRLE/WRGT. Perfect. We'll have WRC/NC/Z/NZ output 0/1, only, then.
Z can come from a 3rd sensor in a rotary encoder (which limits you to one revolution) or a limit switch, on a larger system.
Choices there are to clear the Quad Counter, or Capture the value (leading Z edge) and then subtract that in SW.
Can the smart pins capture from a 3rd pin, when Quad counting from 2 others ?
Not sure if WRZ, WRNZ were needed for that, but sounds like they come for free (or almost free?)
No. The quadrature encoder can work either periodically, reporting the count delta every X clocks, or work in totalizer mode. In its current design, you would have to either watch for the 'zero' pin or use it to generate an interrupt. It would be easy to reset the counter in totalizer mode when 'zero' went high. It seems like this would be troublesome in cases of bidirectional rotation because of hysteresis on the 'zero' detector.
Most Z designs I've seen have deliberately narrow Z, so it 'fits inside' a single state, and so 'hysteresis' on direction change is not an issue.
I just looked at the smart pin and each mode is limited to two inputs and one output. For PWM in SMPS mode, it can output PWM and simultaneously monitor I and V digital inputs. For quadrature modes, there's no output, but the two inputs get used as A and B.
Most, but not all.
Chip hit the nail on the head before. Sometimes I've seen a bolthead and a proximity sensor. Then you would have to clear on the rising edge only, and an interrupt can do that.
ie Currently the save is done by D = 10'b0, C, Z, PC{19:0} presuming both C & Z are saved.
This has been done since the internal stack is only 22 bits wide ( C, Z, <20bit return address).
RETx {wc,wz} optionally restores the C & Z flags.
JMP D {wc,wz} optionally restores the C & Z flags.
_RET_ prefix does not (cannot) restore C & Z flags.
SETCZ {#}D {wc,wz} sets flags according to D. C = D[1], Z = D[0]. If D is a register then D = {D[1:0], D[31:2]} (ie then ROR D,#2).
I have proposed these SHLCZ & SHRCZ instructions replace SETCZ, although I admit that they do not precisely solve the SETCZ #%bb {wc,wz} usage.
If D is a cog register (ie not in the stack, not in LUT, and not in hub) then it is possible to change the return condition by presetting the C & Z bits held in D[21:20] using two MUXxx or two BITxx instructions. MUXxx requires a pair of long masks as well whereas BITxx can use an immediate #S[4:0] to specify the bit location.
We actually have 5 pairs of 0's being D[31:22] unused when the C/Z/PC are saved in a cog register D.
If we utilised this mechanism for push/pop/replace/set the C & Z flags into just D[31:20] we would have 6 pairs of C & Z flags to utilise, while at the same time preserving D[19:0]. So we could use this mechanism with both normal register usage, and the specific case of a saved return address with C & Z flags.
So what do we need?
* Set C and/or Z from D[20] and D[19] by using {wc,wz}
* Set C and/or Z from D[20] and D[19] by using {wc,wz} and then ROR D[31:20],#2 (ie pop) D = D{21:20}, D{31:22}, D{19:0} (ROR or SHR ???)
* Save C and/or Z to D[20] and D[19] by using {wc,wz}
* Save C and/or Z to D[20] and D[19] by using {wc,wz} and then SHL D[31:20},#2 (ie push) D = D{29:20}, wc ? C : D{21}, wz ? Z : D{20}, D{19:0} syntax???
Opcode names anyone???
I've got those RCZR and RCZL opcodes already implemented, but it would be easy to change which bits do what.
Here is part of the Verilog that does the job. Note the first two bits of each "u" term are the new C/Z states:
I also made new CLC/STC-type instructions to easily affect the flags:
Isn't that the same as?
I think someone was proposing logic instructions that work directly on the C flag. How necessary is such a thing?
I was thinking of AND.OR.XOR versions of the BIT opcodes, rather than ones needing 32b masks, which are costly.
Those can address any bit without needing a mask, and so allow compact named Booleans,
This is a natural extension of the BIT opcodes (see above), since P2 can already SET or CLEAR or CPL bits or move C,!C,Z,!Z to BITs
- being able to BIT level AND.OR.XOR, then allow those boolean variables to be collected in an expression like
M = (N AND O) OR (P XOR Q) here, mnopq all need just one bit of memory. No wide masks needed, or wasted memory.
and for languages that have Boolean Expression results, you can code
M = (N AND O) OR (int32==i32) etc
PLC style code loves named booleans. Makes things very easy to read.
ie assembler allocates a name to myflags, bit7 etc..
I realise that this isn't a load store architecture, and maybe it's my 65C02 experience colouring my view, but when I see STZ I think of Store Zero. Would it offend anyone to suggest instead?
Is BITOR essentially the same as?
After all, the state of the bit won't change if the carry is clear.
On a personal note, I'd avoid any mnemonic that contains the string "ORC".