Wow! It sounds like you've really nailed what needs to be done. Good job! This is going to be something really valuable.
Thanks Chip.
I think I have the rest figured out - just need to reset the unstuff counter if I receive a valid bit.
A question..
Do we have room for the instruction RxUSB D,S/# ?
Alternately, we could use a fixed location such as $1F0. Your thoughts?
RxUSB D, S/# WZ,WC
where
S/# is the PinPair# and Poly bits
S[31..9] = unused
S[8..7] = 00= CRC5 USB
01= CRC16 USB
10= CRC16 CCITT
11= undefined
S[6..0] = D-/D+ Pin Pair #0..127
The pin pair is always a pair of pins mod 2. ie nnnnnnx where x=0 and x=1 for the pair.
If the pin pair is even (S[0]=0) then J is the lowest pin and K is the higher pin of the consecutive pair
If the pin pair is odd (S[0]=1) then K is the lowest pin and J is the higher pin of the consecutive pair.
This arrangement allows for simple LS and FS by making the pin pair even or odd.
D is the cog register storing a 32 bit field...
D[31..16] = crc16
D[15] = K new pin value
D[14] = J new pin value
D[13..11] = unstuff counter 3 bits
D[10..8] = bit counter 3 bits
D[7..0] = data byte accumulation
Z = data byte ready (8 bits)
C = SE0/SE1
Thanks Chip.
I think I have the rest figured out - just need to reset the unstuff counter if I receive a valid bit.
A question..
Do we have room for the instruction RxUSB D,S/# ?
Alternately, we could use a fixed location such as $1F0. Your thoughts?
RxUSB D, S/# WZ,WC
where
S/# is the PinPair# and Poly bits
S[31..9] = unused
S[8..7] = 00= CRC5 USB
01= CRC16 USB
10= CRC16 CCITT
11= undefined
S[6..0] = D-/D+ Pin Pair #0..127
The pin pair is always a pair of pins mod 2. ie nnnnnnx where x=0 and x=1 for the pair.
If the pin pair is even (S[0]=0) then J is the lowest pin and K is the higher pin of the consecutive pair
If the pin pair is odd (S[0]=1) then K is the lowest pin and J is the higher pin of the consecutive pair.
This arrangement allows for simple LS and FS by making the pin pair even or odd.
D is the cog register storing a 32 bit field...
D[31..16] = crc16
D[15] = K new pin value
D[14] = J new pin value
D[13..11] = unstuff counter 3 bits
D[10..8] = bit counter 3 bits
D[7..0] = data byte accumulation
Z = data byte ready (8 bits)
C = SE0/SE1
We'll make room, don't worry. Looks fantastic, like you've identified everything it must do.
Thanks Chip & Bill.
Think I might have all the logic done ready for testing except for syntax problems. I am sure it can be written simpler, but as long as I can convey what needs to be done I'll be happy.
Chip, while I think of it, the GETZC instruction.
A while ago I suggested adding an #0-31 parameter (default 0) which rotates the D value right by 0..31 places before setting ZC=D[1:0]. (D remains unchanged as in NR)
This permits full 4 case decoding simply with the Z & C flags.
It will be particularly useful now we have Z&C being saved in various positions with CALLs etc.
I haven't been keeping up with all the latest developments with the P2 but I remember someone mentioning a general purpose CRC instruction a while back and was wondering if that was still being considered for release? Even a bit-wise crc instruction could come in handy.
A question..
Do we have room for the instruction RxUSB D,S/# ?
Alternately, we could use a fixed location such as $1F0. Your thoughts?
RxUSB D, S/# WZ,WC
where
S/# is the PinPair# and Poly bits
S[31..9] = unused
S[8..7] = 00= CRC5 USB
01= CRC16 USB
10= CRC16 CCITT
11= undefined
S[6..0] = D-/D+ Pin Pair #0..127
The pin pair is always a pair of pins mod 2. ie nnnnnnx where x=0 and x=1 for the pair.
If the pin pair is even (S[0]=0) then J is the lowest pin and K is the higher pin of the consecutive pair
If the pin pair is odd (S[0]=1) then K is the lowest pin and J is the higher pin of the consecutive pair.
This arrangement allows for simple LS and FS by making the pin pair even or odd.
D is the cog register storing a 32 bit field...
D[31..16] = crc16
D[15] = K new pin value
D[14] = J new pin value
D[13..11] = unstuff counter 3 bits
D[10..8] = bit counter 3 bits
D[7..0] = data byte accumulation
Z = data byte ready (8 bits)
C = SE0/SE1
See also the USB thead.
Once you have encapsulated RxUSB verilog, it is a small step to pace that on a Timer, and it becomes WAITUSB, which is now read once per byte (or it exits earlier on Flags)
This also then allows the timer to have a edge-snap mode, to resync on the USB edges.
Counter needs a reload (baud) mode, with an optional phased edge reset.
Chip,
Do the background instructions that take multiple clocks (such as the cordic, big multipier, etc) consume large amounts of silicon (apart from the actual processing logic)?
What I am getting at, does it use a lot of silicon to set say the USB receiver off running in the background, and permit the cog to continue on with other processing while a byte (or more) is grabbed?
I have been resisting this path because it lacks general purpose use, but wondered if it is worth pursuing.
Chip,
Do the background instructions that take multiple clocks (such as the cordic, big multipier, etc) consume large amounts of silicon (apart from the actual processing logic)?
What I am getting at, does it use a lot of silicon to set say the USB receiver off running in the background, and permit the cog to continue on with other processing while a byte (or more) is grabbed?
I have been resisting this path because it lacks general purpose use, but wondered if it is worth pursuing.
A USB state machine running in the background, reading and writing AUX would not be very big, at all. I don't think there's anything about USB that lends itself to general-purpose reuse, so it might as well be USB-only. If we could build a complete USB state machine, it might be an ideal approach.
A USB state machine running in the background, reading and writing AUX would not be very big, at all. I don't think there's anything about USB that lends itself to general-purpose reuse, so it might as well be USB-only. If we could build a complete USB state machine, it might be an ideal approach.
Chip,
Do the Counter have a reload mode, ideally with a external edge reset ?
A simple mode like that, would allow a Counter to be set to clock the USB state engine, and probably 1 level of Data storage is needed to give more time tolerance.
I think this could be emulated in SW @ 1.5MHz to prove the details.
Chip,
Do the Counter have a reload mode, ideally with a external edge reset ?
A simple mode like that, would allow a Counter to be set to clock the USB state engine, and probably 1 level of Data storage is needed to give more time tolerance.
I think this could be emulated in SW @ 1.5MHz to prove the details.
The counters can count the frequency of edges and the durations of states, but they don't have a reload mode like you are asking about.
A special circuit can be made for the USB handler, though. In many instances, it's not the guts of a circuit that take lots of space, but all the conduit to make it breathe. If we encapsulated it, it might be the best way to go.
The counters can count the frequency of edges and the durations of states, but they don't have a reload mode like you are asking about.
A special circuit can be made for the USB handler, though. In many instances, it's not the guts of a circuit that take lots of space, but all the conduit to make it breathe. If we encapsulated it, it might be the best way to go.
Good point, it's not a very large timer, lowest would be /4 for 48MHz to 12MHz and highest to cover 1.5MHz from 200MHz core, is /133.33 so an 8 bit reload/Rst counter / 'baud divider' is all that is needed to cover practical clock ranges, with a 50% compare for the sample point for both even and odd numbers.
Can anyone think of a better name than TCHECK? There should be some single word that could make up most of the name. I used "check" like one would check a basketball with an opposition teammate before starting play again. The word "register" applies, too, but it's too long and means, well... registers.
Chip,
As soon as you have the instruction summary (the section at the end of the docs) would you like to post it so I can put it into the spreadsheet and repost.
Thanks.
Chip,
As soon as you have the instruction summary (the section at the end of the docs) would you like to post it so I can put it into the spreadsheet and repost.
Thanks.
I just noticed that you got rid of the wr bit. Since there are still some unary/nullary instruction slots left, can you add an instruction that makes the next instruction's dest not get written? It would be just as fast and use one long less cogram than copying the dest to a temporary register and doing your operation on the temporary register:
old way:
mov t1, x
<whatever> t1, y wz, wc
t1 res 0
new way:
nr ' new instruction
<whatever> x, y wz, wc
Thanks,
electrodude
EDIT: nr would only work inside a tlock or in a single-tasked program, making the mov method easier for multitasking. My nr instruction is still easier for singletasking.
I just noticed that you got rid of the wr bit. Since there are still some unary/nullary instruction slots left, can you add an instruction that makes the next instruction's dest not get written? It would be just as fast and use one long less cogram than copying the dest to a temporary register and doing your operation on the temporary register:
old way:
mov t1, x
<whatever> t1, y wz, wc
t1 res 0
new way:
nr ' new instruction
<whatever> x, y wz, wc
Thanks,
electrodude
EDIT: nr would only work inside a tlock or in a single-tasked program, making the mov method easier for multitasking. My nr instruction is still easier for singletasking.
Nice idea!
AUGNR 'next instruction does not write out the result, only affects Z & C flags according to WZ & WC
I just noticed that you got rid of the wr bit. Since there are still some unary/nullary instruction slots left, can you add an instruction that makes the next instruction's dest not get written? It would be just as fast and use one long less cogram than copying the dest to a temporary register and doing your operation on the temporary register:
old way:
mov t1, x
<whatever> t1, y wz, wc
t1 res 0
new way:
nr ' new instruction
<whatever> x, y wz, wc
Thanks,
electrodude
EDIT: nr would only work inside a tlock or in a single-tasked program, making the mov method easier for multitasking. My nr instruction is still easier for singletasking.
This could work, per task, but could you give me an instruction sequence that would benefit from this feature? I was just looking at the instruction set and I don't see any obvious cases where blocking the result writing would gain anything. If you can convince me, I'll do it. And better yet, we could have an instruction that actually redirects the result writing to another register 'NEXTWR D/#'.
That NEXTWR seems useful as it gives you the ability to do any instruction of the form "D2 = D1 OP S" and D1 is not destroyed in the process. Could be very useful if D1 is needed multiple times after its initial use and you don't want to make multiple copies of it along the way. However you still need to put in the extra NEXTWR instruction each time you need this. It would need to be stored per task state too. If $1F1 is not used in the code, you could put the dummy result there each time using NEXTWR #$1f1.
SHR/SHL/ROR/ROL/RLC/RCR/AND/XOR <reg>,<#nnn NR WZ,WC 'check bit(s)
We can now check a bit using TESTB but it would mostly use a reg. There are cases where the NR was quite useful. There are of course ways around this, using a temp register.
IMHO a single AUGNR would be most useful. To work just like AUGS and AUGD does. This way, all the standard instructions such as the above, plus things like NEG, ADD, SUB, etc could be made to work using the AUGNR preceding the instruction. Would this be difficult?
Another instruction I mentioned a little earlier was to put D[1:0 + offset] into Z & C using wz and wc. By rotating D to the right #offset times, we could use any bit pair to set Z & C flags. This permits full 4 way decoding quickly. I thought that SETZC could be extended to be SETZC D/#,#0..31 using say
ZCL- 1111111 ZC L CCCC DDDDDDDDD 1111nnnnn SETZC D/#,#nnnnn 'where nnnnn shifts D right 0..31 places (default is 30). Note D is not written.
Postedit:
NEXTWR D/#
Could be extremely useful for giving a separate result from 2 inputs (D & S). Maybe we could utilise the special register $1F1 to mean NR.
IMHO this would be more versatile than AUGNR. Maybe AUGDEST might be a better name (since it is like AUGS and AUGD).
Chip
Does SETMASK, WIDEBM, WIDEWM and WIDELM still exist? See here
Only these:
WRWIDE S/PTRA/PTRB
WRWIDEM D/#,S/PTRA/PTRB
WRWIDEM lets you specify a 32-mask in D/# where 1's inhibit individual byte writes within the 32-byte WIDE. So, by making it atomic, we don't need SETMASK.
Comments
I think I have the rest figured out - just need to reset the unstuff counter if I receive a valid bit.
A question..
Do we have room for the instruction RxUSB D,S/# ?
Alternately, we could use a fixed location such as $1F0. Your thoughts?
Your USB instruction looks REALLY good.
We'll make room, don't worry. Looks fantastic, like you've identified everything it must do.
Think I might have all the logic done ready for testing except for syntax problems. I am sure it can be written simpler, but as long as I can convey what needs to be done I'll be happy.
Chip, while I think of it, the GETZC instruction.
A while ago I suggested adding an #0-31 parameter (default 0) which rotates the D value right by 0..31 places before setting ZC=D[1:0]. (D remains unchanged as in NR)
This permits full 4 case decoding simply with the Z & C flags.
It will be particularly useful now we have Z&C being saved in various positions with CALLs etc.
Going to get some sleep now
I haven't been keeping up with all the latest developments with the P2 but I remember someone mentioning a general purpose CRC instruction a while back and was wondering if that was still being considered for release? Even a bit-wise crc instruction could come in handy.
Cheers,
David
See also the USB thead.
Once you have encapsulated RxUSB verilog, it is a small step to pace that on a Timer, and it becomes
WAITUSB, which is now read once per byte (or it exits earlier on Flags)
This also then allows the timer to have a edge-snap mode, to resync on the USB edges.
Counter needs a reload (baud) mode, with an optional phased edge reset.
Do the background instructions that take multiple clocks (such as the cordic, big multipier, etc) consume large amounts of silicon (apart from the actual processing logic)?
What I am getting at, does it use a lot of silicon to set say the USB receiver off running in the background, and permit the cog to continue on with other processing while a byte (or more) is grabbed?
I have been resisting this path because it lacks general purpose use, but wondered if it is worth pursuing.
Yes, quite an impressive number
A USB state machine running in the background, reading and writing AUX would not be very big, at all. I don't think there's anything about USB that lends itself to general-purpose reuse, so it might as well be USB-only. If we could build a complete USB state machine, it might be an ideal approach.
Chip,
Do the Counter have a reload mode, ideally with a external edge reset ?
A simple mode like that, would allow a Counter to be set to clock the USB state engine, and probably 1 level of Data storage is needed to give more time tolerance.
I think this could be emulated in SW @ 1.5MHz to prove the details.
The counters can count the frequency of edges and the durations of states, but they don't have a reload mode like you are asking about.
A special circuit can be made for the USB handler, though. In many instances, it's not the guts of a circuit that take lots of space, but all the conduit to make it breathe. If we encapsulated it, it might be the best way to go.
Good point, it's not a very large timer, lowest would be /4 for 48MHz to 12MHz and highest to cover 1.5MHz from 200MHz core, is /133.33 so an 8 bit reload/Rst counter / 'baud divider' is all that is needed to cover practical clock ranges, with a 50% compare for the sample point for both even and odd numbers.
Don't get too excited because probably at lease .4 Million views was just me trying to figure out what you guys were talking about. LOL
WAITREG? Or WAITRNZ?
Computer Engineering, VLSI Lab, USB2.0 Protocol Engine Project
Figure 6 – Configured Final State Machine
http://engineering.biu.ac.il/files/engineering/shared/PE_project_book_0.pdf
As soon as you have the instruction summary (the section at the end of the docs) would you like to post it so I can put it into the spreadsheet and repost.
Thanks.
Here it is:
Prop2_Instructions_2014_03_12.txt
VGA and PS2 keyboard/mouse was not general purpose, but P1 was wonderful because of them.
I think P2 would be awesome if you can make USB as easy as current serial transceivers.
old way:
new way:
Thanks,
electrodude
EDIT: nr would only work inside a tlock or in a single-tasked program, making the mov method easier for multitasking. My nr instruction is still easier for singletasking.
AUGNR 'next instruction does not write out the result, only affects Z & C flags according to WZ & WC
Works just like AUGS & AUGD.
This could work, per task, but could you give me an instruction sequence that would benefit from this feature? I was just looking at the instruction set and I don't see any obvious cases where blocking the result writing would gain anything. If you can convince me, I'll do it. And better yet, we could have an instruction that actually redirects the result writing to another register 'NEXTWR D/#'.
Interesting idea! Would remove the need for temp registers in a lot of code.
Edit: One use for that could be something like:
On P1 I often used instructions like
SHR/SHL/ROR/ROL/RLC/RCR/AND/XOR <reg>,<#nnn NR WZ,WC 'check bit(s)
We can now check a bit using TESTB but it would mostly use a reg. There are cases where the NR was quite useful. There are of course ways around this, using a temp register.
IMHO a single AUGNR would be most useful. To work just like AUGS and AUGD does. This way, all the standard instructions such as the above, plus things like NEG, ADD, SUB, etc could be made to work using the AUGNR preceding the instruction. Would this be difficult?
Another instruction I mentioned a little earlier was to put D[1:0 + offset] into Z & C using wz and wc. By rotating D to the right #offset times, we could use any bit pair to set Z & C flags. This permits full 4 way decoding quickly. I thought that SETZC could be extended to be SETZC D/#,#0..31 using say
ZCL- 1111111 ZC L CCCC DDDDDDDDD 1111nnnnn SETZC D/#,#nnnnn 'where nnnnn shifts D right 0..31 places (default is 30). Note D is not written.
Postedit:
NEXTWR D/#
Could be extremely useful for giving a separate result from 2 inputs (D & S). Maybe we could utilise the special register $1F1 to mean NR.
IMHO this would be more versatile than AUGNR. Maybe AUGDEST might be a better name (since it is like AUGS and AUGD).
Does SETMASK, WIDEBM, WIDEWM and WIDELM still exist?
See here
Only these:
WRWIDE S/PTRA/PTRB
WRWIDEM D/#,S/PTRA/PTRB
WRWIDEM lets you specify a 32-mask in D/# where 1's inhibit individual byte writes within the 32-byte WIDE. So, by making it atomic, we don't need SETMASK.
RESD #C
MUL A,B
...writes A*B to C
RESD #1
LINK #address16
...writes the return address to $001 instead of $000
RESD A
...indirection for writes
Do you have the opcode for that?