I don't think it should be too tough to have start/stop bit generation and detection be an option for the hardware.
Agreed. Most of the work is in finding register space, and mapping the control and flag bits.
Also easy should be a CLKOUT option bit, even for 'async', as that (with addition of fully variable length control) then allows the P2 to use the FTDI High Speed Serial, which can manage 50MHz clock speeds.
Streaming at 50Mbd over a standard VCP into a P2 has real appeal, and it likely to be about the Pin-limit.
jmg,
I gave this some more thought, and I think (as long as pins can toggle at clkfreq, which Chip can tell us if it is possible)
Most uC cannot toggle pins above 100MHz, and many limit their Serial specs to somewhat under that, to allow for Ts/Th delay effects through the relatively large and slow Pin buffers.
assume external clock is used to shift N bits - let's say shift is on falling edge of ECLK
once N bits are shifted, the rising edge of ECLK latches the shift register into the "Async Latch", sets internal "BITSAVAIL" flip flop
next p2 clock rising or falling clock "Async Latch" is latched into "P2 Receive Latch", clears "BITSAVAIL"
...
Sure, but you have now duplicated state machines for master and slave modes, as well as buffer registers.
Testing is a lot more complex, and you have to hope you sliced those clock domains in the correct places.
All of this effort, is to try to reach /1 on shift clocks ?, but I do not recall seeing any uC that offer /1 shifter clocks - /2 is common, in master mode, and even that for sub 50MHz devices. I think some allow /2 in slave, if the two chips have sync'd clocks.
Even looking at a 'still coming' part like the 240MHz Analog Devices M4, shows a max of 50MHz spec'd for any pins, and a max of (fSCLK/4) MHz is mentioned on external clock modes. (that is still 60MHz at the pins).
I found a NXP device, where early data mentioned 80Mbd, but 2013 data now says 52Mbd, on their SPI ports. The real world always conspires to lower practical clock speeds.
Imho, For SPI (and other high speed serial) a 32-bit wide version of the 74HC195 (or 74HC194) chip extended with an output latch or a 32-bit version of the 74HC299 would do wonders for speed. Nicer would be some muxes for clock source, which edge to clock on, where the pins connect, etc. If the shift register included a counter to (optionally) latch the register every 8-16-32 clocks and set a flag, it should deal with asynchronous bit clocks too?
I have just been "playing" with some of your new instructions.
I just modified my VGA driver in Invaders to utilize the new non-polled WAITVID instruction as well as the new ESWAP8 Endian reversal.
This made a huge impact on my drivers size and performance. It's size was reduced from 57 longs to 41 longs!
Using ESWAP8 not only reduced the driver size but dramatically improved performance too.
I can "nearly" get away running it at 1/16 time slot but 2/16 works great. Sure the 20MHz clock increase helped too!
I think this shows that the real silicon @ 160MHz could easily do similar tasks in just 1/16 time slot.
I like your new SETTASK format too.
Very nice enhancements!
I have just been "playing" with some of your new instructions.
I just modified my VGA driver in Invaders to utilize the new non-polled WAITVID instruction as well as the new ESWAP8 Endian reversal.
This made a huge impact on my drivers size and performance. It's size was reduced from 57 longs to 41 longs!
Using ESWAP8 not only reduced the driver size but dramatically improved performance too.
I can "nearly" get away running it at 1/16 time slot but 2/16 works great. Sure the 20MHz clock increase helped too!
I think this shows that the real silicon @ 160MHz could easily do similar tasks in just 1/16 time slot.
I like your new SETTASK format too.
Very nice enhancements!
I have just been "playing" with some of your new instructions.
I just modified my VGA driver in Invaders to utilize the new non-polled WAITVID instruction as well as the new ESWAP8 Endian reversal.
This made a huge impact on my drivers size and performance. It's size was reduced from 57 longs to 41 longs!
Using ESWAP8 not only reduced the driver size but dramatically improved performance too.
I can "nearly" get away running it at 1/16 time slot but 2/16 works great. Sure the 20MHz clock increase helped too!
I think this shows that the real silicon @ 160MHz could easily do similar tasks in just 1/16 time slot.
I like your new SETTASK format too.
Very nice enhancements!
Cheers
Brian
Glad it's all working. I ran the 80MHz version on my DE0-Nano and I watched it on and off for hours while I worked. It's good It has an 'attract' mode.
That SETTASK idea was Tubular's, I believe. It's a really nice solution to variable-length task loops.
You could use SERA/SERB to get your serial going with less overhead, too.
Have You on SERA/SERB any status register else that it set Zero/Cary flag to step over if no any character are awaiting?
We can make it do almost anything, so yes.
I've been working on the RDSTACK/WRSTACK instructions today and I think I'll have it done tomorrow. Hopefully, I'll glean some clear direction from all this synchronous serial discussion by then and we'll have a nice and tidy solution.
I've been working on the RDTASK/WRTASK instructions today and I think I'll have it done tomorrow. Hopefully, I'll glean some clear direction from all this synchronous serial discussion by then and we'll have a nice and tidy solution.
I have be thinking at if it is possible Z-flag can be used as Character ready to read --- C-flag can be used as Overflow -- Income Character in buffer has be overwritten by next one (Lose of Character in stream)
What if P2 only supported SPI master mode? I suspect that master mode will be used the majority of the time. And we would still have the UART for chip-to-chip communication. This might simplify the implementation:
Set phase, polarity, bit order, MISO, MOSI, CLK, divider
Allow only 8-bit frames (larger can be handled in software)
SEROUTx outputs without start bit, ID, or stop bit
SERINx reads buffered input (captured on the appropriate clock edge during SEROUTx)
CS is handled in software (i.e. SPI hardware has no knowledge of CS)
The biggest difference I see here is that the receiver is now tied to the transmitter in a way that the UART isn't.
[*] SERINx reads buffered input (captured on the appropriate clock edge during SEROUTx)
Master only does sound like a decent compromise. After all, the main reason for wanting SPI is to connect small pin count peripheral chips. I believe raw USB or Ethernet has been shot down on the basis of needing complex framing in the hardware, right?
Agreed. Most of the work is in finding register space, and mapping the control and flag bits.
Also easy should be a CLKOUT option bit, even for 'async', as that (with addition of fully variable length control) then allows the P2 to use the FTDI High Speed Serial, which can manage 50MHz clock speeds.
Streaming at 50Mbd over a standard VCP into a P2 has real appeal, and it likely to be about the Pin-limit.
I totally agree that it is not worth a lot of pain to reach clkfreq/1 on serial, looks like my thought experiments under estimated the level of effort required to implement/test it.
/2 would still be nice if the level of effort required was reasonable, however your examples below suggest it too may require too much effort.
Most uC cannot toggle pins above 100MHz, and many limit their Serial specs to somewhat under that, to allow for Ts/Th delay effects through the relatively large and slow Pin buffers.
Sure, but you have now duplicated state machines for master and slave modes, as well as buffer registers.
Testing is a lot more complex, and you have to hope you sliced those clock domains in the correct places.
All of this effort, is to try to reach /1 on shift clocks ?, but I do not recall seeing any uC that offer /1 shifter clocks - /2 is common, in master mode, and even that for sub 50MHz devices. I think some allow /2 in slave, if the two chips have sync'd clocks.
Even looking at a 'still coming' part like the 240MHz Analog Devices M4, shows a max of 50MHz spec'd for any pins, and a max of (fSCLK/4) MHz is mentioned on external clock modes. (that is still 60MHz at the pins).
I found a NXP device, where early data mentioned 80Mbd, but 2013 data now says 52Mbd, on their SPI ports. The real world always conspires to lower practical clock speeds.
What if P2 only supported SPI master mode? I suspect that master mode will be used the majority of the time. And we would still have the UART for chip-to-chip communication. This might simplify the implementation:
Set phase, polarity, bit order, MISO, MOSI, CLK, divider
Allow only 8-bit frames (larger can be handled in software)
SEROUTx outputs without start bit, ID, or stop bit
SERINx reads buffered input (captured on the appropriate clock edge during SEROUTx)
CS is handled in software (i.e. SPI hardware has no knowledge of CS)
The biggest difference I see here is that the receiver is now tied to the transmitter in a way that the UART isn't.
FYI, there are more opcodes that have changed than are listed in Chip's message at the head of this thread. I wrote a program to compare Chip's opcode table with the one in PropGCC and found numerous other differences.
What if P2 only supported SPI master mode? I suspect that master mode will be used the majority of the time.
I have to strongly disagree with that assumption.
The Prop and Prop II especially can make excellent peripheral chips to larger systems. Say an ARM board running Linux. Very common systems now a days.
As such SPI slave is very important as it is far easier to operate the host end as a master. Nobody is going to want to start writing Linux device drivers to use the SPI interface as a slave device.
Master only does sound like a decent compromise. After all, the main reason for wanting SPI is to connect small pin count peripheral chips.
Bill,
Fast SPI master is far more important than fast slave,
Again, I have to say: slave is important. Consider you want to connect a Prop to an ARM machine or some bigger system? The Prop is the peripheral chip. Fast SPI slave woudld be a great way to sell the Prop as periheral device.
If it's just not practcal to do then so be it. I just think SLAVE can be more important than MASTER.
Again, I have to say: slave is important. Consider you want to connect a Prop to an ARM machine or some bigger system? The Prop is the peripheral chip. Fast SPI slave woudld be a great way to sell the Prop as periheral device.
If it's just not practcal to do then so be it. I just think SLAVE can be more important than MASTER.
It's not ideal, but SPI slave can still be done entirely in software. At the end of the day, I'd rather have fast hardware SPI master support than no hardware SPI support at all. Also, many chips (including ARM, Microchip, AVR) support TTL UARTs that should be able to work with the existing P2 UART functionality.
You are right, I had my prop-centric hat on. Fast slave is also very important, as fast as is consistent with not requiring too great an effort.
Just for yucks, let's figure out the fastest possible externally clocked software SPI slave code - I'll start here:
getspibyte ' wait for rising spiclk, sample miso
waitpeq clkpin, clkpin
shr pinsa,#miso wc,nr
waitpne clkpin, clkpin
rlc spibyte, #1
' 7 more copies of above, with loop unrolled we avoid loop overhead
getspibyte_ret
ret
Prop2_Docs.txt does not give the minimum number of cycles for waitpeq.
1 cycle wait*: 4 clocks per bit in ideal circumstances (which are not likely), and invalid results if spiclk is too fast.
2 cycle wat*: 6 clocks per bit
general formula: W*2+2 clocks minimum for one SPI clock, where W is the minimum wait* time
external clock software receive:
26.6Mbps @ 160Mhz - assuming W=2 is feasible
20.0Mbps @ 160Mhz - assuming W=3 is feasible
16.0Mbps @ 160Mhz - assuming W=4 is feasible
13.3Mbps @ 160Mhz - assuming W=5
external clock hardware receive:
80.0Mbps @ 160Mhz, if clkfreq/2 is possible
53.3Mbps @ 160Mhz, at clkfreq/3
40.0Mbps @ 160Mhz, at clkfreq/4
Conclusions:
- hardware externally clocked SPI slave is much faster than software based externally clocked SPI slave
- hardware SPI slave can run as a task in a cog
- software SPI slave needs a whole cog
Ok heater, you got me - much nicer to be a fast hardware peripheral to an external processor (than a slow one)
Again, I have to say: slave is important. Consider you want to connect a Prop to an ARM machine or some bigger system? The Prop is the peripheral chip. Fast SPI slave woudld be a great way to sell the Prop as periheral device.
If it's just not practcal to do then so be it. I just think SLAVE can be more important than MASTER.
- hardware SPI slave can run as a task in a cog
- software SPI slave needs a whole cog
I think this is a key observation. This means that these simple peripherals like UART, I2C, SPI, etc don't have to consume an entire COG. They can be combined with some higher level logic that runs in another thread. I think this makes the Propeller far more useful. You can use all 8 COGs to partition your application without giving up any to handle the tasks that are supported by custom hardware on other microcontrollers.
No doubt true. But...
There are many systems out there can can act as MASTER to any slave peripherals. Prop II as a SLAVE peripheral would be perfect. It's not always possible to create that software on the MASTER end.
Two examples:
1) Any Linux based system. That's a lot. Linux has no SLAVE side driver. I am certainly not up to creating one. It would be great to be able to attach a Prop II as a peripheral to a Beagle Bone, Raspberry Pi and many others.
2) The Espruino has SPI MASTER support for attaching peripheral chips. Again, anyone want to create that SLAVE side driver.
Philosophically the Prop is the slave in these situations so it makes sense to have SPI SLAVE support.
Anyway as I said, if HOST only is easier and SLAVE is impractical, so be it.
We have to stop messing with the PII design at some point:)
We have to stop messing with the PII design at some point:)
We need to keep that in mind.
I agree that SPI slave opens up some really nice use cases, but at some point we are going to ask for one thing too many and end up getting less than we could have had...
I can see Chip saying he has Slave SPI implemented and then someone will chime in with a request for yet another protocol to be supported and...
I agree that SPI slave opens up some really nice use cases, but at some point we are going to ask for one thing too many and end up getting less than we could have had...
I can see Chip saying he has Slave SPI implemented and then someone will chime in with a request for yet another protocol to be supported and...
Comments
Agreed. Most of the work is in finding register space, and mapping the control and flag bits.
Also easy should be a CLKOUT option bit, even for 'async', as that (with addition of fully variable length control) then allows the P2 to use the FTDI High Speed Serial, which can manage 50MHz clock speeds.
Streaming at 50Mbd over a standard VCP into a P2 has real appeal, and it likely to be about the Pin-limit.
Most uC cannot toggle pins above 100MHz, and many limit their Serial specs to somewhat under that, to allow for Ts/Th delay effects through the relatively large and slow Pin buffers.
Sure, but you have now duplicated state machines for master and slave modes, as well as buffer registers.
Testing is a lot more complex, and you have to hope you sliced those clock domains in the correct places.
All of this effort, is to try to reach /1 on shift clocks ?, but I do not recall seeing any uC that offer /1 shifter clocks - /2 is common, in master mode, and even that for sub 50MHz devices. I think some allow /2 in slave, if the two chips have sync'd clocks.
Even looking at a 'still coming' part like the 240MHz Analog Devices M4, shows a max of 50MHz spec'd for any pins, and a max of (fSCLK/4) MHz is mentioned on external clock modes. (that is still 60MHz at the pins).
I found a NXP device, where early data mentioned 80Mbd, but 2013 data now says 52Mbd, on their SPI ports. The real world always conspires to lower practical clock speeds.
Marty
I have just been "playing" with some of your new instructions.
I just modified my VGA driver in Invaders to utilize the new non-polled WAITVID instruction as well as the new ESWAP8 Endian reversal.
This made a huge impact on my drivers size and performance. It's size was reduced from 57 longs to 41 longs!
Using ESWAP8 not only reduced the driver size but dramatically improved performance too.
I can "nearly" get away running it at 1/16 time slot but 2/16 works great. Sure the 20MHz clock increase helped too!
I think this shows that the real silicon @ 160MHz could easily do similar tasks in just 1/16 time slot.
I like your new SETTASK format too.
Very nice enhancements!
Cheers
Brian
Have You on SERA/SERB any status register else that it set Zero/Cary flag to step over if no any character are awaiting?
Glad it's all working. I ran the 80MHz version on my DE0-Nano and I watched it on and off for hours while I worked. It's good It has an 'attract' mode.
That SETTASK idea was Tubular's, I believe. It's a really nice solution to variable-length task loops.
You could use SERA/SERB to get your serial going with less overhead, too.
We can make it do almost anything, so yes.
I've been working on the RDSTACK/WRSTACK instructions today and I think I'll have it done tomorrow. Hopefully, I'll glean some clear direction from all this synchronous serial discussion by then and we'll have a nice and tidy solution.
Thanks.
That possibility will give SER function most like standard Serial port.
Call to it --- Test if Character --- If Not step ahead --- If Character wait for it and return
I have be thinking at if it is possible Z-flag can be used as Character ready to read --- C-flag can be used as Overflow -- Income Character in buffer has be overwritten by next one (Lose of Character in stream)
Chip, That was next on my list to play with.....
Cheers
Brian
This might be better with 2 possible phases : Call them Skip and Wait.
Skip jumps over, knowing it will get back again inside 1 char time. Good for small loops
Wait, knows/expects another Char is due, and does nothing else until it arrives. Use mostly during data streams ?
The HW needs to have 1 char of buffering, to give practical handshake times, and workable with packed data streams.
Here is an update to the Instruction Set for the new info from Chip.
'' +--------------------------------------------------------------------------+ '' | Propeller II Instruction Set v0.xx | '' +--------------------------------------------------------------------------+ '' | Prepared by Cluso99 from information provided by Chip Gracey (Parallax) | '' +--------------------------------------------------------------------------+ '' RR20130517 first version '' RR20130518 add INDA/INDB & PTRA/PTRB information '' RR20130530 fix r s/be i for 000011 additional instructions, mul/scl i s/be r=1 '' RR20131002 modify for new rev by Chip DAT ----------------------------------- Conditional Assembly commands ..... cccc condition alternate ----------------------------------- 0000 if_never nop 0001 if_nz_and_nc if_a 0010 if_z_and_nc 0011 if_nc if_ae 0100 if_nz_and_c 0101 if_nz if_ne 0110 if_z_ne_c 0111 if_nz_or_nc 1000 if_z_and_c 1001 if_z_eq_c 1010 if_z if_e 1011 if_z_or_nc 1100 if_c if_b 1101 if_nz_or_c 1110 if_z_or_c if_be 1111 if_always (default) ----------------------------------- ------------------------------------------------------------------------------------------------------------------------------- INDA & INDB (Cog addresses $1F6 & $1F7)... ..... (Note: Instructions using indirection ALWAYS execute ) opcode zcri cccc --dest--- -source-- ( Therefore, cccc specify none/postinc/postdec/preinc) ------------------------------------------------------------------------------------------------------------------------------- ------ ---- 00xx 111110110 --------- Dest = INDA ------ ---- 01xx 111110110 --------- Dest = INDA++ ------ ---- 10xx 111110110 --------- Dest = INDA-- ------ ---- 11xx 111110110 --------- Dest = ++INDA ------ ---- 00xx 111110111 --------- Dest = INDB ------ ---- 01xx 111110111 --------- Dest = INDB++ ------ ---- 10xx 111110111 --------- Dest = INDB-- ------ ---- 11xx 111110111 --------- Dest = ++INDB ------ ---0 xx00 --------- 111110110 Srce = INDA ------ ---0 xx01 --------- 111110110 Srce = INDA++ ------ ---0 xx10 --------- 111110110 Srce = INDA-- ------ ---0 xx11 --------- 111110110 Srce = ++INDA ------ ---0 xx00 --------- 111110111 Srce = INDB ------ ---0 xx01 --------- 111110111 Srce = INDB++ ------ ---0 xx10 --------- 111110111 Srce = INDB-- ------ ---0 xx11 --------- 111110111 Srce = ++INDB ------------------------------------------------------------------------------------------------------------------------------- opcode zcri cccc --dest--- -source-- ------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------- PTR (hub address) & RD/WR-QUAD/LONG/WORD/BYTE-C Instructions (i=1 uses PTRx) ..... ------------------------------------------------------------------------------------------------------------------------------- ┌─────────── 0 = PTRA 1 = PTRB │┌────────── 0 = PTRx no update 1 = PTRx updated ││┌───────── 0 = PTRx+Index*Scale 1 = PTRx (post modify) │││ ┌───── Index -32..+31 (nnnnnn = -Index, NNNNNN = +Index) │││ │ Scale is 1=Byte, 2=Word, 4=Long, 16=Quad │││┌──┴─┐ supnnnnnn opcode zcri cccc --dest--- -source-- Expression Uses Initial value Final value ------------------------------------------------------------------------------------------------------------------------------- ------ ---1 ---- --------- 000000000 PTRA PTRA ------ ---1 ---- --------- 011000000 PTRA++ PTRA PTRA += Scale ------ ---1 ---- --------- 011111111 PTRA-- PTRA PTRA -= Scale ------ ---1 ---- --------- 010000001 ++PTRA PTRA + Scale PTRA += Scale (ie Index = +1) ------ ---1 ---- --------- 010111111 --PTRA PTRA - Scale PTRA -= Scale (ie Index = -1) ------ ---1 ---- --------- 000NNNNNN PTRA[Index] PTRA + Index*Scale (ie Index = +NNNNNN) ------ ---1 ---- --------- 011NNNNNN PTRA++[Index] PTRA PTRA += Index*Scale (ie Index = +NNNNNN) ------ ---1 ---- --------- 011nnnnnn PTRA--[Index] PTRA PTRA -= Index*Scale (ie Index = -nnnnnn) ------ ---1 ---- --------- 010NNNNNN ++PTRA[Index] PTRA + Index*Scale PTRA += Index*Scale (ie Index = +NNNNNN) ------ ---1 ---- --------- 010nnnnnn --PTRA[Index] PTRA - Index*Scale PTRA -= Index*Scale (ie Index = -nnnnnn) ------ ---1 ---- --------- 100000000 PTRB PTRB ------ ---1 ---- --------- 111000000 PTRB++ PTRB PTRB += Scale ------ ---1 ---- --------- 111111111 PTRB-- PTRB PTRB -= Scale ------ ---1 ---- --------- 110000001 ++PTRB PTRB + Scale PTRB += Scale (ie Index = +1) ------ ---1 ---- --------- 110111111 --PTRB PTRB - Scale PTRB -= Scale (ie Index = -1) ------ ---1 ---- --------- 100NNNNNN PTRB[Index] PTRB + Index*Scale (ie Index = +NNNNNN) ------ ---1 ---- --------- 111NNNNNN PTRB++[Index] PTRB PTRB += Index*Scale (ie Index = +NNNNNN) ------ ---1 ---- --------- 111nnnnnn PTRB--[Index] PTRB PTRB -= Index*Scale (ie Index = -nnnnnn) ------ ---1 ---- --------- 110NNNNNN ++PTRB[Index] PTRB + Index*Scale PTRB += Index*Scale (ie Index = +NNNNNN) ------ ---1 ---- --------- 110nnnnnn --PTRB[Index] PTRB - Index*Scale PTRB -= Index*Scale (ie Index = -nnnnnn) ------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------- Instuction set for Propeller II ..... -opc-- opcode- operand zcri cccc opcode- operand zcri cccc opcode- operand zcri cccc ---------------------------------------------------------------------------------------------- 000000 wrbyte D,S/PTR 000p ---- rdbyte D,S/PTR -01p ---- rdbytec D,S/PTR -11p ---- 000001 wrword D,S/PTR 000p ---- rdword D,S/PTR -01p ---- rdwordc D,S/PTR -11p ---- 000010 wrlong D,S/PTR 000p ---- rdlong D,S/PTR -01p ---- rdlongc D,S/PTR -11p ---- 000011 coginit D,S ---0 ---- ───── *** see table for opcode 000011 i=1 below *** 000100 mul D,[#]S --1- ---- ───────────────────────────┐ 000101 scl D,[#]S --1- ---- ─────────────────────────┐ │ ┌ setacca D,[#]S 000- ---- 000110 enc D,[#]S ---- ---- │ └──┤ setaccb D,[#]S 010- ---- 000111 jmpret D,[#]S ---- ---- call/ret │ │ maca D,[#]S 100- ---- 001000 ror D,[#]S ---- ---- │ └ macb D,[#]S 110- ---- 001001 rol D,[#]S ---- ---- │ 001010 shr D,[#]S ---- ---- │ ┌ movf D,[#]S 000- ---- 001011 shl D,[#]S ---- ---- └────┤ qsincos D,[#]S 010- ---- 001100 rcr D,[#]S ---- ---- │ qarctan D,[#]S 100- ---- 001101 rcl D,[#]S ---- ---- └ qrotate D,[#]S 110- ---- 001110 sar D,[#]S ---- ---- 001111 rev D,[#]S ---- ---- 010000 mins D,[#]S ---- ---- 010001 maxs D,[#]S ---- ---- 010010 min D,[#]S ---- ---- 010011 max D,[#]S ---- ---- 010100 movs D,[#]S ---- ---- 010101 movd D,[#]S ---- ---- 010110 movi D,[#]S ---- ---- 010111 jmpretd D,[#]S ---- ---- calld/retd 011000 and D,[#]S ---- ---- test 011001 andn D,[#]S ---- ---- testn 011010 or D,[#]S ---- ---- ┌ setinda #S 0000 0001 011011 xor D,[#]S ---- ---- │ setinda S 0000 0011 011100 muxc D,[#]S ---- ---- │ setindb #D 0000 0100 011101 muxnc D,[#]S ---- ---- │ setindb D 0000 1100 011110 muxz D,[#]S ---- ---- │ setinds #D,#S 0000 0101 011111 muxnz D,[#]S ---- ---- │ setinds #D,S 0000 0111 100000 add D,[#]S ---- ---- ┌──────────────┤ setinds D,#S 0000 1101 100001 sub D,[#]S ---- ---- cmp │ └ setinds D,S 0000 1111 100010 addabs D,[#]S ---- ---- │ 100011 subabs D,[#]S ---- ---- │ ┌ fixinda #D,#S 0000 0001 100100 sumc D,[#]S ---- ---- │ ┌────────────┤ fixindb #D,#S 0000 0100 100101 sumnc D,[#]S ---- ---- │ │ └ fixinds #D,#S 0000 0101 100110 sumz D,[#]S ---- ---- │ │ 100111 sumnz D,[#]S ---- ---- │ │ ┌────────── cfgpins D,[#]S 000- ---- 101000 mov D,[#]S ---- ---- │ │ │ 101001 neg D,[#]S ---- ---- │ │ │ ┌──────── waitvid D,[#]S 000- ---- 101010 abs D,[#]S ---- ---- │ │ │ │ 101011 absneg D,[#]S ---- ---- │ │ │ │ ┌ ijz D,[#]S 00-- --- 101100 negc D,[#]S ---- ---- │ │ │ │ │ ijzd D,[#]S 01-- --- 101101 negnc D,[#]S ---- ---- │ │ │ │ ┌──────┤ ijnz D,[#]S 10-- --- 101110 negz D,[#]S ---- ---- │ │ │ │ │ └ ijnzd D,[#]S 11-- --- 101111 negnz D,[#]S ---- ---- │ │ │ │ │ 110000 cmps D,[#]S ---- ---- │ │ │ │ │ ┌ djz D,[#]S 00-- --- 110001 cmpsx D,[#]S ---- ---- │ │ │ │ │ │ djzd D,[#]S 01-- --- 110010 addx D,[#]S ---- ---- │ │ │ │ │ ┌────┤ djnz D,[#]S 10-- --- 110011 subx D,[#]S ---- ---- │ │ │ │ │ │ └ djnzd D,[#]S 11-- --- 110100 adds D,[#]S ---- ---- │ │ │ │ │ │ 110101 subs D,[#]S ---- ---- │ │ │ │ │ │ ┌ tjz D,[#]S 000- --- 110110 addsx D,[#]S ---- ---- │ │ │ │ │ │ │ tjzd D,[#]S 010- --- 110111 subsx D,[#]S ---- ---- │ │ │ │ │ │ │ tjnz D,[#]S 100- --- 111000 subr D,[#]S --1- ---- ─ cmpr ────────┘ │ │ │ │ │ │ tjnzd D,[#]S 110- --- 111001 cmpsub D,[#]S --1- ---- ─────────────────┘ │ │ │ │ │ jp D,[#]S 001- --- 111010 incmod D,[#]S --1- ---- ───────────────────┘ │ │ │ │ jpd D,[#]S 011- --- 111011 decmod D,[#]S --1- ---- ─────────────────────┘ │ │ ┌──┤ jnp D,[#]S 101- --- 111100 ij*** D,[#]S **-- ---- ───────────────────────┘ │ │ └ jnpd D,[#]S 111- --- 111101 dj*** D,[#]S **-- ---- ─────────────────────────┘ │ 111110 tj*** D,[#]S ***- ---- ───────────────────────────┘ 111111 waitcnt D,[#]S 0--- ---- waitpeq D,[#]S 1-0- ---- waitpne D,[#]S 1-1- ---- ---------------------------------------------------------------------------------------------- -opc-- opcode- operand zcri cccc opcode- operand zcri cccc opcode- operand zcri cccc ---------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------- Additional instructions for opcode=000011 with i=1 (coginit has i=0) ..... -source-- opcode- operand --dest--- zcri cccc opcode- operand --dest--- zcri cccc opcode- operand --dest--- zcri cccc ------------------------------------------------------------------------------------------------------------------------------- 000000000 clkset D --------- ---1 ---- 000000001 cogid D --------- ---1 ---- 000000010 (coginit) D --------- ---1 ---- 000000011 cogstop D --------- ---1 ---- 000000100 locknew D --------- ---1 ---- 000000101 lockret D --------- ---1 ---- 000000110 lockset D --------- ---1 ---- 000000111 lockclr D --------- ---1 ---- 000001000 serina D --------- ---1 ---- 000001001 serinb D --------- ---1 ---- 000001010 pushzc D --------- ---1 ---- 000001011 popzc D --------- ---1 ---- 000001100 subcnt D --------- --11 ---- cmpcnt D --------- --01 ---- 000001101 getcnt D --------- --11 ---- passcnt D --------- --01 ---- 000001110 getacca D --------- ---1 ---- 000001111 getaccb D --------- ---1 ---- 000010000 getlfsr D --------- ---1 ---- 000010001 gettops D --------- --11 ---- polvid --------- -101 ---- 000010010 getptra D --------- --11 ---- polctra --------- -101 ---- 000010011 getptrb D --------- --11 ---- polctrb --------- -101 ---- 000010100 getpix D --------- --11 ---- 000010101 getspd D --------- --11 ---- chkspd 000000000 --01 ---- 000010110 getspa D --------- --11 ---- chkspa 000000000 --01 ---- 000010111 getspb D --------- --11 ---- chkspb 000000000 --01 ---- 000011000 popar D --------- --11 ---- 000011001 popbr D --------- --11 ---- 000011010 popa D --------- --11 ---- 000011011 popb D --------- --11 ---- 000011100 reta 000000000 --01 ---- 000011101 retb 000000000 --01 ---- 000011110 retad 000000000 --01 ---- 000011111 retbd 000000000 --01 ---- 000100000 decod2 D --------- ---1 ---- 000100001 decod3 D --------- ---1 ---- 000100010 decod4 D --------- ---1 ---- 000100011 decod5 D --------- ---1 ---- 000100100 blmask D --------- ---1 ---- 000100101 not D --------- ---1 ---- 000100110 onecnt D --------- ---1 ---- 000100111 zercnt D --------- ---1 ---- 000101000 incpat D --------- ---1 ---- 000101001 decpat D --------- ---1 ---- 000101010 bingry D --------- ---1 ---- 000101011 grybin D --------- ---1 ---- 000101100 mergew D --------- ---1 ---- 000101101 splitw D --------- ---1 ---- 000101110 eswap4 D --------- ---1 ---- 000101111 eswap8 D --------- ---1 ---- 000110000 getmull D --------- ---1 ---- 000110001 getmulh D --------- ---1 ---- 000110010 getdivq D --------- ---1 ---- 000110011 getdivr D --------- ---1 ---- 000110100 getsqrt D --------- ---1 ---- 000110101 getqx D --------- ---1 ---- 000110110 getqy D --------- ---1 ---- 000110111 getqz D --------- ---1 ---- 000111000 getphsa D --------- ---1 ---- 000111001 getphza D --------- ---1 ---- 000111010 getcosa D --------- ---1 ---- 000111011 getsina D --------- ---1 ---- 000111100 getphsb D --------- ---1 ---- 000111101 getphzb D --------- ---1 ---- 000111110 getcosb D --------- ---1 ---- 000111111 getsinb D --------- ---1 ---- ------------------------------------------------------------------------------------------------------------------------------- 001iiiiii repd #i 111111111 -001 ---- repd D/#n,#i nnnnnnnnn -0#1 ---- reps #n,#i nnnnnnnnn n111 nnnn ------------------------------------------------------------------------------------------------------------------------------- 01000tttt jmptask D/#n,#t nnnnnnnnn --#1 ---- 0100100-- -spare- nnnnnnnnn --#1 ---- 010010100 settask D/#n nnnnnnnnn --#1 ---- 010010101 clracca 000000000 --01 ---- 010010110 clraccb 000000000 --01 ---- 010010111 clraccs 000000000 --01 ---- 010011000 cachex 000000000 --01 ---- 010011001 saracca D/#n ---nnnnnn --#1 ---- 010011010 saraccb D/#n ---nnnnnn --#1 ---- 010011011 saraccs D/#n ---nnnnnn --#1 ---- 010011100 serouta D/#n nnnnnnnnn --#1 ---- 010011101 seroutb D/#n nnnnnnnnn --#1 ---- 010011110 setxch D/#n nnnnnnnnn --#1 ---- 010011111 setxfr D/#n nnnnnnnnn --#1 ---- ------------------------------------------------------------------------------------------------------------------------------- 010100000 nopx D/#n nnnnnnnnn --#1 ---- 010100001 setzc D/#n nnnnnnnnn --#1 ---- 010100010 setspa D/#n -nnnnnnnn --#1 ---- 010100011 setspb D/#n -nnnnnnnn --#1 ---- 010100100 addspa D/#n -nnnnnnnn --#1 ---- 010100101 addspb D/#n -nnnnnnnn --#1 ---- 010100110 subspa D/#n -nnnnnnnn --#1 ---- 010100111 subspb D/#n -nnnnnnnn --#1 ---- 010101000 pushar D/#n nnnnnnnnn --#1 ---- 010101001 pushbr D/#n nnnnnnnnn --#1 ---- 010101010 pusha D/#n nnnnnnnnn --#1 ---- 010101011 pushb D/#n nnnnnnnnn --#1 ---- 010101100 calla D/#n nnnnnnnnn --#1 ---- 010101101 callb D/#n nnnnnnnnn --#1 ---- 010101110 callad D/#n nnnnnnnnn --#1 ---- 010101111 callbd D/#n nnnnnnnnn --#1 ---- 010110000 wrquad D/PTR supiiiiii --p1 ---- 010110001 rdquad D/PTR supiiiiii -0p1 ---- rdquadc D/PTR supiiiiii -1p1 ---- 010110010 setptra D/#n nnnnnnnnn --#1 ---- 010110011 setptrb D/#n nnnnnnnnn --#1 ---- 010110100 addptra D/#n nnnnnnnnn --#1 ---- 010110101 addptrb D/#n nnnnnnnnn --#1 ---- 010110110 subptra D/#n nnnnnnnnn --#1 ---- 010110111 subptrb D/#n nnnnnnnnn --#1 ---- 010111000 setpix D/#n nnnnnnnnn --#1 ---- 010111001 setpixu D/#n nnnnnnnnn --#1 ---- 010111010 setpixv D/#n nnnnnnnnn --#1 ---- 010111011 setpixz D/#n nnnnnnnnn --#1 ---- 010111100 setpixa D/#n nnnnnnnnn --#1 ---- 010111101 setpixr D/#n nnnnnnnnn --#1 ---- 010111110 setpixg D/#n nnnnnnnnn --#1 ---- 010111111 setpixb D/#n nnnnnnnnn --#1 ---- 011000000 setmula D/#n nnnnnnnnn -1#1 ---- setmulu D/#n nnnnnnnnn -0#1 ---- 011000001 setmulb D/#n nnnnnnnnn --#1 ---- 011000010 setdiva D/#n nnnnnnnnn -1#1 ---- setdivu D/#n nnnnnnnnn -0#1 ---- 011000011 setdivb D/#n nnnnnnnnn --#1 ---- 011000100 setsqrh D/#n nnnnnnnnn --#1 ---- 011000101 setsqrl D/#n nnnnnnnnn --#1 ---- 011000110 setqi D/#n nnnnnnnnn --#1 ---- 011000111 setqz D/#n nnnnnnnnn --#1 ---- 011001000 qlog D/#n nnnnnnnnn --#1 ---- 011001001 qexp D/#n nnnnnnnnn --#1 ---- 011001010 setf D/#n nnnnnnnnn --#1 ---- 011001011 setskip D/#n nnnnnnnnn --#1 ---- 011001100 cfgdac0 D/#n -------nn --#1 ---- 011001101 cfgdac1 D/#n -------nn --#1 ---- 011001110 cfgdac2 D/#n -------nn --#1 ---- 011001111 cfgdac3 D/#n -------nn --#1 ---- 011010000 setdac0 D/#n nnnnnnnnn --#1 ---- 011010001 setdac1 D/#n nnnnnnnnn --#1 ---- 011010010 setdac2 D/#n nnnnnnnnn --#1 ---- 011010011 setdac3 D/#n nnnnnnnnn --#1 ---- 011010100 cfgdacs D/#n -nnnnnnnn --#1 ---- 011010101 setdacs D/#n nnnnnnnnn --#1 ---- 011010110 getp D/#n --nnnnnnn --#1 ---- 011010111 getnp D/#n --nnnnnnn --#1 ---- 011011000 offp D/#n --nnnnnnn --#1 ---- 011011001 notp D/#n --nnnnnnn --#1 ---- 011011010 clrp D/#n --nnnnnnn --#1 ---- 011011011 setp D/#n --nnnnnnn --#1 ---- 011011100 setpc D/#n --nnnnnnn --#1 ---- 011011101 setpnc D/#n --nnnnnnn --#1 ---- 011011110 setpz D/#n --nnnnnnn --#1 ---- 011011111 setpnz D/#n --nnnnnnn --#1 ---- 011100000 setcog D/#n -----nnnn --#1 ---- 011100001 setmap D/#n ---nnnnnn --#1 ---- 011100010 setquad D/#n nnnnnnnnn -0#1 ---- setquaz D/#n nnnnnnnnn -1#1 ---- 011100011 setport D/#n --nn----- --#1 ---- 011100100 setpora D/#n --nn----- --#1 ---- 011100101 setporb D/#n --nn----- --#1 ---- 011100110 setporc D/#n --nn----- --#1 ---- 011100111 setpord D/#n --nn----- --#1 ---- 011101000 setpera D/#n nnnnnnnnn --#1 ---- 011101001 setsera D/#n nnnnnnnnn --#1 ---- 011101010 setperb D/#n nnnnnnnnn --#1 ---- 011101011 setserb D/#n nnnnnnnnn --#1 ---- 011101100 setvid D/#n nnnnnnnnn --#1 ---- 011101101 setvidy D/#n nnnnnnnnn --#1 ---- 011101110 setvidi D/#n nnnnnnnnn --#1 ---- 011101111 setvidq D/#n nnnnnnnnn --#1 ---- 011110000 setctra D/#n nnnnnnnnn --#1 ---- 011110001 setwava D/#n nnnnnnnnn --#1 ---- 011110010 setfrqa D/#n nnnnnnnnn --#1 ---- 011110011 setphsa D/#n nnnnnnnnn --#1 ---- 011110100 addphsa D/#n nnnnnnnnn --#1 ---- 011110101 subphsa D/#n nnnnnnnnn --#1 ---- 011110110 synctra nnnnnnnnn --01 ---- 011110111 capctra nnnnnnnnn --01 ---- 011111000 setctrb D/#n nnnnnnnnn --#1 ---- 011111001 setwavb D/#n nnnnnnnnn --#1 ---- 011111010 setfrqb D/#n nnnnnnnnn --#1 ---- 011111011 setphsb D/#n nnnnnnnnn --#1 ---- 011111100 addphsb D/#n nnnnnnnnn --#1 ---- 011111101 subphsb D/#n nnnnnnnnn --#1 ---- 011111110 synctrb nnnnnnnnn --01 ---- 011111111 capctrb nnnnnnnnn --01 ---- ------------------------------------------------------------------------------------------------------------------------------- 1000bbbbb isob D,#b --------- ---1 ---- 1001bbbbb notb D,#b --------- ---1 ---- 1010bbbbb clrb D,#b --------- ---1 ---- 1011bbbbb setb D,#b --------- ---1 ---- 1100bbbbb setbc D,#b --------- ---1 ---- 1101bbbbb setbnc D,#b --------- ---1 ---- 1110bbbbb setbz D,#b --------- ---1 ---- 1111bbbbb setbnz D,#b --------- ---1 ---- ------------------------------------------------------------------------------------------------------------------------------- -source-- opcode- operand --dest--- zcri cccc opcode- operand --dest--- zcri cccc opcode- operand --dest--- zcri cccc -------------------------------------------------------------------------------------------------------------------------------P2_Instructions_20131002a.spinNote replaced with "a" version.
The biggest difference I see here is that the receiver is now tied to the transmitter in a way that the UART isn't.
Have You made that changes to Debugger ?
Master only does sound like a decent compromise. After all, the main reason for wanting SPI is to connect small pin count peripheral chips. I believe raw USB or Ethernet has been shot down on the basis of needing complex framing in the hardware, right?
C.W.
50Mbaud is very attractive for communicating with USB hosts with an FTDI - thanks for pointing it out, I had not run across that before.
Real world limits can be a pain
I totally agree that it is not worth a lot of pain to reach clkfreq/1 on serial, looks like my thought experiments under estimated the level of effort required to implement/test it.
/2 would still be nice if the level of effort required was reasonable, however your examples below suggest it too may require too much effort.
Fast SPI master is far more important than fast slave, so if it easier to only support fast master, it really is a good compromise.
I have to strongly disagree with that assumption.
The Prop and Prop II especially can make excellent peripheral chips to larger systems. Say an ARM board running Linux. Very common systems now a days.
As such SPI slave is very important as it is far easier to operate the host end as a master. Nobody is going to want to start writing Linux device drivers to use the SPI interface as a slave device.
Again, I have to say: slave is important. Consider you want to connect a Prop to an ARM machine or some bigger system? The Prop is the peripheral chip. Fast SPI slave woudld be a great way to sell the Prop as periheral device.
If it's just not practcal to do then so be it. I just think SLAVE can be more important than MASTER.
It's not ideal, but SPI slave can still be done entirely in software. At the end of the day, I'd rather have fast hardware SPI master support than no hardware SPI support at all. Also, many chips (including ARM, Microchip, AVR) support TTL UARTs that should be able to work with the existing P2 UART functionality.
You are right, I had my prop-centric hat on. Fast slave is also very important, as fast as is consistent with not requiring too great an effort.
Just for yucks, let's figure out the fastest possible externally clocked software SPI slave code - I'll start here:
getspibyte ' wait for rising spiclk, sample miso waitpeq clkpin, clkpin shr pinsa,#miso wc,nr waitpne clkpin, clkpin rlc spibyte, #1 ' 7 more copies of above, with loop unrolled we avoid loop overhead getspibyte_ret retProp2_Docs.txt does not give the minimum number of cycles for waitpeq.
1 cycle wait*: 4 clocks per bit in ideal circumstances (which are not likely), and invalid results if spiclk is too fast.
2 cycle wat*: 6 clocks per bit
general formula: W*2+2 clocks minimum for one SPI clock, where W is the minimum wait* time
external clock software receive:
26.6Mbps @ 160Mhz - assuming W=2 is feasible
20.0Mbps @ 160Mhz - assuming W=3 is feasible
16.0Mbps @ 160Mhz - assuming W=4 is feasible
13.3Mbps @ 160Mhz - assuming W=5
external clock hardware receive:
80.0Mbps @ 160Mhz, if clkfreq/2 is possible
53.3Mbps @ 160Mhz, at clkfreq/3
40.0Mbps @ 160Mhz, at clkfreq/4
Conclusions:
- hardware externally clocked SPI slave is much faster than software based externally clocked SPI slave
- hardware SPI slave can run as a task in a cog
- software SPI slave needs a whole cog
Ok heater, you got me - much nicer to be a fast hardware peripheral to an external processor (than a slow one)
There are many systems out there can can act as MASTER to any slave peripherals. Prop II as a SLAVE peripheral would be perfect. It's not always possible to create that software on the MASTER end.
Two examples:
1) Any Linux based system. That's a lot. Linux has no SLAVE side driver. I am certainly not up to creating one. It would be great to be able to attach a Prop II as a peripheral to a Beagle Bone, Raspberry Pi and many others.
2) The Espruino has SPI MASTER support for attaching peripheral chips. Again, anyone want to create that SLAVE side driver.
Philosophically the Prop is the slave in these situations so it makes sense to have SPI SLAVE support.
Anyway as I said, if HOST only is easier and SLAVE is impractical, so be it.
We have to stop messing with the PII design at some point:)
We need to keep that in mind.
I agree that SPI slave opens up some really nice use cases, but at some point we are going to ask for one thing too many and end up getting less than we could have had...
I can see Chip saying he has Slave SPI implemented and then someone will chime in with a request for yet another protocol to be supported and...
C.W.