Here's a method that uses two pins in asynch mode with one setup with a clock pattern and double clock rate, and then when the data is written the clock pattern is also written to the clock pin at the same time.
Of course, the pattern can be changed to suit as can the clock polarity.
EDIT- this works because it doesn't matter that there is a redundant "start bit" in the data since it is the clock that validates the data.
55 PIN 40,000,000 18 -bit TXD --- configure clock pin to double rate and 18 data
$15554 ' TX 1- COG! --- set the 8-bit clock pattern
54 PIN 20,000,000 TXD --- setup the data pin and effective clock rate
$55 TX --- send a byte
TX is a code word
clkpat long $15554 ' default clock pattern
' TX ( data -- )
_TX mov x,pinreg ' assume clock pin is pin+1
add x,#1
wypin a,pinreg ' write data
wypin clkpat,x ' write clock pattern to pin+1
jmp #DROP
Larry,
Your WAITX's aren't long enough. 16 x 256 = 4096 sysclocks per byte shifted.
I'll look at it again in the morning, but I did have ##5000 delays with no difference in the overall outcome. Did you try it?
So having looked at this again, after some well earned sleep, my stated delay of ##5000 actually had delayed the first and second bytes sent and the last byte was never clocked out. I did not see any difference with the delay of ##3000 or ##4096, but I will update my snippet to use a delay of 4096.
Unlike the async serial mode, you can't examine if shifting is finished or not with sync serial.
Therefore knowing the timing of buffer empty is important for deciding on DIR toggling or maybe X[5] toggling. This is best done with interrupts. I'm a little worried there might be race conditions.
Also, Start-stop mode, X[5]=1, is definitely doing something weird, I think flawed. It may be limited to singles.
This hasn't shown up in earlier discussions because no one is using the smartpins for this. Most cases are bit-bashed instead. Chip used a streamer instead.
I have used the smartpins before but only at sysclk/10 or faster. At such speeds the buffer is always empty so doesn't need any checks.
Well, I've managed to get something going. It's using continuous mode and is split into two three subroutines. Kind of simulating what would happen if using interrupts.
EDIT: Shifted the init code into its own subroutine.
EDIT2: Doesn't need to know if second byte is valid, removed the check.
EDIT3: Reverse an earlier optimisation with when the clock is started.
EDIT4: Modified to 16-bit shifter size instead of 8-bit. Provides more processing leeway.
con
cksout = 20 ' Pin P20
txsout = 21 ' Pin P21
dat org
call #ssend_init
loc ptra, #test_dat
call #ssend_dat
.loop
call #ssend_next
jmp #.loop
'------------------------------------------------
ssend_init
dirl #cksout
wrpin spi_clock_mode, #cksout 'Set pin P20 as transition-output mode
wxpin #10, #cksout 'Set base period of each step
dirh #cksout 'Enable P20
dirl #txsout
wrpin spi_tx_mode, #txsout 'setup sync tx mode for pin 21
wxpin #%0_01111, #txsout 'setup continuous mode, X[5] = 0, 16 bits at a time
_ret_ dirh #txsout 'enable smartpin
'SPI clock mode CPHA=0, CPOL=1
'spi_tx_mode long %0000_0111_000_0000000000000_01_11100_0 'B pin = current pin-1
'spi_clock_mode long %0000_0000_000_0000001000000_01_00101_0 'Transition-mode setup
'SPI clock mode CPHA=0, CPOL=0
spi_tx_mode long %0000_1111_000_0000000000000_01_11100_0 'B pin = current pin-1
spi_clock_mode long %0000_0000_000_0000000000000_01_00101_0 'Transition-mode setup
'------------------------------------------------
ssend_dat
rdword pa, ptra++ wz 'get length of data to send, this controls the exact number of bytes sent
if_nz dirl #txsout 'prep for continuous mode
if_nz rdlong pb, ptra++ 'first data to transmit
if_nz wypin pb, #txsout 'feed first shortword (16 bits) to smartpin shifter
if_nz dirh #txsout 'shifter primed
if_nz shr pb, #16
if_nz wypin pb, #txsout 'feed second shortword (16 bits) to smartpin buffer
if_nz shl pa, #4 'x16 for number of SPI clock steps per byte (8 bits)
if_nz wypin pa, #cksout 'start the SPI clock
ret wcz
'------------------------------------------------
ssend_next
rdword pb, ptra++ '16-bit data to transmit
.waitp
testp #txsout wc 'check for buffer empty
if_c wypin pb, #txsout 'feed the smartpin buffer
if_c ret wcz
jmp #.waitp
'------------------------------------------------
orgh
test_dat word test_dat_e - test_dat_s
test_dat_s byte 1,$85,4,$58,0,"one way bridge"
test_dat_e
While my experimental "mucking about" mode isn't using the smartpin synch serial, and instead uses asynch with 8 clocks for the data bits, it is nonetheless easy to work with and non-demanding. So I can feed individual bytes at a time or transmit the whole buffer. While I have tested it for 50MHz clock sending 512 bytes in 100us, I am also checking out 100 and 150MHz clocks and making some adjustments.
What I'm confused about is what is the main problem with synch mode at present? Before smartpins were devised we just wanted some super simple way of clocking in and out data that even the cheapest MCUs do well, but we ended with far more, and yet we still have trouble clocking data.
Peter,
Jon is trying to learn the sync serial mode so he can write a tutorial on using it.
There is definitely something undocumented about how the start-stop mode works ... or doesn't work. It doesn't seem to raise IN in a consistent manner when the buffer empties. It's fine if you blindly write data at the correct rate, but that's not a general purpose use then.
Peter,
Jon is trying to learn the sync serial mode so he can write a tutorial on using it.
I've seen that too, but while there is a tutorial on "sync serial mode" which addresses a way to handle SPI, it is only one way. For instance, when discussing SPI in general sometimes it is easier just to bit-bash stuff and other times we want the highest performance. That would all be surely mentioned. But just because a smartpin mode has been labeled "async" shouldn't preclude it from being mentioned when handling SPI data. Anyway, I don't want to derail this thread any further. Thanks.
I decided not to reply but then I got to thinking. The whole reason smartpins are being documented is so we know enough to use them. What do we use synch mode for? SPI of course. If SPI isn't even being addressed or at least tested, then the documentation is incomplete which is what happens when someone documents a feature without taking into account the context and usage. This is a problem with most documentation in general.
Just my 2cents worth on that matter.
I'm helping myself as much as I'm helping Jon. What I want to do, but probably will never finish, is make a logic schematic of each smartpin mode. To achieve that I first have to know the effect of each config bit of that mode.
I'm helping myself as much as I'm helping Jon. What I want to do, but probably will never finish, is make a logic schematic of each smartpin mode. To achieve that I first have to know the effect of each config bit of that mode.
Yes, I really appreciate your work on that diagram and understand the nature of the academic side of this documentation effort too.
Here's a demo of the effects of the clocking stages within the prop2 I/O. Using the above code and adjusting the SPI clock smartpin's "base period" you can see the effect this has on the phase relationship as the period is reduced.
This first one is an SPI clock at sysclock/2000. It looks as expected.
Second one is with SPI clock at sysclock/200. It doesn't look all that different but if you look carefully you'll see the data edges are ever so slightly lagging the clock falling edges.
Third one is with SPI clock at sysclock/40. Now you can see the data edges are moving away from the clock falling and towards the clock rising.
Fourth one, at sysclock/20, takes this further and now the data edges are half way between the clock edges.
And finally, at sysclock/10, the data edges are critically close to landing on the clock rising edges.
Here's a demo of the effects of the clocking stages within the prop2 I/O. Using the above code and adjusting the SPI clock smartpin's "base period" you can see the effect this has on the phase relationship as the period is reduced.
The scope time tags seems to not change ? Does that mean you reduced the SysCLk to keep the SPI data rate constant, as the divide ratio reduced ?
The scope time tags seems to not change ? Does that mean you reduced the SysCLk to keep the SPI data rate constant, as the divide ratio reduced ?
Yep, for the first three cases. By that point I was down to 2 MHz sysclock so changed tack and adjusted the scope for the final two.
EDIT: Ah, a point, the code I'm using is not the exact one posted above. I'm using my diagnostic wrapper code. With that I can specify a PLL clock mode.
The scope time tags seems to not change ? Does that mean you reduced the SysCLk to keep the SPI data rate constant, as the divide ratio reduced ?
Yep, for the first three cases. By that point I was down to 2 MHz sysclock so changed tack and adjusted the scope for the final two.
EDIT: Ah, a point, the code I'm using is not the exact one posted above. I'm using my diagnostic wrapper code. With that I can specify a PLL clock mode.
Does that mean that at higher sysclk speeds, and low dividers, there will be more pin-transit-time considerations too, as have been seen on the HyperRAM code developments ?
I guess SPI Speeds above 50MHz are uncommon, but they do occur.
Comments
Your WAITX's aren't long enough. 16 x 256 = 4096 sysclocks per byte shifted.
I'll look at it again in the morning, but I did have ##5000 delays with no difference in the overall outcome. Did you try it?
Of course, the pattern can be changed to suit as can the clock polarity.
EDIT- this works because it doesn't matter that there is a redundant "start bit" in the data since it is the clock that validates the data.
TX is a code word
To setup this mode: Then transmit a couple of bytes: A block mode would be able to transmit much faster end to end.
At 50MHz data rate (100MHz bandwidth on my scope).
Therefore knowing the timing of buffer empty is important for deciding on DIR toggling or maybe X[5] toggling. This is best done with interrupts. I'm a little worried there might be race conditions.
Also, Start-stop mode, X[5]=1, is definitely doing something weird, I think flawed. It may be limited to singles.
I have used the smartpins before but only at sysclk/10 or faster. At such speeds the buffer is always empty so doesn't need any checks.
EDIT: Shifted the init code into its own subroutine.
EDIT2: Doesn't need to know if second byte is valid, removed the check.
EDIT3: Reverse an earlier optimisation with when the clock is started.
EDIT4: Modified to 16-bit shifter size instead of 8-bit. Provides more processing leeway.
What I'm confused about is what is the main problem with synch mode at present? Before smartpins were devised we just wanted some super simple way of clocking in and out data that even the cheapest MCUs do well, but we ended with far more, and yet we still have trouble clocking data.
Jon is trying to learn the sync serial mode so he can write a tutorial on using it.
There is definitely something undocumented about how the start-stop mode works ... or doesn't work. It doesn't seem to raise IN in a consistent manner when the buffer empties. It's fine if you blindly write data at the correct rate, but that's not a general purpose use then.
I've seen that too, but while there is a tutorial on "sync serial mode" which addresses a way to handle SPI, it is only one way. For instance, when discussing SPI in general sometimes it is easier just to bit-bash stuff and other times we want the highest performance. That would all be surely mentioned. But just because a smartpin mode has been labeled "async" shouldn't preclude it from being mentioned when handling SPI data. Anyway, I don't want to derail this thread any further. Thanks.
I decided not to reply but then I got to thinking. The whole reason smartpins are being documented is so we know enough to use them. What do we use synch mode for? SPI of course. If SPI isn't even being addressed or at least tested, then the documentation is incomplete which is what happens when someone documents a feature without taking into account the context and usage. This is a problem with most documentation in general.
Just my 2cents worth on that matter.
Yes, I really appreciate your work on that diagram and understand the nature of the academic side of this documentation effort too.
This first one is an SPI clock at sysclock/2000. It looks as expected.
Second one is with SPI clock at sysclock/200. It doesn't look all that different but if you look carefully you'll see the data edges are ever so slightly lagging the clock falling edges.
Third one is with SPI clock at sysclock/40. Now you can see the data edges are moving away from the clock falling and towards the clock rising.
Fourth one, at sysclock/20, takes this further and now the data edges are half way between the clock edges.
And finally, at sysclock/10, the data edges are critically close to landing on the clock rising edges.
EDIT: Ah, a point, the code I'm using is not the exact one posted above. I'm using my diagnostic wrapper code. With that I can specify a PLL clock mode.
Does that mean that at higher sysclk speeds, and low dividers, there will be more pin-transit-time considerations too, as have been seen on the HyperRAM code developments ?
I guess SPI Speeds above 50MHz are uncommon, but they do occur.