Prop123_A9_Prop2_v7z Intermediate Release - 25 March 2016 - USB, improved serial

cgracey · 2016-04-03 09:43

ozpropdev wrote: »

cgracey wrote: »

I see two simple solutions:

1) Document current behavior and live with it - don't poll IN until 5 clocks after PINACK.

What do you think?

Thinking about it further the most common combination of instructions is likely to be a PINACK followed by a PINSETY.
As PINSETY takes 10 clocks that more than covers the 5 clock delay of PINACK.
Just document the delay and we all should "smartpin" happily ever after.

Done! I think that's the best approach, too. I'm sure glad that you noticed this!

cgracey · 2016-04-03 09:44

<deleted, redundant>

cgracey · 2016-04-03 09:57

<deleted, redundant>

cgracey · 2016-04-03 10:14

dMajo wrote: »

cgracey wrote: »

jmg wrote: »

dMajo wrote: »

Since the smart cell has control/relations with adjacent pins +/-3, perhaps a hardware handshake for serial receive and transmission (rts,cts) would be useful. Even if this is made on fixed pins (not user selectable)

I agree, with the Pin Cells light on buffering, to keep the logic down, it is not clear if SW handshake alone is enough to keep up with highest data rates.

Most of the high-end UARTs have handshake included -many do the full old Modem sets, but it is probably enough to do a single Wire in each direction in HW, and slower status signaling can be done in SW.

Each smart pin read inputs from relative pins -3..+3, but can only output on its own pin. So, to support hardware handshaking, it would require two pins for the transmit case. This may be a little much, at the moment. For what it's worth, though, could you please post a link to a document that explains how simple one-wire-each-direction handshaking is used by some modern IC?

@cgracey,

I understand the +/-3 detail as real pin inputs to each smart cell. That means 7 inputs if you count 0 as its own pin. All real pin states. Right?
I also understand that each smart cell have only one output, its own pin because of missing paths/routes to adjacent pins, but I seem to remember a discussion of daisy chaining smart cells by selecting adjacent cell output as input for current cell.
Is this done through the +/-3 real pin state, so forcing the adjacent cell to output to real world (and potentially wasting a pin) or there is an internal path to the adjacent cell internal output, the one that is eventually routed to external pin?

I explain better.
If smart pin 1 is set as input and can't output on pin2 could pin2's smart cell take internal output of smart cell 1 and output it on its own pin (2).
If not there is a (application) reason why you have chosen 7 inputs (+3..0..-3) span for each cell? Perhaps +2..0..-2 so 5 can be enough and the 2 free channels can be wired to -1 and +1 smart cell internal output regardless if it is routed to external pin or not. I think it takes the same logic as the number of paths are the same, only the source changes for two routes.
In this way smart cells can be daisy chained internally (without the need to output to real world) and more complex functions blocks can be obtained. Also for the future, the ones that can't be foreseen today.
Perhaps in this way the async serial receivers/transmitters can route hw handshake to adjacent pins?

I hope that my english is clear enough, i didn't meant to bother with long writing.

Right now, you can't input another pin's output into a smart pin. There are potential "timing loops" that can result from crossing inputs and outputs. More than that, though, I don't think that useful functions could be realized by chaining the current smart pins together. The reason is that smart pin modes are much more like complete meals than collections of ingredients.

I tried to make a flexible programmable mode in the first smart pins release, but it was actually quite narrow in scope. To make really flexible, chainable pins, we'd need to develop a system of sub-circuits that could be configured in many different ways. This would be more involved than we have time for right now. I would love to know what the ingredients need to be, though.

evanh · 2016-04-03 11:55

Chip,
I've just been pondering using the ADC as a filter for a noisy digital input. What's the planned input config for selecting the ADC bitstream? The ADC is not documented as part of the A/B input selectors and doesn't appear to have encoding space spare for it.

EDIT: Maybe the encoding for "Inverted Cog OUT"?

evanh · 2016-04-03 12:27

On the topic of filtering, I presume a small counter exists for those four options for 3-of-3 input filtering? On that note, instead of the small group of three samples making a result, I'm thinking accumulate every true over the entire duration.

Ie: While the input combo is false the counter is reset. While the input combo is true the counter is incremented. While the terminal count is not reached the result is false. When the terminal count is reached then the count halts and the result becomes true.

evanh · 2016-04-03 12:33

It can still just be the four config options: 100 = 3-of-3 clocks, 101 = 24-of-24 clocks, 110 = 192-of-192 clocks, 111 = 1536-of-1536 clocks. Or, a power of two in clocks will presumably be less logic to implement. I guess 4, 32, 256 and 2048 would be the nearest.

Or maybe, to achieve the desired minimum 3 samples, combine both with two historic counts plus the live sample. Next up would then be 16 on the counter + 1 live makes for 17 samples. And 128+1=129 samples, and 1024+1=1025 samples.

[Lots of edits for detail, sorry]

Rayman · 2016-04-03 20:24

I just put together (I think) a USB sniffer circuit, in case I get time to play with USB...

Used a board from my Smorgasboard (anybody remember that?). It has two usb/ps2 connectors on it that I installed USB jacks on and jumpered them together and then sending D+/D-and GND over to P2 with servo wire to P0 and P1.

Did just discover one flaw in that the default P2 boot code seems to use P0 or P1, making the USB device go beserk. But, was able to move that over to the other side and so now using P32 and P33.

So far, I've plugged in a wireless mouse and it still works with all the jumper wires, so may soon be able to play with this...

cgracey · 2016-04-03 20:45

Rayman wrote: »

I just put together (I think) a USB sniffer circuit, in case I get time to play with USB...

Used a board from my Smorgasboard (anybody remember that?). It has two usb/ps2 connectors on it that I installed USB jacks on and jumpered them together and then sending D+/D-and GND over to P2 with servo wire to P0 and P1.

Did just discover one flaw in that the default P2 boot code seems to use P0 or P1, making the USB device go beserk. But, was able to move that over to the other side and so now using P32 and P33.

So far, I've plugged in a wireless mouse and it still works with all the jumper wires, so may soon be able to play with this...

Great!

Tubular · 2016-04-03 21:47

Ha, the smorgasboard lives on

Publison · 2016-04-03 22:59

Tubular wrote: »

Ha, the smorgasboard lives on

And thank you very much for that endeavour. That was a cool panel!

Rayman · 2016-04-03 23:37

Man, take a little break and you get all lost here...
Seems pinsetm and pinsetx are all different now.
Trying to get the async serial modes working again...

Rayman · 2016-04-03 23:55

Ok, think I have serial working again.
But, my string functions aren't working yet.
Did anything change with "##@" syntax?

Doesn't seem to be any mention of ##@ in the docs...

Rayman · 2016-04-04 00:42

Ok, that wasn't the problem.

Problem seems to be with sending multiple sequential bytes in async serial mode.

Used to need about a 300 clock wait between bytes. Now, looks like need a much bigger wait between bytes.

send_char

                pinsety tx_char,#TX_PIN  'send  character
                waitx   ##120000'300
                ret

Sure I'm missing something new, but haven't found it yet...

jmg · 2016-04-04 01:07

Rayman wrote: »

Used to need about a 300 clock wait between bytes. Now, looks like need a much bigger wait between bytes.

Did you check the Baud rate

Ozpropdev had streaming working in Sync mode, and IIRC there is a Buffer-ready flag that allows packed Txmits.

ozpropdev · 2016-04-04 01:09

Rayman
See here

Rayman · 2016-04-04 01:13

I think I have all the pinsets done correctly.
Not using NCO baud, but regular one...

Here are the settings:

pm_tx          'long    %1_11_11101_0000_0000_00_0_0000000000000        'async tx long mode, dir high
                long               %0000_0000_00_0_0000000000000_01_11110_1        'async tx byte mode, dir high
pm_rx          'long    %1_10_11111_0000_0111_00_0_0000000000000        'async rx long mode, dir low, inputs pin 0
                long               %0000_0000_00_0_0000000000000_00_11111_1        'async rx byte mode, dir low, inputs pin 0 
bitper          long    (SYS_CLK/BAUD_RATE)<<16+7                       'number of clocks per bit period, 3..65536

Rayman · 2016-04-04 01:21

Ok, I think I see the problem now... (me)

I dropped the baud from 3M to 115200 for testing.
Obviously (now), you need to wait longer before sending bytes to transmitter...

Maybe this exists already, but would be nice if there was some wait or other instruction that could be used after pinsety to wait until transmit buffer is ready. That way, the code wouldn't depend on the baud...

ozpropdev · 2016-04-04 01:26

Rayman
Use PINGETZ to get the transmitters busy flag (see link above)

jmg · 2016-04-04 01:29

Rayman wrote: »

Maybe this exists already, but would be nice if there was some wait or other instruction that could be used after pinsety to wait until transmit buffer is ready. That way, the code wouldn't depend on the baud...

I think Chip has mentioned there are ways to check each of these
* Buffer is Ready for another byte (typical load-next case)
and
* TxShifting is done (useful for direction-change/half duplex RS485 etc)

I found this in another thread, about feeding the Tx buffer & handshake :
I guess Async is the same flow sequence.

cgracey wrote:

To get the current release's synchronous output mode working, you must do an initial PINSETY before releasing reset (DIR high). Once DIR is high, you can immediately give it another word with PINSETY. After that, do another PINSETY after each IN raise and PINACK. That will keep the output buffer full at all times.

ozpropdev · 2016-04-04 01:59

From the docs for async tx.

PINGETZ can always be used to read the ‘busy’ flag into D[31]/C.

jmg · 2016-04-04 02:33

ozpropdev wrote: »
From the docs for async tx.
PINGETZ can always be used to read the ‘busy’ flag into D[31]/C.

Thanks. More complete docs...

Here is the internal state sequence:

Wait for an output word to be buffered via PINSETY, then set the ‘buffer-full’ and ‘busy’ flags.
Move the word into the shifter, clear the ‘buffer-full’ flag and raise IN.
Output a high for one bit period (guarantees a whole STOP bit).
Output a low for one bit period (the START bit).
Output the LSB of the shifter for one bit period, shift right, and repeat until all data bits are sent.
Output a high (begins the STOP bit).
If the ‘buffer-full’ flag is set due to an intervening PINSETY, loop to (2). Otherwise, clear the ‘busy’ flag and loop to (1).

PINGETZ can always be used to read the ‘busy’ flag into D[31]/C.

So that 'busy' flag is almost TxDone.
However, the timing order chosen above, does not work too well with half Duplex, as ideally you wait for Stop done before you change direction from Tx to Rx.
With a leading-stop design, doing that is not so simple, and another baud-dependent delay would be needed.

I wonder if that can be fixed ?

Tubular · 2016-04-04 02:42

Would that TX to RX transition cause any problem though? In effect you're listening for RX data earlier

jmg · 2016-04-04 02:53

Tubular wrote: »

Would that TX to RX transition cause any problem though? In effect you're listening for RX data earlier

That depends on the driver topology I guess, it becomes more of a lottery.
If the physical layer needs TX=Hi drive, then it would be an issue to remove Tx drive early.
If it can float high, then everyone seeing a legal stop bit relies on no Rx being too quick.
If Rx triggers on the mid-stop, that could be an issue, and could need a turn-around delay in the replying slave.

jmg · 2016-04-04 03:19

and I see this description on RX

Here is the internal state sequence:

Wait for a high (idle state).
Wait for a low (START bit edge).
Delay for 1.5 bit periods.
Right-shift the A input into the shifter and delay for one bit period, repeat until all data bits are received.
Capture the shifter into the Z register and raise IN.
Loop to (1).

PINGETZ is used to read the received words.

However, that has no noise rejection, and a very narrow start pulse, will start a full Byte Rx,
More usual is to confirm the START bit is still low, at 0.5 bit times, otherwise ignore it as noise.
There is also no mention above of any checking stop bit, my reading has IN flag at middle of last Rx Bit ?

Some systems send a Break Character as frame signaling, and most UARTS can flag that.
It is not clear how the above sequence can detect a break ?

cgracey · 2016-04-04 04:58

I need to get the next full update out to unify all docs and code again.

I think adding the start bit re-check at 0.5 bits in will take a single Verilog statement. You can also enable the 3-of-3 filtering for noise immunity.

I will see about break detection. That is probably simple, too. Break= 0 received and stop bit low.

I'm thinking about shorter pin messaging to speed up USB, as we're barely able to respond in 7.5 bits now, at 80MHz.

rogloh · 2016-04-04 05:14

cgracey wrote: »

I'm thinking about shorter pin messaging to speed up USB, as we're barely able to respond in 7.5 bits now, at 80MHz.

I was wondering about this recently too given some of the latencies mentioned with PINGETZ etc. We would need to understand the total overhead of waiting for a received USB character, acknowledging and reading it and also checking for possible errors or EOPs, copying the data to hub, incrementing a pointer, decrementing and testing a counter for oversize and doing the CRC via software LUT to see how it fits in the loop time available per byte. That is just for the streaming loop, there probably will be some other bottlenecks or pinch points between packets no doubt that could push the code hard too.

jmg · 2016-04-04 05:32

cgracey wrote: »

I need to get the next full update out to unify all docs and code again.

I think adding the start bit re-check at 0.5 bits in will take a single Verilog statement. You can also enable the 3-of-3 filtering for noise immunity.

I will see about break detection. That is probably simple, too. Break= 0 received and stop bit low.

Good, that will make it more robust, and flexible.,
Is that 3-of-3 SysCLK, or BaudCLK, based ?
Older UARTS with fixed /16 dividers, used 2 of 3 sampling, at 16x Baud, but that's not as easy on faster baud designs.

cgracey wrote: »

I'm thinking about shorter pin messaging to speed up USB, as we're barely able to respond in 7.5 bits now, at 80MHz.

Sounds worthwhile, but it should be possible to test at LS USB speeds, with what is there now ?

cgracey · 2016-04-04 05:51

jmg wrote: »

cgracey wrote: »

I need to get the next full update out to unify all docs and code again.

I think adding the start bit re-check at 0.5 bits in will take a single Verilog statement. You can also enable the 3-of-3 filtering for noise immunity.

I will see about break detection. That is probably simple, too. Break= 0 received and stop bit low.

Good, that will make it more robust, and flexible.,
Is that 3-of-3 SysCLK, or BaudCLK, based ?
Older UARTS with fixed /16 dividers, used 2 of 3 sampling, at 16x Baud, but that's not as easy on faster baud designs.

cgracey wrote: »

I'm thinking about shorter pin messaging to speed up USB, as we're barely able to respond in 7.5 bits now, at 80MHz.

Sounds worthwhile, but it should be possible to test at LS USB speeds, with what is there now ?

Yes. That would work.

evanh · 2016-04-04 10:21

jmg wrote: »

Is that 3-of-3 SysCLK, or BaudCLK, based ?

SysClk. EDIT: Well, the 3-of-3 filter has its own selectable divider but for this case the divider is bypassed.

Prop123_A9_Prop2_v7z Intermediate Release - 25 March 2016 - USB, improved serial

Comments