I like, and have used, the <space> $20 character to autobaud for PS2 keyboards in order to connect them to the P1 using 1pin. (see obex for 1pin TV and Kbd)
Space char is good, and it's been used for the same purpose in other systems before.
See the added test example for $20 under the $3f one above.
$3f has better capture precision, and if terminals are the focus here, I like the idea of a visible symbol ("?") over a blank ( )
The reason the "AT" command sequence was used is because it permits the start bit to be timed, and verified by the next/first (LSB) bit.
If you wanted timing, it would have been smarter to have a 101 edge at the start of every byte, but RS232 never did do that...
Any of the above waveforms can be Autobauded, the (significant) down side of a single bit-time capture, is it lacks precision.
0x55 or "U" is another common Baud character, but usually there, AutoBaud count along the toggling bits, to get a better timebase precision.
The reason the "AT" command sequence was used is because it permits the start bit to be timed, and verified by the next/first (LSB) bit.
If you wanted timing, it would have been smarter to have a 101 edge at the start of every byte, but RS232 never did do that...
Any of the above waveforms can be Autobauded, the (significant) down side of a single bit-time capture, is it lacks precision.
0x55 or "U" is another common Baud character, but usually there, AutoBaud count along the toggling bits, to get a better timebase precision.
That isn't actually correct. If you get an alternating 0-1-0 (start, b0, b1...) then you know with reasonable certainty what baud is being used. If the sequence is a longer sequence, in particular the <space> $20, then you don't really know if you have received the start-0-0-0-0-0-1 or a slower baud 0-1 sequence.
The idea is that you time the start "0", shift 1/2 bit and continue sampling to see if the character conforms to what yu are looking for. Ideally the following b0=1 and so you may also time that, or the whole character, or section of the character. But it is the start that is easiest to measure providing the bit0=1.
I have done this for the AT modem firmware without the aid of a UART for our NetComm and Apple modems. Previously I had done it on a micro to sync to ~53Kb specialised mini-computer serial, where I had to resync on each clock which was every 4.75 instructions.
That isn't actually correct. If you get an alternating 0-1-0 (start, b0, b1...) then you know with reasonable certainty what baud is being used. If the sequence is a longer sequence, in particular the <space> $20, then you don't really know if you have received the start-0-0-0-0-0-1 or a slower baud 0-1 sequence.
I'm not quite sure what point you are trying to make, as your single bit example will also fail, with the wrong byte sent.
Focusing back on the topic, which is P2 booting, the main downside of multi-edge characters is the P2 might exit reset part-way through, so that needs to be covered. I think a Twin Smart Pin Capture can do that and get good precision on one-char in.
Most Autobaud designs I've seen just sync once, and do not expect short term drift, but Chip has allowed for resync.
With Twin Smart Pin Capture, that should still be possible, but if there is also NCO TX, that may need more smart pins.
Although I'm not following P2 stuff at all, autobaud detection does interest me. Here's what I would do:
1. Decide on a fixed length for a data block, e.g. 64 bytes, 256 bytes, etc.
2. Start each data block with two or more NULs: $00 ... $00. This will give the autobaud detector nine low bits in a row to measure (start bit and 8 data bits -- stop bits can vary too much to be included in any measurement).
3. Within the block, if a zero byte is sent as data, precede it with a DLE ($10 $00).
4. Within the block, if $10 is sent as data, it's sent as two DLEs ($10 $10).
That way, anytime two or more NULs are received in a row, the device will know that it can't be data and must be the beginning of another data block. Moreover, it can resynchronize its baudrate clock to the new 9-bit low pulses on every new data block.
You can begin the transmission with as many zero bytes as you think you need to ensure initial synchronization. Or you can send a long BREAK to signal the beginning of a transmission, followed by the NULs, with even greater certainty.
Although I'm not following P2 stuff at all, autobaud detection does interest me. Here's what I would do:
1. Decide on a fixed length for a data block, e.g. 64 bytes, 256 bytes, etc.
2. Start each data block with two or more NULs: $00 ... $00. This will give the autobaud detector nine low bits in a row to measure (start bit and 8 data bits -- stop bits can vary too much to be included in any measurement).
3. Within the block, if a zero byte is sent as data, precede it with a DLE ($10 $00).
4. Within the block, if $10 is sent as data, it's sent as two DLEs ($10 $10).
That way, anytime two or more NULs are received in a row, the device will know that it can't be data and must be the beginning of another data block. Moreover, it can resynchronize its baudrate clock to the new 9-bit low pulses on every new data block.
You can begin the transmission with as many zero bytes as you think you need to ensure initial synchronization. Or you can send a long BREAK to signal the beginning of a transmission, followed by the NULs, with even greater certainty.
-Phil
Me thinks you have a synchronous communications background
That isn't actually correct. If you get an alternating 0-1-0 (start, b0, b1...) then you know with reasonable certainty what baud is being used. If the sequence is a longer sequence, in particular the <space> $20, then you don't really know if you have received the start-0-0-0-0-0-1 or a slower baud 0-1 sequence.
I'm not quite sure what point you are trying to make, as your single bit example will also fail, with the wrong byte sent.
????? Of course the code would only be looking for a single character, or a group of similar detectable characters. eg like "A" and "a".
Focusing back on the topic, which is P2 booting, the main downside of multi-edge characters is the P2 might exit reset part-way through, so that needs to be covered. I think a Twin Smart Pin Capture can do that and get good precision on one-char in.
Surely, if the P2 resets, everything gets reset ???
Most Autobaud designs I've seen just sync once, and do not expect short term drift, but Chip has allowed for resync.
"AT" command sets resync on each "AT" sequence.
The older async rates were always 300*2^n. But now rates can also be even numbers like multiples of an even xtal such as 5/10/16/20MHz etc. In the old days we used xtals like 4.932MHz, 7.9872MHz, etc so we could divide by 2^n to get the baud.
So the accuracy is a little more important as we cannot just assume 114kb would be 115,200. However, serial is supposed to be within 1%, and re-syncing (not autobauding) should be done on each character because it's possible to have anything >= 1 stop-bit.
With Twin Smart Pin Capture, that should still be possible, but if there is also NCO TX, that may need more smart pins.
I cannot see why we need to be so accurate. Certainly a lot of micros have internal oscillators that are 1% now days.
It would be GREAT to get this going a lot faster, of course, so that loading could be a single step.
I think 'a lot faster' is looking practical.
Using 2 Smart Pins as / to / and \ to \ timing, and with '?' it should capture 8b and 7b, which can sum to 15b axis.
... To go faster, we need different principles. I was thinking about strings of '?' ($3F) characters, which cause two low bits periodically. That's not much to go on, I know. It would be great if we could use $C0's or something, so that we get low-rate transitions which can be counted better. I think if I used TWO smart pins to count the states, one for high and one for low, I might be able to double the top speed. Remember, also, we don't just have to use whole clock counts. Both smart pin TX and RX have an NCO mode. If we could just get good NCO values, we could go way higher.
To give a target value, I've added numbers to the other MCU One Pin thread, where I pushed up the BAUD until the Tx stop bits started to increase, from the selected 2 Stop bits.
I also changed the B64 to a faster encoded variant, but still has '?' (& many others) free for Autobaud work.
Result is 24.5MHz/2 can sustain 1.225MBd, from MCU
1.531250MBd needs 1 more stop bit, for an effective 1.441176 MBd sustained equiv speed. At /1, double those to 2.45MHz, & 2.88MBd
3MBd is common upper limit for cheaper USB-UARTS, with a virtual Baud Clock of 12MHz
Faster USB-UARTS can go to 12MBd with a 24MHz virtual Baud Clock.
I cannot see why we need to be so accurate. Certainly a lot of micros have internal oscillators that are 1% now days.
Yup, but the P2 is not one of those, not during boot, so it does need good quality AutoBAUD.
Pushing up the precision certainly does help, as you want to avoid the measurement noise affecting the calculations, and you want to centre the errors.
It can further help, if you report the Calibrate value back to the host, as it can then select a better baud choice.
Or NCO RX can help improve granularity at these elevated speeds.
Some target speeds, that a lowly 8 bit MCU can achieve, are in the reply above.
I've got TWO smart pins taking measurements on the RX pin now. One measures lows and one measures highs. There's only need to interrupt on low measurements, as there will be an implied high measurement already available. So far, it seems to have doubled the speed of continuous downloading, but if characters are paced apart, even 1M baud works.
Still looking at how to make it faster. I think we need to go to NCO mode to get sufficient resolution, but that means doing a divide, which is very expensive in terms of time. Also, smaller FPGA implementations don't have CORDIC, so it would have to be a software divide.
Should CALLPA/CALLPB be doc'd as being in the S/#rel9 group, as PNut throws a "Relative address out of range" error?
My apologies if I'm misunderstanding -- again...
Thanks!
I've got TWO smart pins taking measurements on the RX pin now. One measures lows and one measures highs. There's only need to interrupt on low measurements, as there will be an implied high measurement already available.
Sounds good, can that measure _/= to _/= , and =\_ to =\_ , or was that what you meant ?
TWO smart pins is great, because you can catch the wrong-phase effect with a simple > test.
Still looking at how to make it faster. I think we need to go to NCO mode to get sufficient resolution, but that means doing a divide, which is very expensive in terms of time. Also, smaller FPGA implementations don't have CORDIC, so it would have to be a software divide.
Good point about divide - what speed is Multiply ?
First, I think we should push up the Baud rate with 2 Stop Bits, and then look into NCO, if the granularity is becoming an issue.
I do have a faster 64B scheme, for example.
For bit-time calculations, I've come up with an approx /15, which assumes you Two SmartPin capture & add (_/= to _/=) , and (=\_ to =\_) on '?' to get a 15b time measurement. Do > check for skew-error, then calculate.
Ideal 900.5*15 = 13507.5
RD15 : round(/15) using multiply and shift
(2^15+13507.5*4369) >>16 = 900
(2^15+13507.7*4369) >>16 = 900
(2^15+13507.8*4369) >>16 = 901 slight movement in step point, due to appx nature.
It is appx, because *4369/2^16 has a ~15ppm error, I think tolerable for this task.
Should CALLPA/CALLPB be doc'd as being in the S/#rel9 group, as PNut throws a "Relative address out of range" error?
My apologies if I'm misunderstanding -- again...
Thanks!
By paced apart, I mean typed on a keyboard, so many stop bits are there.
The smart pin measurements are for high-period and low-period. Once we get an interrupt from the low-period measurement, we can be sure a high-period measurement is also ready, simplifying the autobaud ISR and cutting the interrupt frequency in half.
16x16 multiply is two clocks.
The autobaud can work at high speed for cases of many stop bits. I don't know if two stop bits would make any difference, though, as there are plenty of fast transitions within data bytes.
Right now, I'm looking at making the smart pin's reporting size variable, according to how many bits are needed. Cutting down the RDPIN time could give us a huge boost:
The smart pin measurements are for high-period and low-period. Once we get an interrupt from the low-period measurement, we can be sure a high-period measurement is also ready, simplifying the autobaud ISR and cutting the interrupt frequency in half.
Yes, but if you measure edge to edge, you increase the T-Axis & simplify the possible readings.
A false-reading (bad phase) is detected with a simple > test.
Q: The smart pins can do a Time Capture and Clear on choice of _/= or =\_ Edge, right ?
The autobaud can work at high speed for cases of many stop bits. I don't know if two stop bits would make any difference, though, as there are plenty of fast transitions within data bytes.
Two stop bits matters mostly for Software RX, but it does give a little more time from leading edge of Start to possible next char start bit.
If the AutoBaud char echos a single char, as I think it needs to, most systems would wait for that, which gives rather more time from AutoBAUD to command start. 10~15b times on MCU, and 250us~1ms on USB-UART
Right now, I'm looking at making the smart pin's reporting size variable, according to how many bits are needed. Cutting down the RDPIN time could give us a huge boost:
Going from long to byte (at fast baud rates) would be equivalent to cutting a lot of instructions from the autobaud ISR.
Does AutoBaud even need an ISR now ?
At First char P2 is waiting, and if you want resync on AutoBaud, you expect Rx to be valid, so can (early) test for '?' and simply grab the waiting Capture pair. On all other Chars, capture is simply ignored. (so it needs to reflect most recent)
If "?" always echos ( eg "."==NOP) then a One-Pin design needs to pause after "?", practical could be a refresh every 500ms say ?
Of course, the faster this works, the less need there is for resync of Baud rates to cover dT drift
Can you test that RC Osc temp drift on the sample wafers ?
Q: Can the smart-pin uart, accept a Baud-Divider change 'live' - ie while receiving characters.
I got the pin messaging reduced and that has enabled auto-baud detection of $20's at 2M baud. The problem is, it only works if the bytes are spread way out. In a high-change-rate byte, like $55, which is received as ...10101010101..., the auto-baud fires every 2 bits, or at 1MHz, and trips all over itself.
Here are the current auto-baud and receiver ISR:
'
'
' Autobaud ISR - $20 -> 10000001001 -> ..1, 6x 0, 1x 1, 2x 0, 1..
'
autobaud_isr akpin #rx_ms0 'acknowledge low-time measurement, high-time also ready
mov buf2,buf0 'get [-2] low time (6x if $20)
rdpin buf1,#rx_ms1 'get [-1] high time (1x if $20)
rdpin buf0,#rx_ms0 'get [-0] low time (2x if $20)
mov limh,buf2 'make comparison window from [-2] low (6x if $20)
shr limh,#4
neg liml,limh
add limh,buf2
add liml,buf2
mul buf1,#6 'normalize [-1] high to 6x (1x if $20)
cmpr buf1,limh wc 'check if within window
if_nc cmp buf1,liml wc
mov buf1,buf0 'normalize [-0] low to 6x (2x if $20)
mul buf1,#3
if_nc cmpr buf1,limh wc 'check if also within window
if_nc cmp buf1,liml wc
if_c reti1 'if not $20, exit
add buf2,buf0 '$20 (space), add 6x 0 and 2x low times to get 8x time
shl buf2,#16-3 'shift up to get clock count for 1x time
or buf2,#7 'set 8 bits
wxpin buf2,#rx_rcv 'set rx pin baud
dirl #rx_rcv 'reset receiver pin
dirh #rx_rcv '(re)enable receiver pin to (re)register frame
mov baud,buf2 'save baud for transmit
mov rxbyte,#$120 'signal receiver ISR to ignore pin, enter space
trgint2 'trigger serial receiver ISR in case it wasn't, already (<50k baud)
reti1 'exit, serial receiver ISR executes next
'
'
' Serial receiver ISR
'
receive_isr clrb rxbyte,#8 wc 'triggered by autobaud? if so, rxbyte = $20 (space)
if_nc akpin #rx_rcv 'triggered by receive, acknowledge rx byte
if_nc rdpin rxbyte,#rx_rcv 'triggered by receive, get rx byte
wrlut rxbyte,head 'write byte to circular buffer in lut
incmod head,#lut_btop 'increment buffer head
reti2
Interesting idea, jmg, about tracking fall-to-fall and rise-to-rise times, instead of state times. It looks like it would cut down on interrupts, but still permit some ambiguity. Take the $20 case:
It's outside of the protocol to have $02 and $03 data present, so this is probably not a problem.
I will try this out. The neat thing about this is that it only needs two samples, not three, and those two sample arrive in one interrupt!
Yes, a single valid reading of a pair per Autobaud char, is all that is needed.
I'm not sure an interrupt is required, if you can wait on capture, for the first Autobaud, and then read-last-capture
on any Serial INT.
That requires the last capture is always the most-recent (ie new capture overwrites old).
Is that how capture works ?
There are also other one-shot use capture cases, where you might prefer first-capture is kept.
You can make good use of a Pair of captures
* Sum to get more x-Axis
* Check tRR relative to tFF to catch rejects
* Check ratio, to discriminate between two valid AutoBaud chars.
I'm running through candidate Chars, for AutoBaud, the ideal outcome is
* Highest Sum of tRR and tFF, as that gives lowest 'maths jitter'
* A means to reject false captures, eg Reset exits mid-char
eg 0x3e ">" fails this, as the balanced nature of this char, means no > or < test can resolve.
You suggestion of '?' looks good, as that gives 15b Sum
There is also " " and "@", but with a lower sum of 10b,
I think AutoBaud needs to echo a ACK char, and there needs to be two AutoBaud chars,
one to Set-Baud and define Two-Pin
another to Set-Baud and define One-Pin
I got the pin messaging reduced and that has enabled auto-baud detection of $20's at 2M baud.
Good results, but I thought you already had the pin-messaging reduced ?
Or was that opcode related, and this is content related ?
Automated Content based compression is nice, but could introduce a jitter source into code ?
It's outside of the protocol to have $02 and $03 data present, so this is probably not a problem.
I will try this out. The neat thing about this is that it only needs two samples, not three, and those two sample arrive in one interrupt!
Yes, a single valid reading of a pair per Autobaud char, is all that is needed.
I'm not sure an interrupt is required, if you can wait on capture, for the first Autobaud, and then read-last-capture
on any Serial INT.
That requires the last capture is always the most-recent (ie new capture overwrites old).
Is that how capture works ?
Wow! These are great ideas. And, yes, the captures are cyclical.
We COULD just do an initial auto-baud and then check the sample pair in each serial interrupt, as the last thing to happen was the stop bit, making a fresh rise-to-rise measurement. We are already 1/2 into the stop bit when the serial ISR is triggered, so we have less than half a bit period to grab the last fall-to-fall sample before a new fall-to-fall is registered, in case a (low) start bit is next. This could get us to 1M baud. It's all about getting into the ISR and reading the fall-to-fall measurement ASAP.
There are also other one-shot use capture cases, where you might prefer first-capture is kept.
You can make good use of a Pair of captures
* Sum to get more x-Axis
* Check tRR relative to tFF to catch rejects
* Check ratio, to discriminate between two valid AutoBaud chars.
I'm running through candidate Chars, for AutoBaud, the ideal outcome is
* Highest Sum of tRR and tFF, as that gives lowest 'maths jitter'
* A means to reject false captures, eg Reset exits mid-char
eg 0x3e ">" fails this, as the balanced nature of this char, means no > or < test can resolve.
You suggestion of '?' looks good, as that gives 15b Sum
There is also " " and "@", but with a lower sum of 10b,
I think AutoBaud needs to echo a ACK char, and there needs to be two AutoBaud chars,
one to Set-Baud and define Two-Pin
another to Set-Baud and define One-Pin
I got the pin messaging reduced and that has enabled auto-baud detection of $20's at 2M baud.
Good results, but I thought you already had the pin-messaging reduced ?
Or was that opcode related, and this is content related ?
Automated Content based compression is nice, but could introduce a jitter source into code ?
Size/time is now content-based, but remember that there is jitter, anyway, from not knowing the phase of the message.
Wow! These are great ideas. And, yes, the captures are cyclical.
We COULD just do an initial auto-baud and then check the sample pair in each serial interrupt, as the last thing to happen was the stop bit, making a fresh rise-to-rise measurement. We are already 1/2 into the stop bit when the serial ISR is triggered, so we have less than half a bit period to grab the last fall-to-fall sample before a new fall-to-fall is registered, in case a (low) start bit is next. This could get us to 1M baud. It's all about getting into the ISR and reading the fall-to-fall measurement ASAP.
This is an example of where 2 Stop bits can buy a higher baud rate.
That 0.5 bit, becomes 1.5 bits.
Wow! These are great ideas. And, yes, the captures are cyclical.
We COULD just do an initial auto-baud and then check the sample pair in each serial interrupt, as the last thing to happen was the stop bit, making a fresh rise-to-rise measurement. We are already 1/2 into the stop bit when the serial ISR is triggered, so we have less than half a bit period to grab the last fall-to-fall sample before a new fall-to-fall is registered, in case a (low) start bit is next. This could get us to 1M baud. It's all about getting into the ISR and reading the fall-to-fall measurement ASAP.
This is an example of where 2 Stop bits can buy a higher baud rate.
That 0.5 bit, becomes 1.5 bits.
Right. If the sender's baud is over ~500k, then he needs to use TWO stop bits, not one. That's simple enough.
I really like the idea of getting the baud updater down to once per byte, and really only on $20's, as we're only looking for fine adjustments.
How do you efficiently compare two values for being within 1/16th of each other?
This has turned into a brain bender for me. I have this code:
'
'
' Autobaud ISR
'
' $20 -> 10000001001 -> fall-to-fall = 7 bits then rise-to-rise = 3 bits
'
autobaud_isr rdpin buf1,#rx_ms1 'get old fall-to-fall time (7x if $20)
rdpin buf0,#rx_ms0 'get new rise-to-rise time (3x if $20)
akpin #rx_ms0 'acknowledge rise-to-rise measurement (ISR trigger)
mul buf1,#3 'normalize both samples to 21x for comparison
mul buf0,#7
sub buf0,buf1 'subtract one from the other
abs buf0 'get absolute difference
topone buf2,buf1 'get magnitude of one original
sub buf2,#4 'divide by 16
shr buf0,buf2 wz 'shift down by magnitude minus 4, checking for 0
if_nz reti1
'(received $20)
I need to run it through mental scenarios before I'm confident. It seems to work just fine, though.
You can make good use of a Pair of captures
* Sum to get more x-Axis
* Check tRR relative to tFF to catch rejects
* Check ratio, to discriminate between two valid AutoBaud chars.
I'm not understanding these ideas, yet.
The idea is to combine the two capture readings, to get the most information.
eg 0x3f will capture tRR = 8b tFF = 7b, which you can add to get 15b times.
You can also compare them, and if (tRR < tFF), you ignore & continue to wait. (means RST exit was mid-char)
To resolve which one of 0x3f or 0x1f was sent, you compare the difference with tRR, and slice ~ 1.5/8b
I'm favouring 0x3f "?" and 0x1f, as they have higher sums, (tho the last one does go a little against your Kbd Test wish )
0x3f "?" : on tRR = 8 tFF = 7 Sum=15b
===\_s_/=0=.=1=.=2=.=3=.=4=.=5=\_6_._7_/=P==T=\_s_/=0=.=1=.=2=.=3=.=4=.=5=\_6_._7_/=P==T=\_s_/=0=.=1=.=2=.=3=.=4=.=5=\_6_._7_/=P=
tRR | 8 | 2+T | 8 | 2+T | 8 | //
tFF 7 | 3+T | 7 | 3+T | 1 | =\_/=
f r OK tRR:8b,tFF:7b -^ ^- Err:tRR:2b+T,tFF:3b+T ^OK ^Err
Margins: +1b,-1b = usable
Need a Second Char, for AutoBaud One-Pin command
0x1f -> OK tRR:8b,tFF:6b Sum=14b Err:tRR:2b+T,tFF:4b+T
===\_s_/=0=.=1=.=2=.=3=.=4=\_5_._6_._7_/=P==T=\_s_/=0=.=1=.=2=.=3=.=4=\_5_._6_._7_/=P==T=\_
Margins: +2b,-2b = usable, CAN pair with 0x3f
also possible are 0x20 " " and 0x40 "@", but they have smaller sums.
How do you efficiently compare two values for being within 1/16th of each other?
This has turned into a brain bender for me. I have this code:
I'm not quite sure what you are asking, but here are my 10b examples
I add, then multiply before shifting to get the rounding effects carefully centred.
How do you efficiently compare two values for being within 1/16th of each other?
I think you are trying to ratio check 7b vs 3b, as a simple value sanity check ?
I was more of the mind that the AutoBaud char had better be right, and the echo char can confirm immediately.
No need to get too fancy in the software.
I was planning on just checking for (eg)
Which of 3b:7b ie " " , or 2b:8b, "@" and here the difference is either -4 or -6, so a slice test of 5b is ok.
( I also test for tRR > tFF, as that represents a phasing fail. That certainly is possible. Next Char should always work)
A purist might also check for >3 and <7, to bound the expected values a little more, but what to do, if the AutoBaud fails the test ?
Anything echoed is likely to be mangled anyway.
Users would always be advised to 'approach from below' in any AutoBaud testing.
For echo, I think the "~" 0x7e is good, as that is most baud sensitive.
P2 running too fast, will receive as ">" instead of "~"
Reporting the Capture Sum, I think is good for testing, and good for users wanting to push Baud speeds.
A spare command char, can ask for the sum, anytime. \
There is room to report this as packed-hex (ie one 16ch ASCII block, not the classic 0..9,A-F)
Here are the tRR, tFF results for the Space and other candidates.
With +4,+6,+3 signaling the error, and -4,-6,-3 for Valid phase, a simple (tRR < tFF) test is enough.
Test cases :
The MCU code I have repeats every ~ 500us waiting for an Autobaud ack reply, but for systems that cannot easily pace their sending, or are more 'one way' in nature, another 'acid test' scenario is
The Boot loader needs to tolerate a long stream of AutoBAUD chars, with reset exit of any Phase.
Something like a RF link, may choose to
* Issue a Reset Pulse
* Stream AutoBaud (eg 300ms) This time is > worst case Reset Exit.
* Stream Data
or, using your simpler Keyboard mode, a user can hold down the AutoBAUD Char, while they manually reset P2.
Comments
See the added test example for $20 under the $3f one above.
$3f has better capture precision, and if terminals are the focus here, I like the idea of a visible symbol ("?") over a blank ( )
This is why ASCII characters %0xxx_xx01 are better for timing purposes.
Any of the above waveforms can be Autobauded, the (significant) down side of a single bit-time capture, is it lacks precision.
0x55 or "U" is another common Baud character, but usually there, AutoBaud count along the toggling bits, to get a better timebase precision.
The idea is that you time the start "0", shift 1/2 bit and continue sampling to see if the character conforms to what yu are looking for. Ideally the following b0=1 and so you may also time that, or the whole character, or section of the character. But it is the start that is easiest to measure providing the bit0=1.
I have done this for the AT modem firmware without the aid of a UART for our NetComm and Apple modems. Previously I had done it on a micro to sync to ~53Kb specialised mini-computer serial, where I had to resync on each clock which was every 4.75 instructions.
Focusing back on the topic, which is P2 booting, the main downside of multi-edge characters is the P2 might exit reset part-way through, so that needs to be covered. I think a Twin Smart Pin Capture can do that and get good precision on one-char in.
Most Autobaud designs I've seen just sync once, and do not expect short term drift, but Chip has allowed for resync.
With Twin Smart Pin Capture, that should still be possible, but if there is also NCO TX, that may need more smart pins.
1. Decide on a fixed length for a data block, e.g. 64 bytes, 256 bytes, etc.
2. Start each data block with two or more NULs: $00 ... $00. This will give the autobaud detector nine low bits in a row to measure (start bit and 8 data bits -- stop bits can vary too much to be included in any measurement).
3. Within the block, if a zero byte is sent as data, precede it with a DLE ($10 $00).
4. Within the block, if $10 is sent as data, it's sent as two DLEs ($10 $10).
That way, anytime two or more NULs are received in a row, the device will know that it can't be data and must be the beginning of another data block. Moreover, it can resynchronize its baudrate clock to the new 9-bit low pulses on every new data block.
You can begin the transmission with as many zero bytes as you think you need to ensure initial synchronization. Or you can send a long BREAK to signal the beginning of a transmission, followed by the NULs, with even greater certainty.
-Phil
The older async rates were always 300*2^n. But now rates can also be even numbers like multiples of an even xtal such as 5/10/16/20MHz etc. In the old days we used xtals like 4.932MHz, 7.9872MHz, etc so we could divide by 2^n to get the baud.
So the accuracy is a little more important as we cannot just assume 114kb would be 115,200. However, serial is supposed to be within 1%, and re-syncing (not autobauding) should be done on each character because it's possible to have anything >= 1 stop-bit. I cannot see why we need to be so accurate. Certainly a lot of micros have internal oscillators that are 1% now days.
I think 'a lot faster' is looking practical.
Using 2 Smart Pins as / to / and \ to \ timing, and with '?' it should capture 8b and 7b, which can sum to 15b axis.
To give a target value, I've added numbers to the other MCU One Pin thread, where I pushed up the BAUD until the Tx stop bits started to increase, from the selected 2 Stop bits.
I also changed the B64 to a faster encoded variant, but still has '?' (& many others) free for Autobaud work.
Result is 24.5MHz/2 can sustain 1.225MBd, from MCU
1.531250MBd needs 1 more stop bit, for an effective 1.441176 MBd sustained equiv speed.
At /1, double those to 2.45MHz, & 2.88MBd
3MBd is common upper limit for cheaper USB-UARTS, with a virtual Baud Clock of 12MHz
Faster USB-UARTS can go to 12MBd with a 24MHz virtual Baud Clock.
Yup, but the P2 is not one of those, not during boot, so it does need good quality AutoBAUD.
Pushing up the precision certainly does help, as you want to avoid the measurement noise affecting the calculations, and you want to centre the errors.
It can further help, if you report the Calibrate value back to the host, as it can then select a better baud choice.
Or NCO RX can help improve granularity at these elevated speeds.
Some target speeds, that a lowly 8 bit MCU can achieve, are in the reply above.
Still looking at how to make it faster. I think we need to go to NCO mode to get sufficient resolution, but that means doing a divide, which is very expensive in terms of time. Also, smaller FPGA implementations don't have CORDIC, so it would have to be a software divide.
My apologies if I'm misunderstanding -- again...
Thanks!
Sounds good, can that measure _/= to _/= , and =\_ to =\_ , or was that what you meant ?
TWO smart pins is great, because you can catch the wrong-phase effect with a simple > test.
What does 'paced apart' mean ? Is that 2 stop bits, or something more ?
2 Stop bits is easy to specify, for above some moderate speed.
Good point about divide - what speed is Multiply ?
First, I think we should push up the Baud rate with 2 Stop Bits, and then look into NCO, if the granularity is becoming an issue.
I do have a faster 64B scheme, for example.
For bit-time calculations, I've come up with an approx /15, which assumes you Two SmartPin capture & add (_/= to _/=) , and (=\_ to =\_) on '?' to get a 15b time measurement. Do > check for skew-error, then calculate.
Ideal 900.5*15 = 13507.5
RD15 : round(/15) using multiply and shift
(2^15+13507.5*4369) >>16 = 900
(2^15+13507.7*4369) >>16 = 900
(2^15+13507.8*4369) >>16 = 901 slight movement in step point, due to appx nature.
It is appx, because *4369/2^16 has a ~15ppm error, I think tolerable for this task.
Baud capture values
You're right. I didn't make a proper opcode description in instructions.txt. Here's how those two lines should read:
I'm glad you noticed this. I fixed it for the next release.
By paced apart, I mean typed on a keyboard, so many stop bits are there.
The smart pin measurements are for high-period and low-period. Once we get an interrupt from the low-period measurement, we can be sure a high-period measurement is also ready, simplifying the autobaud ISR and cutting the interrupt frequency in half.
16x16 multiply is two clocks.
The autobaud can work at high speed for cases of many stop bits. I don't know if two stop bits would make any difference, though, as there are plenty of fast transitions within data bytes.
Right now, I'm looking at making the smart pin's reporting size variable, according to how many bits are needed. Cutting down the RDPIN time could give us a huge boost:
Going from long to byte (at fast baud rates) would be equivalent to cutting a lot of instructions from the autobaud ISR.
Yes, but if you measure edge to edge, you increase the T-Axis & simplify the possible readings.
A false-reading (bad phase) is detected with a simple > test.
Q: The smart pins can do a Time Capture and Clear on choice of _/= or =\_ Edge, right ?
That's 16x16 -> 32 ? That should work well with the rounded /15 above.
Two stop bits matters mostly for Software RX, but it does give a little more time from leading edge of Start to possible next char start bit.
If the AutoBaud char echos a single char, as I think it needs to, most systems would wait for that, which gives rather more time from AutoBAUD to command start. 10~15b times on MCU, and 250us~1ms on USB-UART
Does AutoBaud even need an ISR now ?
At First char P2 is waiting, and if you want resync on AutoBaud, you expect Rx to be valid, so can (early) test for '?' and simply grab the waiting Capture pair. On all other Chars, capture is simply ignored. (so it needs to reflect most recent)
If "?" always echos ( eg "."==NOP) then a One-Pin design needs to pause after "?", practical could be a refresh every 500ms say ?
Of course, the faster this works, the less need there is for resync of Baud rates to cover dT drift
Can you test that RC Osc temp drift on the sample wafers ?
Q: Can the smart-pin uart, accept a Baud-Divider change 'live' - ie while receiving characters.
Here are the current auto-baud and receiver ISR:
Interesting idea, jmg, about tracking fall-to-fall and rise-to-rise times, instead of state times. It looks like it would cut down on interrupts, but still permit some ambiguity. Take the $20 case:
It's outside of the protocol to have $02 and $03 data present, so this is probably not a problem.
I will try this out. The neat thing about this is that it only needs two samples, not three, and those two sample arrive in one interrupt!
Yes, a single valid reading of a pair per Autobaud char, is all that is needed.
I'm not sure an interrupt is required, if you can wait on capture, for the first Autobaud, and then read-last-capture
on any Serial INT.
That requires the last capture is always the most-recent (ie new capture overwrites old).
Is that how capture works ?
There are also other one-shot use capture cases, where you might prefer first-capture is kept.
You can make good use of a Pair of captures
* Sum to get more x-Axis
* Check tRR relative to tFF to catch rejects
* Check ratio, to discriminate between two valid AutoBaud chars.
I'm running through candidate Chars, for AutoBaud, the ideal outcome is
* Highest Sum of tRR and tFF, as that gives lowest 'maths jitter'
* A means to reject false captures, eg Reset exits mid-char
eg 0x3e ">" fails this, as the balanced nature of this char, means no > or < test can resolve.
You suggestion of '?' looks good, as that gives 15b Sum
There is also " " and "@", but with a lower sum of 10b,
I think AutoBaud needs to echo a ACK char, and there needs to be two AutoBaud chars,
one to Set-Baud and define Two-Pin
another to Set-Baud and define One-Pin
Or was that opcode related, and this is content related ?
Automated Content based compression is nice, but could introduce a jitter source into code ?
Wow! These are great ideas. And, yes, the captures are cyclical.
We COULD just do an initial auto-baud and then check the sample pair in each serial interrupt, as the last thing to happen was the stop bit, making a fresh rise-to-rise measurement. We are already 1/2 into the stop bit when the serial ISR is triggered, so we have less than half a bit period to grab the last fall-to-fall sample before a new fall-to-fall is registered, in case a (low) start bit is next. This could get us to 1M baud. It's all about getting into the ISR and reading the fall-to-fall measurement ASAP.
I'm not understanding these ideas, yet.
Size/time is now content-based, but remember that there is jitter, anyway, from not knowing the phase of the message.
This is an example of where 2 Stop bits can buy a higher baud rate.
That 0.5 bit, becomes 1.5 bits.
I've also been thinking about how the smart pin could solve this baud problem.
Right. If the sender's baud is over ~500k, then he needs to use TWO stop bits, not one. That's simple enough.
I really like the idea of getting the baud updater down to once per byte, and really only on $20's, as we're only looking for fine adjustments.
How do you efficiently compare two values for being within 1/16th of each other?
This has turned into a brain bender for me. I have this code:
I need to run it through mental scenarios before I'm confident. It seems to work just fine, though.
The idea is to combine the two capture readings, to get the most information.
eg 0x3f will capture tRR = 8b tFF = 7b, which you can add to get 15b times.
You can also compare them, and if (tRR < tFF), you ignore & continue to wait. (means RST exit was mid-char)
To resolve which one of 0x3f or 0x1f was sent, you compare the difference with tRR, and slice ~ 1.5/8b
I'm favouring 0x3f "?" and 0x1f, as they have higher sums, (tho the last one does go a little against your Kbd Test wish )
also possible are 0x20 " " and 0x40 "@", but they have smaller sums.
I'm not quite sure what you are asking, but here are my 10b examples
I add, then multiply before shifting to get the rounding effects carefully centred.
and a few likely Baud Higher speeds, this is not yet NCO based, just /N - The appeal of 15b "?", over 10b " ", is it will give better NCO outcomes.
I was more of the mind that the AutoBaud char had better be right, and the echo char can confirm immediately.
No need to get too fancy in the software.
I was planning on just checking for (eg)
Which of 3b:7b ie " " , or 2b:8b, "@" and here the difference is either -4 or -6, so a slice test of 5b is ok.
( I also test for tRR > tFF, as that represents a phasing fail. That certainly is possible. Next Char should always work)
A purist might also check for >3 and <7, to bound the expected values a little more, but what to do, if the AutoBaud fails the test ?
Anything echoed is likely to be mangled anyway.
Users would always be advised to 'approach from below' in any AutoBaud testing.
For echo, I think the "~" 0x7e is good, as that is most baud sensitive.
P2 running too fast, will receive as ">" instead of "~"
Reporting the Capture Sum, I think is good for testing, and good for users wanting to push Baud speeds.
A spare command char, can ask for the sum, anytime. \
There is room to report this as packed-hex (ie one 16ch ASCII block, not the classic 0..9,A-F)
With +4,+6,+3 signaling the error, and -4,-6,-3 for Valid phase, a simple (tRR < tFF) test is enough.
The MCU code I have repeats every ~ 500us waiting for an Autobaud ack reply, but for systems that cannot easily pace their sending, or are more 'one way' in nature, another 'acid test' scenario is
The Boot loader needs to tolerate a long stream of AutoBAUD chars, with reset exit of any Phase.
Something like a RF link, may choose to
* Issue a Reset Pulse
* Stream AutoBaud (eg 300ms) This time is > worst case Reset Exit.
* Stream Data
or, using your simpler Keyboard mode, a user can hold down the AutoBAUD Char, while they manually reset P2.