I was thinking about this matter that dMajo brought up about parts killing themselves by driving against opposing states (or power rails). I think that the foundry design rules actually prevent this from happening.
I agree death is unlikely to be instantaneous, but bus contention still needs a little care.
One pin left driving (and losing) might add 150mW of power loss, a fair bit on a tiny package, and likely to elevate die temperatures. More than one pin, and the numbers become significant.
Next, there is the current spikes from contention - on good ground planes and short leads, that might be ok.
Less strict layout, or breadboards/test benches, are not so tolerant, and a contention spike could fire a reset threshold, or give a spurious clock.
I was thinking about this matter that dMajo brought up about parts killing themselves by driving against opposing states (or power rails). I think that the foundry design rules actually prevent this from happening.
I agree death is unlikely to be instantaneous, but bus contention still needs a little care.
One pin left driving (and losing) might add 150mW of power loss, a fair bit on a tiny package, and likely to elevate die temperatures. More than one pin, and the numbers become significant.
Next, there is the current spikes from contention - on good ground planes and short leads, that might be ok.
Less strict layout, or breadboards/test benches, are not so tolerant, and a contention spike could fire a reset threshold, or give a spurious clock.
Right. And these kinds of problems will become evident during software development and get corrected. Contention pitfalls exist in most systems.
Chip,
I still think combined DI/DO is bad idea - see my post on the other thread.
If you want few pins then allow I2C boot too. Minimal 2 pins which can be shared in many circumstances - I do it on P1 where we are restricted to 32 pins.
If you make more P2 variants, particularly a P1 pin compatible with 32 pins, it would fit nicely into existing designs.
BTW, On SD SPI, I found that it is a requirement to output a byte of $FF (wit CSn=0) before sending any commands. It was something I missed during my code tidy up. It is certainly required on a SanDisk 8GB SDHC Class10 (now ~A$6 one of).
Made great headway last night. We should be able to share the routines with FLASH SPI too
It takes a few clocks to realize a pin change after OUTL/OUTH, and then the path from the pin into the cog is a few clocks. So, that TESTIN is from a clock or two before the OUTH took effect.
What is the exact Tsu, Th for the Data sampling instant ie once all those delays are summed ?
Can those delays ever change ?
It takes a few clocks to realize a pin change after OUTL/OUTH, and then the path from the pin into the cog is a few clocks. So, that TESTIN is from a clock or two before the OUTH took effect.
What is the exact Tsu, Th for the Data sampling instant ie once all those delays are summed ?
Can those delays ever change ?
They don't change. I just discovered through experimentaton what worked best in this case, affording the greatest setup time.
They don't change. I just discovered through experimentaton what worked best in this case, affording the greatest setup time.
The greatest setup time is probably less than ideal, as that infers the least hold time ? ie samples on the update edge.
To allow for varying delays, and parts, and ESD series components, some skew tolerance is nice to have.
ie the sample point should be 1-2 sysclks clear of any data change, and having it also clear of any clock edge helps reduce crosstalk effects.
I don't really want to support i2c EEPROMs, in addition to SPI, because it introduces another wrinkle into the tools and support. Long data retention can be had in SPI EEPROMs and FRAM. 3-pin SPI is much faster, generally cheaper, and sufficiently compact.
They don't change. I just discovered through experimentaton what worked best in this case, affording the greatest setup time.
The greatest setup time is probably less than ideal, as that infers the least hold time ? ie samples on the update edge.
To allow for varying delays, and parts, and ESD series components, some skew tolerance is nice to have.
ie the sample point should be 1-2 sysclks clear of any data change, and having it also clear of any clock edge helps reduce crosstalk effects.
There's at least one clock of hold time. I inserted a NOP before the TESTIN and it would work at 80MHz, but not at 20MHz. At 80MHz, things were blurring sympathetically.
There's at least one clock of hold time. I inserted a NOP before the TESTIN and it would work at 80MHz, but not at 20MHz. At 80MHz, things were blurring sympathetically.
A NOP adds 2 SysCLKs, right ?
If that changes to a marginal sampling point, then you have 2 sysclk of pipeline overlap with adjacent out and testin, for a Th of ~ 100ns, at ~20MHz, which should be fine for most skew scenarios.
If boot code could listen for MISO on both P58 and P59 then P2 could support both 3-pin and 4-pin SPI without any resistors.
Yes, but I'm worried, though, about P58 floating and being parasitically coupled to P59 enough to transition along with it. Maybe a minimum-width low pulse on P59 could get around this.
If boot code could listen for MISO on both P58 and P59 then P2 could support both 3-pin and 4-pin SPI without any resistors.
I think I covered 3P then 4P with a pullup, but if you mean some joint boolean combination of P58 & P59, that gets tricky.
A fully floating unused pin, would not be static at 1 or 0, so and logic extract would fail.
You might be able to apply two pullups, but then you have disturbed P58.
If a user has something like a clock applied to P58, I think you are in big trouble, with any logic-extract.
I think for minimal pin impact, you need sequential testing - first try Find-Flash in 3 pin, then try in 4 Pin.
If it finds in 3 Pin, you avoid any 4P action.
A single pullup on DI should be enough to avoid the parasitic effects Chip is concerned with.
Here is the the code from the booter that reads the SPI flash.
Can you post the code for Serial Byte Rx/Tx, & AutoBaud handling ?
I think it is worthwhile looking there to see if Autobaud can be squeezed to its full potential.
Some simple ideas :
a) Use a 4-bit-time AutoBaud Char - ideal is 0xF8. just 2 edges, and variation tolerant.
P2 can capture over 4 bit times, and make better rounding decisions,
b) Unroll Rx from a single REP 8 to a dual delay REP 4 - if the two delays are identical, you have /Baud.0, if they differ by 1, you have /Baud.5 ie this adds a fractional baud, that halves step size granularity ( == 40MHz SysCLK)
c) Add an ENQ variant, that reports the AutoBaud capture value. (maybe a simple 2 byte binary reply is fine ?)
This will report numbers like
Baud Capture.4bT Precision
2400 33333 30ppm
9600 8333 120ppm
115200 694 0.144%
460800 174 0.576%
691200 116 0.864%
This also allows the host to measure the P2 Osc very easily.
The host can also check the Baud decision, to verify the rounding choices are centered.
(it is quite common to find poor rounding in AutoBaud code )
Using a USB bridge with a crystal, (FT232H, FT2232H etc) would give good P2 Osc capture precision.
Improving Baud matters, as in a production environment, simplest one-step boot download of 512k image, over 115k serial, would need over 60 seconds.
It's trivial to have your download process include a tiny initial download at 115kbps, then jump to something much higher from there to load the full 512kB+.
One issue I am sure Chip is concerned about is having the startup be reliable across the entire range of working temps and voltage levels (as with the P1). Sure you can get higher bit rates in ideal conditions, but what about in a hot or cold environment, or when running at 3v instead of 3.3v?
The space character ($20) is unique, in that its state/time signature cannot be mistaken among likely ASCII text characters. It looks like this, in transmission:
...10000001001...
That first 0 is the start bit. Here are the state periods:
6 bit periods of 0
1 bit period of 1
2 bit periods of 0
The current auto-baud routine looks for that 6,1,2 timing relationship for states 0,1,0. Here is the code:
'
'
' Autobaud ISR
'
autobaud_isr akpin #rx_msr 'acknowledge rx state change
rdpin buf2,#rx_msr wc 'get sample, measure ($20 -> 10000001001 -> ..1, 6x 0, 1x 1, 2x 0, 1..)
clrb buf2,#31 'clear msb in case 1 sample
if_c jmp #.scroll 'if 1 sample, just scroll
mov limh,buf0 '0 sample,
shr limh,#4 '..make window from 1st 0 (6x if $20)
neg liml,limh
add limh,buf0
add liml,buf0
mov comp,buf1 '0 sample,
mul comp,#6 '..normalize last 1 (1x if $20) to 6x
cmpr comp,limh wc '..check if last 1 within window
if_nc cmp comp,liml wc
if_nc mov comp,buf2 '0 sample and last 1 within window,
if_nc mul comp,#3 '..normalize last 0 (2x if $20) to 6x
if_nc cmpr comp,limh wc '..check if last 0 within window
if_nc cmp comp,liml wc
if_c jmp #.scroll 'if not $20, just scroll
add buf0,buf2 '$20 (space),
shl buf0,#16-3 '..compute bit period from 6x 0 and 2x 0
or buf0,#7 '..set 8 bits
wxpin buf0,#rx_rcv '..set rx pin baud
dirl #rx_rcv '..reset receiver pin
dirh #rx_rcv '..(re)enable receiver pin to (re)register frame
mov baud,buf0 '..save baud for transmit
mov rxbyte,#$120 '..signal receiver ISR to ignore pin, enter space
trgint2 '..trigger serial receiver ISR in case it wasn't, already (<50k baud)
.scroll mov buf0,buf1 'scroll sample buffer
mov buf1,buf2
reti1 'if $20 (space), serial receiver ISR executes next
'
'
' Serial receiver ISR
'
receive_isr clrb rxbyte,#8 wc 'triggered by autobaud? if so, rxbyte = $20 (space)
if_nc akpin #rx_rcv 'triggered by receive, acknowledge rx byte
if_nc rdpin rxbyte,#rx_rcv 'triggered by receive, get rx byte
wrlut rxbyte,head 'write byte to circular buffer in lut
incmod head,#lut_btop 'increment buffer head
reti2
I wanted something that does not need a special character first, but a common text character. MSB-set values are not good, I think, because some O.S.'s tend to ignore and not pass them. I wanted something that would be conversant with a serial terminal that someone is typing into. It makes stepping in easy.
It would be GREAT to get this going a lot faster, of course, so that loading could be a single step.
It's trivial to have your download process include a tiny initial download at 115kbps, then jump to something much higher from there to load the full 512kB+.
That comes with caveats - you now have two code pieces to manage, and this all assumes you have working Crystal & PLL.
Suppose the P2 PLL has issues ?
One issue I am sure Chip is concerned about is having the startup be reliable across the entire range of working temps and voltage levels (as with the P1). Sure you can get higher bit rates in ideal conditions, but what about in a hot or cold environment, or when running at 3v instead of 3.3v?
Yup, that is what AutoBaud manages.
Chip even has the ability to autobaud live, (not sure how he does that), but that will track drift even during download.
Drift during download I'd expect to be slight, as Process and Voltage are not changing, only temperature.
My suggestions allow users to track & measure even that change.
What I was getting at is that if you push the timings to the limits, then it may not be able to "autobaud" to the speed the computer/host is sending at when in extreme conditions.
What I was getting at is that if you push the timings to the limits, then it may not be able to "autobaud" to the speed the computer/host is sending at when in extreme conditions.
Of course, Autobaud has limits, and increasing granularity at higher bauds - that's a self-evident given.
What I'm looking at, is ways to 'squeeze the last drop', to push the limits well above 115200.
MSB-set values are not good, I think, because some O.S.'s tend to ignore and not pass them. I wanted something that would be conversant with a serial terminal that someone is typing into. It makes stepping in easy.
I've never heard of that ? Any OS examples ?
The serial Terminals I'm used to, allow a $hh hex entry, so any string can be pasted, and they all have HEX modes, if you wanted to check what was being received.
Terminals are only used for early testing, most active use code will be PC/MCU operated.
Here is an example capture from my terminal, of the ENQ-ACK code running.
MCU send ENQ (0xfe here) every 500us until a not-me-echo is seen, then it sends the ROM command and P2 image, rounded to 256 bytes. (0xff is overflow, == blank Flash)
It would be GREAT to get this going a lot faster, of course, so that loading could be a single step.
Agreed, that's what I'm looking into.
Do you have the full RxCode ? & a time budget for stop-bit house-keeping.
I think at higher rates, there could be benefit in 2 stop bits - a slight drop in rate, for a boost in achievable baud.
Scanning that AutoBaud, this line jumps out..
shl buf0,#16-3 '..compute bit period from 6x 0 and 2x 0
Ideally, in Autobaud you want to centre the choice, which means a simple shl is not quite right.
The remainder should be checked to make the round-up or round-down choice, so the chosen integer is always closest to the ideal.
That means (eg) at 400kbaud, you have +/- 1%, instead of -0/+2%
Once Baud maths is optimized, you can push further if you know the P2 Clock, via the BaudValue readback.
This may push a small MCU, but is certainly no problem for a PC Host.
Above show 3 'magic values', 2 with +10% P2 deviation, of 612.5k and 816.667k Baud, and one -8% = 612.5k
Errors are well under 1%, it just needs readback knowledge of the P2 Osc.
FT232H maths is similar, as the Virtual Baud clock there is 24MHz, instead of 24.5MHz in EFM8BB1
Jmg, I don't have any specific examples of OS's which don't like serial characters >$7F, as there likely aren't any, but it's been my experience that what a user is sometimes left to work with on a system is a poorly-written serial application that barely handles the basics. Then, there's the keyboard issue. I want it to work with characters which can be typed from a keyboard using the symbols printed on the keys (no resorting to something like ALT-2-5-5). And no caring about PC/Mac/Unix CR/LF issues. The tab character has even been rendered unusable in most every GUI. We need to work with the most basic subset of boneheaded characters and conventions. A human should be able to talk to it over a keyboard and terminal program. Sure, it can be sped up by automated comms, but I want the human element to be there, so that people can ease there way into the whole thing.
What I think is mainly limiting the auto-baud high end is the time it takes to run that 'autobaud_isr' routine. I think it's as fast as can be. To go faster, we need different principles. I was thinking about strings of '?' ($3F) characters, which cause two low bits periodically. That's not much to go on, I know. It would be great if we could use $C0's or something, so that we get low-rate transitions which can be counted better. I think if I used TWO smart pins to count the states, one for high and one for low, I might be able to double the top speed. Remember, also, we don't just have to use whole clock counts. Both smart pin TX and RX have an NCO mode. If we could just get good NCO values, we could go way higher. That's why I was thinking maybe a string of '??????????????????????' could be used to get a good average on those 2-bit high occurrences.
I'm still trying to absorb the rest of what you wrote.
.... I want it to work with characters which can be typed from a keyboard using the symbols printed on the keys (no resorting to something like ALT-2-5-5). And no caring about PC/Mac/Unix CR/LF issues. The tab character has even been rendered unusable in most every GUI. We need to work with the most basic subset of boneheaded characters and conventions.
ok, tho here the rapid-boot characters of ENQ & ACK, can still be > $7f, as they are not likely to be keyboard issues.
ie they are used to ensure both host/P2 are out of reset, and ready to proceed.
Someone on a keyboard is not worried about that last millisecond
You are compromising the system with that keyboard usage caveat. Are you sure it is valid & worthwhile ?
What P2 users will not have a any-hex-capable terminal ?
An appeal of 0xf8 etc, is the P2 can exit reset in the middle of a Char, and not trip over.
...To go faster, we need different principles. I was thinking about strings of '?' ($3F) characters, which cause two low bits periodically. That's not much to go on, I know. It would be great if we could use $C0's or something, so that we get low-rate transitions which can be counted better. I think if I used TWO smart pins to count the states, one for high and one for low, I might be able to double the top speed. Remember, also, we don't just have to use whole clock counts. Both smart pin TX and RX have an NCO mode. If we could just get good NCO values, we could go way higher.
I've refrained from suggesting smart pins, as one appeal of Software-centric is it covers the "what if smart pins have a flaw' risk, but I guess if you have a Test-wafer-run proof, plus other boot modes, that risk is covered elsewhere.
Adding i2c could help that risk-coverage a little too.
0x3f has appeal that it is 8 bit times from _/= to _/=, but a down side is if the P2 comes out of reset out of phase, can it recover?
(see below)
Could a smart pin be set for time interval capture, _/= to _/= ?
These are the capture precisions, at nominal 20MHz
That's why I was thinking maybe a string of '??????????????????????' could be used to get a good average on those 2-bit high occurrences.
Interesting idea, - we have used up to 50 character strings, to get very high measurement precision on RC osc, down to 15ppm LSB,
but the 'user at keyboard' requirement makes this harder to send.
I think those capture precisions are (just) good enough at 8b, to not need multiple chars.
That '?' also has reset-exit issues mid-char to ponder... maybe a SmartPin twin capture /../ =8 and \..\=7 is enough ?
A system really pushing the envelope to SW-set rates, could do this Rst-P2, Send ? @ 115200, ENQ_P2Osc, use that .072% reading to pick a 'sweet spot value', as above, and Rst-P2, Send ? @ 1M+, ENQ_P2Osc,
addit: More detailed Split case analysis (reset exits mid-char, splitting capture )
a) Try : SmartPin Twin Interval / to / & PulseWidthLow, on 0x3f '?' char
===\_s_/=0=.=1=.=2=.=3=.=4=.=5=\_6_._7_/=P==T=\_s_/=0=.=1=.=2=.=3=.=4=.=5=\_6_._7_/=P==T=\_s_/=0=.=1=.=2=.=3=.=4=.=5=\_6_._7_/=P=
/ to / | 8 | 2+T | 8 | 2+T | 8 |
PWL | 2 | | 1 | | 2 | | 1 |
OK Err OK Err
Every // checks PWL, if PWL ~ (// DIV 4) = OK, Split is 2b+T vs 1b,
a1) Try : SmartPin Twin Interval / to / & \ to \ on 0x3f "?" \\ = 7 // = 8
===\_s_/=0=.=1=.=2=.=3=.=4=.=5=\_6_._7_/=P==T=\_s_/=0=.=1=.=2=.=3=.=4=.=5=\_6_._7_/=P==T=\_s_/=0=.=1=.=2=.=3=.=4=.=5=\_6_._7_/=P=
| 8 | 2+T | 8 | 2+T | 8 | //
7 | 3+T | 7 | 3+T | 1 | =\_/=
f r OK:8,7 Err:2b+T,3b+T OK Err
Valid is \\ = 7/8 of // and invalid is \\ > // ;
Could use both capture values as : 7b*8+8b*7 = 112b so /112 -> b, or (7b+8b)/15 -> b, any better than 8b/8?
and try similar rules on 0x20, space char
b) SmartPin Twin Interval / to / & \ to \ on 0x20 " " \\=7 //=3
=======\_s_._0_._1_._2_._3_._4_/=5=\_6_._7_/=P==T=\_s_._0_._1_._2_._3_._4_/=5=\_6_._7_/=P=
/ to / | 3 | 7+T | 3 |
\ to \ | 7 | 3+T | 7 | 3+T |
f r OK:7,3 Err:7,7+T OK
a) & b) assumes a Interval (PW / to / ) measurement auto-rearms, and gives next time on next edge. Nothing (zero?) on first edge
Usually, Char capture is ok, and OK exits, but do need to cover the second pulse arriving first.
I think a reset exit mid-way, would result in first reading of // = T+2b, PWL = 1b, so fails test of ( // DIV PWM ~ 4 ) for a)
for b) 7b,3b is OK, 7b,7b+T is fail, similar test, slight variant on capture.
This test may be enough, allowing keep of 9600 as valid lower baud rate.
To avoid repeating '?', valid Autobaud-done should echo a single char, maybe '.' - this makes '?' the MCU ENQ, and '.' the ACK
I don't have any specific examples of OS's which don't like serial characters >$7F
I have found when using serial ports in Visual Studio I need to configure the port using the following encoding otherwise only "text" and some control characters are transmitted.
I have had problems, even with PST, with characters other than $1F<char<$7F. Note that $7F is a problem with some terminals!
Remember the old "AT" and "at" command set... Not only autoboarding, but case and parity were able to be determined from these two characters.
I like, and have used, the <space> $20 character to autobaud for PS2 keyboards in order to connect them to the P1 using 1pin. (see obex for 1pin TV and Kbd)
I don't have any specific examples of OS's which don't like serial characters >$7F
I have found when using serial ports in Visual Studio I need to configure the port using the following encoding otherwise only "text" and some control characters are transmitted.
That's the kind madness that I'd like to avoid. Computers used to work a lot better. I don't trust them, anymore, to do even the simplest things sensibly.
In this, I'm in the conservative camp as well - sometimes you have to work with whatever's available, and no time to figure out how to send hex chars from whatever it is (something I haven't actually done with any terminal tool I've ever used, actually - so I wouldn't be able to do that without reading up, even with e.g. minicom, which I've used for decades. Or Seyon. The latter I used for years when on the road. Never needed hex chars, so I don't even know if it can do it.)
Space char is good, and it's been used for the same purpose in other systems before.
Comments
One pin left driving (and losing) might add 150mW of power loss, a fair bit on a tiny package, and likely to elevate die temperatures. More than one pin, and the numbers become significant.
Next, there is the current spikes from contention - on good ground planes and short leads, that might be ok.
Less strict layout, or breadboards/test benches, are not so tolerant, and a contention spike could fire a reset threshold, or give a spurious clock.
Right. And these kinds of problems will become evident during software development and get corrected. Contention pitfalls exist in most systems.
I still think combined DI/DO is bad idea - see my post on the other thread.
If you want few pins then allow I2C boot too. Minimal 2 pins which can be shared in many circumstances - I do it on P1 where we are restricted to 32 pins.
If you make more P2 variants, particularly a P1 pin compatible with 32 pins, it would fit nicely into existing designs.
BTW, On SD SPI, I found that it is a requirement to output a byte of $FF (wit CSn=0) before sending any commands. It was something I missed during my code tidy up. It is certainly required on a SanDisk 8GB SDHC Class10 (now ~A$6 one of).
Made great headway last night. We should be able to share the routines with FLASH SPI too
What is the exact Tsu, Th for the Data sampling instant ie once all those delays are summed ?
Can those delays ever change ?
They don't change. I just discovered through experimentaton what worked best in this case, affording the greatest setup time.
The greatest setup time is probably less than ideal, as that infers the least hold time ? ie samples on the update edge.
To allow for varying delays, and parts, and ESD series components, some skew tolerance is nice to have.
ie the sample point should be 1-2 sysclks clear of any data change, and having it also clear of any clock edge helps reduce crosstalk effects.
There's at least one clock of hold time. I inserted a NOP before the TESTIN and it would work at 80MHz, but not at 20MHz. At 80MHz, things were blurring sympathetically.
If that changes to a marginal sampling point, then you have 2 sysclk of pipeline overlap with adjacent out and testin, for a Th of ~ 100ns, at ~20MHz, which should be fine for most skew scenarios.
If boot code could listen for MISO on both P58 and P59 then P2 could support both 3-pin and 4-pin SPI without any resistors.
Yes, but I'm worried, though, about P58 floating and being parasitically coupled to P59 enough to transition along with it. Maybe a minimum-width low pulse on P59 could get around this.
Wouldn't this solve that?
A fully floating unused pin, would not be static at 1 or 0, so and logic extract would fail.
You might be able to apply two pullups, but then you have disturbed P58.
If a user has something like a clock applied to P58, I think you are in big trouble, with any logic-extract.
I think for minimal pin impact, you need sequential testing - first try Find-Flash in 3 pin, then try in 4 Pin.
If it finds in 3 Pin, you avoid any 4P action.
A single pullup on DI should be enough to avoid the parasitic effects Chip is concerned with.
Yup, a pullup on the P2-DO pin. If no data reply appears on P2-DO, you then try 4P mode (ie reply on separate pin)
Pullups above 10k should still allow 1k DI-DO connection, should users wish to do 3P that way.
Can you post the code for Serial Byte Rx/Tx, & AutoBaud handling ?
I think it is worthwhile looking there to see if Autobaud can be squeezed to its full potential.
Some simple ideas :
a) Use a 4-bit-time AutoBaud Char - ideal is 0xF8. just 2 edges, and variation tolerant.
P2 can capture over 4 bit times, and make better rounding decisions,
b) Unroll Rx from a single REP 8 to a dual delay REP 4 - if the two delays are identical, you have /Baud.0, if they differ by 1, you have /Baud.5 ie this adds a fractional baud, that halves step size granularity ( == 40MHz SysCLK)
c) Add an ENQ variant, that reports the AutoBaud capture value. (maybe a simple 2 byte binary reply is fine ?)
This will report numbers like
Baud Capture.4bT Precision
2400 33333 30ppm
9600 8333 120ppm
115200 694 0.144%
460800 174 0.576%
691200 116 0.864%
This also allows the host to measure the P2 Osc very easily.
The host can also check the Baud decision, to verify the rounding choices are centered.
(it is quite common to find poor rounding in AutoBaud code )
Using a USB bridge with a crystal, (FT232H, FT2232H etc) would give good P2 Osc capture precision.
Improving Baud matters, as in a production environment, simplest one-step boot download of 512k image, over 115k serial, would need over 60 seconds.
One issue I am sure Chip is concerned about is having the startup be reliable across the entire range of working temps and voltage levels (as with the P1). Sure you can get higher bit rates in ideal conditions, but what about in a hot or cold environment, or when running at 3v instead of 3.3v?
...10000001001...
That first 0 is the start bit. Here are the state periods:
6 bit periods of 0
1 bit period of 1
2 bit periods of 0
The current auto-baud routine looks for that 6,1,2 timing relationship for states 0,1,0. Here is the code:
I wanted something that does not need a special character first, but a common text character. MSB-set values are not good, I think, because some O.S.'s tend to ignore and not pass them. I wanted something that would be conversant with a serial terminal that someone is typing into. It makes stepping in easy.
It would be GREAT to get this going a lot faster, of course, so that loading could be a single step.
That comes with caveats - you now have two code pieces to manage, and this all assumes you have working Crystal & PLL.
Suppose the P2 PLL has issues ?
Yup, that is what AutoBaud manages.
Chip even has the ability to autobaud live, (not sure how he does that), but that will track drift even during download.
Drift during download I'd expect to be slight, as Process and Voltage are not changing, only temperature.
My suggestions allow users to track & measure even that change.
Of course, Autobaud has limits, and increasing granularity at higher bauds - that's a self-evident given.
What I'm looking at, is ways to 'squeeze the last drop', to push the limits well above 115200.
I've never heard of that ? Any OS examples ?
The serial Terminals I'm used to, allow a $hh hex entry, so any string can be pasted, and they all have HEX modes, if you wanted to check what was being received.
Terminals are only used for early testing, most active use code will be PC/MCU operated.
Here is an example capture from my terminal, of the ENQ-ACK code running.
Non-ascii chars display as <hh>
MCU send ENQ (0xfe here) every 500us until a not-me-echo is seen, then it sends the ROM command and P2 image, rounded to 256 bytes. (0xff is overflow, == blank Flash)
Agreed, that's what I'm looking into.
Do you have the full RxCode ? & a time budget for stop-bit house-keeping.
I think at higher rates, there could be benefit in 2 stop bits - a slight drop in rate, for a boost in achievable baud.
Scanning that AutoBaud, this line jumps out..
Ideally, in Autobaud you want to centre the choice, which means a simple shl is not quite right.
The remainder should be checked to make the round-up or round-down choice, so the chosen integer is always closest to the ideal.
That means (eg) at 400kbaud, you have +/- 1%, instead of -0/+2%
Once Baud maths is optimized, you can push further if you know the P2 Clock, via the BaudValue readback.
This may push a small MCU, but is certainly no problem for a PC Host.
Above show 3 'magic values', 2 with +10% P2 deviation, of 612.5k and 816.667k Baud, and one -8% = 612.5k
Errors are well under 1%, it just needs readback knowledge of the P2 Osc.
FT232H maths is similar, as the Virtual Baud clock there is 24MHz, instead of 24.5MHz in EFM8BB1
What I think is mainly limiting the auto-baud high end is the time it takes to run that 'autobaud_isr' routine. I think it's as fast as can be. To go faster, we need different principles. I was thinking about strings of '?' ($3F) characters, which cause two low bits periodically. That's not much to go on, I know. It would be great if we could use $C0's or something, so that we get low-rate transitions which can be counted better. I think if I used TWO smart pins to count the states, one for high and one for low, I might be able to double the top speed. Remember, also, we don't just have to use whole clock counts. Both smart pin TX and RX have an NCO mode. If we could just get good NCO values, we could go way higher. That's why I was thinking maybe a string of '??????????????????????' could be used to get a good average on those 2-bit high occurrences.
I'm still trying to absorb the rest of what you wrote.
ie they are used to ensure both host/P2 are out of reset, and ready to proceed.
Someone on a keyboard is not worried about that last millisecond
You are compromising the system with that keyboard usage caveat. Are you sure it is valid & worthwhile ?
What P2 users will not have a any-hex-capable terminal ?
An appeal of 0xf8 etc, is the P2 can exit reset in the middle of a Char, and not trip over.
I've refrained from suggesting smart pins, as one appeal of Software-centric is it covers the "what if smart pins have a flaw' risk, but I guess if you have a Test-wafer-run proof, plus other boot modes, that risk is covered elsewhere.
Adding i2c could help that risk-coverage a little too.
0x3f has appeal that it is 8 bit times from _/= to _/=, but a down side is if the P2 comes out of reset out of phase, can it recover?
(see below)
Could a smart pin be set for time interval capture, _/= to _/= ?
These are the capture precisions, at nominal 20MHz Interesting idea, - we have used up to 50 character strings, to get very high measurement precision on RC osc, down to 15ppm LSB,
but the 'user at keyboard' requirement makes this harder to send.
I think those capture precisions are (just) good enough at 8b, to not need multiple chars.
That '?' also has reset-exit issues mid-char to ponder... maybe a SmartPin twin capture /../ =8 and \..\=7 is enough ?
A system really pushing the envelope to SW-set rates, could do this
Rst-P2, Send ? @ 115200, ENQ_P2Osc, use that .072% reading to pick a 'sweet spot value', as above, and
Rst-P2, Send ? @ 1M+, ENQ_P2Osc,
addit: More detailed Split case analysis (reset exits mid-char, splitting capture )
a) & b) assumes a Interval (PW / to / ) measurement auto-rearms, and gives next time on next edge. Nothing (zero?) on first edge
Usually, Char capture is ok, and OK exits, but do need to cover the second pulse arriving first.
I think a reset exit mid-way, would result in first reading of // = T+2b, PWL = 1b, so fails test of ( // DIV PWM ~ 4 ) for a)
for b) 7b,3b is OK, 7b,7b+T is fail, similar test, slight variant on capture.
This test may be enough, allowing keep of 9600 as valid lower baud rate.
To avoid repeating '?', valid Autobaud-done should echo a single char, maybe '.' - this makes '?' the MCU ENQ, and '.' the ACK
I have found when using serial ports in Visual Studio I need to configure the port using the following encoding otherwise only "text" and some control characters are transmitted.
This had me pulling my hair out for a while.
Remember the old "AT" and "at" command set... Not only autoboarding, but case and parity were able to be determined from these two characters.
I like, and have used, the <space> $20 character to autobaud for PS2 keyboards in order to connect them to the P1 using 1pin. (see obex for 1pin TV and Kbd)
That's the kind madness that I'd like to avoid. Computers used to work a lot better. I don't trust them, anymore, to do even the simplest things sensibly.
Space char is good, and it's been used for the same purpose in other systems before.