I really like the idea of reset and then immediate serial blasting, looking for responses. That simplifies the host software in some ways (reliability) and minimizes the overall download time.
To me, this aspect is quite important.
The smart pin X-Periods mode works nicely for 0x55 Baud-trim, but the Coarse board still has some wrinkles I worry about....
The best-rejection-precision character time measurement does seem to be 5:7
">" has that, but the tHH side of this has a false positive at 5 Stop bits. 5 Stop bit is rare, but not impossible.
"0" is the tLL equivalent of ">", and it avoids Stop bit positives, but "0" is rather useful elsewhere...
One possible rule would be to avoid needing "0" during the Coarse Autobaud + response polling phase ?
Another approach is to impose a rule of gaps in the 'reset and then immediate serial blasting, looking for responses' eg 1 or 2 or 3ms min of 'no edges' ?
For small strings, most USB/UART buffers can packet-group characters and so avoid OS-gaps within packets.
Any OS-Gap between packets is tolerated.
Q: What is the smallest string sequence that can Coarse AutoBaud, and obtain an echo ?
Ideally, a smart-pin mode would exist that acts like a High-pass Monostable - it interrupts when no-pin-changes are seen for X-ms
Q: Is there such a smart pin mode ?
I guess you could emulate this, with code that reloads a timer on Pin-edges, and eventual timer-int means no-pin-edges for X-ms.
Using this, you now know message phase, no matter what reset-exit cases exist, and the first char can be 0x55, for the X-Periods mode.
Avoiding the rejection-tests of the 5:7, I calculate moves the Auto-baud-ceiling from ~ 3.57MBd to over 5MBd(+/-1.6% measure point)
Test results on a new CP2102N-EK (the N suffix matters here)
(CP2102N is $1.1615 /1k at Digikey)
Data specs 3MBd, but tests show you can actually request and set the next two values above that, 3.428571M and 4M.
4.8M gives a setting error message & all values <=4M willl round to nearest physical Baud value.
There is no sign of added-stop bits, even up to 4MBd, in Tx only tests.
I also tested code of the general REPEAT with Delay form
This gives an alternative to a Coarse AutoBaud character and allows "U" to be used for both, as the P2 can wait for a pause in edges, before looking for "U"
BaudDelay_ms = Len(AutoBaud_Enq_String)*10*1000/BaudRate // minimum time to transmit the string.
REPEAT
Tx(AutoBaud_Enq_String) // eg "U00?" or "U?"
Delay(ms1) // Some delay to ensure gaps between packets
TO = TO+1
UNTIL (RxCount > 0) OR (TO > TimeOut)
In Windows, this wobbles about a little, but you can see a 1ms quanta from USB FS UART frames, and can use a calculated BaudDelay_ms to give a floor delay value. At high baud rates, short strings can space to 1ms or sometimes 2ms +
I'm guessing on a RaspPi, the results are similar, as USB-UARTS (FS) will all have the USB 1ms Frame influence.
On a Small Host MCU, with inbuilt UART, you can easily control to well under 1ms pauses, if needed.
With unknown Reset-Exit delays in the hundreds of milliseconds region, a 1-2ms polling rate seems fine ?
For the purposes of reset exit, the size of the delay is likely mainly set by lowest targeted Baud rate - which is what ?
eg at 9600 baud, edge spacing in an AutoBaud_Enq_String will be something under 1ms
I doubt anyone would develop boot a P2 at 9600, but maybe these new long range RF links need checking, as they will have lower-area rates ?
Google finds 12500 Baud mentioned for LoRa ? - so that makes 9600 a practical low-end ?
Jmg, I just went to review your postings and I'm remembering that we can go 2.25M baud, based on how long it takes the code to run to parse and store the data bytes. The only way we are going to get any higher than that is to up the RS osc frequency. This makes me wonder if it's worth changing anything. I mean, even if we can autobaud at higher rates, we don't have the MIPS to handle the data. What do you think?
Jmg, I just went to review your postings and I'm remembering that we can go 2.25M baud, based on how long it takes the code to run to parse and store the data bytes. The only way we are going to get any higher than that is to up the RS osc frequency. This makes me wonder if it's worth changing anything. I mean, even if we can autobaud at higher rates, we don't have the MIPS to handle the data. What do you think?
Is that 2.25MBd based on 1 or 2 stop bits, and is that with a -30% margin on the 20MHz ?
(ie if 20MHz Min can be spec'd, does that 2.25MBd becomes > 3MBd ? (3.375) )
2 Stop bits is also practical to specify, if we take that 2.25MBd as margined, you get 3.375 1 stop bit or 3.7125MBd 2 stop bits, for > 20MHz, and that can then accept the practical 3.428571MBd value in the table above.
How many cycles in the RxINT is that value based on ?
I think it is worth a change to 20MHz min, - did you check what the Test chips actually give you now yet ?
As for Code changes, if there is a parse and store limit, then Autobaud really only needs to be reliable, to just above that.
The tests above, show the practical increments in Baud Value, for low-cost, modern USB-UARTS.
Message issues :
The ">" does have a false window, with 5 stop bits, but maybe that is rare enough to ignore.
The CP2102N just tested has no real 5 stop bit risk, nor does the Exar and modern FTDI parts, and older parts vary that effect with Baud, so you can suggest in the DOCs users start with a lower value on older/simpler UART links, and increase when confidant.
Other changes: I would avoid sending ">>", as that does have reset-exit-phase issues, but the second 'dummy' char after ">" only really needs two edges, so a simple change to another (2 x =\_) char, at the host end, is fine there.
I would like to see a defined shortest string for P2 reset exit-polling, maybe by assuming defaults for the masks.
eg ">" then Dummy then "?", with One-Pin working in this mode gives something along the lines of
"> ?" Two Pin poll string
or
"> *?" One Pin poll string
Reply is a single ".", when correct phase & autobaud done, no reply when still waiting on P2 reset exit.
Addit: I can see only a one-way "*" for change from default to one-pin, adding a two-way command, could help testing ?
Test results on a new CP2102N-EK (the N suffix matters here)
Checking the FTDI FT231X, reveals differing behavior
Baud choices are much more coarse, with 3M, then 1.5MBd, then 3M/(2N+k/8)
ie unlike the CP2102N, which has 7 choices between 3M and 1.5M, the FTDI part only has a 24MHz virtual BaudClock, below 1.5MBd
Strangely, unlike C2102N, which will cleanly round to the nearest baud, the FTDI driver does not round, but jumps to 115200 for any Baud it does not recognize. That requires user care in testing.
ie 3M/(2+3/16) gave 115200. (as does 1600000, or 2000000 etc )
I'd call that bordering on the lazy. SiLabs round properly.
Duplex checks show neither SiLabs nor FTDI can manage 3MBd full duplex, but the P2 is essentially half-duplex in operation, so that is unlikely to be an issue.
The EXAR part, looks to manage Duplex with no dropped characters, to 4MBd, so it looks the best FS USB Bridge part.
The FT232H/2232H can also do high MBd duplex, but they are more costly and HS USB.
Jmg, how would a two-way command help testing if two-way is the default, already?
We can do 2.25M baud at 20MHz. At 2/3 of 20MHz (13.33Mhz), we can do 1.5M baud. So, if the RC osc is not made to go at least 20MHz, we will have to specify 1.5M baud to accommodate the lower 13.33MHz possibility.
Jmg, how would a two-way command help testing if two-way is the default, already?
I was thinking you could switch to one-Pin for confirm that works, then switch back to two-pin for the other boot command tests, without needing a reset.
I guess you could do some tests in Two-Pin, and then be careful to check one-pin at the end
There may be cases where a user requests one pin, then if they fail to see an ack, they switch back to 2 pin.
I can forsee a MCU-Host could be coded that way, to cover both PCB layouts, and it does not really want to reset a P2, if it finds there is no One-Pin wiring.
ie fastest is to AutoBaud & request One-Pin, then check 2 pin, if no one-pin is seen, then exit with PCB-Wiring known.
A single code library can then manage either PCB layout.
We can do 2.25M baud at 20MHz. At 2/3 of 20MHz (13.33Mhz), we can do 1.5M baud. So, if the RC osc is not made to go at least 20MHz, we will have to specify 1.5M baud to accommodate the lower 13.33MHz possibility.
So that means ~ 97 SysClks per Byte overhead ?
I get ~14 SysClks in the smaller RxINT I gave above, and ~ 38+18 SysClks in the Base64-decode, plus Hub align effects.
In my MCU code, I fetch 3 bytes then send 4, (interleaved a little for better speed) to keep the encode simpler, - you may get higher average speed, if you receive and check for 32b, and then Wrlong to HUB, rather than wrbyte 4x as frequently ? (on average, a wrlong is needed every 5.333' Chars in ~ 488 SysClk rate )
Jmg, the "*" character is checked for on each command. If it's not present, that command will be two-wire. So, to stay in single-wire mode, every command must contain the "*".
I tuned up the Base64 (Prop_Txt) code so it ran just as fast as the hex (Prop_Hex) code. If I made the Base64 faster, I would also have to make the Prop_Hex code faster, which seems harder to do, as there are less opportunities for improving it, compared to what I was able to do for the Base64 code. I believe things are pretty sufficient, already. We could not get enough improvement, anyway, to get us to the next major baud (from 1.5M to 2M). We could only squeeze maybe 10% of the time out.
Jmg, the "*" character is checked for on each command. If it's not present, that command will be two-wire. So, to stay in single-wire mode, every command must contain the "*".
OK, that's workable.
What is the shortest string that can AutoBaud Coarse and then get an echo, in one-Pin and 2 Pin modes ?
I tuned up the Base64 (Prop_Txt) code so it ran just as fast as the hex (Prop_Hex) code. If I made the Base64 faster, I would also have to make the Prop_Hex code faster, which seems harder to do, as there are less opportunities for improving it, compared to what I was able to do for the Base64 code. I believe things are pretty sufficient, already. We could not get enough improvement, anyway, to get us to the next major baud (from 1.5M to 2M). We could only squeeze maybe 10% of the time out.
OK. Did you look at the wider-catch range of 0x55 ?
There are other baud values above 1.5MBd (unless it is FT231X which jumps to 3MBd), but the biggest gain seems to be from getting >= 20MHz
CP2102N allows
2400000
2181818.18
2000000
1846153.84
1714285.71
1600000
1500000
It would take 19 characters for one-pin mode, too:
"> Prop_Chk*0 0 0 0 "
About the wider capture 0x55, I started looking at your code earlier today, but got sidetracked with FPGA compilation issues. I'll go back to it soon. For now, I'm preparing an update with the current ">"-only autobaud technique.
It would take 19 characters for one-pin mode, too:
"> Prop_Chk*0 0 0 0 "
Isn't there a "?" query, that gives a shorter echo than Prop_Chk ? ( "?" -> "." or "!" reply )
You have to give those four INA/INB masks and values before a chip will respond, since there could be different chips connected. The Prop_Chk command ends when a non-hex character follows the last hex value. At that point, the selected chip will send out it's "Prop_Ver Au" string and begin responding to "?" characters with "." characters. It waits 10ms before outputting, to allow turn-around time for the host, in case it's needed. So, the first thing you'd get out would be the "Prop_Ver". You might as well wait for that string, rather than harass it with "?" characters.
It waits 10ms before outputting, to allow turn-around time for the host, in case it's needed. So, the first thing you'd get out would be the "Prop_Ver".
That's quite a lot of characters, and the 10ms seems quite a long turn around time ?
All of this adds to the reset-to-run time, when using a Local MCU.
It waits 10ms before outputting, to allow turn-around time for the host, in case it's needed. So, the first thing you'd get out would be the "Prop_Ver".
That's quite a lot of characters, and the 10ms seems quite a long turn around time ?
All of this adds to the reset-to-run time, when using a Local MCU.
Well, that 10ms would be waited for only once. I could certainly shorten it to 1ms. I just wanted to give a host system time to turn around, in case it was operating the link in a half-duplex mode and running from a slow language in a cumbersome operating system. One ms would probably not accommodate that scenario.
Also, I figure that the connect/handshake strings should be long enough that they don't get mistaken from what might be incidental data flowing. I mean, keying off a single character would be kind of reckless in some scenarios.
Well, that 10ms would be waited for only once. I could certainly shorten it to 1ms. I just wanted to give a host system time to turn around, in case it was operating the link in a half-duplex mode and running from a slow language in a cumbersome operating system. One ms would probably not accommodate that scenario.
Also, I figure that the connect/handshake strings should be long enough that they don't get mistaken from what might be incidental data flowing. I mean, keying off a single character would be kind of reckless in some scenarios.
I guess the String length issue can be mitigated by upping the baud rates. (1ms delay is ~230k Baud)
The 10ms wait depends on the phase of reset exit, so the host MCU has to repeat a Enq and listen pause, which means it could double, to 20ms added delay.
Addit: one idea could be to have the Host specify the Delay ?
Since you already have fields for multiple P2 support, one more for Turn-Around define is not much to add, and it allows systems that are local and fast, to skip a long delay and also allows slower systems to adjust what they need. It might be 10ms is not enough on some future OS design, whilst << 1ms is fine for a local MCU host.
This is the proposed patch/change to the Base64 table generate.
' Table Option if ">" is used for Coarse Autobaud & "U" for trim.
mov x,#10 '"0".."9" --> $0..9
callpa #"0",#fill
mov x,#20 '"A".."T" --> $0A..$1D
callpa #"A",#fill
mov x,#34 '"V".."w" --> $1e..$3F
callpa #"V",#fill
This has no effect on P2 read speed, as that is table lookup, but this simpler Base64 encode (fewer tests) does give significant gains in speed and size on the small MCU host end. Specifically, it halves the size and means it can in-line, rather than call, for even more savings.
The improved Base64 means the cheapest MCU candidate can keep up with its highest Baud in an unrolled Base64 code block.
One detail that emerged from this is, it would be nice to have Smart Pins able to trigger on > Some Width.
It is not clear if any of the many modes do that ?
To operate, this would need some preset-value, and then a trigger/gate condition needs to reset the counter when inactive.
ie reset with pin is LOW or HIGH or on Edge.
I can see some modes do capture and smart-preload, so that is close - but lacks the threshold/overflow interrupt.
I think all the pieces are in a Smart Pin Cell, but less clear is if you can configure this way.
Uses:
This would allow Smart Pin HW only sense of BREAK signals on UARTS, and also sense Pauses/gaps in streaming data, and be used as a general watchdog.
Interrupts+SW reload can do this at low speeds, but at higher rates the interrupt can consume most of a COG.
One detail that emerged from this is, it would be nice to have Smart Pins able to trigger on > Some Width.
It is not clear if any of the many modes do that ?
To operate, this would need some preset-value, and then a trigger/gate condition needs to reset the counter when inactive.
ie reset with pin is LOW or HIGH or on Edge.
I can see some modes do capture and smart-preload, so that is close - but lacks the threshold/overflow interrupt.
I think all the pieces are in a Smart Pin Cell, but less clear is if you can configure this way.
Uses:
This would allow Smart Pin HW only sense of BREAK signals on UARTS, and also sense Pauses/gaps in streaming data, and be used as a general watchdog.
Interrupts+SW reload can do this at low speeds, but at higher rates the interrupt can consume most of a COG.
That could be done with some modifications to the smart pin guts. I will look into it today.
Also, some kind of one-shot generator that could run autonomously would be useful. The SMPS PWM modes might be able to this, but I need to review.
That could be done with some modifications to the smart pin guts. I will look into it today.
Sounds good. I think the asynchronous serial receive mode has no means to sense a Break in HW, so that means Smart Pins should be able to be configured to cover that.
If it works in either polarity, that supports both classic break (extended Low time past Stop Bit) and a Data-Pause.
With most UARTS these days either having significant buffers, or being interrupt serviced on small MCUs, it is common to have naturally packed data messages.
That could be done with some modifications to the smart pin guts. I will look into it today.
Sounds good. I think the asynchronous serial receive mode has no means to sense a Break in HW, so that means Smart Pins should be able to be configured to cover that.
If it works in either polarity, that supports both classic break (extended Low time past Stop Bit) and a Data-Pause.
With most UARTS these days either having significant buffers, or being interrupt serviced on small MCUs, it is common to have naturally packed data messages.
One thing about BRK is that terminal programs never seem to support it. It's a low-level matter, only.
This is the proposed patch/change to the Base64 table generate.
' Table Option if ">" is used for Coarse Autobaud & "U" for trim.
mov x,#10 '"0".."9" --> $0..9
callpa #"0",#fill
mov x,#20 '"A".."T" --> $0A..$1D
callpa #"A",#fill
mov x,#34 '"V".."w" --> $1e..$3F
callpa #"V",#fill
This has no effect on P2 read speed, as that is table lookup, but this simpler Base64 encode (fewer tests) does give significant gains in speed and size on the small MCU host end. Specifically, it halves the size and means it can in-line, rather than call, for even more savings.
The improved Base64 means the cheapest MCU candidate can keep up with its highest Baud in an unrolled Base64 code block.
What's the max baud rate it'll do without breaking conventional base64?
This is the proposed patch/change to the Base64 table generate.
' Table Option if ">" is used for Coarse Autobaud & "U" for trim.
mov x,#10 '"0".."9" --> $0..9
callpa #"0",#fill
mov x,#20 '"A".."T" --> $0A..$1D
callpa #"A",#fill
mov x,#34 '"V".."w" --> $1e..$3F
callpa #"V",#fill
This has no effect on P2 read speed, as that is table lookup, but this simpler Base64 encode (fewer tests) does give significant gains in speed and size on the small MCU host end. Specifically, it halves the size and means it can in-line, rather than call, for even more savings.
The improved Base64 means the cheapest MCU candidate can keep up with its highest Baud in an unrolled Base64 code block.
What's the max baud rate it'll do without breaking conventional base64?
Currently, 2.25M baud. The change that Jmg is suggesting would speed things up on the HOST side, since fewer computations would be needed to generate the Base64 data. If a table is used, it doesn't matter. I kind of like what we have now, as it's more standard and there are websites that produce Base64 strings from hex data you paste in. That makes it easier to bring up the know-how.
One thing about BRK is that terminal programs never seem to support it. It's a low-level matter, only.
Yes, it is something of a pain to generate on a PC, but it is common in LIN & other controller interfaces.
You can kludge a Break a couple of ways on a PC
* Use a force parity, in which case you need to use 2 stop bits for normal data, & this is a modest break, not detectable in some systems.
* Change baud rate, but that needs care, as the elasticity of buffers means you might change before all data is sent, and a PC check for TxBuffer=0 only reads the Operating system buffers, not the hardware USB bridge buffers.
Safest seems to be to time the transmit, based on baud and length.
I see the EXAR parts have some HW support for Break - they have a wide mode that flags Rx Break (but strangely lacks Tx-Break equivalent) but does have a register for low-level break commands.
This is the proposed patch/change to the Base64 table generate.
' Table Option if ">" is used for Coarse Autobaud & "U" for trim.
mov x,#10 '"0".."9" --> $0..9
callpa #"0",#fill
mov x,#20 '"A".."T" --> $0A..$1D
callpa #"A",#fill
mov x,#34 '"V".."w" --> $1e..$3F
callpa #"V",#fill
This has no effect on P2 read speed, as that is table lookup, but this simpler Base64 encode (fewer tests) does give significant gains in speed and size on the small MCU host end. Specifically, it halves the size and means it can in-line, rather than call, for even more savings.
The improved Base64 means the cheapest MCU candidate can keep up with its highest Baud in an unrolled Base64 code block.
What's the max baud rate it'll do without breaking conventional base64?
Currently, 2.25M baud. The change that Jmg is suggesting would speed things up on the HOST side, since fewer computations would be needed to generate the Base64 data. If a table is used, it doesn't matter. I kind of like what we have now, as it's more standard and there are websites that produce Base64 strings from hex data you paste in. That makes it easier to bring up the know-how.
Sorry I guess I was asking jmg what impact keeping conventional base64 compatibility (no inserted "U") would have on his efm8 micro baudrate.
Yes keeping to the standard would be good if possible. There used to be a tool called 'uudecode' we used to use with usenet groups in the 90's, I think its the same encoding
What's the max baud rate it'll do without breaking conventional base64?
The MCU I'm testing needs up to 31 SysClks for the older scheme, and up to 23 sysclks for the new one.
On one variant, that extra overhead is enough to force the next-lower baud, which drops from 1.3824MBd to 0.6912MBd ie it halves the possible baud rate.
The new one also frees 0x55, which locking into the "conventional base64' cannot do.
0x55 give far greater trim range to the AutoBaud
This is a clear case, where 'conventional' means significantly compromised performance.
Currently, 2.25M baud. The change that Jmg is suggesting would speed things up on the HOST side, since fewer computations would be needed to generate the Base64 data.
Yes. The MCU side can be significantly smaller and faster with a smarter base64
If a table is used, it doesn't matter. I kind of like what we have now, as it's more standard and there are websites that produce Base64 strings from hex data you paste in. That makes it easier to bring up the know-how.
Is anyone really going to copy from a website during P2 development ?!
You simply document the rules, and Map
"0".."9" <= 0..9
"A".."w" <= '0xa..0x3f' (Skip U)
is quite easy to explain, and is very similar (superset) of intel hex mapping.
less words and rules than this ( & Table generate is smaller)
Further to boot choices, I see this OctaFLASH, which is basically 8bit SPI - Similar to HyperRAM, but with a 1b SPI fallback.
Comes in 128M/256M/512M/1G-bit memories, with 300mil SO16 as an option.
From what I remember of the SPI BOOT, this should work ?
If the WRITE commands pass thru from the PC side, for PGM, the ROM only needs to reset and issue 03H ?
From what I remember of the SPI BOOT, this should work ?
If the WRITE commands pass thru from the PC side, for PGM, the ROM only needs to reset and issue 03H ?
It sounds like it should work. And, yes, the ROM only needs to reset and read ($03) the device.
Further to boot choices, I see this OctaFLASH, which is basically 8bit SPI - Similar to HyperRAM, but with a 1b SPI fallback.
Comes in 128M/256M/512M/1G-bit memories, with 300mil SO16 as an option.
A bit more news on Octaflash, I see this from Silabs newsletter - their new GG11 have native Octal FLASH memory support.
Octal/Quad-SPI Flash Memory Interface
Supports 3 V and 1.8 V memories
1/2/4/8-bit data bus
Quad-SPI Execute In Place (XIP)
Up to 30 MHz SDR/DDR at 3 V
Comments
Even played with it in synchronous mode @ 40Mbaud a while back.
To me, this aspect is quite important.
The smart pin X-Periods mode works nicely for 0x55 Baud-trim, but the Coarse board still has some wrinkles I worry about....
The best-rejection-precision character time measurement does seem to be 5:7
">" has that, but the tHH side of this has a false positive at 5 Stop bits. 5 Stop bit is rare, but not impossible.
"0" is the tLL equivalent of ">", and it avoids Stop bit positives, but "0" is rather useful elsewhere...
One possible rule would be to avoid needing "0" during the Coarse Autobaud + response polling phase ?
Another approach is to impose a rule of gaps in the 'reset and then immediate serial blasting, looking for responses'
eg 1 or 2 or 3ms min of 'no edges' ?
For small strings, most USB/UART buffers can packet-group characters and so avoid OS-gaps within packets.
Any OS-Gap between packets is tolerated.
Q: What is the smallest string sequence that can Coarse AutoBaud, and obtain an echo ?
Ideally, a smart-pin mode would exist that acts like a High-pass Monostable - it interrupts when no-pin-changes are seen for X-ms
Q: Is there such a smart pin mode ?
I guess you could emulate this, with code that reloads a timer on Pin-edges, and eventual timer-int means no-pin-edges for X-ms.
Using this, you now know message phase, no matter what reset-exit cases exist, and the first char can be 0x55, for the X-Periods mode.
Avoiding the rejection-tests of the 5:7, I calculate moves the Auto-baud-ceiling from ~ 3.57MBd to over 5MBd(+/-1.6% measure point)
(CP2102N is $1.1615 /1k at Digikey)
Data specs 3MBd, but tests show you can actually request and set the next two values above that, 3.428571M and 4M.
4.8M gives a setting error message & all values <=4M willl round to nearest physical Baud value.
There is no sign of added-stop bits, even up to 4MBd, in Tx only tests.
CP2102N & 24MHz virtual baud clock UARTS ( N=5;;N=N+1;48M/(2*N)) Tested CP2102N-EK, Simplex (Tx only) - sending 100,000 "U" from test file, one stop bit - Frequency Counter is Baud/2. 4800000 Term Reports : Invalid baudrate 4000000 fc_U = 2.0017 1-4M/(2.00085M*2) = 0.042% 3428571.428 fc_U = 1.71508 48M/(171508*2)/2/10 = 6.99675 3000000 fc_U = 1.50067 2666666.66 fc_U = 1.33400 2400000 fc_U = 1.20057 2181818.18 fc_U = 2000000 fc_U = 1.00040 1846153.84 fc_U = 1714285.71 fc_U = 857460 1-1714285/(857460*2) = 0.037% 1600000 fc_U = 800310 1-1.6M/(800310*2) = 0.0387% 1500000 fc_U = 750390 1411764.70 fc_U = 1333333.33 fc_U = 1263157.89 fc_U = 1200000 fc_U = 600.23 1142857.14 fc_U = 1090909.09 fc_U = 1043478.26 fc_U = 521197*2 = 1042394 1-ans/1043478 = 0.103% 1000000 fc_U = 500.13 960000 fc_U = 115200 57713 Check requests not == 48M/2N 1550000 fc_U = 800310 rounds up 1540000 fc_U = 750250 rounds downI also tested code of the general REPEAT with Delay form
This gives an alternative to a Coarse AutoBaud character and allows "U" to be used for both, as the P2 can wait for a pause in edges, before looking for "U"
In Windows, this wobbles about a little, but you can see a 1ms quanta from USB FS UART frames, and can use a calculated BaudDelay_ms to give a floor delay value. At high baud rates, short strings can space to 1ms or sometimes 2ms +
I'm guessing on a RaspPi, the results are similar, as USB-UARTS (FS) will all have the USB 1ms Frame influence.
On a Small Host MCU, with inbuilt UART, you can easily control to well under 1ms pauses, if needed.
With unknown Reset-Exit delays in the hundreds of milliseconds region, a 1-2ms polling rate seems fine ?
For the purposes of reset exit, the size of the delay is likely mainly set by lowest targeted Baud rate - which is what ?
eg at 9600 baud, edge spacing in an AutoBaud_Enq_String will be something under 1ms
I doubt anyone would develop boot a P2 at 9600, but maybe these new long range RF links need checking, as they will have lower-area rates ?
Google finds 12500 Baud mentioned for LoRa ? - so that makes 9600 a practical low-end ?
Is that 2.25MBd based on 1 or 2 stop bits, and is that with a -30% margin on the 20MHz ?
(ie if 20MHz Min can be spec'd, does that 2.25MBd becomes > 3MBd ? (3.375) )
2 Stop bits is also practical to specify, if we take that 2.25MBd as margined, you get 3.375 1 stop bit or 3.7125MBd 2 stop bits, for > 20MHz, and that can then accept the practical 3.428571MBd value in the table above.
How many cycles in the RxINT is that value based on ?
I think it is worth a change to 20MHz min, - did you check what the Test chips actually give you now yet ?
As for Code changes, if there is a parse and store limit, then Autobaud really only needs to be reliable, to just above that.
The tests above, show the practical increments in Baud Value, for low-cost, modern USB-UARTS.
Message issues :
The ">" does have a false window, with 5 stop bits, but maybe that is rare enough to ignore.
The CP2102N just tested has no real 5 stop bit risk, nor does the Exar and modern FTDI parts, and older parts vary that effect with Baud, so you can suggest in the DOCs users start with a lower value on older/simpler UART links, and increase when confidant.
Other changes: I would avoid sending ">>", as that does have reset-exit-phase issues, but the second 'dummy' char after ">" only really needs two edges, so a simple change to another (2 x =\_) char, at the host end, is fine there.
I would like to see a defined shortest string for P2 reset exit-polling, maybe by assuming defaults for the masks.
eg ">" then Dummy then "?", with One-Pin working in this mode gives something along the lines of
"> ?" Two Pin poll string
or
"> *?" One Pin poll string
Reply is a single ".", when correct phase & autobaud done, no reply when still waiting on P2 reset exit.
Addit: I can see only a one-way "*" for change from default to one-pin, adding a two-way command, could help testing ?
Checking the FTDI FT231X, reveals differing behavior
Baud choices are much more coarse, with 3M, then 1.5MBd, then 3M/(2N+k/8)
ie unlike the CP2102N, which has 7 choices between 3M and 1.5M, the FTDI part only has a 24MHz virtual BaudClock, below 1.5MBd
Strangely, unlike C2102N, which will cleanly round to the nearest baud, the FTDI driver does not round, but jumps to 115200 for any Baud it does not recognize. That requires user care in testing.
ie 3M/(2+3/16) gave 115200. (as does 1600000, or 2000000 etc )
I'd call that bordering on the lazy. SiLabs round properly.
Duplex checks show neither SiLabs nor FTDI can manage 3MBd full duplex, but the P2 is essentially half-duplex in operation, so that is unlikely to be an issue.
The EXAR part, looks to manage Duplex with no dropped characters, to 4MBd, so it looks the best FS USB Bridge part.
The FT232H/2232H can also do high MBd duplex, but they are more costly and HS USB.
We can do 2.25M baud at 20MHz. At 2/3 of 20MHz (13.33Mhz), we can do 1.5M baud. So, if the RC osc is not made to go at least 20MHz, we will have to specify 1.5M baud to accommodate the lower 13.33MHz possibility.
I was thinking you could switch to one-Pin for confirm that works, then switch back to two-pin for the other boot command tests, without needing a reset.
I guess you could do some tests in Two-Pin, and then be careful to check one-pin at the end
There may be cases where a user requests one pin, then if they fail to see an ack, they switch back to 2 pin.
I can forsee a MCU-Host could be coded that way, to cover both PCB layouts, and it does not really want to reset a P2, if it finds there is no One-Pin wiring.
ie fastest is to AutoBaud & request One-Pin, then check 2 pin, if no one-pin is seen, then exit with PCB-Wiring known.
A single code library can then manage either PCB layout.
So that means ~ 97 SysClks per Byte overhead ?
I get ~14 SysClks in the smaller RxINT I gave above, and ~ 38+18 SysClks in the Base64-decode, plus Hub align effects.
In my MCU code, I fetch 3 bytes then send 4, (interleaved a little for better speed) to keep the encode simpler, - you may get higher average speed, if you receive and check for 32b, and then Wrlong to HUB, rather than wrbyte 4x as frequently ? (on average, a wrlong is needed every 5.333' Chars in ~ 488 SysClk rate )
I tuned up the Base64 (Prop_Txt) code so it ran just as fast as the hex (Prop_Hex) code. If I made the Base64 faster, I would also have to make the Prop_Hex code faster, which seems harder to do, as there are less opportunities for improving it, compared to what I was able to do for the Base64 code. I believe things are pretty sufficient, already. We could not get enough improvement, anyway, to get us to the next major baud (from 1.5M to 2M). We could only squeeze maybe 10% of the time out.
OK, that's workable.
What is the shortest string that can AutoBaud Coarse and then get an echo, in one-Pin and 2 Pin modes ?
OK. Did you look at the wider-catch range of 0x55 ?
There are other baud values above 1.5MBd (unless it is FT231X which jumps to 3MBd), but the biggest gain seems to be from getting >= 20MHz
CP2102N allows
2400000
2181818.18
2000000
1846153.84
1714285.71
1600000
1500000
"> Prop_Chk 0 0 0 0 "
That's 19 characters for two-pin mode.
It would take 19 characters for one-pin mode, too:
"> Prop_Chk*0 0 0 0 "
About the wider capture 0x55, I started looking at your code earlier today, but got sidetracked with FPGA compilation issues. I'll go back to it soon. For now, I'm preparing an update with the current ">"-only autobaud technique.
Isn't there a "?" query, that gives a shorter echo than Prop_Chk ? ( "?" -> "." or "!" reply )
You have to give those four INA/INB masks and values before a chip will respond, since there could be different chips connected. The Prop_Chk command ends when a non-hex character follows the last hex value. At that point, the selected chip will send out it's "Prop_Ver Au" string and begin responding to "?" characters with "." characters. It waits 10ms before outputting, to allow turn-around time for the host, in case it's needed. So, the first thing you'd get out would be the "Prop_Ver". You might as well wait for that string, rather than harass it with "?" characters.
That's quite a lot of characters, and the 10ms seems quite a long turn around time ?
All of this adds to the reset-to-run time, when using a Local MCU.
Well, that 10ms would be waited for only once. I could certainly shorten it to 1ms. I just wanted to give a host system time to turn around, in case it was operating the link in a half-duplex mode and running from a slow language in a cumbersome operating system. One ms would probably not accommodate that scenario.
Also, I figure that the connect/handshake strings should be long enough that they don't get mistaken from what might be incidental data flowing. I mean, keying off a single character would be kind of reckless in some scenarios.
I guess the String length issue can be mitigated by upping the baud rates. (1ms delay is ~230k Baud)
The 10ms wait depends on the phase of reset exit, so the host MCU has to repeat a Enq and listen pause, which means it could double, to 20ms added delay.
Addit: one idea could be to have the Host specify the Delay ?
Since you already have fields for multiple P2 support, one more for Turn-Around define is not much to add, and it allows systems that are local and fast, to skip a long delay and also allows slower systems to adjust what they need. It might be 10ms is not enough on some future OS design, whilst << 1ms is fine for a local MCU host.
This has no effect on P2 read speed, as that is table lookup, but this simpler Base64 encode (fewer tests) does give significant gains in speed and size on the small MCU host end. Specifically, it halves the size and means it can in-line, rather than call, for even more savings.
The improved Base64 means the cheapest MCU candidate can keep up with its highest Baud in an unrolled Base64 code block.
It is not clear if any of the many modes do that ?
To operate, this would need some preset-value, and then a trigger/gate condition needs to reset the counter when inactive.
ie reset with pin is LOW or HIGH or on Edge.
I can see some modes do capture and smart-preload, so that is close - but lacks the threshold/overflow interrupt.
I think all the pieces are in a Smart Pin Cell, but less clear is if you can configure this way.
Uses:
This would allow Smart Pin HW only sense of BREAK signals on UARTS, and also sense Pauses/gaps in streaming data, and be used as a general watchdog.
Interrupts+SW reload can do this at low speeds, but at higher rates the interrupt can consume most of a COG.
That could be done with some modifications to the smart pin guts. I will look into it today.
Also, some kind of one-shot generator that could run autonomously would be useful. The SMPS PWM modes might be able to this, but I need to review.
Sounds good. I think the asynchronous serial receive mode has no means to sense a Break in HW, so that means Smart Pins should be able to be configured to cover that.
If it works in either polarity, that supports both classic break (extended Low time past Stop Bit) and a Data-Pause.
With most UARTS these days either having significant buffers, or being interrupt serviced on small MCUs, it is common to have naturally packed data messages.
One thing about BRK is that terminal programs never seem to support it. It's a low-level matter, only.
What's the max baud rate it'll do without breaking conventional base64?
Currently, 2.25M baud. The change that Jmg is suggesting would speed things up on the HOST side, since fewer computations would be needed to generate the Base64 data. If a table is used, it doesn't matter. I kind of like what we have now, as it's more standard and there are websites that produce Base64 strings from hex data you paste in. That makes it easier to bring up the know-how.
You can kludge a Break a couple of ways on a PC
* Use a force parity, in which case you need to use 2 stop bits for normal data, & this is a modest break, not detectable in some systems.
* Change baud rate, but that needs care, as the elasticity of buffers means you might change before all data is sent, and a PC check for TxBuffer=0 only reads the Operating system buffers, not the hardware USB bridge buffers.
Safest seems to be to time the transmit, based on baud and length.
I see the EXAR parts have some HW support for Break - they have a wide mode that flags Rx Break (but strangely lacks Tx-Break equivalent) but does have a register for low-level break commands.
Sorry I guess I was asking jmg what impact keeping conventional base64 compatibility (no inserted "U") would have on his efm8 micro baudrate.
Yes keeping to the standard would be good if possible. There used to be a tool called 'uudecode' we used to use with usenet groups in the 90's, I think its the same encoding
The MCU I'm testing needs up to 31 SysClks for the older scheme, and up to 23 sysclks for the new one.
On one variant, that extra overhead is enough to force the next-lower baud, which drops from 1.3824MBd to 0.6912MBd
ie it halves the possible baud rate.
The new one also frees 0x55, which locking into the "conventional base64' cannot do.
0x55 give far greater trim range to the AutoBaud
This is a clear case, where 'conventional' means significantly compromised performance.
Yes. The MCU side can be significantly smaller and faster with a smarter base64
Is anyone really going to copy from a website during P2 development ?!
You simply document the rules, and Map
"0".."9" <= 0..9
"A".."w" <= '0xa..0x3f' (Skip U)
is quite easy to explain, and is very similar (superset) of intel hex mapping.
less words and rules than this ( & Table generate is smaller)
“A”..”Z” <= $00..$19
“a”..”z” <= $1A..$33
“0”..”9” <= $34..$3D
“+” <= $3E
“/” <= $3F
Comes in 128M/256M/512M/1G-bit memories, with 300mil SO16 as an option.
http://www.macronix.com/en-us/products/NOR-Flash/Pages/OctaFlash.aspx#3V
http://www.macronix.com/Lists/Datasheet/Attachments/6254/MX25LM51245G, 3V, 512Mb, v1.0.pdf
has a SPI Reset pair of 66H then 99H, sent as 1 byte per CS
READ3B (normal read) 03 (hex)
FAST_READ3B (fast read data) 0B (hex)
PP3B (page program) 02 (hex) 256 Bytes
SE3B (sector erase) 20 (hex)
BE3B (block erase 64KB) D8 (hex)
CE (chip erase) 60 or C7 (hex)
From what I remember of the SPI BOOT, this should work ?
If the WRITE commands pass thru from the PC side, for PGM, the ROM only needs to reset and issue 03H ?
It sounds like it should work. And, yes, the ROM only needs to reset and read ($03) the device.
A bit more news on Octaflash, I see this from Silabs newsletter - their new GG11 have native Octal FLASH memory support.
Octal/Quad-SPI Flash Memory Interface
Supports 3 V and 1.8 V memories
1/2/4/8-bit data bus
Quad-SPI Execute In Place (XIP)
Up to 30 MHz SDR/DDR at 3 V