Dynamic system clock setting and comport baud setting

evanh · 2019-08-11 02:17

Here's a semi-general routine for changing the system clock frequency. Only the XMUL component is a variable but all constants are still obeyed. Also adjusts the comport clock divider at the same time. It's from my library-like working code that I've been tweaking lots.

I'm partly wondering if anyone is interested and am open to input on redesign.

'===============================================
'Set new sysclock frequency from a dynamic XMUL(xtalmul)
'and also adjusts the diag comport to suit
'     Note:  Uses the CORDIC.  This means that any already running operations will be lost
'  input:  xtalmul, asyn_baud
' result:  clk_mode, clk_freq
'scratch:  pa, pb, temp1

setclkfrq

'recalculate sysclock hertz using cordic
		qmul	xtalmul, ##(XTALFREQ / XDIV / XDIVP)	'integer component of pre-divided crystal frequency
		mov	pa, xtalmul
		mul	pa, #round((float(XDIV*XDIVP)+0.5) * (float(XTALFREQ)/float(XDIV)/float(XDIVP) - float(XTALFREQ/XDIV/XDIVP)))
		qdiv	pa, #(XDIV * XDIVP)		'fractional component of pre-divided crystal frequency
		getqx	clk_freq			'result of integer component
		getqx	pa				'result of fractional component
		add	clk_freq, pa			'de-error the integer rounding

{ 'this section disabled, more compact alternative below
'recalculate baud divider (clk_freq / asyn_baud) of diag comport using cordic
'  low bauds won't operate at high sysclocks, the divider only has 16-bit reach

'apply max of *64 to copy of clk_freq to achieve 16.6 format
		encod	temp1, clk_freq			'bit position of msb
		subr	temp1, #31			'distance from bit position 31
		fle	temp1, #6			'cap at distance 6
		mov	pa, clk_freq
		shl	pa, temp1			'shift significant bits up to bit position 31

'apply remaining of /64 to copy of asyn_baud
		subr	temp1, #6			'remaining distance
		mov	pb, asyn_baud
		shr	pb, temp1			'shift down to make up distance 6 if needed, truncates baud

'comport divider (clk_freq / asyn_baud) in 16.6 format
		qdiv	pa, pb
		getqx	pa
		shl	pa, #10				'WXPIN format = %DDDDDDDDDDDDDDDD.DDDDDD_xxxxx_FFFFF
		sets	pa, #7				'comport 8N1 framing
}

'recalculate baud divider (clk_freq / asyn_baud) of diag comport using cordic
'  low bauds won't operate at high sysclocks, the divider only has 16-bit reach
		qdiv	clk_freq, asyn_baud		'comport divider
		qfrac	#1, asyn_baud			'remainder scale factor, 2**32 / baud
		getqx	pa				'comport divider
		getqy	pb				'divider remainder, for .6 fraction
		getqx	temp1				'scale factor
		qmul	pb, temp1			'use scale factor on remainder to provide a "big" fraction
		getqx	pb				'fractional component of comport divider
		rolword	pa, pb, #1			'16.16 comport divider
		sets	pa, #7				'comport 8N1 framing (bottom 10 bits should be replaced but 9's enough)

'make sure not transmitting on comport before adjusting hardware
.txwait		rqpin	inb, #DIAGTXPIN	wc		'transmiting? (C high == yes)
	if_c	jmp	#.txwait

		wxpin	pa, #DIAGTXPIN			'set tx baud and framing (divider format is 16.0 if divider >= 1024.0)
		wxpin	pa, #DIAGRXPIN			'set rx baud and framing (divider format is 10.6 if divider < 1024.0)


'adjust hardware to new XMUL sysclock frequency
		andn	clk_mode, #%11			'clear the two select bits to force RCFAST selection
		hubset	clk_mode			'**IMPORTANT**  Switches to RCFAST using known prior mode

		mov	clk_mode, xtalmul		'replace old with new ...
		sub	clk_mode, #1			'range 1-1024
		shl	clk_mode, #8
		or	clk_mode, ##(1<<24 + (XDIV-1)<<18 + XPPPP<<4 + XOSC<<2)
		hubset	clk_mode			'setup PLL mode for new frequency (still operating at RCFAST)

		or	clk_mode, #XSEL			'add PLL as the clock source select
		waitx	##22_000_000/100		'~10ms (at RCFAST) for PLL to stabilise
		hubset	clk_mode			'engage!  Switch back to newly set PLL
		ret			wcz

jmg · 2019-08-11 05:41

evanh wrote: »

Here's a semi-general routine for changing the system clock frequency. Only the XMUL component is a variable but all constants are still obeyed. Also adjusts the comport clock divider at the same time. It's from my library-like working code that I've been tweaking lots.

I'm partly wondering if anyone is interested and am open to input on redesign.

That's cool, always nice to see a full HW range being used

Does that do a round() on the baud result so it centres the baud value on the error band ?
I see rounding mentioned on the SysCLK calc, but not on the baud Calc ?
I've found some commercial USB Bridge chips are a tad lazy in their rounding code, and they can have larger errors of Given_Baud / Requested_Baud as a result.

cheezus · 2019-08-11 05:45

Very nice! This will come in handy once I'm changing clock frequency as well as baud in the Ymodem program!! Very interested to see how this develops.

evanh · 2019-08-11 06:55

jmg wrote: »

Does that do a round() on the baud result so it centres the baud value on the error band ?

The divides in the combined constants of the initial QMUL generates a large rounding error. As in a lot of resolution is lost there. So the rounding error compensation is about reinstating the lost bits. The closest I come to normal rounding is the +0.5 in that really long macro for the MUL instruction.

I see rounding mentioned on the SysCLK calc, but not on the baud Calc ?

That's because the baud divider calculation requires six fractional bits in the smartpin parameter and so the method used is in effect a 64-bit result: 32.32 bits which are truncated to 16.6 bits.

I've found some commercial USB Bridge chips are a tad lazy in their rounding code, and they can have larger errors of Given_Baud / Requested_Baud as a result.

I'm not too surprised given how robust I had to be sans a FPU. It came as a slight shock actually.

evanh · 2019-08-11 08:33

Below 1 MHz sysclock, the comport becomes unreliable for 115200 baud. That used to be more like 4 MHz. Best I can get is a crystal divider of 29 for 690 kHz sysclock, which is close to 6x the baud, then it still works.

Cluso99 · 2019-08-11 08:51

Does the new silicon have this bug or has it been fixed?

It’s good for those who want to change frequencies while running. Not sure how many that will be. Worth putting in the tricks and traps thread with a note for dynamically switching clock frequency.

evanh · 2019-08-11 08:57

Cluso99 wrote: »

Does the new silicon have this bug or has it been fixed?

Yes, the discovery was quite late and, I think, Chip had already run out of custom changes without causing an extra cost.

Although, that part of the routine is tiny compared to the general handling of fractions and rounding errors.

It’s good for those who want to change frequencies while running. Not sure how many that will be. Worth putting in the tricks and traps thread with a note for dynamically switching clock frequency.

Yeah, given the size of the routine, I thought it would be good for people to try it out and make comments first.

Cluso99 · 2019-08-11 09:02

Thanks Evan. Put in in the tricks and traps when it’s ready. You can always update it there if need be.

jmg · 2019-08-11 09:26

evanh wrote: »

Below 1 MHz sysclock, the comport becomes unreliable for 115200 baud. That used to be more like 4 MHz. Best I can get is a crystal divider of 29 for 690 kHz sysclock, which is close to 6x the baud, then it still works.

Do Baud requests of
113386
and
116582
resolve to the same /29 /6 ?
How does the fractional baud value change with those variations ?

evanh · 2019-08-11 10:14

Ah, I'm just using the PC USB comport to test against for reliability. The USB comport won't be able to adjust by small steps.

The test wasn't intended fit snugly, just comparing against what I had before is all.

evanh · 2019-08-11 12:05

Oops, it seems I have some unreliable data ... looks like I've stumbled on another bug with my test program that still has its own testing engaged. I was using its results as the tx data.

Setting up the scope to look at the transmitted data, the tx bit timing is damn accurate. I've now chosen XDIV = 31, XMUL = 20 and XDIVP = 28 and this nets me an almost clean 460.8 kHz sysclock. Perfect for 4x the baud.

And the scope is telling me the result is spot on accurate. Time to lock down what data I'm sending ...

EDIT: Yep, all good at 460.8 kHz sysclock. Doh! Ruins all my earlier results.

EDIT2: 230.4 kHz sysclock works too. And seems to be find above this too. That's amazing. Even 115.2 kHz works but nothing in between does.
EDIT3: Just tested again without the recently added improved rounding and fractional calculations and this way still has the bad spots above 1 MHz sysclock like I had said. So the buggy testing wasn't the cause of that. Minor correction: the errors seem to petter out just above 2 MHz sysclock, so the earlier estimate of below 4 MHz was a tad over the mark. So, all good for the opening post routine.

evanh · 2019-08-11 12:24

Here's the terminal output using XDIV = 62, XDIVP = 28 and XMUL beginning at 10 and incrementing. It shows the blackout for sysclock in between 115200 and 230400 and then all good from there on up.

Loading init-diag-testing.binary - 2272 bytes
Checksum validated
init-diag-testing.binary loaded
( Entering terminal mode.  Press Ctrl-] to exit. )

Total smartpins = 64   1111111111111111111111111111111111111111111111111111111111111111
System clock frequency set to 0.1152 MHz

  Testing SPI using smartpins
=========================================
 XMUL   |--------- Received data ---------------|
  10    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49������������������������������������������������������������в����К���ؐ�ې������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������t�������������������������������������������������p:�������������������������������������������������t�������������������������������������������r�����4�������������������������������������������
  20    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  21    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  22    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  23    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  24    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  25    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  26    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  27    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  28    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  29    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  30    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  31    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  32    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  33    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  34    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  35    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  36    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  37    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  38    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  39    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  40    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  41    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  42    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  43    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  44    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  45    ba 20 20 54 65 73 74 69 6e 67 20 53 50 49
  ...    ...

jmg · 2019-08-11 20:04

evanh wrote: »

And the scope is telling me the result is spot on accurate. Time to lock down what data I'm sending ...

EDIT: Yep, all good at 460.8 kHz sysclock. Doh! Ruins all my earlier results.
EDIT2: 230.4 kHz sysclock works too. And seems to be find above this too. That's amazing. Even 115.2 kHz works but nothing in between does.
EDIT3: Just tested again without the recently added improved rounding and fractional calculations and this way still has the bad spots above 1 MHz sysclock like I had said. So the buggy testing wasn't the cause of that. Minor correction: the errors seem to petter out just above 2 MHz sysclock, so the earlier estimate of below 4 MHz was a tad over the mark. So, all good for the opening post routine.

Are those P2.Tx or P2.Rx tests ?

It makes sense that at very low divides, you need a correct clock multiple, as it's not clear how the fractional adjust applies at very low divides.
I'm also not clear on how the P2's .6 fractional bits is actually applied, as a single byte is 10 bit slots that can be adjusted, so 3 bits of fraction could usefully apply there ?

Did you scope that to see how the fractional bits apply ?

The EFM8UB3 has a /2 then /N and you can set a Txmit speed of 24MHz if you really want to, but receive cannot detect start-edges at 24M.
The highest reliable/practical baud for receive on UB3 I've found is 8MBd.8.n.2, and the 2 stops bits I think are needed because at such low divides, there are not enough baudclks to move from mid-stop-bit, to start-bit-edge modes. The added stop bit buys 3 more baudclks, which also helps with small baud offsets and continual send packets.

evanh wrote: »

The USB comport won't be able to adjust by small steps.

That practical step size depends on the device somewhat.
I just checked a FT231X, and it looks to have the same 24MHz virtual baud clock (24M/N) that the EFM8UB3 does and the FT232H does.

ie that means I can dial up these 24M/N candidates, (0.481% steps at 115200)
115384
116505
115942 and they all measure within ~ 80ppm of the expected 24M/N value. (FT232H will be closer, as that uses a crystal)

evanh · 2019-08-11 22:02

Tx only. It'll be a NCO dither on the internal bit clocking for fractional. The data length won't affect that. The fractions definitely work, that EDIT3 is what happens without.

Not that it matters or that never tried, the proof is done without oddball bauds, but I'm limited to what the Linux driver allows, not the hardware limits, on the PC side.

jmg · 2019-08-11 23:13

evanh wrote: »

... but I'm limited to what the Linux driver allows, not the hardware limits, on the PC side.

Surely Linux does not impose it's own baud quanta, but rather simply feeds the number to the vendors driver ?
The numbers I gave above work on Windows - do they not work on Linux ?
Do you have a linux driver/device for Silabs CP210x series ? I'm curious now to check Linux allows a baud value of 3 to be passed to the driver.

jmg · 2019-08-11 23:23

evanh wrote: »

Tx only. It'll be a NCO dither on the internal bit clocking for fractional. The data length won't affect that.

Yes and no.
If you run a NCO correction over 64 baudclk, you give the appearance of being able to resolve to an average baud of Baudclk/64, but the actual bit sampling applies per data bit, relative to the start bit.
Any 'extra/bonus' averaging across multiple bytes is not relevant, unless you are using the baud-rate for some long term timing, which is very rare.

It's probably better to analyse this from a peak timing deviation on any single bit edge, than an average-timing-value.
If the NCO correction carries over between TX bytes, that means any Autobaud result will vary.

cgracey · 2019-08-12 05:32

What's this talk of a bug in the smart pin serial baud generator? I read a bunch of posts, but I don't understand.

jmg · 2019-08-12 05:51

cgracey wrote: »

What's this talk of a bug in the smart pin serial baud generator? I read a bunch of posts, but I don't understand.

Where ? I do not see any mention of a bug in the serial baud generator ?
Just discussions around rounding, and jitter, and how to specify Baud errors ?
I don't see anything evanh has written that suggests 'not working as designed' in his tests ?
evanh is dropping the sysclk down to extreme levels, so the division from sysclk to baud is quite low, and seeing where it breaks. (on TX)
That's useful, as some users will want to drop sysclk as low as they can, for power reasons, but keep reasonable baud rates...
It's also important that both ends of any system choose the nearest valid baud value to what the user requested.

Some small MCUs now offer 9600 Baud Rx, when operating form 32.768kHz - that's a division of just 3.41333'

cgracey · 2019-08-12 06:07

jmg wrote: »

cgracey wrote: »

What's this talk of a bug in the smart pin serial baud generator? I read a bunch of posts, but I don't understand.

Where ? I do not see any mention of a bug in the serial baud generator ?
Just discussions around rounding, and jitter, and how to specify Baud errors ?
I don't see anything evanh has written that suggests 'not working as designed' in his tests ?
evanh is dropping the sysclk down to extreme levels, so the division from sysclk to baud is quite low, and seeing where it breaks. (on TX)
That's useful, as some users will want to drop sysclk as low as they can, for power reasons, but keep reasonable baud rates...
It's also important that both ends of any system choose the nearest valid baud value to what the user requested.

Some small MCUs now offer 9600 Baud Rx, when operating form 32.768kHz - that's a division of just 3.41333'

Oh. Okay. No problem, then.

evanh · 2019-08-12 08:50

Chip,
I had unexpected variations in my data source that wasn't anything to do with the serial comport. It caused me confusion until I verified that the comport bit timings were spot on. Only then did I actually look for the source of the anomaly. Once eliminated all was good again.

I'm very impressed with the outcome. It looks like smartpins can do async serial fine from a sysclock of 2x the baud and up. On the tx side at least.

jmg · 2019-08-12 09:02

evanh wrote: »

I'm very impressed with the outcome. It looks like we can operate fine from a sysclock of 2x the baud and up. On the tx side at least.

This may be similar to the EFM8UB3.
There, my tests show UB3 can run faster in Tx than Rx, and Rx seems stable at up to baudclk/3 (SysCLK/3 on P2) provided I use 2 stop bits.
I think there. the hand-over from mid-stop bit to start-bit-edge needs a little cushion, and and extra clock or two here also helps with baud creep effects, if sending long messages.
I've sent 1762 byte bursts with no added gaps in my tests, as that's the USB-Buffer-pipeline limit.

Be interesting to see the results of your tests on Rx side with P2, with SysClk divide and stop bit counts.

evanh · 2019-08-12 09:39

I wasn't really trying to test the serial capabilities. It was more about verifying the the precision of those calculations.

jmg · 2019-08-12 19:42

cgracey wrote: »

Oh. Okay. No problem, then.

Nothing that needs errata or fixing

There was an open question around the fractional NCO handling.
Q: Does that reset every character, or does it roll-over across multiple characters ?
The former makes Autobaud more predictable, and a fix could be applied if users know the exact syslks between their autobaud edges, Jitter is only inside characters & fixed.

jmg · 2019-08-12 19:58

evanh wrote: »

It looks like smartpins can do async serial fine from a sysclock of 2x the baud and up. On the tx side at least.

You reported above it could do 1x sysclk, TX only ?
all good at 460.8 kHz
230.4 kHz works too. And seems to be fine above this too.
115.2 kHz works but nothing in between does.

That makes sense, as exact divides will be needed at lowest sysclks, as any NCO adder at lowest divides is a massive time jump.
Rx side is likely to start working reliably around 3x sysclks

cgracey · 2019-08-12 20:27

jmg wrote: »

cgracey wrote: »

Oh. Okay. No problem, then.

Nothing that needs errata or fixing
There was an open question around the fractional NCO handling.
Q: Does that reset every character, or does it roll-over across multiple characters ?
The former makes Autobaud more predictable, and a fix could be applied if users know the exact syslks between their autobaud edges, Jitter is only inside characters & fixed.

The baud timer resets on every character.

jmg · 2019-08-12 20:32

cgracey wrote: »

The baud timer resets on every character.

Good, thanks.
That means the practical/available fractional correction will vary with character length.
It will be 5 bits for a 32 length character (not common) and 3 bits for the 8 data bits in a byte-char.

evanh · 2019-08-12 20:52

I didn't count 1x because that requires a careful clock divider setting to match the baud.

jmg · 2019-08-12 21:30

evanh wrote: »

I didn't count 1x because that requires a careful clock divider setting to match the baud.

True, but baud granularity is such, that baud rate will dictate clock choices even > 1x. Users will expect that.
It is not until you are over 4x that the granularity step drops to 3%, so all values under 4 are going to need some user care and checking.

evanh · 2019-08-13 07:22

Everything from 2x and up worked.

cgracey · 2019-08-13 07:24

evanh wrote: »

Everything from 2x and up worked.

What are the limitations?

evanh · 2019-08-13 07:29

What sort of limitations?

I wasn't trying to carefully measure the limits of the comms. It was a "are the calculated dividers good enough for a for a borderline setup?" The answer was yes, better than expected.

Dynamic system clock setting and comport baud setting

Comments