SPI boot code and new CALLPA/CALLPB instructions

jmg · 2016-10-05 02:50

Cluso99 wrote: »

I like, and have used, the <space> $20 character to autobaud for PS2 keyboards in order to connect them to the P1 using 1pin. (see obex for 1pin TV and Kbd)

Tor wrote: »

Space char is good, and it's been used for the same purpose in other systems before.

See the added test example for $20 under the $3f one above.
$3f has better capture precision, and if terminals are the focus here, I like the idea of a visible symbol ("?") over a blank ( )

Cluso99 · 2016-10-05 03:49

The reason the "AT" command sequence was used is because it permits the start bit to be timed, and verified by the next/first (LSB) bit.

This is why ASCII characters %0xxx_xx01 are better for timing purposes.

jmg · 2016-10-05 04:15

Cluso99 wrote: »

The reason the "AT" command sequence was used is because it permits the start bit to be timed, and verified by the next/first (LSB) bit.

If you wanted timing, it would have been smarter to have a 101 edge at the start of every byte, but RS232 never did do that...

Any of the above waveforms can be Autobauded, the (significant) down side of a single bit-time capture, is it lacks precision.
0x55 or "U" is another common Baud character, but usually there, AutoBaud count along the toggling bits, to get a better timebase precision.

Cluso99 · 2016-10-05 04:27

jmg wrote: »

Cluso99 wrote: »

The reason the "AT" command sequence was used is because it permits the start bit to be timed, and verified by the next/first (LSB) bit.

If you wanted timing, it would have been smarter to have a 101 edge at the start of every byte, but RS232 never did do that...

Any of the above waveforms can be Autobauded, the (significant) down side of a single bit-time capture, is it lacks precision.
0x55 or "U" is another common Baud character, but usually there, AutoBaud count along the toggling bits, to get a better timebase precision.

That isn't actually correct. If you get an alternating 0-1-0 (start, b0, b1...) then you know with reasonable certainty what baud is being used. If the sequence is a longer sequence, in particular the <space> $20, then you don't really know if you have received the start-0-0-0-0-0-1 or a slower baud 0-1 sequence.

The idea is that you time the start "0", shift 1/2 bit and continue sampling to see if the character conforms to what yu are looking for. Ideally the following b0=1 and so you may also time that, or the whole character, or section of the character. But it is the start that is easiest to measure providing the bit0=1.

I have done this for the AT modem firmware without the aid of a UART for our NetComm and Apple modems. Previously I had done it on a micro to sync to ~53Kb specialised mini-computer serial, where I had to resync on each clock which was every 4.75 instructions.

jmg · 2016-10-05 04:41

Cluso99 wrote: »

That isn't actually correct. If you get an alternating 0-1-0 (start, b0, b1...) then you know with reasonable certainty what baud is being used. If the sequence is a longer sequence, in particular the <space> $20, then you don't really know if you have received the start-0-0-0-0-0-1 or a slower baud 0-1 sequence.

I'm not quite sure what point you are trying to make, as your single bit example will also fail, with the wrong byte sent.

Focusing back on the topic, which is P2 booting, the main downside of multi-edge characters is the P2 might exit reset part-way through, so that needs to be covered. I think a Twin Smart Pin Capture can do that and get good precision on one-char in.

Most Autobaud designs I've seen just sync once, and do not expect short term drift, but Chip has allowed for resync.

With Twin Smart Pin Capture, that should still be possible, but if there is also NCO TX, that may need more smart pins.

Phil Pilgrim (PhiPi) · 2016-10-05 06:04

Although I'm not following P2 stuff at all, autobaud detection does interest me. Here's what I would do:

1. Decide on a fixed length for a data block, e.g. 64 bytes, 256 bytes, etc.

2. Start each data block with two or more NULs: $00 ... $00. This will give the autobaud detector nine low bits in a row to measure (start bit and 8 data bits -- stop bits can vary too much to be included in any measurement).

3. Within the block, if a zero byte is sent as data, precede it with a DLE ($10 $00).

4. Within the block, if $10 is sent as data, it's sent as two DLEs ($10 $10).

That way, anytime two or more NULs are received in a row, the device will know that it can't be data and must be the beginning of another data block. Moreover, it can resynchronize its baudrate clock to the new 9-bit low pulses on every new data block.

You can begin the transmission with as many zero bytes as you think you need to ensure initial synchronization. Or you can send a long BREAK to signal the beginning of a transmission, followed by the NULs, with even greater certainty.

-Phil

Cluso99 · 2016-10-05 06:48

Phil Pilgrim (PhiPi) wrote: »

Although I'm not following P2 stuff at all, autobaud detection does interest me. Here's what I would do:

1. Decide on a fixed length for a data block, e.g. 64 bytes, 256 bytes, etc.

2. Start each data block with two or more NULs: $00 ... $00. This will give the autobaud detector nine low bits in a row to measure (start bit and 8 data bits -- stop bits can vary too much to be included in any measurement).

3. Within the block, if a zero byte is sent as data, precede it with a DLE ($10 $00).

4. Within the block, if $10 is sent as data, it's sent as two DLEs ($10 $10).

That way, anytime two or more NULs are received in a row, the device will know that it can't be data and must be the beginning of another data block. Moreover, it can resynchronize its baudrate clock to the new 9-bit low pulses on every new data block.

You can begin the transmission with as many zero bytes as you think you need to ensure initial synchronization. Or you can send a long BREAK to signal the beginning of a transmission, followed by the NULs, with even greater certainty.

-Phil

Me thinks you have a synchronous communications background

Cluso99 · 2016-10-05 07:03

jmg wrote: »

Cluso99 wrote: »

That isn't actually correct. If you get an alternating 0-1-0 (start, b0, b1...) then you know with reasonable certainty what baud is being used. If the sequence is a longer sequence, in particular the <space> $20, then you don't really know if you have received the start-0-0-0-0-0-1 or a slower baud 0-1 sequence.

I'm not quite sure what point you are trying to make, as your single bit example will also fail, with the wrong byte sent.

????? Of course the code would only be looking for a single character, or a group of similar detectable characters. eg like "A" and "a".

Focusing back on the topic, which is P2 booting, the main downside of multi-edge characters is the P2 might exit reset part-way through, so that needs to be covered. I think a Twin Smart Pin Capture can do that and get good precision on one-char in.

Surely, if the P2 resets, everything gets reset ???

Most Autobaud designs I've seen just sync once, and do not expect short term drift, but Chip has allowed for resync.

"AT" command sets resync on each "AT" sequence.

The older async rates were always 300*2^n. But now rates can also be even numbers like multiples of an even xtal such as 5/10/16/20MHz etc. In the old days we used xtals like 4.932MHz, 7.9872MHz, etc so we could divide by 2^n to get the baud.
So the accuracy is a little more important as we cannot just assume 114kb would be 115,200. However, serial is supposed to be within 1%, and re-syncing (not autobauding) should be done on each character because it's possible to have anything >= 1 stop-bit.

With Twin Smart Pin Capture, that should still be possible, but if there is also NCO TX, that may need more smart pins.

I cannot see why we need to be so accurate. Certainly a lot of micros have internal oscillators that are 1% now days.

jmg · 2016-10-05 09:04

cgracey wrote: »

It would be GREAT to get this going a lot faster, of course, so that loading could be a single step.

I think 'a lot faster' is looking practical.
Using 2 Smart Pins as / to / and \ to \ timing, and with '?' it should capture 8b and 7b, which can sum to 15b axis.

cgracey wrote: »

... To go faster, we need different principles. I was thinking about strings of '?' ($3F) characters, which cause two low bits periodically. That's not much to go on, I know. It would be great if we could use $C0's or something, so that we get low-rate transitions which can be counted better. I think if I used TWO smart pins to count the states, one for high and one for low, I might be able to double the top speed. Remember, also, we don't just have to use whole clock counts. Both smart pin TX and RX have an NCO mode. If we could just get good NCO values, we could go way higher.

To give a target value, I've added numbers to the other MCU One Pin thread, where I pushed up the BAUD until the Tx stop bits started to increase, from the selected 2 Stop bits.

I also changed the B64 to a faster encoded variant, but still has '?' (& many others) free for Autobaud work.

Result is 24.5MHz/2 can sustain 1.225MBd, from MCU

1.531250MBd needs 1 more stop bit, for an effective 1.441176 MBd sustained equiv speed.
At /1, double those to 2.45MHz, & 2.88MBd

3MBd is common upper limit for cheaper USB-UARTS, with a virtual Baud Clock of 12MHz
Faster USB-UARTS can go to 12MBd with a 24MHz virtual Baud Clock.

jmg · 2016-10-05 09:10

Cluso99 wrote: »

I cannot see why we need to be so accurate. Certainly a lot of micros have internal oscillators that are 1% now days.

Yup, but the P2 is not one of those, not during boot, so it does need good quality AutoBAUD.

Pushing up the precision certainly does help, as you want to avoid the measurement noise affecting the calculations, and you want to centre the errors.

It can further help, if you report the Calibrate value back to the host, as it can then select a better baud choice.
Or NCO RX can help improve granularity at these elevated speeds.

Some target speeds, that a lowly 8 bit MCU can achieve, are in the reply above.

cgracey · 2016-10-05 18:24

I've got TWO smart pins taking measurements on the RX pin now. One measures lows and one measures highs. There's only need to interrupt on low measurements, as there will be an implied high measurement already available. So far, it seems to have doubled the speed of continuous downloading, but if characters are paced apart, even 1M baud works.

Still looking at how to make it faster. I think we need to go to NCO mode to get sufficient resolution, but that means doing a divide, which is very expensive in terms of time. Also, smaller FPGA implementations don't have CORDIC, so it would have to be a software divide.

garryj · 2016-10-05 19:21

Should CALLPA/CALLPB be doc'd as being in the S/#rel9 group, as PNut throws a "Relative address out of range" error?
My apologies if I'm misunderstanding -- again...

Thanks!

CCCC 1011110 0LI DDDDDDDDD SSSSSSSSS    *   CALLPA  D/#,S/#
CCCC 1011110 1LI DDDDDDDDD SSSSSSSSS    *   CALLPB  D/#,S/#

jmg · 2016-10-05 19:57

cgracey wrote: »

I've got TWO smart pins taking measurements on the RX pin now. One measures lows and one measures highs. There's only need to interrupt on low measurements, as there will be an implied high measurement already available.

Sounds good, can that measure _/= to _/= , and =\_ to =\_ , or was that what you meant ?
TWO smart pins is great, because you can catch the wrong-phase effect with a simple > test.

cgracey wrote: »

So far, it seems to have doubled the speed of continuous downloading, but if characters are paced apart, even 1M baud works.

What does 'paced apart' mean ? Is that 2 stop bits, or something more ?
2 Stop bits is easy to specify, for above some moderate speed.

cgracey wrote: »

Still looking at how to make it faster. I think we need to go to NCO mode to get sufficient resolution, but that means doing a divide, which is very expensive in terms of time. Also, smaller FPGA implementations don't have CORDIC, so it would have to be a software divide.

Good point about divide - what speed is Multiply ?

First, I think we should push up the Baud rate with 2 Stop Bits, and then look into NCO, if the granularity is becoming an issue.
I do have a faster 64B scheme, for example.

For bit-time calculations, I've come up with an approx /15, which assumes you Two SmartPin capture & add (_/= to _/=) , and (=\_ to =\_) on '?' to get a 15b time measurement. Do > check for skew-error, then calculate.

Ideal 900.5*15 = 13507.5

RD15 : round(/15) using multiply and shift
(2^15+13507.5*4369) >>16 = 900
(2^15+13507.7*4369) >>16 = 900
(2^15+13507.8*4369) >>16 = 901 slight movement in step point, due to appx nature.
It is appx, because *4369/2^16 has a ~15ppm error, I think tolerable for this task.

Baud capture values

; Baud   Capture.15bT  P2_Osc Precision LSB (15=8b+7b)
; 9600    31250        32ppm    
; 19200   15625        64ppm
; 28800   10417        96ppm
; 115200  2604        384ppm
; 460800   651        0.153%
; 691200   434        0.230%

cgracey · 2016-10-05 20:28

If you have the number of clocks in

garryj wrote: »
Should CALLPA/CALLPB be doc'd as being in the S/#rel9 group, as PNut throws a "Relative address out of range" error?
My apologies if I'm misunderstanding -- again...
Thanks!
CCCC 1011110 0LI DDDDDDDDD SSSSSSSSS    *   CALLPA  D/#,S/#
CCCC 1011110 1LI DDDDDDDDD SSSSSSSSS    *   CALLPB  D/#,S/#

You're right. I didn't make a proper opcode description in instructions.txt. Here's how those two lines should read:

CCCC 1011110 0LI DDDDDDDDD SSSSSSSSS    *   CALLPA  D/#,S/#rel9
CCCC 1011110 1LI DDDDDDDDD SSSSSSSSS    *   CALLPB  D/#,S/#rel9

I'm glad you noticed this. I fixed it for the next release.

cgracey · 2016-10-05 20:45

Jmg,

By paced apart, I mean typed on a keyboard, so many stop bits are there.

The smart pin measurements are for high-period and low-period. Once we get an interrupt from the low-period measurement, we can be sure a high-period measurement is also ready, simplifying the autobaud ISR and cutting the interrupt frequency in half.

16x16 multiply is two clocks.

The autobaud can work at high speed for cases of many stop bits. I don't know if two stop bits would make any difference, though, as there are plenty of fast transitions within data bytes.

Right now, I'm looking at making the smart pin's reporting size variable, according to how many bits are needed. Cutting down the RDPIN time could give us a huge boost:

	$000000xx = byte, 4..6 clocks
	$0000xxxx = word, 6..10 clocks
	$xxxxxxxx = long, 10..18 clocks

Going from long to byte (at fast baud rates) would be equivalent to cutting a lot of instructions from the autobaud ISR.

jmg · 2016-10-05 21:49

cgracey wrote: »

By paced apart, I mean typed on a keyboard, so many stop bits are there.

hehe, yes many stop bits...

cgracey wrote: »

The smart pin measurements are for high-period and low-period. Once we get an interrupt from the low-period measurement, we can be sure a high-period measurement is also ready, simplifying the autobaud ISR and cutting the interrupt frequency in half.

Yes, but if you measure edge to edge, you increase the T-Axis & simplify the possible readings.
A false-reading (bad phase) is detected with a simple > test.

Q: The smart pins can do a Time Capture and Clear on choice of _/= or =\_ Edge, right ?

cgracey wrote: »

16x16 multiply is two clocks.

That's 16x16 -> 32 ? That should work well with the rounded /15 above.

cgracey wrote: »

The autobaud can work at high speed for cases of many stop bits. I don't know if two stop bits would make any difference, though, as there are plenty of fast transitions within data bytes.

Two stop bits matters mostly for Software RX, but it does give a little more time from leading edge of Start to possible next char start bit.
If the AutoBaud char echos a single char, as I think it needs to, most systems would wait for that, which gives rather more time from AutoBAUD to command start. 10~15b times on MCU, and 250us~1ms on USB-UART

cgracey wrote: »
Right now, I'm looking at making the smart pin's reporting size variable, according to how many bits are needed. Cutting down the RDPIN time could give us a huge boost:
	$000000xx = byte, 4..6 clocks
	$0000xxxx = word, 6..10 clocks
	$xxxxxxxx = long, 10..18 clocks
Going from long to byte (at fast baud rates) would be equivalent to cutting a lot of instructions from the autobaud ISR.

Does AutoBaud even need an ISR now ?
At First char P2 is waiting, and if you want resync on AutoBaud, you expect Rx to be valid, so can (early) test for '?' and simply grab the waiting Capture pair. On all other Chars, capture is simply ignored. (so it needs to reflect most recent)

If "?" always echos ( eg "."==NOP) then a One-Pin design needs to pause after "?", practical could be a refresh every 500ms say ?

Of course, the faster this works, the less need there is for resync of Baud rates to cover dT drift

Can you test that RC Osc temp drift on the sample wafers ?

Q: Can the smart-pin uart, accept a Baud-Divider change 'live' - ie while receiving characters.

cgracey · 2016-10-06 19:10

I got the pin messaging reduced and that has enabled auto-baud detection of $20's at 2M baud. The problem is, it only works if the bytes are spread way out. In a high-change-rate byte, like $55, which is received as ...10101010101..., the auto-baud fires every 2 bits, or at 1MHz, and trips all over itself.

Here are the current auto-baud and receiver ISR:

'
'
' Autobaud ISR - $20 -> 10000001001 -> ..1, 6x 0, 1x 1, 2x 0, 1..
'
autobaud_isr	akpin	#rx_ms0			'acknowledge low-time measurement, high-time also ready

		mov	buf2,buf0		'get [-2] low time	(6x if $20)
		rdpin	buf1,#rx_ms1		'get [-1] high time	(1x if $20)
		rdpin	buf0,#rx_ms0		'get [-0] low time	(2x if $20)

		mov	limh,buf2		'make comparison window from [-2] low (6x if $20)
		shr	limh,#4
		neg	liml,limh
		add	limh,buf2
		add	liml,buf2

		mul	buf1,#6			'normalize [-1] high to 6x (1x if $20)
		cmpr	buf1,limh	wc	'check if within window
	if_nc	cmp	buf1,liml	wc

		mov	buf1,buf0		'normalize [-0] low to 6x (2x if $20)
		mul	buf1,#3
	if_nc	cmpr	buf1,limh	wc	'check if also within window
	if_nc	cmp	buf1,liml	wc

	if_c	reti1				'if not $20, exit

		add	buf2,buf0		'$20 (space), add 6x 0 and 2x low times to get 8x time
		shl	buf2,#16-3		'shift up to get clock count for 1x time
		or	buf2,#7			'set 8 bits
		wxpin	buf2,#rx_rcv		'set rx pin baud
		dirl	#rx_rcv			'reset receiver pin
		dirh	#rx_rcv			'(re)enable receiver pin to (re)register frame

		mov	baud,buf2		'save baud for transmit

		mov	rxbyte,#$120		'signal receiver ISR to ignore pin, enter space
		trgint2				'trigger serial receiver ISR in case it wasn't, already (<50k baud)

		reti1				'exit, serial receiver ISR executes next
'
'
' Serial receiver ISR
'
receive_isr	clrb	rxbyte,#8	wc	'triggered by autobaud? if so, rxbyte = $20 (space)

	if_nc	akpin	#rx_rcv			'triggered by receive, acknowledge rx byte
	if_nc	rdpin	rxbyte,#rx_rcv		'triggered by receive, get rx byte

		wrlut	rxbyte,head		'write byte to circular buffer in lut
		incmod	head,#lut_btop		'increment buffer head

		reti2

Interesting idea, jmg, about tracking fall-to-fall and rise-to-rise times, instead of state times. It looks like it would cut down on interrupts, but still permit some ambiguity. Take the $20 case:

anticipated:

    F------F = 7
...10000001001... = $20
          R--R = 3

misfire:

       F------F = 7
...10x10000001001xxxxxx1... = %0000001x immediately followed by %xxxxxx10
             R--R = 3

It's outside of the protocol to have $02 and $03 data present, so this is probably not a problem.

I will try this out. The neat thing about this is that it only needs two samples, not three, and those two sample arrive in one interrupt!

jmg · 2016-10-06 20:44

cgracey wrote: »

It's outside of the protocol to have $02 and $03 data present, so this is probably not a problem.

I will try this out. The neat thing about this is that it only needs two samples, not three, and those two sample arrive in one interrupt!

Yes, a single valid reading of a pair per Autobaud char, is all that is needed.

I'm not sure an interrupt is required, if you can wait on capture, for the first Autobaud, and then read-last-capture
on any Serial INT.

That requires the last capture is always the most-recent (ie new capture overwrites old).
Is that how capture works ?

There are also other one-shot use capture cases, where you might prefer first-capture is kept.

You can make good use of a Pair of captures
* Sum to get more x-Axis
* Check tRR relative to tFF to catch rejects
* Check ratio, to discriminate between two valid AutoBaud chars.

I'm running through candidate Chars, for AutoBaud, the ideal outcome is
* Highest Sum of tRR and tFF, as that gives lowest 'maths jitter'
* A means to reject false captures, eg Reset exits mid-char
eg 0x3e ">" fails this, as the balanced nature of this char, means no > or < test can resolve.

You suggestion of '?' looks good, as that gives 15b Sum
There is also " " and "@", but with a lower sum of 10b,

I think AutoBaud needs to echo a ACK char, and there needs to be two AutoBaud chars,
one to Set-Baud and define Two-Pin
another to Set-Baud and define One-Pin

jmg · 2016-10-06 21:18

cgracey wrote: »

I got the pin messaging reduced and that has enabled auto-baud detection of $20's at 2M baud.

Good results, but I thought you already had the pin-messaging reduced ?
Or was that opcode related, and this is content related ?
Automated Content based compression is nice, but could introduce a jitter source into code ?

cgracey · 2016-10-06 23:03

jmg wrote: »

cgracey wrote: »

It's outside of the protocol to have $02 and $03 data present, so this is probably not a problem.

I will try this out. The neat thing about this is that it only needs two samples, not three, and those two sample arrive in one interrupt!

Yes, a single valid reading of a pair per Autobaud char, is all that is needed.

I'm not sure an interrupt is required, if you can wait on capture, for the first Autobaud, and then read-last-capture
on any Serial INT.

That requires the last capture is always the most-recent (ie new capture overwrites old).
Is that how capture works ?

Wow! These are great ideas. And, yes, the captures are cyclical.

We COULD just do an initial auto-baud and then check the sample pair in each serial interrupt, as the last thing to happen was the stop bit, making a fresh rise-to-rise measurement. We are already 1/2 into the stop bit when the serial ISR is triggered, so we have less than half a bit period to grab the last fall-to-fall sample before a new fall-to-fall is registered, in case a (low) start bit is next. This could get us to 1M baud. It's all about getting into the ISR and reading the fall-to-fall measurement ASAP.

There are also other one-shot use capture cases, where you might prefer first-capture is kept.

You can make good use of a Pair of captures
* Sum to get more x-Axis
* Check tRR relative to tFF to catch rejects
* Check ratio, to discriminate between two valid AutoBaud chars.

I'm running through candidate Chars, for AutoBaud, the ideal outcome is
* Highest Sum of tRR and tFF, as that gives lowest 'maths jitter'
* A means to reject false captures, eg Reset exits mid-char
eg 0x3e ">" fails this, as the balanced nature of this char, means no > or < test can resolve.

You suggestion of '?' looks good, as that gives 15b Sum
There is also " " and "@", but with a lower sum of 10b,

I think AutoBaud needs to echo a ACK char, and there needs to be two AutoBaud chars,
one to Set-Baud and define Two-Pin
another to Set-Baud and define One-Pin

I'm not understanding these ideas, yet.

cgracey · 2016-10-06 23:04

jmg wrote: »

cgracey wrote: »

I got the pin messaging reduced and that has enabled auto-baud detection of $20's at 2M baud.

Good results, but I thought you already had the pin-messaging reduced ?
Or was that opcode related, and this is content related ?
Automated Content based compression is nice, but could introduce a jitter source into code ?

Size/time is now content-based, but remember that there is jitter, anyway, from not knowing the phase of the message.

jmg · 2016-10-06 23:16

cgracey wrote: »

Wow! These are great ideas. And, yes, the captures are cyclical.

We COULD just do an initial auto-baud and then check the sample pair in each serial interrupt, as the last thing to happen was the stop bit, making a fresh rise-to-rise measurement. We are already 1/2 into the stop bit when the serial ISR is triggered, so we have less than half a bit period to grab the last fall-to-fall sample before a new fall-to-fall is registered, in case a (low) start bit is next. This could get us to 1M baud. It's all about getting into the ISR and reading the fall-to-fall measurement ASAP.

This is an example of where 2 Stop bits can buy a higher baud rate.
That 0.5 bit, becomes 1.5 bits.

cgracey · 2016-10-06 23:17

Our problem now will be to quickly compute an NCO baud rate, in case the baud is high enough.

I've also been thinking about how the smart pin could solve this baud problem.

cgracey · 2016-10-06 23:19

jmg wrote: »

cgracey wrote: »

Wow! These are great ideas. And, yes, the captures are cyclical.

We COULD just do an initial auto-baud and then check the sample pair in each serial interrupt, as the last thing to happen was the stop bit, making a fresh rise-to-rise measurement. We are already 1/2 into the stop bit when the serial ISR is triggered, so we have less than half a bit period to grab the last fall-to-fall sample before a new fall-to-fall is registered, in case a (low) start bit is next. This could get us to 1M baud. It's all about getting into the ISR and reading the fall-to-fall measurement ASAP.

This is an example of where 2 Stop bits can buy a higher baud rate.
That 0.5 bit, becomes 1.5 bits.

Right. If the sender's baud is over ~500k, then he needs to use TWO stop bits, not one. That's simple enough.

I really like the idea of getting the baud updater down to once per byte, and really only on $20's, as we're only looking for fine adjustments.

cgracey · 2016-10-06 23:23

Here's a question:

How do you efficiently compare two values for being within 1/16th of each other?

This has turned into a brain bender for me. I have this code:

'
'
' Autobaud ISR
'
' $20 -> 10000001001 -> fall-to-fall = 7 bits then rise-to-rise = 3 bits
'
autobaud_isr	rdpin	buf1,#rx_ms1		'get old fall-to-fall time	(7x if $20)
		rdpin	buf0,#rx_ms0		'get new rise-to-rise time	(3x if $20)

		akpin	#rx_ms0			'acknowledge rise-to-rise measurement (ISR trigger)

		mul	buf1,#3			'normalize both samples to 21x for comparison
		mul	buf0,#7

		sub	buf0,buf1		'subtract one from the other
		abs	buf0			'get absolute difference
		topone	buf2,buf1		'get magnitude of one original
		sub	buf2,#4			'divide by 16
		shr	buf0,buf2	wz	'shift down by magnitude minus 4, checking for 0

	if_nz	reti1

		'(received $20)

I need to run it through mental scenarios before I'm confident. It seems to work just fine, though.

jmg · 2016-10-06 23:39

cgracey wrote: »

You can make good use of a Pair of captures
* Sum to get more x-Axis
* Check tRR relative to tFF to catch rejects
* Check ratio, to discriminate between two valid AutoBaud chars.

I'm not understanding these ideas, yet.

The idea is to combine the two capture readings, to get the most information.
eg 0x3f will capture tRR = 8b tFF = 7b, which you can add to get 15b times.
You can also compare them, and if (tRR < tFF), you ignore & continue to wait. (means RST exit was mid-char)

To resolve which one of 0x3f or 0x1f was sent, you compare the difference with tRR, and slice ~ 1.5/8b

I'm favouring 0x3f "?" and 0x1f, as they have higher sums, (tho the last one does go a little against your Kbd Test wish )

0x3f "?" :  on tRR = 8 tFF = 7 Sum=15b
===\_s_/=0=.=1=.=2=.=3=.=4=.=5=\_6_._7_/=P==T=\_s_/=0=.=1=.=2=.=3=.=4=.=5=\_6_._7_/=P==T=\_s_/=0=.=1=.=2=.=3=.=4=.=5=\_6_._7_/=P=
tRR    |            8                  |   2+T    |            8                  |   2+T    |       8                       |   //
tFF                     7      |        3+T   |                7           |     3+T     | 1 |                                =\_/= 
    f  r             OK tRR:8b,tFF:7b -^          ^- Err:tRR:2b+T,tFF:3b+T        ^OK        ^Err 
Margins: +1b,-1b = usable

Need a Second Char, for AutoBaud One-Pin command 

0x1f -> OK tRR:8b,tFF:6b Sum=14b   Err:tRR:2b+T,tFF:4b+T 
===\_s_/=0=.=1=.=2=.=3=.=4=\_5_._6_._7_/=P==T=\_s_/=0=.=1=.=2=.=3=.=4=\_5_._6_._7_/=P==T=\_
Margins: +2b,-2b = usable, CAN pair with 0x3f

also possible are 0x20 " " and 0x40 "@", but they have smaller sums.

jmg · 2016-10-06 23:58

cgracey wrote: »

Here's a question:

How do you efficiently compare two values for being within 1/16th of each other?

This has turned into a brain bender for me. I have this code:

I'm not quite sure what you are asking, but here are my 10b examples
I add, then multiply before shifting to get the rounding effects carefully centred.

Examples of /15, /14, /13, /10 at 9600     tRR+tFF
 2^16/15 = 4369.06    (15/9600)/(1/20M) = 31250      (2^15+31250*4369) >> 16 = 2083
 2^16/14 = 4681.1428  (14/9600)/(1/20M) = 29166.666 (2^15+29167*4681) >> 16 = 2083
 2^16/13 = 5041.230   (13/9600)/(1/20M) = 27083.33   (2^15+27083*5041) >> 16 = 2083
 2^16/10 = 6553.6     (10/9600)/(1/20M) = 20833.333  (2^15+20833*6554) >> 16 = 2083  <- same 20M/N result, 9601.536 Baud

and a few likely Baud Higher speeds, this is not yet NCO based, just /N - The appeal of 15b "?", over 10b " ", is it will give better NCO outcomes.

EFM8BB1 examples (expanded)

 TB=24.5M/14 = 1750000     CT=(10/TB)/(1/20M) = 114.285  P2B = 20M/((2^15+round(CT)*6554) >> 16) = 1818181.818 100*(1-P2B/TB) = -3.89%
 TB=24.5M/16 = 1531250     CT=(10/TB)/(1/20M) = 130.612  P2B = 20M/((2^15+round(CT)*6554) >> 16) = 1538461.538 100*(1-P2B/TB) = -0.470%
 TB=24.5M/18 = 1361111.11  CT=(10/TB)/(1/20M) = 146.938  P2B = 20M/((2^15+round(CT)*6554) >> 16) = 1333333.333 100*(1-P2B/TB) = 2.04%
 TB=24.5M/20 = 1225000     CT=(10/TB)/(1/20M) = 163.265  P2B = 20M/((2^15+round(CT)*6554) >> 16) = 1250000     100*(1-P2B/TB) = -2.04%
 TB=24.5M/22 = 1113636.36  CT=(10/TB)/(1/20M) = 179.591  P2B = 20M/((2^15+round(CT)*6554) >> 16) = 1111111.111 100*(1-P2B/TB) = 0.226%
 TB=24.5M/24 = 1020833.333 CT=(10/TB)/(1/20M) = 195.918  P2B = 20M/((2^15+round(CT)*6554) >> 16) = 1000000     100*(1-P2B/TB) = 2.040%
 TB=24.5M/26 = 942307.692  CT=(10/TB)/(1/20M) = 212.244  P2B = 20M/((2^15+round(CT)*6554) >> 16) = 952380.952  100*(1-P2B/TB) = -1.069%
 TB=24.5M/28 = 875000      CT=(10/TB)/(1/20M) = 228.571  P2B = 20M/((2^15+round(CT)*6554) >> 16) = 869565.217  100*(1-P2B/TB) = 0.621%
 TB=24.5M/30 = 816666.666  CT=(10/TB)/(1/20M) = 244.897  P2B = 20M/((2^15+round(CT)*6554) >> 16) = 800000      100*(1-P2B/TB) = 2.040%
 TB=24.5M/32 = 765625      CT=(10/TB)/(1/20M) = 261.224  P2B = 20M/((2^15+round(CT)*6554) >> 16) = 769230.769  100*(1-P2B/TB) = -0.471%
 TB=24.5M/34 = 720588.235  CT=(10/TB)/(1/20M) = 277.551  P2B = 20M/((2^15+round(CT)*6554) >> 16) = 714285.714  100*(1-P2B/TB) = 0.874%
 TB=24.5M/36 = 680555.555  CT=(10/TB)/(1/20M) = 293.877  P2B = 20M/((2^15+round(CT)*6554) >> 16) = 689655.172  100*(1-P2B/TB) = -1.337%
 TB=24.5M/38 = 644736.842  CT=(10/TB)/(1/20M) = 310.204  P2B = 20M/((2^15+round(CT)*6554) >> 16) = 645161.290  100*(1-P2B/TB) = -6.58e-4
 TB=24.5M/40 = 612500      CT=(10/TB)/(1/20M) = 326.530  P2B = 20M/((2^15+round(CT)*6554) >> 16) = 606060.606  100*(1-P2B/TB) = 1.05%
 TB=24.5M/42 = 583333.333  CT=(10/TB)/(1/20M) = 342.857  P2B = 20M/((2^15+round(CT)*6554) >> 16) = 588235.294  100*(1-P2B/TB) = 0.84%

jmg · 2016-10-07 00:12

cgracey wrote: »

How do you efficiently compare two values for being within 1/16th of each other?

I think you are trying to ratio check 7b vs 3b, as a simple value sanity check ?

I was more of the mind that the AutoBaud char had better be right, and the echo char can confirm immediately.
No need to get too fancy in the software.

I was planning on just checking for (eg)
Which of 3b:7b ie " " , or 2b:8b, "@" and here the difference is either -4 or -6, so a slice test of 5b is ok.
( I also test for tRR > tFF, as that represents a phasing fail. That certainly is possible. Next Char should always work)

A purist might also check for >3 and <7, to bound the expected values a little more, but what to do, if the AutoBaud fails the test ?
Anything echoed is likely to be mangled anyway.
Users would always be advised to 'approach from below' in any AutoBaud testing.

For echo, I think the "~" 0x7e is good, as that is most baud sensitive.
P2 running too fast, will receive as ">" instead of "~"

Reporting the Capture Sum, I think is good for testing, and good for users wanting to push Baud speeds.
A spare command char, can ask for the sum, anytime. \

There is room to report this as packed-hex (ie one 16ch ASCII block, not the classic 0..9,A-F)

jmg · 2016-10-07 00:20

Here are the tRR, tFF results for the Space and other candidates.
With +4,+6,+3 signaling the error, and -4,-6,-3 for Valid phase, a simple (tRR < tFF) test is enough.

0x20 " " : tRR = 3, tFF =7 Sum=10b
=======\_s_._0_._1_._2_._3_._4_/=5=\_6_._7_/=P==T=\_s_._0_._1_._2_._3_._4_/=5=\_6_._7_/=P=
tRR                            |   3       |                       7+T    |    3      |   
tFF    |         7                 |     3+T      |        7                  |  3+T     |   
       f                 OK tRR:3b,tFF:7b -^                              ^- Err:tRR:7b+T,tFF:3b+T
Margins: -4b,+4b = usable

0x40 "@" : tRR = 2, tFF =8 Sum=10b
=======\_s_._0_._1_._2_._3_._4_._5_/=6=\_7_/=P==T=\_s_._0_._1_._2_._3_._4_._5_/=6=\_7_/=P=
tRR                                |  2    |                      8+T         |  2    |   
tFF    |         8                     |  2+T     |        8                      | 2+T   |   
       f                 OK tRR:2b,tFF:8b -^                                  ^- Err:tRR:8b+T,tFF:2b+T
Margins: -6b,+6b = usable


0x30 "0" : tRR = 4, tFF =7 Sum=11b
=======\_s_._0_._1_._2_._3_/=4=.=5=\_6_._7_/=P==T=\_s_._0_._1_._2_._3_/=4=.=5=\_6_._7_/=P=
tRR                       |       4        |                   6+T    |    3          |   
tFF    |         7                 |     3+T      |        7                  |  3+T     |   
       f                 OK tRR:4b,tFF:7b -^                          ^- Err:tRR:6b+T,tFF:3b+T
Margins: -3b,+3b = usable


0x38 "8" : tRR = 5, tFF =7 Sum=12b
=======\_s_._0_._1_._2_/=3=.=4=.=5=\_6_._7_/=P==T=\_s_._0_._1_._2_/=3=.=4=.=5=\_6_._7_/=P=
tRR                    |          5        |          5+T         |    5              |   
tFF    |         7                 |     3+T      |        7                  |  3+T     |   
       f                 OK tRR:5b,tFF:7b -^                      ^- Err:tRR:5b+T,tFF:3b+T
Margins: -2b,+2b = usable

jmg · 2016-10-07 00:42

Test cases :
The MCU code I have repeats every ~ 500us waiting for an Autobaud ack reply, but for systems that cannot easily pace their sending, or are more 'one way' in nature, another 'acid test' scenario is

The Boot loader needs to tolerate a long stream of AutoBAUD chars, with reset exit of any Phase.

Something like a RF link, may choose to
* Issue a Reset Pulse
* Stream AutoBaud (eg 300ms) This time is > worst case Reset Exit.
* Stream Data

or, using your simpler Keyboard mode, a user can hold down the AutoBAUD Char, while they manually reset P2.

SPI boot code and new CALLPA/CALLPB instructions

Comments