Prop2 ROM code

jmg · 2016-10-21 21:03

cgracey wrote: »

I've got it working at 2M baud, with a few more innovations, but I've encountered what seems like a cog hardware bug. I'm totally stumped right now.

That's one of the side benefits of pushing the envelope a little - it can uncover subtle HW issues, and much better to find those now

Does it looks like Opcode or Timing or Dual Interrutpt or Smart pin interface or smart pin cell issue ?

cgracey · 2016-10-21 21:59

I found the problem. I was using the SCL instruction, instead of SUB+ABS+SHR+CMP, and the values were overflowing $4000 (1.0), due to dead time between characters. It was causing frequent failures that I couldn't see from the LED's. I thought to slow things down and then I could see some unexpected flickering. Got that fixed, but it cost two MAX instructions, which puts it back to the same size as before, but this method is much safer. These overflow problems might have been giving me subtle trouble, all along.

cgracey · 2016-10-21 22:37

It's running very well now at 2M baud. It even runs at 3M baud if you pace the characters.

'
'
' Autobaud ISR - detects "> "
'
'	      falls |--7---|
'	  $3E -> ..10011111001..10000001001..
'	        highs |-5-|
'
autobaud_isr	rdpin	a0,#rx_tne		'2	get fall-to-fall time	(7x if $3E)
		rdpin	a1,#rx_ths		'2	get high time		(5x if $3E)

		cmpr	a0,limit	wc	'2	make sure both measurements are within limit
	if_nc	cmpr	a1,limit	wc	'2

		scl	a0,norm0		'2	if they are within 1/64th of each other, $3E
	if_nc	cmpr	a1,0		wc	'2
		scl	a1,norm1		'2
	if_nc	cmpr	a0,0		wc	'2

	if_c	reti1				'2/4	if not $3E, exit

		resi1				'4	got $3E, resume on next interrupt

		akpin	#rx_tne			'2	acknowledge pin
		dirl	#rx_rcv			'2	reset receiver
		mul	a0,baud0		'2	compute baud rate
		setbyte	a0,#7,#0		'2	set 8 bit word size
		wxpin	a0,#rx_rcv		'2	set receiver baud rate and word size

		resi1				'4	resume on next interrupt

		akpin	#rx_tne			'2	acknowledge pin
		dirh	#rx_rcv			'2	enable receiver before next start bit
		mov	t0,a0			'2	save baud rate for transmitter
		mov	ijmp1,#autobaud_isr	'2	point back to initial ISR

		reti1				'4	exit


limit		long	$5A04			'count limit ($5A04 = 1.4065, keeps SCL within $7FFF)

norm0		long	$4100*5/7		'fall-to-fall normalization factor ($4100 = 1.015625)
norm1		long	$4100*7/5		'high-time normalization factor

baud0		long	$1_0000/7		'fall-to-fall baud computation factor
'
'
' Receiver ISR
'
receiver_ISR	rdpin	a2,#rx_rcv		'2	get received chr
		shr	a2,#32-8		'2

		wrlut	a2,head			'2	enter chr into receiver buffer
		incmod	head,#lut_btop		'2

		reti2

The SCL instruction does a signed multiply on the low words of the operands and returns a signed result whose LSB is bit 14 of the product, unlike bit 16 for SCLU (scale, unsigned). No result is written, but the result is substituted into the next instruction's S operand. This allows accumulation, comparison, etc, without requiring an extra register. Ariba came up with this scheme, which is pretty ingenious, I think. It works well here. We just need to limit the values so they remain positive through multiplication.

cgracey · 2016-10-21 23:41

jmg wrote: »

cgracey wrote: »

I've got it working at 2M baud, with a few more innovations, but I've encountered what seems like a cog hardware bug. I'm totally stumped right now.

That's one of the side benefits of pushing the envelope a little - it can uncover subtle HW issues, and much better to find those now

Does it looks like Opcode or Timing or Dual Interrutpt or Smart pin interface or smart pin cell issue ?

It was me, fortunately. The hardware is fine.

jmg · 2016-10-21 23:46

cgracey wrote: »

It's running very well now at 2M baud. It even runs at 3M baud if you pace the characters.

Good speed numbers, but does this still require a double-Autobaud before RX starts ? (which has sync issues, as you cannot guarantee pairs are always seen by P2 )

The ">" gives you 3 bit times at 1 Stop and 4 bit times at 2 Stop - seems enough ? - what Baud rate is possible, if you drop the double-char dictate ?

What UART do you use to test ?
The 2MBd may be on the lucky side, as I calculate better than one part in 64 measure, needs <= 1.5625MBd at 20MHz SysCLK
Does it need to be 1/64 for 5:7 ?
I think it can be <= 1/32 ( which can reach 3MBd for a better than one part in 32 measurment ?)

Also, 20 SysCLKS is 1us, which consumes 100% of CPU when receiving 2MBd 0x55, and proportionally less on fewer =\_ chars.

(or 50% of CPU at 1MBd, which shows how much possible SHA time, this super-jump capable AutoBaud is consuming.

A test string should have at least one 0x55, and ideally more, maybe 3 in a row ?

Another means to avoid this saturate effect, would be to skip using 0x55 in the Base64.
I already did that, expecting to use 0x55 as the tracking character.

cgracey · 2016-10-24 04:29

Jmg, you are correct. Autobauding continuously is a huge bandwidth eater. I'm going back to the initial-then-maintenance technique. It is most practical that a host asserts reset, then communicates. I think I will add a 60-second terminal timeout, in case nothing happens, so that the chip will power down.

jmg · 2016-10-24 05:50

cgracey wrote: »

Jmg, you are correct. Autobauding continuously is a huge bandwidth eater. I'm going back to the initial-then-maintenance technique. It is most practical that a host asserts reset, then communicates.

Sounds good. I could not think of any low-overhead way to do Full-Range-AutoBaud continually, but tracking-trim via 0x55 should be low overhead, and able to track rather larger than the standard BAUD error bands. (allowing longer pauses in any download)

cgracey wrote: »

I think I will add a 60-second terminal timeout, in case nothing happens, so that the chip will power down.

OK. You could nudge that timeout up a little, as some 'modern networks & systems' can take quite some time to get their ducks in a row.
(eg someone may do a power-cycle, on something like a RaspPi, and it needs to boot, then start the link.)
Maybe 10 mins ?, or someone may have other numbers they can offer.. ?

Tor · 2016-10-24 06:08

fwiw, the pi boots in just seconds (at least the pi3, with a modern sd card).

jmg · 2016-10-26 02:01

cgracey wrote: »

Jmg, you are correct. Autobauding continuously is a huge bandwidth eater. I'm going back to the initial-then-maintenance technique. It is most practical that a host asserts reset, then communicates....

What is the current command codes for Half-duplex (one-Pin) and ENQ for OK or ERR signaling (reply now one char?)

My code has assumed two AutoBaud chars, one for AutoBaud_Set_Duplex, and another for AutoBaud_Set_HalfDuplex, as that simplifies the polling.
a)
repeat
Tx(AutoBaudChar) // Assumes AutoBaud echos when command checked and decoded.
until (Rx=Ack)

However, there are other solutions too :

You can reserve one char for First(raw) AutoBaud like '@', and test carefully for just that one value. Here tLL:7, tRR:8 = Valid.

Then, you need some echo to confirm, and I think you mentioned some single char ENQ commands are now there ?

The Simplex command can stand-alone, or it can alias with an ENQ, so a Simplex command always echos something as confirmation.
A complement SetDuplex char could make testing faster, and can give 2 similar commands.
b)
repeat
Tx(AutoBaudChar) // No echo
Tx(SetSimplexChar) // included Echo is simpler
until (Rx=Ack_Simplex)

or
c)
repeat
Tx(AutoBaudChar) // No echo
Tx(SetSimplexChar) // No echo
Tx(EnqChar) // if SetSimplex does not include echo
until (Rx=Ack)

b) is simpler than c)

This 2 & 3 char pattern, dictates that those command chars are carefully chosen to avoid any possible false triggers of AutoBaud Char, at any phase of repeat reset exit.

This leads to

Possible Simplex and Duplex commands : MUST avoid tLL:7, tRR:8 result ratios
; 02D "-" 0b 0010 1101 : 
; =======\_s_/=0=.=1=\_2_/=3=\_4_/=5=\_6_._7_/=P==T=\_s_._0_._1_._2_._3_._4_._5_/=6=\_7_/=P==T=\_s_
; tFF               3|r     2|r     2|r          3+T|r                             8|r     2+T |
; tLL               1|      1|      1|            2 |                              7|         1|
                     ^x      ^x      ^x             ^x                              ^V         ^x
; 03D "=" 0b 0011 1101 : 
; =======\_s_/=0=.=1=\_2_/=3=.=4=.=5=\_6_._7_/=P==T=\_s_._0_._1_._2_._3_._4_._5_/=6=\_7_/=P==T=\_s_
; tFF               3|r             4|r          3+T|r                             8|r     2+T |  
; tLL               1|              1|            2 |                              7|         1|  
                     ^x              ^x             ^x                              ^V         ^x
; Both look ok for <> 7/8 in any phase.

These are easy to remember symbolically, for one and two wires...

"-" is then Set-Simplex & echo -> Set link as Simplex & report ok.err since last inquiry.
"=" is then Set-Duplex & echo -> Set link as Duplex, & report ok.err since last inquiry.
"@" is AutoBaud raw char, and a simple NOP when AutoBaud is in tracking/maintenance mode, to tolerate link latencies.
"U" (0x55) is tracking/maintenance Baud char

This is compatible with my smaller Base64 set of "0"-"9", then "A"-"w", skipping "U"

A test for this, is to send a long repeating string of "@-" or "@=" to a P2 coming out of reset.
Echo should be some lesser number of ok.Acks.

jmg · 2016-10-26 02:23

Tor wrote: »

fwiw, the pi boots in just seconds (at least the pi3, with a modern sd card).

Good to know.

The chatter in this thread from 2012, gives values from ~10s to 2m 27s

https://www.raspberrypi.org/forums/viewtopic.php?f=63&t=6212

Does look like 60s is a bit light, for all possible (P2+ RaspPi combinations), from Power-On/Cycle

cgracey · 2016-10-26 05:50

I've got everything running at 1.75Mbaud now, at 20MHz. I hope to get it a little faster, but derating for real-world RC variance means we are still solid at 1Mbaud.

I'm using ">" as the autobaud character. It needs an initial ">>" and periodic ">" characters to maintain baud. It times out at 60 seconds, fastest case, and then powers down as much as possible. So, now, a reset is needed to get its attention, which would have always a practical requirement, anyway.

Realizing a need for fast bit tables, I added a new instruction: ALTB. It's like ALTD, but instead of adding D[8:0] and S/#[8:0] to get a D field for the next instruction, it adds D[13:5] and S/#[8:0], so that D can be a bit number. This works nicely with the bit instructions to make randomly-accessible bit fields which can span the entire cog register memory:

	ALTB	bitnum,#bitbase		'uses bitnum[13:5] as long index
	TESTB	0,bitnum	WC	'uses bitnum[4:0] as bit index
	(use C)

bitbase	res	8	'field of 256 bits

In the booter ROM, this is useful for quickly checking if we have 1 of 7 whitespace characters.

Note: To make room for ALTB, I got rid of SETBYTS which set all D bytes to S/#[7:0].

jmg · 2016-10-26 06:23

cgracey wrote: »

I've got everything running at 1.75Mbaud now, at 20MHz. I hope to get it a little faster, but derating for real-world RC variance means we are still solid at 1Mbaud.

I'm using ">" as the autobaud character. It needs an initial ">>" and periodic ">" characters to maintain baud. It times out at 60 seconds, fastest case, and then powers down as much as possible. So, now, a reset is needed to get its attention, which would have always a practical requirement, anyway.

Some issues with that :

* ">" has more hi-time than "@" and has less sampling time, so lower precision.

* Double chars are a problem to manage - eg how does the host know when the P2 has come out of reset ?

* The "@" has tLL = 7, tFF =8, and gives a result on end of 6th bit - is that not enough time to use a single Char AutoBaud ?

Did you look at 0x55 as the periodic char to maintain baud, as the checking overhead on that is way less, than on ">" or "@" ?
ie it gives you the most cycles for SHA work.

cgracey · 2016-10-26 06:53

Jmg, here is the current code. Do you see any improvements possible here? It works just fine, but could maybe be faster.

'
'
' Autobaud ISR - detects initial "> "
'
'	      falls |--7---|
'	  $3E -> ..10011111001..10000001001..
'	        highs |-5-|
'
autobaud_isr	rdpin	a0,#rx_tne		'2	get fall-to-fall time	(7x if $3E)
		rdpin	a1,#rx_ths		'2	get high time		(5x if $3E)

		cmpr	a0,limit	wc	'2	make sure both measurements are within limit
	if_nc	cmpr	a1,limit	wc	'2

		scl	a0,norm0		'2	if they are within 1/35th of each other, $3E
	if_nc	cmpr	a1,0		wc	'2
		scl	a1,norm1		'2
	if_nc	cmpr	a0,0		wc	'2

	if_c	reti1				'2/4	if not $3E, exit

		resi1				'4	got $3E, resume on next interrupt

		akpin	#rx_tne			'2	acknowledge pin
		mul	a0,baud0		'2	compute baud rate
		setbyte	a0,#7,#0		'2	set word size to 8 bits
		wxpin	a0,#rx_rcv		'2	set receiver baud rate and word size

		resi1				'4	resume on next interrupt

		dirh	#rx_rcv			'2	enable receiver before next start bit
		wrpin	mtpe,#rx_tne		'2	change rx_tne to measure positive edges
		setse1	#%110<<6+rx_rcv		'2	set se1 to trigger on rx_rcv high
		mov	t0,a0			'2	save baud rate for transmitter

		resi1				'4	resume on next interrupt
'
'
' Receiver ISR - detects maintenance ">" chrs
'
'	        rises |--7---|
'	  $3E -> ..10011111001..
'
		rdpin	a0,#rx_tne		'2	get rise-to-rise time	(7x if $3E)

		rdpin	a1,#rx_rcv	wc	'2	get received chr
	if_c	reti1				'2/4	ignore if msb set

		shr	a1,#32-8		'2	shift to lsb justify

		cmp	a1,#">"		wz	'2	autobaud chr?

	if_nz	wrlut	a1,head			'2	enter chr into receiver buffer
	if_nz	incmod	head,#lut_btop		'2	increment buffer head

	if_nz	reti1				'4	exit


.baud		mul	a0,baud0		'2	autobaud chr, compute baud rate
		setbyte	a0,#7,#0		'2	set word size to 8 bits
		wxpin	a0,#rx_rcv		'2	set receiver baud rate and word size
		mov	t0,a0			'2	save baud rate for transmitter

		reti1				'4	exit


limit		long	$58E4			'count limit ($58E4 = 1.3889, keeps SCL within $7FFF)

norm0		long	$41D4*5/7		'fall-to-fall normalization factor ($41D4 = 1.0 + 1/(7*5))
norm1		long	$41D4*7/5		'high-time normalization factor

baud0		long	$1_0000/7		'7x baud computation factor

jmg · 2016-10-26 09:41

cgracey wrote: »

Jmg, here is the current code. Do you see any improvements possible here? It works just fine, but could maybe be faster.

1) I think there is just enough time, to AutoBaud from a single char, inside the first interrupt ? On ">" and maybe on "@"

ie by some code order shuffling like

' Autobaud  Raw ISR - detects initial single AutoBaud Char  "> "
' Another valid command may be right behind this, so timing matters to enable RX, do RX side asap
'
'	      falls |--7---|
'	  $3E -> ..10011111001..10000001001..
'	        highs |-5-|
'
autobaud_raw_isr	rdpin	a0,#rx_tne		'2	get fall-to-fall time	(7x if $3E)
		rdpin	a1,#rx_ths		'2	get high time		(5x if $3E)

		cmpr	a0,limit	wc	'2	make sure both measurements are within limit
	if_nc	cmpr	a1,limit	wc	'2

		scl	a0,norm0		'2	if they are within 1/35th of each other, $3E
	if_nc	cmpr	a1,0		wc	'2
		scl	a1,norm1		'2
	if_nc	cmpr	a0,0		wc	'2

	if_c	reti1				'2/4	if not $3E, exit
' Passes Baud Char ratio tests, so extract time information, and apply to RX
		mul	a0,baud0		'2	compute baud rate
		setbyte	a0,#7,#0		'2	set word size to 8 bits
		wxpin	a0,#rx_rcv		'2	set receiver baud rate and word size
		dirh	#rx_rcv			'2	enable receiver before next start bit
' Rx done first, now can complete other housekeeping ~ 28 SysCLKs, to start RX engine
' budgets:  2.5 bits, at 1.75Mbd is 28.57 SysCLKs
' ">" edges have  just under 3 or 4 bits of timing margin, for 1 or 2 Stop Bits.
		wrpin	mtpe,#rx_f5e		'2	change rx_tne mode to measure 0x55 falling edges
' etc
		mov	t0,a0			'2	save baud rate for transmitter
		reti1				'4	exit

RX setup gets pushed up earlier, as far as possible, and any Capture Mode changes can run after the RxPin is enabled.
ie This looks to have margin at your 1.75Mbd & ">" to do this in a single Char.

2) If you test for a valid ">" before applying AutoBaud-trim, that is a quite narrower catch range, as you must be inside valid-baud limits.

In contrast a 0x55 test needs only pass the Highest Frequency test, which can be something over 10%, so you can tolerate much longer pauses / drift.
I get ~15% minus Baud Tolerance, and no imposed limit on Baud Increased.

The code is very similar to what you have above, just a SmartPin cell mode change to X=5 edges-Time-capture.
Total flight time is less in re-trim, which matters as the RxINT may have been late

' Receiver ISR - detects maintenance "U" chrs, does not RxBuffer  a "U"
'
'	        Falls 1 2  3  4  5  
'	  $55 -> ..10101010101  Capture for 5 edges, 
'
		rdpin	a0,#rx_tne		'2	get X=5 Cycles time
		wrpin	??,#??		'2	opcode to re-arm the next PinCell X=5 capture,  < 5 will read 0,  >= 5 is dT
		cmp	a0,MaintTrim	wz	'2	MaintTrim is time for 5 edges * ~90%, next slowest is > 20% longer
' branch here, nz is RX, z is update Baud
' Need prompt Baud update, as 0x55 can be followed by valid RX, and RI may be late due to drift.  Use 2 Stop bits for highest Baud rates.

Rx:
		rdpin	a1,#rx_rcv	wc	'2	get received chr
		shr	a1,#32-8		'2	shift to lsb justify
...
	        wrlut	 a1,head			'2	enter chr into receiver buffer
	        incmod	head,#lut_btop		'2	increment buffer head


		reti1				'4	exit

.baud		mul	a0,baud0		'2	autobaud chr, compute baud rate, based on 5=\_ = 8b times
		setbyte	a0,#7,#0		'2	set word size to 8 bits
		wxpin	a0,#rx_rcv		'2	set receiver baud rate and word size
		mov	t0,a0			'2	save baud rate for transmitter
		reti1				'4	exit

Tightest times look to be on drift slower, so maybe a Smart Pin X=5 mode can interrupt, which only occurs on 0x55, as RxInt re-arms the counter. (all other possible chars are 4 or less, so no int).

This buys 1.5 more bit times for the detects maintenance case, as the baud INT is from last data bit fall, not mid-stop bit

ie, it then codes something like this, which looks smaller and even faster on Rx-chars


' Maint ISR - detects maintenance "U" chrs, does not RxBuffer  a "U"
'
'	        Falls 1 2  3  4  5  
'	  $55 -> ..10101010101  Capture for 5 edges, 
'
Maint_ISR:    ' At start of last Data bit, 1.5 bit times before Mid-Stop
		rdpin	a0,#rx_t5f		'2	get X=5 Cycles time
' no value test needed as the 5 edges within RIs is enough, every RI re-arms the X=5 capture.
.baud		mul	a0,baud0		'2	autobaud chr, compute baud rate, based on 5x
		setbyte	a0,#7,#0		'2	set word size to 8 bits
		wxpin	a0,#rx_rcv		'2	set receiver baud rate and word size << Re-Prime RX needs to be ASAP, for late RI cases.
		mov	t0,a0			'2	save baud rate for transmitter
		reti1				'4	exit



' Receiver ISR - Mid Stop Bit , any 0x55 is trapped above, this code can purely receive.
Rx_ISR:
		wrpin	??,#??		'2	opcode to re-arm the next PinCell X=5 capture,  < 5 will read 0,  >= 5 is dT
		rdpin	a1,#rx_rcv	wc	'2	get received chr
		shr	a1,#32-8		'2	shift to lsb justify
...
	        wrlut	 a1,head			'2	enter chr into receiver buffer
	        incmod	head,#lut_btop		'2	increment buffer head
		reti1				'4	exit

This also shifts the catch range, now, any step Up in Baud (down in SysCLK), is fine, as 0x55 occurs before RI int.
Drifts down in Baud (Up in SysCLK) need to react & reset before the Start bit, and may get a bit tangled if Maint_ISR and RI drift into the same time-space.
If Maint_ISR completes before RI, (and I guess, removes it, by reset of Rx state machine via new Baud) then it needs only that Maint_ISR enters first.
The margin for that is ~ +15.79% in SysCLK

TonyWaite · 2016-10-26 10:39

Re the Raspberry Pi, bootup time using the 'Ultibo' baremetal kernel is less than one second.
There's also a healthy overlap between the Ultibo and P2 communities.

evanh · 2016-10-26 11:00

cgracey wrote: »

Realizing a need for fast bit tables, I added a new instruction: ALTB. ...

That's cheating!

kwinn · 2016-10-26 11:31

evanh wrote: »

cgracey wrote: »

Realizing a need for fast bit tables, I added a new instruction: ALTB. ...

That's cheating!

Then he really is trying hard enough.

cgracey · 2016-10-26 12:37

evanh wrote: »

cgracey wrote: »

Realizing a need for fast bit tables, I added a new instruction: ALTB. ...

That's cheating!

Look at how it got rid of a string of seven 'IF_NZ CMP X,#whitechr WZ' instructions that was slowing things down:

'
' Get chr after any whitespace
'
get_chr		call	#get_rx			'get byte into x

		altb	x,#whitespace		'whitespace?
		testb	0,x		wz

	if_nz	jmp	#get_chr		'if whitespace, get another byte

		ret


whitespace	long	%00000000_00000000_00100110_00000000		'cr, lf, tab
		long	%00100000_00000000_01000000_00000011		'"=", ".", "!", space
		long	%00000000_00000000_00000000_00000000
		long	%00000000_00000000_00000000_00000000

Now, all seven whitespace characters are detected in just two instructions. The code is even one long smaller.

cgracey · 2016-10-26 12:45

I also added jump instructions for all the pollable/waitable events:

CCCC 1011111 01I 000000000 SSSSSSSSS    **  JINT    S/#
CCCC 1011111 01I 000000001 SSSSSSSSS    **  JCT1    S/#
CCCC 1011111 01I 000000010 SSSSSSSSS    **  JCT2    S/#
CCCC 1011111 01I 000000011 SSSSSSSSS    **  JCT3    S/#
CCCC 1011111 01I 000000100 SSSSSSSSS    **  JSE1    S/#
CCCC 1011111 01I 000000101 SSSSSSSSS    **  JSE2    S/#
CCCC 1011111 01I 000000110 SSSSSSSSS    **  JSE3    S/#
CCCC 1011111 01I 000000111 SSSSSSSSS    **  JSE4    S/#
CCCC 1011111 01I 000001000 SSSSSSSSS    **  JPAT    S/#
CCCC 1011111 01I 000001001 SSSSSSSSS    **  JFBW    S/#
CCCC 1011111 01I 000001010 SSSSSSSSS    **  JXMT    S/#
CCCC 1011111 01I 000001011 SSSSSSSSS    **  JXFI    S/#
CCCC 1011111 01I 000001100 SSSSSSSSS    **  JXRO    S/#
CCCC 1011111 01I 000001101 SSSSSSSSS    **  JXRL    S/#
CCCC 1011111 01I 000001110 SSSSSSSSS    **  JATN    S/#
CCCC 1011111 01I 000001111 SSSSSSSSS    **  JQMT    S/#

CCCC 1011111 01I 000010000 SSSSSSSSS    **  JNINT   S/#
CCCC 1011111 01I 000010001 SSSSSSSSS    **  JNCT1   S/#
CCCC 1011111 01I 000010010 SSSSSSSSS    **  JNCT2   S/#
CCCC 1011111 01I 000010011 SSSSSSSSS    **  JNCT3   S/#
CCCC 1011111 01I 000010100 SSSSSSSSS    **  JNSE1   S/#
CCCC 1011111 01I 000010101 SSSSSSSSS    **  JNSE2   S/#
CCCC 1011111 01I 000010110 SSSSSSSSS    **  JNSE3   S/#
CCCC 1011111 01I 000010111 SSSSSSSSS    **  JNSE4   S/#
CCCC 1011111 01I 000011000 SSSSSSSSS    **  JNPAT   S/#
CCCC 1011111 01I 000011001 SSSSSSSSS    **  JNFBW   S/#
CCCC 1011111 01I 000011010 SSSSSSSSS    **  JNXMT   S/#
CCCC 1011111 01I 000011011 SSSSSSSSS    **  JNXFI   S/#
CCCC 1011111 01I 000011100 SSSSSSSSS    **  JNXRO   S/#
CCCC 1011111 01I 000011101 SSSSSSSSS    **  JNXRL   S/#
CCCC 1011111 01I 000011110 SSSSSSSSS    **  JNATN   S/#
CCCC 1011111 01I 000011111 SSSSSSSSS    **  JNQMT   S/#

CCCC 1011111 1LI DDDDDDDDD SSSSSSSSS    **  SETPAT  D/#,S/#

SETPEQ/SETPNE were replaced by SETPAT which uses the Z flag to select equal/not-equal and C to select INA/INB.

cgracey · 2016-10-26 12:51

Jmg, I'll need to get some sleep before I'm going to understand fully what you posted. Thanks a lot for thinking about this.

jmg · 2016-10-26 18:54

cgracey wrote: »

Jmg, I'll need to get some sleep before I'm going to understand fully what you posted. Thanks a lot for thinking about this.

Checking the ranges, I get this for 0x55 "Baud maintenance", two interrupts, 't5 Monostable' design.

* a -15.79% catch range in apparent Baud slow down (User lowers Baud, or SysCLK drifts up)
* an almost unlimited +% catch in apparent Baud increase. (you would likely have some upper sanity check ceiling ~ 2MBd? )

Inside the -15.79% drift, the earlier 0x55 INT resets the UART, so strips any pending RI (double INTs are avoided.)
Outside the -15.79% drift, in this extreme unworkable case, RI hits before 0x55, (& is certainly corrupt data), and it will reset the 0x55 path before it is read. ( I think this also avoids double INTs ) This is an outside-spec zone, but should avoid lockouts.

Note this is much wider catch range than your original Check Rx for ">", and it removes the check code from default Rx INT path.
I doubt the Osc drift, from a recent Reset/RawAutoBaud to next BaudTrim maintenance, will be anywhere near +/-15.79%, but it does mean longer pauses, or higher slews, are tolerated.

( eg I can imagine one use of many P2's in a T&M cycling chamber, capturing things like oscillator module frequencies. Ideally, tray measure does not include a re-boot, but it is possible a new-tray load, could need a new-code load, and that would be done on a high temp slew )

One practical use of a nice wide Up-Speed ability, is you can work to the PVT 20MHz corner, sync the P2, then read the actual RCOsc
(via a command to read your t0 above) .

Anyone on a system with good baud granularity, can then adjust Baud to remove the fat safety margin, and send a 0x55 then data. P2 will change gears on the fly.

It also allows a system wide bootup, and 'who is connected' check, at some lower clock speed on all parts, then when the 'board test' part is done, you change UART in the Download section, 10x+ (eg 115200 -> 1.5MBd) without needing a P2 reset.

Have OnSemi given you the spec margins for an uncalibrated Oscillator ? Was that 30% meaning +/- 30% ?
(I guess some of this will depend on how tightly you spec Vcc +/- 10% is common, but would give worse spans than +/- 5% )
You may need to spec two spreads, if you want to offer a similar T,V range to P1 ?

Addit: the 0x55 capture, is based on this Smart Pin mode
%10011 = for X periods, count time
which I have taken as capable of capturing time, over X-Whole-Periods. (here X=5)

As a period measure, it needs to Arm, wait for the next (=\_) edge then start timing, and counting periods (+1 on each =\_).
After 5 whole periods, it stores the 5P time, Clears time, and signals Done, & interrupts if enabled.
It does not re-arm, at least until read.
Assumes some means exists to re-set the monostable nature of this, eg Reset of Mode, clears all totals, and rearms.

Comment: It is unclear if RDPIN alone is enough to re-arm a Smart-Pin capture, but there could be use for an option to read with/without re-arm ?
It is certainly useful to configure many pin-cells, then arm all of them in the same SysCLK.

Slight diversion :
FWIR, the related mode, %10100 = For X periods, count states actually means collect tHH over X whole periods.
A precision duty cycle would then use two pin cells, with the common Arm mentioned above.
eg
Enable PinCell to X-tFF mode (%10011)
Enable PinCell to X-tHH mode (%10100)
Issue atomic Arm command to both, they then both wait for the same-next-edge to start*
Wait for interrupt.
Precision Duty = tHH/tFF

* to avoid partial-gate & phase errors, both tHH & tFF need to wait for the same specified (next) arming-edge

jmg · 2016-10-26 19:04

TonyWaite wrote: »

Re the Raspberry Pi, bootup time using the 'Ultibo' baremetal kernel is less than one second.
There's also a healthy overlap between the Ultibo and P2 communities.

That's another good spec-point.

I was mostly thinking about the 'any old/slow/non optimised Pi-like systems', such as may be lying about in classrooms & Labs.
There, a 60s timeout inside P2 could be a little short, based on the forum discussions.

During this time, P2 is running no code. just waiting on interrupts - the Raw AutoBaud one, or the Timeout.
Icc should be modest, at the RC osc into one COG ?
I see 32b at 20MHz is just over 3 1/2 minutes - seems a simple solution ?.

Electrodude · 2016-10-26 19:34

Why does the timeout have to be more than a few seconds at the most? Shouldn't whatever's programming it reset it first anyway, in case it's already running something and not waiting for a program?

jmg · 2016-10-26 20:08

Electrodude wrote: »

Why does the timeout have to be more than a few seconds at the most? Shouldn't whatever's programming it reset it first anyway, in case it's already running something and not waiting for a program?

Sure, but it is also nice to have a System-Power outage and recovery, be able to boot normally, and that can occur with no additional reset lines, if you take simple care on the timing.
Given nothing is happening anyway, there is little down side to a longer timeout. The power impact looks modest, and only for a short time after a power cycle (which should be rare).
If someone does manage the reset, they are not affected.

jmg · 2016-10-26 23:05

cgracey wrote: »

I'm using ">" as the autobaud character.

Minor detail I notice, but a possible failure window, is that ">" has an alias case with stop-bits = 5.
That is rare, but not impossible, especially at higher baud rates.

There is a 'mirror' version of ">" in "0", which has same tFF=7, and uses tLL=5 instead of the tHH=5
By measuring tLL, you avoid the stop-bit alias case.

Seairth · 2016-10-26 23:30

cgracey wrote: »
Look at how it got rid of a string of seven 'IF_NZ CMP X,#whitechr WZ' instructions that was slowing things down:
'
' Get chr after any whitespace
'
get_chr		call	#get_rx			'get byte into x

		altb	x,#whitespace		'whitespace?
		testb	0,x		wz

	if_nz	jmp	#get_chr		'if whitespace, get another byte

		ret


whitespace	long	%00000000_00000000_00100110_00000000		'cr, lf, tab
		long	%00100000_00000000_01000000_00000011		'"=", ".", "!", space
		long	%00000000_00000000_00000000_00000000
		long	%00000000_00000000_00000000_00000000
Now, all seven whitespace characters are detected in just two instructions. The code is even one long smaller.

Clever!

Does this mean you settled on 7-bit serial? I lost track of the conversation...

cgracey · 2016-10-26 23:32

Seairth wrote: »
cgracey wrote: »
Look at how it got rid of a string of seven 'IF_NZ CMP X,#whitechr WZ' instructions that was slowing things down:
'
' Get chr after any whitespace
'
get_chr		call	#get_rx			'get byte into x

		altb	x,#whitespace		'whitespace?
		testb	0,x		wz

	if_nz	jmp	#get_chr		'if whitespace, get another byte

		ret


whitespace	long	%00000000_00000000_00100110_00000000		'cr, lf, tab
		long	%00100000_00000000_01000000_00000011		'"=", ".", "!", space
		long	%00000000_00000000_00000000_00000000
		long	%00000000_00000000_00000000_00000000
Now, all seven whitespace characters are detected in just two instructions. The code is even one long smaller.
Clever!

Does this mean you settled on 7-bit serial? I lost track of the conversation...

It's 8-bit serial, but any characters with bit 7 set are ignored at the ISR level, so they don't get through.

Seairth · 2016-10-26 23:34

cgracey wrote: »

I also added jump instructions for all the pollable/waitable events:

Does this mean that the JUMP variants can wait too? Or would you still use a WAITxxx, followed by a JMP?

cgracey · 2016-10-26 23:36

jmg wrote: »

cgracey wrote: »

I'm using ">" as the autobaud character.

Minor detail I notice, but a possible failure window, is that ">" has an alias case with stop-bits = 5.
That is rare, but not impossible, especially at higher baud rates.

There is a 'mirror' version of ">" in "0", which has same tFF=7, and uses tLL=5 instead of the tHH=5
By measuring tLL, you avoid the stop-bit alias case.

Good eye. This could only blow up during initial autobaud, right? Once we sync, there's no possibility of alignment problems, assuming the baud rate doesn't change too fast.

jmg · 2016-10-26 23:51

cgracey wrote: »

Good eye. This could only blow up during initial autobaud, right?

Correct, it is a rare, but possible combination, only on the Raw Baud step.
One of my pencil tests is to consider the P2 coming out of reset in any bit-slot, and with any number of stop bits.

See my post above about repeating autobaud loops, where you would send a pair of chars.
The AutoBaud filter rejects the second char, and waits until the correct phase Raw Baud char.
Then, the second char passes into RX, and sets one/two pin modes, and echoes, allowing the host to sense exactly when
the P2 has come out of reset.

cgracey wrote: »

Once we sync, there's no possibility of alignment problems, assuming the baud rate doesn't change too fast.

Agreed. With 0x55 the baud rate can do a quite large step-up (>10x), but step downs need to be < -15.79% decrements to track.

Prop2 ROM code

Comments