Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

cgracey · 2017-10-24 11:04

I discovered something crazy. The 2.5V FPGA pins drop 1/3 when you float them. This has been our problem, not some race condition in the Prop2.

Here are the 3.3V pins being driven low, high, low, high, then floated:

2017-10-24%2003.57.10.jpg

And here are the 2.5V pins being driven low, high, low, high, then floated:

2017-10-24%2003.50.39.jpg

So, this was causing us some trouble. Nothing I see we can do about this.

I did change the TX smart pin mode from STOP+START+DATA+STOP-transition to START+DATA+STOP, so we won't be turning off the TX drive prematurely, anymore. It was a good thing you guys noticed all these details!!!

ozpropdev · 2017-10-24 11:25

Ok. Good to get some explanation of the "weirdness".

jmg · 2017-10-24 18:58

cgracey wrote: »

I discovered something crazy. The 2.5V FPGA pins drop 1/3 when you float them. This has been our problem, not some race condition in the Prop2.

The images posted by ozpropdev do not show that step effect, so they must have used 3v3 pins ?
With no RC-effect showing on the 2.5V, that looks like some deliberate termination is being included ? - is that DDR memory default ?

It seems the default pullup on USB-UARTs is ~ 74k on FTDI and > 100k on SiLabs parts.
Enough to keep a pin HI, but not low enough to give a fast rise time at high baud rates.

cgracey wrote: »

I did change the TX smart pin mode from STOP+START+DATA+STOP-transition to START+DATA+STOP, so we won't be turning off the TX drive prematurely, anymore. It was a good thing you guys noticed all these details!!!

That sounds good, and is more conventional/expected for all those designers coming from 'other UARTs'

It will make RS485 designs easier across wide baud rates.

Q:
What is currently in the ROM, and what size is it ?
Is One-wire serial boot still there ?
Did you add a variant load to skip the ID and MASKs ?
Can a loader MCU change the PLL settings, and so check the Xtal as part of Boot/POST ? ( IEC 60730 standards, etc...)
(plus load larger images faster, - the choice of more speed is always good)

See https://www.digikey.com/en/articles/techzone/2011/sep/microcontrollers-for-safety-critical-applications

cgracey · 2017-10-30 18:34

I just added a new command to the loader and it's working:

Prop_Clk 0 0 0 0 xx

...where xx is the new clock setting.

It's necessary, afterwards, for the sender to wait 10ms before giving a new "> Prop_???" command, so that the Prop2 has time to adapt to its new setting. Crystals take a few ms to rev up and a potential PLL setting needs some time, too, to settle.

Next, I need to add a feature where a non-0 mask in the mask+match data (those four 0's) causes each Prop2 to only drive its TX line during a message transmission, so that for multiple Prop2's, the TX line can be common to all Props (as is the RX line). In such an arrangement, a pull-up resistor on TX would be required.

Here is the Prop_Clk command code:

'
'
' Command - clock setup
'
command_clk	call	#match_device		'receive and check INA/INB filter values

		call	#get_hex		'get clock setting
	if_nc	jmp	#get_command		'if not hex, error, wait for another command

		zerox	x,#25			'clear non-clock bits

		mov	y,x			'switch to partial setting, but in RC fast mode
		andn	y,#%11
		clkset	y

		waitx	##rc_max/200		'wait 5ms

		clkset	x			'switch to full setting

		jmp	#reset_serial		'restart serial at new setting, get next command

jmg · 2017-10-30 18:50

cgracey wrote: »

I just added a new command to the loader and it's working:

Prop_Clk 0 0 0 0 xx

...where xx is the new clock setting.

Sounds good.
Can the P2 give a single char ack echo just before it applies the PLL change ?
That means code can send, wait for echo, then time, then send a new > autobaud

cgracey wrote: »

It's necessary, afterwards, for the sender to wait 10ms before giving a new "> Prop_???" command, so that the Prop2 has time to adapt to its new setting. Crystals take a few ms to rev up and a potential PLL setting needs some time, too, to settle.

What determines the 10ms ? is that RCSLOW paced, RCFAST paced, or some other delay ?
If the host sends a continual "> Prop_???> Prop_???> Prop_???", what happens ? does that reliably echo on the first 'pll delay exit' ?

cgracey wrote: »

Next, I need to add a feature where a non-0 mask in the mask+match data (those four 0's) causes each Prop2 to only drive its TX line during a message transmission, so that for multiple Prop2's, the TX line can also be common to all Props (as is the RX line, already).

Or you could use open drain for multiple P2s ? - 470 ohms is tau of 22ns with 47pF ?

cgracey · 2017-10-30 20:23

The need for a delay is why I kind of cringe about having any sort of clock-changing command. It interrupts the stasis of things.

Rather than spit out an acknowledgement character, which feels like emitting pollution, how about the sender just sends enough data to cover the 10ms, given he knows his baud rate. That would be simpler, I think.

The 10ms is safety time. The Prop2 waits 5ms for the new setting to settle in, before switching over to it, so 10ms is more than is needed, but allows for some slack.

We wouldn't need less than 10k of pull-up, because remember that when the serial transmitter turns on, it drives the TX line, giving a whole stop bit at the end of the last byte. Oh, I see what you were getting at - open drain. Yes, that would require some stiff pull-up.

jmg · 2017-10-30 21:00

cgracey wrote: »

The need for a delay is why I kind of cringe about having any sort of clock-changing command. It interrupts the stasis of things.

True, but clock change is not a common command, - very useful on testers and faster MCU loading, but not mandatory to use.

cgracey wrote: »

Rather than spit out an acknowledgement character, which feels like emitting pollution, how about the sender just sends enough data to cover the 10ms, given he knows his baud rate. That would be simpler, I think.

A problem there is you do not want to waste time, and the host does likely need to change Baud rate (a typical reason for clock change), so they need to know when to safely do that.
Many systems have buffers along the chain, of unknown size and latency, and you want to avoid a new-baud flushing buffers.....
So I think a echo confirms the P2 has a full command set, and is changing speed...

cgracey wrote: »

The 10ms is safety time. The Prop2 waits 5ms for the new setting to settle in, before switching over to it, so 10ms is more than is needed, but allows for some slack.

So that 5ms is internally timed ? - does the P2 clock pause during that time, freezing the code, or does it run at previous clock setting ?

If you send a continual "> Prop_???> Prop_???> Prop_???", from the time you get the ACK, at the new baud speed, what happens ?

cgracey wrote: »

We wouldn't need less than 10k of pull-up, because remember that when the serial transmitter turns on, it drives the TX line, giving a whole stop bit at the end of the last byte.
Oh, I see what you were getting at - open drain. Yes, that would require some stiff pull-up.

Open Drain avoids any contention possibility during run, and allows users to do CAN-like arbitration over multiple TX units.
ie after boot, uses will likely want to also talk to their 'many P2s', and there may be some debug of the 'over-hang' times needed.

cgracey · 2017-10-31 03:42

Jmg,

After thinking about it, I really prefer the open drain method for TX. It makes other things simpler, too.

And, I reluctantly agree about the acknowledgement character after Prop_Clk is received.

By the way, during the 5ms clock change-over, the RC fast oscillator is used. So, we can count on about 20MHz.

jmg · 2017-10-31 05:05

cgracey wrote: »

...
By the way, during the 5ms clock change-over, the RC fast oscillator is used. So, we can count on about 20MHz.

So that is some binary counter from the RCFAST ? (2^16, 2^17?)

Is that RCFAST+counter used in all change cases, so eg PLL.100MHz to PLL.120MHz, will actually run for ~5ms at 20MHz

Seems like it could be a challenge to know/prove via the link, when you have a faster clock speed ?

I have wondered about a variant of AutoBAUD that echoes the Autobaud capture value ?
eg maybe ">?" recals & echos, and ">" recals, no echo ?

That lets you easily read SysCLK to baud-osc precisions.
On Locked USB designs that's usually ~ 0,25%, and on Crystal USB designs, that can be better than 50ppm.

cgracey · 2017-10-31 17:04

jmg wrote: »

cgracey wrote: »

...
By the way, during the 5ms clock change-over, the RC fast oscillator is used. So, we can count on about 20MHz.

So that is some binary counter from the RCFAST ? (2^16, 2^17?)

Is that RCFAST+counter used in all change cases, so eg PLL.100MHz to PLL.120MHz, will actually run for ~5ms at 20MHz

Seems like it could be a challenge to know/prove via the link, when you have a faster clock speed ?

I have wondered about a variant of AutoBAUD that echoes the Autobaud capture value ?
eg maybe ">?" recals & echos, and ">" recals, no echo ?

That lets you easily read SysCLK to baud-osc precisions.
On Locked USB designs that's usually ~ 0,25%, and on Crystal USB designs, that can be better than 50ppm.

On any Prop_Clk command, the oscillator reverts to RC fast and a 'WAITX ##rc_max/200' executes to get the 5ms delay. The other oscillator bits, though, have enabled the crystal oscillator and PLL, possibly. While those settle in, the 5ms delay happens. All serial activity has been shut down during this time, and the receive buffer is reset. After the 5ms, the final two LSBs of the clock config are made according to the setting, causing the clock switch-over to occur. Then, serial is restarted from scratch and a new "> " is needed for recalibration.

ozpropdev · 2017-11-02 05:52

Chip
Out of curiosity I hooked up a Nano,Be-A2 and a Be-A9 to a prop plug (without a pullup on tx) with P30/P31 as ID pins and got the following responses.

> Prop_Chk c0000000 c0000000 0 0
Prop_Ver Du
> Prop_Chk c0000000 0 0 0
Prop_Ver Fu
> Prop_Chk c0000000 80000000 0 0
Prop_Ver Cu

Works nicely.

cgracey · 2017-11-02 06:05

ozpropdev wrote: »
Chip
Out of curiosity I hooked up a Nano,Be-A2 and a Be-A9 to a prop plug (without a pullup on tx) with P30/P31 as ID pins and got the following responses.
> Prop_Chk c0000000 c0000000 0 0
Prop_Ver Du
> Prop_Chk c0000000 0 0 0
Prop_Ver Fu
> Prop_Chk c0000000 80000000 0 0
Prop_Ver Cu
Works nicely.

Those TX signals are all driven. On the coming v25 release, TX will become open-drain when either of the mask values is non-0.

ozpropdev · 2017-11-02 06:58

cgracey wrote: »

Those TX signals are all driven. On the coming v25 release, TX will become open-drain when either of the mask values is non-0.

It was my understanding that V24 went open drain at the end of all messages.
See scope pictures here

cgracey · 2017-11-02 07:51

ozpropdev wrote: »

cgracey wrote: »

Those TX signals are all driven. On the coming v25 release, TX will become open-drain when either of the mask values is non-0.

It was my understanding that V24 went open drain at the end of all messages.
See scope pictures here

Ah, sorry. Yes, it did. It only drove TX for the duration of the message, up to the start of the last stop bit.

cgracey · 2017-11-02 08:39

A new v25 is posted at the top of this thread. Please try it out.

ozpropdev · 2017-11-02 10:34

Chip
Both Nano variants respond to Prop_Chk with "Prop_Ver C" but Pnut doesn't recognize them.

Edit: DE2-115 Prop_Chk Ok, Pnut no go.
Edit2: BeMicro-A2 ditto.
Edit3: BeMicro-A9 and P123-A7 ditto.

....Pnut V25 only recognizes P123-A9 board.

Peter Jakacki · 2017-11-02 14:49

I seem to be having some trouble detecting the Prop on a CVA9 with V25 (FPGA and PNut). I can > Prop_Chk 0 0 0 0<cr> and get the Prop_Ver F response for a quick sanity check.

Checked a bit more and it all seems to point toward PNut V25 not detecting the Prop on a CVA9 (BeMicro).

cgracey · 2017-11-02 16:43

Sorry, Guys. I think I might have made the timing a little too aggressive on the PC side I'll get this fixed this morning, hopefully.

potatohead · 2017-11-02 16:57

I am concerned the pressure for fastest possible boot will make P2 glitchy.

cgracey · 2017-11-02 17:59

Here is a new PNut_v25a.exe in a .zip file.

I just saw problems, too, on the DE0-Nano, so I added a 20ms delay back into the download. It seems to work fine now:

https://drive.google.com/file/d/0B9NbgkdrupkHbGtoOXFLRFRpVms/view?usp=sharing

jmg · 2017-11-02 18:39

cgracey wrote: »

Here is a new PNut_v25a.exe in a .zip file.

I just saw problems, too, on the DE0-Nano, so I added a 20ms delay back into the download. It seems to work fine now:

Where exactly is that delay inserted, and should it be a user-option ? ie set to some default, but command line variable ?

cgracey · 2017-11-02 18:55

jmg wrote: »

cgracey wrote: »

Here is a new PNut_v25a.exe in a .zip file.

I just saw problems, too, on the DE0-Nano, so I added a 20ms delay back into the download. It seems to work fine now:

Where exactly is that delay inserted, and should it be a user-option ? ie set to some default, but command line variable ?

It's right after the DTR raise.

I've found that these API calls are a flakey nightmare. Sometimes, it's necessary to pause before executing them, as if they're internally out of order and they need time to get their own status straightened out. One thing I found that is cooky is that the RX_PURGE doesn't work, so you have to keep requesting received data (that might have arrived between serial port initialization and the DTR-induced reset), until there's no more. But there's more... You need to wait about 20ms and ask again, as some will be late getting through the data pipe. It's like data diarrhea that you must check on a few times. If you don't do this, you'll get garbage data from before the Prop2's actual response occurred. It's foundations like this upon which the modern pile of techno-Smile has been built. It helps me to understand why it's sometimes possible to watch a video online, yet be unable to navigate the same web page.

jmg · 2017-11-02 19:06

cgracey wrote: »

It's right after the DTR raise.

Yes, reset-exit will widely vary, depending on the CAP+transistor design.
(one reason I suggest a default, but with a user choice too)

cgracey wrote: »

I've found that these API calls are a flakey nightmare. Sometimes, it's necessary to pause before executing them, as if they're internally out of order and they need time to get their own status straightened out. One thing I found that is cooky is that the RX_PURGE doesn't work, so you have to keep requesting received data (that might have arrived between serial port initialization and the DTR-induced reset), until there's no more. But there's more... You need to wait about 20ms and ask again, as some will be late getting through the data pipe. It's like data diarrhea that you must check on a few times. If you don't do this, you'll get garbage data from before the Prop2's actual response occurred. It's foundations like this upon which the modern pile of techno-Smile has been built. It helps me to understand why it's sometimes possible to watch a video online, yet be unable to navigate the same web page.

Yup. Shiploads of elasticity and jitter....
We also found that handshake signals are not guaranteed to stay in phase with data, and that doing a dummy read handshake can ensure the outgoing is applied...

With care, you can stream data without breaks, as the buffers do seem to be designed to allow that.
Hardware buffers that are larger, tolerate more OS jitter...

cgracey · 2017-11-02 19:09

I don't like my fix for the DE0-Nano. For it to work with a 20ms delay that the Prop123-A9 does not require indicates something else is wrong. It could have to do with how I connected the pin in the top-level AHDL file or maybe that the Prop123 has weird 2.5V pins that go almost mid-level when floated.

I was seeing some really flakey, intermittent serial errors on the loader, within the autobaud code. They made no sense, and they are hard to reproduce. I think they might be related to the fact that the pin data comes in asynchronously and gets routed through a lot of logic, and then goes to a unique set of 64 flops per each set of 4 cogs. I think there was some problem in all that, as it was possible for unsynchronized data to propagate with possibly different states after registering. So, I just added 64 flops that immediately register all incoming pin data from the FPGA (or what will be the pad frame). This should unify all behavior and I'm anxious to see if the problem will go away (if I can even reproduce it). This has the effect of adding one clock to each incoming pin state, though. I need to consider if this will have any ramifications to the smart pins.

jmg · 2017-11-02 19:11

cgracey wrote: »

A new v25 is posted at the top of this thread. Please try it out.

This seems to have lost the Single-pin-UART connection boot choice ? - that was the lowest possible pin usage boot.
Since that was first included, the choices of very low cost MCUs, suitable for Boot + WDOG, has expanded, not declined.

It also looks to now pulse P61 (SPI_CS) as part of the sense process ?
If an external pull-up resistor is sensed on P61 (SPI_CS)
- which could easily be a user-pin ?

cgracey · 2017-11-02 19:36

jmg wrote: »

cgracey wrote: »

A new v25 is posted at the top of this thread. Please try it out.

This seems to have lost the Single-pin-UART connection boot choice ? - that was the lowest possible pin usage boot.
Since that was first included, the choices of very low cost MCUs, suitable for Boot + WDOG, has expanded, not declined.

It also looks to now pulse P61 (SPI_CS) as part of the sense process ?
If an external pull-up resistor is sensed on P61 (SPI_CS)
- which could easily be a user-pin ?

With everything else going on, half-duplex was just frying my brain, so I took it out. Maybe I'll put it back in when I feel more confident about it.

Yes, P61 (SPI_CS) is checked for a pull-up to indicate that a SPI flash is present. This has been the case for a while. What would work better?

jmg · 2017-11-02 20:11

cgracey wrote: »

Yes, P61 (SPI_CS) is checked for a pull-up to indicate that a SPI flash is present. This has been the case for a while. What would work better?

So that means you drive P61 low, then release, & read, in order to check for pull-up ?
It's mainly that pin disturbance that bothers me - a system with no SPI, could be using that pin ?

See also the other thread, the lowest - pin impact sense I can come up with, (assuming there is always a UART connected), is a resistor from RXD to SPI_CS. Another series R handles the very brief drive contention to possible UART/MCU
SPI_CS is now read-only during early boot-tests, and is checked for 'follows RXD' - ie every edge out on RXD, should appear on SPI_CS if the resistor is fitted.
RXD Serial-only loaders have no visible impact on any other pins, totally freeing them up. One pin loader, impacts just ONE pin.

It's 2 components, but does free up a P2 pin.

Q: Where you say "If a program successfully loads serially within 60 seconds:"

I presume that really means 'there is a timeout of 60s on any serial gaps', not that a load that takes 61 seconds overall, fails ?

Q: There seems no host control of SPI via UART, with just a booter & cable - so perhaps a couple of commands that
a) Scans SPI and confirms sum is "Prop" ($706F7250) with std ack/nack reply
or
b) Reads SPI and boots

maybe those are part of the proposed but tbf Monitor?

Cluso99 · 2017-11-02 21:03

With the current 64 I/O pins, we are not short on pins!

Please don't mess with nasties to save pins like we had to do with P1.

If and when a smaller P2 comes out, we will have had the tie to try out some methods to save pins if necessary.

And if savig pins is really necessary, then I2C EEPROM/FLASH could be added as a boot option.

I have posted a pic of what pullups are required, and can be tested by the boot ROM code here
forums.parallax.com/discussion/flag/comment/1424608/53023/cluso99/L2Rpc2N1c3Npb24vY29tbWVudC8xNDI0NjA4LyNDb21tZW50XzE0MjQ2MDg-

jmg · 2017-11-02 21:25

Cluso99 wrote: »

And if savig pins is really necessary, then I2C EEPROM/FLASH could be added as a boot option.

Using single Pin UART saves the most pins..., but yes, some may like a SOT23-5 memory & shared pins that i2c allows.

Cluso99 wrote: »

Please don't mess with nasties to save pins like we had to do with P1.

?? What nasties ? Unwanted/unexpected pulses on user-pins, is nasty...

- 'resistors to pins' for config is quite universal, and not to be feared.
Is it not locked in stone that one end must be Vdd or Gnd, & many i2c devices extract additional addresses via pin-to-pin straps.

cgracey · 2017-11-02 21:37

I registered all the incoming pin signals and I can't get a download failure, anymore. This should have been done from day one, but I was worried about adding clock delays. One of the first rules in synchronous design is to register all inputs. I wonder what intermittent flakiness we've all experienced that may now be gone. Not registering those incoming pins was really reckless, in hindsight. Now, everything is registered, coming and going, as it should be.

I'm going to recompile everything and pay attention to how each TX pin is implemented in AHDL, as that seems to be problematic. I need to make them all tristate-able, just like the real I/O's.

Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

Comments