Where is Spud?
potatohead
Posts: 10,261
in Propeller 2
Hey guys! I've been reading along. Itching to jump back into P2 land, but I've been in P1 land for the last couple months.
...at times, wishing it was a P2, but that will come, and I can wait, and P1 is plenty. Easy too. We all know that, and it's awful fun to have those, "but you can't do that on...." moments. Then they see a little block of code doing spiffy stuff and ask for the datasheet. Total win.
Finally, after a long time doing this, I ended up with a work related project I can tackle. All that stuff we've done here together really pays off! Thanks guys. I know stuff. I think, for this project, I know enough stuff too. Should that not be true, you know I'll ask.
But, I lied. This last 4 weeks, I've not read anything. Where are we at? Is the FPGA image sorted? Are we in full test mode?
Tomorrow, I will seek to catch up and update my FPGA. See if my stuff still runs. Any thread pointers to "stuff Spud should know, because Spud was gone for a while...." info will be appreciated!
Hope all of you are happy, healthy, and having a lot of fun!
...at times, wishing it was a P2, but that will come, and I can wait, and P1 is plenty. Easy too. We all know that, and it's awful fun to have those, "but you can't do that on...." moments. Then they see a little block of code doing spiffy stuff and ask for the datasheet. Total win.
Finally, after a long time doing this, I ended up with a work related project I can tackle. All that stuff we've done here together really pays off! Thanks guys. I know stuff. I think, for this project, I know enough stuff too. Should that not be true, you know I'll ask.
But, I lied. This last 4 weeks, I've not read anything. Where are we at? Is the FPGA image sorted? Are we in full test mode?
Tomorrow, I will seek to catch up and update my FPGA. See if my stuff still runs. Any thread pointers to "stuff Spud should know, because Spud was gone for a while...." info will be appreciated!
Hope all of you are happy, healthy, and having a lot of fun!
Comments
The ROM first checks for a serial connection, then it checks for an image in SPI flash. It loads whatever is available from serial or flash, and then authenticates the loaded $1F8-long program with its 8-long HMAC signature using the first 128 fuses as a key. If the loader authenticates, it is executed. If not, the clock is slowed to 20KHz and cog0 stops, in order to go comatose until another reset.
I'll make loaders which can load RAM (as we currently have) or SPI flash.
I decided NOT to put a monitor program into ROM. The reason being that because the ROM is only practically accessible on boot-up now, and not mapped into hub memory, it wouldn't be usable at any time other than start-up. So, it wouldn't be otherwise available as an instantiable asset. Of course, a monitor could be used to load code from a generic terminal program, but I wonder if there'd be much interest in making tools that don't use the quick and tidy loader protocol. Well, what do you guys think about this? Maybe I'm a little punch drunk and the idea of ONE WAY seems a respite.
That said, I would use the "squirt it in" mode we had in the Hot design. I liked that one could bootstrap onto the chip from anything. But, what is worth what?
If we set that aside, a nice set of tools could be the first thing someone loads.
At some point, I would love to see a loadable assembler, editor, monitor, maybe.... small SPIN compiler. Just the core keywords, no special things, use in-line assembly for those.
That mini-environment can be "finished" and used very long term, just like P1 SPIN + PASM is. Call it the baseline, "works no matter what" target environment.
Everything else is bigger, changes, whatever.
my .02
1) must be >1 second after Prop2 reset
2) send space (Prop2 recalibrates serial timing from each space/$20 received)
3) send non-hex/whitespace chr to reset loader (white space = space/tab/cr/lf)
2) send hex value to set load length (ie "FFFC0")
3) send data bytes (ie "12 34 56 78 9A B C D E F 0 a b cc dd ee ff...")
4) send 16-bit checksum from data bytes (ie "abcd")
5) if checksum okay, Prop2 does "coginit #0,#0", else it waits for more characters
This would allow the Prop2 to be loaded from any machine with a serial port. The only problem with having this is that the chip would never go to low-power mode when it didn't have a program, as it would always be waiting for serial data. With 1mA of quiescent, anyway, that may not matter, as this program might take only 1ma at 20MHz and 1 cog.
It could time out, say after one minute too, if a power compromise is warranted. I'm not sure it is either.
Doing it this way, as an option, allows for unknown integration and development platforms. And it can remain OS / tool / computer independent. Someone may home brew something and want to put a Prop 2 to use. This way, they can with basically zero dependencies. Their tools, their machine, etc...
a) Prop2 recalibrates serial timing from each space/$20 received
b) white space = space/tab/cr/lf
c) unexpected characters cause the text loader to start over
d) text loader times out after one minute and chip enters coma
From the perspective of the PC/host:
1) must be > 1 second and < 1 minute after Prop2 reset
2) send special string to signify start of data (ie " Propeller 2" - note the leading space)
3) send hex data bytes separated by white spaces to load into hub 0, 1, 2, 3...
4) send "G", then cog does 'COGINIT #0,#0'
That's really simple and does away with the checksum. The checksum could be implemented by the program at $00000. This way, a simple, editable text file just needs to be transmitted to the chip for it to load.
This would be excellent because it does not require any special interface.
Just to be sure we are talking to the correct device, it is alive and communication is OK.
Yes, it would be good if we got some kind of prompt so that we also know when to send perhaps even just echoing everything back.
I thought about that too, though I think it should be sent between steps 2 and 3. This way, the baud rate has been established, and the P2 doesn't output anything unless it recognizes the signature in step 2.
I'm thinking the loader should be redone to make it always text-based, with simple queries and responses. When the Prop2 resets, it will wait for 100ms for a serial query to come in from a host. If nothing comes, it will try to load and verify from SPI flash. If that fails, it goes back to waiting for a serial query.
I've been thinking about upshifting the baud rate after initial connection, but that needs to be made optional, so that a dumb transmitter doesn't have to do anything special, while the Propeller chip just keeps auto-bauding without interruption. That way, the host can start and finish without any adjustments - it just sends the data file.
Also, by making the HMAC signature envelope the whole application, not just the loader, we get around data integrity problems.
Here is what I have, so far:
In the simplest usage, the host just sends " Propeller ! 0 0 0 0 <bytecount> <bytes> <hmac>" at any baud rate from 9600 to 115.2k. The Propeller responds with "PASS" or "FAIL". If PASS, it does a 'COGINIT #0,#0'. If fail, it attempts to load from SPI flash, and then returns to waiting for a serial command while running from the 20MHz RC oscillator (same as power-up).
To step up the speed before a download, the host does the " Propeller ?..." command with a clkcode and baudcode, followed by the " Propeller !..." command to download.
If the host just wants to know if any Propeller is there, it can send " Propeller ? 0 0 0 0 0 0" at any baud rate from 9600 to 115.2k. The Propeller responds with "Propeller..." which details the pins, the hub size, number of cogs, and whether or not the chip contains a CORDIC unit. Those metrics cover every possible family member.
Edit: Duh. I guess what you mean is that the pins need to match the xdata values after being masked by the xmask values. This is to allow the chip type to be determined by using pins as a selector?
Say you have 16 chips loading different programs from the same serial signal and they are not using SPI flash chips. You can differentiate them from one another by, say, tying the normally-used SPI flash pins in different high/low combinations. The amask/bmask values would be 00000000/3C000000, while the adata/bdata values would be 00000000/xx000000. The 'xx' represents the four pins that could be tied in different states. You could load each chip with a different program over one wire. No need to plug a PropPlug into any of them, ever.
You may also need to use this flow to program flash chips that are connected.
That means those dual-purpose pins need to be re-defined to valid states for QuadSPI during pgm, but use light pullup/pulldown during Serial ID check.
Should be possible, but each piece of SW needs to tolerate the other operation ?
Is there any reason the Auto-baud is limited to 115200 here ?
20MHz should be ok to > 1MBd, with a NCO/fractional Baud clock ?
[insert favorite "simplicity" quote here]
Thinking about the identification of the chip from an internal point of view a GETVER instruction would make code aware of available cogs, pins, hub size cordic etc.
This would be useful for OBEX objects.
I just don't get this need for the loader to have to query this information. You (in general) are going to know exactly what version of the chip you are targeting.
I think the idea here is to confirm you are actually running, on what you hope to be on.
If P2 variants become common, this will be a common question.
It is a good idea generally for loaders to identify the device.
Currently I am testing code on a P123-A9 board and a DE2-115 board.
There are considerable differences between the two platforms such as number of cogs, smartpin count, hub size and pin assignment of peripherals.
Not having the luxury of conditional compilation in Pnut means hardware detection was the way to go.
Knowing what platform my code is running on my code adjust to the available hardware.
Now I have one version of my code that runs on both platforms without modification.
A lot cleaner and simpler.
Auto baud detecton involves measuring the RX pin's states. Software must be able to do a full computation in one bit period. I'll put the code here soon. I got it working today. At 20Mhz, 115.2k baud is about 173 clocks, or 86 instructions. We might be able to do 320.4k, but once you shift up to crystal and PLL, you are deterministic, as well as fast, so 20Mhz should be possible.
You need the 128-bit key to come up with a valid signature for whatever you download, before it will run.
You can freely choose which char to use for AutoBaud, so that gives some flexibility
(you are not locked to that one bit period)
We look for a space ($20). It has a very unique 10000001001 pattern that can be detected with certainty among text characters.