Monitor in P2 ROM
msrobots
Posts: 3,709
I still think the monitor of P2-HOT was a very good idea.
It does not need to be a full blown TACHYON kernel, just the monitor back.
I am attaching a pdf from @potatohead describing some of it, it is for the P2-HOT and from 2013.
Enjoy!
Mike
It does not need to be a full blown TACHYON kernel, just the monitor back.
I am attaching a pdf from @potatohead describing some of it, it is for the P2-HOT and from 2013.
Enjoy!
Mike
Comments
In the opposite it would allow instant access to the chip.
could even be batch-able over serial, so easy to handle for any host and nice for testing the chip after production.
@Publison - I an sadly not able to find the thread from 2013(?) about the monitor, could you help me there, please?
Enjoy!
Mike
This is a good idea, and now there is more room in ROM, Parallax should be thinking about how P2 can work in a Module / Education situation too.
I've already suggested the Loader allow a command to configure the PLL, (one key reason, is to allow a loader MCU to confirm the P2 xtal is OK), and it's a tiny step from there to allow general Memory : Value R/W parameters, aka a monitor.
Users could 'play live' with Smart pins, with a simple GUI, and use another smart pin to test
Some suggest < favourite language > in ROM, but I'm not sure language kernels should be in ROM ? - there are better uses for ROM, and better ways to manage language features.
If the ROM can read designated/protected Flash areas, (and OTP areas), then having "a FORTH kernel in the P2 ROM" becomes less important.
You could tell the boot loader, to launch forth, via the Flash 'nano-FAT'. To the end user, that appears exactly like 'they booted-forth'.
It could also launch any bytecode language, including Spin 2 - Parallax just needs to provide a "P2-Languages-flash-file image, compatible with the ROM, and users can update that.
eg Winbond parts look to have 4k top/bottom protect and says Complement Protect bit (CMP) is a non-volatile read/write bit in the status register (S14).
I think that allows a new 'FAT' to be written in the 'other' 4k, and then once verified, that can flip. CMP thus points to which 4K is active, ROM just needs to read that, then look at the info.
These Flash parts also have Unique IDs and I think that can be used to load-many P2's, without Pin-Addr-Cost, with some simple ROM enhancements.
Avnet shows W25Q16JVSNIQ-TR for $0.1840/10k
I've seen you make this assertion a number of times now, but I don't understand how the Pin-Addr-Cost can be avoided if the ROM is the same in each P2.
I could see a load order chain costing an additional 2 pins per P2: On boot, one pin is driven while the other is sensed. First in the chain has the sense pin tied externally to the appropriate state to start loading. As each P2 completes loading it drives the driven pin to the alternate state to start the next in the chain.
Please elaborate on your scheme.
Or, you can build a ring, that chains Tx-Rx in a circle, and define some number of bytes per node.
Addressing is implicit by place in chain, and this works best with 9-bit UART modes, which are sadly not well supported in USB-UART land.
We've done this with MCUs, but probably less ideal for OS hosts.
or, you can wire OR the TXD's and use wired-or arbitration (like CAN-BUS does), using the key feature of Unique FLASH ID
The host sends a CheckID, and all nodes start to reply each with their Unique FLASH ID, in close sync (they Autobauded, and all did a WAIT on Start bit)
To keep pace, and allow higher baud speeds, a NextID/NOP repeat char could be sent, so TX Echo is always just one-byte sync'd.
eg a Winbond part could send something like
SPI_CMD+04bh+(NOP)*4 + (NextID) *16
As each P2 replies, they check to confirm TX_Pin == Tx_Data and when <>, they release the send, and wait for next enquiry.
The P2 that makes it to end, flags itself as done, and goes quiet.
Repeated CheckIDs reveal all serial numbers present, like a roll call. This needs to be done once only for each cluster of P2's connected, as then the host has collected all the Unique IDs.
With a 20MHz RC osc, I'd expect 100~200kBaud is practical, and if a PLL config is added to ROM, then maybe 1MBd.
So each P2 has it's own external Flash storage including Unique ID and the only shared connection is to the host?
The Flash part only provides the Unique ID during development to control which code blob is loaded into each P2, but can be used for standalone (simultaneous) program load in a "finished" system.
The shared connection in a programmed system is then only used for a HMI terminal, if desired?
Perhaps the boot process could include a check of the Flash content for a "locked" load image and then if found require a successful challenge response before allowing access to the monitor or loading code from the host? The correct response would also need to be stored on the Flash chip if it is not a universal response code.
I recognise that this is not providing security from code theft as there is nothing stopping an attacker snooping the bus between Flash and P2, but it will prevent accidental access to the code, and might prevent code tampering.
I see the Prop2 resources and their needed monitor interactions as follows:
1) Hub RAM: read/write and launch cogs from
2) Cogs: start, stop, and poll
3) Smart pins: configure and poll
If a monitor just gave you access to the above, the chip could be used right away in applications where just a bunch of smart pins were needed. Add some cog code and you've got real-time processing. Build apps and you've got stand-alone products.
A monitor would be about 3KB of code. It could be built-in, for greatest ease of use, but remember that a 4KB squirt of data could load that same monitor using Prop_Txt via the serial boot protocol. That squirt is a copy/paste-able chunk of text that could easily be administered by a larger system. And, it could be upgraded, as needed, then.
Here is what a 3KB download would look like (X represents the base64 characters):
Then, there's total flexibility. On the other hand, a nice little monitor in ROM could become a mainstay, and once people want more, they can squirt a better one in.
I think there would be value in having a monitor built into ROM. For early sales, this might be more important than code protection.
There's almost nothing to learn with a monitor, if you just want to use smart pins.
1. run "p2load monitor" (or "p2load forth", etc)
2. p2load repeatedly send Prop_Chk until it gets a response.
3. p2load then sends your monitor using Prop_Hex.
4. p2load exits when done.
To get things going, you would first start p2load, then force a reboot on your P2 (cylce power). Then just fire up your favorite serial terminal and go to town.
Edit: and, of course, on more full-featured IDEs, this could all be built in and automated.
The thing about having something in ROM, though, is that it can become a mainstay just because it doesn't change. There is value in something NOT changing over a long period of time. More know-how gets accumulated out there.
Guess what this bit of code would do from scratch.
That looks good to me.
Yes, but you have only one shot to get it right, both it terms of features and reliability.
Then I redirect all output to SERIAL on pin 26 and FOR 10 times I do a CR (carriage return) and PRINT' a literal string looping as we do with NEXT until finally redirecting output back to the console CON since we are doing this interactively.
But the words themselves could be more verbose for sure or I could make each word more specialized so that TXD for instance took a whole string of arguments but that just complicates it when we could just keep it simple, clean, and compact.
I could setup 10 serial ports printing Hello World! with this too:
Or we could monitor and debug serial traffic continually printing serial data as hex bytes like this: Using PIN 27 as a serial RXD @ 115,200 BAUD we BEGIN to RX a character and print the BYTE with a trailing SPACE UNTIL we hit a console KEY.
This is how it works with Tachyon on the P2 now.
Does RX block? Won't that mean that we can't escape from the loop unless at least one character comes in on pin 27?
A monitor in ROM is good for debugging assembly code but Forth is excellent for debugging hardware and trying out what-ifs instantly.
Run the exit command, and it's Forth.
Yes, having the monitor there is a great learning tool. In terms of Parallax education, that feature could have some value in that the very low level means and methods can be taught.
IMHO, they should be. The one in HOT was excellent. The only thing it was missing then was a hook so additional code could be uploaded to extend what it can do.
Today, Chip has identified these things:
I would add one command, and a jump vector pointer. Once the monitor is in there, we get much more flexible uploads too.
See the doc I made, page 21.
An upload could include the jump vector needed, and it would literally be paste and go. User has new command available.
Intrepeters, assemblers, editors all follow.
For the on-chip kit, and we really do need one that is simple, "done" in the sense the P1 stuff is "done", would be the baseline for drivers and all sorts of other basics.
That body of code would be universal as everything else can be a super-set of it, and or offer translators / filters for it.
We get a P1 style OBEX, and we get code that will compile on something unchanged 20 years from now, just like most of the code for P1 works today.
Extremely high value here. We need to do it. Please.
If we do choose to put Tachyon in there, and I'm totally good with that, the appeal there has three main attractions:
1. Monitor, interactive assembler, disassembler. (this is what the Apple 2 had in it's ROM, BTW) I think Peter can pack it in there. The guy has done amazing stuff in tiny memory spaces.
2. For people who grok this stuff, bootstrapping new chips is cake. Nothing out there could touch it.
3. Robust boot support possible. A whole lot of people want this. Why not deliver it? We can you know.
The downsides are few:
1. There would be more of a memory map needed. Not too much, but it's not as lean as the monitor can be. I personally don't think much of this as it can all be ignored by simply booting earlier, ignoring this stuff, and or with an upload and jump that wipes it clean straight away.
2. It's bigger, more clunky and that can mean mistakes kill it off easier. (minor)
And dang it, I'm gonna ask one more time:
Can we write inhibit the lower 16k? Do it with a little state machine. Use the COGID instruction, and just require steps, to toggle it, and require they all be sequential, maybe. Or within X clocks.
P2 chip starts up, just as it does now. If the states are performed or triggered, then writes to the lower 16K are ignored. Run through the states, or a different state again to return to default.
The chances of rouge instructions doing this are extremely slim, and it allows the thing to be useful past a bonehead mistake, which people will do.
And I'm only asking this if it's not gonna break timing, or some other crazy thing. Would be a shame to have not done this, particularly with something useful in the chip.
Of course, a quick reset returns the thing to default state too...
All I know is when I was developing on the Apple, in RAM, having the ROM be stable was high value. Made a lot of bonehead errors. Later, on a Color Computer 3, it would move it's ROM to RAM on boot. I stomped it pretty regularly when learning how things are done. A write latch would have made it golden.
Again, only if it's no brainer. If it's an issue at all, ignore my request. I don't make it lightly. In fact, have held to the "don't do it" for some time now. But this one goes way back to when we decided to make the ROM a serial thing, streamed in to RAM at the right time.