P2 Mailbox and Parameters - where to place and what is needed?
Cluso99
Posts: 18,069
From discussions in other threads...
Crystal and clock parameters, baud, and serial i/o pins...
At hub $0+
At hub $4+ with JMP at hub $0 to jump over parameters
At top of hub
What parameters are needed?
Having previously done this in P1 for my OS alone since I/we could not get any traction, let's have another go.
Here are some of the list of parameters that IMHO would be good to have...
* Current clock frequency (long) - same as hub $0 (long) in P1
* Current pll etc setting (long) - similar but expanded from hub $4 (byte) in P1
* RCFAST frequency (long) - initially say 25MHz (Chip to advise best value), and can be changed if determined by software
* HUBFREE pointer (long) - allows allocation of hub for other cogs
- eg using top of hub, then $FFF00 would mean $00000-$FFEFF is usable by program
- eg using bottom of hub, then $00100 would mean $00100-$FFFFF is usable by program
* Serial: cog, mode, TX & RX pins (long)
* Serial: baud (long)
* StdOut: TX mailbox (long)
* StdIn : RX mailbox (long)
* AuxCfg: LF/rows/columns (long) - typical terminal, 0-127 rows of 0-255 columns
* AuxPar: undefined (long)
* AuxOut: TX mailbox (long)
* AuxIn : RX mailbox (long)
* SD: mailbox (2*longs) - typically CMD/buffer/reply/sector
* Cogs stay resident: (long) - (word free) 1=don't reset on reboot
* Date/Time (long) - 6:4:5:5:6:6 for yy:mm:dd:hh:mm:ss
The mailbox concept allows bytes to be passed with byte parameters. If an extended mailbox is required, then a flag is set here together with a pointer to where in hub the bigger mailbox is located. The larger mailbox gets its space using/updating the HUBFREE pointer.
Crystal and clock parameters, baud, and serial i/o pins...
At hub $0+
At hub $4+ with JMP at hub $0 to jump over parameters
At top of hub
What parameters are needed?
Having previously done this in P1 for my OS alone since I/we could not get any traction, let's have another go.
Here are some of the list of parameters that IMHO would be good to have...
* Current clock frequency (long) - same as hub $0 (long) in P1
* Current pll etc setting (long) - similar but expanded from hub $4 (byte) in P1
* RCFAST frequency (long) - initially say 25MHz (Chip to advise best value), and can be changed if determined by software
* HUBFREE pointer (long) - allows allocation of hub for other cogs
- eg using top of hub, then $FFF00 would mean $00000-$FFEFF is usable by program
- eg using bottom of hub, then $00100 would mean $00100-$FFFFF is usable by program
* Serial: cog, mode, TX & RX pins (long)
* Serial: baud (long)
* StdOut: TX mailbox (long)
* StdIn : RX mailbox (long)
* AuxCfg: LF/rows/columns (long) - typical terminal, 0-127 rows of 0-255 columns
* AuxPar: undefined (long)
* AuxOut: TX mailbox (long)
* AuxIn : RX mailbox (long)
* SD: mailbox (2*longs) - typically CMD/buffer/reply/sector
* Cogs stay resident: (long) - (word free) 1=don't reset on reboot
* Date/Time (long) - 6:4:5:5:6:6 for yy:mm:dd:hh:mm:ss
The mailbox concept allows bytes to be passed with byte parameters. If an extended mailbox is required, then a flag is set here together with a pointer to where in hub the bigger mailbox is located. The larger mailbox gets its space using/updating the HUBFREE pointer.
Comments
The date/time should be a simple 32-bit or 64-bit second counter from some epoch, instead of being something that's tricky to increment and compare. Hard to display but easy to operate on is better than easy to display but hard to operate on. Alternatively, you could just store a 64-bit epoch there that is set at boot and only changes if there are clock adjustments and then compare this epoch to the current 64-bit system counter so that you don't have to make sure that exactly one cog updates the clock every second.
How does HUBFREE work? Is it the head of a freelist, or what? It sounds like it would need a lock in case multiple cogs want it at once - should there be a standard lock number for guarding it? Or should there be one HUBFREE pointer per cog, where each cog has a separate pool, to eliminate locking?
* Current clock frequency (long) - same as hub $0 (long) in P1
* Current pll etc setting (long) - similar but expanded from hub $4 (byte) in P1
* RCFAST frequency (long) - initially say 25MHz (Chip to advise best value), and can be changed if determined by software
not sure what you want with HUBFREE since this depends on the application and how much COGs use how much RAM, Impossible to coordinate.
* Serial: cog, mode, TX & RX pins (long)
* Serial: baud (long)
* StdOut: TX mailbox (long)
* StdIn : RX mailbox (long)
* AuxCfg: LF/rows/columns (long) - typical terminal, 0-127 rows of 0-255 columns
* AuxPar: undefined (long)
* AuxOut: TX mailbox (long)
* AuxIn : RX mailbox (long)
* SD: mailbox (2*longs) - typically CMD/buffer/reply/sector
* Cogs stay resident: (long) - (word free) 1=don't reset on reboot
* Date/Time (long) - 6:4:5:5:6:6 for yy:mm:dd:hh:mm:ss
This should be part of your OS, not of any program one wants to write.
my Larson scanner does not need at all a AuxOut or StdOut and might not even use any serial ports, same goes for SD
Sure some values should be saved, but not to much.
Enjoy!
Mike
PASM has its small stack in the COG, every thing else has to take care of itself. So is your default stack that of fastspin, that of propgcc or that of the still upcoming SPIN2 interpreter?
Even if programs written in different languages would use the same stack area, how would they coordinate?, where to put the common tail and head pointers?
I followed all the discussions of coordinating mailboxes on the P1 and I think it is just moot. What we need is, like in the P1 fixed locations for CLOCKFREQ and CLOCKMODE so that a object loaded can adjust its timing to the clock freq, and the convention that if somebody changes it to write it back.
And putting those 2/3 longs on bottom of the ram would allow a booter to read that out of the file or doesn't and use the current setting instead. And YES we will lose 2/3 immediate values, but if it is just 2 or 3 its fine, did work with the P1 too.
Just keep it simple and short.
Mike
On the subject of keeping simple: Since all finished programs, including obex modules, get pasted together as source code, there is no particular need for any reserved locations of any sort. Things like clock frequency can all be predefined constants in the source code.
Think of P1 when you use an object to output serial you use fdx.tx(char) while in video you use tv.out(char). Here we have the chance to make these all the same... stdout(char).
HUBFREE can use a lock, or you could load an object and run it. It allocates the space it requires and then updates HUBFREE, then sets its mailbox flag to indicate its done. Now the OS (or loader or whatever) can run the next object, and so on. These reserved longs provide a simple method. The implementation is still up to the programmer, or agreed to by us.
This is an extremely short window of opportunity. Derail this if you will - it's been done every time previously when a group of us have tried, so really I don't expect any progress
The stack does not *have* to grow upwards, and in fact in p2gcc it grows downwards (but p2gcc doesn't use ptra yet, so maybe this is moot). Different compilers will place the stack at different places. But typically it will be placed after the code and data, growing up towards the end of HUB RAM, which meands it will be in different places for different applications. So I don't think I'd try to specify a default stack location.
My feeling is that we should agree on a common minimum set of features (like maybe the 3 I listed) with a bunch of memory reserved for future expansion. I like some of @Cluso99 's suggestions, but disagree with a few details -- I suspect a lot of people will have different ideas about what should go in the area. This can be hashed out in the coming months, as long as we have a fixed area reserved.
I like the idea of reserving, say, 256 bytes at the top of HUB RAM. But this requires buy-in from all the boot ROM owners. If we can't get agreement, my fall back position is to use the current TAQOZ configuration space at the bottom of RAM (since it exists already).
The CALLD instruction has two encodings -- one with a 9-bit address field, and the other with a 20-bit field. The second encoding only has 2 bits for the destination address, which allow for PA, PB, PTRA or PTRB. Due to this restriction I use PA as the link register to hold the return address.
When GCC accesses a stack variable it will compute the address once and keep it in a register for future use. So there probably isn't much advantage to using PTRA for addressing stack variables. When GCC calls a function it has to save several registers on the stack. On the P2 this can be done efficiently by using SETQ prior to WRLONG, which does a block transfer.
I forgot we have write protection for the top 16KB of hub. In the 512KB version this is duplicated.
What if we had say 4 longs there (at the top) for the clock settings, and a pointer to where in hub the mailboxes and parameters are located. This way, the ROM can preload the default clock parameters, and they can be protected.
TAQOZ will also live inthe bottom 64KB if it’s being used.
BTW I have also been playing with 1920x1080 VGA (would be same for HDMI). For a 1bpp, this is just under 256KB (~252KB IIRC) and gives 2 colours. . 2bpp would be just under 512KB which gives 4bpp.
Both these give us a nice display if we use a pair of P2’s, wth one being a dedicated display plus perhaps keyboard and audio. I don’t think there are many single chip 1920x1080 around with 512KB of internal RAM plus cpus with their own ram.
This could make the basis for a nice 2 cog variant with 1MB and say 8 smart pins and 1 set of DACs in the future. This may just fit in the current silicon size.
I like the idea of ROM plus vectors at the top the best. It can be write inhibited. P2 as a system makes doing that compelling, but the ROM authors need every byte! (Always goes that way, lol)
I also like the idea of just starting programs at $400 leaving that bottom non HUBEXEC RAM open too. That one could be well established right now as tools get made. Think future expansion on the system front, and or buffers, 6502 “Zero Page” style.
I am fine with what gets decided, as I am catching up having only had a small amount of time on P2.
Just sharing my preferences. There is nothing said here one cannot live with.
Some of this discussion treats the topic from a system on chip point of view. That is interesting.
On that note, should a system software, small OS like 8bit days, appear? It would need to gain traction. As it does, programs will be modified and or written accordingly, yes?
I see that like a PC or Apple 2. A raw bootable program can do anything!
The minimum useful things for programs that take the hardware over, not treating the P2 like a system, are basically clock parameters. All else is up to the developer. And those things are not needed, just nice.
If it were me, that is what I would settle on.
The ROM authors are thinking a bit differently, more system on chip. If it were me, they decide how the pieces work, with systems, patching, extending in mind where it makes sense.
The rest will likely come as someone builds a system where programs resemble applications more than we have seen in the past. Maybe leaving $000 to $400 paves the way for that to happen.
My .02
The problem with protecting them is that programs will want to change them sometimes. If the program does a clkset() to change frequency, it should update the frequency and mode; similarly for changes in baud rate on the default serial port.
A pointer in the ROM to the top of HUB memory is a good idea though (it'll let applications figure out how much memory they have -- 512K on current devices, up to 1MB on future ones).
Right, but the current chip only has 512KB of memory. If people start writing new clock settings at 512KB - N, then that'll not work on chips that have 1 MB of memory. If they stick to the 16KB dual-mapped block and always use 1MB - N that would work. But it also means we'd be updating stuff in the mapped ROM, which means we'd have to take the protection off. I could live with that, not sure if others would be happy though.
The mailboxes will need to Hein writable hub.
Currently I am not sure whether the protection needs to be set or not.
The write-protected 16K of memory may have some interesting uses. Maybe it could be used to hold a small kernel and system data that can't be scribbled on by other programs. There could be a cog dedicated to run the kernel code, and it could take control of the system if other cogs hang and don't reset their watch dogs. It might be something that an OS could be built around.
There have been a few attempts to standardize mailboxes and system tables, but I don't see much need for it. Instead of mixing code from different languages it's probably better to write everything in one language. Eric's spin2cpp does a great job of converting Spin to C or C++. I have a utility called cspin that converts C to Spin, though it does have some limitations on the C that it supports.
It's tempting to want to take advantage of drivers that are in the ROM code, but I think it's better to just develop drivers in whatever language people write their code in. Once the silicon is produced the ROM drivers are frozen in time. Software is changeable, and new and improved drivers will be developed over time.
In a single-program application, the code knows where the system information is located. Any cogs that share the same code will also know this. Cogs that don't share the same code can be told where the system parms are by passing a pointer when doing a coginit. Or a predefined rendezvous area can be used like we have done with OS'es on the P1.
And yes, I am also thinking about that ROM write lock.
As far as I understand it is a one-way solution, so you can NOT switch back. And even code in ROM is not able to write ROM. So all of @Cluso99's HUB exec routines and debugger will have a hard time to use the booter-rom area as input buffer for serial and parameters, when write protect is enabled.
So write protecting the ROM makes just sense for own code overwriting the ROM area before write protecting.
I see some merit in providing standard mailboxes but the lower ram area is unique by providing direct access and so I think one should really use as less as longs as possible.
Having them on top of the ROM would be also OK, but on top of the RAM is not useful for versions with different RAM sizes, and Parallax/Chip talked about extending the P2 to a 'family' of devices with different COG count and RAM size.
A pointer in ROM stating the current RAM size should be available, but it seems that ROM is closing soon, maybe this discussion is already to late.
And I strongly disagree with not writing programs in different languages, fastspin provides this flawlessly, I can use spi2 objects from Basic or C, C objects from Spin and Basic and Basic objects from C and Spin.
This is very useful for writing a program in your most favorite language and use objects written in other languages.
And - hmm - that might be possible with GCC too, it is a compiler collection, so technically one can compile fortran with propgcc and C and link it together ...
Enjoy!
Mike
On the use of multiple languages, I think what Eric has done in fastspin is amazing. I'm guessing that Eric accomplished this multi-language integration by compiling each language to a common assembly environment. That is, they all use the same register configuration and rules, and they use the same stack.
We're in the early stage of software tool development for the P2. At this point it might make sense to integrate C and Spin code. There are tools like fastspin and spin2cpp that enable this. However, as time goes on I think that each language will have similar drivers written in their own native language. So there will be less of a need to integrate languages in the future.
Well, if your program never has any need to set or read any of those variables it won't care. In practice I think most programs read the at least the clock frequency in order to determine how many cycles to wait for various delays.
But WAY more interesting is the tidbit that debug interrupt routines can write into a write protected rom. Where do you found that? That indeed would open up ring 0 system level and a ring 3 user level.
How cool would that be.
My thought on this was that code running from ROM location should be able to write ROM locations, even if write protected, but I never dared to ask for that.
But repurposing the debug interrupt could work. I am currently stuck with my serial cog based buffered driver, somehow my tests don't do what I expect, the driver itself seems to be ok.
I am waiting if someone here finds my brainfart in the test suite or I need to go over this in a couple of days with fresh eyes. Maybe I play around with rom locking, tonight...
Enjoy!
Mike
For purposes of this discussion all of the fastspin 'dialects" (C, Basic, Spin) should be considered as the same language. The major "high level" languages available now for the P2 are Tachyon Forth, p2gcc, and fastspin. All 3 of these have configuration areas located in various places and with various different things in them. It'd be nice to standardize on some of these things.
I think that since TAQOZ exists already and is in the ROM it makes sense to use it as a model (the other languages, being not burned in ROM, can more easily change to accomodate TAQOZ than vice-versa). If I've read it correctly, TAQOZ's config area is from $0 to $34, followed by some TAQOZ specific vectors. The config area starts with a JMP, and the next 8 bytes are an ID.
fastspin only needs 3 of the configuration longs that TAQOZ has defined: the _CPUHZ at $14 (final clock frequency), _CLKCFG at $18 (clock configuration mode), and baud rate at $1c. Conveniently for me, they are together. Presently fastspin has those at $10, $14, and a variable address (not exported to other programs). I will move them to match TAQOZ unless someone can convince me otherwise.
p2gcc seems to also have 3 configuration longs: clock frequency (TAQOZ $14), user baud (TAQOZ $1c), and bitcycles (not presently stored in TAQOZ, but can be calculated from the previous 2 values). If I'm reading the source correctly these are stored starting at $400. @"Dave Hein" : it looks like it shouldn't be too bad to move those down to start at $14, or am I missing something in the p2gcc init code?
So: I propose that we standardize for now on TAQOZ's use of $10 to $20 as _XIN, _CPUHZ, _CLKCFG, and _BAUD, with $0-$F reserved for a jump instruction and anything else the language runtime wants to use to initialize. _XIN is not likely to change at run time, so I'd be fine with moving that to the ROM area, but the other 3 values are pretty likely to be set by the runtime environment, so I think putting them in RAM makes sense.
I'd also like to propose that we set aside the "top" 1K of memory (just below the ROM shadow copy) as reserved for future mailboxes and inter-COG communication. We can hash out the details later.
Good. I've recently changed to using -SINGLE for the loader since I've always been setting the clock at runtime anyway. So I have it all as predefined constants in the source.
They shouldn't be in a protected area because user code will want to change them.
Yes, I agree ultimately we should overwrite the ROM area in hub ram to contain very "usable" routines. IMHO the Monitor & TAQOZ are there for getting up and running quickly without requiring other code, downloadable or from Flash/SD. It's an excellent starting point though.
I disagree that all programs should be in the same language tho. Having fast efficient pasm code for drivers makes a lot more sense to me. In my OS, the loaded drivers (serial and SD and possibly video/keyboard) are preloaded and stay resident. Other programs are loaded and run from SD, all while the resident OS stays resident. The programs are pre-compiled binary spin programs, but they could be anything as long as they are pre-compiled binaries, and know about the mailboxes and entry points.
As long as we have a standard pointer to the mailbox area, etc, we are covered.
IMHO they should be in protected area. You can unlock (enable) to change and then protect again. When we talk about the ROM, we really mean the top 16KB of a 1MB hub ram model, which can be write protected and unprotected by instructions. This 16KB is also windowed to the top 16KB of whatever is the maximum hub ram in this chip. The serial ROM is auto-loaded into this area on power up/reset, but its not write protected by default.