Spin modules as separate relocatable modules (in OS)
Cluso99
Posts: 18,069
As I understand it, a spin program is relocatable.
When the compiler builds a complete program, it takes each spin PUB and PRI routine and builds them as blocks within the binary.
So, if I build an object, I should be able to load it, right? I should also be able to relocate it too, right?
Previously I have achieved just loading cogs with pasm code. They are able to stay resident (or running) when I load a main program module. Now I want to be able to keep 1 cog running a spin driver code as well.
OK, so take my example. I want to compile Kye's SD-MMC_FATEngine.spin (without the pasm section because I will load that into the cog separately so that I can reclaim cog space). and keep it running in upper hub ram. If I convert it to pasm (as mpark did with fsrw and Sphinx) then this can be done. But I want to use spin as its speed is fine, its done, and its footprint is (relatively) small.
I want to be able to locate this spin code, at say hub address $7000. I need to do this so that I can retain the spin code as resident and callable routines to other "reloadable" program modules. These reloadable modules will be compiled as separate modules, with just an _def object listing all the user calls into the resident routines.
Has anyone done anything like this before? What is the calling mechanism, so that I can create an object (an _def for inclusion in my program/modules) that gives me the call address/parameters for my program/modules?
Hopefully I have exlpained this adequately. I want to use this in my SD card OS. BTW I have overcome the loading of a binary and not completely clearing out hub (the top reserved hub ram) and I know that I can leave some cogs running as required.
When the compiler builds a complete program, it takes each spin PUB and PRI routine and builds them as blocks within the binary.
So, if I build an object, I should be able to load it, right? I should also be able to relocate it too, right?
Previously I have achieved just loading cogs with pasm code. They are able to stay resident (or running) when I load a main program module. Now I want to be able to keep 1 cog running a spin driver code as well.
OK, so take my example. I want to compile Kye's SD-MMC_FATEngine.spin (without the pasm section because I will load that into the cog separately so that I can reclaim cog space). and keep it running in upper hub ram. If I convert it to pasm (as mpark did with fsrw and Sphinx) then this can be done. But I want to use spin as its speed is fine, its done, and its footprint is (relatively) small.
I want to be able to locate this spin code, at say hub address $7000. I need to do this so that I can retain the spin code as resident and callable routines to other "reloadable" program modules. These reloadable modules will be compiled as separate modules, with just an _def object listing all the user calls into the resident routines.
Has anyone done anything like this before? What is the calling mechanism, so that I can create an object (an _def for inclusion in my program/modules) that gives me the call address/parameters for my program/modules?
Hopefully I have exlpained this adequately. I want to use this in my SD card OS. BTW I have overcome the loading of a binary and not completely clearing out hub (the top reserved hub ram) and I know that I can leave some cogs running as required.
Comments
What you really want to do is to write your own object linker that would take, say, the output of the Sphinx compiler which already compiles the objects one by one into separate files, then links them into a single binary file. You could have a linker that would accept commands specifying which objects were provided already in memory and it would create the object table tree appropriately. It could even provide a fixup table that would give addresses for the object table blocks that might need load-time relocation to simplify that process. The dynamically supplied objects would not be included in the memory allocation of the binary file.
What a Brilliant suggestion! Thankyou Mike.
I do partially understand the tree structure, and the three memory bases. So I know once loaded and running the spin cog and interpreter would continue through the code ok.
I can also get (and I have looked at) the listings that bst and homespun produce.
But the best part is to use the Sphinx compiler to produce what I am after !
Postedit: I just checked the output from a bst listing and I can get the table listing of the methods. This will simplify my work significantly.
Yes, it is possible to have relocatable spin objects and load them at runtime. However, calling them is a little complicated. You have to patch the object table in every object that calls the loaded object. The other problem is you need some way of not including objects that will be loaded at runtime in the main binary. Both of these are difficult to do without compiler support. I started to modify mparks compiler but could only get half of what I wanted to do. If you have a look at the thread here http://forums.parallax.com/showthread.php?100406-DOL-Dynamic-Object-Loader you will find some code I wrote a couple of years ago the problems I run into.
I'm away from my home computer at the moment but can post more if you want tomorrow night (Australia time)
Can you take that further. Relocate that to address $9000.
Ah - $9000 is more than propeller ram. So - the spin code resides in external memory, in a flat 32 bit memory space. Load in the program as needed. Are the internal jumps valid (ie, all relative rather than absolute?). Do you need a special "exit" command to jump back to hub spin? Would you add external memory from 32k up? Could you store less frequently used routines in this external memory?
For a simplistic example, the spin code has no object references in it. Maybe it is just a simple PUB that adds two numbers and returns the result. Could this be possible?
That would require a new spin interpreter. There are also limits in the current spin byte code that limit how far away you could jump, how far away variables are etc
Good points. Worth brainstorming.
Ok, keep all variables in a PUB local. They now are not too far away for that PUB. Further, if you call an "external memory" pub, instead of calling it directly, you call another PUB which then handles the jump to external ram.
Oh, and you probably want to know the length of a PUB if it is going to be relocated somewhere. Can a PUB report its own length?
Another technique I tried was to provide an object that contained stub methods that exacting matched the methods in the OS kernel. A user program could be built using the stub object. The stub methods would call a "remote call" method with an additional parameter that indicated the method number. The "remote call" method would then look up the method entry in the kernel's method table and call it. This worked OK, but the main drawback is that the stub object and the kernel object must contain matching method tables. If I made a significant change to the kernel I would have to rebuild all the apps with a new version of the stub object. Also, this technique only works with Spin programs and would make it difficult for PASM and C programs to call kernel functions.
A third approach would be to use a table of method pointers. I posted a MethodPtr object in the OBEX about a year ago that could be used for this purpose. A table of method pointers could be compiled for the kernel and put in a known location. User applications would then call a kernel function using the pointer in the table. This technique would also be limited to Spin applications, and would be difficult for PASM and C programs to use.
I can extract the method tables from bst and homespun via the listings. Then I can put these into an object as constants. Therefore, any modules will know where they are located.
Drac: Modules can in fact be loaded from anywhere - it is only necessary to modify the code Kye uses to do a reboot. I have (untested) modified it to selectively stop only the cogs required and leave the required driver cogs (and spin cogs) running. I have already modified it to limit hub clearing. The reload of lower hub is currently done from sd card file, but no reason why it could not come from sram (would be much faster!).
Dave: I have already struck the issue of having to recompile all modules if something changes. Ugh! I indeed thought about using a method number and a mailbox and tend to agree this is probably the best. Problem if the methods change, but guess I can live with that.
So, I thought I would ignore all the problems and get it working, see the issues, and then perhaps post the example and again ask for ideas/comments.
I do want all languages to run under this.. spin, pasm, catalina c, prop basic. I will leave gcc for someone else but I want it to run - its out of my league and there seems no interest from this group in my XMM so I am giving them a miss atm.
It's the whole gcc project. Sorry, but thats the way I feel.
Since this thread started, Microchip have released their serial sram chip 23LCV1024. This is large enough to store 4x the program size that is in the propeller and only uses 4 pins.
So thinking of a simple board with 4 pins for this sram chip and 4 pins for an SD card, and (say) 8 pins for VGA and 2 for keyboard.
Some spin code is going to be 'resident' and some can be moved in and out of Hub. Things that would be resident would be the keyboard driver, probably most of the display driver and the core of the SD driver. Relocatable things might be string and math routines.
Grab a keyboard object, VGA object, SD object and SRAM object from the obex and combine them into a program.
Let's try to save some memory. First step would be to strip out all the cog code from the above and compile them separately and then preload all the cogs, then reboot the propeller with a new spin program. Rebooting is easy to do with code, and I think Cluso said above there are ways to keep the cogs ticking over. So we could save up to 14k of hub ram that way.
So this would be the core of the program, and maybe it compiles to 20k, leaving the top 12k free. In that space we might want to bring in temporary PUB routines. To avoid the complexities of object tables, define a generic PUB wrapper where you can pass number of variables. Maybe make it 3 or 4. If you want more, one of those variables can be a pointer to an array. So now all the calls look the same - a PUB call with 4 variables. Some of those won't be used but that won't matter.
Now recompile the entire program with that extra PUB added. Then remove that PUB and replace it with the next one and recompile. Take the resulting binary files and remove the lower 20k, and then each of the resulting code is some binary data that should run when it is copied to memory location 20k.
Maybe this next step is not necessary, but copy all those binary files from SD card to SRAM. This might be faster, maybe not. It also might mean there is a way to remove the SD driver code and free up some more space. Load them in as needed.
Off to study Overlay Programming... http://en.wikipedia.org/wiki/Overlay_(programming)
and more on the concept http://www.ojodepez-fanzine.net/network/qbdl/16-mb-exes.html