VOL - The Missing SPIN Keyword
Brian Fairchild
Posts: 549
(for more background see my 'frustrations' topic)
At run time I want the following...
8 COGs each running their own PASM code
32k of data in HUB RAM
Which is quite achievable by the silicon. But not by the tools it appears.
The problem is DAT and it's 'multi-purpose' nature. It is used to flag...
data-tables
run-time workspace
PASM block
The first two are fixed and need to remain; the last one might or might not need to be fixed depending on if you want to reload your COGs. At the moment all of them are fixed into the 32k boot memory image.
This means that the best I can achieve is 32,768 - ( 8cogs * 496longs * 4bytes per long) = 16,896 bytes of data. It's actually smaller than this due to some system overheads.
If SPIN had a new keyword VOL(atile) to flag areas that are only needed to load COGs then the remaining space could be used by the tools for additional VAR space.
There would need to be stated restrictions on how long things stayed valid but as most COGs get loaded at system start I don't see that as a huge problem.
At run time I want the following...
8 COGs each running their own PASM code
32k of data in HUB RAM
Which is quite achievable by the silicon. But not by the tools it appears.
The problem is DAT and it's 'multi-purpose' nature. It is used to flag...
data-tables
run-time workspace
PASM block
The first two are fixed and need to remain; the last one might or might not need to be fixed depending on if you want to reload your COGs. At the moment all of them are fixed into the 32k boot memory image.
This means that the best I can achieve is 32,768 - ( 8cogs * 496longs * 4bytes per long) = 16,896 bytes of data. It's actually smaller than this due to some system overheads.
If SPIN had a new keyword VOL(atile) to flag areas that are only needed to load COGs then the remaining space could be used by the tools for additional VAR space.
There would need to be stated restrictions on how long things stayed valid but as most COGs get loaded at system start I don't see that as a huge problem.
Comments
A slightly different approach would be to use the VOL option to put the PASM block at the end of the Spin Stack space.
Anyway, a simple work around is:
Then, after you launch the cog, you can use my_array as if you had declared it as
Assuming (as in your case) that the PASM block is the full size.
Since hub ram is continuous, I don't see where the issue of hard coding your data tables is. Do you really need the compiler to assign the addresses?
Perhaps you should compile your program under bst or homespun and enable the listing so you can see what is compiled.
Isn't that why we use HLLs? I don't really care where my variables are. I just want to be able to access them.
I've made a simple test for the worst case scenario...
8 'cog' files each containing a start method which just issues a cognew followed by 496 longs of PASM. Total of 500 longs for each file.
1 'main' file which calls the 8 object.start methods and allocates a pile of buffers in a VAR block.
If I allocate 4096 longs of VARS the tool reports 4,022 longs of program and 4,096 longs of variables leaving me 70 longs of stack/free. Try to allocate any more than 54 of those remaining longs and the tool complains.
So in this extreme case, at run time, I have 4,000 longs that I can't directly use.
I might do that although, TBH, I'm thinking that SPIN is just not going to cut it for me.
DAT has a double life. As a Spin construct it used for data definitions that are shared among all instances of an object. Note that each instance gets it's own copy of the VAR areas.
It just so happens that PASM code is considered to be data shared among it's object instances, which is reasonable, so it has been shoved into the DAT section.
This gives rise to the situation that in a "normal" program having loaded a COG that PASM space is never used again and is considered a waste of space, which it is.
Recycling the space is not so straight forward as 1) you may want to reload the cog with different codes so trampling on the PASM is a bad idea. 2) You may want multiple instances of the object all expecting to load the same code so trampling on it is a bad idea.
Personally I would have preferred that PASM code go into a "ASM" section. After all PASM code is fundamentally different from the rest of the Spin language. It's content cannot be accessed from Spin once it is loaded to COG, it uses a different address space.
"ASM" sections would be named as PUB routines are, defining a code block for a single cog load.
After that all we need is a way to specify that this "ASM" section is only going to be loaded once at start up, after that it can be overwritten. The compiler could then put such sections at the end of the memory in spin stack space.
Also there is PZST, which is open source and would allow you to entirely do your own thing. Don't let the debate herein bog you down.
IDE forks are not just tolerated, they are openly welcomed.
http://www.microcodes.info/pzst-an-open-source-propeller-ide-in-development-12470.html
I'm liking that idea.
As a default, putting ASM at the top of the 32k image is a good place to start for many application.
For 'power' users there needs to be a way to claw even more of it back though. You almost need another type of VAR, or at least a qualifier, which says 'This is a variable but should be overlaid on top of the ASM space and I will make sure my code doesn't access it until I've finished with the ASM block.'
Do you want to use a HLL (Spin) or code all in PASM ? I can see the problem you describe if you mix Spin and PASM, then you get scattered DAT and Spin blocks in the hub memory.
But if you code all in PASM this simplifies it dramatically (for the compler not for you). You have no longer a HLL. You have a short Spin starter PUB and then only DAT sections (all the other sections rely to Spin).
This DAT sections can contain initialized data arrays or PASM code. It's now up to you to order them so that the DAT blocks you will not overwrite comes first, and the reusable PASM DAT blocks come later.
In PASM most variables are registers in cog ram, you only access hubram for big arrays, buffers, initialized tables, shared variables between cogs and so on. It can be a bit tricky to define the addresses for that, but I have shown a way in your other thread.
Andy
Taking my own topic off-topic...
Hi Andy,
I bounce between all 8 COGs running PASM and 7 COGs running PASM plus one running a SPIN debug monitor and doing its own low-speed serial interface.
The application lends itself so well to the Prop's architecture...
1 (2?) COGs receiving multiple DMX streams
1 COG transmitting multiple DMX streams
4/5/6 COGs processing the data
?1 COG as a supervisor?
The processing is all about reading data from 512 bytes arrays, shuffling it about a bit using various array indices and intermediate arrays and then putting it all in a number of 512 byte output buffers.
The processor COGs can be true parallel processors.
I've written an object that can transmit 8 DMX streams in a single COG - what 44-pin commodity uC has 8 USARTs !?
I'm pretty much resigned to having to do my own variable address management across multiple files.
Yes, it would. The variable would not be usable (writable) a certain amount of time upon startup. There is no previously: both the variable and the PASM block are using the same memory location at startup, which means that code has to wait before it can write to a variable.
This solution might be useful if the Propeller supported dynamic memory (such as C's "new" keyword), but that's not the case. All variables on the Propeller as static: they're defined at compile time.
Maybe not full, but I use 300 longs of cog ram for color buffer in my VGA driver. A short peek/poke (names from Basic language...) pasm procedure can get a value from hub ram or put it there. These procedures are something about 20 long size, so you can keep about 470 longs in one unused cog.
BTW, Spin does support dynamic memory allocation. You just need to use one of the Spin memory managers in the OBEX. The CLIB object has a malloc method that works just like the C version.
But is that not exactly what we have now ?
You define your PASM and 'waste' the space if you need to reload it OR just double define that space, for buffers or variables as you wish. Any dat-section is acsessible by spin and pasm and the compiler takes care of the translation between label and hubadress?
And if yo do so you can start using them after starting the pasm-cog. as SLRM shows in post #2
what am I missing?
Enjoy!
Mike
@SRLM - #2 - thanks for the tip
@SRLM - #14 - I agree but I think having to not use an area of variables until after COGs have initialised would not be huge task. Most code I've looked at seems to set up the COGs as one of the first tasks.
@msrobots - #17 - unless I'm missing something the compiler doesn't quite 'take care' of the address translation as the issue of scope across objects comes into play.
@Cluso99 - #18 - I think I'm more likely to use an EEPROM to store the COG code. Not being able to use a system because someone had lost the SD card might cause my customers a problem.
Once again - thanks guys.
Given an addition to a spin compiler like the open source version created by Roy and a capable loader, VOL sections could be gathered into one blob at the end of code. The blob can then be written to the upper 32KB of an EEPROM by the loader (this is possible today).
We can implement Brian's VOL in Spin, but it takes work and it is a language extension. Why does the concept of VOL make sense? 1) The program defines COG services - no program reboot required. 2) We don't have to define some so far illusive operational standard. 3) It shouldn't be that hard to add to an extension to Roy's open source Spin compiler.
We implemented a model like VOL in Propeller-GCC called ECOG because it was the most flexible and easiest thing to do (no-brainer for the user and easy for the developer). The down side is that It requires an I2C loader COG to allow reading code from the eeprom (I2C buffer space and COG recyclable) .
The so called "OS" approach (come on, Linux is an OS) solves this problem without requiring changes in the language of course, but that's not so easy either. Part of the problem is that people seem to insist that applications need to run on a Propeller "OS" like on a PC. I don't buy that.
So to have an effective use of an "OS" (system loader) for applications, one must have some predefined things and extra baggage. Such predefined things have been discussed ad-naseum with no agreement even among pals. A standardized system loader model is possible of course, it's just illusive at best. I'm not necessarily against a system loader approach, but it doesn't need to be overburdened by the requirement to be an "OS" either.
The advantage to a system loader approach is in saving program space as long as support is not onerous. As long as things can be made easy it's fine. I'm tired of Propeller onion peeling.
-Phil
For some reason I got the idea that Brian wanted to have 16KB or so of COG code, and that the rest of the program could be used for SPIN.
"Which is quite achievable by the silicon. But not by the tools it appears."
It's possible to reset the stack pointer to the beginning of program 2 by modifing the header values in program 1, and then restarting cog 0 with the new header. When program 1 runs the second time it would use a flag to skip the initialization code that was executed in the first step. This technique requires that program 2 is the last object in the executable image. I think this can be done by creating a dummy object that contains the FILE directive, and referencing that as the last object listed in the top object.
It's just a simple matter of programming.
EDIT: We may not even need to split the application into 2 programs if we can ensure that all of the cog startup objects are grouped together. All of the startup objects could be referenced by one object, whcih could return the object start address back to the top object.
Other missing Keywords are:
Stack ORG: ---- Explains it self.
HUB ORG:
For placing COG DAT parts anywhere in HUB
@Brian, I know what you are saying - I've also been frustrated that so much precious ram gets wasted using the "standard" software solution.
A few workarounds
1) "De-objectify" your code. Take all the objects you are including and copy and paste the code into your main program. First thing is it probably won't compile due to common variable names (that are separate when contained within an object) but you can fix that with a search and replace. Once you have done this, you have control over all the DAT sections and you can do things like group them all together to make one big block. Then recycle that space for arrays or a screen buffer when the cogs have loaded.
2) Don't use Spin. When Pullmoll put an entire Z80 computer in the propeller one of the messages that came up in the bootup sequence was "Goodbye Spin!". The first time I saw that, I thought - "he can't do that". But of course you can. Just need to define some rendevous points for all the cog pasm, and make sure you pass contiguous blocks of data to cogs so you can point to them with just one long. Of course, this is not how objects in the obex work so you have to code on your own.
3) Use other languages. GCC for instance can have the program on an SD card and cache it through some of the hub ram. That means the program can be megabytes in size.
4) Use a different screen. Up until recently I always wanted to use as much as possible of hub ram for a screen buffer - more memory = better resolution. I have been much happier since I found touchscreen modules for $17 with an SD card socket as a bonus, and all the memory for a 240x320 full color display is on the touchscreen. That frees up lots of hub memory, plus no need for video drivers either.
5) Use external memory. Many serial and parallel solutions out there.
6) Chain programs. This is where I am at now - I have a Desktop program with a bunch of icons on the screen. Touch an icon and it loads a brand new program off the SD card (eg a Calculator or a Picture Viewer). Kye added this in his SD driver code and the ability for one program to Chain another is extremely useful. I've also added a "warm boot" feature which reloads the original Desktop out of external ram, so the delay for running a program or going back to the desktop is only about a second. I also store bitmaps and strings in external memory.
7) Compact programs by saving space using BST instead of the proptool, which eliminates unused methods.
8) Pull existing objects to bits and only use the code you need (with acknowledgements to the original author).
9) Split up Obex code into the high level language driver and the pasm part, and put the pasm part in an eeprom or sd card and load all cogs through one common 496 long area. I did this for about 10 objects but there are some that are too complicated to do.
I guess all the above are "advanced" programming techniques. Though #6 is the one I've settled on because it uses the standard prop tool and most programs are only using about half the available memory even with an SD driver and I don't have to think so hard as when using some of the other techniques *grin*.
Thanks for the thoughts.
It's likely to be 3) + 5) for me.
Having poked around for the last couple of days I can completely see where SPIN 'is' and accept it for what it is. I'm of the very firm opinion that compilers etc are tools that do what they do. If I buy a hammer and find that it's not right for the job in hand I buy a different hammer rather than try to modify the one I have. There are after all loads of different hammers to choose from and there's bound to be one I like. And you can never have too many hammers.
"To a man with a hammer every problem looks like a nail."
First step is to port the existing product's code over to C + ASM. It's due an annual 'polish' so it's a good time to start.
Thanks again.
Brian
Electrodude