Making PropGCC programs more OS friendly
Dave Hein
Posts: 6,347
There was a thread a few weeks ago about booting a PropGCC binary on another cog besides cog 0 ( http://forums.parallax.com/showthread.php?137945-Booting-and-Cog-selection . That issue was resolved, but I've encountered another issue that I'd like to raise. crt0_lmm.s compares the value of PAR to 0x8000 to determine if it is starting up a program or starting a thread in another cog. It would be useful when starting a C program from an SD file to specify an address other than 0x8000 for the stack. This would allow for space to put the argv parameters and other OS information, as well as running Spin programs in high memory at the same time.
I'd like to request that we change crt0_lmm.s so that it tests for a non-zero lock value at C_LOCK_PTR to determine whether a thread is starting or not. Unfortunately, the lock number at boot up will be lock 0, so we would need to set another bit at C_LOCK_PTR to ensure that it's non-zero after allocating the lock. If we remove the 0x0000800 constant at r14 we could add an "or __TMP0, #0x100" instruction after the locknew instruction. This bit should not interfere with the code that uses the lock at __CMPSWAPSI. If need be, we could change the rdlong to a rdbyte in this routine so it doesn't get the extra 1 at bit 8.
I tested this idea, and I am able to run a stand-along C program with a stack value different than 0x8000. My next step is to add argv information. I was happily surprised to see that the startup code for a PropGCC program already supports argc and argv. All it needs is an arrary of pointers that's terminated by a NULL pointer.
I'd like to request that we change crt0_lmm.s so that it tests for a non-zero lock value at C_LOCK_PTR to determine whether a thread is starting or not. Unfortunately, the lock number at boot up will be lock 0, so we would need to set another bit at C_LOCK_PTR to ensure that it's non-zero after allocating the lock. If we remove the 0x0000800 constant at r14 we could add an "or __TMP0, #0x100" instruction after the locknew instruction. This bit should not interfere with the code that uses the lock at __CMPSWAPSI. If need be, we could change the rdlong to a rdbyte in this routine so it doesn't get the extra 1 at bit 8.
I tested this idea, and I am able to run a stand-along C program with a stack value different than 0x8000. My next step is to add argv information. I was happily surprised to see that the startup code for a PropGCC program already supports argc and argv. All it needs is an arrary of pointers that's terminated by a NULL pointer.
Comments
There are three places where I patch the C program's memory image before I start it. The first place is to replace the instruction at cog location 2 to test for a non-zero lock value instead testing the PAR register for a value $8000, which is used for the stack. This allows me to specify a stack value other than $8000.
The second area I patched was to put the address of the arg list in the startup code. This is stored in the third long after the entry address. The third patch was to add an address to the exit code. There is an existing hook for this, and the address is located at the first long after the exit address. I use two different exit routines depending on whether the program is run as a spinix app, or run in the stand-alone mode. In the stand-alone mode I reboot the Prop on exit, which reloads spinix from the EEPROM.
I'll post an update to spinix in the spinix thread once I clean up the code a bit, and add a few more features. It's pretty cool to be able to run a C app just like one of the Spin apps. The only restriction is that only one C app can run at a time, but one or more Spin apps can be running at the same time as the C program.
That's great Dave!
I have an application that can benefit from this.
Thanks.
--Steve
Thanks.
We will look at this. Meanwhile, please use Dave Hein's method.
Can you post your version for evaluation?
Thanks.
Yes, sure. Please find the patch in attachment. As I mentioned before, I made the stack pointer to be based on __hub_end value.
EDIT: I just realized a better way to patch the startup code is just to set the value of R14 to match the stack address I pass in PAR when starting up a C app. I was patching the cmp instruction at R2, but patching R14 is more straight-forward.
Yes, I define it at link time. For each new image there is a special linker script with specific settings of hub memory region and thus is unique for each new image.
I'd like to see a linker flag that lets the stack address be specified.
For now I can edit crt0_lmm.s
C.W.
Would something like this work?
propeller-elf-gcc -mlmm Hello.o -Xlinker --defsym -Xlinker __hub_end=0x1234 -o a.out
This sounds like a great suggestion for per program stack.
Changing the standard crto_lmm.s file is still under consideration.
Thanks.
I assume that goes along with using your version of crto_lmm.s, I'll give it a try.
C.W.
In the OS case that is true, but in my specific case the GCC LMM program is the boot application for the prop and I want to reserve space above the stack for mailboxes for Cog to Cog communication.
Once the C program loads up the other cogs it will be wiped out to allow the HUB ram below the mailboxes to be used as "RAM" for the system that the prop will be emulating.
It would really be nice in cases like this to be able to set the stack in code with a pragma or something so that others compiling the code don't need to worry about it in compiler and linker settings.
C.W.
I found you attachment with the patched file, but where does it go in my propgcc installation? I can't find a crt0_lmm.s file.
C.W.
Hi.
I see where you want to go with this now. Hope you have success creating a service point database for the communications end-points or whatever approach you choose to take.
Wish I had seen this earlier today, then I could help generate a linker script file that will set aside memory at the top of hub space for you.
One of the great advantages of GCC is the linker tool. It allows you to map memory the way you want it to be without needing to ask some compiler guy for a change. I'm out of time just now and will be unavailable almost all day tomorrow because of travel and meeting time.
If someone else doesn't get around to helping create a linker file for you for this purpose then I will on Thursday.
The crt0_lmm.s file is a startup file. It can be over-ridden, but I can't remember how just now.
Thanks,
--Steve
Thanks Steve. No big hurry, I only get a few hours here and there to work on this right now and this isn't a show stopper.
Just knowing that it can be done is enough to let me move on with other parts of the program.
This 1802 emulator is just one of those mountains I've decided to climb and it gives me a good reason to work with propgcc.
Thanks again,
C.W.
The startup files are in the propgcc/gcc/gcc/config/propeller/ directory, and get built along with the compiler. For testing a change in them, though, rebuilding the whole compiler is definitely overkill! You can just compiler crt0_lmm.s on its own by doing something like: You then need to copy the new _crt0.o to $PROPGCC/lib/gcc/propeller-elf/4.6.1/ and $PROPGCC/lib/gcc/propeller-elf/4.6.1/short-doubles, where "$PROPGCC" is where you've installed the compiler binaries.
Eric
Another alternative if you don't want to mess up your distribution: You can use the -nostartfiles option to propeller-elf-gcc and propeller-elf-g++. If you use this then you need to include all the startup files that the normal linker would (_crt0.o, _crtbegin.o, hubstart_xmm.o, _crtend.o) and perhaps in the correct order. But this allows you to pick and choose between local files and the files from the distribution.
C.W.
I looks like the changes sugested by d2rk, would work for my needs, but since they ignore the value passed in PAR, the ability to run non-primary cogs is lost.
It seems like the changes as suggested by Dave Hein along with an ability to set a stack address of other than 0x8000 in the loader would solve both the "OS Friendly" issue, and the issue of a stack below 0x8000 for LMM apps loaded by the standard loader.
C.W.
C.W. Attached is a demo that lets you use $7F00, $7F20, $7F40, etc... symbolically for mailboxes using a linker script.
The linker mapping can be further changed for your needs - look at the bottom of the file.
Just in case it's not clear at first glance what the program is doing: The end result is it toggles pins 12, 13 at different rates.
The way it works is by starting the same COG C program on 2 different COGS using mailbox1 and mailbox2 for communications.
This and similar programs should be used with some method of setting the stack start under the mailbox area.
--Steve
Hmmm, come to think of it the existing LMM code can be used as is to launch a cog at a different stack address -- just make sure that PAR is pointing at (stacktop-12), that (stacktop-12) contains the default entry point (the symbol "entry"), (stacktop-4) contains the correct default value for __TLS as found in misc/thread.c, and that lock 0 has been allocated and is available for use by the program.
I was thinking about this while coming down stairs this morning. Yes, the stack will overwrite the mailboxes at some point.
So, what's the right solution?
People in Propeller land want to use things like in the demo I attached for better or worse.
Your hosted environment is a fine generic solution to me.
The "right solution" I'm seeking has to do with GCC in stand-alone modes like C.W. needs.
For the program I posted either the crt0_lmm.s will need to change to set or the loader will need to patch something.
Can't we just read __hub_end from the linker script to set the stack for this case?
I'm not inclined to see any change to crt0_lmm.s at this point unless it is a good general solution.
As for me, that would be totally fine solution.
This may be a way, but it makes my head hurt
How do you feel about changing crt0_*.s by default to read __hub_end from the linker rather than hard-coding to 0x8000? Did you say this is not possible?
This is a little OT .... In a general sense we need a very simple way to do this for a COG without threading. This was not very obvious until I got a chance to port some ICC code - it's not multi-threaded of course, and ICC used something backwards from what we do by using COGNEW to launch an LMM program and COGNEW_NATIVE launching a PASM program. The actual meaning of COGNEW doesn't seem to matter too much and using it to just launch PASM fine. I guess it would be nice if we had an LMMNEW or something that just used the start address of a function and stack for running LMM code non-threaded. It may just be another onion layer though - yikes. Any thoughts?