Making PropGCC programs more OS friendly

Dave Hein · 2012-03-07 13:07

There was a thread a few weeks ago about booting a PropGCC binary on another cog besides cog 0 ( http://forums.parallax.com/showthread.php?137945-Booting-and-Cog-selection . That issue was resolved, but I've encountered another issue that I'd like to raise. crt0_lmm.s compares the value of PAR to 0x8000 to determine if it is starting up a program or starting a thread in another cog. It would be useful when starting a C program from an SD file to specify an address other than 0x8000 for the stack. This would allow for space to put the argv parameters and other OS information, as well as running Spin programs in high memory at the same time.

I'd like to request that we change crt0_lmm.s so that it tests for a non-zero lock value at C_LOCK_PTR to determine whether a thread is starting or not. Unfortunately, the lock number at boot up will be lock 0, so we would need to set another bit at C_LOCK_PTR to ensure that it's non-zero after allocating the lock. If we remove the 0x0000800 constant at r14 we could add an "or __TMP0, #0x100" instruction after the locknew instruction. This bit should not interfere with the code that uses the lock at __CMPSWAPSI. If need be, we could change the rdlong to a rdbyte in this routine so it doesn't get the extra 1 at bit 8.

I tested this idea, and I am able to run a stand-along C program with a stack value different than 0x8000. My next step is to add argv information. I was happily surprised to see that the startup code for a PropGCC program already supports argc and argv. All it needs is an arrary of pointers that's terminated by a NULL pointer.

Dave Hein · 2012-03-12 10:49

I can now run LMM C programs from an SD file under spinix just like it is able to run Spin programs. I can pass command-line arguments to the C program using the standard argc and argv variables, and the program terminates correctly by using the exit hook facility to call a termination routine. My C runner code is shown below. It is written in Spin, and resides in the OS kernel at high memory.

There are three places where I patch the C program's memory image before I start it. The first place is to replace the instruction at cog location 2 to test for a non-zero lock value instead testing the PAR register for a value $8000, which is used for the stack. This allows me to specify a stack value other than $8000.

The second area I patched was to put the address of the arg list in the startup code. This is stored in the third long after the entry address. The third patch was to add an address to the exit code. There is an existing hook for this, and the address is located at the first long after the exit address. I use two different exit routines depending on whether the program is run as a spinix app, or run in the stand-alone mode. In the stand-alone mode I reboot the Prop on exit, which reloads spinix from the EEPROM.

I'll post an update to spinix in the spinix thread once I clean up the code a bit, and add a few more features. It's pretty cool to be able to run a C app just like one of the Spin apps. The only restriction is that only one C app can run at a time, but one or more Spin apps can be running at the same time as the C program.

CON
  SEEK_SET       = 0
  SEEK_END       = 2
  KILL_COMMAND   = 25
  STACK_SIZE     = 800
  START_ADDR     = $20
  FILE_NOT_EXIST = -3
  OUT_OF_MEMORY  = -2

  ' LMM Kernel Registers
  r0 = 0
  r1 = 1
  r2 = 2
  r3 = 3
  r4 = 4
  r5 = 5
  lr = 15
  pc = 17

OBJ
  c   : "cfileiosim"
  mem : "cmalloc"
  sys : "sysdefs"

' This routine loads and runs a C program from a file  
PUB run(argc, argv, mode) | cognum, infile, size, i, size1, argsize, argv1, ptr

  ' Open the program file
  infile := c.fopen(long[argv], string("r"))
  ifnot infile
    return FILE_NOT_EXIST

  ' Determine the size of the arg list
  argsize := 4 ' Initialize argsize to 4 bytes for NULL argv terminator
  repeat i from 0 to argc - 1
    argsize += 4 + strsize(long[argv][i]) + 1
  argsize := ((argsize + 3) >> 2) << 2 ' Round to the nearest long

  ' Determine the file size
  c.fseek(infile, 0, SEEK_END)
  size := c.ftell(infile)
  c.fseek(infile, 0, SEEK_SET)
  size := ((size + 3) >> 2) << 2 ' Round to the nearest long

  ' Kill the calling program if requested to free up memory
  if (mode & sys#run_kill_caller)
    argv := c.KillCaller(argc, argv, argsize, mode)

  ' Allocate memory at location $0 for the app plus space for the stack and the arg list
  size1 := mem.malloc0(size + STACK_SIZE + argsize)
  ifnot size1
    c.fclose(infile)
    return OUT_OF_MEMORY

  ' Skip the first 6 bytes and read the app file into memory
  c.fread($6, 1, 6, infile)
  c.fread($6, 1, size - 6, infile)
  c.fclose(infile)

  ' Patch cog location $2 to test if lock has been allocated
  ' instead of testing for stack address PAR not equal to $8000
  long[START_ADDR][r2] := patch | (long[START_ADDR][r5] & $1ff)

  ' Copy arg list to app after the stack
  argv1 := size + STACK_SIZE
  c.copyarglist(argc, argv, argv1)
  if (mode & sys#run_kill_caller)
    mem.free(argv)

  ' Patch the address of the arg list into the app
  ' The pc contains the entry address, and the third long after
  ' the entry address points to the arg list
  ptr := long[START_ADDR][pc]
  long[ptr][3] := argv1

  ' Patch in the address of the exit code
  ' The lr register contains the address of the exit routine, and
  ' the fist word after the start of the exit routine is the address
  ' of the exit hook routine.
  if (mode & sys#run_stand_alone)
    argv := @stand_alone_exit
  else
    argv := @spinix_app_exit
  ptr := long[START_ADDR][lr]
  long[ptr][1] := argv

  ' If running stand-alone stop all cogs, clear and return locks and
  ' start the app in cog 0
  if (mode & sys#run_stand_alone)
    repeat i from 0 to 7
      if i <> cogid
        cogstop(i)
    repeat i from 0 to 7
      lockclr(i)
      lockret(i)
    coginit(0, START_ADDR, size + STACK_SIZE)
    cogstop(cogid)

  ' Otherwise, start the app in the next available cog and add an entry
  ' to the cog table
  else
    cognum := cognew(START_ADDR, size + STACK_SIZE)
    i := sys#proc_type_capp
    c.AddToCogTable(cognum, i, long[argv1], 0, size1)

  return cognum

DAT
' This is used to replace an instruction in the LMM kernel so that we can
' use a stack address other than $8000.  The source address for __C_LOCK_PTR
' will be copied from the instruction at cog location 5.
patch   rdlong r1, 0 wz

' This exit routine is used to terminate stand-alone programs by rebooting
stand_alone_exit 
        mov    r1, #128
        clkset r1

' This exit routine is used to terminate spinix apps by issuing a kill command
spinix_app_exit
        ' Get the file lock number and set the lock
        rdlong  r3, pc
        long    sys#filelock
        rdlong  r3, r3
        sub     r3, #1
        lockset r3  wc
  if_c  sub     pc, #8

        ' Get the cog number and save it in hub RAM
        cogid   r0
        mov     r1, pc
        long    0
        wrlong  r0, r1
        
        ' Save the hub RAM address in the file parm location
        rdlong  r5, pc
        long    sys#fileparm
        wrlong  r1, r5

        ' Combine the cog number with the spinix kill command
        shl     r0, #16
        or      r0, #KILL_COMMAND

        ' Write the command to the file command location
        rdlong  r4, pc
        long    sys#filecmd
        wrlong  r0, r4

        ' Spin in a tight loop waiting for the process to terminate
        sub     pc, #4

jazzed · 2012-03-12 11:23

Dave Hein wrote: »

I can now run LMM C programs from an SD file under spinix just like it is able to run Spin programs. ....

That's great Dave!

I have an application that can benefit from this.

Thanks.
--Steve

d2rk · 2012-04-19 08:00

Maybe I missed something, but will crt0_lmm.s be changed to support an address other than 0x8000 for the stack?

Thanks.

jazzed · 2012-04-19 11:33

d2rk wrote: »

Maybe I missed something, but will crt0_lmm.s be changed to support an address other than 0x8000 for the stack?

Thanks.

We will look at this. Meanwhile, please use Dave Hein's method.

d2rk · 2012-04-19 11:41

OK. I already have own version of crt0_lmm.s for these purposes, just asking about official version. Will take a look on Dave Hein's method more detail.

jazzed · 2012-04-19 14:56

d2rk wrote: »

OK. I already have own version of crt0_lmm.s for these purposes, just asking about official version. Will take a look on Dave Hein's method more detail.

Can you post your version for evaluation?
Thanks.

d2rk · 2012-04-20 07:46

jazzed wrote: »

Can you post your version for evaluation?
Thanks.

Yes, sure. Please find the patch in attachment. As I mentioned before, I made the stack pointer to be based on __hub_end value.

Dave Hein · 2012-04-20 10:08

How is __hub_end defined? Do you define it at link time? Ideally, we would want to define the stack address at run time.

EDIT: I just realized a better way to patch the startup code is just to set the value of R14 to match the stack address I pass in PAR when starting up a C app. I was patching the cmp instruction at R2, but patching R14 is more straight-forward.

d2rk · 2012-04-20 11:19

Dave Hein wrote: »

How is __hub_end defined? Do you define it at link time? Ideally, we would want to define the stack address at run time.

Yes, I define it at link time. For each new image there is a special linker script with specific settings of hub memory region and thus

__hub_end = ADDR(.hub) + SIZEOF(.hub)

is unique for each new image.

ctwardell · 2012-04-23 02:50

jazzed wrote: »

We will look at this. Meanwhile, please use Dave Hein's method.

I'd like to see a linker flag that lets the stack address be specified.

For now I can edit crt0_lmm.s

C.W.

denominator · 2012-04-23 09:04

ctwardell wrote: »

I'd like to see a linker flag that lets the stack address be specified.

Would something like this work?

propeller-elf-gcc -mlmm Hello.o -Xlinker --defsym -Xlinker __hub_end=0x1234 -o a.out

jazzed · 2012-04-23 11:44

denominator wrote: »

Would something like this work?

propeller-elf-gcc -mlmm Hello.o -Xlinker --defsym -Xlinker __hub_end=0x1234 -o a.out

This sounds like a great suggestion for per program stack.
Changing the standard crto_lmm.s file is still under consideration.

ctwardell · 2012-04-23 12:13

denominator wrote: »

Would something like this work?

propeller-elf-gcc -mlmm Hello.o -Xlinker --defsym -Xlinker __hub_end=0x1234 -o a.out

Thanks.

I assume that goes along with using your version of crto_lmm.s, I'll give it a try.

C.W.

jazzed · 2012-04-23 14:13

The problem with __hub_end is the user needs to specify the stack. Of course the tools mentioned in this thread make that possible. However, the OS really should be setting up the application work-space as Dave's code seems to do.

ctwardell · 2012-04-23 14:22

jazzed wrote: »

The problem with __hub_end is the user needs to specify the stack. Of course the tools mentioned in this thread make that possible. However, the OS really should be setting up the application work-space as Dave's code seems to do.

In the OS case that is true, but in my specific case the GCC LMM program is the boot application for the prop and I want to reserve space above the stack for mailboxes for Cog to Cog communication.

Once the C program loads up the other cogs it will be wiped out to allow the HUB ram below the mailboxes to be used as "RAM" for the system that the prop will be emulating.

It would really be nice in cases like this to be able to set the stack in code with a pragma or something so that others compiling the code don't need to worry about it in compiler and linker settings.

C.W.

ctwardell · 2012-04-24 18:32

d2rk wrote: »

OK. I already have own version of crt0_lmm.s for these purposes, just asking about official version. Will take a look on Dave Hein's method more detail.

I found you attachment with the patched file, but where does it go in my propgcc installation? I can't find a crt0_lmm.s file.

C.W.

jazzed · 2012-04-24 21:08

ctwardell wrote: »

In the OS case that is true, but in my specific case the GCC LMM program is the boot application for the prop and I want to reserve space above the stack for mailboxes for Cog to Cog communication.

Once the C program loads up the other cogs it will be wiped out to allow the HUB ram below the mailboxes to be used as "RAM" for the system that the prop will be emulating.

It would really be nice in cases like this to be able to set the stack in code with a pragma or something so that others compiling the code don't need to worry about it in compiler and linker settings.

C.W.

Hi.

I see where you want to go with this now. Hope you have success creating a service point database for the communications end-points or whatever approach you choose to take.

Wish I had seen this earlier today, then I could help generate a linker script file that will set aside memory at the top of hub space for you.

One of the great advantages of GCC is the linker tool. It allows you to map memory the way you want it to be without needing to ask some compiler guy for a change. I'm out of time just now and will be unavailable almost all day tomorrow because of travel and meeting time.

If someone else doesn't get around to helping create a linker file for you for this purpose then I will on Thursday.

The crt0_lmm.s file is a startup file. It can be over-ridden, but I can't remember how just now.

Thanks,
--Steve

ctwardell · 2012-04-24 21:29

jazzed wrote: »

If someone else doesn't get around to helping create a linker file for you for this purpose then I will on Thursday.

Thanks Steve. No big hurry, I only get a few hours here and there to work on this right now and this isn't a show stopper.

Just knowing that it can be done is enough to let me move on with other parts of the program.

This 1802 emulator is just one of those mountains I've decided to climb and it gives me a good reason to work with propgcc.

Thanks again,

C.W.

ersmith · 2012-04-25 19:33

ctwardell wrote: »

I found you attachment with the patched file, but where does it go in my propgcc installation? I can't find a crt0_lmm.s file.

C.W.

The startup files are in the propgcc/gcc/gcc/config/propeller/ directory, and get built along with the compiler. For testing a change in them, though, rebuilding the whole compiler is definitely overkill! You can just compiler crt0_lmm.s on its own by doing something like:

propeller-elf-gcc -c -o _crt0.o crt0_lmm.s

You then need to copy the new _crt0.o to $PROPGCC/lib/gcc/propeller-elf/4.6.1/ and $PROPGCC/lib/gcc/propeller-elf/4.6.1/short-doubles, where "$PROPGCC" is where you've installed the compiler binaries.

Eric

denominator · 2012-04-25 23:27

ersmith wrote: »

You then need to copy the new _crt0.o to $PROPGCC/lib/gcc/propeller-elf/4.6.1/ and $PROPGCC/lib/gcc/propeller-elf/4.6.1/short-doubles, where "$PROPGCC" is where you've installed the compiler binaries.

Another alternative if you don't want to mess up your distribution: You can use the -nostartfiles option to propeller-elf-gcc and propeller-elf-g++. If you use this then you need to include all the startup files that the normal linker would (_crt0.o, _crtbegin.o, hubstart_xmm.o, _crtend.o) and perhaps in the correct order. But this allows you to pick and choose between local files and the files from the distribution.

ctwardell · 2012-04-26 04:44

Thanks Eric and denominator, I'll try these out this weekend.

C.W.

ctwardell · 2012-04-26 18:46

I looked at this a little more and dug into the crt0_lmm.s code.

I looks like the changes sugested by d2rk, would work for my needs, but since they ignore the value passed in PAR, the ability to run non-primary cogs is lost.

It seems like the changes as suggested by Dave Hein along with an ability to set a stack address of other than 0x8000 in the loader would solve both the "OS Friendly" issue, and the issue of a stack below 0x8000 for LMM apps loaded by the standard loader.

C.W.

jazzed · 2012-04-26 23:56

ctwardell wrote: »

I looked at this a little more and dug into the crt0_lmm.s code.

I looks like the changes sugested by d2rk, would work for my needs, but since they ignore the value passed in PAR, the ability to run non-primary cogs is lost.

C.W. Attached is a demo that lets you use $7F00, $7F20, $7F40, etc... symbolically for mailboxes using a linker script.
The linker mapping can be further changed for your needs - look at the bottom of the file.

Just in case it's not clear at first glance what the program is doing: The end result is it toggles pins 12, 13 at different rates.
The way it works is by starting the same COG C program on 2 different COGS using mailbox1 and mailbox2 for communications.

This and similar programs should be used with some method of setting the stack start under the mailbox area.

--Steve

ersmith · 2012-04-27 02:35

The initial stack is actually set to whatever the PAR register contains, so to set a stack address other than 0x8000 it would only be necessary to change the tiny spin loading code in spinboot.s. Unfortunately this is currently loading PAR using a spin "constant mask 14" instruction, rather than reading it from memory, so changing it in the linker isn't possible. On the other hand if you're writing your own loader you're probably setting PAR yourself and bypassing the default spin boot code (the first 32 bytes of the file),

Hmmm, come to think of it the existing LMM code can be used as is to launch a cog at a different stack address -- just make sure that PAR is pointing at (stacktop-12), that (stacktop-12) contains the default entry point (the symbol "entry"), (stacktop-4) contains the correct default value for __TLS as found in misc/thread.c, and that lock 0 has been allocated and is available for use by the program.

ersmith · 2012-04-27 02:41

Steve: perhaps I'm misunderstanding, but it looks like your demo links the blink and mbox code together, and the main mbox code still uses 0x8000 as its stack. To use a different stack location with the default loader would require changes to the spin boot code, wouldn't it?

jazzed · 2012-04-27 06:12

ersmith wrote: »

Steve: perhaps I'm misunderstanding, but it looks like your demo links the blink and mbox code together, and the main mbox code still uses 0x8000 as its stack. To use a different stack location with the default loader would require changes to the spin boot code, wouldn't it?

I was thinking about this while coming down stairs this morning. Yes, the stack will overwrite the mailboxes at some point.

So, what's the right solution?

People in Propeller land want to use things like in the demo I attached for better or worse.

Dave Hein · 2012-04-27 09:49

Spinix has mailboxes, the kernel and other apps runing while I run a PropGCC C app. I also provide a different exit routine and an argv list. This is all placed above the C stack. Spinix doesn't use the Spin startup code that's part of the C binary. It uses a different Spin routine to patch the C binary so I can define an arbitrary stack address.

jazzed · 2012-04-27 10:20

Dave Hein wrote: »

Spinix has mailboxes, the kernel and other apps runing while I run a PropGCC C app. I also provide a different exit routine and an argv list. This is all placed above the C stack. Spinix doesn't use the Spin startup code that's part of the C binary. It uses a different Spin routine to patch the C binary so I can define an arbitrary stack address.

Your hosted environment is a fine generic solution to me.

The "right solution" I'm seeking has to do with GCC in stand-alone modes like C.W. needs.

For the program I posted either the crt0_lmm.s will need to change to set or the loader will need to patch something.
Can't we just read __hub_end from the linker script to set the stack for this case?

I'm not inclined to see any change to crt0_lmm.s at this point unless it is a good general solution.

d2rk · 2012-04-27 10:25

jazzed wrote: »

Can't we just read __hub_end from the linker script to set the stack for this case?

As for me, that would be totally fine solution.

jazzed · 2012-04-27 11:19

ersmith wrote: »

Hmmm, come to think of it the existing LMM code can be used as is to launch a cog at a different stack address -- just make sure that PAR is pointing at (stacktop-12), that (stacktop-12) contains the default entry point (the symbol "entry"), (stacktop-4) contains the correct default value for __TLS as found in misc/thread.c, and that lock 0 has been allocated and is available for use by the program.

This may be a way, but it makes my head hurt

How do you feel about changing crt0_*.s by default to read __hub_end from the linker rather than hard-coding to 0x8000? Did you say this is not possible?

This is a little OT .... In a general sense we need a very simple way to do this for a COG without threading. This was not very obvious until I got a chance to port some ICC code - it's not multi-threaded of course, and ICC used something backwards from what we do by using COGNEW to launch an LMM program and COGNEW_NATIVE launching a PASM program. The actual meaning of COGNEW doesn't seem to matter too much and using it to just launch PASM fine. I guess it would be nice if we had an LMMNEW or something that just used the start address of a function and stack for running LMM code non-threaded. It may just be another onion layer though - yikes. Any thoughts?

Making PropGCC programs more OS friendly

Comments