p2load: A Loader for the Propeller II
David Betz
Posts: 14,516
Here is an early version of p2load, a loader for the Propeller II.
This loader should be able to load .BIN and .obj files created with PNut.exe. I've included a sample program (my "Hello, Propeller II!" program).
Usage:
You should be able to load it like this:
The -v just gets you some additional information that might be useful to me if you have problems. The -t says to enter a simple terminal emulator after completing the load.
To load more than one file at a time you must specific the load address of all but one of the files. If no load address is specified, the file will load at 0xe80. For example, to load Bagger's image display program along with an image to display do this:
This will load "testcard.bin" at 0x8000 and then load the image display program at 0xe80.
It is also possible to start an image in a COG and pass it a parameter. The idea behind this is that you could have drivers that are built separately from your main program and load them separately. This could allow you to reuse the hub space normally occupied by these drivers for your main program. Later, I'll add a way to do this when you boot from SPI flash.
The sequence would be:
1) load data into the area between 0xe80 and 0xfff containing already initialized mailboxes for each driver you want to load.
2) load the driver images to 0x1000 or above
3) start each driver
4) load your main program starting at 0x1000
You start a driver using the following syntax:
This would load the mailbox images at 0xe80 (the default load addres), the driver images starting at 0x1000, it would start the driver image at 0x1000 passing it 0xf00 as a parameter (mailbox), then start driver image at 0x1800 passing it 0xf10, and finally load the main program at 0x1000 overwriting the driver images which aren't needed anymore since the COGs are already running.
This is a command-line tool. I might consider making a GUI version of it using Qt if there is a need.
The code for p2load is hosted on Google Code: http://code.google.com/p/propgcc/
Here is a zip file containing a Windows version compiled with MinGW: p2load.zip
This has been tested on the DE0-Nano FPGA board running Chip's P2 configuration file. It has not been tested on the DE2-115 FPGA board.
Update: 2012-12-05
Added the -s option to strip off the first $0e80 bytes of the .obj file before loading. This allows you to put "byte 0[$0e80]" at the start of your DAT section to make labels have their correct hub addresses without having to add the $0e80 offset.
Update: 2012-12-06
Replaced the Cygwin version of p2load.exe with a MinGW version that doesn't require any extra DLLs.
Update: 2013-03-30
Added the ability to load multiple files at a time and to specifiy the address at which a file is to be loaded.
Also added a way to start a COG passing it a parameter.
Update: 2013-03-30 (second attempt)
Added the -m option to start the ROM monitor rather than the user program after loading.
Fixed a bug in parameter parsing that prevented the -b option from being used to change the baud rate.
Update: 2013-03-30 (third attempt)
Added the -r option to allow the start address and parameter to be set for the main program.
Update: 2013-03-30 (fourth attempt)
Added the ability to specify the COG id to the -c option.
Changed the default baud rate to 230400.
Update: 2013-03-31
Fixed a bug that could cause the second-stage loader to crash after processing -c.
Update: 2013-04-01, Version 0.003
Made the loader handle "-c 0,xxx:yyy" the same as "-r xxx:yyy". This is because COG 0 always has to be the last COG loaded before the loader exits.
Update: 2013-04-01, Version 0.004
Fixed a bug in the -h option.
Update: 2013-04-03, Version 0.005
Added a -T option to enter a PST-compatible terminal mode where \r is translated to \r\n on output.
Update: 2013-04-21, Version 0.007
Added the -f option to write to flash and updated -r to allow the main program to be started in any COG.
Update: 2013-04-23, Version 0.008
Fixed some bugs in the -f option. Added a -w option to write a bootable image to flash.
This loader should be able to load .BIN and .obj files created with PNut.exe. I've included a sample program (my "Hello, Propeller II!" program).
Usage:
p2load - a loader for the propeller 2 - version 0.008, 2013-04-23 usage: p2load [ -b baud ] baud rate (default is 230400) [ -c addr[:param] ] load a free COG with image at addr and parameter param [ -c n,addr[:param] ] load COG n with image at addr and parameter param [ -f file,faddr ] write contents of file to flash at faddr [ -f faddr,haddr,count ] write count bytes of hub data at haddr to flash at faddr [ -h ] cog image is at $1000 instead of $0e80 [ -m ] start the ROM monitor instead of the program [ -n ] set stack top to $8000 for the DE0-Nano [ -p port ] serial port (default is to auto-detect the port) [ -P ] list available serial ports [ -r addr[:param] ] run program in COG 0 from addr with parameter param [ -r n, ] run program in COG n [ -r n,addr[:param] ] run program in COG n from addr with parameter param [ -s ] strip $e80 bytes from start of the file before loading [ -t ] enter terminal mode after running the program [ -T ] enter PST-compatible terminal mode [ -v ] verbose output [ -w ] write a bootable image to flash [ -? ] display a usage message and exit file[,addr]... files to load
You should be able to load it like this:
p2load -v p2-hello.obj -t
The -v just gets you some additional information that might be useful to me if you have problems. The -t says to enter a simple terminal emulator after completing the load.
To load more than one file at a time you must specific the load address of all but one of the files. If no load address is specified, the file will load at 0xe80. For example, to load Bagger's image display program along with an image to display do this:
p2load testcard.bin,8000 BaggersNPotatoheads_16_bit_graphics_driver_NTSC.bin
This will load "testcard.bin" at 0x8000 and then load the image display program at 0xe80.
It is also possible to start an image in a COG and pass it a parameter. The idea behind this is that you could have drivers that are built separately from your main program and load them separately. This could allow you to reuse the hub space normally occupied by these drivers for your main program. Later, I'll add a way to do this when you boot from SPI flash.
The sequence would be:
1) load data into the area between 0xe80 and 0xfff containing already initialized mailboxes for each driver you want to load.
2) load the driver images to 0x1000 or above
3) start each driver
4) load your main program starting at 0x1000
You start a driver using the following syntax:
p2load mailboxes.bin drivers.bin,1000 -c 1000:f00 -c 1800:f10 prog.bin,1000
This would load the mailbox images at 0xe80 (the default load addres), the driver images starting at 0x1000, it would start the driver image at 0x1000 passing it 0xf00 as a parameter (mailbox), then start driver image at 0x1800 passing it 0xf10, and finally load the main program at 0x1000 overwriting the driver images which aren't needed anymore since the COGs are already running.
This is a command-line tool. I might consider making a GUI version of it using Qt if there is a need.
The code for p2load is hosted on Google Code: http://code.google.com/p/propgcc/
Here is a zip file containing a Windows version compiled with MinGW: p2load.zip
This has been tested on the DE0-Nano FPGA board running Chip's P2 configuration file. It has not been tested on the DE2-115 FPGA board.
Update: 2012-12-05
Added the -s option to strip off the first $0e80 bytes of the .obj file before loading. This allows you to put "byte 0[$0e80]" at the start of your DAT section to make labels have their correct hub addresses without having to add the $0e80 offset.
Update: 2012-12-06
Replaced the Cygwin version of p2load.exe with a MinGW version that doesn't require any extra DLLs.
Update: 2013-03-30
Added the ability to load multiple files at a time and to specifiy the address at which a file is to be loaded.
Also added a way to start a COG passing it a parameter.
Update: 2013-03-30 (second attempt)
Added the -m option to start the ROM monitor rather than the user program after loading.
Fixed a bug in parameter parsing that prevented the -b option from being used to change the baud rate.
Update: 2013-03-30 (third attempt)
Added the -r option to allow the start address and parameter to be set for the main program.
Update: 2013-03-30 (fourth attempt)
Added the ability to specify the COG id to the -c option.
Changed the default baud rate to 230400.
Update: 2013-03-31
Fixed a bug that could cause the second-stage loader to crash after processing -c.
Update: 2013-04-01, Version 0.003
Made the loader handle "-c 0,xxx:yyy" the same as "-r xxx:yyy". This is because COG 0 always has to be the last COG loaded before the loader exits.
Update: 2013-04-01, Version 0.004
Fixed a bug in the -h option.
Update: 2013-04-03, Version 0.005
Added a -T option to enter a PST-compatible terminal mode where \r is translated to \r\n on output.
Update: 2013-04-21, Version 0.007
Added the -f option to write to flash and updated -r to allow the main program to be started in any COG.
Update: 2013-04-23, Version 0.008
Fixed some bugs in the -f option. Added a -w option to write a bootable image to flash.
zip
28K
Comments
When the ROM loader starts, it fires up a COG with the HMAC/SHA-256 engine. Once it has completed the authentication/verification, it shuts the COG back down. However, it seems to me that the second-stage loader is going to want to do the same authentication/verification of the remaining code that it loads. Obviously, it could just start the engine back up again, but I am wondering if the engine could just be left running by the first-stage loader. This would provide a small start-up performance boost and save a few instructions in the second-stage loader. The ROM loader would still be the same size because the COGSTOP would likely be moved to the section where authentication failed.
Strangely enough I get the same result now when using Pnut to compile and load this code. Is Pnut also loading into hub at $0E80??
Nico Hattink
The first example above sets all three baud rates to 230400.
The second example sets the first-stage and second-stage loader baud rates to 23400 and the terminal baud rate to 115200.
The third example sets the terminal baud rate to 230400.
The fourth example is a template showing where each of the baud rates go in the option syntax.
If not specified, the default baud rate is 115200.
I have tested this and changing both loader baud rates to the same value seems to work but having a different baud rate for the first-stage and second-stage loaders fails. I'll work on debugging that.
Thanks.
Will test in some time
pnut.exe generates an error when building something larger than 32K, just FYI.
byte $f0[$FeFF] ---> failed with, "Object exceeds 32K"
This brings up another question. We talked a while ago about using the space between $0e80 and $1000 for mailboxes and other globals. Has anyone given any more thought to that? My current version of propgcc for the P2 leaves that space unallocated except for CLKFREQ. Shall we reopen the discussion of what to use that space for? One possiblity would be to have something similar to Ross's registry there or at least a pointer to it. Any other ideas?
IMHO, starting the program at $1000 seems a great idea to me. Ideally, there is a common set of drivers and LMM / XMM kernels that all need to communicate in a structured way. Opening that discussion up makes sense, but I also think it's really early too, so there might not be much to discuss right now. Settling in at $1000 for start of program does beg those "what to put in that space?" questions, and that's a good thing.
I'm about to work up an example or two where pieces get uploaded and a program is written to communicate with those pieces. I didn't think of CLKFREQ yet, but that's a no brainer. I'll start assuming it's at $E80, mailboxes above that, program at $1000. If that really needs to change, great! I can amend what I'm working on once others here weigh in.
Edit: I'm about finding the center of gravity really. It's early, so we will do stuff and I believe we will find a good center as a part of that process.
Thinking about that space:
1. Core system values similar to CLKFREQ, and perhaps that's the only one! Pointers to #2, 3, 4 strike me as a good idea though. I'm thinking of a user program that gets built with one set of drivers and *MM kernel, and it has it's own mailboxes for inter-cog comms. (Barring port D) Say it gets built with another one. Populating those pointers might help as the programmer could work relative to a pointer for a lot of compatibility among competing schemes and there will be competing schemes.
2. Memory kernel mailboxes
3. Various driver mailboxes
4. User mailboxes.
Thinking ahead to GOLD type work means just enough structure for competing things to compete to add value, not get in the way of one another or require programmers to twiddle with too many things.
I have just discovered that pnut.exe only loads the first 512-8 longs into $0E80.165F which does not allow me to properly test my interpreter.
I agree we should keep $E80.EFF free for mailboxes, pointers etc. While we need a discussion about the use of the 128B between E80.EFF, I think we are all too busy playing with DEx emulations to properly discuss this yet.
Food for thought.... 4 sets of longs per cog = 4*4*8=128 Bytes. We could use the last long of each cog for other things/pointers, or we could try and convince Chip to free-up the last 16 bytes ($E70.E7F) of rom which just stores the message "== END OF ROM ==" ???
Right, Chip's PNut.exe IDE can only load 512 longs at the moment. I think he said he had plans to add a second-stage loader but I don't know if he's had time yet.
If you're going to use p2load you'll need to know what switches to use:
-v just produces slightly verbose informational output. It is optional.
-s strips the first $0e80 bytes from the file before loading. This is because I've taken to adding $0e80 bytes of padding at the start of my PASM files so that the hub addresses come out right. If your PASM code starts at hub address 0 then you should leave off the -s option.
-h assumes that the COG image starts at $1000. If not specified, the COG image starts at $0e80.
-t just enters a simple terminal mode after loading.
-n is for loading C programs into the DE0-Nano. It places the top of the C stack at $8000 instead of $20000 since the DE0-Nano only supports 32k of hub memory.
I have my code arranged as follows...
$0E80.0FFF = simple pasm boot code that then does a coginit to run my P2 Interpreter.
$1000.17FF = P2 Interpreter.
$1800.1BFF = Vector Table for P2 Interpreter
$1C00.1FFF = filled with "?"
$2000.xxxx = Spin code object being tested (compiled with PropTool with a base of $0000)
I have had to fill the gaps to ensure the code is contiguous.
I'm glad you're having good luck with p2load. I've recently updated it to know how to load .elf files produced by propgcc. Let me know if there are any features that I could add that would help your work.
How is your Spin interpreter coming?
Parallax released an FPGA binay of the Propeller 2, and anybody who wants can download it and test with it. Cluso has a thread with all the resources: http://forums.parallax.com/showthread.php?144199-Propeller-II-Emulation-of-the-P2-on-DE0-NANO-amp-DE2-115-FPGA-boards
No, really, Chip has released a binary that runs on 2 FPGA boards. See http://forums.parallax.com/forumdisplay.php?65-Propeller-Chip/page2&order=desc
David: I am just getting the interpreter to begin execution, controlled so I can ensure it is following the instructions correctly. I have verified that it can decode the bytecodes and fetch the vectors for decoding. I need to verify that it is obtaining the correct program/variable/stack/pc/dcurr first - I have had to relocate each of these by $2000.
This is my current spin test code (entered manually after compiling with homespun)
It's great that you're making progress on getting Spin running on P2!
It is really fantastic that we can test a single cog on a $79 platform. I think this has to be a first by any company, with or without NDA.
I need to add some leds to my DE0 so that I can test the I/O port with spin. It is simple enough - 8 superbright leds and a SIL resistor network will make this a simple job.