I'm talking about Spin on the P2 not the P1. There you might be able to tolerate the lower code density to gain much better performance. And if better code density is really needed, I guess you could implement CMM on the P2.
Sorry! I missed the "P2" in your message (and it looks like I'm not the only one ). Yes, I guess it would be reasonable to make that trade-off on P2. I guess that's an interesting argument -- even with 512K memory will be tight on the P2, but it's certainly much better than on P1.
I'm talking about Spin on the P2 not the P1. There you might be able to tolerate the lower code density to gain much better performance. And if better code density is really needed, I guess you could implement CMM on the P2.
Sorry! I missed the "P2" in your message (and it looks like I'm not the only one ). Yes, I guess it would be reasonable to make that trade-off on P2. I guess that's an interesting argument -- even with 512K memory will be tight on the P2, but it's certainly much better than on P1.
Since there is no need for an LMM kernel on the P2, would it be possible to create a CMM kernel that could run at the same time as hubexec? That would allow transitions between native PASM code and CMM on the same COG. Code that isn't time-critical could be compiled in CMM mode and code that requires better performance could be compiled as native PASM to run under hubexec mode. Is that feasible?
Since there is no need for an LMM kernel on the P2, would it be possible to create a CMM kernel that could run at the same time as hubexec? That would allow transitions between native PASM code and CMM on the same COG. Code that isn't time-critical could be compiled in CMM mode and code that requires better performance could be compiled as native PASM to run under hubexec mode. Is that feasible?
That's an interesting idea, and I think it should be possible. There could be three tiers of code speed within the same program: hub interpreted code < hubexec code < cog code. We'd need a CMM style kernel for P2 that could run in the COG and interpret the hub code; or maybe the CMM kernel could even be run as hubexec, since the speed critical parts will be in PASM.
I don't think I have the time to tackle that right now, but it sounds like a good approach you're proposing, David!
Parallax should consider adopting FastSpin as the official Spin compiler for the P2!
Well, except that fastspin output (being LMM code) is about 4x bigger than Spin output. So there's a tradeoff there. I think it's nice that both openspin and fastspin exist, they complement each other.
I'm talking about Spin on the P2 not the P1. There you might be able to tolerate the lower code density to gain much better performance. And if better code density is really needed, I guess you could implement CMM on the P2.
That makes a lot more sense... and I quite like it
One might argue though that if you need CMM from Spin, just use the original spin2cpp combined with whatever the C++ compiler ends up being. Then FastSpin is complete as it stands today! We just need a C++ compiler for the P2
There was a minor bug in 3.1.0 (some functions were being marked for inlining that should not have been, causing an "internal error" to be printed. I've released a new 3.1.1 version which fixes this, and also reduces the COG register footprint significantly by sharing the same variable locations for leaf functions.
spin2cpp 3.2.0 is now available as a binary release (as well as the usual github source code). It has an optional common subexpression eliminator/loop optimizer for the PASM output path, as well as many bugfixes in the Spin parsing code and some minor improvements in C/C++ output.
spin2cpp has been updated to version 3.2.1. This version has a number of bug fixes to the C/C++ code generation, including a change to allow coginit/cognew to work properly with the old PropGCC library that's included with SimpleIDE. With this version of spin2cpp I was able to convert a simple Scribbler S3 program to C++ and run it on the robot (thanks to Parallax for sending me an S3 to test with!)
spin2cpp has been updated to version 3.2.1. This version has a number of bug fixes to the C/C++ code generation, including a change to allow coginit/cognew to work properly with the old PropGCC library that's included with SimpleIDE. With this version of spin2cpp I was able to convert a simple Scribbler S3 program to C++ and run it on the robot (thanks to Parallax for sending me an S3 to test with!)
Wow! Congratulations! Does that mean you got the Scribbler code to fit in CMM mode? How did you do that?
spin2cpp has been updated to version 3.2.1. This version has a number of bug fixes to the C/C++ code generation, including a change to allow coginit/cognew to work properly with the old PropGCC library that's included with SimpleIDE. With this version of spin2cpp I was able to convert a simple Scribbler S3 program to C++ and run it on the robot (thanks to Parallax for sending me an S3 to test with!)
Wow! Congratulations! Does that mean you got the Scribbler code to fit in CMM mode? How did you do that?
The Scribbler control code itself has always fit -- the problem is fitting all of the demos at once. The sample program I used only had one demo.
spin2cpp has been updated to version 3.2.1. This version has a number of bug fixes to the C/C++ code generation, including a change to allow coginit/cognew to work properly with the old PropGCC library that's included with SimpleIDE. With this version of spin2cpp I was able to convert a simple Scribbler S3 program to C++ and run it on the robot (thanks to Parallax for sending me an S3 to test with!)
Wow! Congratulations! Does that mean you got the Scribbler code to fit in CMM mode? How did you do that?
The Scribbler control code itself has always fit -- the problem is fitting all of the demos at once. The sample program I used only had one demo.
Okay, thanks for clarifying. Have you thought at all about how to get a single image built that mirrors the capability of the Spin code that ships with the S3? Is it even possible with spin2cpp? I was wondering if it could be done by putting each demo in an overlay read from EEPROM.
Okay, thanks for clarifying. Have you thought at all about how to get a single image built that mirrors the capability of the Spin code that ships with the S3? Is it even possible with spin2cpp? I was wondering if it could be done by putting each demo in an overlay read from EEPROM.
My impression is that the goal of the S3 conversion project is to allow people to write S3 programs in C. That doesn't mean duplicating the functionality of one specific program (the one with 8 demos built in). I could be wrong of course. Anyway, my focus is just on getting spin2cpp to work correctly -- trying to create overlays for multiple demos is out of my jurisdiction .
Okay, thanks for clarifying. Have you thought at all about how to get a single image built that mirrors the capability of the Spin code that ships with the S3? Is it even possible with spin2cpp? I was wondering if it could be done by putting each demo in an overlay read from EEPROM.
My impression is that the goal of the S3 conversion project is to allow people to write S3 programs in C. That doesn't mean duplicating the functionality of one specific program (the one with 8 demos built in). I could be wrong of course. Anyway, my focus is just on getting spin2cpp to work correctly -- trying to create overlays for multiple demos is out of my jurisdiction .
Yeah, I don't think Parallax has made the exact goals clear. Are you working on what is going wrong with the motor control?
From what I can tell the C code for the 8 demos should fit in memory. The block_wrapper should also fit, which is the interface that is used by the Blockly code. There is some question whether there is enough room for user code plus the block_wrapper, but I think there is enough. It would be good to have a semi-complicated Blockly program to test to see if the C code fits.
The main problem is the test code. This does not fit along with the demo code. It doesn't fit with the block_wrapper code, but it's too large in Spin also. The solution is to treat the test code as a completely separate program which can be downloaded when needed, or to store it in the upper half of the EEPROM, and to copy it into RAM when needed.
From what I can tell the C code for the 8 demos should fit in memory. The block_wrapper should also fit, which is the interface that is used by the Blockly code. There is some question whether there is enough room for user code plus the block_wrapper, but I think there is enough. It would be good to have a semi-complicated Blockly program to test to see if the C code fits.
The main problem is the test code. This does not fit along with the demo code. It doesn't fit with the block_wrapper code, but it's too large in Spin also. The solution is to treat the test code as a completely separate program which can be downloaded when needed, or to store it in the upper half of the EEPROM, and to copy it into RAM when needed.
I think the test code comprises a number of tests though. Can't each fit by itself sort of like Eric fit one already? What comes programmed into the S3 from the factory?
I believe the S3 comes programmed with scribbler_default, which is the 8 demos plus the test mode. The C version fits with the 8 demos if the references to scribbler_test are stubbed out. The demos run stand-alone, and are selected by pushing a button at bootup. The test mode works with the serial interface, and is a good candidate for downloading and running when necessary.
I believe the S3 comes programmed with scribbler_default, which is the 8 demos plus the test mode. The C version fits with the 8 demos if the references to scribbler_test are stubbed out. The demos run stand-alone, and are selected by pushing a button at bootup. The test mode works with the serial interface, and is a good candidate for downloading and running when necessary.
If the test mode requires a serial interface you're right that there is really no reason for it to be in EEPROM. So it sounds like if Eric figures out why the motors aren't working then this project is basically done?
Are you working on what is going wrong with the motor control?
Yes, I found it -- a subtle (and nasty) bug in assigning labels. I've published a new release 3.2.2. With this one I can build a little bp_test program that turns on the LEDs, drives the robot forward a little bit, then changes the LED color, and it works in both C and C++ mode. -mcmm is required though, I think there's not quite enough room in LMM mode (although the program fits, it fails to run properly, probably because of stack overflow).
Are you working on what is going wrong with the motor control?
Yes, I found it -- a subtle (and nasty) bug in assigning labels. I've published a new release 3.2.2. With this one I can build a little bp_test program that turns on the LEDs, drives the robot forward a little bit, then changes the LED color, and it works in both C and C++ mode. -mcmm is required though, I think there's not quite enough room in LMM mode (although the program fits, it fails to run properly, probably because of stack overflow).
Congratulations! I wouldn't worry about LMM mode. CMM mode is still probably faster than Spin.
Just a heads up for those of you using spin2cpp: I'm working on a big change to the --gas flag that will allow users to more easily edit the output (so the C/C++ code will actually be somewhat usable on its own). The main improvements are that DAT labels will be directly visible to the C code, and that preprocessor defines will be usable in the inline asm (instead of us having two sets of defines for C and asm, as we do with the current GAS output). I've also re-arranged things to allow for multiple .org directives in one DAT section, which has always worked for the default binary blob DAT form but didn't work for GAS dat. This does, unfortunately, make the generated code a bit uglier.
Here's an example of some Spin code:
''
'' Simple program to blink an LED using a COG
''
CON
HZ = 80_000_000
_clkmode = xtal1 + pll16x
_clkfreq = HZ
PIN = 15
PINMASK = |<PIN
OBJ
ser : "FullDuplexSerial.spin"
DAT
msg byte "hello, world!", 13, 10
PUB main
'' start up the serial port
ser.start(31, 30, 0, 115200)
'' say hello
ser.str(@msg)
'' set the blink rate
rateptr := @rate
rate := 2*HZ
cognew(@blinkcog, 0)
repeat
DAT
org 0
blinkcog
mov DIRA, mask
mov OUTA, mask
mov now, CNT
:loop
rdlong rate, rateptr
add now, rate
waitcnt now, #0
xor OUTA, mask
jmp #:loop
mask
long PINMASK
rate
long 0
rateptr
long 0
now
long 0
Here's what the output of spin2cpp --gas looks like now:
//
// Simple program to blink an LED using a COG
//
#include <propeller.h>
#undef clkset
#undef cogid
#undef cogstop
#undef locknew
#undef lockret
#undef lockclr
#undef lockset
#undef waitcnt
#undef waitpeq
#undef waitpne
#define _waitcnt(x) __builtin_propeller_waitcnt((x), 0)
#include "blinky.h"
#define Yield__() __asm__ volatile( "" ::: "memory" )
extern uint8_t _dat_blinky_[] __asm__("..dat_start");
#define _tostr__(...) #__VA_ARGS__
#define _tostr_(...) _tostr__(__VA_ARGS__)
#define _dat_(...) __asm__(_tostr_(__VA_ARGS__) "\n")
extern uint8_t msg[] __asm__("msg");
extern int32_t blinkcog[] __asm__("blinkcog");
extern int32_t rate[] __asm__("rate");
extern int32_t rateptr[] __asm__("rateptr");
_dat_( .section .blinky.dat,"ax" );
_dat_( .compress off );
_dat_( ..dat_start: );
_dat_( msg: );
_dat_( .ascii "hello, world!\r\n" );
_dat_(..org0002_base = . + 0x0 );
_dat_( .balign 4 );
_dat_( blinkcog: );
_dat_( mov dira, (mask-..org0002) );
_dat_( mov outa, (mask-..org0002) );
_dat_( mov (now-..org0002), cnt );
_dat_( Blinkcog_loop: );
_dat_( rdlong (rate-..org0002), (rateptr-..org0002) );
_dat_( add (now-..org0002), (rate-..org0002) );
_dat_( waitcnt (now-..org0002), #0 );
_dat_( xor outa, (mask-..org0002) );
_dat_( jmp #(Blinkcog_loop-..org0002) );
_dat_( mask: );
_dat_( .long BLINKY_PINMASK );
_dat_( rate: );
_dat_( .long 0 );
_dat_( rateptr: );
_dat_( .long 0 );
_dat_( now: );
_dat_( .long 0 );
//
// due to a gas bug, we need the .org constants to be unknown during the first pass
// so they have to be defined here, after all asm is done
//
_dat_(.equ ..org0002, ..org0002_base );
_dat_( .compress default );
_dat_( .text );
void blinky::main(void)
{
// start up the serial port
ser.start(31, 30, 0, 115200);
// say hello
ser.str((int32_t)(msg));
// set the blink rate
rateptr[0] = (int32_t)(rate);
rate[0] = 2 * BLINKY_HZ;
cognew((int32_t)(blinkcog), 0);
while (1) {
Yield__();
}
}
Notice that the C++ code is much more readable, using symbols like "msg" instead of "&_load_start_blinky_cog[0]" for DAT section objects. The offsets into the DAT are no longer hard coded. Strings are output more nicely. Also notice that the DAT section no longer has to redefine the constants, which was a maintenance headache. The downside is due to the change to allow multiple orgs: the labels used in the DAT section have to be adjusted in ASM code. Perhaps we can hide this with some clever macros or something.
This only affects --gas; for now the default will remain the same (dat sections output as a binary blob).
A preview of the changes is available on github in the "new_dat" branch. It passes the internal tests, but there's still a problem with the Scribbler and --gas: the LEDs light up and motor comes on, but it never shuts off again. I'm still debugging that.
I've posted a new version (3.4.0) to https://github.com/totalspectrum/spin2cpp/releases. This one implements the --gas switch changes so that the inline assembly is much more readable (and usable). I'd encourage everyone to give it a try. The default handling of DAT is still to create a "binary blob" (array of bytes), but eventually I think --gas should become the default.
The generated output is looking great. I see that spin2cpp generates a warning about the alignment now. It looks like the generated C code still uses a cast to (int32_t *), which means that the compiler will still generate 2 word accesses. So the intent is that the Spin source must be changed to eliminate the warning and to ensure proper operation. Is that correct?
The generated output is looking great. I see that spin2cpp generates a warning about the alignment now. It looks like the generated C code still uses a cast to (int32_t *), which means that the compiler will still generate 2 word accesses. So the intent is that the Spin source must be changed to eliminate the warning and to ensure proper operation. Is that correct?
No, when spin2cpp emits the warning about alignment, it also adds an __attribute__((aligned(4))) decoration to the label array, so the compiler will assume that the label is 32 bit aligned and be able to do a wrlong/rdlong. It's a bit of a hack, but it seems to work.
I've published an updated spin2cpp / fastspin (version 3.5.0). This one has a primitive type inference engine, so the generated C code looks more natural. The compiler tries to guess "natural" types for variables rather than making everything a 32 bit integer. For example, in a function like:
PUB mystrlen(p) : r
repeat while byte[p] <> 0
r++
the compiler sees that "ptr" is only ever used to access bytes, so it gives it a type "char *" in generated C code:
int32_t foo_mystrlen(char *p)
{
int32_t r = 0;
while (p[0] != 0) {
(r++);
}
return r;
}
If for some reason spin2cpp can't figure out a type, or gets conflicting ideas, it punts and makes it "int32_t" (as before).
There's a new spin2cpp release, version 3.6.0. Besides various bug fixes, it has an implementation of the proposed object pointer syntax for Spin2. You can pass pointers to objects, and cast those pointers to "dummy" objects declared with = instead of :.
OBJ
fds: "FullDuplexSerial" ' a real, concrete object
ser="FullDuplexSerial" ' defines a type
'' print a hex number to a serial port
PUB printx(s, n)
ser[s].hex(n, 8)
'' print a hex number to fds
PUB printx_default(n)
printx(@fds, n) '' ends up working like fds.hex(n, 8)
Note that although Propeller2 support was up to date at one point, the instruction set has changed again and so spin2cpp --p2 may or may not work for you. I don't really want to chase a moving target, so if you want to try the P2 support you're on your own. Most likely the PASM output for P2 will work (the instructions that spin2cpp uses still have the same names) but binary output probably won't work.
It is easily reproducible without the need to use the original file. Not sure when it was introduced, I had the same code converted sometime ago and apprently it worked (to be honest I'm not sure if I have changed the code by hand).
Note that although Propeller2 support was up to date at one point, the instruction set has changed again and so spin2cpp --p2 may or may not work for you. I don't really want to chase a moving target, so if you want to try the P2 support you're on your own. Most likely the PASM output for P2 will work (the instructions that spin2cpp uses still have the same names) but binary output probably won't work.
Hmm, yes.
Note that David Betz has neatly solved this 'following binary changes' issue, where he has written a program to parse P2 instruction spreadsheet to generate the tables to drive the assembler.
Note that although Propeller2 support was up to date at one point, the instruction set has changed again and so spin2cpp --p2 may or may not work for you. I don't really want to chase a moving target, so if you want to try the P2 support you're on your own. Most likely the PASM output for P2 will work (the instructions that spin2cpp uses still have the same names) but binary output probably won't work.
Hmm, yes.
Note that David Betz has neatly solved this 'following binary changes' issue, where he has written a program to parse P2 instruction spreadsheet to generate the tables to drive the assembler.
David's program is pretty neat, but the format it prints is not the one I use in spin2cpp, so either way I would have to make code changes. And I really am tired of keeping up with instruction set changes. I don't think anyone uses spin2cpp for P2, or at least I've had no feedback about it, so there's really no need to keep it up to date. If/when there's silicon, I may support P2 in fastspin/spin2cpp, but for now I'm just going to sit this one out.
Comments
Sorry! I missed the "P2" in your message (and it looks like I'm not the only one ). Yes, I guess it would be reasonable to make that trade-off on P2. I guess that's an interesting argument -- even with 512K memory will be tight on the P2, but it's certainly much better than on P1.
That's an interesting idea, and I think it should be possible. There could be three tiers of code speed within the same program: hub interpreted code < hubexec code < cog code. We'd need a CMM style kernel for P2 that could run in the COG and interpret the hub code; or maybe the CMM kernel could even be run as hubexec, since the speed critical parts will be in PASM.
I don't think I have the time to tackle that right now, but it sounds like a good approach you're proposing, David!
That makes a lot more sense... and I quite like it
One might argue though that if you need CMM from Spin, just use the original spin2cpp combined with whatever the C++ compiler ends up being. Then FastSpin is complete as it stands today! We just need a C++ compiler for the P2
The Scribbler control code itself has always fit -- the problem is fitting all of the demos at once. The sample program I used only had one demo.
My impression is that the goal of the S3 conversion project is to allow people to write S3 programs in C. That doesn't mean duplicating the functionality of one specific program (the one with 8 demos built in). I could be wrong of course. Anyway, my focus is just on getting spin2cpp to work correctly -- trying to create overlays for multiple demos is out of my jurisdiction .
The main problem is the test code. This does not fit along with the demo code. It doesn't fit with the block_wrapper code, but it's too large in Spin also. The solution is to treat the test code as a completely separate program which can be downloaded when needed, or to store it in the upper half of the EEPROM, and to copy it into RAM when needed.
Yes, I found it -- a subtle (and nasty) bug in assigning labels. I've published a new release 3.2.2. With this one I can build a little bp_test program that turns on the LEDs, drives the robot forward a little bit, then changes the LED color, and it works in both C and C++ mode. -mcmm is required though, I think there's not quite enough room in LMM mode (although the program fits, it fails to run properly, probably because of stack overflow).
Here's an example of some Spin code:
Here's what the output of spin2cpp --gas looks like now:
Here's what it will look like:
Notice that the C++ code is much more readable, using symbols like "msg" instead of "&_load_start_blinky_cog[0]" for DAT section objects. The offsets into the DAT are no longer hard coded. Strings are output more nicely. Also notice that the DAT section no longer has to redefine the constants, which was a maintenance headache. The downside is due to the change to allow multiple orgs: the labels used in the DAT section have to be adjusted in ASM code. Perhaps we can hide this with some clever macros or something.
This only affects --gas; for now the default will remain the same (dat sections output as a binary blob).
A preview of the changes is available on github in the "new_dat" branch. It passes the internal tests, but there's still a problem with the Scribbler and --gas: the LEDs light up and motor comes on, but it never shuts off again. I'm still debugging that.
No, when spin2cpp emits the warning about alignment, it also adds an __attribute__((aligned(4))) decoration to the label array, so the compiler will assume that the label is 32 bit aligned and be able to do a wrlong/rdlong. It's a bit of a hack, but it seems to work.
Eric
The source has this method:
spin2cpp says "SD-MMC_FATEngine.spin:1797: error: symbol addressDIREntry on left hand side of assignment" and writes that code:
It is easily reproducible without the need to use the original file. Not sure when it was introduced, I had the same code converted sometime ago and apprently it worked (to be honest I'm not sure if I have changed the code by hand).
Note that David Betz has neatly solved this 'following binary changes' issue, where he has written
a program to parse P2 instruction spreadsheet to generate the tables to drive the assembler.
http://forums.parallax.com/discussion/comment/1404918/#Comment_1404918
You could likely get a copy of that ?
Eric
David's program is pretty neat, but the format it prints is not the one I use in spin2cpp, so either way I would have to make code changes. And I really am tired of keeping up with instruction set changes. I don't think anyone uses spin2cpp for P2, or at least I've had no feedback about it, so there's really no need to keep it up to date. If/when there's silicon, I may support P2 in fastspin/spin2cpp, but for now I'm just going to sit this one out.
Eric