Compiling Spin

I've been distracted by other things lately and only just got around to reading the "Why oh why in this day and age" thread, which wandered off-topic into the subject of Spin compilers. It appears there's some confusion about spin2cpp and its role in compiling Spin, so I wanted to point out a few things:
(1) spin2cpp started off as a converter from Spin to C++, but now it's a full compiler. It parses Spin code into an intermediate tree, and then from that tree can produce C, C++, PASM (P1 or P2), or binary output. The name is misleading now. Actually there is a different frontend for spin2cpp (called "fastspin") that mimics the interface of openspin, so perhaps we should start using that name for the compiler suite.
(2) The spin2cpp parser is written in YACC, so it is in some sense a formal definition of Spin grammar (at least, the Spin grammar accepted by spin2cpp; but so far I haven't found any programs that openspin takes but spin2cpp doesn't). There are actually two phases to parsing: lexical analysis (converting strings into tokens) and then the grammar (figuring out how the tokens go together to make a program). The grammar is in YACC; the lexical analysis is hand-written in C just to make some things easier (e.g. for indenting we emit INDENT and OUTDENT tokens).
(3) fastspin/spin2cpp does do optimization, both on the intermediate tree (so language independent optimizations) and on the PASM output. Examples of some of the optimizations it can perform are: dead code elimination, function inlining (any sufficiently small function is inlined, as are all functions that are called only once), common sub-expression elimination, and loop strength reduction.
(4) fastspin/spin2cpp can produce both LMM and plain COG programs. There are options to place data in either COG or HUB memory, and similarly for code to go in COG or HUB memory. On the P1, code in HUB is interpreted via the usual LMM mechanism, and also FCACHE. FCACHE means that small loops are compiled as plain COG code and loaded at run time into COG memory, so the whole loop runs at full speed (rather than doing the LMM read/execute one instruction at a time over and over).
(5) fastspin supports inline assembly. In fact many of the Spin primitives like COGNEW are implemented internally as plain Spin functions that use inline assemlby; these are then inlined automatically by the optimizer.
(6) The P2 output routines are a bit out of date, they need to be brought up to the most recent P2 definition.
(7) Performance of the PASM binaries produced by fastspin/spin2cpp is generally decent. PropGCC can usually do better (it has a more sophisticated optimizer) but I think fastspin can beat all of the other languages/compilers out there (other than pure PASM of course) and on some specific tasks like pin manipulation it can even beat PropGCC. As a representative example, here are the results from Heater's fft benchmark with recent builds of fastspin and PropGCC:
(1) spin2cpp started off as a converter from Spin to C++, but now it's a full compiler. It parses Spin code into an intermediate tree, and then from that tree can produce C, C++, PASM (P1 or P2), or binary output. The name is misleading now. Actually there is a different frontend for spin2cpp (called "fastspin") that mimics the interface of openspin, so perhaps we should start using that name for the compiler suite.
(2) The spin2cpp parser is written in YACC, so it is in some sense a formal definition of Spin grammar (at least, the Spin grammar accepted by spin2cpp; but so far I haven't found any programs that openspin takes but spin2cpp doesn't). There are actually two phases to parsing: lexical analysis (converting strings into tokens) and then the grammar (figuring out how the tokens go together to make a program). The grammar is in YACC; the lexical analysis is hand-written in C just to make some things easier (e.g. for indenting we emit INDENT and OUTDENT tokens).
(3) fastspin/spin2cpp does do optimization, both on the intermediate tree (so language independent optimizations) and on the PASM output. Examples of some of the optimizations it can perform are: dead code elimination, function inlining (any sufficiently small function is inlined, as are all functions that are called only once), common sub-expression elimination, and loop strength reduction.
(4) fastspin/spin2cpp can produce both LMM and plain COG programs. There are options to place data in either COG or HUB memory, and similarly for code to go in COG or HUB memory. On the P1, code in HUB is interpreted via the usual LMM mechanism, and also FCACHE. FCACHE means that small loops are compiled as plain COG code and loaded at run time into COG memory, so the whole loop runs at full speed (rather than doing the LMM read/execute one instruction at a time over and over).
(5) fastspin supports inline assembly. In fact many of the Spin primitives like COGNEW are implemented internally as plain Spin functions that use inline assemlby; these are then inlined automatically by the optimizer.
(6) The P2 output routines are a bit out of date, they need to be brought up to the most recent P2 definition.
(7) Performance of the PASM binaries produced by fastspin/spin2cpp is generally decent. PropGCC can usually do better (it has a more sophisticated optimizer) but I think fastspin can beat all of the other languages/compilers out there (other than pure PASM of course) and on some specific tasks like pin manipulation it can even beat PropGCC. As a representative example, here are the results from Heater's fft benchmark with recent builds of fastspin and PropGCC:
gcc LMM -O2: 65247 us
gcc LMM -Os: 138096 us
fastspin 3.2.0 -O: 170822 us
fastspin 3.2.0: 171422 us
Catalina LMM: ~348000 us
gcc CMM -O2: 520296 us
gcc CMM -Os: 567318 us
PropBasic LMM(*): 690842 us
JDForth: ~1200000 us
OpenSpin: 1465321 us
(*): PropBasic produced incorrect results, so should be taken with a grain of salt
Catalina and JDForth results are from a slightly older benchmark version, as
posted on forums
Comments
CON _clkmode = xtal1 + pll16x _clkfreq = 80_000_000 DO = 0 'pins CLK = 1 CS = 2 VAR byte array[512] pub Main | i outa[CS] := 1 dira[CS] := 1 dira[DO] := 1 dira[CLK] := 1 repeat i from 0 to 511 array[i] := i & 255 outa[CS] := 0 repeat i from 0 to 511 spiout(array[i]) outa[CS] := 1 pub spiout(value) value ><= 8 'MSB first repeat 8 '8 bits outa[DO] := value outa[CLK] := 1 value >>= 1 outa[CLK] := 0
Here's the results from spin2cpp. The exact command line is:
spin2cpp --asm --binary --code=cog --data=hub --cse spibase.spin
which produces spibase.pasm (the assembly listing) and spibase.binary (the compiled binary). Here's spibase.pasm:CON _clkmode = 1032 _clkfreq = 80000000 DO = 0 CLK = 1 CS = 2 PUB main coginit(0, @entry, 0) DAT org 0 entry mov arg1, par wz call #_Main cogexit cogid arg1 cogstop arg1 _Main or OUTA, #4 or DIRA, #4 or DIRA, #1 or DIRA, #2 mov _Main__cse__0008, objptr mov _Main_i, #0 L__0017 cmps _Main_i, #511 wc,wz if_a jmp #L__0019 mov _Main__cse__0007, _Main_i and _Main__cse__0007, #255 wrbyte _Main__cse__0007, _Main__cse__0008 add _Main_i, #1 add _Main__cse__0008, #1 jmp #L__0017 L__0019 andn OUTA, #4 mov _Main__cse__0010, objptr mov _Main_i, #0 L__0020 cmps _Main_i, #511 wc,wz if_a jmp #L__0022 rdbyte arg1, _Main__cse__0010 call #_spiout add _Main_i, #1 add _Main__cse__0010, #1 jmp #L__0020 L__0022 or OUTA, #4 _Main_ret ret _spiout mov _var_02, #1 rev arg1, #24 mov _var_03, #8 L__0023 shr arg1, #1 wc muxc OUTA, _var_02 or OUTA, #2 andn OUTA, #2 djnz _var_03, #L__0023 _spiout_ret ret objptr long @@@objmem result1 long 0 COG_BSS_START fit 496 hubexit jmp #cogexit objmem long 0[128] org COG_BSS_START _Main__cse__0007 res 1 _Main__cse__0008 res 1 _Main__cse__0010 res 1 _Main_i res 1 _var_02 res 1 _var_03 res 1 arg1 res 1 arg2 res 1 arg3 res 1 arg4 res 1 fit 496
The inner loop in the spiout function is pretty good, I think. I do see that the optimizer missed the chance to replace _var_02 with the constant 1, but otherwise I think it's what most PASM programmers would write by hand. It's also slightly better than the code GCC produces for a similar function; in this case fastspin/spin2cpp's knowledge of the Prop architecture was able to beat GCC's better generic optimizers.
Excuse me whist I transcribe it into Javascript....
Since Chip asked recently about formal grammars, would it be possible to convert Spin's into readable form (e.g. BNF)?
-Phil
Even better, how hard would it be to make fastspin able to output code in multiple output formats in the same binary? You could have all initialization code (since you have to boot into PNUT anyway) and cogs that don't need to be fast running PNUT to save RAM, and cogs that need to be fast running native PASM compiled from Spin. To specify what target should be used, I would suggest a special builtin function that takes a function and an output format and returns a pointer that can be passed to COGNEW. Obviously, if a single function needs to be compiled to multiple output formats, the compiler should be smart enough to do that.
It would be really neat if you made it so it could do things like compile arithmetic-only functions to floating point command stream scripts, similar to what JasonDorie did for the ELEV-8 firmware in C. This is why I think the special builtin above would be better than a special COGNEW: you might not want to pass the pointer to COGNEW - you might want to pass it to your floating point cog instead.
EDIT: reworded stuff
This should be a fantastic PASM learning tool for me in Linux.
Haven't got past FTDI drivers yet, "sudo shy".
I'll get over it as soon as I learn to make scripts, something like batch files that I have done before. Still learning the Linux landscape.
Quite happy with PropellerIDE and BST. Over thought that one, not knowing BST was part of the basic install.
Just curious since I have been using Linux for many years now, but what do you mean "haven't got past FTDI drivers yet"? I know I never have to worry about the drivers as Linux Mint works out of the box with this sort of thing.
Peter, I'm going by what I read on the FTDI site, I don't have my Linux machine up at the moment, but Parallax diverted me to the FTDI site for installation instructions. I couldn't find a .deb package to do an express install. I am going wrong direction, right?
EDIT: When I can get USB running, I will have the best of all worlds.
I think the Linux advice is well well out of date, many serial USB devices just work with Linux these days, and have done so for many years. This includes the popular FTDI drivers too of course.
Ignore what you saw and just plug it in and it comes up as ttyUSB0 etc. I use minicom for a serial terminal.
peter@peter-workstation ~ $ ls /dev/ttyU* /dev/ttyUSB0 /dev/ttyUSB1 /dev/ttyUSB2 peter@peter-workstation ~ $ lsusb Bus 002 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 001 Device 005: ID 2109:2812 VIA Labs, Inc. VL812 Hub Bus 001 Device 004: ID 0a12:0001 Cambridge Silicon Radio, Ltd Bluetooth Dongle (HCI mode) Bus 001 Device 003: ID 058f:6366 Alcor Micro Corp. Multi Flash Reader Bus 001 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 004 Device 002: ID 0480:a00c Toshiba America Inc Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 003 Device 004: ID 045e:0745 Microsoft Corp. Nano Transceiver v1.0 for Bluetooth Bus 003 Device 020: ID 0403:6001 Future Technology Devices International, Ltd FT232 USB-Serial (UART) IC Bus 003 Device 012: ID 0403:6001 Future Technology Devices International, Ltd FT232 USB-Serial (UART) IC Bus 003 Device 006: ID 05e3:0608 Genesys Logic, Inc. Hub Bus 003 Device 021: ID 0403:6001 Future Technology Devices International, Ltd FT232 USB-Serial (UART) IC Bus 003 Device 002: ID 05e3:0608 Genesys Logic, Inc. Hub Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub peter@peter-workstation ~ $
This is distribution dependent, so it may not apply to your Linux installation. Debian and its derivatives (and possibly RedHat/Fedora derivates as well) require that you either run as root or add user to the "dialout" group. Instructions here: http://askubuntu.com/questions/112568/how-do-i-allow-a-non-default-user-to-use-serial-device-ttyusb0
sudo adduser second_user dialout
You will need to log out and back in for the change to take affect.The intermediate form is an abstract syntax tree (AST) which is a binary tree. But unfortunately there's no documentation of the exact details of the nodes, beyond what's in ast.h. Something to put on my todo list
Below is a vaguely BNF like description that's generated by bison -v, edited a bit by hand to make the nonterminals more obvious. Note that this isn't exactly the same as the language OpenSpin accepts; there are some extensions like inline assembly and IF/THEN/ELSE expressions.
Spin Grammar 0 $accept: input $end 1 input: rest 2 | conblock rest 3 rest: topelement 4 | topelement rest 5 emptyline: <END_OF_LINE> 6 emptylines: 7 | emptylines emptyline 8 topelement: "CON" conblock 9 | "DAT" datblock 10 | "DAT" annotation datblock 11 | "VAR" varblock 12 | "OBJ" objblock 13 | "PUB" funcdef funcbody 14 | "PRI" funcdef funcbody 15 | "PUB" annotation funcdef funcbody 16 | "PRI" annotation funcdef funcbody 17 | annotation emptylines 18 funcdef: identifier optparamlist <END_OF_LINE> 19 | identifier optparamlist localvars <END_OF_LINE> 20 | identifier optparamlist resultname localvars <END_OF_LINE> 21 | identifier optparamlist resultname <END_OF_LINE> 22 optparamlist: 23 | identlist 24 | '(' identlist ')' 25 resultname: ':' identifier 26 localvars: '|' identlist 27 funcbody: 28 | stmtlist 29 stmtlist: stmt 30 | stmtlist stmt 31 stmt: basicstmt 32 | compoundstmt 33 | <END_OF_LINE> 34 | error <END_OF_LINE> 35 basicstmt: "RETURN" <END_OF_LINE> 36 | "RETURN" expr <END_OF_LINE> 37 | "ABORT" <END_OF_LINE> 38 | "ABORT" expr <END_OF_LINE> 39 | expr <END_OF_LINE> 40 | "QUIT" <END_OF_LINE> 41 | "NEXT" <END_OF_LINE> 42 compoundstmt: ifstmt 43 | repeatstmt 44 | stmtblock 45 | casestmt 46 stmtblock: <INDENT> stmtlist <OUTDENT> 47 | <INDENT> <OUTDENT> 48 ifstmt: "IF" expr <END_OF_LINE> elseblock 49 | "IFNOT" expr <END_OF_LINE> elseblock 50 elseblock: stmtblock 51 | stmtblock "ELSE" <END_OF_LINE> stmtblock 52 | stmtblock "ELSEIF" expr <END_OF_LINE> elseblock 53 | stmtblock "ELSEIFNOT" expr <END_OF_LINE> elseblock 54 casestmt: "CASE" expr <END_OF_LINE> <INDENT> casematchlist <OUTDENT> 55 casematchlist: casematchitem 56 | casematchlist casematchitem 57 casematchitem: casematch <END_OF_LINE> stmtblock 58 casematch: matchexprlist ':' 59 matchexprlist: matchexpritem 60 | matchexprlist ',' matchexpritem 61 matchexpritem: "OTHER" 62 | expr ".." expr 63 | expr 64 rangeexpritem: expr 65 | expr ".." expr 66 rangeexprlist: rangeexpritem 67 | rangeexprlist ',' rangeexpritem 68 repeatstmt: "REPEAT" <END_OF_LINE> stmtblock 69 | "REPEAT" <END_OF_LINE> stmtblock "WHILE" expr <END_OF_LINE> 70 | "REPEAT" <END_OF_LINE> stmtblock "UNTIL" expr <END_OF_LINE> 71 | "REPEAT" "WHILE" expr <END_OF_LINE> stmtblock 72 | "REPEAT" "UNTIL" expr <END_OF_LINE> stmtblock 73 | "REPEAT" identifier "FROM" expr "TO" expr "STEP" expr <END_OF_LINE> stmtblock 74 | "REPEAT" identifier "FROM" expr "TO" expr <END_OF_LINE> stmtblock 75 | "REPEAT" expr <END_OF_LINE> stmtblock 76 | "ASM" datblock "ENDASM" 77 | "CCODE" 78 lookupexpr: "LOOKUPZ" '(' expr ':' rangeexprlist ')' 79 | "LOOKUP" '(' expr ':' rangeexprlist ')' 80 lookdownexpr: "LOOKDOWNZ" '(' expr ':' rangeexprlist ')' 81 | "LOOKDOWN" '(' expr ':' rangeexprlist ')' 82 conblock: conline 83 | conblock conline 84 conline: enumlist <END_OF_LINE> 85 | <END_OF_LINE> 86 | error <END_OF_LINE> 87 enumlist: enumitem 88 | enumlist ',' enumitem 89 enumitem: identifier '=' expr 90 | identifier 91 | identifier '[' expr ']' 92 | '#' expr 93 datblock: datline 94 | datblock datline 95 datline: basedatline 96 | identifier basedatline 97 basedatline: <END_OF_LINE> 98 | error <END_OF_LINE> 99 | "BYTE" <END_OF_LINE> 100 | "BYTE" exprlist <END_OF_LINE> 101 | "WORD" <END_OF_LINE> 102 | "WORD" exprlist <END_OF_LINE> 103 | "LONG" <END_OF_LINE> 104 | "LONG" exprlist <END_OF_LINE> 105 | instruction <END_OF_LINE> 106 | instruction operandlist <END_OF_LINE> 107 | instruction modifierlist <END_OF_LINE> 108 | instruction operandlist modifierlist <END_OF_LINE> 109 | "ORG" <END_OF_LINE> 110 | "ORG" expr <END_OF_LINE> 111 | "ORGH" <END_OF_LINE> 112 | "ORGH" expr <END_OF_LINE> 113 | "RES" expr <END_OF_LINE> 114 | "FIT" expr <END_OF_LINE> 115 | "FIT" <END_OF_LINE> 116 | "FILE" string <END_OF_LINE> 117 objblock: objline 118 | objblock objline 119 objline: <END_OF_LINE> 120 | error <END_OF_LINE> 121 | identdecl ':' string 122 varblock: varline 123 | varblock varline 124 varline: "BYTE" identlist <END_OF_LINE> 125 | "WORD" identlist <END_OF_LINE> 126 | "LONG" identlist <END_OF_LINE> 127 | <END_OF_LINE> 128 | error <END_OF_LINE> 129 identlist: identdecl 130 | annotation identdecl 131 | identlist ',' identdecl 132 identdecl: identifier 133 | identifier '[' expr ']' 134 expr: integer 135 | float 136 | string 137 | "STRING" '(' exprlist ')' 138 | lhs 139 | '@' lhs 140 | "@@" lhs 141 | "@@@" lhs 142 | lhs ":=" expr 143 | identifier '#' identifier 144 | expr '+' expr 145 | expr '-' expr 146 | expr '*' expr 147 | expr '/' expr 148 | expr '&' expr 149 | expr '|' expr 150 | expr '^' expr 151 | expr '>' expr 152 | expr '<' expr 153 | expr "=>" expr 154 | expr "=<" expr 155 | expr "<>" expr 156 | expr "==" expr 157 | expr "//" expr 158 | expr "**" expr 159 | expr "#>" expr 160 | expr "<#" expr 161 | expr "><" expr 162 | expr "<-" expr 163 | expr "->" expr 164 | expr "<<" expr 165 | expr ">>" expr 166 | expr "~>" expr 167 | expr "OR" expr 168 | expr "AND" expr 169 | expr '+' '=' expr 170 | expr '-' '=' expr 171 | expr '/' '=' expr 172 | expr '*' '=' expr 173 | expr '&' '=' expr 174 | expr '|' '=' expr 175 | expr '^' '=' expr 176 | expr "//" '=' expr 177 | expr "**" '=' expr 178 | expr "#>" '=' expr 179 | expr "<#" '=' expr 180 | expr "><" '=' expr 181 | expr "<-" '=' expr 182 | expr "->" '=' expr 183 | expr "<<" '=' expr 184 | expr ">>" '=' expr 185 | expr "~>" '=' expr 186 | expr "AND" '=' expr 187 | expr "OR" '=' expr 188 | expr '<' '=' expr 189 | expr '>' '=' expr 190 | expr "=<" '=' expr 191 | expr "=>" '=' expr 192 | '(' expr ')' 193 | '\\' funccall 194 | '\\' identifier 195 | funccall 196 | '-' expr 197 | '!' expr 198 | '~' expr 199 | "~~" expr 200 | "NOT" expr 201 | "||" expr 202 | "^^" expr 203 | "|<" expr 204 | ">|" expr 205 | "$" 206 | lhs "++" 207 | lhs "--" 208 | "++" lhs 209 | "--" lhs 210 | lhs '?' 211 | '?' lhs 212 | lhs '~' 213 | lhs "~~" 214 | <CONSTANT> '(' expr ')' 215 | <FLOAT> '(' expr ')' 216 | "ROUND" '(' expr ')' 217 | "TRUNC" '(' expr ')' 218 | lookupexpr 219 | lookdownexpr 220 | "IF" expr "THEN" expr "ELSE" expr 221 lhs: identifier 222 | identifier '[' expr ']' 223 | hwreg 224 | hwreg '[' range ']' 225 | memref '[' expr ']' 226 | memref 227 | "SPR" '[' expr ']' 228 memref: "BYTE" '[' expr ']' 229 | "WORD" '[' expr ']' 230 | "LONG" '[' expr ']' 231 | identifier '.' "BYTE" 232 | identifier '.' "WORD" 233 | identifier '.' "LONG" 234 funccall: identifier '(' exprlist ')' 235 | "COGINIT" '(' exprlist ')' 236 | "COGNEW" '(' exprlist ')' 237 | identifier '.' identifier '(' exprlist ')' 238 | identifier '.' identifier 239 | identifier '[' expr ']' '.' identifier '(' exprlist ')' 240 | identifier '[' expr ']' '.' identifier 241 expritem: expr 242 | integer '[' expr ']' 243 | float '[' expr ']' 244 | '-' integer '[' expr ']' 245 | '-' float '[' expr ']' 246 | string '[' expr ']' 247 exprlist: expritem 248 | exprlist ',' expritem 249 operand: expr 250 | '#' expr 251 | '#' '#' expr 252 operandlist: operand 253 | operandlist ',' operand 254 range: expr 255 | expr ".." expr 256 integer: <NUMBER> 257 float: <FLOAT> 258 string: <STRING> 259 identifier: <IDENTIFIER> 260 | "RESULT" 261 annotation: T_ANNOTATION 262 hwreg: <HARDWARE_REGISTER_NAME> 263 instruction: <INSTRUCTION_SYMBOL> 264 | instrmodifier instruction 265 instrmodifier: <INSTRUCTION_MODIFIER_SYMBOL> 266 modifierlist: instrmodifier 267 | modifierlist instrmodifier 268 | modifierlist ',' instrmodifier
Thanks David, My distribution is Linux Mint (Sarah), and I should expand on what I said, I should say that: when I plugged in the Propeller, the USB device wasn't recognized. I know I installed an express card USB 3.0, and I want to be able to take advantage of that. I will look into this a little later.
Note that BST does have an optimizer already.
Mike Y.
Let's say you have the following code:
PUB fibo(i) : a | b, c b := 1 repeat i c := a+b b := a a := c
The only reason variable "c" exists is to save the value that will get written to "a" so it doesn't get clobbered by "b := a". A store and a load can be shaved off by just leaving the result of "a+b" on the stack until after the "b := a", and then assigning it to "a". Furthermore, the "b := " of "b := a" can have its pop flag cleared, to save having to load "a" twice. BST's optimizer did not do this (in fact, it does very little).
This is sort of a bad example, since the above optimizations can be performed in plain Spin as shown below. However, if any control structures were in between "c := a+b" and "a := c", it would not be possible to leave the values of the partially evaluated expressions on the stack in plain Spin.
PUB fibo(i) : a | b b := 1 repeat i a := b + (b := a)
Thanks,
-Phil
Might be best to open a new thread for this
I will if I can't figure it out, certainly don't want to derail the OP.