Why...O...why...in this day and age...

DavidZemon · 2017-01-06 19:43

cgracey wrote: »

JasonDorie wrote: »

Toggle, or set H or L depending on the instruction? (I'd expect the latter to be more useful)

And yes, you probably understand my meaning. When writing the firmware for the Elev8 in Spin, there were cases where I needed more than one of a given type of object in the project, but also wanted to call functions in them from multiple places in the code, that's hard to do in Spin without duplicating code. I needed two serial port objects (USB and XBee) and wanted to send output to them from multiple sub-objects in my code. Much simpler in C/C++.

I also found the lack of optimization in Spin to be huge. Function level linking, dead code elimination, auto-inlines, and constant folding make a massive difference in code size and performance, and meant that my C/C++ version of the Elev8 code is slightly smaller than the Spin version, and 2x faster, even though the bytecode representation for CMM code is 50% or more larger than Spin for simple benchmarks.

Ultimately, it would be great if we could get optimizing compilation working for Spin. It's a lifetime of development effort to achieve that, it seems. Do you think lessons could be efficiently learned from how CLANG, GNUC, and others work? I imagine there's some degree of abstraction at which it's all the same. I wonder how many KB of compiler source code define those functions.

This is why I would argue that Spin2cpp is all you need. Spin -> C/C++ can be un-optimized, then rely on the fantastic work of the GCC and PropGCC teams to do the optimizations that have taken a lifetime to develop (and will continue developing for another lifetime or ten probably).

JasonDorie · 2017-01-06 20:27

That introduces another layer of potential errors though, unless it's perfect. The Spin opcodes are more compact than CMM compiled C. If a native Spin compiler could produce even some of the same optimizations as GCC it would improve the performance and size of Spin.

Optimization is something that can be done piecemeal over time, too. I've written two different compilers (though without loops or branching, so they're somewhat simpler to deal with). In both cases I started with a very direct input -> output code generator, then added optimizations over time, starting with the lowest hanging fruit and working my way up.

Starting with LLVM or CLang would be better than GCC, as the infrastructure is much more modular & modern. To support Spin you'd need to write the parser and generator stages, but you'd use their internal representation for the intermediate stages, which gets you most of their existing optimization work. Assembly level optimization (processor specific features) would still be on you.

jmg · 2017-01-06 20:36

JasonDorie wrote: »

I've always thought these kinds of "look how simple it is!" tests are a bit silly, because the triviality of the example means that it's simple to implement in just about any language.

Yes and no.
The real value comes, when those examples include both ASM listings, and a toggle speed value.
Numbers help qualify everything.

In the case of token-languages, those stats should include the loaded token engine.

JasonDorie · 2017-01-06 20:45

I agree they have value, but simple cases also muddy the waters sometimes. For example, for a simple toggle example, Spin is going to win for size, period, because the representation is compact and the interpreter is in ROM. CMM would include the size of the interpreter and standard setup libs, so it would be large for a trivial example. For a larger project though, C will likely win because optimizations will overtake that initial disadvantage.

On the other hand, that in itself is probably useful to know.

Regardless, we're kind of going off the rails here from the OP.

jmg · 2017-01-06 21:01

DavidZemon wrote: »

cgracey wrote: »

Ultimately, it would be great if we could get optimizing compilation working for Spin. It's a lifetime of development effort to achieve that, it seems. Do you think lessons could be efficiently learned from how CLANG, GNUC, and others work? I imagine there's some degree of abstraction at which it's all the same. I wonder how many KB of compiler source code define those functions.

This is why I would argue that Spin2cpp is all you need. Spin -> C/C++ can be un-optimized, then rely on the fantastic work of the GCC and PropGCC teams to do the optimizations that have taken a lifetime to develop (and will continue developing for another lifetime or ten probably).

Yup.

Certainly before Chip starts into any Spin on P2, he should look very carefully at what Spin2cpp already does.
Even that name is somewhat obsolete, as latest versions should be called
Spin2cpp_or_Asm_for_P1_or_P2

Some info is spread a little, but try these posts
http://forums.parallax.com/discussion/comment/1374983/#Comment_1374983
https://forums.parallax.com/discussion/comment/1389266/#Comment_1389266
https://forums.parallax.com/discussion/comment/1390346/#Comment_1390346
https://forums.parallax.com/discussion/comment/1390721/#Comment_1390721

and here is a code comment I found here

'' simple hello world that works on PC or Propeller
''
'' to compile for PC:
''   spin2cpp --cc=gcc -DPC --elf -o Hello HelloWorld.spin
'' to compile for Propeller
''   spin2cpp --asm --binary -Os -o hello.binary HelloWorld.spin
'' to compiler for Prop2:
''   spin2cpp --p2 --asm -o hello.p2asm --code=hub HelloWorld.spin

Note here that spin2cpp can output generic C for PC, or even ASM for P1 or P2, all from single source just with command variations.

JasonDorie wrote: »

That introduces another layer of potential errors though, unless it's perfect...

Certainly, it needs to be well-tested, but I think that can be done, as Parallax already have a strong commitment to C, and that means they will support and improve C flows.

Also note newest Spin2cpp does not actually have to use a C compiler at all - see the example above of direct ASM generate.

jmg · 2017-01-06 21:26

cgracey wrote: »

My only personal desire is to make Spin for the Prop2. It could compile to byte code or straight PASM, or either.

I recognize the value of snatching those Hipster dollars from cyberspace, but I'm actually pretty limited, in terms of what I have the stomach for and the wil to do. Spin is going to be the first thing I work on.

This 'finite-resource' aspect is why I'd suggest looking first at what Spin2cpp can do already.
( once P2 has hardware sign off, of course )

It is often much easier to work 'on top of other work', than be a pioneer

In mining that vein, I also earlier looked into fasmg, to try to find a quick way to get a P2/P1 macro assembler.

See the thread and reports here on that testing:
https://board.flatassembler.net/topic.php?t=19443&postdays=0&postorder=asc&start=0
http://forums.parallax.com/discussion/165101/macro-assembler-for-p1-p2

The wizard author of fasmg, Tomasz Grysztar, also applied some extensions to improve fasmg to modern convention, like tolerate of underscore in binary/hex strings, and he helped with compact macros, as fasmg is not really an 'assembler' so much as a powerful macro engine (coded in itself, of course)

The end outcome is fasmg can parse Prop ASM, with some cleanups
(eg fasmg assembler uses the more conventional ; as an asm comment )

I did not cover all opcodes, but I tried to test all prefix/suffix conditionals, which are the Prop-Specials on ASM code.

David Betz · 2017-01-06 21:55

JasonDorie wrote: »

That introduces another layer of potential errors though, unless it's perfect. The Spin opcodes are more compact than CMM compiled C. If a native Spin compiler could produce even some of the same optimizations as GCC it would improve the performance and size of Spin.

Eric has a version of spin2cpp that generates PASM for either P1 or P2 so there doesn't have to be an added layer of translation. I'm sure he could also add a code generator for a bytecode VM if one were defined for the P2.

JasonDorie · 2017-01-06 22:16

His original suggestion was that Spin2Cpp allowed us to gain the optimization stages offered by gcc. Adding a PASM or PNUT output stage would negate that - my original point was that using clang or llvm and adding the generator to those would get you that optimization, and you could still target the output for PNUT, PASM, CMM, or other flavors of runtime.

David Betz · 2017-01-06 22:24

JasonDorie wrote: »

His original suggestion was that Spin2Cpp allowed us to gain the optimization stages offered by gcc. Adding a PASM or PNUT output stage would negate that - my original point was that using clang or llvm and adding the generator to those would get you that optimization, and you could still target the output for PNUT, PASM, CMM, or other flavors of runtime.

I'd like to see LLVM used. Anyone here know it well enough to attempt a P2 code generator?

David Betz · 2017-01-06 22:32

cgracey wrote: »

JasonDorie wrote: »

Toggle, or set H or L depending on the instruction? (I'd expect the latter to be more useful)

And yes, you probably understand my meaning. When writing the firmware for the Elev8 in Spin, there were cases where I needed more than one of a given type of object in the project, but also wanted to call functions in them from multiple places in the code, that's hard to do in Spin without duplicating code. I needed two serial port objects (USB and XBee) and wanted to send output to them from multiple sub-objects in my code. Much simpler in C/C++.

I also found the lack of optimization in Spin to be huge. Function level linking, dead code elimination, auto-inlines, and constant folding make a massive difference in code size and performance, and meant that my C/C++ version of the Elev8 code is slightly smaller than the Spin version, and 2x faster, even though the bytecode representation for CMM code is 50% or more larger than Spin for simple benchmarks.

Ultimately, it would be great if we could get optimizing compilation working for Spin. It's a lifetime of development effort to achieve that, it seems. Do you think lessons could be efficiently learned from how CLANG, GNUC, and others work? I imagine there's some degree of abstraction at which it's all the same. I wonder how many KB of compiler source code define those functions.

I think the learning curve is pretty steep with GCC and I suspect with LLVM as well. I'm sure you could learn a lot by plowing through the code given enough time but I suspect it would take a *lot* of time. Also, optimization of Spin might be a bit tricky if you want to avoid having to decorate variable declarations with something like "volatile". I think one of the big optimizations is keeping frequently referenced variables in registers and that doesn't work if they are hardware registers or variables shared with other COGs.

DavidZemon · 2017-01-06 22:34

David Betz wrote: »

JasonDorie wrote: »

His original suggestion was that Spin2Cpp allowed us to gain the optimization stages offered by gcc. Adding a PASM or PNUT output stage would negate that - my original point was that using clang or llvm and adding the generator to those would get you that optimization, and you could still target the output for PNUT, PASM, CMM, or other flavors of runtime.

I'd like to see LLVM used. Anyone here know it well enough to attempt a P2 code generator?

If Parallax hires me, I'll learn it

Last time I inquired though, they didn't do remote hires

David Betz · 2017-01-06 22:47

DavidZemon wrote: »

David Betz wrote: »

JasonDorie wrote: »

His original suggestion was that Spin2Cpp allowed us to gain the optimization stages offered by gcc. Adding a PASM or PNUT output stage would negate that - my original point was that using clang or llvm and adding the generator to those would get you that optimization, and you could still target the output for PNUT, PASM, CMM, or other flavors of runtime.

I'd like to see LLVM used. Anyone here know it well enough to attempt a P2 code generator?

If Parallax hires me, I'll learn it Last time I inquired though, they didn't do remote hires

You could always move to Rocklin!

jmg · 2017-01-06 23:19

JasonDorie wrote: »

His original suggestion was that Spin2Cpp allowed us to gain the optimization stages offered by gcc. Adding a PASM or PNUT output stage would negate that - my original point was that using clang or llvm and adding the generator to those would get you that optimization, and you could still target the output for PNUT, PASM, CMM, or other flavors of runtime.

You also said this, above

JasonDorie wrote:

I also found the lack of optimization in Spin to be huge. Function level linking, dead code elimination, auto-inlines, and constant folding make a massive difference in code size and performance, and meant that my C/C++ version of the Elev8 code is slightly smaller than the Spin version, and 2x faster, even though the bytecode representation for CMM code is 50% or more larger than Spin for simple benchmarks.

...

Optimization is something that can be done piecemeal over time, too. I've written two different compilers (though without loops or branching, so they're somewhat simpler to deal with). In both cases I started with a very direct input -> output code generator, then added optimizations over time, starting with the lowest hanging fruit and working my way up.

There are many 'optimizations', and as you say here, many are not even done in the back end, but can be managed earlier.
Adding PASM/PNUT generate (as Spin2Cpp does already) does not negate those earlier optimisations, and it even allows later ones to be added, over time.

JasonDorie · 2017-01-06 23:37

Spin2Cpp does optimizations internally? If not, then yes, adding a PASM output to it means that we'd lose the optimzations performed by GCC when parsing the Spin2Cpp output. Did I misunderstand something here?

My understanding:
Spin2Cpp -> cpp files -> GCC (includes optimizers) -> optimized runtime code
Spin2Cpp -> PASM/PNUT files (bypasses optimizer in gcc)

The_Master · 2017-01-06 23:38

Lots of good ideas here, got me thinking. Maybe adding Linux to my Elev-8

Does anyone know if there is enough weight capacity on the Elev-8 for a hard drive? Maybe a networking card too?

JasonDorie · 2017-01-06 23:40

I can't tell if you're serious, but in case you are, Kyle has lifted a 5lb weight with his, so an SSD would be no problem.

Publison · 2017-01-06 23:46

Yea I would not want a spinning hard drive to upset the Gyro.

Chris Savage · 2017-01-06 23:47

A Raspberry Pi should suffice, no?

David Betz · 2017-01-06 23:50

JasonDorie wrote: »

Spin2Cpp does optimizations internally? If not, then yes, adding a PASM output to it means that we'd lose the optimzations performed by GCC when parsing the Spin2Cpp output. Did I misunderstand something here?

My understanding:
Spin2Cpp -> cpp files -> GCC (includes optimizers) -> optimized runtime code
Spin2Cpp -> PASM/PNUT files (bypasses optimizer in gcc)

I think Eric has added some simple optimizations to spin2cpp itself, maybe only when it generates PASM output. I'm not sure of the details. However, yes, you'll probably get better optimization if you go through GCC.

jmg · 2017-01-06 23:50

JasonDorie wrote: »

Spin2Cpp does optimizations internally? If not, then yes, adding a PASM output to it means that we'd lose the optimzations performed by GCC when parsing the Spin2Cpp output. Did I misunderstand something here?

My understanding:
Spin2Cpp -> cpp files -> GCC (includes optimizers) -> optimized runtime code
Spin2Cpp -> PASM/PNUT files (bypasses optimizer in gcc)

Yes, but note that
Spin2Cpp -> PASM/PNUT files (bypasses optimizer in gcc)
does not exclude doing any of your list of Function level linking, dead code elimination, auto-inlines, and constant folding within Spin2Cpp. (Eric may already do some of this)

The point I'm making is gcc is not the sole source of optimize choice here.

I've even seen assemblers manage dead code elimination.

JasonDorie · 2017-01-06 23:52

Chris Savage wrote: »

A Raspberry Pi should suffice, no?

You'd think, but honestly the multitasking / timeslice nature of the OS on it means that latency can be an issue. It would be great for additional on-board processing, but probably not the thing you want handling all the realtime tasks that the Prop currently does. If it timeslices even at 1ms you might completely miss an input R/C signal. (I will add the caveat that I haven't used one, but have read this in a few different places)

JasonDorie · 2017-01-06 23:58

Jmg - I never said it couldn't be done, I was saying that by excluding gcc from the chain you were losing the optimizations that they've done in that tool, and would therefore have to implement them somewhere else. It's not trivial, but it's obviously possible.

jmg · 2017-01-07 00:03

JasonDorie wrote: »

Chris Savage wrote: »

A Raspberry Pi should suffice, no?

You'd think, but honestly the multitasking / timeslice nature of the OS on it means that latency can be an issue. It would be great for additional on-board processing, but probably not the thing you want handling all the realtime tasks that the Prop currently does. If it timeslices even at 1ms you might completely miss an input R/C signal. (I will add the caveat that I haven't used one, but have read this in a few different places)

Certainly it would be a brave/pioneering type to use a RaspberryPi as the only controller, but I'd never say impossible, as there are interesting things talked about with 'bare metal' pi's and the Pi-3 does have a QuadCore, so maybe someone can run Linux on one core, and do bare-metal real time on some of the other 3 cores ?
Most designer I know would chose the simpler path of adding a separate real-time controller.

JasonDorie · 2017-01-07 00:07

Yeah, bypassing the OS completely would be the way to go. The raw chip should certainly have the power to handle everything, particularly the Pi-3. I've had the itch to add a Cortex-M4 chip to mine as the primary flight controller, and have it use the Prop for all the I/O tasks, but free time and flying weather are both at a premium right now.

jmg · 2017-01-07 00:07

David Betz wrote: »

I think Eric has added some simple optimizations to spin2cpp itself, maybe only when it generates PASM output. I'm not sure of the details. However, yes, you'll probably get better optimization if you go through GCC.

A search for optimize in the Spin2cpp github has 15 hits and the changelog.txt has 5 mentions of optimize.

I see some things are not tagged as optimize, such as the auto-inlines Jason mentioned.

Version 3.1.0 Spin2cpp
 - Added (preliminary) --p2 support for Propeller 2.
 - Added fcache support for Propeller 1.
 - If a function is called only once, and we can eliminate it by inlining it, do so.

Phil Pilgrim (PhiPi) · 2017-01-07 00:10

Chip,

I'd avoid the spin2cpp route. That's just shoehorning Spin into a language that's an uncomfortable fit. I think you'll find that once you get into it, optimization is not that hard. Take constant folding, for example. During code generation, if the top three items on your parse stack are two constants and an operation, just perform the operation on the constants and push the result back onto the stack instead of emitting the code for it. Simple. Aaand ... it eliminates the need for that pesky constant pseudo-function.

I do hope that you'll start with a formal grammar for Spin this time, though. That's an essential part of defining the language. Moreover, working from such a grammar, makes writing a recursive-descent parser that much easier.

-Phil

jmg · 2017-01-07 00:17

Phil Pilgrim (PhiPi) wrote: »

Chip,

I'd avoid the spin2cpp route. That's just shoehorning Spin into a language that's an uncomfortable fit.

Did you miss that Spin2cpp can also emit both P1 or P2 ASM ?

Phil Pilgrim (PhiPi) · 2017-01-07 00:20

jmg wrote:

Did you miss that Spin2cpp can also emit both P1 or P2 ASM ?

Skipping the C++ intermediary? So, then it's not spin2cpp but a Spin compiler, right?

-Phil

jmg · 2017-01-07 00:26

Phil Pilgrim (PhiPi) wrote: »

jmg wrote:

Did you miss that Spin2cpp can also emit both P1 or P2 ASM ?

Skipping the C++ intermediary? So, then it's not spin2cpp but a Spin compiler, right?

Correct - see my comment above about how the 'spin2cpp' name is not quite accurate anymore..

Phil Pilgrim (PhiPi) · 2017-01-07 00:45

I really hope that Chip and Jeff are able to produce a Spin dev environment entirely in-house, like they did with the Propeller Tool. Without citing examples, I feel that the latter has maintained an integrity that some of the various outsourced tools have not -- and cannot. It's just the nature of outsourcing that contractors get busy with other stuff and can't always respond to Parallax's needs when called upon or hew to Parallax's priorities. I'm one of those people, so I understand the problem.

-Phil

Why...O...why...in this day and age...

Comments