Shop OBEX P1 Docs P2 Docs Learn Events
OSX/Win32/Linux Spin Compiler-0.15.3 - Page 3 — Parallax Forums

OSX/Win32/Linux Spin Compiler-0.15.3

1356

Comments

  • JasonDorieJasonDorie Posts: 1,930
    edited 2008-10-22 22:02
    Does anyone writing a SPIN compiler have the source publically available? I'm curious to see how hard it would be to add optimizations like this to the compiler. I'm half surprised that they aren't there already (and half not, because if you're using SPIN instead of PASM, maybe speed wasn't your biggest concern to begin with).

    Regarding the example of the PUB function that simply returns a member variable, you said that the compiler doesn't know the layout or addresses of everything until the final packaging is done. True, but isn't that what linking is for? It was also said that the compiler doesn't know how complex the function is, but I disagree - The compiler just compiled it - It knows EXACTLY how complex the function is, and should be able to tag a function as 'just returns a member' so that it could potentially be inlined later.

    You could go even further and potentially determine that the function call overhead is larger than the body of the function being called, and just insert the function body inline instead. Most C/C++ compilers do this, but include the pipeline cost of the jump/return in the decision.

    Keep in mind that I have no idea what kinds of limitations are in place because of the workings of the interpreter, so I may just be highlighting my own ignorance here. [noparse]:)[/noparse] Are there things about the VM that prevent (or complicate) these kinds of optimizations?

    Jason
  • Forest GodfreyForest Godfrey Posts: 38
    edited 2008-10-23 03:41
    The problem with optimization is that while some of them (constant folding, common subexpression computation, and good basic block scheduling) can reduce code size, many others actually increase code size. Function inlining, loop unrolling, software pipelining, vectorization (ok, I work on Crays during the day [noparse]:)[/noparse] and various interprocedural analyses increase code size, sometimes significantly. If you want to see that in action, compile a moderate sized program with various gcc optimizations on and off.

    All of those optimizations could be applied to Spin with various degrees of difficulty. The problem on the prop is that code size is a major concern. Inlining in gcc will happen not just in cases where the compiler thinks the function call overhead is too large, it'll happen almost all the time. I've had it inline >1000 line functions several times in the same file. While I won't rule out someone wanting that in Spin, I'm betting most people don't [noparse]:)[/noparse] Also, the branch overhead of the Prop is nowhere near what it is on a deeply pipelined superscalar processor.

    What I'd like to see is something like the C "inline" attribute that would have it under the control of the programmer. One could envision a "INL" keyword to go with "PUB" and "PRI", that would be a private inline. However, I basically will have that with bstc because, after a short shell script, I can just use the C preprocessor with #define's to inline the stuff I want inlined.
  • BradCBradC Posts: 2,601
    edited 2008-10-23 04:27
    JasonDorie said...
    Does anyone writing a SPIN compiler have the source publically available? I'm curious to see how hard it would be to add optimizations like this to the compiler. I'm half surprised that they aren't there already (and half not, because if you're using SPIN instead of PASM, maybe speed wasn't your biggest concern to begin with).

    Actually, I wrote mine as I was tired of having to use Windows to program the Propeller. Compatibility was my biggest concern.
    JasonDorie said...

    Regarding the example of the PUB function that simply returns a member variable, you said that the compiler doesn't know the layout or addresses of everything until the final packaging is done. True, but isn't that what linking is for? It was also said that the compiler doesn't know how complex the function is, but I disagree - The compiler just compiled it - It knows EXACTLY how complex the function is, and should be able to tag a function as 'just returns a member' so that it could potentially be inlined later.

    Your assumption is true if an object only has a single instance inside another object, but this assumption falls down when you use multiple instances of the same object
    fred[noparse][[/noparse] 4 ] : "freds_object".

    Yes, with the way I have the symbol table for each object configured I could tag a method that only returns or sets a value, then I could look up that variable offset at link time and put it into the calling method. It's an awful lot of complex code to compensate for a questionable design choice in your source.
    JasonDorie said...

    You could go even further and potentially determine that the function call overhead is larger than the body of the function being called, and just insert the function body inline instead. Most C/C++ compilers do this, but include the pipeline cost of the jump/return in the decision.

    Which is all very nice in theory, except that spin objects don't work that way. A variable is referenced by an offset into the variable space from that object and not available outside that particular instance of the object without quite some convolution. Inlining a method from another object is just not easily achievable.
    JasonDorie said...

    Keep in mind that I have no idea what kinds of limitations are in place because of the workings of the interpreter, so I may just be highlighting my own ignorance here. [noparse]:)[/noparse] Are there things about the VM that prevent (or complicate) these kinds of optimizations?

    Yes. The architecture of spin limits what you are asking to a degree. It could be worked around, but then you could just as easily write faster/smaller code.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pull my finger!
  • BradCBradC Posts: 2,601
    edited 2008-11-01 11:09
    Revision 0.09 is up. (See top post for details)

    A couple of bugfixes and a speed improvement.
    Thanks to the wonder of Valgrind, I picked some low hanging fruit in the tokeniser for optimisation and managed an average speedup across 1500 source files of x8.
    One particular compilation sped up over 48 times. Not so noticeable on fast machines, but on a slow old PPC G4 it makes a _huge_ difference. My regression test went from over 3 hours down to 48 minutes on that machine.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pull my finger!
  • mparkmpark Posts: 1,305
    edited 2008-11-01 14:07
    What kind of low hanging fruit?
  • BradCBradC Posts: 2,601
    edited 2008-11-01 14:26
    mpark said...
    What kind of low hanging fruit?

    I'm a bit embarrassed to say really..

    When I tokenise the source I build up 5 chains of tokens (Spin/Var/Dat/Obj/Con), and each time I see a block change I change the chain I'm pushing tokens onto.
    I was maintaining 5 separate chain heads and each time I pushed a token I was starting at the head and traversing to the tail before I connected the next token (My token chains are doubly linked lists). This is of course totally bletcherous behaviour and a complete performance killer.

    Suffice to say that fixing this particular little quirk in the tokeniser gives me a significant performance boost (like source files that used to take 12 seconds to compile now take 0.25).

    I bought the Hydra book, and some of the game sources there are real swines for giving the compiler a good work out.

    I did say I was a lousy code architect [noparse];)[/noparse]

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pull my finger!
  • heaterheater Posts: 3,370
    edited 2008-11-01 16:53
    Just been testing bstc on my 8080 emulator project and came up with a little bug.

    It does not like "object_base := @@0" producing the error: "ERROR : bstc_bug - Error at (5,20) Expected Unique Name or BYTE, WORD, LONG

    I am using this to discover the base as address of the emulator object so that I can add it to addresses in a jump table that point to LMM functions.

    To be honest I did wonder at the time why the hell this works at all (Taking the address of a constant, weird) but someone recommended it to me as a way to solve my problem, it worked and I have not thought about it since.

    I imagine that now I could use @@@ in my jump table directly, if I understand correctly, which would do away for the need to add the object offset and speed up the emulator a tiny bit. But we'll come to that.

    Attached is a small spin to show the problem.

    Don't worry we've all made bletcherous code. Although most are not brave enough to say so !

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.
  • BradCBradC Posts: 2,601
    edited 2008-11-01 17:04
    heater said...
    Just been testing bstc on my 8080 emulator project and came up with a little bug.

    It does not like "object_base := @@0" producing the error: "ERROR : bstc_bug - Error at (5,20) Expected Unique Name or BYTE, WORD, LONG

    I am using this to discover the base as address of the emulator object so that I can add it to addresses in a jump table that point to LMM functions.

    Ouch! that shows there are plenty of other very ugly uses of the @@ operator. I'm gonna have to ruminate on this for a couple of days as it requires a bit of a re-design of my memory allocator. Thanks heater!

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pull my finger!
  • heaterheater Posts: 3,370
    edited 2008-11-01 17:12
    Sorry, I should have done this test ages ago when the @, @@, @@@ discussion was going on.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.
  • mparkmpark Posts: 1,305
    edited 2008-11-01 17:27
    heater said...
    To be honest I did wonder at the time why the hell this works at all (Taking the address of a constant, weird)...

    Don't think of @@ as taking the address of anything, think of it more as converting addresses from relative to absolute. It simply adds the object base address to its operand.
  • mparkmpark Posts: 1,305
    edited 2008-11-01 17:29
    Brad, thanks for your reply. So bstc is a one-pass compiler (in the sense that it only reads the source file once)? Homespun too, but I've been wondering if maybe two passes would make things simpler. Idle speculation at this point.
  • BradCBradC Posts: 2,601
    edited 2008-11-01 17:31
    mpark said...
    heater said...
    To be honest I did wonder at the time why the hell this works at all (Taking the address of a constant, weird)...

    Don't think of @@ as taking the address of anything, think of it more as converting addresses from relative to absolute. It simply adds the object base address to its operand.

    Indeed precisely what it does. Its really just a unary operator that adds PBASE to whatever comes next.
    In any case, I moved 2 lines of code and it's fixed. I'm running my regression tests overnight and I'll have an updated release out in the morning.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pull my finger!
  • BradCBradC Posts: 2,601
    edited 2008-11-01 17:42
    mpark said...
    Brad, thanks for your reply. So bstc is a one-pass compiler (in the sense that it only reads the source file once)? Homespun too, but I've been wondering if maybe two passes would make things simpler. Idle speculation at this point.

    Well, yes and no.. yes it reads the source file once and converts it into tokens, but then it may use those tokens multiple times. So in essence it's doing the slow bit (text to tokens) once, but then it runs over the list of tokens as many times as it needs to. The DAT section assembler is a 2 pass assembler and I have a neat trick for parsing the CON sections that allows recursive resolving of constants that are not declared strictly in order. Of course then when it comes to redundant method removal and absolute address resolution I re-compile bits of the token chains completely. It's not uncommon for the chain to be parsed or at least inspected more than once.

    I can't see where reading the source file again would possibly be useful except perhaps not having to store the lines required for the list file and copying them out at generation time.

    bstc allocates and releases incredible amounts of memory while compiling files as each token is a dynamically allocated memory block. Each operation generates yet more tokens and each token has 2 dynamically allocated strings attached to it. The heap activity while doing a compile is quite furious. It's now gone from being mildly quick to really quite fast, but on a 400MHZ G4 PPC it's still like molasses in winter.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pull my finger!
  • heaterheater Posts: 3,370
    edited 2008-11-01 20:09
    I see what you mean about @@. But it's still weird enough that applying @@ to constants is not documented.

    I was in the mind set of things like FORTRAN where literals did actually get compiled into a literal pool in memory. Which causes some surprises when you pass them by address to procedures that then modify their parameters....

    Anyway I assumed that when mpark said two passes he really meant passes through the tokens. I though that was the usual way to handle forward references.

    RE: "recursive resolving of constants" That seems to work only so far in Parallax Spin e.g.

    This works:
    c1 = c2
    c2 = 1.1

    This does not:
    c1 = c2
    c2 = c3
    c3 = 1.1

    Do they only do two passes and give up I wonder ?

    The latter also does not work in bstc (all be it with the error indicating c1 unresolved rather than c2)
    So where is the recursive resolving ?

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.
  • heaterheater Posts: 3,370
    edited 2008-11-01 20:33
    A couple of weeny bugs:

    In Parallax spin the following gives an error "Register is not allowed here." but bstc compiles it clean:

    CON
    rr = dira

    The following for parallax results in "Expected a unique constant name or #" but bstc says "Fatal error in CON loop"

    CON
    mov = dira

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.
  • BradCBradC Posts: 2,601
    edited 2008-11-02 07:24
    Brilliant catches.
    The first one is a bit odd really and I'm not sure if I should fix it or not. It's treating "dira" as a constant equal to the cog register address $1fX(whatever it is).
    I guess if the Parallax compiler errors out on it I should to.

    The second one is most certainly a bug and I'll get it fixed. I was planning a new release today for your @@ bug, but I'll knock these on the head before I get it out (and any other bugs you find!).

    Thanks heaps!

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pull my finger!
  • BradCBradC Posts: 2,601
    edited 2008-11-02 07:26
    heater said...

    The latter also does not work in bstc (all be it with the error indicating c1 unresolved rather than c2)
    So where is the recursive resolving ?

    I manually limit it to 2 passes to be error for error compatible with the Parallax compiler.
    I did not want people to be able to write code on the bst stuff that would not compile on the Parallax compiler.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pull my finger!
  • heaterheater Posts: 3,370
    edited 2008-11-02 11:28
    Error compatible with Parallax is very wise. Although I did wonder about dira and friends myself.

    Not so brilliant catches. A while ago I was entertaining myself writing a prop assembler that would be syntax compatible with Parallax, even the same error messages, everything except the spin code. So I was trying lots of experiments with the prop tool syntax and asking dumb questions about it here.

    Then you guys turn up with your complete, all singing all dancing compilers and I lost heart for my assembler.

    Now the reason I was writing an assembler was to assemble the output of a simple Pascal like compiler I was working on that produces LMM code. I wanted it to be cross platform and Parallax compatible.

    I probably won't continue with that compiler either as I'm not sure my skills are up to making a good job of it. Just now it only supports the LONG type for example and can only just support procedures.

    Just wondering now how about fixing up your spin compilers to generate LMM code, big but speedy?

    I have run out of spin to run through you compilers so that may be it for bug reports for a while.

    Great work by the way, you and mpark.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.
  • BradCBradC Posts: 2,601
    edited 2008-11-02 13:02
    heater said...
    Error compatible with Parallax is very wise. Although I did wonder about dira and friends myself.

    Not so brilliant catches. A while ago I was entertaining myself writing a prop assembler that would be syntax compatible with Parallax, even the same error messages, everything except the spin code. So I was trying lots of experiments with the prop tool syntax and asking dumb questions about it here.

    Then you guys turn up with your complete, all singing all dancing compilers and I lost heart for my assembler.

    I felt nearly the same when mpark popped up with his Compiler. I'd spent months working on one and I was ->" "<- that close to having it finished. Then I remembered that I was writing it for *me* as I wanted to be able to develop on my Linux machine in addition to my old Macs, so I pitched back in and finished it off.
    heater said...

    Now the reason I was writing an assembler was to assemble the output of a simple Pascal like compiler I was working on that produces LMM code. I wanted it to be cross platform and Parallax compatible.

    I probably won't continue with that compiler either as I'm not sure my skills are up to making a good job of it. Just now it only supports the LONG type for example and can only just support procedures.

    What help do you need to get the pascal compiler working? I'd love to support a cross platform Pascal or C (I prefer Pascal) effort. I'm more than willing to pitch in with code if required and will help with an LMM assembler if you want to go down that route.

    All my stuff is written in Pascal, and I've got quite a bit of experience in the language from many angles. If you want to do a Pascal to LMM compiler I'll certainly help.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pull my finger!
  • BradCBradC Posts: 2,601
    edited 2008-11-02 14:15
    Version 0.10 up

    Bugfixes as per changelog in top post.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pull my finger!
  • heaterheater Posts: 3,370
    edited 2008-11-03 11:22
    Just managed to compile all of my 8080 emulator with v0.10, including the original @@ problem, successfully. For sure produces the same eeprom file as the prop tool.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.
  • heaterheater Posts: 3,370
    edited 2008-11-03 14:19
    Thanks for your offer of help BradC. You may not be so keen when you hear the following:

    1) My compiler is not a Pascal compiler, I did say Pascal "like". The intention was to have a simple block structured language in the style Pascal but incorporating some/all the operators from spin and perhaps some other features. That is to tailor the language to the Propeller instruction set/architecture.

    2) As I know almost nothing about compilers and getting my head around, for example, the dragon book just does not work it is all based on Jack Crenshaws "Lets Build A Compiler" series compilers.iecc.com/crenshaw/. He defines two languages as he develops ideas in the series "TINY" and "KISS". TINY is pretty much a toy and gets almost full development if you can piece together bits of Jacks code. He keeps changing tack in the series. KISS is more like a real language but sadly Jack gave up on the series before fully defining/exploring it. My compiler is at the TINY level just now.

    3) It follows Jacks philosophy in the series of "keep it simple" so there is no thought to optimization.

    4) I have written my compiler in Ada !. As is my assembler attempt. Crazy perhaps but I wanted to get some practice in with Ada at the same time. That's a throwback to my days working with aircraft flight controls and such.

    Some couple of years ago I did develop Jacks KISS compiler, as far as I could follow his specification, in C (his is in Pascal) producing code for x86 (Jack targets 68000). It includes signed/unsigned bytes words and longs, looping constructs, procedures etc. Trouble is that code and it's backups may have been lost in the three house moves I've done since then. If I'm lucky I can recover that and redo the code generator for LMM.

    I'll have to dig all this out again and give it some thought. Given that I like to work in Linux having yours and mparks compilers around now may spur me on again.

    Of course what really canned al this was the arrival of ImageCrafts C LMM compiler. Is there any point to continue unless the resulting language/compiler can offer some unique and useful features ?

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.

    Post Edited (heater) : 11/3/2008 2:29:36 PM GMT
  • BradCBradC Posts: 2,601
    edited 2008-11-03 14:30
    heater said...
    Is the any point to continue unless we can offer some unique and useful features ?

    <snipped lots of intersting stuff>

    This last point is a biggie. It's worth continuing if you find it fun to do. I do it coz I enjoy it [noparse]:)[/noparse]

    Personally I find spin quite pascalish in parts, and I was pretty much immediately at home, but if you think there is another way to entice beginners to get into programming (which is what it's all about afterall) then lets see where it leads.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pull my finger!
  • heaterheater Posts: 3,370
    edited 2008-11-03 15:12
    "I do it coz I enjoy it" Yep, most stuff I dink around with at home has no practical use whatsoever, assuming it even gets finished [noparse]:)[/noparse]

    "another way to entice beginners to get into programming" Yep but I'm not sure if we can pull it off. In the early days of microcomputers you could just turn the thing on, it would come up to an editor/command line immediately. You would type a few simple BASIC statements and "run", And it did. Very simple and instant gratification/feedback. That simple thing hooked thousands of kids into exploring what they could do next and how far they could push the machine. This story is repeated by many famous and not so famous programmers today.

    Now we have wonderful things like the Prop and other micros with the possibility of I/O to the real world of LEDs, relays, motors, sensors, whole robots etc. We don't have to be limited to simple text or graphics on a screen. Great, but to get these to jump is a million times more complicated than those micros and BASIC. Generally you need to hook it up to a PC, install some development system, learn some complicated language, C, Spin whatever. And that's even before tackling the problems of hooking up those real world devices.

    So I'd like to see a stand alone system, with an ultra simple language, and some ready made I/O which could be switched on and programmed, at the simplest level, in an instant.

    The Prop could do this, with its TV/VGA output and a keyboard. So I'd like my compiler to be simple enough to be self hosting.

    I think Chip is already heading in that direction, at least for the Prop II.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.
  • heaterheater Posts: 3,370
    edited 2008-11-03 15:41
    Actually thinking about it, it's appalling that modern day games boxes, Wii, xbox, playstation etc don't come with even the rudimentary programming capability of an old C64. Of course if they did kids might be busy creating there own stuff rather than spending out on games, oh dear we can't have that can we.

    A games console, with a programming system and a bunch of I/O ports is what I have in mind I guess.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.
  • BradCBradC Posts: 2,601
    edited 2008-11-18 16:16
    Version 0.11 up. Details in top post.

    Speed improvements (another 40%) and bug fixes.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pull my finger!
  • BradCBradC Posts: 2,601
    edited 2008-12-22 16:22
    Version 0.12 up. Details in top post.

    More major speed improvements and bug fixes. Now up to date with the compiler in bst and much faster than 0.11.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Cardinal Fang! Fetch the comfy chair.
  • hinvhinv Posts: 1,255
    edited 2009-02-05 20:26
    @heater; I agree with you whole heartily. My first computer was a Commodore 128, but before that I played with Atari 400 and 800 computers at school in 7th grade. I was hooked, just like you said, by instant gratification.
    Well, I got a commodore 128 for my children to catch the same bug, but found out they were spoiled. They like the C64 in a joystick better because they don't have to type, and hook up all that stuff. Now my son found my little PC110 palmtop and thinks it is really neat and wants to learn to program. The question is: what do I put on it? I was thinking TCL, but I really don't know. Maybe I would be better off with a protoboard with keyboard, mouse & composite and FemtoBasic. What would you suggest?

    Doug
  • hinvhinv Posts: 1,255
    edited 2009-02-05 20:32
    Hi Brad,

    It seems the IDE problem I am having is in the compiler. Here is a copy and paste from an attempt to compile:
    maniac:~/mcu/propeller> ./bstc.linux -L /home/doug/mcu/propeller/proptool Display.spin
    Brads SpinTool Compiler v0.12 - Copyright 2008, All rights reserved
    Compiled for i386 Linux at 20:19:19 on 2008/12/22
    Found a USB Serial Device
    Loading Object Display
    Loading Object Numbers
    An unhandled exception occurred at $080514DE :
    EInOutError : Access denied
    $080514DE
    $0804FD55
    $0804FFEE
    $0804D781
    $080481D4

    maniac:~/mcu/propeller> ls -l /home/doug/mcu/propeller/proptool/Numbers*
    -rw-r--r-- 1 doug users 90162 2008-01-29 13:31 /home/doug/mcu/propeller/proptool/Numbers.spin


    Any Idea what is wrong?
  • Luis DigitalLuis Digital Posts: 371
    edited 2009-02-05 20:40
    "EInOutError : Access denied"

    It seems that you do not have permission to use the port USB.

    Treat as the administrator or add his user to the group USB.
Sign In or Register to comment.