Shop OBEX P1 Docs P2 Docs Learn Events
DeSpin - Another Spin bytecode and PASM disassembler — Parallax Forums

DeSpin - Another Spin bytecode and PASM disassembler

CardboardGuruCardboardGuru Posts: 443
edited 2007-12-28 01:08 in Propeller 1
Here's a project I was working on back in the summer. It's a decompiler for Spin and disassembler for PASM, written in Python.

It's not complete, but it does disassemble most of the small .binary files I threw at it. But it crashes if you try to feed it anything that's not a valid spin binary file.

Assuming you have Python installed and on your path, you run it like this:
PYTHON DESPIN.PY SPINPROGRAM.BINARY

It might be interesting to Hippy as it represents a completely independent project to solve the same problem as his disassembler.

I use a different approach to output, I had in mind to also produce a Spin bytecode assembler, and for the output of the disassembler to be valid input for the assembler. Perfect for testing by means of round trip integrity. So the actual data is over on the right behind comment marks. On the left is the symbolic stuff that could be used for input to an assembler.

No doubt I took a different approach to labeling - it's a complex issue as labels function as both markers in Hub RAM and possibly in Cog RAM as well.

I think I may have changed some of the names of Spin Opcodes from the ones used by GEAR, as there was no official standard.

Hope someone finds it interesting. It was an excellent exercise to write it, from the point of view of gaining great insight into how Spin works and how to optimise Spin programs.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Help to build the Propeller wiki - propeller.wikispaces.com
Play Defender - Propeller version of the classic game
Prop Room Robotics - my web store for Roomba spare parts in the UK

Post Edited (CardboardGuru) : 12/27/2007 10:02:48 PM GMT

Comments

  • hippyhippy Posts: 1,981
    edited 2007-12-27 23:35
    Looks very interesting. I haven't used Python but looked through the source code and when I get some free time I will give it a whirl.

    There are some missing decoded values for the REFERENCE ( what I called USING ) opcodes, held in "do_dict = {...}". For the 'effect' byte, msb indicates if there is a resultant push, $40 to $FF map to opcodes $E0..$FF "dict = {...}" those $00 to $3F are the special cases.

    I'm not sure your 16-bit negative address offset is right in "elif operand_type==OP_BRANCH" and your "if operand_type==OP_PACKED_LITERAL" is different to what I use, but I'm not sure who's right, so I need to do some testing there.

    I recall GEAR may have got it wrong with respect to $1xxi00xx opcodes ( what I call "op MEM[noparse]/noparse.size" and "op MEM[noparse]/noparse[noparse]/noparse.size" ).

    That $3C opcode seems the illusive one; I haven't managed to generate any code in which it appears !

    If I've read your code right you seem to be traversing the method/object tables which is by far a better way of doing it than walking the program bytecode as I did it initially. I hadn't understood the significance of RES and overlaying of Cog and Hub and my code / decompilation suffers for that.

    Anyway it's great to see more bytecode tools arriving and well done.
  • mirrormirror Posts: 322
    edited 2007-12-28 00:16
    I made a couple of minor changes to GEAR quite some time ago - but have·not had the time to make more serious changes - which I would like to do. Unfortunately my unfamiliarity with the resources available within C# makes code development very inefficient at this time. I'm at a bit of a disadvantage in that my most familiar language is Delphi which doesn't sit well with the MAC and Linux users on this group. I like the look of Python (more so than C#) but just don't have the spare time to learn a new "major" language.

    I think what GEAR brought to me was the emulator / simulator·(more so than the bytecode decompiler) - and it has taught me more about the Propeller than any other non-Parallax resource.

    I have made minor changes to GEAR - breakpoints (1 per cog), fixing the emulation of some assembly instructions. I also added a set of shorter opcodes, so that the code may be decompiled with either short or long opcodes. As an example, the replacement of the opcodes is in the following vein:

    ·· PUSH_INDEXED_MAINMEM_BYTE·········· byte Mem[noparse]/noparse@
    ·· POP_INDEXED_MAINMEM_BYTE············ byte Mem[noparse]/noparse!
    ·· EFFECT_INDEXED_MAINMEM_BYTE······· byte Mem[noparse]/noparse with
    ·· REFERENCE_INDEXED_MAINMEM_BYTE·· & byte Mem[noparse]/noparse


    Where
    • byte and word sized memory is qualified (long being seen as native).
    • The four memory areas are Mem, Var, Loc and Obj
    • @ is fetch, ! is store - from forth
    • & is address of - from C
    • [noparse]/noparse indicates indexing
    • with - for a side effect instruction

    ·I found that this shorter set of opcodes made the resulting code easier to read.

    Attached is the copy of Gear that I currently use that allows switching between the two sets of opcodes.
  • CardboardGuruCardboardGuru Posts: 443
    edited 2007-12-28 00:23
    Python is brilliant. After years of using C++, I'd stopped getting any fun from programming. Python put the fun back. (Then I got a Hydra which upped the fun to a new level)

    I probably won't take this disassembler any further. The intention was for it to be a second step towards a Spin Compiler usable from the command line, and portable to Mac and Linux. (The first step being the proof of concept Propeller uploader working on the Mac.) However my personal view is that the proposal to put a development system into the ROM of the Prop2 will probably provide the preferred solution portability, and given all the feedback Parallax have had I suspect they'll release command line compiler at the same time, so a third party Spin compiler didn't have quite so good a future as I'd originally envisaged. As the disassembler has now been dead for 6 months and I haven't changed my mind on that, I thought it might as well serve as a point of interest to others and I've uploaded it as is.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Help to build the Propeller wiki - propeller.wikispaces.com
    Play Defender - Propeller version of the classic game
    Prop Room Robotics - my web store for Roomba spare parts in the UK
  • CardboardGuruCardboardGuru Posts: 443
    edited 2007-12-28 01:08
    hippy said...
    If I've read your code right you seem to be traversing the method/object tables which is by far a better way of doing it than walking the program bytecode as I did it initially. I hadn't understood the significance of RES and overlaying of Cog and Hub and my code / decompilation suffers for that.

    Yes, it's a classic two pass disassembler. The two passes are exactly the same, with only output disabled on the first pass. The first pass serves to fill in addresses in the symbol table. The second pass then knows what all the forward references are.

    Yes, the issue of addresses for labels is very complicated by the fact that a data label may be referring to a Hub RAM address or a Cog RAM address. And it's not necessarily clear where Cog RAM addresses start. And RES means there can be tricky overlaps. I think the answer is to create a seperate Cog RAM symbol table for every COGINIT or COGNEW that is encountered (assuming the 99.9% case that the AsmAddress is a constant). Each is after all effectively setting up it's own memory map. Then when disassembling assembler, memory references apply to the Symbol table that is current at that point.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Help to build the Propeller wiki - propeller.wikispaces.com
    Play Defender - Propeller version of the classic game
    Prop Room Robotics - my web store for Roomba spare parts in the UK
Sign In or Register to comment.