Shop OBEX P1 Docs P2 Docs Learn Events
Spin Bytecode Disassembler — Parallax Forums

Spin Bytecode Disassembler

hippyhippy Posts: 1,981
edited 2008-09-08 15:11 in Propeller 1
Further to my long term possible plans ( a Spin compiler for Windows non-XP, an alternative language compiler / translator to Spin / Bytecode / Assembler ), the first step has been to produce a Spin Bytecode Disassembler, mainly to get to grips with what's going on under the hood - Many thanks go to Robert Bryon Vandiver ( Gear ) and Cliff L Biffle for the groundwork they did in identifying a considerable chunk of the bytecode and image format.

The disassembler traverses the image and identifies Spin and Assembler and then generates output in a form in which a ( yet to exist ) Spin Assembler can convert to an image. This will be the back-end code generator for the future projects. It's a very early beta and a lot more work needs to be done, especially variables, and it does not yet support Spin programs which include objects.

As there's no official definition of Spin out there, I simply designed my own mnemonics. They probably aren't to everyone's taste smile.gif

A zipped executable is attached for anyone who feels like playing with it ( VB6 runtime required ), and an example of my Spin test code and the bytecode disassembly is in the PDF.
«134

Comments

  • mparkmpark Posts: 1,305
    edited 2007-08-02 03:20
    Very cool. A bit beyond my level, but I can see the possibilities. Keep going!
  • AleAle Posts: 2,363
    edited 2007-08-02 06:52
    Hallo Hippy,

    Very interesting stuff, if we could only make an Spin compiler, maybe a new era of (super) fast spin can begin !

    Is there a way you can make it GPL or some other opensource license ?, after all, Gear is under the GPL v2. (Maybe C, or java would be a bit of a better choice if cross-platform beyond winblows is also intended :-D)

    Keep the good work !

    Gru
  • hippyhippy Posts: 1,981
    edited 2007-08-02 10:53
    Thanks for the encouragement.

    On Windows I am a VB6 programmer, so that's the source people will get. It will be open sourced and I will also deliver a proper install package. I generally try to make sure my VB6 will import into VB 2005 Express and run with no changes, and the GUI will be split from the main code to hopefully make porting easier if anyone wants to do that. I have no Linux / Mac programming experience and use Windows 98, so I am primarily working towards creating a Spin IDE / compiler which runs under 98.

    So far, Spin bytecode looks to be very efficient. The only scope for speed improvement would be constant and array index folding, while dead code removal would recover some space. The bytecode lacks single instructions to move source variables or constants to destination variables which would improve speed but it is understandable why they do not exist.

    I can see plenty of scope for where all that I am learning could take me, but first I'll try and get my foundations secured smile.gif
  • evanhevanh Posts: 16,101
    edited 2007-08-02 14:49
    Ambitious! And could be something viable for me also.


    Evan waves [noparse]:)[/noparse]
  • APStech-AttilaAPStech-Attila Posts: 38
    edited 2007-08-02 14:59
    Hello hippy!

    ··· Nice work!

    ··· Unfortunately mscomm32.ocx is missing on my system. Some others may find this a difficulty. Probably you could add the OCX to the zip package.

    Thanks.

    · Attila
  • hippyhippy Posts: 1,981
    edited 2007-08-02 18:49
    New Version 0.1 executable created ( now attached to first post ) which doesn't require MScomm32 - sorry about that.

    Better Assembler traversing and disassembly. Various cosmetic changes. Still a long way to go !

    Waves back at Evan [noparse]:)[/noparse]
  • hippyhippy Posts: 1,981
    edited 2007-08-07 04:00
    New Version 0.02 executable created ( now attached to first post ).

    Not there yet, but a lot further on now. Supports Objects, there's better program traversing and Spin bytecode decoding has improved. I think I've got the decoding almost finalised.

    Variables are not yet named nor placed so no 'source file' can be produced which would be re-compiled to create the same bytecode. That will be the next big effort to do, then an actual bytecode compiler.

    The .ZIP now includes the executable and Spin source and Eeprom files along with the .PDF which illustrates the decompilation of those Spin files.
  • simonlsimonl Posts: 866
    edited 2007-08-07 13:11
    Woot! Lookin' good hippy. KUTGW.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Cheers,

    Simon
    www.norfolkhelicopterclub.co.uk
    You'll always have as many take-offs as landings, the trick is to be sure you can take-off again ;-)
    BTW: I type as I'm thinking, so please don't take any offense at my writing style smile.gif
  • LawsonLawson Posts: 870
    edited 2007-08-07 16:04
    This is exciting stuff! would it be possible to keep the the memory requirements of this low? (say a 2-4K bytes ram and a 'disk cache' on a SD card) Gotta keep those options open [noparse]:D[/noparse]

    Marty

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Lunch cures all problems! have you had lunch?
  • hippyhippy Posts: 1,981
    edited 2007-08-07 16:39
    Lawson said...
    This is exciting stuff! would it be possible to keep the the memory requirements of this low? (say a 2-4K bytes ram and a 'disk cache' on a SD card) Gotta keep those options open [noparse]:D[/noparse]
    I'm not sure I completely follow what you're after here, but I doubt it. I use two 32K x 32-bit arrays to mark what each bytecode byte is while traversing the program, and they'll have to expand to support the Propeller Mk II. It seems to be a bit of an esoteric situation you envision so my preference would be to get what I have working, open source it, and leave others to determine how to implement it once they can see how to.

    The important thing I see from this is not so much the program itself but the understanding of the bytecode, layout and file format. With that revealed, everyone is free to write their own software to utilise or generate the bytecode. It also means it can be documented better than it is now.

    There are still a few things which are a mystery to me, but once I get a Spin Compiler running it should be a simple case of comparing what I can generate against what the Propeller Tool does and whittle away at the differences. Still a way to go until then !

    There's an endian bug in the 0.02 version for Spin literal constants. It's fixed and ready for the next version worth uploading.
  • LawsonLawson Posts: 870
    edited 2007-08-07 18:18
    I'm just thinking of the possabilities of a Proto board development station. A proto board with a SD card slot and the accessory kit theoretically has everything required to be a mobile development station for the Propeller and mabe even the basic stamp line. Sorry, just a vapor ware project I've been thinking about. (two 32K x 32bit arrays would be just fine as long as access is fairly linear. I've assumed that this project would need a virtual memory manager of some kind)

    Don't mind my distraction,
    Marty

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Lunch cures all problems! have you had lunch?
  • hippyhippy Posts: 1,981
    edited 2007-08-09 05:11
    Not far from a 1.00 release ... Latest 0.03 version attached to first post.

    Fingers crossed when I say; I've got variable placement and naming finished, including a mechanism for local / stack variables. Source code can now be inlined with the decompilation so it's easier to see what the bytecode relates to, but can get out of kilter with uncalled/unused methods. Various bug fixes and cosmetic changes.

    Now it's mainly a case of decompiling as many images as I can find, looking for what breaks things and fixing them, while trying to identify bytecode sequences I haven't encountered and working out what they actually mean.
  • hippyhippy Posts: 1,981
    edited 2007-08-11 16:26
    Finally, Version 1.00 has arrived, attached to first post.

    The criteria for release was being able to disassemble all .eeprom files which come with the PropTool installation without any crashes or unexpected surprises and with reasonable results. The executable is now called PropView.exe with PropLoad.exe reverting to the Windows non-XP Propeller downloader it was originally intended to be ( not written yet ). Apologies for the confusion and change caused through 'getting carried away'.

    I am not sure about the benefits of releasing the source code at present. It is a complete mess with the hacking used to make it work. It would really benefit from a complete rewrite using the knowledge learned, as would most proof of concept code written in a hurry. That is not currently planned. My belief is that it would be far more useful to document the Spin bytecode, which is underway.

    The goal was to determine what is generated by the Propeller Tool in order to generate bytecode from something else. I believe I have achieved that and it would be a near impossible task to decompile all arbitrary code, especially when it contains self-modifying code.

    I cannot guarantee disassembly is perfect, but what I am seeing makes sense to me. The $3C opcode is a 'known unknown' because I have not encountered its use so far. There may be other esoteric issues I have missed. I do not expect any help from Parallax ( their position is quite clear and understandable ), but if anyone can identify where I am getting anything wrong I would be very grateful for enlightenment.

    Post Edited (hippy) : 8/11/2007 4:31:57 PM GMT
  • Fred HawkinsFred Hawkins Posts: 997
    edited 2007-08-13 15:35
    I get this:

    Run-time error '339'

    Component 'RICHTX32.OCX' or one of its dependencies not correctly registered: a file is missing or invalid

    when I double clicked on PropView.exe.
  • hippyhippy Posts: 1,981
    edited 2007-08-13 16:00
    @ Fred Hawkins: That's the Rich Text Box Control. That should come as a standard part of the VB6 run-time which needs to be installed - www.microsoft.com/downloads/details.aspx?familyid=7b9ba261-7a9c-43e7-9117-f673077ffb3c&displaylang=en
  • AleAle Posts: 2,363
    edited 2007-08-13 16:02
    hippy: As soon as you have some Disassembler rules drafted, would be useful if you could post them, so other (me included) could benefit from your hard work smile.gif, I know that comes soon.
  • Fred HawkinsFred Hawkins Posts: 997
    edited 2007-08-13 17:19
    Hippy, That's where google sent me for vb6 runtime. Maybe I have to reset for its installation to take effect.

    Well, that didn't help at all. Frigging microsoft.

    Does anyone have any advice about removing VB6runtime?

    Post Edited (Fred Hawkins) : 8/13/2007 6:18:03 PM GMT
  • hippyhippy Posts: 1,981
    edited 2007-08-13 19:04
    @ Fred : Sorry the link wasn't useful, and I'm not sure what to suggest.

    @ Ale : I've put up a rough draft of the bytecode description as an attachment to the first post but there's a lot of text to still write.

    With other things going on I've not been able to get my head into a clear thinking mode. Some of the bytecode sequences are quite complex to describe, particularly Repeat-From-To and Select-Case. Easy enough for me to see what it is doing but not to put into words.

    It needs turning into a far more readable form and there are two more documents which need writing to consolidate everything togetehr -

    (1) What bytecode to generate to achieve the equivalent of a Spin command which makes the use of the opcodes become much clearer and their function more understandable

    (2) How the memory is divided up, how the "LINK" vector table works at the start of the main program, and how sub-objects are nested within each other. This would also be the architectural description of the Spin Virtualised Processor.
  • Fred HawkinsFred Hawkins Posts: 997
    edited 2007-08-14 00:12
    Hippy, one difference found: your link is service pack 6, my link and load was from service pack 5.

    2nd difference: service pack 6 (VB6.0-KB290887-X86.exe) has a eula that claims that the update is only for people authorized to use it or something like that. When you accept the eula and run it, you get vbrun60sp6.exe, which hangs. God I hate Microsoft.

    Post Edited (Fred Hawkins) : 8/14/2007 12:31:26 AM GMT
  • hippyhippy Posts: 1,981
    edited 2007-08-14 00:48
    @ Fred : I feel your pain. Luckily ( touch wood ) I've never had any problems installing the run time on any systems I've had to use. That doesn't help your predicament though.

    @ Ale : I've attached MainSpec.txt to the first post. This is a draft overview of how objects and sub-objects fit together. Basically, the start of that "Spin Architecture" document.

    If there are any specific questions on anything I'll do my best to try and answer them, but please bear in mind I don't have a Propeller Chip yet, and only started looking at any of this a couple of weeks ago so I'm still stumbling around in the dark trying to make educated guesses and make sense of what there is.
  • AleAle Posts: 2,363
    edited 2007-08-14 08:29
    Hippy: That info is very interesting indeed !, Igrasped just the surface when I made the loader for the DAT section (in pPropellerTool), hopefully now, I´d be able to load any DAT section at will smile.gif. Thanks !

    Another release of pPropellerSim comes... smile.gif
  • Fred HawkinsFred Hawkins Posts: 997
    edited 2007-08-14 12:17
    Hippy, I deleted it all but the pdf and texts. But maybe I can persuade you to run your program past Biffle's forth.
  • hippyhippy Posts: 1,981
    edited 2007-08-14 15:22
    Sorry you didn't have any luck [noparse]:([/noparse]

    I had little joy with disassembling the Forth interpreter, although it showed up a couple of bugs in my own code.

    The first problem was the code re-starts Cog 0 with an assembler routine, and knowing the code will never return the usual Spin Return or Repeat which would terminate the PUB method isn't there so the disassembler goes crashing through the rest of image thinking it's Spin when it isn't. I assume the .binary file is not created by the Propeller Tool or was patched-up post creation.

    Even with the disassembler forced to start from the first Cog instruction it doesn't get very far as additional Cogs are launched on-the-fly, code for which cannot be determined automatically.

    The disassembler isn't designed for this type of code. A better approach would be to emulate Spin and Assembler as it executes and identify data, Spin and Assembler locations that way, but that's a lot of work and well beyond what I am planning to do.

    I've included what I did manage to extract, but it's not very exciting -
  • Paul BakerPaul Baker Posts: 6,351
    edited 2007-08-14 18:00
    Cliff wrote his own assembler which he used to compile the Forth, he claimed his assembler "freed" him from Spin, when in actuallity what his·assembler does is inject the Spin byte code equivalent·of coginit(0,pc+1,0)

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Paul Baker
    Propeller Applications Engineer

    Parallax, Inc.
  • deSilvadeSilva Posts: 2,967
    edited 2007-08-14 18:06
    Paul Baker (Parallax) said...
    ... when in actuality what his assembler does is inject the Spin byte code equivalent of coginit(0,pc+1,0)
    smile.gif But this is not his fault - it's yours!
  • Mike GreenMike Green Posts: 23,101
    edited 2007-08-14 18:07
    The boot loader always starts a Spin interpreter in cog #0 when it's done loading from a PC or EEPROM. Chip posted a 32 byte preamble that does a coginit(0,$20,0) to start up an assembly program beginning at location $20. That's probably what you're running into. This preamble is:
    Minimal Spin bootstrap code for assembly language launch
    --------------------------------------------------------
    $0000: HZ HZ HZ HZ CR CS 10 00 LL LL 18 00 18 00 10 00
    $0010: FF FF F9 FF FF FF F9 FF 35 37 04 35 2C -- -- --
    $0020: your assembly code starts here - loaded into COG #0
    
    elaboration:
    $0000: HZ HZ HZ HZ - internal clock frequency in Hz (long)
    $0004: CR          - value to be written to clock register (byte)
    $0005: CS          - checksum so that all RAM bytes will sum to 0 (modulus 256)
    $0006: 10 00       - 'pbase' (word) must be $0010
    $0008: LL LL       - 'vbase' (word) number of longs loaded times 4
    $000A: 18 00       - 'dbase' (word) above where $FFF9FFFF's get placed
    $000C: 18 00       - 'pcurr' (word) points to Spin code
    $000E: 10 00       - 'dcurr' (word) points to local stack
    $0010: FF FF F9 FF - below local stack, must be $FFF9FFFF
    $0014: FF FF F9 FF - below local stack, must be $FFF9FFFF
    $0018: 35          - push #0   (long written to $0010)
    $0019: 37 04       - push #$20 (long written to $0014)
    $001B: 35          - push #0   (long written to $0018)
    $001C: 2C          - COGINIT(0, $20, 0) - load asm code from $20+ into same COG #0
    $001D: -- -- --    - filler
    $0020: XX XX XX XX - 1st long of asm program to be loaded into COG #0
    $0024: XX XX XX XX - 2nd long of asm program to be loaded into COG #0
    $0028:             - rest of data
    
    Note: 'vbase' is the total number of bytes loaded.  If valid, the IDE
    will load the binary file and transfer it to a Propellor.
    
    
  • hippyhippy Posts: 1,981
    edited 2007-08-14 19:46
    @ Paul, @ Mike : That is similar to what I'm seeing. The COGINIT appears to have been embedded into a PUB method rather than using Chip's preamble.

    My disassembler would have caught the COGINIT link into Assembler and identified the code as such if it had been a "PUSH #OBJ+$10.LONG" not a "PUSH #$20", but I didn't think of everything smile.gif

    As it looks like I'll be pulling similar tricks myself I'll add special case tests for Chip's preamble and restarting Cog 0 in the first PUB method, and improve my Assembler detection for COGINIT's used with absolute addresses.
  • The WangsterThe Wangster Posts: 13
    edited 2007-09-10 13:02
    Sorry if this is a stupid question, but what's stopping Chip from releasing his bytecode specs? Why do we have to go this long route?
  • hippyhippy Posts: 1,981
    edited 2007-09-10 13:43
    @ The Wangster : From Paul Baker (Parallax) in http://forums.parallax.com/showthread.php?p=580679 - "they feel releasing the bytecode would open a can of worms from a support perspective that they don't want to deal with."
  • Damien AllenDamien Allen Posts: 103
    edited 2007-11-26 23:37
    Just wondering if you have made any progress in this area? I've learned a lot from what you've posted so far....need more information.....

    Cheers
Sign In or Register to comment.