Spin Bytecode Disassembler
hippy
Posts: 1,981
Further to my long term possible plans ( a Spin compiler for Windows non-XP, an alternative language compiler / translator to Spin / Bytecode / Assembler ), the first step has been to produce a Spin Bytecode Disassembler, mainly to get to grips with what's going on under the hood - Many thanks go to Robert Bryon Vandiver ( Gear ) and Cliff L Biffle for the groundwork they did in identifying a considerable chunk of the bytecode and image format.
The disassembler traverses the image and identifies Spin and Assembler and then generates output in a form in which a ( yet to exist ) Spin Assembler can convert to an image. This will be the back-end code generator for the future projects. It's a very early beta and a lot more work needs to be done, especially variables, and it does not yet support Spin programs which include objects.
As there's no official definition of Spin out there, I simply designed my own mnemonics. They probably aren't to everyone's taste
A zipped executable is attached for anyone who feels like playing with it ( VB6 runtime required ), and an example of my Spin test code and the bytecode disassembly is in the PDF.
The disassembler traverses the image and identifies Spin and Assembler and then generates output in a form in which a ( yet to exist ) Spin Assembler can convert to an image. This will be the back-end code generator for the future projects. It's a very early beta and a lot more work needs to be done, especially variables, and it does not yet support Spin programs which include objects.
As there's no official definition of Spin out there, I simply designed my own mnemonics. They probably aren't to everyone's taste
A zipped executable is attached for anyone who feels like playing with it ( VB6 runtime required ), and an example of my Spin test code and the bytecode disassembly is in the PDF.
Comments
Very interesting stuff, if we could only make an Spin compiler, maybe a new era of (super) fast spin can begin !
Is there a way you can make it GPL or some other opensource license ?, after all, Gear is under the GPL v2. (Maybe C, or java would be a bit of a better choice if cross-platform beyond winblows is also intended :-D)
Keep the good work !
Gru
On Windows I am a VB6 programmer, so that's the source people will get. It will be open sourced and I will also deliver a proper install package. I generally try to make sure my VB6 will import into VB 2005 Express and run with no changes, and the GUI will be split from the main code to hopefully make porting easier if anyone wants to do that. I have no Linux / Mac programming experience and use Windows 98, so I am primarily working towards creating a Spin IDE / compiler which runs under 98.
So far, Spin bytecode looks to be very efficient. The only scope for speed improvement would be constant and array index folding, while dead code removal would recover some space. The bytecode lacks single instructions to move source variables or constants to destination variables which would improve speed but it is understandable why they do not exist.
I can see plenty of scope for where all that I am learning could take me, but first I'll try and get my foundations secured
Evan waves [noparse]:)[/noparse]
··· Nice work!
··· Unfortunately mscomm32.ocx is missing on my system. Some others may find this a difficulty. Probably you could add the OCX to the zip package.
Thanks.
· Attila
Better Assembler traversing and disassembly. Various cosmetic changes. Still a long way to go !
Waves back at Evan [noparse]:)[/noparse]
Not there yet, but a lot further on now. Supports Objects, there's better program traversing and Spin bytecode decoding has improved. I think I've got the decoding almost finalised.
Variables are not yet named nor placed so no 'source file' can be produced which would be re-compiled to create the same bytecode. That will be the next big effort to do, then an actual bytecode compiler.
The .ZIP now includes the executable and Spin source and Eeprom files along with the .PDF which illustrates the decompilation of those Spin files.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Cheers,
Simon
www.norfolkhelicopterclub.co.uk
You'll always have as many take-offs as landings, the trick is to be sure you can take-off again ;-)
BTW: I type as I'm thinking, so please don't take any offense at my writing style
Marty
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Lunch cures all problems! have you had lunch?
The important thing I see from this is not so much the program itself but the understanding of the bytecode, layout and file format. With that revealed, everyone is free to write their own software to utilise or generate the bytecode. It also means it can be documented better than it is now.
There are still a few things which are a mystery to me, but once I get a Spin Compiler running it should be a simple case of comparing what I can generate against what the Propeller Tool does and whittle away at the differences. Still a way to go until then !
There's an endian bug in the 0.02 version for Spin literal constants. It's fixed and ready for the next version worth uploading.
Don't mind my distraction,
Marty
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Lunch cures all problems! have you had lunch?
Fingers crossed when I say; I've got variable placement and naming finished, including a mechanism for local / stack variables. Source code can now be inlined with the decompilation so it's easier to see what the bytecode relates to, but can get out of kilter with uncalled/unused methods. Various bug fixes and cosmetic changes.
Now it's mainly a case of decompiling as many images as I can find, looking for what breaks things and fixing them, while trying to identify bytecode sequences I haven't encountered and working out what they actually mean.
The criteria for release was being able to disassemble all .eeprom files which come with the PropTool installation without any crashes or unexpected surprises and with reasonable results. The executable is now called PropView.exe with PropLoad.exe reverting to the Windows non-XP Propeller downloader it was originally intended to be ( not written yet ). Apologies for the confusion and change caused through 'getting carried away'.
I am not sure about the benefits of releasing the source code at present. It is a complete mess with the hacking used to make it work. It would really benefit from a complete rewrite using the knowledge learned, as would most proof of concept code written in a hurry. That is not currently planned. My belief is that it would be far more useful to document the Spin bytecode, which is underway.
The goal was to determine what is generated by the Propeller Tool in order to generate bytecode from something else. I believe I have achieved that and it would be a near impossible task to decompile all arbitrary code, especially when it contains self-modifying code.
I cannot guarantee disassembly is perfect, but what I am seeing makes sense to me. The $3C opcode is a 'known unknown' because I have not encountered its use so far. There may be other esoteric issues I have missed. I do not expect any help from Parallax ( their position is quite clear and understandable ), but if anyone can identify where I am getting anything wrong I would be very grateful for enlightenment.
Post Edited (hippy) : 8/11/2007 4:31:57 PM GMT
Run-time error '339'
Component 'RICHTX32.OCX' or one of its dependencies not correctly registered: a file is missing or invalid
when I double clicked on PropView.exe.
Well, that didn't help at all. Frigging microsoft.
Does anyone have any advice about removing VB6runtime?
Post Edited (Fred Hawkins) : 8/13/2007 6:18:03 PM GMT
@ Ale : I've put up a rough draft of the bytecode description as an attachment to the first post but there's a lot of text to still write.
With other things going on I've not been able to get my head into a clear thinking mode. Some of the bytecode sequences are quite complex to describe, particularly Repeat-From-To and Select-Case. Easy enough for me to see what it is doing but not to put into words.
It needs turning into a far more readable form and there are two more documents which need writing to consolidate everything togetehr -
(1) What bytecode to generate to achieve the equivalent of a Spin command which makes the use of the opcodes become much clearer and their function more understandable
(2) How the memory is divided up, how the "LINK" vector table works at the start of the main program, and how sub-objects are nested within each other. This would also be the architectural description of the Spin Virtualised Processor.
2nd difference: service pack 6 (VB6.0-KB290887-X86.exe) has a eula that claims that the update is only for people authorized to use it or something like that. When you accept the eula and run it, you get vbrun60sp6.exe, which hangs. God I hate Microsoft.
Post Edited (Fred Hawkins) : 8/14/2007 12:31:26 AM GMT
@ Ale : I've attached MainSpec.txt to the first post. This is a draft overview of how objects and sub-objects fit together. Basically, the start of that "Spin Architecture" document.
If there are any specific questions on anything I'll do my best to try and answer them, but please bear in mind I don't have a Propeller Chip yet, and only started looking at any of this a couple of weeks ago so I'm still stumbling around in the dark trying to make educated guesses and make sense of what there is.
Another release of pPropellerSim comes...
I had little joy with disassembling the Forth interpreter, although it showed up a couple of bugs in my own code.
The first problem was the code re-starts Cog 0 with an assembler routine, and knowing the code will never return the usual Spin Return or Repeat which would terminate the PUB method isn't there so the disassembler goes crashing through the rest of image thinking it's Spin when it isn't. I assume the .binary file is not created by the Propeller Tool or was patched-up post creation.
Even with the disassembler forced to start from the first Cog instruction it doesn't get very far as additional Cogs are launched on-the-fly, code for which cannot be determined automatically.
The disassembler isn't designed for this type of code. A better approach would be to emulate Spin and Assembler as it executes and identify data, Spin and Assembler locations that way, but that's a lot of work and well beyond what I am planning to do.
I've included what I did manage to extract, but it's not very exciting -
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer
Parallax, Inc.
My disassembler would have caught the COGINIT link into Assembler and identified the code as such if it had been a "PUSH #OBJ+$10.LONG" not a "PUSH #$20", but I didn't think of everything
As it looks like I'll be pulling similar tricks myself I'll add special case tests for Chip's preamble and restarting Cog 0 in the first PUB method, and improve my Assembler detection for COGINIT's used with absolute addresses.
Cheers