Shop OBEX P1 Docs P2 Docs Learn Events
SPIN Interpreter Internals (link included) — Parallax Forums

SPIN Interpreter Internals (link included)

asterickasterick Posts: 158
edited 2007-01-07 21:47 in Propeller 1
Given light of some interest (one person) in my SPIN interpreter research, I went ahead and decided to do a write up of everything I know on the inner workings.

www.sublab.net/spin.html

This covers all the operation codes I could get the compiler to produce. Anything that is not known is marked.

EDIT: Added opcode 0x07 (Call indexed object)

Post Edited (asterick) : 1/5/2007 10:55:51 PM GMT
«1

Comments

  • rokickirokicki Posts: 1,000
    edited 2007-01-03 19:25
    The "unknown" second word in the object linkage table is, I believe, the "var" offset for that object with respect to the current object.
    (You can test this by adding and removing variables and watch it change.)

    Also, your python code fails for me; I'm trying to understand why. (All I get is numbers, no op codes.)

    Based on these results you might share with people what makes the most compact code. (I.e., the first seven (arguments + locals)
    are more concisely accessed than the rest, also the first 8 long vars are more concisely accessed; also longs are faster than
    either bytes or words in general).

    It looks to me like the "PC relatives" (branch offsets and the like) use a two-byte form frequently when a one-byte form would
    suffice.

    Maybe I'll just write a binary to binary optimizer that does the obvious peephole improvements.
  • asterickasterick Posts: 158
    edited 2007-01-03 19:34
    The python code fails because I forgot to remove a "return" in the DumpHeader function. I was testing the checksum code, and the disassembly got in the way.

    The relative branches are 7-bit signed, so they can only be between -64 to 63 (the MSB means that it's a 16-bit branch)

    Local 0 is the RESULT or return value. Local 1-7 are the call arguements, followed by local variables.

    Yes, longs are more efficient, so long as they take up the first 8 entries in the VAR tables. Also interleaving BYTEs and LONGs results in alignment and will steal bytes from you.

    As far as your "var" offset, I don't know, it seems pretty ambiguous. 1st and 2nd tier objects have a zero (1st tier is ignored), where as the 3rd tier tend to have wildly large values (0x3000+)

    I also discovered another opcode, which leaves only 2 unknown in the main instruction set.

    I've included an update disassembler, which dumps all the crazyness.
  • asterickasterick Posts: 158
    edited 2007-01-03 19:41
    Something interesting that might be tried, I don't know if it's been attempted before... but someone might try doing this:

    VAR
      LONG Interp[noparse][[/noparse]512]
    
    PUB start | index
      repeat index from -496 to 15 step 1
        Interp[noparse][[/noparse]index+496] := SPR[noparse][[/noparse]index]
    
      ' TODO: Output to COM
    
    



    to see if we can get a dump for an emulator. I don't actually own a propeller board yet, so I can't do it myself. I'm almost certain that the people at parallax thought to do boundary checking on that function, but you never know. They had the foresight to encrypt the rom after all.
  • rokickirokicki Posts: 1,000
    edited 2007-01-03 19:45
    Hmm, would you share the binary you are decompiling that shows strange offsets?

    My interpretation of the second word of the object is as follows.

    Given an object with a var ptr of v.

    This object has two subobjects, o1 and o2. The var offset of o1 is v1, and the var
    offset of o2 is v2.

    When o1 or o2 is invoked, the var ptr should be set to v + v1 or v + v2, respectively.

    Essentially, it's an offset from the var ptr of the enclosing object to get to the var
    pointer of the current object.

    (Try including the same object multiple times; you'll get multiple subobject entries,
    each with a different var offset, reflecting that they are different objects, yet they
    share the same code.)

    If you have an example that shows this not to be the case, I'm interested in it.
  • rokickirokicki Posts: 1,000
    edited 2007-01-03 19:47
    The SPR trick with values less than 0 or greater than 15 does not work, at least not for me.

    I was hoping to get a debugger working by modifying the main-memory code in-place, but
    have not yet figured out everything I need. (I was going to modify the main-memory code
    to put a branch-back-to-itself, or maybe a wait-pne or something).

    What we should consider doing is coming up with a symbol table format, since clearly that's
    lacking and useful.
  • mike101videomike101video Posts: 43
    edited 2007-01-03 19:50
    This looks like the basis for a C compiler, that compiles into spin Bytecodes???? Or a Basic that uses the spin bytecode?? A whole world opens up.

    Mike

    PS - I'm interested as well.. so make it TWO people....
  • asterickasterick Posts: 158
    edited 2007-01-03 19:51
    Aaah, looking at it, you are correct.


    The WORD after the child object offset is the VAR block offset relative to the parent object's VAR block offset (nrr)

    I didn't really mess with custom objects, I usually just messed with the library objects, which makes sense that they would go crazy, since they can have massive variable blocks.
  • asterickasterick Posts: 158
    edited 2007-01-03 20:01
    Haha, good to see there is some interest. I updated the page to include your info rokicki (with props given)

    I'm going to be cleaning it up after a bit, so it's easier to read. Right now it's a bit slapped together. I also need to describe what the codes themselves do, and how they affect the stack.

    Those voids in the instruction set are glaring at me though. [noparse]:)[/noparse]
  • rokickirokicki Posts: 1,000
    edited 2007-01-03 20:09
    Oh yeah, I'm currently looking at using these bytecodes as a target for BASIC. I'm also considering building a JIT BASIC using multiple cogs. smile.gif
    (The JIT BASIC would just be a stunt; only the smallest and simplest loops would stay in assembly---but they would run fast!)

    Using the bytecodes as a BASIC target would ease the memory pressure on the microcomputer I'm building because I would reuse the
    existing ROM code rather than taking up space for my own code. And for simplicity I'd use as few opcodes as possible to keep the "compiler"
    simple.

    Other tricks you can do pretty easily with this information:

    dynamic loading and execution of SPIN binaries

    loading and executing a SPIN binary with assembly just to load up and start a COG, and then free that memory back up. This way you
    could load a *bunch* of functionality into some cogs (with assembly language) and then free up all that shared RAM for other uses.

    debuggers/tracers

    passing an object by reference (this is a tiny bit tricky)
  • asterickasterick Posts: 158
    edited 2007-01-03 20:15
    Objects by reference is POSSIBLE, but painful, yes. Allocating new objects would be a feat and a half, to say the least.

    It wouldn't be a huge challenge if you used a custom compiler (since you could embed the VAR block inside the object itself)

    I don't think the system write protects code space.
  • rokickirokicki Posts: 1,000
    edited 2007-01-03 20:28
    I've tested if the system write protects code space---it does not.

    If you ignore multithreading, objects by reference is not that hard; you simply need to make
    sure the method signatures match. Everything else falls into place pretty easily.

    With multithreading, it's harder, since the code space is shared and you only have a byte
    index to use for a trampoline. But it should still be reasonably possible.
  • asterickasterick Posts: 158
    edited 2007-01-03 21:06
    Ok, I cleaned up the spec a little bit, it's not bearable.
  • Ym2413aYm2413a Posts: 630
    edited 2007-01-03 21:48
    bytecodes BASIC or C would rock!

    I like the spin bytecodes since they almost act as a form of compression when used correctly.
    Complex tasks can be done with a few spin opcodes which would normally take many 32bit ASM opcodes to do!

    I've once wrote complex code in ASM for the prop only to later write it in SPIN to save space! (at the cost of speed)
    smilewinkgrin.gif
  • asterickasterick Posts: 158
    edited 2007-01-03 21:59
    Yes, there are major benefits to code density if you use SPIN. It's designed to be as compact as possible, and even then there is potential for some squeezing. If you REALLY didn't care about code execution speed, you could easily do compression hooks on functions so they unpack themselves as they are executed
  • Ym2413aYm2413a Posts: 630
    edited 2007-01-04 18:31
    asterick said...
    Yes, there are major benefits to code density if you use SPIN. It's designed to be as compact as possible, and even then there is potential for some squeezing. If you REALLY didn't care about code execution speed, you could easily do compression hooks on functions so they unpack themselves as they are executed

    A perfect example of spin's compression is doing some multiplication.
    How many 32bit ASM ops would we need? 12 maybe?
    How many Spin ops would we need? I know it's less then 12. (lol)

    Correct me if I'm wrong, but I think of the spin ops as a bunch of useful little routines. You just call the routines as you need them.

    I mean the same could be done in ASM (make a whole bunch of routines for complex functions used a lot in your program and call them), but Parallex already has done this for us.

    Not to mention that these "routines" are already stored in ROM and don't cost us any memory!
    Maybe I see things from the wrong angle, But this is how I view the spin opcodes. lol.gif
  • kuismakuisma Posts: 134
    edited 2007-01-04 19:07
    Ym2413a said...

    A perfect example of spin's compression is doing some multiplication.
    How many 32bit ASM ops would we need? 12 maybe?
    mult:                 mov prod,#0
    :loop                 shr foo,#1 wz,wc
                if_c      add prod,bar
                          shl bar,#1
                if_nz     jmp #:loop
    
    
    


    5 instructions. wink.gif

    - Mikael
  • asterickasterick Posts: 158
    edited 2007-01-04 20:20
    And technically, you need a minimum of 2 spin operations to do a multiplication

    PUB mul( x, y )
    x *= y

    translates down to

    PUSH_LOCALMEM_LONG_2
    EFFECT_LOCALMEM_LONG_1 POP MULTIPLY

    which turns into 3 bytes.

    Still better than 20 needed for assembly. [noparse]:)[/noparse]
  • Ym2413aYm2413a Posts: 630
    edited 2007-01-04 20:25
    Mikael you are the men.

    Here's some old Z80 code for the task I pulled up on the net.
    Still needs more opcodes by far!

    INPUT: THE VALUES IN REGISTER B EN C
    ; OUTPUT: HL = B * C
    ; CHANGES: AF,DE,HL,B
    ;
    LD HL,0
    LD A,B
    OR A
    RET Z
    LD D,0
    LD E,C
    LOOP: ADD HL,DE
    DJNZ LOOP
    RET
  • asterickasterick Posts: 158
    edited 2007-01-04 20:28
    Welp, I've started on my rewrite of the interpreter and bootloader,

    I've made a bootloader that does everything except the serial comms / detection.

    Someone mind giving me some heads up on how it works?
  • Ym2413aYm2413a Posts: 630
    edited 2007-01-04 20:28
    asterick said...
    And technically, you need a minimum of 2 spin operations to do a multiplication

    PUB mul( x, y )
    x *= y

    translates down to

    PUSH_LOCALMEM_LONG_2
    EFFECT_LOCALMEM_LONG_1 POP MULTIPLY

    which turns into 3 bytes.

    Still better than 20 needed for assembly. [noparse]:)[/noparse]

    Yeah, now that's what I call compression. [noparse]:)[/noparse]
    I had a feeling multiplication would be a good example of the spin ops.

    Hey look, I wrote Multiplication in one asm op. (lol)
    "shl foo,bar"
    Only thing is you have to multiply by a power of two. *laughs* tongue.gif
  • asterickasterick Posts: 158
    edited 2007-01-04 20:29
    Aww YM, that's hardly fair! Besides, the z80 is Smile. 6502 all the way.

    Here's some ARM! Mwahaha.

    MUL r0, r1, r3
  • Ym2413aYm2413a Posts: 630
    edited 2007-01-04 20:33
    asterick said...
    Welp, I've started on my rewrite of the interpreter and bootloader,

    I've made a bootloader that does everything except the serial comms / detection.

    Someone mind giving me some heads up on how it works?

    Good news. Someone from parallex left some example code up here for that!
    Let me see if I can find it.

    (p.s) I like the z80. with it's 150,000 some instructions.(lol) MSX!!!

    Post Edited (Ym2413a) : 1/4/2007 8:40:30 PM GMT
  • asterickasterick Posts: 158
    edited 2007-01-04 20:38
    Pfft, it only had ~1792 instructions. I still bow down to the Hu6580.
  • Ym2413aYm2413a Posts: 630
    edited 2007-01-04 20:44
    Here you go buddy. This is one of them. ^^
    Something relating to a serial loader. Not sure if you've seen this or not.

    http://forums.parallax.com/showthread.php?p=622354
  • asterickasterick Posts: 158
    edited 2007-01-04 20:57
    I'll take it! I just need to know how the puppy works. I'm assuming it's something simple, but you never know.
  • GadgetmanGadgetman Posts: 2,436
    edited 2007-01-04 21:34
    asterick said...
    Aww YM, that's hardly fair! Besides, the z80 is Smile. 6502 all the way.

    Yeah, if you keep the program in the first 256Bytes(Zero-page mode) you may even get it to run faster than on the Z80...
    (That is, if they ran at the same clock-speed...)

    Those 16bit ADD and SUB, commands and the BIT commands, who needs them?

    Not to mention those pesky IX and IY registers?
    I mean, seriously, who would ever do anything as daft as work with databases, or even linked lists on an 8bit CPU?

    But where the Z80 really suffers is the Interrupts...
    It can barely bit-bang a 9600bps full-duplex serial link in the background, because the shadow-registers is such a clumsy solution to use during tight interrupts...
    And the ability to handle up to 128 different interrupts, well, that is just plain confusing, isn't it?

    Ooops...

    Left my brain in Irony-mode...

    Anyway, that piece of Z80 code among is the worst I've ever seen.
    (One of my hobbies is disassembling ROMs from assorted computers)
    Do you know what it does?
    To multiply B with C it copies C into the DE(16bit register), then ADDs it B times to the HL(16bit) register.
    if the B register contains 255, it'll loop 255 times, and the DJNZ command isn't nice to such loops.
    (The DJNZ command on the Propeller is much nicer such loops. )

    This one takes a bit more space, but it executes a heck of a lot faster:
    http://map.tni.nl/sources/external/z80bits.html#1.1

    Or read this page to learn why the Z80 was considered THE 8bit number-cruncher:
    http://home.arcor.de/andreadrian/oldcpu/Z80_number_cruncher.html

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Don't visit my new website...
  • Ym2413aYm2413a Posts: 630
    edited 2007-01-04 22:31
    Wow, I feel like I'm the only person in the world who still likes the z80 anymore.
    That sucks for me. [noparse]:([/noparse]

    I learned ASM on the 8bit z80. cry.gif

    Then again some people have called the propeller's opcodes weak and pointless and joke about its lack of interrupts, but I like the propeller too and like its ops as well.
    So I guess it's what your used too.

    Post Edited (Ym2413a) : 1/4/2007 10:41:09 PM GMT
  • LawsonLawson Posts: 870
    edited 2007-01-04 22:37
    DUDE! targeting the spin byte-code with another computer language. Make it stop! this darn Prop chip just keeps getting nicer and nicer. (wonder if GCC could be easily altered to target spin byte-code?)

    exciting stuff,
    Marty
  • asterickasterick Posts: 158
    edited 2007-01-04 22:52
    I just dislike the Z80 from a "Oh god how many instructions do they need" standpoint. I like the 6502 for it's simplicity, not it's MIPS. [noparse];)[/noparse] You really can't argue with any system that's been crammed in so many home computers and game consoles.
  • GadgetmanGadgetman Posts: 2,436
    edited 2007-01-05 00:57
    Do you have a CP/M machine somewhere?

    A Sinclair machine?
    (Timex Sinclair in the USA)

    A Gameboy?

    Anyway, I think it's time to revive the 'Old School Hacker' thread in the Sandbox...
    http://forums.parallax.com/forums/default.aspx?f=15&p=001&m=387

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Don't visit my new website...
Sign In or Register to comment.