SPIN Interpreter Internals (link included)

asterick · 2007-01-03 17:32

Given light of some interest (one person) in my SPIN interpreter research, I went ahead and decided to do a write up of everything I know on the inner workings.

www.sublab.net/spin.html

This covers all the operation codes I could get the compiler to produce. Anything that is not known is marked.

EDIT: Added opcode 0x07 (Call indexed object)

Post Edited (asterick) : 1/5/2007 10:55:51 PM GMT

rokicki · 2007-01-03 19:25

The "unknown" second word in the object linkage table is, I believe, the "var" offset for that object with respect to the current object.
(You can test this by adding and removing variables and watch it change.)

Also, your python code fails for me; I'm trying to understand why. (All I get is numbers, no op codes.)

Based on these results you might share with people what makes the most compact code. (I.e., the first seven (arguments + locals)
are more concisely accessed than the rest, also the first 8 long vars are more concisely accessed; also longs are faster than
either bytes or words in general).

It looks to me like the "PC relatives" (branch offsets and the like) use a two-byte form frequently when a one-byte form would
suffice.

Maybe I'll just write a binary to binary optimizer that does the obvious peephole improvements.

asterick · 2007-01-03 19:34

The python code fails because I forgot to remove a "return" in the DumpHeader function. I was testing the checksum code, and the disassembly got in the way.

The relative branches are 7-bit signed, so they can only be between -64 to 63 (the MSB means that it's a 16-bit branch)

Local 0 is the RESULT or return value. Local 1-7 are the call arguements, followed by local variables.

Yes, longs are more efficient, so long as they take up the first 8 entries in the VAR tables. Also interleaving BYTEs and LONGs results in alignment and will steal bytes from you.

As far as your "var" offset, I don't know, it seems pretty ambiguous. 1st and 2nd tier objects have a zero (1st tier is ignored), where as the 3rd tier tend to have wildly large values (0x3000+)

I also discovered another opcode, which leaves only 2 unknown in the main instruction set.

I've included an update disassembler, which dumps all the crazyness.

asterick · 2007-01-03 19:41

Something interesting that might be tried, I don't know if it's been attempted before... but someone might try doing this:

VAR
  LONG Interp[noparse][[/noparse]512]

PUB start | index
  repeat index from -496 to 15 step 1
    Interp[noparse][[/noparse]index+496] := SPR[noparse][[/noparse]index]

  ' TODO: Output to COM

to see if we can get a dump for an emulator. I don't actually own a propeller board yet, so I can't do it myself. I'm almost certain that the people at parallax thought to do boundary checking on that function, but you never know. They had the foresight to encrypt the rom after all.

rokicki · 2007-01-03 19:45

Hmm, would you share the binary you are decompiling that shows strange offsets?

My interpretation of the second word of the object is as follows.

Given an object with a var ptr of v.

This object has two subobjects, o1 and o2. The var offset of o1 is v1, and the var
offset of o2 is v2.

When o1 or o2 is invoked, the var ptr should be set to v + v1 or v + v2, respectively.

Essentially, it's an offset from the var ptr of the enclosing object to get to the var
pointer of the current object.

(Try including the same object multiple times; you'll get multiple subobject entries,
each with a different var offset, reflecting that they are different objects, yet they
share the same code.)

If you have an example that shows this not to be the case, I'm interested in it.

rokicki · 2007-01-03 19:47

The SPR trick with values less than 0 or greater than 15 does not work, at least not for me.

I was hoping to get a debugger working by modifying the main-memory code in-place, but
have not yet figured out everything I need. (I was going to modify the main-memory code
to put a branch-back-to-itself, or maybe a wait-pne or something).

What we should consider doing is coming up with a symbol table format, since clearly that's
lacking and useful.

mike101video · 2007-01-03 19:50

This looks like the basis for a C compiler, that compiles into spin Bytecodes???? Or a Basic that uses the spin bytecode?? A whole world opens up.

Mike

PS - I'm interested as well.. so make it TWO people....

asterick · 2007-01-03 19:51

Aaah, looking at it, you are correct.

The WORD after the child object offset is the VAR block offset relative to the parent object's VAR block offset (nrr)

I didn't really mess with custom objects, I usually just messed with the library objects, which makes sense that they would go crazy, since they can have massive variable blocks.

asterick · 2007-01-03 20:01

Haha, good to see there is some interest. I updated the page to include your info rokicki (with props given)

I'm going to be cleaning it up after a bit, so it's easier to read. Right now it's a bit slapped together. I also need to describe what the codes themselves do, and how they affect the stack.

Those voids in the instruction set are glaring at me though. [noparse]:)[/noparse]

rokicki · 2007-01-03 20:09

Oh yeah, I'm currently looking at using these bytecodes as a target for BASIC. I'm also considering building a JIT BASIC using multiple cogs.

(The JIT BASIC would just be a stunt; only the smallest and simplest loops would stay in assembly---but they would run fast!)

Using the bytecodes as a BASIC target would ease the memory pressure on the microcomputer I'm building because I would reuse the
existing ROM code rather than taking up space for my own code. And for simplicity I'd use as few opcodes as possible to keep the "compiler"
simple.

Other tricks you can do pretty easily with this information:

dynamic loading and execution of SPIN binaries

loading and executing a SPIN binary with assembly just to load up and start a COG, and then free that memory back up. This way you
could load a *bunch* of functionality into some cogs (with assembly language) and then free up all that shared RAM for other uses.

debuggers/tracers

passing an object by reference (this is a tiny bit tricky)

asterick · 2007-01-03 20:15

Objects by reference is POSSIBLE, but painful, yes. Allocating new objects would be a feat and a half, to say the least.

It wouldn't be a huge challenge if you used a custom compiler (since you could embed the VAR block inside the object itself)

I don't think the system write protects code space.

rokicki · 2007-01-03 20:28

I've tested if the system write protects code space---it does not.

If you ignore multithreading, objects by reference is not that hard; you simply need to make
sure the method signatures match. Everything else falls into place pretty easily.

With multithreading, it's harder, since the code space is shared and you only have a byte
index to use for a trampoline. But it should still be reasonably possible.

asterick · 2007-01-03 21:06

Ok, I cleaned up the spec a little bit, it's not bearable.

Ym2413a · 2007-01-03 21:48

bytecodes BASIC or C would rock!

I like the spin bytecodes since they almost act as a form of compression when used correctly.
Complex tasks can be done with a few spin opcodes which would normally take many 32bit ASM opcodes to do!

I've once wrote complex code in ASM for the prop only to later write it in SPIN to save space! (at the cost of speed)

asterick · 2007-01-03 21:59

Yes, there are major benefits to code density if you use SPIN. It's designed to be as compact as possible, and even then there is potential for some squeezing. If you REALLY didn't care about code execution speed, you could easily do compression hooks on functions so they unpack themselves as they are executed

Ym2413a · 2007-01-04 18:31

asterick said...
Yes, there are major benefits to code density if you use SPIN. It's designed to be as compact as possible, and even then there is potential for some squeezing. If you REALLY didn't care about code execution speed, you could easily do compression hooks on functions so they unpack themselves as they are executed

A perfect example of spin's compression is doing some multiplication.
How many 32bit ASM ops would we need? 12 maybe?
How many Spin ops would we need? I know it's less then 12. (lol)

Correct me if I'm wrong, but I think of the spin ops as a bunch of useful little routines. You just call the routines as you need them.

I mean the same could be done in ASM (make a whole bunch of routines for complex functions used a lot in your program and call them), but Parallex already has done this for us.

Not to mention that these "routines" are already stored in ROM and don't cost us any memory!
Maybe I see things from the wrong angle, But this is how I view the spin opcodes.

kuisma · 2007-01-04 19:07

Ym2413a said...

A perfect example of spin's compression is doing some multiplication.
How many 32bit ASM ops would we need? 12 maybe?

mult:                 mov prod,#0
:loop                 shr foo,#1 wz,wc
            if_c      add prod,bar
                      shl bar,#1
            if_nz     jmp #:loop

5 instructions.

- Mikael

asterick · 2007-01-04 20:20

And technically, you need a minimum of 2 spin operations to do a multiplication

PUB mul( x, y )
x *= y

translates down to

PUSH_LOCALMEM_LONG_2
EFFECT_LOCALMEM_LONG_1 POP MULTIPLY

which turns into 3 bytes.

Still better than 20 needed for assembly. [noparse]:)[/noparse]

Ym2413a · 2007-01-04 20:25

Mikael you are the men.

Here's some old Z80 code for the task I pulled up on the net.
Still needs more opcodes by far!

INPUT: THE VALUES IN REGISTER B EN C
; OUTPUT: HL = B * C
; CHANGES: AF,DE,HL,B
;
LD HL,0
LD A,B
OR A
RET Z
LD D,0
LD E,C
LOOP: ADD HL,DE
DJNZ LOOP
RET

asterick · 2007-01-04 20:28

Welp, I've started on my rewrite of the interpreter and bootloader,

I've made a bootloader that does everything except the serial comms / detection.

Someone mind giving me some heads up on how it works?

Ym2413a · 2007-01-04 20:28

asterick said...
And technically, you need a minimum of 2 spin operations to do a multiplication

PUB mul( x, y )
x *= y

translates down to

PUSH_LOCALMEM_LONG_2
EFFECT_LOCALMEM_LONG_1 POP MULTIPLY

which turns into 3 bytes.

Still better than 20 needed for assembly. [noparse]:)[/noparse]

Yeah, now that's what I call compression. [noparse]:)[/noparse]
I had a feeling multiplication would be a good example of the spin ops.

Hey look, I wrote Multiplication in one asm op. (lol)
"shl foo,bar"
Only thing is you have to multiply by a power of two. *laughs*

asterick · 2007-01-04 20:29

Aww YM, that's hardly fair! Besides, the z80 is Smile. 6502 all the way.

Here's some ARM! Mwahaha.

MUL r0, r1, r3

Ym2413a · 2007-01-04 20:33

asterick said...
Welp, I've started on my rewrite of the interpreter and bootloader,

I've made a bootloader that does everything except the serial comms / detection.

Someone mind giving me some heads up on how it works?

Good news. Someone from parallex left some example code up here for that!
Let me see if I can find it.

(p.s) I like the z80. with it's 150,000 some instructions.(lol) MSX!!!

Post Edited (Ym2413a) : 1/4/2007 8:40:30 PM GMT

asterick · 2007-01-04 20:38

Pfft, it only had ~1792 instructions. I still bow down to the Hu6580.

Ym2413a · 2007-01-04 20:44

Here you go buddy. This is one of them. ^^
Something relating to a serial loader. Not sure if you've seen this or not.

http://forums.parallax.com/showthread.php?p=622354

asterick · 2007-01-04 20:57

I'll take it! I just need to know how the puppy works. I'm assuming it's something simple, but you never know.

Gadgetman · 2007-01-04 21:34

asterick said...
Aww YM, that's hardly fair! Besides, the z80 is Smile. 6502 all the way.

Yeah, if you keep the program in the first 256Bytes(Zero-page mode) you may even get it to run faster than on the Z80...
(That is, if they ran at the same clock-speed...)

Those 16bit ADD and SUB, commands and the BIT commands, who needs them?

Not to mention those pesky IX and IY registers?
I mean, seriously, who would ever do anything as daft as work with databases, or even linked lists on an 8bit CPU?

But where the Z80 really suffers is the Interrupts...
It can barely bit-bang a 9600bps full-duplex serial link in the background, because the shadow-registers is such a clumsy solution to use during tight interrupts...
And the ability to handle up to 128 different interrupts, well, that is just plain confusing, isn't it?

Ooops...

Left my brain in Irony-mode...

Anyway, that piece of Z80 code among is the worst I've ever seen.
(One of my hobbies is disassembling ROMs from assorted computers)
Do you know what it does?
To multiply B with C it copies C into the DE(16bit register), then ADDs it B times to the HL(16bit) register.
if the B register contains 255, it'll loop 255 times, and the DJNZ command isn't nice to such loops.
(The DJNZ command on the Propeller is much nicer such loops. )

This one takes a bit more space, but it executes a heck of a lot faster:
http://map.tni.nl/sources/external/z80bits.html#1.1

Or read this page to learn why the Z80 was considered THE 8bit number-cruncher:
http://home.arcor.de/andreadrian/oldcpu/Z80_number_cruncher.html

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Don't visit my new website...

Ym2413a · 2007-01-04 22:31

Wow, I feel like I'm the only person in the world who still likes the z80 anymore.
That sucks for me. [noparse]:([/noparse]

I learned ASM on the 8bit z80.

Then again some people have called the propeller's opcodes weak and pointless and joke about its lack of interrupts, but I like the propeller too and like its ops as well.
So I guess it's what your used too.

Post Edited (Ym2413a) : 1/4/2007 10:41:09 PM GMT

Lawson · 2007-01-04 22:37

DUDE! targeting the spin byte-code with another computer language. Make it stop! this darn Prop chip just keeps getting nicer and nicer. (wonder if GCC could be easily altered to target spin byte-code?)

exciting stuff,
Marty

asterick · 2007-01-04 22:52

I just dislike the Z80 from a "Oh god how many instructions do they need" standpoint. I like the 6502 for it's simplicity, not it's MIPS. [noparse];)[/noparse] You really can't argue with any system that's been crammed in so many home computers and game consoles.

Gadgetman · 2007-01-05 00:57

Do you have a CP/M machine somewhere?

A Sinclair machine?
(Timex Sinclair in the USA)

A Gameboy?

Anyway, I think it's time to revive the 'Old School Hacker' thread in the Sandbox...
http://forums.parallax.com/forums/default.aspx?f=15&p=001&m=387

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Don't visit my new website...

SPIN Interpreter Internals (link included)

Comments