Spin Interpreter - Faster???

hippy · 2008-08-12 14:04

The way I did validation on my interpreter was to test common cases and boundary conditions then try real programs to see if they worked. Exhaustive testing and validation isn't something I'm particularly into. I tended to focus on proving small sections of the interpreter code did the job on paper and confirmed by the small test set I used and assumed it would work for all other cases and I'd catch any bugs as they arrived.

I tested incrementally so multiply results were compared against iterative add results, square root was tested by multiplying a number by itself and seeing if the root was the same as the original number, bit-wise invert was compared against xor etc. It was a case of if I had two results the same, it either worked or I had a bug in two different bits of code which I deemed unlikely. There wasn't really a validation suite, just a set of test programs I tried then moved on.

On re-purposing Spin bytecode, I'd reserve $3C for 'breakpoint'; it's conveniently byte-sized and can therefore be dropped on any other Spin bytecode. I have my suspicions that is what it may have been used for during Chip's own development. Once that's repurposed there isn't a convenient 'breakpoint' to be had and I'm sure it will be desirable in the future.

The next best option IMO is repurposing '$37 $8x' to '$37 $FF', load packed number, where MSB is not used by Spin. There's also '$05 $00', Call to method zero, '$38 $00', Load byte constant zero' etc if you just want shortish extra opcodes.

Once you start digging there are many bytecode sequences which aren't used by spin which can be repurposed. But if you want usefulness for people using PropTool/Propellent you have to use Spin commands they can specify and use arguments which wouldn't be valid to perform particular tasks, eg CogNew(0,@pasmCode) to launch overlaid PASM etc.

jazzed · 2008-08-12 17:14

"I have my suspicions ..."

Chip debug? ROFLMAO [noparse]:)[/noparse] So heretical [noparse]:)[/noparse] I'm certain it would have taken less than 8 years without it [noparse]:)[/noparse]
I'm sorry, I just could not resist.

Seriously though ... Cluso99

So on this DAT style object, you need a way to reference the variable data labels from your interpreter for some debug/test output? At some point you will need to have a source line to byte code position marker to achieve a similar goal ... any idea how that will look? Object VAR variables may not be very well understood (you certainly can't build mixed type data structures with VAR). The rest of the spin binary format is pretty clear. Perhaps you can post a spin source example to translate to DAT style?

Given a spin debugger, I assume you will have some type of memory access interface and one or more breakpoint mechanisms. Being able to see variable address/data at minimum is useful. IDE that would allow displaying such data like in VisualBasic or Eclipse would be less tedious of course. VB has a break on variable state change in addition to line breaks.

Having some way to bring some PASM in-line would be excellent. Have you done it yet?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

hippy · 2008-08-12 20:13

jazzed said...
"I have my suspicions ..."

Chip debug? ROFLMAO [noparse]:)[/noparse] So heretical [noparse]:)[/noparse] I'm certain it would have taken less than 8 years without it [noparse]:)[/noparse]
I'm sorry, I just could not resist.

Obviously ... because $3C is not implemented in the interpreter ... Chip never had to resort to using it

Cluso99 · 2008-08-12 21:50

Marvellous what a night's sleep (or lack thereof) can do !

@Hippy: You have almost made a spin debuger. Can you send/receive serial easily (presume so since you are probably in VB)? Can you highlight an opcode line (with color)? (yes, I know we are supposed to spell it colour in our countries)

I have a version of the interpreter which has a debug section controlled by LMM. I can outout anything in hub or cog ram easily and via fdx (FullDuplexSerial) via another cog. The version just saves bytes for this feature without anything missing - it's almost original except hub decoding, no overlays, and works perfectly. I can send it to you direct - message me at the contact on www.bluemagic.biz·(I hate listing my address for fear of spam). It is easy for me to send you the details at each spin bytecode (master) fetch and the stack, pc etc details which you could display in a frame. To step to the next bytecode, send me a character and I will advance. This will give us F8 single step.We could also step slowly by the same method or stop on a particular pc fetch. All easy to me.

Because you use the tv object and I use fdx, I am going to add fdx.out == fdx.tx to my object. Then I only need the change the xtal and tv.start and object and I can use your code.

I don't want to post my interpreter until I have it basically working when all will be revealed.

@Jazzed: I'll have to answer you seperately.

I have worked out my idea to test but it will have to wait - I have a plane to catch.··

Updated: (plane is delayed due to fog)

@Jazzed: I have decided to use a LMM style·code (in DAT) and·copy it into the spin code in hub ram before launching the interpreter. With my debugger I can then see (and log for comparison) the execution of each bytecode group. I can also automate this so that I can grow it and use it for continual verification if I change something. Only problem is I know that the maths section will take 16 days to validate, so will have to accept that not all values will be able to be tested. Something like...

DAT
  byte push_#1
  byte memop_xxx_xxx
  byte mathop_rol

Anyway, this is what I am thinking at the moment.

Post Edited (Cluso99) : 8/12/2008 11:21:15 PM GMT

Cluso99 · 2008-08-27 22:44

I have seen the fantastic joint work on other threads, particularly the sqrt one, so I thought I would throw the following open (unfinished) debuggers into the forum.

Attached is a version of the Ram Interpreter which has inbuilt debugging.

The code is pretty much Chip's version with a debugger for spin.

There are a few caveats:
1. The Interpreter code is pretty much standard, with only·a few modifications to save enough space to add the debugger.
2. The debugger uses hub ram $7000 onwards without allocation (this will change later)
3. I went back to old code, so I won't guarantee it, although I believe it works correctly.
4. I will update the debugger code via the thread "Debug SPIN and ASM - LMM style" and it include operation of the debugger.
5. The debugger outputs to PST (Parallax/Propeller Serial Terminal)

See Debug SPIN and ASM - LMM style ·http://forums.parallax.com/showthread.php?p=743062

The faster and much modified code will be posted later (soon I hope).

Post Edited (Cluso99) : 8/27/2008 10:57:04 PM GMT

Cluso99 · 2008-09-01 06:50

Here is my current version of Chip's PNut Interpreter (Spin ROM Interpreter).

NOTES:

1.· It is untested although the maths section has been checked and the lower decode section was working but I have pruned more code from it since including the removal of the pop... sections to use a vector as well.

2.· There are sections with comments for me to improve next. The sqrt needs to be replaced with Chip's new code from the forum as well.

3.· My work is a derivative of Chip's who has gratiously allowed me to publish it.

4.· There is plenty of spare code space and the variables x...adr are not overlaying the startup code (for simplifying my debugger).

5.· I have published the code just for viewing. It will compile, but not run - it is not yet ready for anyone to test.

6.· You will note that I use an external hub based set of vectors for decoding (uses 256 longs of hub). This improves the execution speed in decoding as well as reducing the code footprint.

7.· Once the testing has been done, the extra code space can be used for either adding new features or speeding up the most used routines by inlining some routines.

Enjoy

Cluso99 · 2008-09-07 00:53

For those interested, a quick update...

I have 3 versions of the Interpreter now running simultaneously (executing the same spin code)

Cog 5 runs the Rom Interpreter

Cog 6 runs the Ram Interpreter (basically the Rom Interpreter in Ram with a couple of mods)

Cog 7 runs the Cluso Interpreter (the faster modified version under test in Ram)

This allows me to compare the results and debug the ClusoInterpreter. Without the debugger, I will be able to check the speed differences between the Rom and Cluso versions

Cluso99 · 2008-09-11 18:00

Below is·a sample·debug trace of both spin and assembler of my RamInterpreter.

adr  conds        op       wc wz nr dst   src  : cz  dst_data  src_data  : new_dst
------------------------------------------------------------------------------------
$000 .  .  .  .   mov      .  .  .  $000,#$005 : --  $A0FC0005      $005 : $00000005
$001 .  .  .  .   mov      .  .  .  $001, $1F0 : --  $A0BC03F0 $08FFE4CC : $00003454
$002 .  .  .  .   add      .  .  .  $001,#$002 : --  $00003454      $002 : $00003456
$003 .  .  .  .   rdword   .  .  .  $1EB, $001 : --  $0000001D $00003456 : $00000010
$004 .  .  .  .   add      .  .  .  $003,#$100 : --  $04BFD601      $100 : $04BFD701
$005 .  .  .  .   add      .  .  .  $003,#$100 : --  $04BFD701      $100 : $04BFD801
$006 .  .  .  .   djnz     .  .  .  $000,#$002 : --  $00000005      $002 : $00000004
$002 .  .  .  .   add      .  .  .  $001,#$002 : --  $00003456      $002 : $00003458
$003 .  .  .  .   rdword   .  .  .  $1EC, $001 : --  $0000003C $00003458 : $00003320
$004 .  .  .  .   add      .  .  .  $003,#$100 : --  $04BFD801      $100 : $04BFD901
$005 .  .  .  .   add      .  .  .  $003,#$100 : --  $04BFD901      $100 : $04BFDA01
$006 .  .  .  .   djnz     .  .  .  $000,#$002 : --  $00000004      $002 : $00000003
$002 .  .  .  .   add      .  .  .  $001,#$002 : --  $00003458      $002 : $0000345A
$003 .  .  .  .   rdword   .  .  .  $1ED, $001 : --  $0000002B $0000345A : $00003450
$004 .  .  .  .   add      .  .  .  $003,#$100 : --  $04BFDA01      $100 : $04BFDB01
$005 .  .  .  .   add      .  .  .  $003,#$100 : --  $04BFDB01      $100 : $04BFDC01
$006 .  .  .  .   djnz     .  .  .  $000,#$002 : --  $00000003      $002 : $00000002
$002 .  .  .  .   add      .  .  .  $001,#$002 : --  $0000345A      $002 : $0000345C
$003 .  .  .  .   rdword   .  .  .  $1EE, $001 : --  $0000002B $0000345C : $000001E7
$004 .  .  .  .   add      .  .  .  $003,#$100 : --  $04BFDC01      $100 : $04BFDD01
$005 .  .  .  .   add      .  .  .  $003,#$100 : --  $04BFDD01      $100 : $04BFDE01
$006 .  .  .  .   djnz     .  .  .  $000,#$002 : --  $00000002      $002 : $00000001
$002 .  .  .  .   add      .  .  .  $001,#$002 : --  $0000345C      $002 : $0000345E
$003 .  .  .  .   rdword   .  .  .  $1EF, $001 : --  $000057CF $0000345E : $0000346C
$004 .  .  .  .   add      .  .  .  $003,#$100 : --  $04BFDE01      $100 : $04BFDF01
$005 .  .  .  .   add      .  .  .  $003,#$100 : --  $04BFDF01      $100 : $04BFE001
$006 .  .  .  .   djnz     .  .  .  $000,#$002 : --  $00000001      $002 : $00000000
$007 .  .  .  .   cogid    .  .  .  $1E9,#$001 : --  $0000001D      $001 : $00000007
Cog 7, Par 3454: 0010 3320 3450 01E7 346C
spin: pc(01E7) 37 00 65 38 sp(346C) 0000 0000 <0000> 0000 0000  (1,29)
$008 .  .  .  .   mov      .  .  .  $000,#$000 : --  $00000000      $000 :
$009 .  .  .  .   rdbyte   .  .  .  $005, $1EE : --  $80FC0700 $000001E7 : $00000037
$00A .  .  .  .   add      .  .  .  $1EE,#$001 : --  $000001E7      $001 : $000001E8
$00B .  .  .  .   mov      .  .  .  $002, $005 : --  $80FC0202 $00000037 : $00000037
$00C .  .  .  .   and      .  wz nr $005,#$001 : --  $00000037      $001 :
$00D .  .  .  .   and      wc .  nr $005,#$002 : --  $00000037      $002 :
$00E .  .  .  .   mov      .  .  .  $01C, $005 : C-  $00000000 $00000037 : $00000037
$00F .  .  .  .   shl      .  .  .  $01C,#$002 : C-  $00000037      $002 : $000000DC
$010 .  .  .  .   add      .  .  .  $01C, $01B : C-  $000000DC $00002C60 : $00002D3C
$011 .  .  .  .   rdlong   .  .  .  $01C, $01C : C-  $00002D3C $00002D3C : $000000C4
$012 .  .  .  .   jmpret   .  .  .  $1D1, $01C : C-  $5C7C0000 $000000C4 : $5C7C0013
$0C4 .  .  .  .   mov      .  .  .  $000, $005 : C-  $00000000 $00000037 : $00000037
$0C5 .  .  .  .   sub      .  .  .  $000,#$035 : C-  $00000037      $035 : $00000002
$0C6 if_nc_or_z   jmp      .  .  nr $000,#$146 : C-  $00000002      $146 :
$0C7 .  .  .  .   rdbyte   .  .  .  $001, $1EE : C-  $0000345E $000001E8 : $00000000
$0C8 .  .  .  .   add      .  .  .  $1EE,#$001 : C-  $000001E8      $001 : $000001E9
$0C9 .  .  .  .   rol      .  .  .  $000, $001 : C-  $00000002 $00000000 :
$0CA .  .  .  .   and      wc .  nr $001,#$020 : C-  $00000000      $020 :
$0CB if_c         sub      .  .  .  $000,#$001 : --  $00000002      $001 :
$0CC .  .  .  .   and      wc .  nr $001,#$040 : --  $00000000      $040 :
$0CD if_c         xor      .  .  .  $000, $1E5 : --  $00000002 $FFFFFFFF :
$0CE .  .  .  .   jmp      .  .  nr $000,#$146 : --  $00000002      $146 :
$146 .  .  .  .   wrlong   .  .  nr $000, $1EF : --  $00000002 $0000346C :
$147 .  .  .  .   add      .  .  .  $1EF,#$004 : --  $0000346C      $004 : $00003470
$148 .  .  .  .   and      wc .  nr $006,#$040 : --  $E4FC0002      $040 :
$149 .  .  .  .   jmp      .  .  nr $000,#$008 : --  $00000002      $008 :
spin: pc(01E9) 65 38 57 69 sp(3470) 0000 0000 <0000> 0000 0000  (2,55)
$008 .  .  .  .   mov      .  .  .  $000,#$000 : --  $00000002      $000 : $00000000

Chip's Interpreter executes 4114 PASM instructions for 100 spin instructions. My RamInterpreter executes 3889 PASM for the same code. The code is not indicative, so I don't know how this is skewed.·

Cluso99 · 2008-09-15 12:34

Posted is the latest version of the ClusoInterpreter (v260C_007F).

Testing so far indicates this is a working version. I am trying to get a set of tests to prove all bytecodes.

Once I tidyup the testing and debugging code I will post this (next few days).

Enjoy

Cluso99 · 2008-09-23 10:07

Here is a trace of the spin code which executes when a coginit is issued to start a spin program.

The "RUN" bytecode·calls the ROM which builds the stack with parameters for the Spin Interpreter, then the spin bytecode is issued. I thought this was interesting

jazzed · 2009-01-07 05:03

@Cluso99

Would you be kind enough to put together a TV_Text type demo that uses your interpreter and describe how to interact with the debugger to start, stop, break, inspect data, and continue a spin program? If it don't work that way perhaps you can explain how it does work?
Thanks.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Steve

Oldbitcollector (Jeff) · 2009-01-07 05:15

Second that! This looks like a piece of the grail from this direction.

OBC

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
New to the Propeller?

Check out: Protoboard Introduction , Propeller Cookbook 1.4 & Software Index
Updates to the Cookbook are now posted to: Propeller.warrantyvoid.us
Got an SD card connected? - PropDOS

Cluso99 · 2009-01-08 12:26

@jazzed & OBC:
I'll put together a set of instructions and how to compile it. I am on the other pc at the moment - just arrived back from xmas hols.

Basically, it uses PST to communicate the debug information. So, upon starting a list of commands are displayed on the PST screen. I must confess, it's a little complex and confusing because I haven't worked out a way to get the Interpreter to run (launch) in just one cog. It is necessary to go through convolutions to get the RamInterpreter running in the first place, so I have to restart the Prop to achieve this.

I have started a version that steps through the homespun listing as it executes, but haven't got very far. It is not possible (at least at the moment) to substitue instructions on the fly. It just steps/runs the spin code as loaded. You can examine and change memory (hub and cog) on the fly.

This would be better in the thread:
PASM and SPIN debug with Zero Footprint· http://forums.parallax.com/showthread.php?p=748420

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Prop Tools under Development or Completed (Index)
http://forums.parallax.com/showthread.php?p=753439

cruising][noparse][[/noparse]url=http://www.bluemagic.biz]cruising[noparse][[/noparse]/url][/url]

This is a [noparse][[/noparse]b]bold[noparse][[/noparse]/b] test.

Post Edited (Cluso99) : 1/8/2009 12:33:13 PM GMT

Cluso99 · 2010-02-04 12:04

jazzed said... (on this thread http://forums.parallax.com/showthread.php?p=878563)
I've been toying with 32 bit words for the Spin interpreter key variables (dcurr, pbase, vbase, etc..) and have something that works right so far within the HUB RAM. Another question beyond the subject is: can an interface to fast XMM memory be squeezed into the interpreter? TBD

1. Is it possible to just copy everything from HUB RAM to XMM and flip a switch?
2. Given that the ROM occupies $8000-$FFFF, and stack grows up, a big hack would be needed for stack.
3. Is it possible to just make the stack always start above $FFFF to push the stack over the "ROM Hole" code?
4. Maybe a big dummy array would serve to fix the "ROM Hole" instead of asking for a special compiler mode?

I understand that Homespun allows for images > 32K. Is there any limit at all in Homespun today that would prevent a "Spin32" from working?

I thought I would say what I have learnt about the spin interpreter·based on what I did.

Firstly, my code is probably easier to understand that Chip's because I have unravelled the code as I moved the decode table into hub thereby freeing up cog space. How I went about doing this was to improve the decode by using a hub table. The I took each bytecode block and unravelled it and sped it up. Now, when I did this I did it without understanding how object really work within spin.

@jazzed:
Changing to 32 bits for dcurr, pbase, vbase should be fairly simple as IIRC it is actually 32 bit within the cog now. However, when these are pushed and pulled from the stack, a long would need to be pushed & popped.

1. Yes, but only for·1 modified interpreter. This could be done with only rdxxxx & wrxxxx instructions changed to access XMM. Due to contention, only 1 interpreter could be run or else contention must be in place using a lock or something similar.
2. I don't agree. It just depends on where to place the stack to ensure it has enough space.
3. I believe so, but see my answer to MagIO below.
4. Yes.

Homespun - Michael will have to answer this.

MagIO said...
Currently you don't have real access to the stack because there is no support for changing stack-pointers. What SPIN is doing with it is it's own secret. There is no push or pop. If you want to keep it as a secret, you can have completely different stack concepts. For example each SPIN-COG has it's own stack starting at $0000_0000 and it can even grow into ROM address-space if these addresses are mapped into some real XMM memory.

There is actually a push and a pop, just that it is disguised somewhat.

However, the main problem I see with the stack being anywhere but hub, is that of speed. If it was in SRAM (which can be similar access speeds with correct hardware, BUT·NOT with multi-cog access), then the main problem is that of contention. In hub, that is taken care of by the hub controller. The stack is the most used memory part of the spin's hub accesses. Relocating this would not only be a nightmare, but have a huge impace on performance.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:

· Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
· Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz

Post Edited (Cluso99) : 2/4/2010 12:13:08 PM GMT

Spin Interpreter - Faster???

Comments