Hacking Spin Interpreter Cog Ram

hippy · 2008-07-27 22:12

A bit technical this and perhaps not of much interest to anyone who isn't toying with hacking the Spin Interpreter and object code, but for those who are, this could be a useful discovery.

Apologies if anyone has worked this out and announced it already but if they have I've missed it !

There are three Spin bytecodes which have had me intrigued for a long while ...

3F 80+n     PUSH    spr
3F A0+n     POP     spr
3F C0+n     USING   spr

When n is $10 to $1F these allow Spin to access the Special Purpose Registers at $1F0 to $1FF ( INA, OUTA, DIRA etc ), when n is $00 to $0F they perform another function, for example, to push the current Cog ID to the stack ...

3F 89       PUSH    $09

I knew about the '$3F 9x' opcodes, but '$3F 8x' remained a mystery. Working on the latest bytecode disassembler I set about trying to fill in the gaps.

Looking at the Spin Interpreter source; what do we find at $1E9 ? "id", which is set early on in the interpreter initialisation and never otherwise used within the interpreter ( I had wondered why that was ).

Obvious with hindsight and when the light comes on; +n is +$00 to +$1F accessing Cog Ram $1E0 to $1FF which just happens to include the SPR. A typical Chip optimisation ( I recall I did say getting into Chip's head was a big key to cracking and understanding the interpreter ).

Now, what do we see after "id" ? The variables the interpreter itself uses; program counter, stack pointer, object base, variable base, etc.

So a few things ...

1) For people writing their own Spin Interpreters, it's crucial that Cog Ram at $1E9-$1EF is used exactly as in the ROM Interpreter or the '$3F xx' opcodes are abstracted to behave as they would if placed elsewhere. That explains why attempts to move or remove any of those variables lead to a crash of a hub-loaded interpreter.

2) Reading Cog Ram from $1E0 to $1EF should be possible ( assuming we can put the required bytecode into the image ) so it's possible to determine current stack base, current stack pointer, current PC etc without jumping through convoluted hoops in Spin.

3) If we can read Cog Ram we can write it, which opens the door to some potentially clever tricks with altering stack and PC on the fly, all of which could be useful for DOL and other 'bootload object', and 'object overlay' hackery.

4) As we can write to $1E0 through $1EF, there's a good chance we could inject code into the interpreter which we can force to be executed; we can overwrite part of the 'range' routine.. That would allow injected code to dump the entire interpreter from Cog Ram ( irrelevant now, and a catch 22 ! ), but an attack vector Chip will probably want to close for the Prop II.

5) More importantly with code injection, it should be possible to switch a running interpreter to an LMM interpreter and back again, or even switch it to running in-line PASM !

I haven't had a chance to try this yet but looking at the interpreter source it appears to hold up as a valid analysis.

hippy · 2008-07-28 00:21

Still not tested on real hardware but it's looking promising. This is genuine Spin bytecode. The POP MEM[noparse]/noparse[noparse]/noparse.LONG is writing 0 to LONG[noparse][[/noparse]SP-4]. Something else which had never made sense in previous disassemblies ...

====                          ; PRI test_Cog_Spin
====                          ;   CogInit(0, test_Cog_Spin, 1)

0629         37 02            PUSH     #8
062B         36               PUSH     #1
062C         15               MARK
062D         35               PUSH     #0
062E         3F 8F            PUSH     COG_SP.LONG
0630         37 61            PUSH     #$FFFFFFFC
0632         D1               POP      MEM[noparse][[/noparse]][noparse][[/noparse]].LONG
0633         2C               COGISUB

0629         37 02            PUSH     #8
062B         36               PUSH     #1
062C         15               MARK
062D         35 3F 8F 37      LET      MEM[noparse][[/noparse]COG_SP.LONG][noparse][[/noparse]$FFFFFFFC].LONG, #0
0631         61 D1
0633         2C               COGISUB

Mike Huselton · 2008-07-28 00:41

Good Sleuthing! Hippy, you always awe me, as well as a handful of other on this forum...

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
JMH

hippy · 2008-07-28 03:30

Whoot !

Tested on real hardware - Dynamically changing the stack pointer on the fly. This is just a raw proof of concept hack and it would be much better in an object abstracted with PUB IncStack(n), PUB DecStack(n) methods.

CON

  _CLKMODE      = XTAL1 + PLL16x
  _XINFREQ      = 7_372_800

OBJ
  tv : "TV_Text"

DAT

' ***** WARNING : DO NOT EDIT ANYTHING IN THE FOLLOWING PUB MAIN !

PUB Main | oBase, pInit, pcInc, pcDec

  tv.Start( 12 )        ' Start the TV Display

  ' Detrmine where the bytes are in the bytecode which we need to
  ' change to turn OUTB into SP
   
  oBase := word[noparse][[/noparse] $0006 ]
  pInit := word[noparse][[/noparse] oBase+4 ] + oBase
  pcInc := pInit + $2A
  pcDec := pInit + $3A

  ShowStackTop          ' Show where stack top is
  
  byte[noparse][[/noparse] pcInc ] := $CF  ' Convert OUTB in following assignment to SP

  OUTB += $1500-4       ' SP += $1500
              
  ShowStackTop          ' Show where stack top is now

  byte[noparse][[/noparse] pcDec ] := $CF  ' Convert OUTB in following assignment to SP

  OUTB -= $1500+4       ' SP -= $1500

  ShowStackTop          ' Show where stack top is now
 
PRI ShowStackTop
  tv.Hex( @result,8 )   ' Show where result is now on stack
  tv.Out( $0D)

hippy · 2008-07-28 03:55

Double Whoop !

Dynamically changing PC ...

CON

  _CLKMODE      = XTAL1 + PLL16x
  _XINFREQ      = 7_372_800

OBJ
  tv : "TV_Text"

DAT

' ***** WARNING : DO NOT EDIT ANYTHING IN THE FOLOWING PUB MAIN !

PUB Main | oBase, pInit, pLoop, pPoke

  tv.Start( 12 )        ' Start the TV Display

  oBase := word[noparse][[/noparse] $0006 ]
  pInit := word[noparse][[/noparse] oBase+4 ] + oBase
  pLoop := $36
  pPoke := pInit + $3C

  { pLoop Here }

  tv.Hex( result++, 8 )
  tv.Out( $0D)
  
  WaitCnt( CLKFREQ / 1000 * 500 + CNT )
      
  byte[noparse][[/noparse] pPoke ] := $AE  ' Convert OUTB in following assignment to PC

  OUTB := pLoop         ' PC := pLoop

Cluso99 · 2008-07-28 07:02

@Hippy - great work!!

I have a 6 long debugger which allows me full access to the spin code in cog. I have saved 6 longs in the code (more up to about $20 but not yet consolidated, and is much faster) which allows me to run an LMM debugger. This is the debugger I spoke about.

I have simplified the codes $00-3F and am reasonably sure they work (not totally validated like I am doing with the $E0-$FF bytecodes)

Last night I set about a routine to validate the math codes $E0-$EF but my laptop didn't see action and went to sleep (shutdown) so I didn't validate all parameters. I am comparing to the spin code for verification of all possible x and y values.

I've left the codes at $1E3 or was it $1E5 (hate 2 laptops - it's always on the other) in the same location.

I only now have to understand the calling of mathop from mrop. Then I can place it in the new RamInterpreter.

During my validation I discovered that the spin interpreter has a couple of quirks with the repeat loop

·· repeat 0 to $FF··· only uses a byte and therefore left of the displayed long (endian issue)

·· repeat $0 to $FFFF_FFFF stops at $8000_0000 because of the sign

so I have to use 2 repeats

· repeat $0 to $7FFF_FFFF
····...
· repeat $8000_0000 to $FFFF_FFFF
··· ...

·

hippy · 2008-07-28 14:08

Aroogah!

Sucking ROM Interpreter code out of the Cog's RAM bypassing all encryption mechanisms.

I'll be the first to put my hands up to acknowledge this wouldn't have been easy without knowing the interpreter code to start with but it is possible without knowing it. Whether it would have been worthwhile to try this approach which would have been 'stabbing in the dark' is another question.

CON

  _CLKMODE      = XTAL1 + PLL16x
  _XINFREQ      = 7_372_800

OBJ

  tv : "TV_Text"

PUB Main

  FixPeekCog
  ShowInterpreter

PRI PeekCog( peekAdr ) | safe1, safe2

  result := $083C01E3 + ( peekAdr << 9 )
  
  safe1 := OUTA
  safe2 := OUTB
  
  OUTA := result    ' $1E2 : wrlong $xxx,$1E3
  OUTB := $7FFC     ' $1E3

  result := LookUp( 1 : 1..3 ) ' Run code at $1E2
  
  OUTA := safe1
  OUTB := safe2
  
  result := long[noparse][[/noparse] $7FFC ]

PRI FixPeekCog

  byte[noparse][[/noparse]$3B] := $82              ' := [noparse][[/noparse]$1E2]
  byte[noparse][[/noparse]$3E] := $83              ' := [noparse][[/noparse]$1E3]

  byte[noparse][[/noparse]$42] := $A2              ' [noparse][[/noparse]$1E2] :=
  byte[noparse][[/noparse]$47] := $A3              ' [noparse][[/noparse]$1E3] :=

  byte[noparse][[/noparse]$54] := $A2              ' [noparse][[/noparse]$1E2] :=
  byte[noparse][[/noparse]$57] := $A3              ' [noparse][[/noparse]$1E3] :=
  
PRI ShowInterpreter | pc

  tv.Start( 12 )

  repeat pc from $000 to $00F
    tv.Hex( pc, 3 )
    tv.Out( " " )
    tv.Out( ":" )
    tv.Out( " " )
    result := PeekCog(pc)
    tv.Hex( result, 2 )
    tv.Out( " " )
    tv.Hex( result >> 8, 2 )
    tv.Out( " " )
    tv.Hex( result >> 16, 2 )
    tv.Out( " " )
    tv.Hex( result >> 24, 2 )
    tv.Out( $0D )

Adr  OK   Read            Expected        Should be ...

000       01 00 00 00     05 00 FC A0     A7          MOV      V1, #$05
001       01 00 00 00     F0 03 BC A0                 MOV      V2, PAR
002       03 00 00 00     02 02 FC 80     A8          ADD      V2, #$02
003       01 00 00 00     01 D6 BF 04                 RDWORD   $1EB, V2
004       01 00 00 00     00 07 FC 80                 ADD      V4, #$100
005       22 00 00 00     00 07 FC 80                 ADD      V4, #$100
006       03 00 00 00     02 00 FC E4                 DJNZ     V1, #A8
007       74 0C 00 00     01 D2 FF 0C                 COGID    V5

008  YES  00 00 FC A0     00 00 FC A0     A9          MOV      V1, #$00
009  YES  EE 0B BC 00     EE 0B BC 00                 RDBYTE   V6, V7
00A  YES  01 DC FF 80     01 DC FF 80                 ADD      V7, #$01
00B  YES  40 0A 7C 85     40 0A 7C 85                 CMP      V6, #$40 WC
00C  YES  EE 00 4C 5C     EE 00 4C 5C         IF_NC   JMP      #A12
00D  YES  05 06 BC A0     05 06 BC A0                 MOV      V4, V6
00E  YES  04 06 FC 20     04 06 FC 20                 ROR      V4, #$04
00F  YES  1A 06 FC 80     1A 06 FC 80                 ADD      V4, #$1A

The trick is to subvert the interpreter code which is used to do range checking, then force that code to execute ( LookUp in this case ), then restore it so other bytecode using that still works.

This is all rather academic given that we have the Interpreter source code and can use a RAM Interpreter such as that Cluso99 is working on, and this trick won't work on the Prop II if Chip adds the code to prevent it and I expect he will, although I hope he leaves the PC, SP etc data accessible which cannot be used to suck out the interpreter code.

It's been good fun though and I'm easily amused by sticking crowbars in cracks and seeing what happens

Cluso99 · 2008-08-12 05:43

Just looking at the code in detail. The first 8 longs in the cog are used as variables x,y,a,t1,t2,op,op2,adr after the initial execution of the code which is initialising the pbase..dcurr fields. That is why the first 8 longs don't compare. (Just for others looking as I am sure Hippy knows this)

Cluso99 · 2008-08-12 11:02

Here is a version of Hippy's hack that uses FullDuplexSerial instead of TV. Just use PST - there is a 5 second delay before output.

Thanks Hippy - I learnt a few more things getting this to work. It has given me a few more ideas for debugging my Interpreter.

Sleazy - G · 2008-08-21 00:55

Wow, I dont know if Chip either is flattered or wants to cut you to pieces, but bravo on the hax. You are 1337 H4X0R. All your base belong to us!

On a related note, Can you remember that long dual-port ram discussion we tagged on a few months ago I believe? I just watched a few parts of the 7 part Chip Gracey interview on youtube. It sounds like he's quad-porting the ram, or so he says. Pipeline.

hippy · 2008-08-21 01:31

Yes, I remember that discussion. I haven't had a chance to sit down and watch the Chip videos yet but intend to. I think we're all going to be quite impressed when Prop II does arrive.

It's quite a pleasure to see Parallax so laid back about the digging myself and others have done. I'm sure Chip is quite fascinated watching what people do pull out the bag - a bit like a proud parent pushing a child out into the world and finding out what it can do they never expected !

I expect Chip and others ( myself included ) would be rather upset if anyone set about deliberately undermining the Propeller or Parallax but it seems to me he's happy to see anything which isn't damaging as positive. I don't expect a cease and desist notice any time soon

QuattroRS4 · 2008-08-21 09:12

Hippy - a 'cease and desist' notice would probably push this underground and make you and others push that little bit harder because you know that you have 'touched a nerve' .. Yip - A 'Cease and desist' would make you (edit: -more) dangerous ! Lol

Regards,
John Twomey

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
'Necessity is the mother of invention'

Paul Baker · 2008-08-21 23:15

Hey John, just saw your page in the '09 catalog, very nice.

Sorry for the OT comment.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer

Parallax, Inc.

QuattroRS4 · 2008-08-22 02:47

Cheers Paul..

Regards,
John

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
'Necessity is the mother of invention'

Cluso99 · 2008-09-17 06:20

Has anyone found out what the compiler directive is to use cog memory (registers)·$1E0-$1EF versus $1F0-$1FF?

I want to use spin to output these bytecodes...

3F 80+n     PUSH    spr
3F A0+n     POP     spr
3F C0+n     USING   spr

I know I can fudge it as Hippy has done (above).

hippy · 2008-09-17 13:30

There is none I know of as there's no reason Spin would be designed to allow that, it's just good fortune it worked. The only way I know of is to patch the bytecode at run-time or poke data into the .binary/.eeprom file after compilation.

Cluso99 · 2008-09-17 14:58

Hippy - thought that may be the case, so am coding using a similar method to what you have done. Looking for the bytecode and modifying it on the fly, so to speak.

BradC · 2008-09-17 15:17

Cluso99 said...
Hippy - thought that may be the case, so am coding using a similar method to what you have done. Looking for the bytecode and modifying it on the fly, so to speak.

That is a case that would be very easy to add to a compiler as an additional instruction.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Pull my finger!

Cluso99 · 2008-09-17 16:08

@BradC: It might be best to only enable by an option because of the consequences of using it. Another feature that would be nice is being able to compile pasm into $1F0-1FF. PropTool blocks it, but it would be nice to just be a warning. Maybe both these could generate warnings???

I am sorry that I haven't had time to try the compiler yet. I am trying to reduce the complexity of the spin debugger, so it is simple for others. Othewise it's working well - I can switch from spin down to pasm and back on the fly, so I can just trace a particular bytecode. It is also display instruction counts (spin and pasm).

BradC · 2008-09-17 16:36

Cluso99 said...
Another feature that would be nice is being able to compile pasm into $1F0-1FF. PropTool blocks it,

Really?
Oh, I see.. it compiles code just fine, it won't let you jump to a label after $1F0 but you can jump to an immediate value just fine.

     org $1F0
aa  mov outa,ina
     mov dira,outa
     jmp #aa < --fails
     jmp #$1F0 <-- works

Yeah, that'll be easy to fix in a new compiler.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Pull my finger!

hippy · 2008-09-17 16:54

Don't forget that anything compiled to $1F0-$1FF won't actually get loaded there ! The CogInit zeroes the SPR.

A RAM Spin Interpreter could be made to do that using the same zero-footprint debug trick as in Spin. It would run into problems though if two Cogs launched the same code at the same time as what's in hub would need to be replaced by a micro-bootloader and restored in the Cog later.

BradC · 2008-09-17 17:50

Cluso99 said...
@BradC: It might be best to only enable by an option because of the consequences of using it.

I was thinking

X1 := REG[noparse][[/noparse]$1EF]

Where REG[noparse][[/noparse]XX] works precisely as any other register operation as long as the XX is between $1E0 and $1FF
You could even do REG[noparse][[/noparse]$1ED][noparse][[/noparse]3..6]~~

It would be incredibly easy to implement and allow you to access anything from $1E0-$1FF with all the features of a normal register access.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Pull my finger!

Cluso99 · 2008-09-17 18:17

BradC said...
I was thinking

X1 := REG[noparse][[/noparse]$1EF]

Where REG[noparse][[/noparse]XX] works precisely as any other register operation as long as the XX is between $1E0 and $1FF
You could even do REG[noparse][[/noparse]$1ED][noparse][[/noparse]3..6]~~

It would be incredibly easy to implement and allow you to access anything from $1E0-$1FF with all the features of a normal register access.

Your solution would be great

· The same applies to REG[noparse][[/noparse]$1EF] := X1

In the Spin Interpreter, it is curious that only the $3F bytecode·allows access to the registers (cog memory) from $1E0-$1FF whereas the SPR[noparse][[/noparse]reg] only allows access to $1F0-$1FF (because the bytecodes $24-$26 OR the register with $10 and subsequently OR with $1E0 in the $3F section).

BradC · 2008-09-17 19:15

Cluso99 said...

In the Spin Interpreter, it is curious that only the $3F bytecode allows access to the registers (cog memory) from $1E0-$1FF whereas the SPR[noparse][[/noparse]reg] only allows access to $1F0-$1FF (because the bytecodes $24-$26 OR the register with $10 and subsequently OR with $1E0 in the $3F section).

Curious ? Yes and no.. more no than yes..

If you look at it, the compiler is specifically geared not to allow you access to $1E0 - > $1EF, although it does itself for COGID. So, logically it needs the ability to do that, but it does not want you to do that. Now look at SPR[noparse]/noparse which allows you to poke any value in there you want.. it needs to range check that in the interpreter to ensure you don't go where it does not want you to go.

Makes sense to me anyway.
Having the compiler source code though, means you can add anything you like to exploit the entire feature set of the available code.
I'm looking at ways of using PCURR to allow easy runtime patching of the Spin Bytecode..

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Pull my finger!

Cluso99 · 2008-09-18 02:22

@BradC: I am playing with modifying the hub bytecode on the fly. I need this to test all bytecodes in the Interpreter and I don't have a spin program to do this. I want to verify all options so that I can be sure the Interpreter is the same in all ways unless specifically stated.

Hacking Spin Interpreter Cog Ram

Comments