Shop OBEX P1 Docs P2 Docs Learn Events
Hacking Spin Interpreter Cog Ram — Parallax Forums

Hacking Spin Interpreter Cog Ram

hippyhippy Posts: 1,981
edited 2008-09-18 02:22 in Propeller 1
A bit technical this and perhaps not of much interest to anyone who isn't toying with hacking the Spin Interpreter and object code, but for those who are, this could be a useful discovery.

Apologies if anyone has worked this out and announced it already but if they have I've missed it !

There are three Spin bytecodes which have had me intrigued for a long while ...

3F 80+n     PUSH    spr
3F A0+n     POP     spr
3F C0+n     USING   spr





When n is $10 to $1F these allow Spin to access the Special Purpose Registers at $1F0 to $1FF ( INA, OUTA, DIRA etc ), when n is $00 to $0F they perform another function, for example, to push the current Cog ID to the stack ...

3F 89       PUSH    $09





I knew about the '$3F 9x' opcodes, but '$3F 8x' remained a mystery. Working on the latest bytecode disassembler I set about trying to fill in the gaps.

Looking at the Spin Interpreter source; what do we find at $1E9 ? "id", which is set early on in the interpreter initialisation and never otherwise used within the interpreter ( I had wondered why that was ).

Obvious with hindsight and when the light comes on; +n is +$00 to +$1F accessing Cog Ram $1E0 to $1FF which just happens to include the SPR. A typical Chip optimisation ( I recall I did say getting into Chip's head was a big key to cracking and understanding the interpreter ).

Now, what do we see after "id" ? The variables the interpreter itself uses; program counter, stack pointer, object base, variable base, etc.

So a few things ...

1) For people writing their own Spin Interpreters, it's crucial that Cog Ram at $1E9-$1EF is used exactly as in the ROM Interpreter or the '$3F xx' opcodes are abstracted to behave as they would if placed elsewhere. That explains why attempts to move or remove any of those variables lead to a crash of a hub-loaded interpreter.

2) Reading Cog Ram from $1E0 to $1EF should be possible ( assuming we can put the required bytecode into the image ) so it's possible to determine current stack base, current stack pointer, current PC etc without jumping through convoluted hoops in Spin.

3) If we can read Cog Ram we can write it, which opens the door to some potentially clever tricks with altering stack and PC on the fly, all of which could be useful for DOL and other 'bootload object', and 'object overlay' hackery.

4) As we can write to $1E0 through $1EF, there's a good chance we could inject code into the interpreter which we can force to be executed; we can overwrite part of the 'range' routine.. That would allow injected code to dump the entire interpreter from Cog Ram ( irrelevant now, and a catch 22 ! ), but an attack vector Chip will probably want to close for the Prop II.

5) More importantly with code injection, it should be possible to switch a running interpreter to an LMM interpreter and back again, or even switch it to running in-line PASM !

I haven't had a chance to try this yet but looking at the interpreter source it appears to hold up as a valid analysis.

Comments

  • hippyhippy Posts: 1,981
    edited 2008-07-28 00:21
    Still not tested on real hardware but it's looking promising. This is genuine Spin bytecode. The POP MEM[noparse]/noparse[noparse]/noparse.LONG is writing 0 to LONG[noparse][[/noparse]SP-4]. Something else which had never made sense in previous disassemblies ...

    ====                          ; PRI test_Cog_Spin
    ====                          ;   CogInit(0, test_Cog_Spin, 1)
    
    0629         37 02            PUSH     #8
    062B         36               PUSH     #1
    062C         15               MARK
    062D         35               PUSH     #0
    062E         3F 8F            PUSH     COG_SP.LONG
    0630         37 61            PUSH     #$FFFFFFFC
    0632         D1               POP      MEM[noparse][[/noparse]][noparse][[/noparse]].LONG
    0633         2C               COGISUB
    
    
    


    0629         37 02            PUSH     #8
    062B         36               PUSH     #1
    062C         15               MARK
    062D         35 3F 8F 37      LET      MEM[noparse][[/noparse]COG_SP.LONG][noparse][[/noparse]$FFFFFFFC].LONG, #0
    0631         61 D1
    0633         2C               COGISUB
    
    
    
  • Mike HuseltonMike Huselton Posts: 746
    edited 2008-07-28 00:41
    Good Sleuthing! Hippy, you always awe me, as well as a handful of other on this forum...

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    JMH
  • hippyhippy Posts: 1,981
    edited 2008-07-28 03:30
    Whoot !

    Tested on real hardware - Dynamically changing the stack pointer on the fly. This is just a raw proof of concept hack and it would be much better in an object abstracted with PUB IncStack(n), PUB DecStack(n) methods.

    CON
    
      _CLKMODE      = XTAL1 + PLL16x
      _XINFREQ      = 7_372_800
    
    OBJ
      tv : "TV_Text"
    
    DAT
    
    ' ***** WARNING : DO NOT EDIT ANYTHING IN THE FOLLOWING PUB MAIN !
    
    PUB Main | oBase, pInit, pcInc, pcDec
    
      tv.Start( 12 )        ' Start the TV Display
    
      ' Detrmine where the bytes are in the bytecode which we need to
      ' change to turn OUTB into SP
       
      oBase := word[noparse][[/noparse] $0006 ]
      pInit := word[noparse][[/noparse] oBase+4 ] + oBase
      pcInc := pInit + $2A
      pcDec := pInit + $3A
    
      ShowStackTop          ' Show where stack top is
      
      byte[noparse][[/noparse] pcInc ] := $CF  ' Convert OUTB in following assignment to SP
    
      OUTB += $1500-4       ' SP += $1500
                  
      ShowStackTop          ' Show where stack top is now
    
      byte[noparse][[/noparse] pcDec ] := $CF  ' Convert OUTB in following assignment to SP
    
      OUTB -= $1500+4       ' SP -= $1500
    
      ShowStackTop          ' Show where stack top is now
     
    PRI ShowStackTop
      tv.Hex( @result,8 )   ' Show where result is now on stack
      tv.Out( $0D)
    
    
    
  • hippyhippy Posts: 1,981
    edited 2008-07-28 03:55
    Double Whoop !

    Dynamically changing PC ...

    CON
    
      _CLKMODE      = XTAL1 + PLL16x
      _XINFREQ      = 7_372_800
    
    OBJ
      tv : "TV_Text"
    
    DAT
    
    ' ***** WARNING : DO NOT EDIT ANYTHING IN THE FOLOWING PUB MAIN !
    
    PUB Main | oBase, pInit, pLoop, pPoke
    
      tv.Start( 12 )        ' Start the TV Display
    
      oBase := word[noparse][[/noparse] $0006 ]
      pInit := word[noparse][[/noparse] oBase+4 ] + oBase
      pLoop := $36
      pPoke := pInit + $3C
    
      { pLoop Here }
    
      tv.Hex( result++, 8 )
      tv.Out( $0D)
      
      WaitCnt( CLKFREQ / 1000 * 500 + CNT )
          
      byte[noparse][[/noparse] pPoke ] := $AE  ' Convert OUTB in following assignment to PC
    
      OUTB := pLoop         ' PC := pLoop
    
    
    
  • Cluso99Cluso99 Posts: 18,069
    edited 2008-07-28 07:02
    @Hippy - great work!!

    I have a 6 long debugger which allows me full access to the spin code in cog. I have saved 6 longs in the code (more up to about $20 but not yet consolidated, and is much faster) which allows me to run an LMM debugger. This is the debugger I spoke about.

    I have simplified the codes $00-3F and am reasonably sure they work (not totally validated like I am doing with the $E0-$FF bytecodes)

    Last night I set about a routine to validate the math codes $E0-$EF but my laptop didn't see action and went to sleep (shutdown) so I didn't validate all parameters. I am comparing to the spin code for verification of all possible x and y values.

    I've left the codes at $1E3 or was it $1E5 (hate 2 laptops - it's always on the other) in the same location.

    I only now have to understand the calling of mathop from mrop. Then I can place it in the new RamInterpreter.

    During my validation I discovered that the spin interpreter has a couple of quirks with the repeat loop

    ·· repeat 0 to $FF··· only uses a byte and therefore left of the displayed long (endian issue)

    ·· repeat $0 to $FFFF_FFFF stops at $8000_0000 because of the sign

    so I have to use 2 repeats

    · repeat $0 to $7FFF_FFFF
    ····...
    · repeat $8000_0000 to $FFFF_FFFF
    ··· ...


    ·
  • hippyhippy Posts: 1,981
    edited 2008-07-28 14:08
    Aroogah!

    Sucking ROM Interpreter code out of the Cog's RAM bypassing all encryption mechanisms.

    I'll be the first to put my hands up to acknowledge this wouldn't have been easy without knowing the interpreter code to start with but it is possible without knowing it. Whether it would have been worthwhile to try this approach which would have been 'stabbing in the dark' is another question.

    CON
    
      _CLKMODE      = XTAL1 + PLL16x
      _XINFREQ      = 7_372_800
    
    OBJ
    
      tv : "TV_Text"
    
    PUB Main
    
      FixPeekCog
      ShowInterpreter
    
    PRI PeekCog( peekAdr ) | safe1, safe2
    
      result := $083C01E3 + ( peekAdr << 9 )
      
      safe1 := OUTA
      safe2 := OUTB
      
      OUTA := result    ' $1E2 : wrlong $xxx,$1E3
      OUTB := $7FFC     ' $1E3
    
      result := LookUp( 1 : 1..3 ) ' Run code at $1E2
      
      OUTA := safe1
      OUTB := safe2
      
      result := long[noparse][[/noparse] $7FFC ]
    
    PRI FixPeekCog
    
      byte[noparse][[/noparse]$3B] := $82              ' := [noparse][[/noparse]$1E2]
      byte[noparse][[/noparse]$3E] := $83              ' := [noparse][[/noparse]$1E3]
    
      byte[noparse][[/noparse]$42] := $A2              ' [noparse][[/noparse]$1E2] :=
      byte[noparse][[/noparse]$47] := $A3              ' [noparse][[/noparse]$1E3] :=
    
      byte[noparse][[/noparse]$54] := $A2              ' [noparse][[/noparse]$1E2] :=
      byte[noparse][[/noparse]$57] := $A3              ' [noparse][[/noparse]$1E3] :=
      
    PRI ShowInterpreter | pc
    
      tv.Start( 12 )
    
      repeat pc from $000 to $00F
        tv.Hex( pc, 3 )
        tv.Out( " " )
        tv.Out( ":" )
        tv.Out( " " )
        result := PeekCog(pc)
        tv.Hex( result, 2 )
        tv.Out( " " )
        tv.Hex( result >> 8, 2 )
        tv.Out( " " )
        tv.Hex( result >> 16, 2 )
        tv.Out( " " )
        tv.Hex( result >> 24, 2 )
        tv.Out( $0D )
    
    
    



    Adr  OK   Read            Expected        Should be ...
    
    000       01 00 00 00     05 00 FC A0     A7          MOV      V1, #$05
    001       01 00 00 00     F0 03 BC A0                 MOV      V2, PAR
    002       03 00 00 00     02 02 FC 80     A8          ADD      V2, #$02
    003       01 00 00 00     01 D6 BF 04                 RDWORD   $1EB, V2
    004       01 00 00 00     00 07 FC 80                 ADD      V4, #$100
    005       22 00 00 00     00 07 FC 80                 ADD      V4, #$100
    006       03 00 00 00     02 00 FC E4                 DJNZ     V1, #A8
    007       74 0C 00 00     01 D2 FF 0C                 COGID    V5
    
    008  YES  00 00 FC A0     00 00 FC A0     A9          MOV      V1, #$00
    009  YES  EE 0B BC 00     EE 0B BC 00                 RDBYTE   V6, V7
    00A  YES  01 DC FF 80     01 DC FF 80                 ADD      V7, #$01
    00B  YES  40 0A 7C 85     40 0A 7C 85                 CMP      V6, #$40 WC
    00C  YES  EE 00 4C 5C     EE 00 4C 5C         IF_NC   JMP      #A12
    00D  YES  05 06 BC A0     05 06 BC A0                 MOV      V4, V6
    00E  YES  04 06 FC 20     04 06 FC 20                 ROR      V4, #$04
    00F  YES  1A 06 FC 80     1A 06 FC 80                 ADD      V4, #$1A
    
    
    



    The trick is to subvert the interpreter code which is used to do range checking, then force that code to execute ( LookUp in this case ), then restore it so other bytecode using that still works.

    This is all rather academic given that we have the Interpreter source code and can use a RAM Interpreter such as that Cluso99 is working on, and this trick won't work on the Prop II if Chip adds the code to prevent it and I expect he will, although I hope he leaves the PC, SP etc data accessible which cannot be used to suck out the interpreter code.

    It's been good fun though and I'm easily amused by sticking crowbars in cracks and seeing what happens smile.gif
  • Cluso99Cluso99 Posts: 18,069
    edited 2008-08-12 05:43
    Just looking at the code in detail. The first 8 longs in the cog are used as variables x,y,a,t1,t2,op,op2,adr after the initial execution of the code which is initialising the pbase..dcurr fields. That is why the first 8 longs don't compare. (Just for others looking as I am sure Hippy knows this) smile.gif
  • Cluso99Cluso99 Posts: 18,069
    edited 2008-08-12 11:02
    Here is a version of Hippy's hack that uses FullDuplexSerial instead of TV. Just use PST - there is a 5 second delay before output.

    Thanks Hippy - I learnt a few more things getting this to work. It has given me a few more ideas for debugging my Interpreter. cool.gif
  • Sleazy - GSleazy - G Posts: 79
    edited 2008-08-21 00:55
    Wow, I dont know if Chip either is flattered or wants to cut you to pieces, but bravo on the hax. You are 1337 H4X0R. All your base belong to us!

    On a related note, Can you remember that long dual-port ram discussion we tagged on a few months ago I believe? I just watched a few parts of the 7 part Chip Gracey interview on youtube. It sounds like he's quad-porting the ram, or so he says. Pipeline.
  • hippyhippy Posts: 1,981
    edited 2008-08-21 01:31
    Yes, I remember that discussion. I haven't had a chance to sit down and watch the Chip videos yet but intend to. I think we're all going to be quite impressed when Prop II does arrive.

    It's quite a pleasure to see Parallax so laid back about the digging myself and others have done. I'm sure Chip is quite fascinated watching what people do pull out the bag - a bit like a proud parent pushing a child out into the world and finding out what it can do they never expected !

    I expect Chip and others ( myself included ) would be rather upset if anyone set about deliberately undermining the Propeller or Parallax but it seems to me he's happy to see anything which isn't damaging as positive. I don't expect a cease and desist notice any time soon smile.gif
  • QuattroRS4QuattroRS4 Posts: 916
    edited 2008-08-21 09:12
    Hippy - a 'cease and desist' notice would probably push this underground and make you and others push that little bit harder because you know that you have 'touched a nerve' .. Yip - A 'Cease and desist' would make you (edit: -more) dangerous ! Lol

    Regards,
    John Twomey

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    'Necessity is the mother of invention'
  • Paul BakerPaul Baker Posts: 6,351
    edited 2008-08-21 23:15
    Hey John, just saw your page in the '09 catalog, very nice.

    Sorry for the OT comment.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Paul Baker
    Propeller Applications Engineer

    Parallax, Inc.
  • QuattroRS4QuattroRS4 Posts: 916
    edited 2008-08-22 02:47
    Cheers Paul..

    Regards,
    John

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    'Necessity is the mother of invention'
  • Cluso99Cluso99 Posts: 18,069
    edited 2008-09-17 06:20
    Has anyone found out what the compiler directive is to use cog memory (registers)·$1E0-$1EF versus $1F0-$1FF?

    I want to use spin to output these bytecodes...

    3F 80+n     PUSH    spr
    3F A0+n     POP     spr
    3F C0+n     USING   spr
    
    

    I know I can fudge it as Hippy has done (above).
  • hippyhippy Posts: 1,981
    edited 2008-09-17 13:30
    There is none I know of as there's no reason Spin would be designed to allow that, it's just good fortune it worked. The only way I know of is to patch the bytecode at run-time or poke data into the .binary/.eeprom file after compilation.
  • Cluso99Cluso99 Posts: 18,069
    edited 2008-09-17 14:58
    Hippy - thought that may be the case, so am coding using a similar method to what you have done. Looking for the bytecode and modifying it on the fly, so to speak.
  • BradCBradC Posts: 2,601
    edited 2008-09-17 15:17
    Cluso99 said...
    Hippy - thought that may be the case, so am coding using a similar method to what you have done. Looking for the bytecode and modifying it on the fly, so to speak.
    That is a case that would be very easy to add to a compiler as an additional instruction.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pull my finger!
  • Cluso99Cluso99 Posts: 18,069
    edited 2008-09-17 16:08
    @BradC: It might be best to only enable by an option because of the consequences of using it. Another feature that would be nice is being able to compile pasm into $1F0-1FF. PropTool blocks it, but it would be nice to just be a warning. Maybe both these could generate warnings???

    I am sorry that I haven't had time to try the compiler yet. I am trying to reduce the complexity of the spin debugger, so it is simple for others. Othewise it's working well - I can switch from spin down to pasm and back on the fly, so I can just trace a particular bytecode. It is also display instruction counts (spin and pasm).
  • BradCBradC Posts: 2,601
    edited 2008-09-17 16:36
    Cluso99 said...
    Another feature that would be nice is being able to compile pasm into $1F0-1FF. PropTool blocks it,

    Really?
    Oh, I see.. it compiles code just fine, it won't let you jump to a label after $1F0 but you can jump to an immediate value just fine.

         org $1F0
    aa  mov outa,ina
         mov dira,outa
         jmp #aa < --fails
         jmp #$1F0 <-- works
    
    



    Yeah, that'll be easy to fix in a new compiler.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pull my finger!
  • hippyhippy Posts: 1,981
    edited 2008-09-17 16:54
    Don't forget that anything compiled to $1F0-$1FF won't actually get loaded there ! The CogInit zeroes the SPR.

    A RAM Spin Interpreter could be made to do that using the same zero-footprint debug trick as in Spin. It would run into problems though if two Cogs launched the same code at the same time as what's in hub would need to be replaced by a micro-bootloader and restored in the Cog later.
  • BradCBradC Posts: 2,601
    edited 2008-09-17 17:50
    Cluso99 said...
    @BradC: It might be best to only enable by an option because of the consequences of using it.

    I was thinking

    X1 := REG[noparse][[/noparse]$1EF]

    Where REG[noparse][[/noparse]XX] works precisely as any other register operation as long as the XX is between $1E0 and $1FF
    You could even do REG[noparse][[/noparse]$1ED][noparse][[/noparse]3..6]~~

    It would be incredibly easy to implement and allow you to access anything from $1E0-$1FF with all the features of a normal register access.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pull my finger!
  • Cluso99Cluso99 Posts: 18,069
    edited 2008-09-17 18:17
    BradC said...
    I was thinking

    X1 := REG[noparse][[/noparse]$1EF]

    Where REG[noparse][[/noparse]XX] works precisely as any other register operation as long as the XX is between $1E0 and $1FF
    You could even do REG[noparse][[/noparse]$1ED][noparse][[/noparse]3..6]~~

    It would be incredibly easy to implement and allow you to access anything from $1E0-$1FF with all the features of a normal register access.
    Your solution would be great cool.gif· The same applies to REG[noparse][[/noparse]$1EF] := X1 cool.gif

    In the Spin Interpreter, it is curious that only the $3F bytecode·allows access to the registers (cog memory) from $1E0-$1FF whereas the SPR[noparse][[/noparse]reg] only allows access to $1F0-$1FF (because the bytecodes $24-$26 OR the register with $10 and subsequently OR with $1E0 in the $3F section).
  • BradCBradC Posts: 2,601
    edited 2008-09-17 19:15
    Cluso99 said...

    In the Spin Interpreter, it is curious that only the $3F bytecode allows access to the registers (cog memory) from $1E0-$1FF whereas the SPR[noparse][[/noparse]reg] only allows access to $1F0-$1FF (because the bytecodes $24-$26 OR the register with $10 and subsequently OR with $1E0 in the $3F section).

    Curious ? Yes and no.. more no than yes..

    If you look at it, the compiler is specifically geared not to allow you access to $1E0 - > $1EF, although it does itself for COGID. So, logically it needs the ability to do that, but it does not want you to do that. Now look at SPR[noparse]/noparse which allows you to poke any value in there you want.. it needs to range check that in the interpreter to ensure you don't go where it does not want you to go.

    Makes sense to me anyway.
    Having the compiler source code though, means you can add anything you like to exploit the entire feature set of the available code.
    I'm looking at ways of using PCURR to allow easy runtime patching of the Spin Bytecode..

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Pull my finger!
  • Cluso99Cluso99 Posts: 18,069
    edited 2008-09-18 02:22
    @BradC: I am playing with modifying the hub bytecode on the fly. I need this to test all bytecodes in the Interpreter and I don't have a spin program to do this. I want to verify all options so that I can be sure the Interpreter is the same in all ways unless specifically stated.
Sign In or Register to comment.