Code protect?

deSilva · 2008-02-23 14:58

Hope that this is not misunderstood:
Hippy, I think you did a tremendious piece of work considering your unexperience with cryptography. However there was no code to break... As you describe it very well: "..but after that it was really just a coding exercise."
Terms as "XOR-ing" or "bit-swapping" are not in the repertoir of a professional cryptanalyst... they are child's play.

This is also a phantastic example of a Potiemkin's village. Someone says: "It is encrypted!" and evrybody excepts that - even deSilva

Thank you, Hippy!

hippy · 2008-02-23 15:56

@ deSilva : I take no exception to what you write. I'm no cryptoanalyst, no mathematical whizkid and what others are doing and achieving with statistical analysis is far beyond my skills. I was "just lucky", right place, right time, some good guesses and intuition which turned out right.

Going back through the thread, I also made some guesses which kept me encouraged which turned out to be completely wrong.

I hadn't heard of Potiemkin's village and had to look that up. I did hear tales of the USSR using a similar ploy with flat cut-outs and kerosene heaters where the engine tail pipes would be to make it look to spy satellites that they had more jet fighters than they did, and Britain invented and ran an entirely imaginary army in WW2 for purposes of distraction. On a more recent front, the past and ongoing 'assessments' of two Middle Eastern countries shows how people can be easily played by information which isn't accurate and getting it wrong can lead to terrible tragedy.

I was thinking along the same lines however; if Parallax had claimed triple level, military strength encryption, how many would simply have not bothered to even try the challenge ? I wouldn't have. Expecting it to be harder than it was is probably why no one made any serious attempts at decoding the ROM before. Plus lack of 'evil intent' with most here undoubtedly respecting Parallax's right to their IP.

As has been said here and elsewhere, an encryption method only has to outlast the time and effort those attempting to crack it are willing to put in. And for the potential cracker, the gain has to outweigh the pain. Pragmatically, while it has exposed Parallax IP, has it actually done any damage ? It has the potential to, but I think the risk is low ( that's the topic of "threat versus risk" which is dear to my heart in the current political climate but not for this forum ).

Phil Pilgrim (PhiPi) · 2008-02-23 18:09

Hippy,

Another thought about the missing RETs: One RET can easily serve multiple subroutines, so long as they aren't nested. A profusion of direct or indirect jumps to the same location would be a tip-off that this was being done. In the case of indirect jumps, the jump target doesn't have to be another JMP, either: it could just be a temporary variable location.

-Phil

deSilva · 2008-02-23 21:45

Possibly this thread has arousen interest in some readers in a fascinating topic. Books about it are galoore, but I have a special link for (mostly) German readers here:
www.staff.uni-mainz.de/pommeren/Kryptologie/
Masses of links as well, many of them English of course, and some quite entertaining.

Harley · 2008-02-23 23:31

Hippy,

I'm curious; what tool(s) did you use to obtain the information for your 'partial.txt' file? I don't recall any reference to what it takes to 'read' the Prop ROM and manipulate the bits.

This forum thread has been a very interesting 'read' to follow. If I weren't occupied with other projects, it would have been interesting to have gotten involved.

Super job. Many thanks for the effort and documentation provided.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Harley Shanko

stevenmess2004 · 2008-02-24 01:21

Harley, you can do it in either spin or assembler. For spin you just use something like
temp:=long[noparse][[/noparse]$F004]
or in assembler you would use the rdlong instruction

Beanie2k · 2008-02-24 02:40

@Hippy

Are you planning on finishing the decoding, or are you stopping now? IMHO the cat is now out of the bag, save for the tail. Might as well get that part out too

. Either way great job everyone!!

deSilva · 2008-02-24 02:41

stevenmess2004 said...
Harley, you can do it in either spin or assembler. For spin you just use something like
temp:=long[noparse][[/noparse]$F004]
or in assembler you would use the rdlong instruction

A most excellent reply!

potatohead · 2008-02-24 02:48

I missed this!

So the contents of $f004 would be put into temp then?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Wiki: Share the coolness!

Chat in real time with other Propellerheads on IRC #propeller @ freenode.net

deSilva · 2008-02-24 02:51

Well, I could write an appendix to deSilva's Tutorial for this, but I would concentrate on the assembly part...

stevenmess2004 · 2008-02-24 02:53

Yeah, you can read all the contents of the rom like that. My propAsm (I need to change the name of that, when I named it I forgot about the other one) code can even disassemble the code and show you what it looks like as assembly code doing that. (it doesn't decrypt it although it would be interesting to add that, swapping the bits around wouldn't be hard)

Beanie2k: I expect that hippy may be catching up on some much needed sleep

Beanie2k · 2008-02-24 03:02

Ah yes! I forgot he is almost halfway around the world from me. Oh the wonders of the internet !!!

hippy · 2008-02-24 03:25

@ Beanie2k : I was exercising that other part of my life and went out for a beer

That got the neurons moving again. What is it they say about the 80:20 rule ? Anyway, finally cracked it, and I reckon I have all 32 bit-mappings worked out ...

Instruction Bit : 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
Encoded ROM Bit :  3  7 21 12  6 19  4 17 20 15  8 11  0 14 30  1

Instruction Bit : 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0
Encoded ROM Bit : 23 31 16  5  9 18 25  2 28 22 13 27 29 24 26 10

Edit oops : Bit-reversal corrected !

There definitely were some errors in the earlier register bit-mappings. I'm starting to see Chip's clever trickery at work now and one mystery - That "007 : COGID $1E9", I cannot see anywhere it ever gets used, but it has to be there for a reason, or I've still got a mis-mapping in the opcode part. Everything else passes the 'looks sensible' visual test though, but I've not combed through it extensively.

Post Edited (hippy) : 2/24/2008 3:31:44 AM GMT

stevenmess2004 · 2008-02-24 03:30

Have you figured out what the $3C opcode does? If you haven't then I'll have another go.

hippy · 2008-02-24 04:04

I've taken a quick look for $3C and it looks to me ( you'd best confirm ) ...

1) $3C loaded into $005 ( instruction register )
2) Falls through the IF_NC test to 00D
3) The $3C in $005 indexes $01A-$01E as a jump table ( 4 x byte addr per long )
4) Code jumps to $073 with Z=1 C=0
5) That Pops 3 longs ... then I'm not sure

The problem is perhaps not so much how $3C works, but what it's there for. We will see the logic of what it does but its functionality only really makes sense in terms of its appearance in the bytecode.

I'll leave you to investigate further as I think I do need that sleep now

cgracey · 2008-02-24 05:10

Good job, Hippy! (And the rest of you, too!)

I was really stunned that you figured out those first·two instructions so quickly. I knew it would soon be over, after that.

The original source code is attached in this posting. I'm sure·you guys will find some ways to improve it.

Regarding the interpreter code: My only regret was that I didn't use an external RDBYTE table to fetch those 16 different branch addresses in the main loop. This would have saved several longs and sped things up a bit. Those extra longs could have been used to increase the interpreter's functionality. It took me almost two days towards the end of development to free up·just one·long, so that I could fix a bug. This code was really fun to write. You'll see that to avoid spending longs on conditional JMPs,·separate sequences were often given complementary conditions and then placed in-line. This is not how things started out, but it became necessary as functionality began to trump speed. The functional limits of the interpreter·kept growing·until I absolutely couldn't make it do any more. That was the only way I knew it was finished. Towards the end, it was a ten-steps-forward/nine-steps-back scenario that went on for weeks.

·

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Chip Gracey
Parallax, Inc.

Post Edited (Chip Gracey (Parallax)) : 2/24/2008 9:12:49 AM GMT

ModEdit: 2017.06.20.
Link here to working download files:
http://forums.parallax.com/discussion/comment/1359516/#Comment_1359516

cgracey · 2008-02-24 05:22

One other thing:

Beyond the defined instructions and data, all background longs were filled with randomly-chosen·longs that existed within the the interpreter. This was done to make the boundaries of the application appear unclear from·a ROM dump. (So much for that!)

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Chip Gracey
Parallax, Inc.

stevenmess2004 · 2008-02-24 05:24

Thanks a heap for posting this Chip

Also a nice trick to reuse the names for the first few registers.

hippy, opcode $3C is listed as unused

I haven't figured out what it actually does yet though.

stevenmess2004 · 2008-02-24 05:38

Chip said...
My only regret was that I didn't use an external RDBYTE table to fetch those 16 different branch addresses in the main loop. This would have saved several longs and sped things up a bit.

Does this mean that we may have some new features in the next prop? Also if there is a multiplier than maybe some more?

cgracey · 2008-02-24 05:44

stevenmess2004 said...

Chip said...
My only regret was that I didn't use an external RDBYTE table to fetch those 16 different branch addresses in the main loop. This would have saved several longs and sped things up a bit.

Does this mean that we may have some new features in the next prop? Also if there is a multiplier than maybe some more?

There are some fundamental differences with the next Prop, as it is shaping up to have 256KB of RAM and 256KB of ROM. This means the interpreter must break out of its current 64KB-map limit. Also, if we can read 8 longs per hub access, caching in special interpreter code becomes much more viable. It's going to be different, for sure. Spin shouldn't change much, though.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Chip Gracey
Parallax, Inc.

stevenmess2004 · 2008-02-24 07:33

hippy said...
1) $3C loaded into $005 ( instruction register )
2) Falls through the IF_NC test to 00D
3) The $3C in $005 indexes $01A-$01E as a jump table ( 4 x byte addr per long )
4) Code jumps to $073 with Z=1 C=0
5) That Pops 3 longs ... then I'm not sure

Hippy, for 4 I get $096, which happens to be a cogstop x. If that's right than it will cause all bad things to happen

Beanie2k · 2008-02-24 09:13

Mike Green said...
parsko,
[noparse][[/noparse]...] I suspect that the hardware for the COGINIT instruction uses the decryption logic whenever it starts a cog from ROM since the bootloader is also stored in encrypted form.

I wonder if this same "encryption" method is used for the bootloader?

stevenmess2004 · 2008-02-24 09:19

Chip said...
Whenever a cog loads from ROM, these muxes rearrange the data bits.

So it would seem that the encryption is the same. Maybe hippy can get us the code

hippy · 2008-02-24 13:14

Thanks for posting the code Chip, it really is interesting to read, and a lot easier as annotated source code.

Compared with my own attempt a lot of similarities are appearing so I think I was on the right track but really missed out big time on the optimisation tricks you used and I missed a few bytecode mappings which would have optimised things.

That 'use Z and C as a case statement index' is very efficient. It would have saved me around 100 longs for handling opcodes below $40. I knew there'd be some very fundamental trick I'd missed. I think it's likely true to say that any future VM's I write for the Propeller are going to 'borrow' that idea.

One very useful addition to the compiler would be IF_00, IF_01, IF_10 and IF_11 conditionals which translate to IF_(N)C_(N)Z to make coding those easier and less error prone. IF_0, IF_1, IF_2 and IF_3 would probably be welcome as well.

Folding the Case, LookDown and LookUp handlers into a single, small routine is impressive. Those really had me struggling to implement. This sure is a candy box of surprises. Thanks.

@ Beanie2K : Yes, the Bootloader seems to use the same encryption. ROM dump of loader and a not very annotated decoding attached.

CardboardGuru · 2008-02-24 14:56

I second Hippy's comments. Thanks Chip for the source code. It's very interesting to read, and I'm sure good things will follow from the community knowing exactly what's going on in the Spin VM.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Help to build the Propeller wiki - propeller.wikispaces.com
Play Defender - Propeller version of the classic game
Prop Room Robotics - my web store for Roomba spare parts in the UK

hippy · 2008-02-24 16:11

While waxing lyrical on using the C and Z flags to create a one-of-four case statement in PASM, I am certain that someone else mentioned doing this in the forum a while ago but I cannot find that post so cannot give fair credit.

Reflecting on it, even without having realised the cleverness of TEST with WC, it is still possible to create the C and Z flags for the bottom two bits of a byte using -

AND opc,#2 WZ NR
SHR opc,#1 WC NR

Quite simply, despite it being obvious with hindsight and even being handed the 'how-to' on use, it's a trick I missed. I'm sure the C bit used for 'parity' in the bit-wise and MUX operators also have some good uses in the right places.

Also in hindsight, similar bytecodes in groups of four should have been a hint.

Regarding the CogInit issues of trampling on the stack identified recently, that seems to be the 'run' bytecode ( which I called MARK ) at label 'j5' which copies bytes to somewhere, presumably the base of the new stack to be, and then runs bytecode held in ROM at $FFFC. This is something I was never sure of in the bytecode disassembler so it's going to be interesting to decode what is going on there.

There's only one improvement I'd make to the bytecode so far, and it probably isn't there for the lack of two longs; bit7 in the Packed Number Load is not used ( at least that's now confirmed - label 'jD' ). I'd have used that bit7 to set the lsb of the base number ( currently '2' ) being shifted which allows a lot more numbers to be packed into a single byte.

I haven't looked in any depth at the Loader but it would be interesting to know what the limits for programming baud rates are and what the actual timeouts are which has thrown up a few problems for people writing their own downloaders.

Harley · 2008-02-24 17:35

stevenmess2004 said...
Harley, you can do it in either spin or assembler. For spin you just use something like
temp:=long[noparse][[/noparse]$F004]
or in assembler you would use the rdlong instruction

I should have worded my question differently, I now see.

What I was wondering was if Hippy had already working (some tool) some source, debugged and working, prior to the 'challenge'. If it were just to 'look' at a dump on screen, then that would be quite a chore. So a dump to a file, print it out, then try to figure out what's going on. So, a ROM dumper, and disassembler, and 'something' to switch bits to see if that helps. I'd be working for days to get my 'tools' to work before being able to begin on the challenge. Apparently Hippy and others had something already working to begin with. It seemed Hippy was already ready when the 'challenge' gauntlet was thrown down by Chip. Seemed he took off immediately to work on this.

Years ago I'd written crude (well, not so crude) disassemblers and at least one assembler (for NSC's 8073 'micro' years ago). So know the chore to those tasks. Fun stuff for some of us.

Thanks to everyone commenting on this ROM reverse engineer game.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Harley Shanko

deSilva · 2008-02-24 17:55

hippy said...
AND opc,#2 WZ NR
SHR opc,#1 WC NR

Maybe we really should compile some well known practices...
Restoring flags is generally done in this way:

SHR flags,#1 WC, WZ, NR

---
Edit:
Whilest saving them had been figured out to work best in this way...

MUXNZ flags,#2
MUXC  flags,#1

Post Edited (deSilva) : 2/24/2008 6:08:17 PM GMT

hippy · 2008-02-24 18:53

Harley said...
What I was wondering was if Hippy had already working (some tool) some source, debugged and working, prior to the 'challenge'. If it were just to 'look' at a dump on screen, then that would be quite a chore. So a dump to a file, print it out, then try to figure out what's going on. So, a ROM dumper, and disassembler, and 'something' to switch bits to see if that helps. I'd be working for days to get my 'tools' to work before being able to begin on the challenge. Apparently Hippy and others had something already working to begin with. It seemed Hippy was already ready when the 'challenge' gauntlet was thrown down by Chip. Seemed he took off immediately to work on this.

I was completely unprepared when Chip threw open the challenge. But did have some information and knowledge I could call on. That was useful after I'd picked myself off the floor.

I have a Spin bytecode disassembler written based on work done by Robert Bryon Vandiver ( Gear ) and Cliff L Biffle. That helped with quickly turning the decoded bits into readable source code. A quick cut 'n' paste and tweak job.

http://forums.parallax.com/showthread.php?p=665019

In terms of getting the dump, I simply took the file provided by Peter Jakacki. That saved having to do anything to extract the ROM data myself.

http://forums.parallax.com/showthread.php?p=676627

Everything else between that file and the decoded output was written in the past few days.

An editor extracted the required image data from the .HEX file and a short program turned that into a long per line ( CogImage.Txt ). A rewrite got the endian issue sorted

The main decoding program only came after I had the guess it was bit-swapping and 'proven' on paper. Two rewrites there; firstly the change to map output bits from input bits rather than the other way round, and a complete rewrite from the ground up to solve the error I'd introduced in address mapping.

The final version reads two files, "Known.Txt" which is what the decoder itself outputs and feeds back in in the hope of coercing more solutions, and a snapshot of that as "Guessed.Txt" which I could hand modify to force known instructions and addresses with. If you look in Guessed.Txt you'll see the address bits I could determine from looking at the output. Not that many required to force complete decoding. DJNZ, MOVI, MOVD and MOVS were the biggest help. The snapshot came from the earlier decoding because I was convinced it was only the addressing bits which were wrong. The program itself got tweaked to put in some annotations as they appeared.

Beanie2k · 2008-02-24 20:03

hippy said...

[noparse][[/noparse]...]
I haven't looked in any depth at the Loader but it would be interesting to know what the limits for programming baud rates are and what the actual timeouts are which has thrown up a few problems for people writing their own downloaders.

I think before we mess around with the loader code we ought to get Chip's OK on it. The challenge was for the bytecode interpreter only. I just raised the question about whether the loader was encrypted the same way just as a natural follow-on. The last thing I want to do is hurt our good relationship with the folks at Parallax.

Code protect?

Comments