Code protect?

hippy · 2008-02-21 14:16

Another useful attack vector - the RET instruction.

Assuming the tool Chip used sets src and dst to zero, all RET should have the same scrambled values except for any conditional execution part. Ignoring IF_NEVER, all RET instructions have between 6 and 9 bits set, the rest cleared, 9 bits set for an IF_ALWAYS RET.

Applying a bit-count puts a valid 'IF_ALWAYS RET' at 1E4, just before the constant which looks like -1 ... coincidence ? Or that -1 would be the first constant chosen to appear after code ?

On the counter-side, there are not as many scrambled longs which have the same value which suggests a lack of IF_ALWAYS RET, which seems unlikely.

Another target is IF_ALWAYS JMP #? which should have between 10 and 17 bits set. Any other instructions which should have a narrow band of bits set ?

Another interesting thing is that there are no zeroed longs in the Cog as one would expect for uninitialised variables ( zeroed longs exist in the loader part ). I suspect this is because run-time variables are overlaid on Cog Code once that has been executed and becomes redundant. The most obvious, executed and re-usable are at $000 up, so that may prove useful with src and dst registers mainly being zero bits elsewhere.

One final thought for now - Most of us are, in general, attacking the interpreter code; would a better approach be to attack the 'simpler' bootloader or even start with the constants loaded to the Cog ?

CardboardGuru · 2008-02-21 14:18

Yes I'm pretty sure the bit swapping permutations are given by 32! which is 2.6e+35. That's far more secure than a 32 bit constant XORed into all the words.

Checking one per second, that would take 8.3e+27 years to do brute force. Even getting the check down to milliseconds pre permutation won't practically help much.

It reminds me of the eternity puzzle. www.google.co.uk/search?q=eternity+puzzle&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US[noparse]:o[/noparse]fficial&client=firefox-a A jigsaw variant that offered £1,000,000 to the first person to solve it. The permutations were so huge that no foreseeable supercomputer could brute force the solution given millions of years. Yet the problem was understood and a way of whittling down the permutations was found which resulted in a solution being found by brute force in just a few months.

The key to that was one of parity. As I recall the pieces were conceptually made of several right angled triangles. Some left handed triangles, some right handed. So a given piece might have more right handed triangles in it, or more left handed pieces, or the same number. This gave a "parity" for each piece. It so happened that when a program was doing a search of permutations,, you could check the parity of the pieces you have left against the parity of the space left on the board to fill, and rule out a whole branch of the search tree.
Your suggestion of counting the number of ones and zeros in a long is parity too. If this is a bit swap situation, that does seem like a very promising angle to pursue.

I take that back. There was much talk of using parity to help solve eternity, but I've just read up the method that the winners used, and they didn't use parity, and suggest parity is useless for that puzzle. But that doesn't mean it won't be useful here.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Help to build the Propeller wiki - propeller.wikispaces.com
Play Defender - Propeller version of the classic game
Prop Room Robotics - my web store for Roomba spare parts in the UK

Post Edited (CardboardGuru) : 2/21/2008 2:46:27 PM GMT

lnielsen · 2008-02-21 14:56

I have had a fun morning reading this thread. I REALLY enjoy this forum but I am not enough of a propeller head to help with the search. I think it is interesting that Chip says there might be a solution to code protection in PropII and then sends us on this quest. I have to assume the new PropII feature is making this "utility" available to us. If that is true and we are able to break it then the "utility" would be useless in the future PropII.

That said, I don't think this "utility" can be based on static scrambling code; if it was then once you knew the trick all would be revealed. There has to be a key to make this work for the future and that key must be truly hidden in the scrambled image. I would think that with the "trick" understood you could find the key if you had both the source and scrambled images. Obviously you can extract the source if you have the scrambled and the key but cracking the scrambled without the key must be a serious effort even when you know the trick.

To make this a practical solution for the PropII you would need to prove that you coudn't find the key in someone's object based on numerous experiments where you have both the source and scrambled versions of your own code.

Have fun and I will continue to watch. I think the PropII will be well worth the wait.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
BioProp: Robotics - Powered by Bioloids and controlled by the Propeller

hippy · 2008-02-21 15:38

Whoohoo ... maybe ....

$000 : MOV $000,#5
$001 : MOV $001,PAR

There are the right number of bits in the first long and in the second for each of those instructions, plus the right number of changes ( bit sets and bit clears ) between the two. It's also what one would expect as there are five words which are in the info block the interpreter needs to use. Following should be some sort of copy/load from Hub code.

If correct, then it narrows down the potential bit swapping possibilities; one of three bits cleared maps to the I (immediate) bit, the other two to bits 0 and 2 (src), the six bits set map to bits 4 through 9 ( src, and lsb of dst ). The set and unchanged bits map to bit 29 and 31 (opc) and the other five to R and the four Condition bits.

Applying that to the constant at 1E8 would be interesting.

Post Edited (hippy) : 2/21/2008 4:08:01 PM GMT

Phil Pilgrim (PhiPi) · 2008-02-21 18:52

Hippy,

I think you might be reading your ROM in big-endian form. You quoted the following data:

1E5 FFFFFFFF
1E6 08000000
1E7 00001000
1E8 C1C93008
1E9 08000000

When I dump this section of ROM in LONG form, I get (with hub addresses):

F000: 02CAFFFF 0130CD09 1A70690D 0610C909 C29C4D23 0214C90D 0214C90D 0438C989
F020: C298CD63 0030C909 37506B05 C094CF2B 024ACB09 3548F0C1 03344D09 0334C901
F040: 2E14C909 06155D81 063CC901 063CCD41 2C304B09 04304F41 3D70EF81 031159A1
F060: 0220CF91 0622CB81 0008D9C1 6F094B95 E6639992 1AB2072A D99CF09B 36706FC5
F080: 37486F05 C694CB2B F5C86F27 C694CB2B F7C86F27 C694CB2B F5CC6D27 F5F46D2F
F0A0: C694CB2B 2848D9C5 C294492B C0924F2B 2D08B4C1 79DD9AE6 37506905 C094CF2B
F0C0: 02100808 0638C941 36506D0D 02104D41 00304D09 0A30C941 2008BDD0 F6F06B2F
F0E0: F4DC6927 F1CC6F27 C690CB2B C2944F2B F4F4662F C094422B 2008D9C1 79DDDBE7
F100: 3359DDE5 0018C408 000ACD09 C594432A 2230DF91 284891C5 2008D8C0 2D4894C1
F120: E39C8A2A 284898C4 3359DDF5 7CDDD2E7 00344409 78DD9BE6 E29CCB2B 37546D45
F140: 00064D89 00305B09 04105649 04100F08 00185649 00100F08 FDD8DFC7 040A0509
F160: 06144909 021C4D09 020E0009 03344009 02148D08 0320CB91 37442545 E394CB2B
F180: 7CDD92E7 2C08B5C1 0448D9C1 79DD93E7 C29C022B F5F0602E E8B4622E 2008D0C1
F1A0: 7CDDDAE6 00344909 0038CD09 06144901 0014CD09 336C85C9 00104E00 0210CC08
F1C0: 033C48C0 2324E8CC 2128EADC 2478E989 78DDDBE7 310898C0 202CC9C9 0322CB91
F1E0: 0630CF81 02340509 020A4808 04300A08 00188E08 01380E40 03100808 01100808
F200: 0030DF09 01384F41 0638CF41 01105DE3 0230CFC1 01145FC3 BFC52624 02144C40
F220: 03104808 02044941 01104909 153CC989 2008D9C1 0026CD81 0320CB91 00201589
F240: 00285C88 2008D9C1 7CDD92E7 00088041 02008041 01008001 79DDDBE6 0408C440
F260: 01088540 003098C8 2008D9C1 7CDD9AE6 00281CC8 200898C0 79DDDBE7 0830C9C1
F280: 0822C991 3C48D9C1 3508B9C0 78DD93E7 28342185 083C8141 2A302185 06388141
F2A0: 063001C1 201881C1 003005C1 041A8141 011AC840 00388D08 1C08F9C1 79DDDBE7
F2C0: 050A8141 050ACC40 11707D85 0320CB91 0148D9C1 31502545 79DDDAE6 F3F46F2F
F2E0: C69CCB2B F5DC6F27 C69CCB2B F7D86F27 C69CCB2B F5D86F27 C69CCB2B F7DC6D27
F300: C6A6C0A3 1D08F0C1 C2A4CDB3 EAB469A7 0148D9C1 01304D09 0918ED09 2848D1C5
F320: 37506905 C094CF2B 00384D01 0222E981 00188D08 0262C981 11782DC4 2848D9C5
F340: 0F18EF09 35546905 C094CF2B 2038C941 043049C1 1A78CB89 2848D9C5 35506905
F360: C094CF2B 02304B09 0330CF41 1070E9C5 42905983 429C5983 409059A3 2848BCC4
F380: 79DDD2E7 02304009 7CDD9BE6 2D30CD81 2F30CD81 02344B09 021E4F09 073C4F49
F3A0: 021CCF09 C0B0442B C0B0092A D77D7FA7 57FC7DA7 2330CBC1 2848FDC5 120ACB09
F3C0: 29489DC4 124AEB09 0448B9C4 01344D19 2D34CD81 44B14B21 44BB4B21 79DDD2E7
F3E0: 7CDD9BE6 003845C1 023841C1 003845C1 0234C909 0824C991 2C08F4C5 2508B1C5
F400: 2026C981 0508BDC4 0126C981 0008C019 117874C5 0208C019 137874C5 0908F4C5
F420: 0030EB09 00324949 2134D981 02324D59 013C89C0 0426C981 2C0898C4 0032CD41
F440: 02160D08 0236DD01 0032DD01 2538CF8D 0124C991 063C4C08 00384818 021CCC08
F460: 0024CD91 04304C08 2848D9C5 0232CD51 0234DD01 2C38CA8C 04324D89 0238DD01
F480: 0234CD41 2938CF8D 2026C981 00305909 0126C981 0024CD91 00301509 2848D9C5
F4A0: 00024D99 01308109 0430C808 00308D08 04384991 11707DC5 2848D9C5 250CCD19
F4C0: 02380509 06344909 2334C981 04144D09 2C0EC909 091CC009 043CCD41 0054CD09
F4E0: C61159E1 BBC52224 00304D01 2848D9C5 2026C981 0448D8C4 0024CD91 00384409
F500: 0018C408 00300D48 2848D9C5 0424C991 244899C4 0030A109 023A8541 0178848D
F520: 0030CC08 00384C40 2848D9C5 0030C909 08308309 023A8541 023C9501 023A8541
F540: 023C9501 04388141 003085C1 02360189 04308141 00389501 2178838D 000A4C08
F560: 11787C84 35406D45 C394CB2B 0066CB81 2008D9C1 FFAF9F79 0630C909 03344F09
F580: 2B34CB81 0220EB91 0C48F9C5 03304D09 0330CD41 0630CD81 0A22CB81 79DD9BE6
F5A0: 00380D40 2222CB81 0320CB91 C39C822B 37542645 48D5DDC2 3759D9E4 2222CB81
F5C0: 07344B08 02144B09 3654660C 3354230C 33542E0C 0638CD41 C0115DE3 0230CDC1
F5E0: 40945FC3 0630CF81 0F19DDE1 114894C5 79DDD2E6 1D08F4C4 05300C08 11300C80
F600: 284898C4 1E59DDE5 2D64EB91 79DD93E7 1908B1C5 53DDDDE5 0024EB91 05340908
F620: 5EDDBDE0 C39CCB2B 1908B9C4 0826CB81 3408FCC4 0124CB91 1108B9C4 2026CB81
F640: 3C089DC4 4699DFA3 78DDDBE7 4299DBA3 C394832B 02348509 3359DDE5 00064D89
F660: 04105D09 FDD8DFC7 C594422B 1D08F9C5 0010CDC1 0230E909 0D34CD09 0034CC00
F680: 04224981 00309501 0038DC00 3F3ACD8D 1C08FDC5 2026CB81 28388041 28309041
F6A0: 0838C040 0830D040 11783DC4 1C08FDC5 0010DD09 0126CB81 0424CB91 05381441
F6C0: 05381441 3D70E480 11300580 2026CB81 35406445 1026CB81 C3948B2A 6295D9A1
F6E0: 2322CB91 05000541 200891C1 003CC409 053C5441 307C6045 117C64C5 05385441
F700: 05385441 30786045 3D746485 043040C1 C2B54B2B 2008D9C1 2322CB91 05100541
F720: 284891C5 3D706D0D 30706045 05385441 05385441 2848D9C5 48D5FDC3 35546B05
F740: C094CF2B 1026CB81 283CCF41 2834DF41 35502B04 C0948F2A 203C8B40 01340BC0
F760: 2808D9C1 C39CCB2B 35546D45 C39CCB2B 37506D45 C39CCB2B 35506D45 0008D9C1
F780: 003C0DC0 063809C0 003C0DC0 00024D89 00064089 0008D9C1 FFFFFFFF [b]00000008[/b]
F7A0: 00100000 0830C9C1 00000008 35502B04 35406445 03344D09 050ACC40 2848D9C5
F7C0: C29C022B 0024CD91 00301509 C2B54B2B 021E4F09 2178838D 063809C0 2134D981
F7E0: 1D08F0C1 7CDD92E7 043049C1 124AEB09 05100541 00100000 2026CB81 FDD8DFC7

Note that your 08000000 comes out 00000008.

As I've pointed out before, given the number of times certain longs are repeated, there's almost no chance that any kind of cumulative enciphering is afoot. This means that the same transformation is being applied to each long.

I've done some analysis on consecutive groups of 6 bits, using all possible rotations of the long and all possible 6-bit XOR patterns, in both forward and reverse order. What I was looking for was a reasonable instruction distribution (i.e. no MUL, MULS, ENC, or ONES; few, if any, WAITxx; lots of JMPRETs (same as JMPs)). So far, nothing stands out. Here's a sample of the program's output:

ROT = 26   XOR = $027
sumnz   : 146
sumz    : 52
negnz   : 27
negnc   : 22
jmpret  : 21
abs     : 18
djnz    : 16
subabs  : 15
sumnc   : 13
movi    : 13
negc    : 13
or      : 12
absneg  : 8
add     : 7
cmpsub  : 7
mov     : 5
subsx   : 5
sub     : 5
negz    : 5
sumc    : 4
addsx   : 3
muxnz   : 3
and     : 3
xor     : 2
muxnc   : 2
subx    : 1
min     : 1
addabs  : 1
muxz    : 1
waitpeq_: 1
subs    : 1
rol     : 1
addx    : 1
maxs    : 1
waitvid_: 1
cmps    : 1
ror     : 1

This, of course, is not a real candidate, since there are no hub instructions. Obviously, I need to hone my filter somewhat.

It's still possible that the target of a MOVI, for example, could be encoded as a MUL, say, to throw people off the scent. I would have to look at the bytecode list to determine if there's any correspondence between their bit patterns and those of the actual ASM instructions they represent. If so, a MOVI in the interpreter would make perfect sense. In any event, I suspect, as you do, that some bit-swapping/nybble-swapping/byte-swapping is taking place. It would be easiest to analyze if the swapping were local, but that 00000008 suggests otherwise and is what originally put me on the trail of rotations. The FFFFFFFF that precedes it suggests that there is no XORing taking place at all. So my next thrust will be looking at likely bit permutations.

-Phil

hippy · 2008-02-21 20:18

Yes, my endianess is probably wrong ( I just pulled the data as is out of the SPIN.HEX file posted elsewhere ), but shouldn't matter if bit-swapping is the key, but it would help if we all use the same raw data and agree which bit is which.

On your dump, don't forget that $F000-$F002 are part of the trig tables, and the Interpreter starts at $F004. I don't see any reason why the Cog wouldn't load 496 longs from $FFF4 onwards, thus no lsb-to-lsb mapping of ROM to Cog address ( for loader at $F800 there would be ). What's at 'unused' $F002..$F003; no idea ( is that the $02CA or $FFFF - unprogrammed ? )

I'm doing my decoding by hand based on counts of bits-set and guessing what the instruction could be. That at $003 could be a "MOV $003,#0", although there are other dst/src possibilities. If so it looks like bit 25 ( my endian ), bit 8 ( your endian ) is the I, immediate, bit. After that it gets more complicated; RDWORD, JMP, CALL ?

I've been working on how many bits are set or cleared compared with the suspected "MOV $000,#5" and which possible instruction those changes could give. That's still a lot of possibilities for each instruction, meaning scouring for changed-by-N bits instructions doesn't seem worthwhile yet. It is however possible to say what instructions something cannot be through too few bit changes. If the I bit mapping has been identified, that allows encrypted values of "MOV $000,#$000" and "MOV $000,$000" to be determined, a reasonable basis to base further bit-changes from.

Looking at where the bit changes are if it is as I've guessed for $000 and $001, it's 'random' bit swapping although some pattern could emerge. Handily consecutive bits, byte or nibble, unfortunately I don't think so.

Unfortunately, I haven't come up with a strategy for attempted decoding which can be automated yet.

I looked at Spin bytecode function to PASM opcode mappings which would make them easer to decode and handle using MOVI within the interpreter and I didn't see any obvious one-to-one mappings.

hippy · 2008-02-21 20:26

The ROM dump ( same endianess as Phil's ), Cog address and one long per line ...

Phil Pilgrim (PhiPi) · 2008-02-21 20:43

I don't use the value at F000. It's just there to keep the columns lined up. The FFFF is part of the sine table. 'No idea what the 02CA is for. One thing I forgot to mention is that when I do my analysis, I ignore that part of the code which comes near the end, since it's probably data and not instructions.

A peculiar thought occurred to me: Suppose part of the trig table, when interpreted as code, was actually able to write something to hub memory. One could then start a cog beginning in the trig table that would include part of the interpreter code. It would be a huge coincidence, but if just one or two longs from the interpreter code got written out to hub memory as a result, that could provide a huge clue as to how the bits are scrambled. Update: Never mind. There's nothing in the 512 longs preceding F004 that starts with 0 in the upper nybble.

Also, even though there are 32! bit permutations, one wouldn't have to examine nearly that many. For the i-field, there are "only" 32! / 26! possibilities -- still a large number, but accessible for exhaustive analysis on a fast computer. By partitioning opcodes into groups for frequency analysis, that number could be reduced even further. Once the opcode bits have been determined, any pattern extant in their selection might provide a clue for the other 26 bits.

-Phil

Post Edited (Phil Pilgrim (PhiPi)) : 2/21/2008 9:24:38 PM GMT

VIRAND · 2008-02-21 22:41

If there is no XOR and only permutations in the scrambling, then checksums can find candidates for certain instructions.
Then guessing a couple of instructions correctly ... quickly decreases the number of possible permutations,
and rapid successive approximation may happen as testing remaining permutations reveals more good instructions.

Another totally different idea:
Maybe the spin interpreter is a double interpreter,
so it squeezes more instructions in 496 longs.
Also, does the bootloader need all the memory reserved for it?
(Doesn't it have to be Spin bytecode to be able to run at all outside a cog? Maybe I missed something.)

cgracey · 2008-02-21 22:54

If one of you guys was to make an amazing determination, would you want me to confirm it, or stay silent?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Chip Gracey
Parallax, Inc.

CJ · 2008-02-21 23:06

I'm sure a confirmation would be greatly appreciated and boost enthusiasm

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Parallax Forums - If you're ready to learn, we're ready to help.

Phil Pilgrim (PhiPi) · 2008-02-21 23:33

Chip,

I assume, since you asked, that perhaps one of us has determined something. As much as I'd prefer not to know just yet what it might be, I'll happily defer to the majority, although it may have to wait until morning for Hippy in the UK. (I know you're just dying to tell us!)

-Phil

Phil Pilgrim (PhiPi) · 2008-02-22 00:01

The Propeller instructions divide nicely into 16 groups of four related opcodes which differ among themselves in their two LSBs. Therefore it's possible to perform a reasonable opcode frequency analysis on these 16 groups, rather than on all 64 opcodes. This reduces the permutation search space from 605,854,080 to 863,040 instances.

I've run some analyses over the latter space, hoping to find candidate permutations bearing some sort of pattern. My criteria were the following, relating to the number of instructions found:

····MUL group = 0
····WAIT group < 5
····MOVS group (includes JMPRET, JMP, RET) > 20
····MOV group > 20
····RDBYTE group > 20
····Any single group < 128

I found plenty of candidates, but no permutations bearing an obvious or simple pattern. Assuming that permutation is the sole obfuscation method used, this could be due to a couple factors:

1. Chip's chosen permutation employs a non-regular pattern.
2. There is data interspersed among the instructions that code into the MUL group, thereby disqualifying the permutation that produced them.

-Phil

CJ · 2008-02-22 00:01

I just put the first one into big endian and ended up with

IF_C_EQ_Z RDLONG $14C, #$001 WC

reading the global clock speed?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Parallax Forums - If you're ready to learn, we're ready to help.

cgracey · 2008-02-22 00:05

Phil Pilgrim (PhiPi) said...
Chip,

I assume, since you asked, that perhaps one of us has determined something. As much as I'd prefer not to know just yet what it might be, I'll happily defer to the majority, although it may have to wait until morning for Hippy in the UK. (I know you're just dying to tell us!)

-Phil

Well, I don't want you guys to suffer burnout. I'd hate to see you run too far down the wrong road and get all fatigued and give up. Then, I'd never have the pleasure of showing you my code. Aaaaaaah.... I guess we can all wait.

Should I quit with the psyops? Is it helping or hurting?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Chip Gracey
Parallax, Inc.

tpw_man · 2008-02-22 00:58

Does the interpreter use up every long in a cog, or is there room for a loop to copy its (decrypted) self into hub memory for dumping.

I believe I would need four longs to do it.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
I am 1010, so be surprised!

CJ · 2008-02-22 01:03

I was just playing with changing the listing to big endian and disassembling it, starts with

IF_C_EQ_Z RDLONG $14C #$001 WC, reads the global clock speed
IF_Z HUBOP $0B8 #$01A WC NR , equates to a nop without Z set
then a rdlong of long 1 which includes the global clock register,

this is sounding like a good start for an interpreter that needs to know how fast it is going

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Parallax Forums - If you're ready to learn, we're ready to help.

Phil Pilgrim (PhiPi) · 2008-02-22 01:15

Chip,

The wrong turns are part of the fun. But please feel free to taunt us all you want!

I'm curious, though: why the sudden decision to release the interpreter code? I always assumed it would be locked in the vault forever with the BASIC Stamp interpreters.

-Phil

CJ · 2008-02-22 01:22

Phil: I suspect that the openness is from the fact that without the propeller chip itself the interpreter is useless

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Parallax Forums - If you're ready to learn, we're ready to help.

hippy · 2008-02-22 01:24

Looking promising but a guestimate here rather than every instruction confirmed - it's a real brain drainer doing this by hand !

000 : MOV    $000,#5      count   mov     count,#5      ' How many words
001 : MOV    $001,PAR     ptr     mov     ptr,PAR       ' Where from minus 2
002 : ADD    $001,#2      Loop    add     ptr,#2        ' Point to first/next
003 : RDWORD $???,$001    LoopDst rdword  ?,ptr         ' Read word
004 : ADD    $003,#$100           add     LoopDst,#$100 ' Update dest
005 : ADD    $003,#$100           add     LoopDst,#$100
006 : DJNZ   $000,#$002           djnz    count,#Loop   ' Repeat 5 times
007 : CALL   #?
008 : MOV    $000,$000

Not at all confident about 007 and onwards.

@ Chip : I don't think it's burn-out you should worry about, more damage to nations' entire economies while engrossed

I think we'd all hate to have the secret revealed if we thought we were on the verge of cracking it, but I don't think a "you're wildly off-track / have missed something major" would go amiss to save too much wasted time ( after a reasonable enough period to rescue ourselves ).

@ tpw_man : The hard part is getting the dump loop into the Cog running the interpreter and then getting it to run.

@ CJ : Maybe, but I cannot really imagine Chip having written code other than straight-forward, and the Interpreter doesn't really need to know how fast it's running.

Post Edited (hippy) : 2/22/2008 1:31:06 AM GMT

Paul Baker · 2008-02-22 01:48

He hasn't said he'll release the code, I think he was refering to someone cracking it to obtain a copy.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer

Parallax, Inc.

Phil Pilgrim (PhiPi) · 2008-02-22 02:21

I dunno, Paul; this seems pretty clear-cut to me:

Chip Gracey said...
... but if someone posts the correct binary, I'll post the original source code.

Anyway, the quest continues...

-Phil

cgracey · 2008-02-22 03:27

Phil Pilgrim (PhiPi) said...
Chip,

The wrong turns are part of the fun. But please feel free to taunt us all you want!

I'm curious, though: why the sudden decision to release the interpreter code? I always assumed it would be locked in the vault forever with the BASIC Stamp interpreters.

-Phil

Well, if the code·CAN be extracted, I'd like to know about it sooner than later, because·I'd like to gauge· the effectiveness of this current scheme. Whether or not you guys crack this ROM, we are all learning some interesting things here. And there's nothing like experience to shatter or cement our convictions.

You guys are certainly exhibiting a lot·more intuition than I would·have, myself, or I would have imagined from others. So, it's confirmed to me that in a case like this, it takes a fractional amount of effort to design the obfuscation, compared to what it takes to reverse it. I'm also learning how an attacker might think, and that's very useful to know, as some simple, subtle departures from likelihood could really derail their efforts.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Chip Gracey
Parallax, Inc.

Post Edited (Chip Gracey (Parallax)) : 2/22/2008 3:33:25 AM GMT

Phil Pilgrim (PhiPi) · 2008-02-22 03:42

Chip,

Ah, so. This is all about the Prop II, isn't it?

One summer I worked on a salmon troller in Alaska. The skipper was really fussy about how the food on board was handled. In particular, he had a rule about bread: never take the heel; always leave it on the end of the loaf. I made the comment: "So, we sacrifice the end piece and let it go stale to save the rest of the loaf?" "Yes," he replied, "and so it is in life." Shall I infer, then, that the Prop I interpreter code is merely the heel here and that the upcoming Prop II code is the next slice in the loaf?

-Phil

Bob Lawrence (VE1RLL) · 2008-02-22 03:57

You may be right Phil but that's a fairly crummy analogy you gave.

VIRAND · 2008-02-22 04:04

I'm going to make a wild guess that the ROM doesn't even have 100 longs of PASM in it.

hippy · 2008-02-22 04:04

I don't know if I am on the right track but the earlier code passes verification and I'm now partially automated !

If it is correct, I've identified 11 distinct bit mappings, but most importantly 4 of the 6 opcode bit mappings which has narrowed down each instruction to being one of four potentials.

The technique I'm using is to take the encrypted long and determine where each bit could not be in what I think the result is. By removing those from the list of possibilities, what's left are those bits which it could be, and when there's only one it could be the mapping is determined. I am surprised that just 5 unique opcodes revealed 35% of the total mapping, and 66% of the opcode mapping.

Unfortunately I've run into a design problem and need a code rewrite. I'm mapping encrypted bits to where they should be in the result, what I should be doing is setting each result bit from the encrypted bits. Even if a result bit could come from any number of encrypted bits, as long as all those are the same ( 1 or 0 ) the result bit is still known and that should provide a lot more information to guess at what each instruction is, maybe even fully reveal many.

I'm half expecting this to fall flat on its face at some point but it's looking good so far.

Bit mappings I think there are so far ...

Encrypted Bit / Actual Instruction Bit

26    1
25    9
24    2
21    29
19    26
18    10
15    22
10    0
7     30
3     31
2     8

The full Encrypted Bit to where it could be in Actual Instruction mapping -

 31           ------ ---- ---- rqponml-- ---------
 30           ------ ---- ---- rqponml-- ---------
 29           ---CB- zy-- ---- rqponml-- -----d---
 28           ------ ---- ---- --------- -hgfe----
 27           ------ ---- ---- --------- -hgfe----
 26           ------ ---- ---- --------- -------b-
 25           ------ ---- ---- --------j ---------
 24           ------ ---- ---- --------- ------c--
 23           ------ ---- ---- rqponml-- ---------
 22           ------ ---- ---- --------- -hgfe----
 21           --D--- ---- ---- --------- ---------
 20           ------ --x- vuts --------- ---------
 19           -----A ---- ---- --------- ---------
 18           ------ ---- ---- -------k- ---------
 17           ---CB- zy-- ---- rqponml-- -----d---
 16           ---CB- zy-- ---- rqponml-- -----d---
 15           ------ ---w ---- --------- ---------
 14           ------ --x- vuts --------- ---------
 13           ------ ---- ---- --------- -hgfe----
 12           ---CB- zy-- ---- rqponml-- -----d---
 11           ------ --x- vuts --------- ---------
 10           ------ ---- ---- --------- --------a
 9            ---CB- zy-- ---- rqponml-- -----d---
 8            ------ --x- vuts --------- ---------
 7            -E---- ---- ---- --------- ---------
 6            ---CB- zy-- ---- rqponml-- -----d---
 5            ------ ---- ---- rqponml-- ---------
 4            ---CB- zy-- ---- rqponml-- -----d---
 3            F----- ---- ---- --------- ---------
 2            ------ ---- ---- --------- i--------
 1            ------ ---- ---- rqponml-- ---------
 0            ------ --x- vuts --------- ---------

Phil Pilgrim (PhiPi) · 2008-02-22 04:17

Hippy,

You're still up? It's after four a.m. there!

I'm in the middle of a new permutation analysis. I ran the floating point code through a frequency analysis (4 MSBs of i-field), and came up with the following (each mnemonic being the first in its 4-tuple):

    movs: 38.6%
     mov: 23.6%
     and:  9.9%
     add:  9.3%
     ror:  7.9%
  wrbyte:  2.0%
     rcr:  1.8%
    cmps:  1.3%
    mins:  1.1%
  cmpsub:  0.9%
    negc:  0.7%
    sumc:  0.2%

This changes my acceptance filter considerably, although the FP code may be peculiar as far as frequencies go...

-Phil

cgracey · 2008-02-22 04:20

Phil Pilgrim (PhiPi) said...
Chip,

Ah, so. This is all about the Prop II, isn't it?

One summer I worked on a salmon troller in Alaska. The skipper was really fussy about how the food on board was handled. In particular, he had a rule about bread: never take the heel; always leave it on the end of the loaf. I made the comment: "So, we sacrifice the end piece and let it go stale to save the rest of the loaf?" "Yes," he replied, "and so it is in life." Shall I infer, then, that the Prop I interpreter code is merely the heel here and that the upcoming Prop II code is the next slice in the loaf?

-Phil

I like that bread story, Phil. That guy sounds pretty wise. I was feeling pretty wise until about an hour ago, even talking like him.

It's now looking like my protection gizmo wasn't very effective, so I'll need to enhance it for the next Propeller.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Chip Gracey
Parallax, Inc.

hippy · 2008-02-22 04:33

Chip Gracey (Parallax) said...
I'm also learning how an attacker might think, and that's very useful to know, as some simple, subtle departures from likelihood could really derail their efforts.

If there were bit inversions, I'd probably throw in the towel. If bit-swapping or bit inversions were related to ROM address I almost certainly would. TBH, it's only because you revealed it was tending towards simplistic that I even gave it a go.

I have my towel ready in case I'm already entirely on the wrong track.

If there were a 'secret flag' in the Cog/Decoder set when CogInit activated and that enabled decryption for the interpreter load and returned garbage or zeroes when $F004-$FFFF were read otherwise ( by Spin or PASM ), we'd not have anything at all to even start with.

Code protect?

Comments