just thinking out loud on your comment.. "Just keep looking for something to come out that looks statistically like PASM instructions or Spin byte codes." ....
What if the hashed code result in the EEPROM after encryption IS/ARE valid PASM instructions? Just not the instructions that were originally intended.
What i mean... suppose the encryption ONLY scatters the the instruction mnemonic of the OP code so that visually it may look like valid PASM code, but if you were to follow the flow of the code, then it would not make any sense.
This is almost certainly worse than encrypting all of the code, because it gives the attackers more clues -- they can see the source and destination of all the instructions, and based on that can make some good guesses about what the instructions are doing. It also drastically reduces the search space for brute force attacks; now instead of having to work with all 32 bits of the instruction the attacker just has to go after the 6 bits of the instruction field.
Actually, it would be about the same, assuming that each manipulated bit in the instruction represented a valid instruction and there were no holes. The fact that it would 'look' like a valid instruction would add to the non-deterministic complexity of trying to decipher it, because the hash produces valid code commands, each branch you follow in the code would send you on a goose chase.
Except a serious cryptographer would be looking at statistics of the decrypted code first. Generally speaking, references to code (self modification) will involve jumps or movi,movd,movs instructions, references to outa and dira will involve mov, or, or xor, references to ina will involve waitpxx or mov, and so on. Certain instructions will follow others much more often, and some instructions are common (mov, add) and others are rare (absneg, the lock instructions, and so on). All of this information is grist for the cryptographer's mill. Remember, the German Enigma code was not cracked because it was a bad algorithm per se: it was cracked because the way it was used "leaked" information about the messages, allowing the Allied cryptographers to drastically reduce the search space for keys.
This thread is way over my head, but as for faking valid-looking PASM code to throw off inspection, couldn't such inspection still be performed automatically by computer analysis, such that human inspection was not required? I mean, while such a step would add another layer of protection, couldn't such a layer be defeated by statistical analysis of the generated output to detect a ruse? And while such output might get by one particular statistical scanning technique, it might not get by a scan using a different statistical technique. Statistical techniques could be designed or "weighted" in many ways. That's the problem, it seems, with trying to hide something: it only hides it from the angles the hider has considered, but, when looked at from another angle, such hiding might not work. So, one has to consider that when considering whether a hiding technique is worth the effort. Nevertheless, the above technique would be more than enough to throw the likes of me off, as I'd rather spend my time inventing something myself than figuring out a statistical method to break/steal someone else's effort (unfortunately, not everyone feels the same way).
I'll have to agree with some of the earlier comments in this thread: cryptography is hard! It's easy to come up with an algorithm which can withstand a casual attack, but coming up with an algorithm which can withstand a serious attack is very hard. You need experience and training to do this.
Moreover, if you're trying to sell into a professional market, you not only need to have a good algorithm, you need to convince your customers that you have a good algorithm. I know what to expect if you're using AES, SHA, or RSA, and I can have great confidence in these. To a lesser degree I can have confidence in algorithms like XXTEA and RC-4 which have been out there for a while and withstood attacks. But some custom algorithm? As a customer I'm going to have to basically assume that it is worthless. This seems harsh -- maybe you really have come up with a great algorithm. But virtually all homegrown algorithms have weaknesses. Unless I'm a cryptographer and have time and inclination to do the analysis myself, I have to assume the worst.
I think Chip is working out some last details before a shuttle run, he popped in to make sure he's headed in the right direction and doesn't make a bunch of $5,000 keychains.
All, I was just thinking out loud, sometimes that can be good, other times not. I am definitely not a cryptographer,
"Seems like any time Chip and Beau both show up on the forums a lot all at the same time it must mean that they have some spare time." --
@jazzed ... after hours :-) ... seriously 12 to 14 hour days since Mid November (Chip sometimes never sleeps) ... we gotta come up for air a little bit. ...and it's Saturday for goodness sake. :-)
Actually the process of LVSing and DRCing eats up CPU time where I am dead in the water on that particular machine during an LVS/DRC run. Fortunately I have two other machines that I can utilize my time on.... sometimes other layout that needs to be done, sometimes code writing, and sometimes answering forum questions... whatever needs to get done.
...but if you must ask. I am crossing the T's and dotting the I's so to speak.... ALL of the layout blocks are complete, there were a few that needed some aspect ratio adjustments so that they would fit in place, so there was a little re-work but not much. All of the wiring is complete at the top level of layout with exception of the core. The core is a synthesized block that Chip can work on until virtually the last minute. My next task is to generate what is called a frame view of the core. The frame view consists of a border or frame, and a piece of metal with associating text for ALL the nets (several thousand) brought right up to the edge of the frame view border. This essentially creates a "black box" where all of the core circuitry will connect to, and vise verse, where all of the circuitry external to the core will meet up with. While the core is completed (about a 2 to 4 week window), I will be LVSing and DRCing the entire chip minus the core.
Thanks for the update and all the hard work by the team! We really appreciate these updates, even though you're busy. Come up for air as often as needed, and occasionally even if not needed (just for play like a dolphin leaping out of the water just because it can).
Was the use of a capital 'C' for the word "chip" in "the entire Chip minus the core" an indication that the chip has taken on God-like complexity, meriting uppercase usage, or just an "artifact" due to one of the chip's creators (Creators?) being named "Chip"? Okay, it's a rhetorical question (I think), but I'll start to really wonder if you start typing it as "CHIP" in a bold font and at a higher point size.
"Was the use of a capital 'C' for the word "chip" in "the entire Chip minus the core" an indication that the chip has taken on God-like complexity," ... lol... Technically Chip is just as much part of the chip as the chip is part of Chip. Hows that for rhetorical :-) ... it was just a Freudian typo
Dave,
It's not inconceivable to have a farm of GPUs to work on the problem. With the latest AMD GPU you have 2048 stream processors, running at about 1Ghz (925Mhz default, many cards with push that above 1Ghz). So 5 of those roughly matches the 1000 10Ghz cores, which you said would take about 5 years. 100 of them would do it in 3 months. At $500 a piece, that's only $50,000 for the GPU cards. It's not practical, but it's well within reach of many.
$50,000 would be cheap if someone was reverse engineering an i7 or similar. But IMHO there are not likely to be many prop projects where someone will invest those dollars to get at the code, at least from a commercial point of view. Of course, getting the unecrypted code is only the beginning. There is no source and no comments. There would still be heaps of work and many man-hours!
I don't think we need to jump to 128bit, going to 72 or 80 would be sufficient for the next 10 years or so.
Don't forget to look back at the progress over the last 10 years. I would be careful of making a statement link this. Particularly on this forum where impossible usually only means a few months
When I worked on 10MB disk drives (worth $16,000 and the size of washing machines) in the mid 70's there is no possible way I would have believed any nut who told me that in 35 years we would have pocket sized 1TB drives or 64GB microSD drives, irrespective of price!
Anyway, back to reality. Hippy decoded the spin instructions and interpreter quite well using a form of statistical analysis. However, most of this work was done after Chip threw down the gauntlet to work out the ROM encryption. So, IMHO, most likely the code would be cracked by a hacker with too much time on their hands. Encryption will be great for some commercial products to stop someone from blatantly copying their product.
Anyway, IMHO with encryption (as long as the basic algorithm is sound), and particularly if we used out of order and overlapping blocks, with real code embedded as fake code, and the whole hub filled, the P2 would prove to be a hard nut to crack
Beau: Thanks for the update. My thoughts about Chips question was that he is coding the ROM. I don't think any of these suggestions require hardware changes unless Chip decides to add a decryption instruction. Not only do you both have to come up for air, there are times when you must take a break to be able to refocus. I know we are all impatient, but I can see it getting closer.
"I know we are all impatient, but I can see it getting closer. " - Yup!! I've had almost 20 years of dealing with all of you guys/gals in the Parallax Forums and I wouldn't change a thing :-).... and YES I can see light through the trees (...err well during the day at least) , we are getting much closer.
Just for fun though, I would like to see if someone could decipher the code I posted in #750.... c'mon it's only 13 lines of PASM and there are plenty of hints in there as to what "should" be there that probably wouldn't be there if we really did something like this. <- Again I was just thinking out loud and playing "what would it look like if..."
$50,000 would be cheap if someone was reverse engineering an i7 or...
Don't forget:
1) With the progress we have seen in technology in our lifetimes that might
might be expected to get much cheaper quite quickly.
2) Quite likely you may soon be able to rent time on such machines (if you
can't already) as an when required, so the investment required may not be so
huge.
3) Give the increasing sophistication of organized "cybercrime" one might expect
such cracking services to be on offer. Perhaps they are already.
4) The "hackers" will have arrays of fast cheap Prop II's to help with the
job!!
Don't forget to look back at the progress over the last 10 years
Exactly.
@Beau and co.
Eric has a very good point:
Moreover, if you're trying to sell into a professional market, you not only need
to have a good algorithm, you need to convince your customers that you have a
good algorithm. I know what to expect if you're using AES, SHA, or RSA, and I
can have great confidence in these. To a lesser degree I can have confidence in
algorithms like XXTEA and RC-4 which have been out there for a while and
withstood attacks. But some custom algorithm? As a customer I'm going to have to
basically assume that it is worthless. This seems harsh -- maybe you really have
come up with a great algorithm. But virtually all homegrown algorithms have
weaknesses. Unless I'm a cryptographer and have time and inclination to do the
analysis myself, I have to assume the worst.
Chip described a cipher of his own creation which may indeed be as good as
anything else. Problem is to be taken seriously one has to demonstrate the
strength of the thing in a way that can convince the crypto and user
community. This requires analysis by many experts and a good time out in the
field to see if anyone spots the flaws in it. Until then it cannot be taken
seriously.
I'll have to be honest, I'm not sure I understood you correctly, since I used the structure and probable meaning to derive the instructions, not finding the key.
My notion was the key to be the instruction offset in longs, since the last instruction is 13 (12 based 0) instructions from the beginning and the key was 12.
Though this notion didn't hold up when trying to translate the others, also the key wasn't held constant, like most XOR based obfuscation.
It's late and I haven't figured out what permutation you have used to generate the key yet, I've gotta go to bed so I'm not too tired to go to church in the morning.
DAT
PASM org
mov dira, #%1
mov :T1, :P1
add :T1, cnt
mov :C1 , #16
:L1 waitcnt :T1, :P1
xor outa, #%1
djnz :C1, #:L1
mov :C1, #16
:L2 waitcnt :T1, :P2
xor outa, #%1
djnz :C1, #:L2
mov :C1, #16
jmp #:L1
:T1 long 0
:P1 long 298
:P2 long 322
:C1 long 0
Cluso99,
Distributed.net is working on cracking RC5-72 (72bit key), they expect it to take about 200 years to exhaust the key set given the current rate. They have GPU based versions of their client ramping up that will probably reduce that a bunch, but it won't bring it down by enough to make it practical for at least another few GPU/CPU generations (proper new architectures in both are typically 2-3 years apart). That is part of what I was basing my statement about 72 or 80bit keys on.
Also, having the key doesn't just give you the source, it gives you the ability to place your own code on the hardware. That alone might be enough. However, once you have the decoded binary, disassembly into spin/pasm that is reasonably readable and editable is not hard. Not having comments or symbol names is only a minor hindrance, in my opinion.
Roy: IIRC it was the late 90's when they said that based on Moore's Law, it would take 150 years to map the human genome unless they came up with some new algorithms. Guess they found them because if I am not mistaken, it only took about 8 years and all had been mapped. Anyway, I am sure if someone wanted to crack something, and had enough support, the internet and a program (like the seti project) could get there.
But I am going to agree with heater and others, that an existing and proven encryption method will be much better received by the business community than an in-house designed one. We all know Chips capabilities, so I don't doubt for one minute his encryption capabilities. But it will be the opposition that will say it's not secure. And it is the commercial designers Parallax is after!
2) Quite likely you may soon be able to rent time on such machines (if you
can't already) as an when required, so the investment required may not be so
huge.
3) Give the increasing sophistication of organized "cybercrime" one might expect
such cracking services to be on offer. Perhaps they are already.
1) Amazon has compute farms and GPU farms you can rent.
2) There are also private GPU farms if you know who to speak to (I do password cracking occasionally for fun and profit so get offers all the time).
3) It is cheaper to rent botnets than either of the above options last I checked.
I guess that is why we have such things as the DMCA. Performing such circumvention of copy protection mechanisms is outlawed in the USA and other parts. So at least in those places just having a copy protection scheme, no mater how poor, gives protection under the law.
That may not let you sleep at night though given the nature of our global economy, the rather lax laws and attitudes in some parts of the world and the "dark net". Or the fact that unless you are big player you probably don't have the resources to to set the lawyers loose on your attackers.
Copyright holding/infringing is recognised world wide. I don't think the DMCA does anything useful other than to make it easy to dish out accusations and takedowns with no proof at all.
AES has already been implemented on the prop. Any reason to not use it?
I guess a reasonable design would be to have 416 bits of fuses, 128 for an AES key to load, 128 for user-config data (secondary key?), and 160 for a SHA1 hash of the bootloader. There should be a software-settable flag to hide access to the fuses, which is only reset on reboot. Add a mask of I/O pins that can't be changed once that flag is set, and every situation I can think of will be covered.
@jazzed ... after hours :-) ... seriously 12 to 14 hour days since Mid November (Chip sometimes never sleeps) ... we gotta come up for air a little bit. ...and it's Saturday for goodness sake. :-)
Not a criticism by any means at all. It was an observation that you guys are probably so close to the end of the tunnel that you could get a sunburn ....
Perhaps we should also be thinking of possible side or covert channel attacks on any code protection scheme. Such implementation failures have doomed systems despite their using strong algorithms.
Lets say, for example, the key fuse bits are read once during start up and that they are read one bit at a time. Unlikely I know but it's only an example. Further lets assume that reading the key bits from fused is a significantly different operation from normal COG/HUB reads and that the instantaneous power consumption of the chip is different when reading a fuse set to 1 or 0.
BINGO! In that scenario an attacker knows exactly the moment when each key bit is read, from reading the ROM code and knowing the clock rate. He only has to measure the small changes in power consumption to get the key out.
Ah, you say, but the key bits will all be read in one go, or in two 32 bit reads. Well perhaps, but if the instantaneous power consumption was dependent on the fuse being set or not that would still give an indication of how many fuse bits are set to 1 and hence dramatically reduce the key space to search.
These may not be practical attacks, I just put them up an example of what could potentially, inadvertently render the whole scheme useless.
Heater... hmmm interesting, that's essentially how RFID is done, by modulating the current representing each bit, and this is done with fractional amounts of current difference. Only 50mV or less above the noise floor can contain valid data.
@pedward, great job. I too tried this puzzle. In the past I'd done lots of hand-disassembly. But ran out of time; sleepy-time took over the goal of solving the puzzle.
In the 1980s I worked for a large electronics company in their research
and development labs. As we were developing military gear we were using military
grade chips that were not supposed to fail so easily. So, here and there around
the labs were yellow "post boxes" on the walls into which we were supposed to
put any faulty chips. Periodically these chips were collected up and analyzed
for their failure mode.
I knew one of the young guys doing that job. Basically chips would get depotted
and he would look at them under an electron microscope. Naturally most failures
were just us engineers letting out the blue smoke or being careless with static
discharge which could easily be found under the microscope.
Now I guess that fuse bits are big things not in the regular mould of all
the other gates and transistors on the chip and can be found under an
electron microscope.
That guy could probably have read the key from the fuse bits within in a few
lunchtimes and for the price of a few beers:)
I suspect such possibilities are even more wide spread in the world now.
All this make worries about the strength of the crypto algorithm moot.
And with _red_'s tools at hand most any encrypted data is brute force crackable, for a price. So, what is reasonable? I'm personally happy with no code protection at all.
One possible way to protect against that attack is to use thin strips of aluminum for the fuse bits, so that if the chip gets depotted, the aluminum should oxidize pretty quickly in a form that hides whether the fuse was open or closed, preventing them from being physically read.
One way to protect against power-based side-channel attacks would be to have a high-power boot mode that activates all logic elements whether they're being used or not, for the duration of decryption, or until the user program sets a "regular power use" flag.
Comments
This is almost certainly worse than encrypting all of the code, because it gives the attackers more clues -- they can see the source and destination of all the instructions, and based on that can make some good guesses about what the instructions are doing. It also drastically reduces the search space for brute force attacks; now instead of having to work with all 32 bits of the instruction the attacker just has to go after the 6 bits of the instruction field.
Eric
Eric
Moreover, if you're trying to sell into a professional market, you not only need to have a good algorithm, you need to convince your customers that you have a good algorithm. I know what to expect if you're using AES, SHA, or RSA, and I can have great confidence in these. To a lesser degree I can have confidence in algorithms like XXTEA and RC-4 which have been out there for a while and withstood attacks. But some custom algorithm? As a customer I'm going to have to basically assume that it is worthless. This seems harsh -- maybe you really have come up with a great algorithm. But virtually all homegrown algorithms have weaknesses. Unless I'm a cryptographer and have time and inclination to do the analysis myself, I have to assume the worst.
Eric
"Seems like any time Chip and Beau both show up on the forums a lot all at the same time it must mean that they have some spare time." --
@jazzed ... after hours :-) ... seriously 12 to 14 hour days since Mid November (Chip sometimes never sleeps) ... we gotta come up for air a little bit. ...and it's Saturday for goodness sake. :-)
Actually the process of LVSing and DRCing eats up CPU time where I am dead in the water on that particular machine during an LVS/DRC run. Fortunately I have two other machines that I can utilize my time on.... sometimes other layout that needs to be done, sometimes code writing, and sometimes answering forum questions... whatever needs to get done.
...but if you must ask. I am crossing the T's and dotting the I's so to speak.... ALL of the layout blocks are complete, there were a few that needed some aspect ratio adjustments so that they would fit in place, so there was a little re-work but not much. All of the wiring is complete at the top level of layout with exception of the core. The core is a synthesized block that Chip can work on until virtually the last minute. My next task is to generate what is called a frame view of the core. The frame view consists of a border or frame, and a piece of metal with associating text for ALL the nets (several thousand) brought right up to the edge of the frame view border. This essentially creates a "black box" where all of the core circuitry will connect to, and vise verse, where all of the circuitry external to the core will meet up with. While the core is completed (about a 2 to 4 week window), I will be LVSing and DRCing the entire chip minus the core.
Was the use of a capital 'C' for the word "chip" in "the entire Chip minus the core" an indication that the chip has taken on God-like complexity, meriting uppercase usage, or just an "artifact" due to one of the chip's creators (Creators?) being named "Chip"? Okay, it's a rhetorical question (I think), but I'll start to really wonder if you start typing it as "CHIP" in a bold font and at a higher point size.
"Was the use of a capital 'C' for the word "chip" in "the entire Chip minus the core" an indication that the chip has taken on God-like complexity," ... lol... Technically Chip is just as much part of the chip as the chip is part of Chip. Hows that for rhetorical :-) ... it was just a Freudian typo
$50,000 would be cheap if someone was reverse engineering an i7 or similar. But IMHO there are not likely to be many prop projects where someone will invest those dollars to get at the code, at least from a commercial point of view. Of course, getting the unecrypted code is only the beginning. There is no source and no comments. There would still be heaps of work and many man-hours!
Don't forget to look back at the progress over the last 10 years. I would be careful of making a statement link this. Particularly on this forum where impossible usually only means a few months
When I worked on 10MB disk drives (worth $16,000 and the size of washing machines) in the mid 70's there is no possible way I would have believed any nut who told me that in 35 years we would have pocket sized 1TB drives or 64GB microSD drives, irrespective of price!
Anyway, back to reality. Hippy decoded the spin instructions and interpreter quite well using a form of statistical analysis. However, most of this work was done after Chip threw down the gauntlet to work out the ROM encryption. So, IMHO, most likely the code would be cracked by a hacker with too much time on their hands. Encryption will be great for some commercial products to stop someone from blatantly copying their product.
Anyway, IMHO with encryption (as long as the basic algorithm is sound), and particularly if we used out of order and overlapping blocks, with real code embedded as fake code, and the whole hub filled, the P2 would prove to be a hard nut to crack
Beau: Thanks for the update. My thoughts about Chips question was that he is coding the ROM. I don't think any of these suggestions require hardware changes unless Chip decides to add a decryption instruction. Not only do you both have to come up for air, there are times when you must take a break to be able to refocus. I know we are all impatient, but I can see it getting closer.
Thanks!
"I know we are all impatient, but I can see it getting closer. " - Yup!! I've had almost 20 years of dealing with all of you guys/gals in the Parallax Forums and I wouldn't change a thing :-).... and YES I can see light through the trees (...err well during the day at least) , we are getting much closer.
Just for fun though, I would like to see if someone could decipher the code I posted in #750.... c'mon it's only 13 lines of PASM and there are plenty of hints in there as to what "should" be there that probably wouldn't be there if we really did something like this. <- Again I was just thinking out loud and playing "what would it look like if..."
Don't forget:
1) With the progress we have seen in technology in our lifetimes that might
might be expected to get much cheaper quite quickly.
2) Quite likely you may soon be able to rent time on such machines (if you
can't already) as an when required, so the investment required may not be so
huge.
3) Give the increasing sophistication of organized "cybercrime" one might expect
such cracking services to be on offer. Perhaps they are already.
4) The "hackers" will have arrays of fast cheap Prop II's to help with the
job!!
Exactly.
@Beau and co.
Eric has a very good point:
Chip described a cipher of his own creation which may indeed be as good as
anything else. Problem is to be taken seriously one has to demonstrate the
strength of the thing in a way that can convince the crypto and user
community. This requires analysis by many experts and a good time out in the
field to see if anyone spots the flaws in it. Until then it cannot be taken
seriously.
I'll have to be honest, I'm not sure I understood you correctly, since I used the structure and probable meaning to derive the instructions, not finding the key.
My notion was the key to be the instruction offset in longs, since the last instruction is 13 (12 based 0) instructions from the beginning and the key was 12.
Though this notion didn't hold up when trying to translate the others, also the key wasn't held constant, like most XOR based obfuscation.
It's late and I haven't figured out what permutation you have used to generate the key yet, I've gotta go to bed so I'm not too tired to go to church in the morning.
Distributed.net is working on cracking RC5-72 (72bit key), they expect it to take about 200 years to exhaust the key set given the current rate. They have GPU based versions of their client ramping up that will probably reduce that a bunch, but it won't bring it down by enough to make it practical for at least another few GPU/CPU generations (proper new architectures in both are typically 2-3 years apart). That is part of what I was basing my statement about 72 or 80bit keys on.
Also, having the key doesn't just give you the source, it gives you the ability to place your own code on the hardware. That alone might be enough. However, once you have the decoded binary, disassembly into spin/pasm that is reasonably readable and editable is not hard. Not having comments or symbol names is only a minor hindrance, in my opinion.
Roy
But I am going to agree with heater and others, that an existing and proven encryption method will be much better received by the business community than an in-house designed one. We all know Chips capabilities, so I don't doubt for one minute his encryption capabilities. But it will be the opposition that will say it's not secure. And it is the commercial designers Parallax is after!
1) Amazon has compute farms and GPU farms you can rent.
2) There are also private GPU farms if you know who to speak to (I do password cracking occasionally for fun and profit so get offers all the time).
3) It is cheaper to rent botnets than either of the above options last I checked.
Yep, there we go, job done. Game over.
I guess that is why we have such things as the DMCA. Performing such circumvention of copy protection mechanisms is outlawed in the USA and other parts. So at least in those places just having a copy protection scheme, no mater how poor, gives protection under the law.
That may not let you sleep at night though given the nature of our global economy, the rather lax laws and attitudes in some parts of the world and the "dark net". Or the fact that unless you are big player you probably don't have the resources to to set the lawyers loose on your attackers.
LOL! I guess there ain't too many scrupples in the business ... though, it's prolly one of the least intrusive activities of bot-nets. O_o
For example Twofish an AES finalist.
http://www.schneier.com/twofish.html
Source code available in various languages and assembly versions at the link above and it is patent free.
lol, well that proves that... I'm definitely not a cryptographer.
What you deciphered is correct, except the first line should be an or rather than a mov
The code was a simple FSK generator on P0 between 124kHz and 134kHz
No idea what the constraints are but this has to fit in a small ROM space and perhaps even has to execute within a single COG.
I guess a reasonable design would be to have 416 bits of fuses, 128 for an AES key to load, 128 for user-config data (secondary key?), and 160 for a SHA1 hash of the bootloader. There should be a software-settable flag to hide access to the fuses, which is only reset on reboot. Add a mask of I/O pins that can't be changed once that flag is set, and every situation I can think of will be covered.
Well I didn't ask directly. I certainly do appreciate your understanding the depth of the statement though.
Lets say, for example, the key fuse bits are read once during start up and that they are read one bit at a time. Unlikely I know but it's only an example. Further lets assume that reading the key bits from fused is a significantly different operation from normal COG/HUB reads and that the instantaneous power consumption of the chip is different when reading a fuse set to 1 or 0.
BINGO! In that scenario an attacker knows exactly the moment when each key bit is read, from reading the ROM code and knowing the clock rate. He only has to measure the small changes in power consumption to get the key out.
Ah, you say, but the key bits will all be read in one go, or in two 32 bit reads. Well perhaps, but if the instantaneous power consumption was dependent on the fuse being set or not that would still give an indication of how many fuse bits are set to 1 and hence dramatically reduce the key space to search.
These may not be practical attacks, I just put them up an example of what could potentially, inadvertently render the whole scheme useless.
@pedward, great job. I too tried this puzzle. In the past I'd done lots of hand-disassembly. But ran out of time; sleepy-time took over the goal of solving the puzzle.
In the 1980s I worked for a large electronics company in their research
and development labs. As we were developing military gear we were using military
grade chips that were not supposed to fail so easily. So, here and there around
the labs were yellow "post boxes" on the walls into which we were supposed to
put any faulty chips. Periodically these chips were collected up and analyzed
for their failure mode.
I knew one of the young guys doing that job. Basically chips would get depotted
and he would look at them under an electron microscope. Naturally most failures
were just us engineers letting out the blue smoke or being careless with static
discharge which could easily be found under the microscope.
Now I guess that fuse bits are big things not in the regular mould of all
the other gates and transistors on the chip and can be found under an
electron microscope.
That guy could probably have read the key from the fuse bits within in a few
lunchtimes and for the price of a few beers:)
I suspect such possibilities are even more wide spread in the world now.
All this make worries about the strength of the crypto algorithm moot.
One possible way to protect against that attack is to use thin strips of aluminum for the fuse bits, so that if the chip gets depotted, the aluminum should oxidize pretty quickly in a form that hides whether the fuse was open or closed, preventing them from being physically read.
One way to protect against power-based side-channel attacks would be to have a high-power boot mode that activates all logic elements whether they're being used or not, for the duration of decryption, or until the user program sets a "regular power use" flag.