Chip & Beau: Wow, 160 fuse bits would be fantastic! 128 for the key +1 to enforce decryption and prevent further fuses from being blown, and still leaves 31 user bits.
Now we could use 3 of those user bits to permit a specific download (download p30-31, eeprom, sdcard) which would speed up the boot process, and also limit what the user can do.
As far as a second key, why not use the 128 bit key and the resulting hash after the decrypt code, and then hash that with the remaining 31-32 bits. This 31-32 bits could be expanded with the decryption code to widen it in a semi-secured way to 128 bits.
Anyway, this sounds fantastic to me.
The easiest way to crack this seems to be an xray or similar attack of the prop chip itself. And of course, this has to be done for every secured prop2 product. For a company who wants to produce a competing product, that may be a proposition. However, for the person who just wants to crack a product for their own use it is most likely not an option. Why would you want to buy a product just so you can make your own. Way easier to buy it.
FYI, I found a good article on Wikipedia which describes the authentication method I laid out, it's called HMAC, or Hash-based Message Authentication Code.
I used this recently at work to implement Basic LTI v1.0 support for our product. The Provider gives the Consumer as secret key which is used to generate a hash, the Provider then uses the shared key to reproduce the hash and authenticate the request.
So, the orignal premise of simply appending the key is flawed. What you need is 2 keys, and the 160 bits could provide that.
In a revised approach, you do this:
1) Take bootloader and append 128 bit key to it
2) hash combined key/code
3) append second, smaller, key to hash from step 2
4) hash combined k2 and hash from step 2
5) compare hash value at address $1F0 to newly computed hash
6) reject code if hash fails.
This process is relatively minor and uses 2 keys for authentication and 1 key for encryption. The encryption key is further obscured from hash analysis by the second round of hashing.
Ideally, you would REALLY want 162 bits, 128 bits for the encryption key, 32 bits for the HMAC outer loop, and 2 bits for "encrypted" and "write protect".
After researching the above, I'm confident in using the same key in authentication and encryption, since the key is completely obscured by the second HMAC key.
They are big because each fuse has to have a huge 3.3V PMOS device to blow it. Its channel width is around 300um.
I should probably say first that I haven't worked with fuses in any of the few chip's I've designed (HV process for research only) but if transistor size is the limiting factor wouldn't it make more sense to use n-mos transistors only? That is assuming that the threshold drop across the pull-up network doesn't make the minimum required N-mos size > than 300um of course.
I am presuming the fuses are something like the UV EPROMs were, but without the window? Anyway, it does not matter. 160+ fuses will be fantastic
I thought there was going to be 192KB of hub ram. Did I miss something?
Does the ROM need to be in the hub ram space? Surely the whole 128KB or 192KB could be made available. The ROM is now only going to execute once (per reset), so hopefully it can be put somewhere else in unused hub space, so we can get to the whole hub ram and not waste that precious 2KB. You can guarantee we will want it, and more!
UV EPROMS have actual transistors. The charge on the gate is the memory. The UV knocks th charge out to erase. From what is mentioned here PropII fuses are actual conductors that get vapourized permanently.
I should probably say first that I haven't worked with fuses in any of the few chip's I've designed (HV process for research only) but if transistor size is the limiting factor wouldn't it make more sense to use n-mos transistors only? That is assuming that the threshold drop across the pull-up network doesn't make the minimum required N-mos size > than 300um of course.
I'm probably just spewing non-sense ,
David
PMOS devices were used to blow each fuse, since the other side of the fuse could be tied to GND, enabling the internal 1.8V supply to pull up on it to get a reading. Otherwise, if NMOS devices had been used, one side of each fuse would have had to tie to the 3.3V supply, complicating subsequent reading at 1.8V. It's possible some huge PMOS series device could have supplied the whole group with 3.3V, so that individual (and smaller!) NMOS's could have blown the fuses, but it would have complicated reading them via the 1.8V supply. Because we've never done this before, I did it in the most straightforward way I could to ensure reliability.
PMOS devices were used to blow each fuse, since the other side of the fuse could be tied to GND, enabling the internal 1.8V supply to pull up on it to get a reading. Otherwise, if NMOS devices had been used, one side of each fuse would have had to tie to the 3.3V supply, complicating subsequent reading at 1.8V. It's possible some huge PMOS series device could have supplied the whole group with 3.3V, so that individual (and smaller!) NMOS's could have blown the fuses, but it would have complicated reading them via the 1.8V supply. Because we've never done this before, I did it in the most straightforward way I could to ensure reliability.
Oh I see, couldn't you get the same results and higher density by arranging the fuses in a grid and using both the massive PMOS gates and equivalent NMOS gates to act as row and column selects to blow fuses? You could then use only the NMOS network with your 1.8V logic to read out the fuse values.
I'll stop spewing now, Love the propeller chip, can't wait till this one comes out!
Oh I see, couldn't you get the same results and higher density by arranging the fuses in a grid and using both the massive PMOS gates and equivalent NMOS gates to act as row and column selects to blow fuses?
I thought about that, then thought some more and realized why that wouldn't work. It has to do with fuses not being diodes.
If my understanding is right, Chip is going the "tried and true" way that his fab recommended. If he had more resources, I'm sure he could try a shuttle of various methods to find a more suitable solution. A shuttle run costs $80k for the process they are using, at that price you are talking purely an academic exercise, since it's unlikely it would result in recouping $80k in production.
I just can't believe how close we are to a shuttle run of the actual processor!! Throw in 160 fuses and the latest compilation of cog logic, and rock and roll! Simple.
As most electron microscopes require the layer of interest to be exposed and metalized, it doesn't matter if an electron microscope can see the damage as the attacker can already physically probe the fuses.
Perhaps so. It's just that I remember being awe struck by an episode of the Horizon science show back in 1978 where they showed what looked like an electron microscope view of a microprocessor actually running. You could actually see the transistors switching!
Thanks for that link! It's easy to become jaded by the technology we enjoy today and take it for granted, including the Propeller. Videos like the one cited help to rekindle the wonder that existed when the revolution began and the simplest microprocessors were bleeding-edge devices.
Oh I see, couldn't you get the same results and higher density by arranging the fuses in a grid and using both the massive PMOS gates and equivalent NMOS gates to act as row and column selects to blow fuses? You could then use only the NMOS network with your 1.8V logic to read out the fuse values.
The trouble is that so much current is needed to blow a fuse that you cannot practically suffer any series devices or extra wire and vias. At least, I didn't want to risk the fuses not working. I figured you'd pretty much need something that looks like one of these:
The actual fuse has maybe 9 vias per side, with a minimum-width poly wire between them.
I suppose a matrix arrangement would be possible, but it would have to be huge to keep the resistances down, with very large drivers for row and column. Something to try next time, for sure, if we need more fuses.
I emailed Bruce Schneier about doing a quick design review of our security plan using SHA-256 and AES-128. He wrote me back and said that he'd like to help, but he's neither a software or hardware programmer.
I'm not sure a fuse matrix would be worth the effort, as circuitsoft pointed out fuses are wires not diodes like in most matrix circuits, in order to blow one fuse you would have to supply enought current to blow a single fuse in parallel with every other set of series fuses in the matrix without blowing them too, reading values back would be tricky to say the least but probably still doable.
It was mentioned... "what if the fuse WAS the diode"....
There are a few problems here also.... Once you blow the 'diode' there is no guarantee that it won't become a short, or even become an open, thus creating a problem similar to not having a diode at all.
Second, in order to create a diode you typically tie the gate to the Source... You can't use an NMOS transistor because all of the Cathodes are tied to the substrate... A PMOS might work, but if the NWELL gets damaged in the process it also shorts to the substrate.
Also... To create just a PN diode the P and N is actually the same material just doped differently during the manufacturing process( An example would be the NWELL used for a PMOS transistor... the junction between the NWELL and the substrate form a diode with the Anode being the substrate)... applying excess current to 'fuse" the PN junction barrier does not guarantee an open circuit after the process, instead the PN region randomly becomes PPN or NNP and eventually just wont function as a diode resulting in an unpredictable condition that may or may not conduct.
Currently as it stands, we are routing limited on the number of FUSES .... In terms of layout there is enough room for 210 FUSES, however in terms of getting those signals into the proper location there is only room for 172 FUSES.... Any multiplexing scheme ends up, no matter how you slice it, being a balancing act between the number of fuses and the applied multiplexing method (The multiplexers take up size and add a fair amount of routing congestion alone). 38 lost fuses would be better served if that space were utilized with decoupling capacitors instead of trying to fight the balancing act.
I emailed Bruce Schneier about doing a quick design review of our security plan using SHA-256 and AES-128. He wrote me back and said that he'd like to help, but he's neither a software or hardware programmer.
Bruce Schneier is well known for breaking security issues down into explanations normal people can understand. He may be more willing to look if you make it clear that all discussions are currently in plain English.
Other options would be Daniel J Bernstein (website is http://cr.yp.to) or David A Wheeler.
Also consider that state universities exist, in part, to assist businesses in the states they serve. Surely, there is someone in the UC system that's qualified to help.
Also consider that state universities exist, in part, to assist businesses in the states they serve. Surely, there is someone in the UC system that's qualified to help.
Perhaps so. It's just that I remember being awe struck by an episode of the Horizon science show back in 1978 where they showed what looked like an electron microscope view of a microprocessor actually running. You could actually see the transistors switching!
Well as a micro-chip is already mostly covered in metal you can skip the metalization step. In fact, I think Chip (well Parallax anyway) already has an electron microscope that can watch a micro-chip as it's functioning.
A friend asked me today, what does this encryption get you that potting the prop and flash in epoxy doesn't? After a bit of thought, I realized I didn't have a good answer.
Circuitsoft,
Potting circuit boards is a messy, expensive business that adds weight, bulk and expense to you products. It also may not do what you want. Let's say you encase your Prop/EPROM in such a way that no one can read the prom without breaking it. How could do you do upgrades of your code?
To probe a signal with an E-beam probe, the layer of metal in which you want to probe needs to be exposed, further adding to the complexity. I suppose you could X-ray the state of the fuse, but you will also have to trace back into the "core" the association of the fuse to the particular bit that was blown. Since there are 172 Fuse Bits, that translates to 344 wires that are just over one quarter of a micron wide. And even then the "core" is nothing but a mass of logic gates. The fuse bits themselves are not in any particular visual order that would correspond to the bit that was blown. So in order to even determine the correct mapping that correlates to the selected bits that were blown, you would need to probe 172 Propeller chips each with only one of their bits blown.
Comments
Now we could use 3 of those user bits to permit a specific download (download p30-31, eeprom, sdcard) which would speed up the boot process, and also limit what the user can do.
As far as a second key, why not use the 128 bit key and the resulting hash after the decrypt code, and then hash that with the remaining 31-32 bits. This 31-32 bits could be expanded with the decryption code to widen it in a semi-secured way to 128 bits.
Anyway, this sounds fantastic to me.
The easiest way to crack this seems to be an xray or similar attack of the prop chip itself. And of course, this has to be done for every secured prop2 product. For a company who wants to produce a competing product, that may be a proposition. However, for the person who just wants to crack a product for their own use it is most likely not an option. Why would you want to buy a product just so you can make your own. Way easier to buy it.
http://en.wikipedia.org/wiki/HMAC
I used this recently at work to implement Basic LTI v1.0 support for our product. The Provider gives the Consumer as secret key which is used to generate a hash, the Provider then uses the shared key to reproduce the hash and authenticate the request.
So, the orignal premise of simply appending the key is flawed. What you need is 2 keys, and the 160 bits could provide that.
In a revised approach, you do this:
1) Take bootloader and append 128 bit key to it
2) hash combined key/code
3) append second, smaller, key to hash from step 2
4) hash combined k2 and hash from step 2
5) compare hash value at address $1F0 to newly computed hash
6) reject code if hash fails.
This process is relatively minor and uses 2 keys for authentication and 1 key for encryption. The encryption key is further obscured from hash analysis by the second round of hashing.
Ideally, you would REALLY want 162 bits, 128 bits for the encryption key, 32 bits for the HMAC outer loop, and 2 bits for "encrypted" and "write protect".
After researching the above, I'm confident in using the same key in authentication and encryption, since the key is completely obscured by the second HMAC key.
I'm probably just spewing non-sense ,
David
I thought there was going to be 192KB of hub ram. Did I miss something?
Does the ROM need to be in the hub ram space? Surely the whole 128KB or 192KB could be made available. The ROM is now only going to execute once (per reset), so hopefully it can be put somewhere else in unused hub space, so we can get to the whole hub ram and not waste that precious 2KB. You can guarantee we will want it, and more!
PMOS devices were used to blow each fuse, since the other side of the fuse could be tied to GND, enabling the internal 1.8V supply to pull up on it to get a reading. Otherwise, if NMOS devices had been used, one side of each fuse would have had to tie to the 3.3V supply, complicating subsequent reading at 1.8V. It's possible some huge PMOS series device could have supplied the whole group with 3.3V, so that individual (and smaller!) NMOS's could have blown the fuses, but it would have complicated reading them via the 1.8V supply. Because we've never done this before, I did it in the most straightforward way I could to ensure reliability.
I have always wondered what happens when that condenses out over the lifetime of the chip.
I'll stop spewing now, Love the propeller chip, can't wait till this one comes out!
Oh, Chip, could the fuses be diodes?
haha right... guess I should have thought it through a bit further...
Perhaps so. It's just that I remember being awe struck by an episode of the Horizon science show back in 1978 where they showed what looked like an electron microscope view of a microprocessor actually running. You could actually see the transistors switching!
I think you'll find it on YouTube here: http://www.youtube.com/watch?v=XdGD8ZS62yw
Thanks for that link! It's easy to become jaded by the technology we enjoy today and take it for granted, including the Propeller. Videos like the one cited help to rekindle the wonder that existed when the revolution began and the simplest microprocessors were bleeding-edge devices.
-Phil
The trouble is that so much current is needed to blow a fuse that you cannot practically suffer any series devices or extra wire and vias. At least, I didn't want to risk the fuses not working. I figured you'd pretty much need something that looks like one of these:
VDD -> PMOS -> fuse -> GND
-or-
VDD -> fuse -> NMOS -> GND
The actual fuse has maybe 9 vias per side, with a minimum-width poly wire between them.
I suppose a matrix arrangement would be possible, but it would have to be huge to keep the resistances down, with very large drivers for row and column. Something to try next time, for sure, if we need more fuses.
There are a few problems here also.... Once you blow the 'diode' there is no guarantee that it won't become a short, or even become an open, thus creating a problem similar to not having a diode at all.
Second, in order to create a diode you typically tie the gate to the Source... You can't use an NMOS transistor because all of the Cathodes are tied to the substrate... A PMOS might work, but if the NWELL gets damaged in the process it also shorts to the substrate.
Also... To create just a PN diode the P and N is actually the same material just doped differently during the manufacturing process( An example would be the NWELL used for a PMOS transistor... the junction between the NWELL and the substrate form a diode with the Anode being the substrate)... applying excess current to 'fuse" the PN junction barrier does not guarantee an open circuit after the process, instead the PN region randomly becomes PPN or NNP and eventually just wont function as a diode resulting in an unpredictable condition that may or may not conduct.
Currently as it stands, we are routing limited on the number of FUSES .... In terms of layout there is enough room for 210 FUSES, however in terms of getting those signals into the proper location there is only room for 172 FUSES.... Any multiplexing scheme ends up, no matter how you slice it, being a balancing act between the number of fuses and the applied multiplexing method (The multiplexers take up size and add a fair amount of routing congestion alone). 38 lost fuses would be better served if that space were utilized with decoupling capacitors instead of trying to fight the balancing act.
Other options would be Daniel J Bernstein (website is http://cr.yp.to) or David A Wheeler.
-Phil
I'd like to recommend Dr. Ravishankar from UC Riverside: http://www.cs.ucr.edu/~ravi/
The more eyes the merrier...
Well as a micro-chip is already mostly covered in metal you can skip the metalization step. In fact, I think Chip (well Parallax anyway) already has an electron microscope that can watch a micro-chip as it's functioning.
Lawson
Not with an optical microscope. It might be possible with some of the more exotic (and much more expensive) instruments that are available.
Potting circuit boards is a messy, expensive business that adds weight, bulk and expense to you products. It also may not do what you want. Let's say you encase your Prop/EPROM in such a way that no one can read the prom without breaking it. How could do you do upgrades of your code?