It was the wire that was making poor contact. The swap to the Analog Discovery forced me to swap the wire going to P62.
The white wire could be broken inside the heat-shrink. Good time to resolve what is wrong with the wire so doesn't cause future issues.
Wires can also be broken inside their own insulation, particularly with teflon or other high temp insulation. What happens during soldering the wire is that the solder wicks a millimeter or two up the copper strands under the insulation creating a solid copper/solder wire that any vibration can eventually break at the point where the solder ends. The resulting break can be a clean break that is relatively easy to find or an elusive one that that looks just fine whenever a probe is applied to the joint.
Used an hefty USB charger module (my own creation), with a capacity of 2A per port, to supply power to the board. The current now goes entirely through the "P2 USB" port. The fact that the Vbus voltage is dropped seems to allow a condition where the "PC USB" port power switch (U502) is not off, fully on in part. Vbus now reads 4.89V and is within spec.
However, the test still fails, as expected. I don't think that Vbus is a factor, but managed to exclude that, anyway. Maybe the clock generator is not strong enough to drive the pin through a 1K resistor?
Kind regards, Samuel Lourenço
Check the voltage out of that USB charger. I have found that many supposedly correct chargers still give me the lightning bolt up in the corner of my raspberry pi, indicating to low of voltage. Or perhaps add a really fat capacitor across the output.
Used an hefty USB charger module (my own creation), with a capacity of 2A per port, to supply power to the board. The current now goes entirely through the "P2 USB" port. The fact that the Vbus voltage is dropped seems to allow a condition where the "PC USB" port power switch (U502) is not off, fully on in part. Vbus now reads 4.89V and is within spec.
However, the test still fails, as expected. I don't think that Vbus is a factor, but managed to exclude that, anyway. Maybe the clock generator is not strong enough to drive the pin through a 1K resistor?
Kind regards, Samuel Lourenço
Check the voltage out of that USB charger. I have found that many supposedly correct chargers still give me the lightning bolt up in the corner of my raspberry pi, indicating to low of voltage. Or perhaps add a really fat capacitor across the output.
Hi Michael,
The charger module in question presents no voltage drop. Plus, if has some nice niobium caps, one per output. Mind that I've measured the voltage on the P2 Eval side, right on its power port. I'm using quality cables that show little voltage drop as well. On the other side, the USB host ports on my PC have some drop issues, because the designers chose to use weak PPTC fuses to protect each pair of ports.
Regarding your Raspberry Pi, mind that the cable that is supplied with it is a weak one. It shows an appreciable voltage drop. Many mini or micro USB cables are prone to that. Especially those of a thin gauge.
Anyway, the issue here was nothing related with supply voltages, as the P2 is powered by 3.3V and 1.8V, and those rails are from regulators (or DC-DC converters) and thus little influenced by the 5V supply lines (these rails measured as they should during the test). The failure was caused by a faulty wire (that should be) supplying the 1MHz clock to the test, and the issue is now solved.
We've had a scare with the new chip on the wafer prober at ON Semi.
We fried a pin on the wafer prober card. In one picture you can see a shortened prober pin that is discolored and in the other you can see a melted blob on a VIO die pad that was that prober pin's tip.
This prober card is pretty delicate and not meant for lots of current flow. It seems that we were getting latch-up, due to high core VDD and GND currents, which were causing sufficient ground disparities between core GNDs and I/O block GNDs. It was a VIO pin that blew out. Each I/O pin can only conduct 50mA, and a VIO pin feeds four of them. Instead of saturating at 200mA, 2A were conducted, which is the limit of the tester. The probe pin turned briefly into a lightbulb filament.
Our custom I/O test program was failing, while ON Semi's digital verification programs were passing. The main difference between the two is that our I/O test program runs 8 cogs at 180MHz, while the ON tests run slowly and only the memory built-in-self-tests were running at 180MHz. So, our 10x current demand was triggering huge problems.
Fortunately, when we did our last round of I/O test program development, I made an extra trial version that ran at only 20MHz, instead of 180MHz. It tested out fine. So, we can use this lower-current program on the wafer prober with the hair-like pins to test the I/O's at the wafer level, then use the full-power program for testing the packaged chips.
Wendy told me that it costs $1800 to repair a single probe card pin!
We should receive 10 glob-top prototype chips tomorrow from ON Semi. I'm going down to Parallax and we will solder one onto an original P2 Eval board to check it out. The pick-and-place line will be busy through the end of the week making some robot boards.
Ouch! I gather the analogue testing errors were happening independent of the fried probe, correct?
The VIO pins were like triggered SCR's to GND, in latch-up. The heavy core current draw triggered the situation. Of course, in that circumstance, the pins were untestable.
Fortunately, when we did our last round of I/O test program development, I made an extra trial version that ran at only 20MHz, instead of 180MHz. It tested out fine...
Great there is a work around.
Certainly with the poor GND natuer of probes, the lowest possible MHz makes sense.
Can they also lower the current limits, to avoid damage to the probes if there are transitory peaks ?
... So, we can use this lower-current program on the wafer prober with the hair-like pins to test the I/O's at the wafer level, then use the full-power program for testing the packaged chips.
Even with packaged chips, you might want to still keep modest currents to extend the life of the ZIF sockets they will use ? Tho hopefully here, the paths are more parallel in nature.
Fortunately, when we did our last round of I/O test program development, I made an extra trial version that ran at only 20MHz, instead of 180MHz. It tested out fine...
Great there is a work around.
Certainly with the poor GND natuer of probes, the lowest possible MHz makes sense.
Can they also lower the current limits, to avoid damage to the probes if there are transitory peaks ?
... So, we can use this lower-current program on the wafer prober with the hair-like pins to test the I/O's at the wafer level, then use the full-power program for testing the packaged chips.
Even with packaged chips, you might want to still keep modest currents to extend the life of the ZIF sockets they will use ? Tho hopefully here, the paths are more parallel in nature.
They have settable current limits, with a background limit of 2A.
I think the ZIF sockets and normal tester head are pretty tough. They can handle that current. The delicate probe pins can't take it, though.
I just heard from ON Semi that both of their probe cards now have blown pins. One was VIO4447 and the other was VIO0407. Random, senseless latch-up violence.
I'm thinking the latch-ups are not due to core speed but more, as you first surmised, because of the separation of the two grounds. Without the common rail of the thermal pad, there is too much opportunity for floating voltages to cause latch-up.
I just heard from ON Semi that both of their probe cards now have blown pins. One was VIO4447 and the other was VIO0407. Random, senseless latch-up violence.
That's $3,600 in repairs needed.
Is that since the change in test MHz, or did they both get damaged running the earlier tests ?
If 2 probes are damaged, maybe others are stressed and have not (yet) failed totally, but have been heated to extreme temperatures.
Does this also have the potential to damage the P2 die ? (so they cannot bond to that pad ?)
I'm thinking the latch-ups are not due to core speed but more, as you first surmised, because of the separation of the two grounds. Without the common rail of the thermal pad, there is too much opportunity for floating voltages to cause latch-up.
High frequency means high current.
There are back-to-back diodes between core GND and I/O GNDs in each I/O power and GND pad. The diodes for each VIO pin sum to 200um in length, but that's not enough clamping to inhibit latch-up, apparently.
I just heard from ON Semi that both of their probe cards now have blown pins. One was VIO4447 and the other was VIO0407. Random, senseless latch-up violence.
That's $3,600 in repairs needed.
Is that since the change in test MHz, or did they both get damaged running the earlier tests ?
If 2 probes are damaged, maybe others are stressed and have not (yet) failed totally, but have been heated to extreme temperatures.
Does this also have the potential to damage the P2 die ? (so they cannot bond to that pad ?)
Yes, there could be a lot of fatigue, already.
I think once the test current comes down, there is reduced risk to everything, going forward.
We should receive 10 glob-top prototype chips tomorrow from ON Semi. I'm going down to Parallax and we will solder one onto an original P2 Eval board to check it out. The pick-and-place line will be busy through the end of the week making some robot boards.
Fingers and toes crossed for you Chip once you get to try it out. Nice and methodically in a quiet room without interruption with any luck. LOL.
We should receive 10 glob-top prototype chips tomorrow from ON Semi. I'm going down to Parallax and we will solder one onto an original P2 Eval board to check it out. The pick-and-place line will be busy through the end of the week making some robot boards.
Fingers and toes crossed for you Chip once you get to try it out. Nice and methodically in a quiet room without interruption with any luck. LOL.
Fortunately, these chips have already passed the digital tests and our IO test, which is a downloaded program.
We should receive 10 glob-top prototype chips tomorrow from ON Semi. I'm going down to Parallax and we will solder one onto an original P2 Eval board to check it out. The pick-and-place line will be busy through the end of the week making some robot boards.
Gosh Chip you need to leave home and drive down to Rocklin? Can't somebody bring those chips to the walnut farm?
We should receive 10 glob-top prototype chips tomorrow from ON Semi. I'm going down to Parallax and we will solder one onto an original P2 Eval board to check it out. The pick-and-place line will be busy through the end of the week making some robot boards.
Gosh Chip you need to leave home and drive down to Rocklin? Can't somebody bring those chips to the walnut farm?
I hope everything will work out...
Enjoy!
Mike
Thanks, Mike. Everything to do with building circuit boards is in Rocklin, so that's where I'm headed, early.
By reading the thread, now I've realized that the above 2A consumption could be caused by a latch-up condition. It would be odd that the current consumption would rise to 2A when testing at 180MHz. That would be another P2 hot, especially considering that this revision should show a lower current consumption.
The blown probe pins are an expensive mishap, but I'm glad that all tests can pass so far. Lets see how P2 behaves when fully packaged and with a nice ground.
I've had my experience with latch-up. I once designed a power supply, but forgot to add a resistor going from the supply output to the input of the error-correction op-amp. When the power supply suffered a spike from an heavy inductive load, the diode protection on the output prevented the destruction, but was not enough to prevent a latch-up condition on the op-amp. Luckily, the power supply was limited to 250mA, and the amplifier held. After power down, that condition would disappear, but I'm convinced that, if there was no current limit, the op-amp would be destroyed. Yup, typical thyristor behavior.
At $1800, you'd think they would put some PTC fuses on them...
PTC fuses are susceptible to fatigue, and can go permanently into a kind of open state. I don't know if that would degrade the performance when probing the chip.
Fingers crossed here!
Any words are better than no word at all.
I hope the problem is just something on one wafer. I don't know if they've figured anything out, yet. It might just be a meeting about how they're going to proceed to investigate the cause of the problem, and in what way they'll permit me to help.
Comments
Wires can also be broken inside their own insulation, particularly with teflon or other high temp insulation. What happens during soldering the wire is that the solder wicks a millimeter or two up the copper strands under the insulation creating a solid copper/solder wire that any vibration can eventually break at the point where the solder ends. The resulting break can be a clean break that is relatively easy to find or an elusive one that that looks just fine whenever a probe is applied to the joint.
Check the voltage out of that USB charger. I have found that many supposedly correct chargers still give me the lightning bolt up in the corner of my raspberry pi, indicating to low of voltage. Or perhaps add a really fat capacitor across the output.
The charger module in question presents no voltage drop. Plus, if has some nice niobium caps, one per output. Mind that I've measured the voltage on the P2 Eval side, right on its power port. I'm using quality cables that show little voltage drop as well. On the other side, the USB host ports on my PC have some drop issues, because the designers chose to use weak PPTC fuses to protect each pair of ports.
Regarding your Raspberry Pi, mind that the cable that is supplied with it is a weak one. It shows an appreciable voltage drop. Many mini or micro USB cables are prone to that. Especially those of a thin gauge.
Anyway, the issue here was nothing related with supply voltages, as the P2 is powered by 3.3V and 1.8V, and those rails are from regulators (or DC-DC converters) and thus little influenced by the 5V supply lines (these rails measured as they should during the test). The failure was caused by a faulty wire (that should be) supplying the 1MHz clock to the test, and the issue is now solved.
Kind regards, Samuel Lourenço
We fried a pin on the wafer prober card. In one picture you can see a shortened prober pin that is discolored and in the other you can see a melted blob on a VIO die pad that was that prober pin's tip.
This prober card is pretty delicate and not meant for lots of current flow. It seems that we were getting latch-up, due to high core VDD and GND currents, which were causing sufficient ground disparities between core GNDs and I/O block GNDs. It was a VIO pin that blew out. Each I/O pin can only conduct 50mA, and a VIO pin feeds four of them. Instead of saturating at 200mA, 2A were conducted, which is the limit of the tester. The probe pin turned briefly into a lightbulb filament.
Our custom I/O test program was failing, while ON Semi's digital verification programs were passing. The main difference between the two is that our I/O test program runs 8 cogs at 180MHz, while the ON tests run slowly and only the memory built-in-self-tests were running at 180MHz. So, our 10x current demand was triggering huge problems.
Fortunately, when we did our last round of I/O test program development, I made an extra trial version that ran at only 20MHz, instead of 180MHz. It tested out fine. So, we can use this lower-current program on the wafer prober with the hair-like pins to test the I/O's at the wafer level, then use the full-power program for testing the packaged chips.
Wendy told me that it costs $1800 to repair a single probe card pin!
Sounds like you have given this testing a lot of thought. We look forward to probe melting silicon soon!
The VIO pins were like triggered SCR's to GND, in latch-up. The heavy core current draw triggered the situation. Of course, in that circumstance, the pins were untestable.
Great there is a work around.
Certainly with the poor GND natuer of probes, the lowest possible MHz makes sense.
Can they also lower the current limits, to avoid damage to the probes if there are transitory peaks ?
Even with packaged chips, you might want to still keep modest currents to extend the life of the ZIF sockets they will use ? Tho hopefully here, the paths are more parallel in nature.
They have settable current limits, with a background limit of 2A.
I think the ZIF sockets and normal tester head are pretty tough. They can handle that current. The delicate probe pins can't take it, though.
That's $3,600 in repairs needed.
That color tempering along the pin tells you what temperature those sections of metal were at just after things 'blew'.
https://en.wikipedia.org/wiki/Tempering_(metallurgy)
Is that since the change in test MHz, or did they both get damaged running the earlier tests ?
If 2 probes are damaged, maybe others are stressed and have not (yet) failed totally, but have been heated to extreme temperatures.
Does this also have the potential to damage the P2 die ? (so they cannot bond to that pad ?)
High frequency means high current.
There are back-to-back diodes between core GND and I/O GNDs in each I/O power and GND pad. The diodes for each VIO pin sum to 200um in length, but that's not enough clamping to inhibit latch-up, apparently.
Yes, there could be a lot of fatigue, already.
I think once the test current comes down, there is reduced risk to everything, going forward.
I suppose the probe lengths will be a factor too. Synchronous switching edges might be all that counts ...
Fingers and toes crossed for you Chip once you get to try it out. Nice and methodically in a quiet room without interruption with any luck. LOL.
Fortunately, these chips have already passed the digital tests and our IO test, which is a downloaded program.
Gosh Chip you need to leave home and drive down to Rocklin? Can't somebody bring those chips to the walnut farm?
I hope everything will work out...
Enjoy!
Mike
Thanks, Mike. Everything to do with building circuit boards is in Rocklin, so that's where I'm headed, early.
The blown probe pins are an expensive mishap, but I'm glad that all tests can pass so far. Lets see how P2 behaves when fully packaged and with a nice ground.
I've had my experience with latch-up. I once designed a power supply, but forgot to add a resistor going from the supply output to the input of the error-correction op-amp. When the power supply suffered a spike from an heavy inductive load, the diode protection on the output prevented the destruction, but was not enough to prevent a latch-up condition on the op-amp. Luckily, the power supply was limited to 250mA, and the amplifier held. After power down, that condition would disappear, but I'm convinced that, if there was no current limit, the op-amp would be destroyed. Yup, typical thyristor behavior.
Kind regards, Samuel Lourenço
kind regards, Samuel Lourenço
I hope they've identified the problem and it's nothing requiring another fab turn.
Any words are better than no word at all.
I hope the problem is just something on one wafer. I don't know if they've figured anything out, yet. It might just be a meeting about how they're going to proceed to investigate the cause of the problem, and in what way they'll permit me to help.