You mean the don't limit current during the test!? Who here would do that given the cost of the test fixture?
I think they do limit, but at a fairly high level, (IIRC 2A?) but seems that limit is above what damages, should that flow thru a single probe.
A simple BUS connect of VDD makes sense, as they are internally bonded, but the BUS connect of VIO is not so easy to understand, as those are designed to be separate and if they were separated the current set point could be far lower.
You mean the don't limit current during the test!? Who here would do that given the cost of the test fixture?
Customer pays for that kind of repair at the test fixture. US$ 1800 each tiny metal finger contact, blown away, just because too much Coulombs decided to pass thru, at the same time. That's the deal.
You mean the don't limit current during the test!? Who here would do that given the cost of the test fixture?
Customer pays for that kind of repair at the test fixture. US$ 1800 each tiny metal finger contact, blown away, just because too much Coulombs decided to pass thru, at the same time. That's the deal.
I don't think they would charge us for such a thing, especially due to a manufacturing defect on their end.
From what I understand about the relationship with Parallax, I’m sure OnSemi will do the right thing. They want to get to the bottom of the problem so that they can prevent it from happening again. And now they are getting their best engineers on the job.
There are still many things to be learnt, even for an old process like 180nm.
Unfortunately for us bystanders, OnSemi (for political reasons and trade secrets) are unlikely to tell us the full story, but they will find out what went wrong and put measures in place to stop/minimise it in future.
They may even need to take the dies (failed/bad/good) and polish them down, examining them on the way down. PCB manufactures used to do this in the early 80’s when they had problems. Not sure about now anymore though, as the process is now well understood.
They may even need to take the dies (failed/bad/good) and polish them down, examining them on the way down. PCB manufactures used to do this in the early 80’s when they had problems.
The Russians used to do that during the cold war. Vax included a message on one of their chips, in Russian, that said "VAX--when you care enough to steal the very best."
So, one good wafer, and one bad one. Four to be tested, yet. Lets hope the remaining four are all good.
Kind regards, Samuel Lourenço
If some are good and some are bad it's a manufacturing problem. Once they get that sorted out the original process of validation can continue.
Yup, that's what I think. Anyway, any desigh issues should be ironed out by now. We already had a successfull iteration.
Just one wafer is 100+ chips (I think it is around 140, take a dozen, according to my calculations). If an entire wafer is ruined, I'm worried. Imagine the losses that Parallax will have. I hope a few chips can be saved on that wafer, but my hopes are not high.
So, the IR camera is an obvious first step... Never heard of this in microscope form, but makes sense... Wonder if the FLIR chips can zoom in....
Yeah, I don't know that lenses can pass IR, but it has sufficient resolution to see things of interest.
some valuable information can be found here https://edmundoptics.de/c/ir-lenses/655/# . With the right equipment the spectrum from left to right can be seen clearly, including extreme regions ;-)
Normal glass and plastic lenses pass IR, but the focal point may not be exactly the same as it is for visible light. This is usually only a problem for optically fast lenses which have a narrow depth of field. Where you get into really exotic materials is UV, particularly shortwave UV, to which normal glass is opaque.
I wonder if we could detect future failures by any metrics.
How many chips failed out of how many tested?
As for the possibility of detecting future failures by any metrics, all the information I was able to gather on that subject by the last 24 hours, does indicates that you need to know the position of each and every defective die at the finished wafer.
With the above information in hands and also knowing the causes of any failure at each individual die, the defective ones could be sorted, and the maps for each individual wafer can be completed.
By analyzing defect distribution, the cause(s) for each kind of defect become immediately evident.
The order in which the chips are tested would be of interest here too. How many chips would be failed before a damaged test fixture is detected?
Based on what is known about the test procedure, all of this could have been caused by a few defective chips. One vaporized probe tip could splatter metal over the wafer, potentially causing more failures.
I like the idea of a resistance check and very conservative current limits. To the maximum extent possible, a defective chip should not be able to damage the tester!
$1800*132 pins = $237,600 ! There's got to be a lot more costs involved in creating a chip than the test pins. I'd expect labor to be a large part of it and a possible nuisance punitive fee.
I wonder if we could detect future failures by any metrics.
How many chips failed out of how many tested?
As for the possibility of detecting future failures by any metrics, all the information I was able to gather on that subject by the last 24 hours, does indicates that you need to know the position of each and every defective die at the finished wafer.
With the above information in hands and also knowing the causes of any failure at each individual die, the defective ones could be sorted, and the maps for each individual wafer can be completed.
By analyzing defect distribution, the cause(s) for each kind of defect become immediately evident.
The order in which the chips are tested would be of interest here too. How many chips would be failed before a damaged test fixture is detected?
Based on what is known about the test procedure, all of this could have been caused by a few defective chips. One vaporized probe tip could splatter metal over the wafer, potentially causing more failures.
I like the idea of a resistance check and very conservative current limits. To the maximum extent possible, a defective chip should not be able to damage the tester!
$1800*132 pins = $237,600 ! There's got to be a lot more costs involved in creating a chip than the test pins. I'd expect labor to be a large part of it and a possible nuisance punitive fee.
I think the probe card is maybe $20k. I'm not sure. Repairs are just expensive. They have two probe cards and both got damaged. One has been repaired.
It seems to me that it was kind of an oversight to not have power supply current checking and allow the tester's default 2 amps to flow.
They seem quite dedicated now to discovering the cause of the problem. I wish they would concurrently just modify the test procedure to incorporate current checks and limits, so that wafer sort could continue. We need packaged parts. The failure analysis could be an independent matter. They got through one whole wafer without incident before both probe cards' pins got fried on the second wafer.
It seems like the thinking got tangled up:
1) The first wafer was sorted without incident.
2) It was reported that the probe card pin blew during the custom Parallax' I/O test on the second wafer.
3) Both probe cards wound up damaged.
4) Parallax supposes that something about its test must have caused latch-up.
5) Parallax makes a lower-current version of its I/O test.
6) Consensus among all is that Parallax test caused problems.
7) It is discovered that dies on the second wafer have VIO-to-GND shorts.
8) ON must investigate cause of shorts.
9) No more testing until cause of shorts is determined.
I would love to hear some acknowledgment that Parallax' test had nothing to do with the probe card getting damaged. It feels like the perception remains.
ON was very accommodating to let us run a custom pattern on their tester, and it took a long time for me to get it straightened out, due to initial lack of thermal settling time allowance. I got it working on the final opportunity they granted, as it had become a real sore point. Then, the probe card damage occurs and our test program takes the full blame. Now, new facts show it was not the cause, but we're still in trouble in the court of opinion.
Keep calm. First wafer OK second fail and they stopped testing, due to right thinking, first current choke the tester, then go on with the only working tester and test the wafer 3 of 6.
Sure one always fears the blame game, and fears that big company will cheat, but that fear is not likely to be needed, my guess.
Because they want to see the P2 running too. And they know that Parallax will be a recurring customer for years to come. Even decades. And some of them might even like it just because it is slightly different to work on chips design as the 'normal daytime' stuff.
What is causing it is speculation, but reasonably, running 2 Amps thru a I/O pin and burning the tester sounds not like a software error in the self test written by Parallax.
ON was very accommodating to let us run a custom pattern on their tester, and it took a long time for me to get it straightened out, due to initial lack of thermal settling time allowance. I got it working on the final opportunity they granted, as it had become a real sore point. Then, the probe card damage occurs and our test program takes the full blame. Now, new facts show it was not the cause, but we're still in trouble in the court of opinion.
I don't have top down experience here but talking about it with them is usually the best way forward. I've been guilty of not talking enough myself at work. It doesn't help staying quiet.
I wonder if the VIO VSS shorts can be related to Antenna effects during the manufacturing process.
Usually this is a problem damaging the gate oxide from charge accumulation (static electricity) during the manufacturing process when you have a long run of metal connecting to a transistor gate but it can also cause breakdown within other structures. Different processes have different rules for this but generally, if I remember correctly for 180nm, if the metal run has an area ratio 200:1 compared to the gate, then you should implement design accommodations to prevent or minimize charge accumulation. A general DRC check will not necessarily test for antenna rules, you must run a separate ANT rule check.
I wonder if the VIO VSS shorts can be related to Antenna effects during the manufacturing process.
Usually this is a problem damaging the gate oxide from charge accumulation (static electricity) during the manufacturing process when you have a long run of metal connecting to a transistor gate but it can also cause breakdown within other structures. Different processes have different rules for this but generally, if I remember correctly for 180nm, if the metal run has an area ratio 200:1 compared to the gate, then you should implement design accommodations to prevent or minimize charge accumulation. A general DRC check will not necessarily test for antenna rules, you must run a separate ANT rule check.
I put a voltmeter in parallel to measure the test voltage. It has a 10M ohm input, but I have compensated for that in the resistance measurements below.
I put a voltmeter in parallel to measure the test voltage. It has a 10M ohm input, but I have compensated for that in the resistance measurements below.
I've sent several emails to the high-up engineer/boss at ON. I hope he gives some indication tomorrow that he read them. It would be good if wafer sort can continue with current limits in place. We need to get dies over to Amkor for packaging.
I've sent several emails to the high-up engineer/boss at ON. I hope he gives some indication tomorrow that he read them. It would be good if wafer sort can continue with current limits in place. We need to get dies over to Amkor for packaging.
It would be easier to protect against any damage, if they could isolate each VIO and apply a lower protection limit to each one, but that may be more time to modify the fixture ?
Thanks, Saucy. Nothing unusual about V4855, by design. Not sure if that's significant, or not.
Von and I both discovered variations on this behaviour. A later test will give different outcome. I was getting it over 4 volts with 500uA constant current. It kind of figures, it is the power rail after all.
I've sent several emails to the high-up engineer/boss at ON. I hope he gives some indication tomorrow that he read them. It would be good if wafer sort can continue with current limits in place. We need to get dies over to Amkor for packaging.
It would be easier to protect against any damage, if they could isolate each VIO and apply a lower protection limit to each one, but that may be more time to modify the fixture ?
Yes, that would be very disruptive to how things are flowing and register as a huge expense. It would be ideal, but a group check on resistance would be quite sufficient.
I hope I have some dialogue with ON about that in the next day or two.
... about the VIO VSS short and my mentioning of charge accumulation.
The logical thinking is that the reverse diode between VIO VSS may be damaged. I don't think this is the case at all. One of the techniques to combat Antenna issues is to place a reverse biased diode near the transistor gate. Another method is to create a metal jumper. A third method is to construct an ESD "comb" (sorta like the concept of a lightning rod.)
Anyway, if the diode between VIO and VSS is constructed in a way that it actually doesn't make a connection until the last layers of metal are applied in the manufacturing process then it would provide little if no protection during the manufacturing process. In that case, any transistor within the I/O drive itself could suffer.
How long is the wire run from the core logic to any of the I/O's? best case length? and worst case length? ... and what is the wire width?
I don't have an explanation for how one wafer is good and the other is not, but the humidity level during the manufacturing process and even during exposed die testing plays a significant part. Humidity is usually regulated and it is a fine balancing act. On one hand you don't want any moisture to work it's way into the manufacturing process, and on the other hand less moisture increases the risk for static related failures.
Ponder this, it takes roughly 30kV to arc across a 1cm gap with rounded probes. That's only 3V per um.
The big boss at ON Semi gave word to the test engineer to start working with me on the current-limiting procedure for the start of the test. The engineer and I talked and we've got a plan of attack:
1) Hold TESn, RESn, and P[63:0] at GND.
2) Set VDD as a 1mA source, clamped to 1.8V. (typical leakage is ~100uA)
3) Set VIO as a 100uA source, clamped to 3.3V. (typical leakage is ~10uA)
4) Allow 3ms for any attached bypass caps to charge.
5) Verify that VDD and VIO currents are not clamped at limits, else fail.
6) Set sufficient/safe VDD and VIO limits.
7) Proceed to regular test suite.
This will detect shorted dies right off the bat and preserve delicate test fixtures. We can get the known-good die sorted for Amkor packaging, then.
He said these shorted die were located near the edge of the wafer, by the way, where yield typically drops off.
Comments
A simple BUS connect of VDD makes sense, as they are internally bonded, but the BUS connect of VIO is not so easy to understand, as those are designed to be separate and if they were separated the current set point could be far lower.
Customer pays for that kind of repair at the test fixture. US$ 1800 each tiny metal finger contact, blown away, just because too much Coulombs decided to pass thru, at the same time. That's the deal.
I don't think they would charge us for such a thing, especially due to a manufacturing defect on their end.
There are still many things to be learnt, even for an old process like 180nm.
Unfortunately for us bystanders, OnSemi (for political reasons and trade secrets) are unlikely to tell us the full story, but they will find out what went wrong and put measures in place to stop/minimise it in future.
They may even need to take the dies (failed/bad/good) and polish them down, examining them on the way down. PCB manufactures used to do this in the early 80’s when they had problems. Not sure about now anymore though, as the process is now well understood.
Kind regards, Samuel Lourenço
The Russians used to do that during the cold war. Vax included a message on one of their chips, in Russian, that said "VAX--when you care enough to steal the very best."
https://www.zdnet.com/pictures/photos-inside-these-chips-art-awaits/9/
If some are good and some are bad it's a manufacturing problem. Once they get that sorted out the original process of validation can continue.
I think the digital part was done via computer, so probably nothing there....
Just one wafer is 100+ chips (I think it is around 140, take a dozen, according to my calculations). If an entire wafer is ruined, I'm worried. Imagine the losses that Parallax will have. I hope a few chips can be saved on that wafer, but my hopes are not high.
Not even the analog part is designed without any form of CAD, nowadays.
Kind regards, Samuel Lourenço
some valuable information can be found here https://edmundoptics.de/c/ir-lenses/655/# . With the right equipment the spectrum from left to right can be seen clearly, including extreme regions ;-)
V0815 = ? ohms
V1623 = ? ohms 21M
V2431 = ? ohms
V3239 = ? ohms 39.89 M
V4047 = ? ohms
V4855 = ? ohms 38.6 M
V5663 = ? ohms
EDDDM525D
Thanks, Pilot.
The order in which the chips are tested would be of interest here too. How many chips would be failed before a damaged test fixture is detected?
Based on what is known about the test procedure, all of this could have been caused by a few defective chips. One vaporized probe tip could splatter metal over the wafer, potentially causing more failures.
I like the idea of a resistance check and very conservative current limits. To the maximum extent possible, a defective chip should not be able to damage the tester!
$1800*132 pins = $237,600 ! There's got to be a lot more costs involved in creating a chip than the test pins. I'd expect labor to be a large part of it and a possible nuisance punitive fee.
I think the probe card is maybe $20k. I'm not sure. Repairs are just expensive. They have two probe cards and both got damaged. One has been repaired.
It seems to me that it was kind of an oversight to not have power supply current checking and allow the tester's default 2 amps to flow.
They seem quite dedicated now to discovering the cause of the problem. I wish they would concurrently just modify the test procedure to incorporate current checks and limits, so that wafer sort could continue. We need packaged parts. The failure analysis could be an independent matter. They got through one whole wafer without incident before both probe cards' pins got fried on the second wafer.
It seems like the thinking got tangled up:
1) The first wafer was sorted without incident.
2) It was reported that the probe card pin blew during the custom Parallax' I/O test on the second wafer.
3) Both probe cards wound up damaged.
4) Parallax supposes that something about its test must have caused latch-up.
5) Parallax makes a lower-current version of its I/O test.
6) Consensus among all is that Parallax test caused problems.
7) It is discovered that dies on the second wafer have VIO-to-GND shorts.
8) ON must investigate cause of shorts.
9) No more testing until cause of shorts is determined.
I would love to hear some acknowledgment that Parallax' test had nothing to do with the probe card getting damaged. It feels like the perception remains.
ON was very accommodating to let us run a custom pattern on their tester, and it took a long time for me to get it straightened out, due to initial lack of thermal settling time allowance. I got it working on the final opportunity they granted, as it had become a real sore point. Then, the probe card damage occurs and our test program takes the full blame. Now, new facts show it was not the cause, but we're still in trouble in the court of opinion.
Sure one always fears the blame game, and fears that big company will cheat, but that fear is not likely to be needed, my guess.
Because they want to see the P2 running too. And they know that Parallax will be a recurring customer for years to come. Even decades. And some of them might even like it just because it is slightly different to work on chips design as the 'normal daytime' stuff.
What is causing it is speculation, but reasonably, running 2 Amps thru a I/O pin and burning the tester sounds not like a software error in the self test written by Parallax.
so wait and see.
Mike
I don't have top down experience here but talking about it with them is usually the best way forward. I've been guilty of not talking enough myself at work. It doesn't help staying quiet.
I wonder if the VIO VSS shorts can be related to Antenna effects during the manufacturing process.
Usually this is a problem damaging the gate oxide from charge accumulation (static electricity) during the manufacturing process when you have a long run of metal connecting to a transistor gate but it can also cause breakdown within other structures. Different processes have different rules for this but generally, if I remember correctly for 180nm, if the metal run has an area ratio 200:1 compared to the gate, then you should implement design accommodations to prevent or minimize charge accumulation. A general DRC check will not necessarily test for antenna rules, you must run a separate ANT rule check.
Reference:
https://www.eetimes.com/document.asp?doc_id=1216897#
Thanks, Beau. I will ask them about this. I think this came up, but maybe it didn't get sufficiently addressed.
I don't think that the thinking got tangled up. It is just the natural progression of hypothesis being ruled out.
To be fair, at the beginning of this I didn't had the information I have now. I didn't knew a whole wafer had passed the tests before the incident.
Kind regards, Samuel Lourenço
V0007 = 4.94M ohms, 0.446V
V0815 = 4.31M ohms, 0.415V
V1623 = 5.21M ohms, 0.458V
V2431 = 5.01M ohms, 0.449V
V3239 = 4.28M ohms, 0.413V
V4047 = 4.77M ohms, 0.438V
V4855 = 8.13M ohms, 0.557V, I checked this repeatedly, it's always higher.
V5663 = 4.61M ohms, 0.430V
All Vio = 3.70M ohms, 0.379V
Thanks, Saucy. Nothing unusual about V4855, by design. Not sure if that's significant, or not.
It would be easier to protect against any damage, if they could isolate each VIO and apply a lower protection limit to each one, but that may be more time to modify the fixture ?
Yes, that would be very disruptive to how things are flowing and register as a huge expense. It would be ideal, but a group check on resistance would be quite sufficient.
I hope I have some dialogue with ON about that in the next day or two.
The logical thinking is that the reverse diode between VIO VSS may be damaged. I don't think this is the case at all. One of the techniques to combat Antenna issues is to place a reverse biased diode near the transistor gate. Another method is to create a metal jumper. A third method is to construct an ESD "comb" (sorta like the concept of a lightning rod.)
Anyway, if the diode between VIO and VSS is constructed in a way that it actually doesn't make a connection until the last layers of metal are applied in the manufacturing process then it would provide little if no protection during the manufacturing process. In that case, any transistor within the I/O drive itself could suffer.
How long is the wire run from the core logic to any of the I/O's? best case length? and worst case length? ... and what is the wire width?
I don't have an explanation for how one wafer is good and the other is not, but the humidity level during the manufacturing process and even during exposed die testing plays a significant part. Humidity is usually regulated and it is a fine balancing act. On one hand you don't want any moisture to work it's way into the manufacturing process, and on the other hand less moisture increases the risk for static related failures.
Ponder this, it takes roughly 30kV to arc across a 1cm gap with rounded probes. That's only 3V per um.
The big boss at ON Semi gave word to the test engineer to start working with me on the current-limiting procedure for the start of the test. The engineer and I talked and we've got a plan of attack:
1) Hold TESn, RESn, and P[63:0] at GND.
2) Set VDD as a 1mA source, clamped to 1.8V. (typical leakage is ~100uA)
3) Set VIO as a 100uA source, clamped to 3.3V. (typical leakage is ~10uA)
4) Allow 3ms for any attached bypass caps to charge.
5) Verify that VDD and VIO currents are not clamped at limits, else fail.
6) Set sufficient/safe VDD and VIO limits.
7) Proceed to regular test suite.
This will detect shorted dies right off the bat and preserve delicate test fixtures. We can get the known-good die sorted for Amkor packaging, then.
He said these shorted die were located near the edge of the wafer, by the way, where yield typically drops off.