circuit design considerations: how many years will a propeller chip run?

cwyuzik · 2010-11-26 23:36

Hi,
I recently consulted here on a circuit design for an auto stop and restart circuit for the prop chip and got some great advice. But was wondering, perhaps it isnt necessary to stop and restart the chip in the first place.

For example, weve got one of the first digital alarm clocks here (circa 1981) that uses one of the early TI microprocessors, and it hasnt skipped a beat in close to 30 years and still is running strong.

Could the propeller reliably run a program that long? I know these have only been available for around 4 years and this may be speculation, but would anyone here have an insight as to how long this chip might run?

Im using a q44 that will write to its smt eeprom approximately 12 times a day.

Thanks.

Ale · 2010-11-27 00:30

Endurance of EEPROMs tend to be a bit on the 20 years depending on conditions...

MagIO2 · 2010-11-27 01:55

For this case you could use FRAM. It has much more write cycles and directly replaces the EEPROM.
Here the design of the propeller is an advantage. If the EEPROM/FRAM is broken you simply replace it. In other mikrocontrollers you have the Flash/EEPROM build in. So, if it's broken you have to replace the controller.

I think it's no problem for the propeller to run as long as that TI. You could even think about concepts to make it failsafe if electromagnetic fields or solar flares flip a bit in RAM because you have several COGs which can double check.

zoopydogsit · 2010-11-27 01:59

Technically in an ideal environment, with clean power, constant temperature, controlled humidity, no vibration, properly designed and built circuit then IMHO I guess the Propeller could last almost forever.

You'd most likely find problems external to the Propeller chip.

As long as you load it correctly (don't draw more than spec'd from the device (power per pin)) then in reality, the main causes of failure would be;
- Power supply issues (stress to the power supply, and other components)
- Circuit design issues (overloading pins, incorrectly interfacing to the real world (introducing ground loops, ESD, etc)) leading to intermittent behaviour and failures.
- Premature failure from ESD damage pre-construction (mishandled devices damaged by ESD and failing after a short period)
- Aging components, mostly electrolytic capacitors drying out (mostly in power supplies), these can lead to increased noise, rippling of power and decreasing predictability.
- Mechanical issues (buttons, etc) wear, put strain on cables (an unsupported solder joint with movement has a limited life)
- Cycling the thermal curve (heat up and down) causing physical expansion and contraction of components on the board (cracks in solder joints, PCB lines, fatigue lines on pins etc).
- Humidity (should be between 40% and 50%), less than 40% and you get increased risk of ESD failure, greater than 50% and you increase the risk of corrosion.
- Vibration, similar effects to cycling the thermal curve.

There are engineering and manufacturing disciplines that can be used to minimize/manage these issues.

I read somewhere that an LED has the MTBF of 3x10^8 years. Though they are likely to loose some brightness well before that. I'm guessing that most etched silicon would be the similar in ideal circumstances (except for fusable link fuses, they leave debris).

I did read somewhere about silicon crystal growth on substrates over time, don't know how fast, but I'd suspect it would happen long after most projects are in landfill. Electroyltic material mismatch can lead to corrosion and other issues (google Zinc Whiskers) from the wrong types of metals together.

If you have a critical design you should try to minimize the above and build in a WatchDog circuit to reset (ie. with an NE555 and pulses out from a pin from the Prop, failure of the pulses causes it to reset the prop). Example circuit http://schematics.dapj.com/2004/10/555-watchdog-for-uc-and-up-systems.html

Your most likely cause of failures will be the mechanical connections of the construction (soldering) before I'd suspect silicon to go bad (if you've designed and built it right).

Writing to an Eeprom may have issues at some point, there will be a limit to the number of writes before you start having issues. You'll need to read the manufacturers specs the brand and specific device. There may be other technical solutions (ie. battery backup SRAM). I've seen other discussions in this forum for eeproms eventually failing after seriously considerable work. Search the current and previous Propeller Forum.

From the experimenting I've done I've been surprised by the reliability and stability of the Prop. I'm sure others can comment on what they've experienced.

localroger · 2010-11-27 08:10

The EEPROM will wear out after a million or so write cycles. It's not part of the propeller chip though and, if it weren't SMT, could be replaced.

I have seen EEPROM wear out from excessive use. One of my competitors decided to store the count of trucks passing a highway weigh station in EEPROM. At 1,200 trucks or so a day on this busy interstate highway it would stop counting every six months or so. The original vendors were calling this a random failure and charging them $800 for a new main board when this happened. I figured out it was the EEPROM and saved them a lot of money until they finally got a new system.

cwyuzik · 2010-11-27 23:27

Thank you all so much. Very helpful, especially about the watchdog circuit Zoopydogsit.

zoopydogsit · 2011-01-19 17:27

Hi Cwyuzkik,

In regards to your watchdog circuit, as mentioned the Propeller will need to output a pulse stream to reset it to keep it from re-setting the Prop. Which is the whole idea of how a watchdog works. However you will need to think carefully about your expected fail condition for it's implementation.

In a standard, single core implementation, you'd have regular points in your code that would toggle a pin, thus providing the watchdog reset condition (preventing it from timming out and resetting the processor). The intent is that should the processor/application become lost and fail to toggle the pin then in reality it's not doing it's job, so you can reset it.

A simple application agnostic design might consider just looking at data bus bit 0, as that is likely to change states while the processor is running, and presumably the "solution" is working. In the most catastrauphic failure, where the CPU hangs and can't move data on the data bus the watchdog then triggers a reset. However this doesn't take into consideration application issues, for example being burried in interrupts (I've done this in the past with a Z80 and NMI on AC power zero crossing! my interrupt handler didn't complete before the next interrupt. DOH!). In these cases the watchdog could work on the application toggling some other line (high un-used address bit, or latched data bus bit, etc) and when the application becomes unstuck the watchdog does it's job. Really good watchdog circuits in this scenario had some FIFO memory to capture the last thousand or so states of the address and data bus and lock it for later problem analysis, giving you a crash log and way of determing why the application failed in the way that it did (like the above Z80 example, seeing it fail to complete the interrupt handler code over and over is helpful to fixing the problem rather than staring at a hung box trying to figure it out).

In the case of the Propeller this is more interesting. As there are lots of ways of implementing a watchdog that wouldn't be indicative of an application fail condition. Examples;
- Hardware counter. Each COG has 2 hardware counters, one of these could be used to drive out the toggling of the pin for the watchdog. However only on a complete Propeller failure - presumably the clock circuitry would this fail.
- one COG toggling a pin (ie. LED flashing demo). However this would only be indicative of the COG failing, and I assume it would be something really catastrauphic again like the clock circuitry failing.

Probably the smartest way would be to have regular points in the most critical parts of your application to toggle the pin and resetting the watchdog. Again, this can be a challenge in the Prop where you have 8 COGs then that could be complex if more that one COG is involved in delivering critical aspects of your application. Although I didn't catch your original watchdog discussion, this could be the reason why you didn't get the watchdog solution you were expecting.

Cluso99 · 2011-01-19 22:30

The Voyager spacecraft have been running for 33 years. And how long were the computers (very basic ones at that) tested on the ground before that! http://story.malaysiasun.com/index.php/ct/9/cid/89d96798a39564bd/id/42178065/ht/Voyager-spacecraft-going-strong-at-age-33/

Now, EPROMS also had a problem (the erasable ones) where even some labels let enough UV in to enable the fuses to regrow! I had a product that had a battery backed RAM in the chip. It had a 10 year life expectancy but after 5 years, one by one they failed. Not as you would think. They had been left powered on most of the time, but when they had a power fail they lost their code. It turned out we had a catastrophic failure with them all at about 5 years. They were all over Australia, but fortunately the EPROM boot code worked, and we were able to download them all remotely (they were connected via leased lines and stat muxes so it was easy to reload them when they failed).

So, to you question, noone really knows because there are unforseen issues that could crop up. And that is apart from the other issues the others have already said about. However, there is probably no reason it will not function for 30+ years. But I bet it's obsolete before then. BTW MTBF can be calculated by formulae if you believe in it.

Ale · 2011-01-20 06:07

There is something also to consider. The term "obsolete" tends to be used to mean "it has to be replaced now because the newer, shinier, still-hot product needs to be bought and this product will fail as soon as yesterday". That some piece of equipment can be replaced with something newer does not mean that it will be replaced or it will suddenly stop working or perform worser than it did yesterday just because there is something new on the market. Look at Cluso's example... You can get something new every 6 months... but you do not have to.

circuit design considerations: how many years will a propeller chip run?

Comments