Shop OBEX P1 Docs P2 Docs Learn Events
Reliability - How do I stop intermittent bugs? — Parallax Forums

Reliability - How do I stop intermittent bugs?

Kirk FraserKirk Fraser Posts: 364
edited 2013-05-17 13:08 in Propeller 1
I've been working on a hydraulic valve controller using a Propeller, position sensor, and PWM output. This week I transferred the circuitry from my custom PCB to a Propeller Project Board USB and got no response to the serial terminal from the position sensor. A few hours later I tried again, changing nothing, just plugging it in and running the exact same software with F10 again and it works! How do I track down the intermittent mistake and make it work reliably? I don't know if it's in a component, solder joint, software, or my own faith in it working. It has to be reliable or it won't even pass a 2 month function test.

In a past thread I learned to reserve P27 for an LED to show the Propeller is working since the Propeller tool can report success programming a dead propeller chip. It seems I should try to do that for every component or at least the major ones then somehow write software that detects when something goes wrong.

I also notice the Propeller Project Board USB contains many more components than the simple Propeller circuit in educational documentation. At least 2 unidentified chips, many SMD's which could be capacitors or resistors, and other parts I have no clue on what they are. Is there a new kind of recommended design strategy which improves reliability over the simple circuit?

Please share any ideas. Thanks.

Comments

  • Mike GreenMike Green Posts: 23,101
    edited 2013-05-17 11:10
    In the process of transferring circuitry from one board to another, all sorts of problems and mistakes can occur, everything from wiring errors to cold solder joints. There's no magic in trying to debug the hardware. You have to painstakingly go over everything, possibly reheat connections, etc. There's a unique skill in going over software or hardware that you've designed. You have to pretend that you've never seen the stuff before, leave all of your assumptions behind, and look with fresh eyes.

    The Propeller Project Board USB is much more complex than the original designs for the programming interface and power supply. It uses a switching regulator for the 3.3V supply. While more complex than a linear regulator, it's more efficient, produces much less heat, and tolerates a higher input voltage (because of the lower heat dissipation). The USB interface is built in (rather than separate as with the PropPlug).

    For your information, the Propeller Tool will not report success programming a dead propeller chip. If the chip is dead, the Propeller Tool will not find a Propeller on-line. Some portions of the Propeller need not work to be able to program it. For example, the programming process doesn't need a crystal or external clock. The Propeller's PLL can be damaged and it will still respond correctly to the Propeller Tool. Any or all I/O pins except for 28-31 can be damaged and the Propeller will still program its EEPROM. These are all known and are simply a consequence of the limited Propeller features needed for use with the Propeller Tool. The Propeller Tool can be used to load and execute a more complete test program. That, along with a hardware test rig, is how Parallax tests Propeller boards.

    Reliability is a complex issue. There are many ways to improve it. Often simple circuits are best. There's less to go wrong, fewer connections to have to be concerned about. On the other hand, sometimes careful use of complexity can help. Switching regulators are one example. It used to be that these were considered to be very complex. Now there's a variety of manufacturers making chips that are easy to use and they provide reference designs and lots of documentation. There are only a few parts required beyond what might be needed for a linear regulator and a lot of advantages in using a switching regulator. Similarly, there are a variety of USB to serial chips available, sometimes 2nd or 3rd generation. They require very few external components and many of the oddities of the earlier generations have been fixed.
  • LoopyBytelooseLoopyByteloose Posts: 12,537
    edited 2013-05-17 11:58
    I wish I could offer a quick fix, but one just gains a lot of exotic skills with years of experience to avoid mysterious bugs.

    One of my most frequent problems is tiny solder bridges. These cannot be seen with the naked eye, maybe the size of a cat's wisker... but they short out power or bridge logic i/o over to the adjacent pins. These days, after I solder up a project board, I run an Xacto knife blade between nearby soldering to cut all the unseen bridges I can before I go looking for them. It is fast and easier than looking over the board under a 20x jeweller's loupe.

    And if you are using a new board, you just have to learn everything there is to know about it. Sometimes it requires tracing with a VOM, even after an indepth study of the schematic.

    I can understand your frustration and feel your pain, but we all go through this from time to time.
  • Heater.Heater. Posts: 21,230
    edited 2013-05-17 12:06
    You need to divide an conquer. Make sure each little part works and then start to integrate working parts into the whole.

    Hardware: Start of by checking and double checking everything. I'd suggest producing a little test program for each hardware feature and exercise those features on at a time until you are confident they work reliably. Then see what happens when you start exercising features together. Building up to a full all singing all dancing hardware test.

    Software. Again, strip the thing down to a single function and get that running reliably. If you program is written in a nice modular style you can exercise all the parts on their own. You will probably want to create some little "test harness" code to wrap around and exercise each module.

    When all the modules work independently you can start to combine them into the complete program, one at a time. Testing each new build.

    By this point you should have a working system or found where your issue may be.

    Except of course for those really pesky issues, like that cold solder joint that looks fine but is intermittent.

    At the software end intermittent problems are often caused by things like: Having two tasks that share data but are accessing it at the same time and corrupting it or getting half new and half old data. Check your stack sizes. And a bunch of other things that have been driving us all mad over the years:)

    Be methodical and don't panic.
  • SRLMSRLM Posts: 5,045
    edited 2013-05-17 13:08
    I like the book Debugging by David J Agans. It has lots of useful debugging ideas, and it's a fun read.

    Other than that, I don't have anything new to recommend. Just challenge all assumptions, go over everything again, divide the problem space so you can binary search down to the problem, and don't try and do too much at once.
Sign In or Register to comment.