Best practices for finding bugs in a Propeller Project

Mag748 · 2012-11-07 11:42

Hello,

I have been spending a log of time recently working on a project.Right now, the product works fairly well, and I am fairly happy.

The only problem is that the Propeller will consistently freeze at some point during its use. I amhaving difficulty pinpointing the cause of the problem.

Are there any recommendation for creating logs within the Prop to determine what could be causing things to go wrong? What do other people do when working on large scale projects where many many things are happening at once?

Thanks for any input.

-Marcus

Martin_H · 2012-11-07 11:46

What language is it written in?

I try to keep my background tasks fairly simple and debug them first. I then have a main loop in main cog which handles most of the complexity and delegates the details to the other cogs. For Spin I find serial output terminal is a big help and I use carefully placed output messages to monitor program state during execution.

DynamoBen · 2012-11-07 11:56

There are no best practices in testing, you have adapt your approach based on the context of the problem you are seeing. Some things I've done in the past include serial debugging, code reviews/having someone else try the code, and bypassing questionable or suspected code. Throwing serial statements in to give you a clue where in your code things are going wrong can help a lot if you know roughly where the failure is happening. Having someone else do a code review or run your code can highlight potential problems but they can't review everything so you would need to have an idea where it might be failing. You might also try bypassing code you think might be an issue, if you bypass the offending code it won't freeze anymore.

Beyond code debugging a "freeze" could be hardware. All it takes is a pin that is floating that shouldn't be and things will act strangely or stop.

Rayman · 2012-11-07 12:11

I often use fullduplexserial to send debug info over USB serial link to PC running the Parallax Serial Terminal...

If you have a really big program, one thing that people often run into is running out of stack space.
You also need to reserve stack space if you launch a new cog.

BTW: If you're using a Quickstart, don't touch the bottom of the board with your fingers or it may freeze...

Mag748 · 2012-11-07 12:52

I think one of my problems is that my "Main Loop" Has gotten out of hand. Since I've been adding things, I haven't taken the time to break out sections of the code into simple, tested methods.

I think I have to tone down the FDS usage and figure out a more efficient and helpful;way of getting the important information. Most of the time its just spitting out repetitive information that becomes miles long after running for extended periods of time.

Also, I've been in mad-scientist mode for most of this project, so I'm afraid to say I'm a little scared of alot of the code, and and quite suprised sometimes when it does work at all.

I really hope it's not any hardware issues since that would be very hard for me to debug. I am using the Protoboard, but its mounted in a enclosure so I cant touch it while its running.

I did just check how much stack space I had available, and it was only 89 Longs. I have been running stress tests on the program with is filling it to the brim, but I removed a few things and have over 400 now, so I will continue testing with that.

From what I gather, things to keep in mind are:

* segment large "main Loops" into smaller Methods, and test each one of those methods until you are sure there are no issues at that level.
* Isolate questionable code and try and reproduce the issue
* Have someone else review and run code that you think may be buggy.
* Make sure you have enough Stack Space
* Use FDS to display pertinent information during runtime.
* Ensure there are no hardware issues that would cause floating pins.

Thanks,
Marcus

msrobots · 2012-11-07 14:01

I use JDcogSerial (OBEX) for debug because it is very easy to use from spin and other PASM-cogs. In PASM this is very helpful.

I also usually have a second spin-cog running a simpe WachDog loop where I can put tests like output 'whhatever' on seral when Varable X changes.

So I do not need to spread the debugging info all around my main program-code.

A beautiful example for a watchdog like this is the new monitor of Propeller II.

Enjoy!

Mike

Heater. · 2012-11-07 14:12

Best practice for finding bugs? There are none. Every situation is different. But:

1) Divide and conquer - split your prgram into small parts that can each be tested in isolation. Test each part in isolation, you will need a little test harness to exercise each part. (May not be so little, a test of all possible inputs to a piece of code can easily be bigger than the code itself)

2) Test that the component does not do what it should not do. Does it write to memory that it should not, for example?

3) Purtubation testing. Sometimes I have had wierd timing issues in programs to do with interrupts or two tasks accessing the same data structure without suitable exclusion in place. By increasing or just changing the execution time of different parts one can home in on the culprit.

4) The Linus Torvalds approach. Study all the code, without running it, until you understand what is going on. Linus hates debuggers and such crutches.

prof_braino · 2012-11-08 18:10

Exactly like heater said, except there are many, and they are universal. Best practices for finding bugs are centered around common sense. These always apply.

* keep it simple
* think about the design, if you don't get the design right, you WON'T get the code right.
* if you have problems with the code, check the design
* make sure the little stuff works
* divide and conquer
* break each giant impossible problem into a bunch of small easy problems. Solve each one until there's none left (you be surprised how many folks try to skip this, and complain that stuff doesn't work).
* after you get it working, check the design, and check the code. If you say "what idiot wrote this?" change it. Repeat. (this is my favorite step).
* don't optimize until later. When you feel the need to optimize, don't optimize till later. When you do optimize, only optimize something that actually needs optimization.

Mike Green · 2012-11-08 18:29

From a debugging mechanics standpoint, I like to have a display with "significant" information on it ... pointers, status bytes, indicator bytes, etc. I often will use a dedicated video output. The 1-pin TV driver is great for this as long as there's enough memory for the buffer and routines. For PASM debugging, I will use the same driver that I'll start up in a cog, then have very short PASM routines to write bytes and hex values directly to the display buffer from the cog being debugged. Another option is the display routine from Sphinx which keeps the display buffer in the cog that's generating the video and communicates via a one long "mailbox".

Phil Pilgrim (PhiPi) · 2012-11-08 19:13

Program quality begins in the planning and writing phase. If you're debugging a large program, it's probably too late, and you should consider starting over. It took me years to recognize three simple principles of coding:

1. Don't write and enter the whole program at once. Write as little as you have to in order to guarantee one small success. Then build on that success, one little bit at a time. Write and test, write and test, write and test. It's not only more reliable to code this way, but incremental successes -- no matter how small -- are very satisfying!

2. If, while testing your small, incremental additions, something doesn't go the way you expect it to, it's tempting to think about, and dwell upon, the stuff you did right. "Well this should work!" But why make that assumption if it doesn't work? Put your mind in critic mode, as if you're trying to investigate someone else's code. Better still, if you still think it should work, try explaining it to someone else. They don't even have to understand a word of what you're saying: it's the act of explaining that will lead you to the solution, not the other person's response.

3. Never be afraid just to start over from scratch. At some point in the development, you might think, "If only I had seen this sooner!" or "If only I had done it this way!" Well, instead of beating a dead horse, do it that way. You'll probably be much happier in the end and will have learned a lot in the process.

As a matter of personal taste, I loathe debuggers. To me, they're nothing but crutches that promote bad habits. Serial debug output is good and so is a scope if you have one.

-Phil

Mike Green · 2012-11-08 20:01

I mostly agree with Phil, certainly regarding #2 and #3. Critic mode is very important. You have to be able to look at the code as if someone else wrote it and you know how it works only by looking at it. Starting over from scratch often will result in better, more efficient and reliable code. You're not really starting from scratch ... you've got the advantage of doing it once.

Sometimes you have to write enough of the program to fill out the structure. It does help to plan things so you can test it piece by piece.

Most of the time full featured debuggers are not very useful. If you've designed the program carefully, there are probably only a few variables that you need to monitor to see how things are working. It's very useful to have a tool that can set breakpoints at specific points in the program and display the values of specific variables, but breakpoints don't have to be general purpose. A simple loop waiting on the state of an I/O pin will do just fine. You should know where you need to put them. You may compile in additional ones as you narrow in on specific problems. Attach a pushbutton with a pullup and a capacitor for debouncing and you're set. If you have a board with some indicator LEDs, you can often use those to give you feedback on the state of variables. The QuickStart Board is nice this way. It has indicator LEDs and you can use the touch switches instead of pushbuttons.

MagIO2 · 2012-11-09 02:02

One big source of bugs is that you often have limits within which the code works. If those limits are exceeded, the code will stop working properly. The point is that usually you don't check for the limits!
What limits? For example
* the index of an array (in Java-world you'll get a Index out of bounds exception - in propeller world anything can happen - overwriting variables or code)
* the wait-time for a waitcnt (which might be the reason for a COG running well for a while and then you have pauses for ~50sec)
* the number of subsequent function calls (limited by the stack)
* calculations produce a valid result only if they never exceed the variables limits (so, especially watch out when doing multiplications and divisions)

So, a posibility to increase stability of the SW is to at least document those limits and even better: check them before! (Especially in objects)

prof_braino · 2012-11-09 04:57

Phil Pilgrim (PhiPi) wrote: »

Program quality begins in the planning and writing phase. If you're debugging a large program, it's probably too late, and you should consider starting over. It took me years to recognize ....

An aside: Estimating Schedules.

You can only accurately estimate doing something you've done before. If you attempt to estimate a schedule for something completely new, you are just pulling numbers out of your butt. I see this constantly on new development, its not pretty. The more "new" something has, the less likely your estimate.

On the other hand, you CAN estimate how long it could take to "get a handle" on something new. That is, slap together a basic driver and UI. At the end of that period, you MIGHT be in a position to make and estimate on how long development would take, based on the time and effort of the R&D cycle. The danger is that management will tend to say "just go with the prototype, that already looks like it works". NEVER ship the prototype. This also ends in disaster, not always, but I've never seen one that didn't.

Kind of a tangent, but related to finding bugs, in that the whole technique is preventing them from happening in the first place.

T Chap · 2012-11-09 05:20

When there is a consistent freeze with no logical explanation for it, that is always a stack overrun in my cases. There is a simple way to find out if you running over the stack space, but this is assuming that your program is in fact even using allocated stack space for other cogs to load in. Since you stated there are 'many things going on at once', then it is assumed you have multiple cogs and have allocated stack space. You could set up a simple test for stack overrun by placing a test variable immediately after the stack space in the var section. For example:

VAR LONG stackspace(24), stacktester1, stack2(16), stacktester2, x, y , z 'etc.

PUB CheckStackOverrun
   If stacktester <> 0
       'indicate the overun using the stackname  on LCD, piezo, serial terminal etc
       repeat   '  lock up the code here and take note of what stack was ran over
   If stacktester2 <> 0
        'indicate the overun using the stackname  on LCD, piezo, serial terminal etc
        repeat

Place this debug code in a free cog if one is available, else you will have to include it in loop somewhere. What happens on the stack overrun is that the cog is exceeding the allocated stack space and encroaching on the next variable to the right of the stack array. This means other parts of your ram is getting written over.

One method I like to use for debugging is to have a piezo buzzer connected to a pin and a method called Beep. I use several different frequencies on the piezo for different indications(beep1000, beep2500 etc), even some multiple beeps that give a nice audible as to where the code is at and what it is doing. You could drop in some beeps at different points to let you know where things are happening which may help nail down the precursor to the crash. I also like using the piezo so that I can hear what is going on over the phone with clients, I create several beep patterns using sequences and different frequencies to know what is happening. The piezo does not require a cog.

Mike G · 2012-11-09 05:38

I live by Phil's #1 suggestion. Over the last 2 years I have been using test driven development. Basically, replace the debugger with unit test code. It's pretty simple, write a logical block of code to accomplish some task. Then write code to test the logical block. This does two things, 1) forces you to write small concise code with expected inputs and outputs 2) provides a new perspective - "How would I break this logical block?".

The knee jerk reaction to test driven development is - That's crazy and double the amount of code. It is true, you end up writing extra code. Code not used in a production environment. However, the code produced is high quality and well tested.

Like anything, test driven development takes practice. Speaking from experience, it is not uncommon for me to write a day worth of code, never touch the debugger, and the code works out of the box.

For code that already exists, I use divide and conquer.

Heater. · 2012-11-09 06:31

I'm all for test driven development or at least lots of test units as described. My experience in the aerospace industry showed me that there can be far more lines of code in the test harnesses than the code being tested, which I guess is why people tend not to bother. It becomes vital when you have a large project that is going to developed by a team over perhaps years. Then you can run your test cases over every new version or release and be sure that nothing that used to work in the older versions has been broken by the changes creating the new version. Which is not to say that it can't be helpful in a one man project.

Mag748 · 2012-11-10 16:04

I agree with the idea of stopping what your doing if you look at your code and say, "What idiot wrote this", because that's exactly what I did with the first version of this project. It was essentially one big mess. This second version, which I did start from scratch uses State Machines (two of them each in its own cog). The "master" State Machine feeds data to the secondary state machine and monitors it for updates on its end. I have found this works quite well.

I found this post by Tracy Allen to be cool and I think it will be a great tool for debugging:

[POST=1138869]Re: Make an LED pulse (smoothly go on and off) without any code interaction[/POST] (link)

This will allow me to create signals with an indicator while the program is running (freeing extra cogs) Originally I was skeptical about doing this since I realized that this counter-driver status LED would continue indicating away even though the program "goes off into the weeds" as Tracy Allen puts it. Clearly this wasn't good, but what I plan to do now is have the led blink twice or so, then stop for a while, just like a heartbeat. This must be a really common example now that I think about it.

I never really learned how to take a step by step approach to designing a complex application. I love the mad scientist approach, which to me is non stop coding until I can't take it anymore, then run it and go back and band-aid anything that does't work. With the second version of this application, I am realizing that if I had taken the time to completely, or even partially design this from the beginning, I would have had more time to write the code only once, and would not have needed the time to go back and affix band-aids.

The bright side is that I can still take this project and apply many of these advised techniques and turn it into a very reliable and professional application. And I look forward to doing that. Divide and conquer is the way I plan on at the moment. I will try out some test driven code as well.

Thanks,
Marcus

Mag748 · 2012-11-10 16:07

T Chap wrote: »

When there is a consistent freeze with no logical explanation for it, that is always a stack overrun in my cases. There is a simple way to find out if you running over the stack space, but this is assuming that your program is in fact even using allocated stack space for other cogs to load in. Since you stated there are 'many things going on at once', then it is assumed you have multiple cogs and have allocated stack space. You could set up a simple test for stack overrun by placing a test variable immediately after the stack space in the var section.

I have been using the Stack Length object in the propeller library for determining stack sizes for my home brewed cog running methods. I never though about using it for the main stack space. It didn't actually occur to me that that would ever be an issue, but I will make sure to keep it in mind now.

Thanks,
Marcus

Tracy Allen · 2012-11-11 10:45

In objects that have aborts (such as an SD card driver), be absolutely sure that you provide code to handle them gracefully.

I/O devices and things that deal with communication channels are subject to glitches and unexpected responses. "What would happen if this devices is suddenly disconnected or short circuited or loses a bit here or there?". Allow for timeouts. If you are using an object from the OBEX, you can't assume that it covers all the bases or that it will play nice with your application. Testing is one approach, and another is to dig iin to be sure that it does in fact have timeouts or aborts associated with any code that waits for a certain response.

The automomous LED is very useful in field settings where you may have to ask someone else to tell you at a glance what state a system is in, or when the system is untethered in operation. Most useful in main programs that have to run through a sequence of tasks; each can be bracketed by a unique blink pattern.

cavelamb · 2012-11-11 11:47

The Mythical Man-Month: Essays on Software Engineering...

An interesting and entertaining book on just this very subject
challenges the notion
that all men and all months are created equally.

Written by Fred Brooks who headed the OS360 project which
some say took over 1 million man years to accomplish!

Mr. Brooks' central theme is that "adding manpower to a late software project makes it later".
(Brooks Law)

And other interesting ideas, such as the second-system effect (I've been guilty of that one)
and strong advocacy of prototyping.

Not a "how to debug your code" manual, but highly recommended.

Alex.Stanfield · 2012-11-13 18:54

Besides all the excellent recommendations above I love to use a logic analyzer for debugging/testing/measuring. Just the simplest/cheapest will save hours in any complex project.
I go by:
1) Find a free pin or free one for testing
2) set that pin to output in the cog your are inspecting
3) Set pin to 1 on any significant event (entering a subroutine, calling waitcnt, or whatever you need to analyze)
4) reset pin to 0 after the event

By using this method you will be able to:

- Check your code is getting executed
- Measure your code's speed
- Spot waitcnts that arrive late (creating hangs for ~50secs)
- Spot conditions when your code is being called at rates different than you expect.
- Spot protocol problems (specially if your logic analyzer decodes the protocol)

This tool is hard to ignore once you try it, besides it's trivial to setup in any language and in any part of the code, no new objects to add and very low footprint.

On the other side when you don't have a clue where the problem is and you are running spin in several cogs I include my object "SYSLOG" (http://obex.parallax.com/objects/864/) it logs messages onto an SD card AND serial port, with timestamps and cog numbers involved.

Hope it helps

Alex

Mag748 · 2012-11-14 08:40

Alex.Stanfield wrote: »

Besides all the excellent recommendations above I love to use a logic analyzer for debugging/testing/measuring. Just the simplest/cheapest will save hours in any complex project.
I go by:
1) Find a free pin or free one for testing
2) set that pin to output in the cog your are inspecting
3) Set pin to 1 on any significant event (entering a subroutine, calling waitcnt, or whatever you need to analyze)
4) reset pin to 0 after the event

By using this method you will be able to:

- Check your code is getting executed
- Measure your code's speed
- Spot waitcnts that arrive late (creating hangs for ~50secs)
- Spot conditions when your code is being called at rates different than you expect.
- Spot protocol problems (specially if your logic analyzer decodes the protocol)

This tool is hard to ignore once you try it, besides it's trivial to setup in any language and in any part of the code, no new objects to add and very low footprint.

On the other side when you don't have a clue where the problem is and you are running spin in several cogs I include my object "SYSLOG" (http://obex.parallax.com/objects/864/) it logs messages onto an SD card AND serial port, with timestamps and cog numbers involved.

Hope it helps

Alex

Alex, thanks for this addition. I had just come across a post by kuisma that touches on this subject of "syslog" as well. It uses a server on another computer to send logs using the spinneret:

[post=976729]Re: W5100 Sn_SR undocumented misfeature / bug[/post]

I will look into using syslog on my next project.

Thanks,

T Chap · 2012-11-15 20:50

I had been seeing a strange crash lately only under a certain case of receiving data from an RS485 port, which comes into the Prop on one port of the 4 port object. I added this code to a repeating loop and immediately saw the array that was getting ran over. I bumped up the array until the array was no longer being exceeded.

This is crude and slows down the loop but it is not intended to stay in the program permanently if speed is a concern for the loop it is living in.

VAR LONG MTRStack[28], a1, iTripSetArray[18],a2, ProfileStack[36],a3, PingStack[48],a4, KPStack[48],a5
       'etc more vars


PUB StackCheck
    if a1+a2+a3+a4+a5 > 0
       Beep2500
       GoZeroClear
       ser.str(3, string("Stack Overflow")) 
       go(1, 0)   ' LCD bottom row, move to col 0
       ser.dec(3, a1)
       go(1, 3)    'col 3
       ser.dec(3, a2)
       go(1,6)
       ser.dec(3, a3)
       go(1, 9)
       ser.dec(3, a4)
       go(1, 12)
       ser.dec(3, a5)

Joerg · 2013-03-05 13:41

Hi

I have been using a lot of microprocessors an i just love the Propeller-Chip. The only thing i miss is a good simulator! I also have developed my own OS for Freescsale processors in assembly an C language and thank to the very good simulation tools from Freescale my system is working for a lot of MCU's. I recently have developed a 32-bit communication protocol to address different Propeller chips with only two ports (and some piullups!) and a second layer with the mechanism for request and answering of data. Since all of my modules are written in assembly (yes I am an oldie!! [1958]; still remembering the times when I had to program a 8085-CPU with dip-switches!) I had to write some codes to debug with 8 LED's connected on one 8-bit port. This is more or less the stile of the 1980th's but it works also with modern hardware.
If somebody is interested in my hardware have a look at systech-gmbh.ch or mail me (the address can be found also on this site).

So finally my old wish remains: create a powerful simulating tool with debugging capabilities, or an hardware-level tool that does the same!

Saluti (greetings in Italian)
Joerg

Christof Eb. · 2013-03-06 03:28

Hi, J

Best practices for finding bugs in a Propeller Project

Comments