Shop OBEX P1 Docs P2 Docs Learn Events
Propeller and this involuntary shut down? — Parallax Forums

Propeller and this involuntary shut down?

grasshoppergrasshopper Posts: 438
edited 2008-12-26 16:15 in Propeller 1
I am running a usb communications to a custom propeller PCB with some extensive code. My problem is that after about 4 hours the propeller reboots it self. Can some one give me some ideas as to why this may be? Some of the details are as follows:

1. USB to PC running 64bit Vista at 19200 baud with the FTDI IC
2. Running 80Mhz propeller
3. Sending data to PC every 15 seconds


Some things I have considered were resetting all my variable at the start of each method and making sure that I don't overfill a memory space over time.

All ideas are welcome Thanks

Comments

  • Mike GreenMike Green Posts: 23,101
    edited 2008-12-23 16:33
    Sounds like some kind of array overflow or stack overflow. Typically other variables get overwritten or possibly code gets overwritten and this sort of thing cascades until the program eventually causes a reboot.
  • grasshoppergrasshopper Posts: 438
    edited 2008-12-23 16:39
    How do I go about finding this? I mean so far I reset all my variables. Could it be the Full Duplex serial object? This is the only object that I have not even looked in.
  • BaggersBaggers Posts: 3,019
    edited 2008-12-23 16:45
    how big is your code, and data and stack space?

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    http://www.propgfx.co.uk/forum/·home of the PropGFX Lite

    ·
  • grasshoppergrasshopper Posts: 438
    edited 2008-12-23 16:48
    Well honestly I am not sure how big my code is. I can say that I have over 1600 lines of code in one object and some 500 in the others. Is there a way that the compiler can tell me? I figured that it was a shady area using the propeller and the propeller tool.

    *Edited*

    I have 2,315 Longs program
    I have 260 Longs Variable
    I have 5,613 Longs Stack / free
  • Mike GreenMike Green Posts: 23,101
    edited 2008-12-23 16:58
    The problem is likely in your program somewhere, not in anything else. Unfortunately, it's a kind of painstaking task. It helps to have someone else help you because most people have a hard time temporarily forgetting how their program is supposed to work so they can focus on what the code actually does.

    You need to look at any of your code that stores into an array or stores using a pointer. You also have to make sure that you don't have some recursion somewhere (where a routine calls itself, possibly through a long chain of other calls).

    If you're using more than one cog (except for library routines that start up assembly cogs), you may have more than one cog updating a variable at the same time.
  • soshimososhimo Posts: 215
    edited 2008-12-23 17:03
    Just a thought, but what are you using as a communcation program to the propeller? If you are using a propplug or the serial programming circuit as your transceiver then a DTR signal from the computer will reset the propeller. That's the protocol for downloading a program - raise DTR, once the propeller resets look for a sequence of bytes and then tell the propeller that you are programming it via another sequence. After that you download the program to the chip - typical ICSP protocol. The point being, if a spurious dtr signal comes in your chip will reset.
  • grasshoppergrasshopper Posts: 438
    edited 2008-12-23 17:18
    @soshimp,
    Nice thought, I made this circuit board and placed a jumper that I can remove so that the propeller cant be reset from the computer anymore. This way i cant program it till I replace the jumper.

    @Mike,
    The updating of a memory space from one cog to the next may be an issue. I am not sure how to go around this? I do run a loop in one cog that reads temperatures then places it in a memory address that other cogs can read. Hum, not sure why after 4 hours this would be a problem though. Basically I have the propeller setting in an infinite loop reading temperatures every second then sending this to the computer.

    Ill keep this thread going till the problem is solved. Some day someone else may have the same problem.
  • BaggersBaggers Posts: 3,019
    edited 2008-12-23 17:42
    are you using graphics.spin? as you're using >$2000 in code, so if you're using double buffered graphics that would cause program corruption.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    http://www.propgfx.co.uk/forum/·home of the PropGFX Lite

    ·
  • grasshoppergrasshopper Posts: 438
    edited 2008-12-23 17:59
    @Baggers,
    No the only objects that I am using that I did not create are the fullDuplexSerial, FloatString, and PWMasm objects.

    Otherwise the problem lays in my code somewhere. I can say that I have been running the device for 2 hours now and no rebooting yet. ill give it another 2 hours to see what will happen.
  • BaggersBaggers Posts: 3,019
    edited 2008-12-23 18:19
    ok, no worries, good luck [noparse]:)[/noparse]

    also, if you're using pointers that's another place where issues can arrise, if you're not checking boundaries when writing data.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    http://www.propgfx.co.uk/forum/·home of the PropGFX Lite

    ·
  • Mike GreenMike Green Posts: 23,101
    edited 2008-12-23 18:19
    grasshopper,
    The fact that it (the reboot) seems to be reproducible about once every 4 hours suggests that there's some kind of reproducible failure where some kind of problem accumulates until the program fails. Typically, this is from a subscript out of range as I said earlier. Without runtime bounds checking (which is usually expensive in terms of execution time and space) you have to just go through the program line by line, flag the lines with subscript or pointer references, and go through the program logic involved to assure yourself that you can't get a subscript out of range. There are some programming practices that can help when you're writing the program, but, in your case, it's just a line by line check that's needed.
  • StefanL38StefanL38 Posts: 2,292
    edited 2008-12-23 18:51
    Hello Grasshopper,

    how about two ways of analyzing it:

    first is what Mike suggested: going through the code. Line by line checking EVERY line regardless of your guessing the error can't be here. REALLY EVERY line

    second: make everything constant no real reading of temperature or time or anything. Include sending a memory dump to the PC to analyse if some part of the memory get's overwritten

    Another idea: send the data as fast as you can to reproduce it faster than every 4 hours. But count the number of the loops to see if it happens always after the same amount of loops

    If everything stays constant - could you easily create a codeversion that can run on a propeller without any special hardware ?
    and attach this version here

    best regards

    Stefan
  • parskoparsko Posts: 501
    edited 2008-12-24 16:17
    Grasshopper,

    also check your hardware. My desktop Prop board resets itself when I turn the TV on or off, or if there is a big change in power coming into the regulators. Maybe the heat is coming on every 4 hours? I attribute it to my hardware design. Good luck.

    -Parsko
  • Timothy D. SwieterTimothy D. Swieter Posts: 1,613
    edited 2008-12-25 11:45
    So, for sure, is the problem repeatable every four hours? Exactly on time, on cue?

    If so, then I would really investigate the software or something the software is doing. I would setup the Propeller to send the data along with some sort of counter or time stamp. Then let the Propeller run for 24 hours and then check the data. There should be consistent reboots and you might be able to track down where if you have some sort of counter of seconds or minutes that shows exactly when the reboot occurs and if it reboots at the exact same time.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Timothy D. Swieter, E.I.
    www.brilldea.com - Prop Blade, LED Painter, RGB LEDs, uOLED-IOC, eProto for SunSPOT, BitScope
    www.tdswieter.com
  • Timothy D. SwieterTimothy D. Swieter Posts: 1,613
    edited 2008-12-25 11:48
    By the way, nice idea for the jumper on the reset line. Do you have a pull-up on the reset line? Is it possible that interference is causing the reset? This is easily dismissed by running the tests above and seeing a repeatable reset.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Timothy D. Swieter, E.I.
    www.brilldea.com - Prop Blade, LED Painter, RGB LEDs, uOLED-IOC, eProto for SunSPOT, BitScope
    www.tdswieter.com
  • grasshoppergrasshopper Posts: 438
    edited 2008-12-25 16:12
    Timothy D. Swieter said...
    By the way, nice idea for the jumper on the reset line. Do you have a pull-up on the reset line? Is it possible that interference is causing the reset? This is easily dismissed by running the tests above and seeing a repeatable reset.

    Nope no pull up. Good idea I let slip by. On the next board ill put a pull up, but I am 99% sure this is not the reason. In the book I think that the reset can be left floating. Ill go check soon.

    So far I have combed thought the code over and over and its taking a toll on my confidence. Electronics and programming is about building confidence in my opinion, and at times I get to feeling overwhelmed. Like to hell with this!

    Post Edited (grasshopper) : 12/25/2008 4:20:05 PM GMT
  • Timothy D. SwieterTimothy D. Swieter Posts: 1,613
    edited 2008-12-25 18:52
    I understand the building of confidence and your opinion after a long bout with a problem like this.

    Lets see, yes, I believe the reset line can be left floating. I saw in another thread the idea of pulling it high to make the design more robust. I thought I would try this in one of my next designs. In the end the pull-up resistor component doesn't have to be populated, but I though it made sense to ensure it was pulled high unless intended to be pulled low. Especially in a system without the USB components installed (in otherwords programmed by the Prop Plug or something similar meaning there is a reset line running around a board to a connector).

    Grasshopper - tells us more about the reset. Can you run a program and log the data over say 13 hours and see that it resets 3 times? Can you implement some counter or something in the code and the logged data to see that it resets at exactly the same moment? If we can narrow it down to exactly repeatable, semi-repeatable or not repeatable that can help us determine where to look.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Timothy D. Swieter, E.I.
    www.brilldea.com - Prop Blade, LED Painter, RGB LEDs, uOLED-IOC, eProto for SunSPOT, BitScope
    www.tdswieter.com
  • grasshoppergrasshopper Posts: 438
    edited 2008-12-25 19:41
    Well the reset is a routine in my program, but not sure if the reset is accruing from my code (the method), or the actual propeller (due to stack overwriting or some other reason)

    Basically I can reset the instrument from 2 logical modes. The first time ever programmed and EEprom has to be populated with the correct run time data, or calibration. This resetting allows me to fit the proper data to the cogs that they need to run correctly.

    Who knows - the last 2 days I have not found any errors in the code doing as Mike suggested. Yet I keep digging!

    As far as pulling the reset up with a pull up resistor. Yea I am going to put in place a resistor pad and your right I don't have to populate it, but its there if needed.

    Today Its been running for 3 hours and 37 minutes. I have placed a counter that keeps time for now.

    Post Edited (grasshopper) : 12/25/2008 7:47:25 PM GMT
  • kwinnkwinn Posts: 8,697
    edited 2008-12-25 19:58
    grasshopper, I understand how a difficult problem can undermine your confidence, but you need to view this as a challenge and a learning opportunity. I find you learn far more from solving a difficult problem than any course, tutorial, or manual teaches you. Stay with it and try the suggestions everyone made earlier in addition to the suggestions below.

    I searched the prop manual for every instance of "reset" and looked at the schematics for a pullup resistor on the reset line but found none so I assume there is an on chip pullup. Being a devout believer in Murphy's law I would still put a 1K pullup on the reset line just to be sure.
    Build a small latch circuit to monitor the reset line. A quad latch circuit with schmidt triggers and inverters on the input and leds on the output will confirm or eliminate an external source for the reset.
    If possible add a routine to send the values of stack pointers, arrays, etc. to the PC for logging along with the regular data you send every 15 seconds.

    Good Luck
  • grasshoppergrasshopper Posts: 438
    edited 2008-12-25 20:06
    Thanks Kwinn,

    Yea I am not going to give up per say. Hell my boss would kill me. Especially after he has seen what the mighty propeller can do there is no turning back. I am thinking that I may have found the problem cause at this point no resetting has happened. Ill let it run for more hours and ill post my finding.

    Wish me luck
  • soshimososhimo Posts: 215
    edited 2008-12-25 20:24
    grasshopper -
    I too feel your pain, as I think we all have at one point or another. I think as engineers and/or hobbyists (let's admit it, we are all engineers in some fashion, some more classically trained than others, but we all have the "knack" as it were) , anyway, as engineers I think we tend (at least I know I do in my day job and hobby) to focus on a problem with the exclusion of everything else. This tends to actually harm us when we are trying to work through a difficult problem. When face with adversity humans natural reaction is to go back to something they know - the old tried and true. This is why we tend to do the same things over and over when trying to figure out a problem. I learned long ago to just let it go, walk away for a day, a week, whatever it takes. Focus on something completely different for awhile and you will be amazed that ideas will suddenly pop into your head to try. Just don't fall into the same trap if those ideas don't work - chalk it up as a failed experiment and move on.

    Yes, it can be frustrating, and trust me, I've had countless times I felt like just locking my part bins up in storage and tossing my computer out the window. I find though if I detach myself I can figure things out much easier. Its that whole "thinking out of the box" to use an overused term. When you are trapped IN the box its difficult to think outside of the box. You have to remove yourself from the box. cool.gif
  • grasshoppergrasshopper Posts: 438
    edited 2008-12-25 21:01
    Soshimo,

    So true. "I am stuck inside the box". Well I have found some promising results; problem is that in my mad dash to fix the problem I cant pin point where the problem was. I did so much code tweaking that it is all but vanished. I recall one particular line of code that I did not comment out and yet it seemed harmless. I did remove this among other things and now it has been running for over 6 hours.

    here is the code from memory.
    declaring the variable as global
    Var long Overtempwarning 
    
    



    This line was placed inside a loop elsewhere.
    OverTempWarning := checkovertemp(temperature)
    
    


    This was just lingering in a loop really not used or reset. It was suppose to be commented out for use later. Overtempwarning was just reloading the results from the checkovertemp method. This seems harmless but I am not sure.
  • grasshoppergrasshopper Posts: 438
    edited 2008-12-26 16:15
    Thanks to everyone that helped me through this nightmare. It has now ran for 19 hours straight and never restarted. I still cant see where I fixed the bug due to my mad dash to fix the bug, with all the code changes and object tweaks.


    System is an all good now.

    Thanks
Sign In or Register to comment.