Forum Update - Announcement about May 10th, 2018 update and your password.

Arlobot DHB-10 motor board bug with encoders

Hello,
Since October, we have written a few times to the Parallax support and tried a few things but without luck so far. I hope it will be more successful with the forum.
In particular, I am interested in hearing if anybody else experiences similar issues with the Arlobot DHB-10 motor board when using encoders.

The DHB-10 motor board (Firmware 1.0) crashes after a short period of time when the encoders are used.

On the hardware side:
* We have bought 3 Parallax Arlobot kits at our university, and our colleagues at another university have bought 10 more. The same problem has been faced on all 13 robots, which were assembled and tested by two different persons.
* The three first robots are using the default Propeller Activity Board, while the other ten robots are using an Arduino Uno board. Same problem with both types of boards.
* The problem does not seem to appear when unplugging the encoders and driving the motor board by power “GO” (instead of by speed “GOSPD”). N.B.: keeping the encoders attached and driving the motor board by power does not help.
* We have tested with batteries fully charged.

On the software side:

Output of the “Arlo - Test Encoder Connections” default test:
Testing...



ticksL = 203, ticksR = 185



Encoder connections are

correct!  Your Arlo is

ready for the next step.



Test done.

We were expecting a more even tick count for right and left (when driving the motors by speed), but besides that, the test seems to be OK.

Then, with our own minimal tests, from the Propeller Activity Board:
//DHB10-GOSPD.c
#include "simpletools.h"
#include "arlodrive.h"

int main() {
	char s[32];
	memset(s, 0, 32);
	while (1) {
		sprint(s, "GOSPD %d %d\r", (rand() % 100) - 50, (rand() % 100) - 50);
		dhb10_com(s);
		pause(100);
	}
}

* Minimal test code “DHB10-GOSPD.c” for the Propeller Activity Board. It is to be used with the encoders, and controls the motor board by speed. This makes the motor board to crash within a few seconds to a few minutes.
** Here is a video of a test: https://photos.app.goo.gl/nv09gSV6ktdSRAF82 . One can see the wheels stopping and the diagnostic diodes blinking at the end.
//DHB10-GO.c
#include "simpletools.h"
#include "arlodrive.h"

int main() {
	char s[32];
	memset(s, 0, 32);
	while (1) {
		sprint(s, "GO %d %d\r", (rand() % 100) - 50, (rand() % 100) - 50);
		dhb10_com(s);
		pause(1000);
		sprint(s, "GO %d %d\r", 0, 0);
		dhb10_com(s);
		pause(1000);
	}
}

* Minimal test code “DHB10-GO.c” for the Propeller Activity Board. It controls the motor board by power.
** If the encoders are attached, this code makes the motor board to crash after typically a few minutes
** If the encoders are not attached, then this code does not seem to make the motor board crash.

We can happily provide more information.

For now, since connecting the encoders to the DHB-10 board does not work, we have a work-around, which is to connect them to the Propeller Activity Board WX, but this is far from ideal:
https://github.com/chrisl8/ArloBot/pull/26
We have not yet looked at the DHB-10 firmware https://www.parallax.com/downloads/dhb-10-motor-controller-firmware by ourselves, but we do believe there is a bug somewhere in it.

Related thread:
* http://forums.parallax.com/discussion/166105/dhb-10-error-motor-or-encoder-error
«1

Comments

  • 50 Comments sorted by Date Added Votes
  • Hi @Alkarex

    I'll make some inquiries, and also clear some time to look over the code this week.

    Whilst I've not come across this issue before, I can sure set up an Arlo and look to replicate the behavior.

    What date-code do you have stamped on your DHB-10 pcbs?
    (Look for 4-digits, bottom right of the pcb, under the RevA text)

    In the meantime, other users may have some experience to help too.
  • Alkarex, this issue should've been escalated inside Parallax. We have on staff the firmware and hardware team, so we should be able to sort this out quickly. I've used it without problems, as are other customers so I'm wondering if the "mass fail" across all of your robots is the same error.

    To get you the support you deserve, I'm calling in David Carrier.

    He will be on this thread by 11:00 am Pacific time on Thursday.

    Thanks,

    Ken Gracey
  • Alkarex,
    Regarding the different tick counts when running 'Arlo - Test Encoder Connections', it is running both motors with the same input power, but the output speed will differ, because of variations in efficiency between motors. This Arlo also uses left- and right-handed worm gear drives in each motor, so when driving forward or in reverse, one worm gear is pushing and one is pulling, creating a stronger difference in efficiency than when rotating in place.

    As for the when the DHB-10 blinks the LEDs and stops driving the motors, that is the programmed behavior when it detects a problem with either a motor or an encoder. There aren't any known issues with it doing so while using the 'GO' command, so I'd like to get some more information about what is happening with your setup.

    When the DHB-10 has encountered an error, and the LEDs are blinking, it will not drive the motors, but it is still running and will respond to commands. If you issue a movement command, while the DHB-10 is displaying an error, it will respond with a line containing an error message or number. If you issue a request that doesn't move the motors, it will respond as normal.

    Can you run 'DHB10-GO.c' on one of the boards, with the encoders connected, then after it displays an error, capture the error message the DHB-10 is responding with? Afterward, can you use the 'DIST' command to read the current encoder positions?

    Thank you,
    David Carrier
  • Thanks all for the feedback. One of my colleagues will make the suggested tests and report on Monday.
    Best regards,
    Alexandre
  • Shouldn't memset use the size of s rather than just a hard coded value?
    Ex:
    memset(s, 0, 32);
    

    Should be:
    memset(s, 0, sizeof(s));
    

    Also, In other instances I would set a variable as 'volatile' just to ensure the compiler does not optimize the value out.

    Ex:
    volatile char s[32];
    

  • WhitWhit Posts: 3,883
    edited January 19 Vote Up0Vote Down
    All,

    I've been running my Arlo several months - I had one error at first, but my notched encoder wheel was not centered (between the emitter and reader). This error showed in the very first diagnostic test. I centered the wheel and have had no issues since. All of this was as suggested in the tutorial.

    Whit+

    "We keep moving forward, opening new doors, and doing new things, because we're curious and curiosity keeps leading us down new paths." - Walt Disney
  • Hi @VonSzarvas,

    For the three Arlobots we have, the date-code of two them is 1715 while the other one is 1646.
  • zhowan wrote: »
    Hi @VonSzarvas,

    For the three Arlobots we have, the date-code of two them is 1715 while the other one is 1646.

    The datecode I have is 1513. I'll forward the note to David Carrier so he can test with the same version boards as you.

    He is Parallax's in-house engineer, so he is much better placed to solve the issues than us community guys anyway!

    In any event, I'll check my Arlo too, and run the test's David has recommended above. Let's see if we find any clues to help you.

  • Hi All,

    I have tested again on our Alro, plugging the encoders to DHB10 board, the test-encoder-connection code works fine:
    Testing...
    
    ticksL = 197, ticksR = 185
    
    Encoder connections are 
    correct!  Your Arlo is 
    ready for the next step.
    
    Test done.
    

    And then I tested the robot by following code:
    #include "simpletools.h"                      // Include simple tools
    #include "arlodrive.h"
    
    int main()                                    // Main function
    {
      char s[32]; // Hold strings converted for sending to DHB-10
      memset(s, 0, 32);
      char *reply;
      
      while(1)
      {
        // Add main loop code here.
        sprint(s, "GOSPD %d %d\r", rand() % 100 - 50, rand() % 100 - 50);
        reply = dhb10_com(s);
        print("GOSPD:%s\n",reply);
        pause(100);
        reply = dhb10_com("SPD\r");
        print("SPD:%s\n",reply);
        reply = dhb10_com("DIST\r");
        print("DIST:%s\n",reply);
        pause(100);
        
      }  
    }
    

    After one minutes or two, the terminal shows:
    GOSPD:ERROR - Motor or encoder error
    
    SPD:0 0
    
    DIST:204 292
    

    Which means it returns an error from GOSPD command, while normal responses from SPD and DIST commands, exactly as David Carrier said.
    @JonM , change the variable declaration to 'volatile char s[32];' does not help.

    Finally, I changed the GOSPD command to GO, it works fine now (at least for 20 minutes).





  • zhowan, we're asking David to get another look at this today. If it turns out that our boards/firmware are at fault we'll certainly ship you some replacements. In the meantime, thank you for your time diagnosing the issue with us. - Ken Gracey
  • Can you try doing the test on one motor at a time, e.g. just the left then just the right? Can you also try pushing the encoder wheel as close as possible to the motor, and see if it takes longer to fail, and also check to see if it has moved at all? The encoder wheel is the black toothed disk pictured here: https://learn.parallax.com/tutorials/robot/arlo/arlo-robot-assembly-guide/section-1-motor-mount-and-wheel-kit-assembly/step-1.

    Considering that the 'GO' command runs the motors without the encoders and doesn't generate the issue, it is likely an encoder issue. It could be that the encoder wheel is moving around on the axle, or rubbing against the optical sensor.
  • AlkarexAlkarex Posts: 27
    edited January 24 Vote Up0Vote Down
    Hello,
    We have made some more tests today, with more photos and videos:
    https://photos.app.goo.gl/QplqmX2viKLQJ6453

    The video of the single wheel is a right-side wheel, with encoder REV A 29321, and DHB-10 Motor Controller #28231 REV A 1715.
    I have done my best to show close-ups of the encoders.

    As far as we can see, there is no mechanical problem with the encoders.
    Furthermore, all the encoders seem to work fine when we plug them to the Propeller Activity Board WX and perform the odometry there.

    We have tried one motor at a time as you asked, on three different robots, and for all of them, the left-side wheel works fine, while the right-side wheel fails.

    It would also be possible to ask our colleagues with the other 10 Arlo robots to reproduce the findings.
    We have a 14th Arlo, bought a bit later (December 2017 vs. September 2017) and not assembled yet, which we plan to test probably next week.

    Let us know whether there is any other test we could do to help solving this problematic issue.

    P.S.: We will later today post a minimal test running on the Propeller Activity Board WX with the encoders attached to it (instead of attached to the motor board).
  • Alkarex wrote: »
    Hello,
    We have tried one motor at a time as you asked, on three different robots, and for all of them, the left-side wheel works fine, while the right-side wheel fails.

    This enables an additional test to try and isolate the source of the problem....

    1. What happens if you swap the connections on the DHB-10. So plug the left-wheel and it's encoder to the other DHB-10 channel that's usually used by the right wheel. This could identify if the problem is with the DHB-10 channel, or with the wheel/encoder.

    2. Another thought, if the above test determines the issue is with the DHB-10... I think the source code is open source? If so, maybe it's possible to change the pin defines for the encoders of the right-wheel to use the Aux1/Aux2 headers instead.



  • AlkarexAlkarex Posts: 27
    edited January 24 Vote Up0Vote Down
    As promised, here is a test that we have run for testing the encoders from the Propeller Activity Board.
    #include "simpletools.h"                      // Include simple tools
    #include "arlodrive.h"
    
    // Define the encoders pins
    #define LEFT_A 3
    #define LEFT_B 2
    #define RIGHT_A 1
    #define RIGHT_B 0
    
    static volatile long int left_ticks = 0, right_ticks = 0;
    static volatile int last_left_A = 2; last_right_A = 2;
    void encoderCount(void *par);
    unsigned int encoderCountStack[128];
    
    int main()                                    // Main function
    {
      // Start the encoder cog
      cogstart(&encoderCount, NULL, encoderCountStack, sizeof encoderCountStack);
      char s[32];
      memset(s, 0, 32);
      char *reply;
      int cmd = 0;
     
      while(1)
      {
        cmd = rand() % 100 - 50;
        sprint(s, "GO %d %d\r", cmd, cmd);
        reply = dhb10_com(s);
        print("GO:%s\n",reply);
        print("left = %d, right = %d\n", left_ticks, right_ticks); 
        pause(100);
      }  
    }
    
    /**
     * For when the encoders are connected to the Propeller board
     * instead of being connected to the motor board.
     */
    void encoderCount(void *par)
    {
        while(1)
        {
            int left_A = input(LEFT_A);
            int left_B = input(LEFT_B);
            int right_A = input(RIGHT_A);
            int right_B = input(RIGHT_B);
    
            if (last_left_A == 0)
            {
                if (left_A == 1)
                {
                    if (left_B == 0)
                    {
                        left_ticks++;
                    }
                    else
                    {
                        left_ticks--;
                    }
                }
            }
            else if (last_left_A == 1)
            {
                if (left_A == 0)
                {
                    if (left_B == 1)
                    {
                        left_ticks++;
                    }
                    else
                    {
                        left_ticks--;
                    }
                }
            }
            last_left_A = left_A;
    
            if (last_right_A == 0)
            {
                if (right_A == 1)
                {
                    if (right_B == 0)
                    {
                        right_ticks++;
                    }
                    else
                    {
                        right_ticks--;
                    }
                }
            }
            else if (last_right_A == 1)
            {
                if (right_A == 0)
                {
                    if (right_B == 1)
                    {
                        right_ticks++;
                    }
                    else
                    {
                        right_ticks--;
                    }
                }
            }
            last_right_A = right_A;
        }
    }
    


    The code seems to run fine. We left it running during our lunch and it produced:
    GO
    left = -133, right = -379
    

    (Same comment than at the very top of this thread: was expecting slightly closer values between right and left, but looks otherwise fine)


    Good idea @VonSzarvas , and we have tried to swap the left/right encoders, and swap the left/right motors.
    The result is that it is now the left-side wheel that crashes (the one the motor board believes in the right-side one), while the right-side wheel runs fine (the one the motor board believes is the left-side one).

    Regarding the firmware of the motor board, the source code is available, but we are not (yet) familiar with the .spin format and e.g. how to compile it.
  • Hi @Alkarex,

    I spotted something in your photos. The soldering at the headers seems like it didn't re-flow on all the pins fully. I think I can see one of the pads under the solder blob, so it's likely not making a good connection. Especially with motion, that can be intermittent.

    I'd recommend you get those encoders off and try to hold the soldering iron on those header pins long enough, until the solder reflows nicely on all the pins.

    As for the firmware- I'm not sure either, but I'm sure David Carrier can help out if that test becomes necessary. Could you check the soldering first though, just to get that gremlin off the table?
  • Alkarex wrote: »
    and we have tried to swap the left/right encoders, and swap the left/right motors.
    The result is that it is now the left-side wheel the crashes (the one the motor board believes in the right-side one), while the right-side wheel runs fine (the one the motor board believes is the left-side one).

    So this suggests the issue occurs with the "Motor 2" channel.

    I think the soldering should be addressed first.

    If that doesn't solve, then moving the Motor2 encoders up to the Aux1/2 pins would still be an interesting test. We can ask Parallax's help with that I'm sure.

    (Like the PAB-WX headers, Aux 1/2 has lower input impedance than the DHB-10's dedicated encoder inputs, which may help overcome problems of higher resistance in the encoder signals, like from poor solder joints or some other yet-unknown gremlin!)

  • @VonSzarvas Thanks for your feedback. We also believe that there is an error with the "Channel 2" on the DHB-10 motor board.
    Although we do not think there is anything wrong with the encoders (since swapping right and left produces the error on the other side, since they work fine from the Propeller Activity Board WX, and since it is the same problem with all our robots), we have redone the soldering as you suggested for the sake of completion, which made no difference (still same crash).
    New photos on https://photos.app.goo.gl/QplqmX2viKLQJ6453

    Tomorrow, we will test the motor board of our 14th Arlo bot, which was bought a couple of months later.

    Anything new from @DavidCarrier or @KenGracey ? (I do not know how to make user mentions for people with a space in their username)

    Best regards
  • Alkarex wrote: »
    Anything new from @DavidCarrier or @KenGracey ? (I do not know how to make user mentions for people with a space in their username)

    I'm not aware of any possible way to address people with spaces. But I've just e-mailed David and alerted him to your message. Both David and Ken will also get a forum pop-up next time they login here, so they will see your message soon.

    I'll keep watching too, in-case I have any other ideas.
  • @Alkarex

    Actually one thought...

    Would you be able to add a photo of a couple of your DHB-10 boards (I think you mentioned having a couple different date-versions), from the top view, so we can see the board components in view, and also where/how the wires are plugged in.

    In the interests of completeness, it might catch something to compare your setup and board parts to what is expected. And at least we can do that whilst David prepares his thoughts.
  • @VonSzarvas I have added a photo of the board used for the last tests https://photos.app.goo.gl/QplqmX2viKLQJ6453
    Tomorrow, when we test the 14th board, I will put a couple more photos.
  • This is all guesswork until I can go check the robot and datasheets, but the only thing I immediately spotted was that the large motor cables are attached with "red" to the negative terminal, and "blue" to the positive terminal.

    I don't recall those motors having polarity, or indeed if red should be positive :), but maybe that's something to double check. They are automotive motors, and maybe they assume gnd is connected to places that other motors don't.

    Stranger things have happened. Maybe voltage can get from the motor housing, along the spindle and to the encoder in some way. I suppose swapping the connections would be a quick test, if you agree there is no warning on the motor about polarity. The motors should just run in the opposite direction.


  • @VonSzarvas reversing the polarity of the motor 2 (right-side) without reversing the encoder channels gives the following error in the encoder test:
    Testing...
    
    
    ticksL = 241, ticksR = -223
    
    
    Motor 1 encoder cables
    
    are connected correctly.
    
    
    ERROR: Motor 2 encoder
    
    connections are reversed!
    
    
    Test done.
    

    Reversing both the motor polarity and the encoder channels passes the test, but produces the same crash.
  • Dang ! OK, That's off the list. Thanks for trying it though.

    I guess we wait for the CA timezone to awaken.

  • Alkarex,
    The "Motor or encoder error" message indicates that there is a disagreement between a motor and its encoder, or something wrong with the encoder itself. It could, for example, be that the motor is unable to reach the requested speed or that the encoder is catching on or otherwise misaligned with the optical sensor. Other than switching both channels at the same time, encoders by themselves are not able to report errors, but will instead give inaccurate results. The DHB-10 firmware detects the inaccurate results, then reports the error, so a program without the detection and reporting ability will not show an error.
    It is good to know that it is only occurring on the left side of the DHB-10s. The Arlo hardware and the DHB-10 are designed with a significant level of symmetry, but there are some minor asymmetries, which can help narrow down where the issue is.
    I think the next step from here is for me to modify the DHB-10 firmware to include more in-depth messages describing the disagreement between the motor drive power and the encoder position readings. I can either create a setup here to model the problem, then make some modifications to the firmware to narrow down what the issue is, and send you a debug firmware with loading instructions, or I can send you a shipping label, so that I can test a few of the parts here.
    Let me know which you prefer, and I will get everything started.
  • @DavidCarrier It is apparently always the right-side motor that fails, not the left-side one. Yes, please send a firmware with additional debug information. We are also open to sending you back one of our Arlo robots.

    On my side, I will try to get one of the 10 Arlo robots from our colleague from another university who has the same motor board crash problem when encoders are connected, to do the same testing as we have done above, to confirm it is the same issue.

    Thanks!
  • Alkarex,
    I have attached both Spin and binary files for a modified version of the firmware with an added STATUS command. You can load the binary file by connecting a Prop Plug to the 4-pin socket next to the reset button, then using the Propeller Tool software, available from: https://www.parallax.com/downloads/propeller-tool-software-windows-spin-assembly, to load EEPROM. The STATUS command will return a cog number (0 through 7) that the motor controller is running in, or an error, as a negative number. Here's the meaning of the negative numbers:
    -1 = The motor controller cog failed to launch
    -2 = The left motor driver IC detected a hardware fault
    -3 = The right motor driver IC detected a hardware fault
    -4 = The left wheel encoder expected movement but read no movement for full second
    -5 = The right wheel encoder expected movement but read no movement for full second

    If you get -2 or -3, then the Allegro A4940 H-bridge has triggered its fault output pin. If you get -4 or -5, then there was no movement during the last 50 position readings, which occur 50 times per second, the motor is significantly behind its expected position, and the motors power level is above a threshold that should produce noticeable movement. If it is the latter, then either the motor is jamming due to a high load or a weak power source, or the encoders are not providing readings.

  • Alkarex,
    I have attached both Spin and binary files for a modified version of the firmware with an added STATUS command. You can load the binary file by connecting a Prop Plug to the 4-pin socket next to the reset button, then using the Propeller Tool software, available from: https://www.parallax.com/downloads/propeller-tool-software-windows-spin-assembly, to load EEPROM. The STATUS command will return a cog number (0 through 7) that the motor controller is running in, or an error, as a negative number. Here's the meaning of the negative numbers:
    -1 = The motor controller cog failed to launch
    -2 = The left motor driver IC detected a hardware fault
    -3 = The right motor driver IC detected a hardware fault
    -4 = The left wheel encoder expected movement but read no movement for full second
    -5 = The right wheel encoder expected movement but read no movement for full second

    If you get -2 or -3, then the Allegro A4940 H-bridge has triggered its fault output pin. If you get -4 or -5, then there was no movement during the last 50 position readings, which occur 50 times per second, the motor is significantly behind its expected position, and the motors power level is above a threshold that should produce noticeable movement. If it is the latter, then either the motor is jamming due to a high load or a weak power source, or the encoders are not providing readings.

    Hi David Carrier,

    Thanks for the files. I have tried both the firmware you sent and the one from the Parallax website, they both work without any crash. So I think there was a bug with the firmware I had before.
  • VonSzarvasVonSzarvas Posts: 1,145
    edited February 6 Vote Up0Vote Down
    ... or perhaps the battery condition at the time of your previous tests?

    A battery with low voltage or low capacity* could cause the motors to lag behind, and lead to the sort of errors David described.

    And if assuming in the code, motor 1 is driven or feedback-checked first, then motor 2, even if only uSecs difference, that might explain why motor 2 seemed to fail each time.

    I'm glad to have learned from this, that the Arlo code is quite "intelligent" in respect of detecting when a motor is falling behind. And that a good quality and fully charged battery is essential recommendation, especially when debugging motor issues.

    Thank you @Alkarex and @zhowan for sharing your feedback.



    * By low capacity, I mean that the battery could be old, and/or not have enough instantaneous current capacity to power both motors, especially when the motors first start to turn or change direction as they will draw an initial peak-current well in excess of the typical running current. Hookup cables that are underrated may also be a factor.


    @DavidCarrier - I think having that new STATUS command available for all users might be a good improvement to the regular firmware.
  • @VonSzarvas I believe the potential battery problem can be ruled out, as we in general tested with fully-charged batteries at office temperature, as well as tested on many robots with different charging states, all exhibiting the same issue. We also tested the Arlo with load and with no load at all (wheels not in contact with the floor) with no difference.
    So apparently, there was a bug in the firmware that was pre-flashed on the motor board, which is solved by flashing a new firmware.
    We have started to flash our other robots and will report whether they all start working fine.
  • VonSzarvasVonSzarvas Posts: 1,145
    edited February 6 Vote Up0Vote Down
    @Alkarex

    Interesting feedback. Then there remains something curious. It seems unlikely to me Parallax could have been flashing an older version of firmware, especially across various datecode pcbs. Although if that is the only problem, then at least the matter is solved. Maybe the manufacturing image became corrupt or replaced by an older one at some point, which Parallax can investigate. Or the boards were re-flashed/reset at some other point inside or outside Parallax, inadvertently with older firmware.

    I still think the battery was something I overlooked earlier in the thread, so by making a note here it re-enforces that in my mind for future. I'm glad you were able to rule that out already though, and your lifting the wheels off the floor is also a great diagnosis tip, to reduce current draw.

    Looking forward to your next report. Thank you.
Sign In or Register to comment.