Can't Wait for PropGCC on the P2?

11819212324

Comments

  • Dave
    I am using windows 7.
    All was good but all of a sudden everything changed. Started with someones code that worked and was using it as an example. Was moving smoothly then all of a sudden got garbage out.
    Question? Is the patch for spin2gui? If so how do I run the patch?
  • @pilot0315 : This thread is mainly for talking about p2gcc, Dave's code for supporting GCC on P2. It does share a loder with spin2gui, but that's the only thing in common. Are you able to run other programs and get serial output? If so, it's not a loadp2 problem. Please post either to the fastspin thread or create a new thread for your problem, but it sounds like it's most likely a bug in your code (if you started from working code and then changed it and it no longer works).
  • @Dave Hein

    I have been playing with your code for a while now trying to get SPI to work and I notice that the p2gcc.bat file doesn't work. It seems the -g -t options somehow migrated back in after being taken out.

    Mike

  • iseries, thanks for letting me know. I updated p2gcc.bat in GitHub, and removed the -g -t options.
  • pilot0315 wrote: »
    Dave
    I am using windows 7.
    All was good but all of a sudden everything changed. Started with someones code that worked and was using it as an example. Was moving smoothly then all of a sudden got garbage out.
    Question? Is the patch for spin2gui? If so how do I run the patch?

    Check the configure commands, run command. Check the -b value, make sure that it is the same as the baud you specify in your program. Either change the -b value to be the same as that in the program or change the baud in the program to be the same as the -b value.
    Tom
  • roglohrogloh Posts: 987
    edited 2019-02-05 - 05:37:35
    Still having weird issues with my high speed downloads with loadp2 on the Mac. Sometimes works ok, sometimes code fails to startup, or seems to hang part way in. I'm wondering if it is a bad serial transfer.

    I recall someone else mentioned earlier that it could be nice for the SINGLE mode to be updated to use base64 transfers with Prop_Txt. I think this is a good thing to support and also with checksum enabled so then the download tool can confirm the download result to be sure it was a reliable transfer. Given that a P2 clock mode can also be setup using Prop_Clk before the download happens that may be sufficient for a single stage download operation to be done at high clock speeds.

    In contrast the existing multi-stage CHIP mode downloader is still handy as it provides the extra capability to start at a non-zero address and has a 33% performance improvement with binary vs base64 transfers, but right now it doesn't validate the image download and this may become an issue at high speeds as you won't really know if the image was bad or there is a real bug in your code. I guess this CHIP mode could also have some type of checksum added in its byte loop for more reliability, but then it would also need to be able to transmit out and that's a little more complex (though certainly possible).
  • The intermittent failed loads has always happened for me, both FPGA and silicon. Happens for -SINGLE too. One possible reason could be inaccurate auto-bauding.
    "There's no huge amount of massive material
    hidden in the rings that we can't see,
    the rings are almost pure ice."
  • jmgjmg Posts: 13,249
    rogloh wrote: »
    ...
    I recall someone else mentioned earlier that it could be nice for the SINGLE mode to be updated to use base64 transfers with Prop_Txt. I think this is a good thing to support and also with checksum enabled so then the download tool can confirm the download result to be sure it was a reliable transfer. Given that a P2 clock mode can also be setup using Prop_Clk before the download happens that may be sufficient for a single stage download operation to be done at high clock speeds.

    There is also a Python P2 loader
    https://forums.parallax.com/discussion/168850/python-p2-loader/p1

    Looks like that changes CLK settings, and can autoscan for P2's, but does not include Checksum option yet ?

  • evanh wrote: »
    The intermittent failed loads has always happened for me, both FPGA and silicon. Happens for -SINGLE too. One possible reason could be inaccurate auto-bauding.

    I would hope that inaccurate auto-bauding is not the case, as being in ROM it plays a critical part of the download process. If it turned out to be flaky that would be a problem that will drive people crazy for sure. Perhaps a varying/low frequency RC oscillator could cause issues at higher baud rates but I'd suspect that an external crystal oscillator should be stable enough for auto-bauding to be decent and reliable once the P2 frequency is high enough that there are sufficient cycles per bit for measuring intervals and the timing quantization is not too coarse. I imagine the HW should resync at the start bit for each byte, though reception looks to be done in the smartpin according to the ROM code I browsed.

    Actually this whole auto-baud process in the ROM appears non-trivial, at least for my first attempt to figure out what the code is doing. It involves coordinating interrupts and smartpin timing modes etc. So yes if there is a bug buried in there it could potentially mess up the download.
  • jmgjmg Posts: 13,249
    rogloh wrote: »
    evanh wrote: »
    The intermittent failed loads has always happened for me, both FPGA and silicon. Happens for -SINGLE too. One possible reason could be inaccurate auto-bauding.

    I would hope that inaccurate auto-bauding is not the case, as being in ROM it plays a critical part of the download process. If it turned out to be flaky that would be a problem that will drive people crazy for sure. Perhaps a varying/low frequency RC oscillator could cause issues at higher baud rates but I'd suspect that an external crystal oscillator should be stable enough for auto-bauding to be decent and reliable once the P2 frequency is high enough that there are sufficient cycles per bit for measuring intervals and the timing quantization is not too coarse. I imagine the HW should resync at the start bit for each byte, though reception looks to be done in the smartpin according to the ROM code I browsed.

    Actually this whole auto-baud process in the ROM appears non-trivial, at least for my first attempt to figure out what the code is doing. It involves coordinating interrupts and smartpin timing modes etc. So yes if there is a bug buried in there it could potentially mess up the download.

    Initial Autobaud has to use the RCFAST,
    First Autobaud captures the time signature of '>' ( =\_ to =\_ of 7bT, and _/==\_ of 5bT ) which is unique, and then applies that time to the smart pin UART.
    This has a very wide dynamic range.
    Later Autobaud's expect to also see '>' so they assume any RCFAST creep has not broken RX.

    By sampling over multiple bit times, it can reduce the quantise effects.

    As you say, the cycles available for calibrate matter, and higher bauds would expect to fail more often than lower ones.
    It looks like 3MBd is marginal, at RCFAST typical, but 2Mbd is better. (and lower bauds even better still)

    Let's run some numbers :
    3MBd & 23MHz
     round(0x10000/7)*53 = 0b 0111 1001 0010 0011 1010
     round(0x10000/7)*54 = 0b 0111 1011 0110 1100 1100
    2MBd & 23MHz
     round(0x10000/7)*80 = 0b 1011 0110 1101 1010 0000
     round(0x10000/7)*81 = 0b 1011 1001 0010 0011 0010
                              DDDD FFFF FFxx ---- ----
    DDDD   = X[25..16] whole division
    FFFFFF = X[15..10] fractional 
    
     23M/3M = 7.6666666666666666667
     23M/2M = 11.5
    
    Maps to P2 DOC maths like this 
     23M/(0b0111+ 0b100100/64) = 3.041322
     23M/(0b0111+ 0b101101/64) = 2.985801
    
     23M/(0b1011 +0b011011/64) = 2.013680
     23M/(0b1011 +0b100100/64) = 1.989189
    
    However, I'm not entirely sure you can apply better than 1/8 fraction to a single byte of 8 samples.
    eg You only have a choice of /7 or /8 on each of 8 bit samples, which gives a new formula of 
     
     23M/(0b0111+ 0b100/8)     = 3.066667
     23M/(0b0111+ 0b101/8)     = 3.016393
    
     23M/(0b1011 +0b011/8)     = 2.021978
     23M/(0b1011 +0b100/8)     = 2.000000
    

    So 2Mbd is looking ok, and 3MBd is more marginal. (2.4MBd should also be a valid baud value)
  • jmgjmg Posts: 13,249
    rogloh wrote: »
    Still having weird issues with my high speed downloads with loadp2 on the Mac. Sometimes works ok, sometimes code fails to startup, or seems to hang part way in. I'm wondering if it is a bad serial transfer.
    evanh wrote: »
    The intermittent failed loads has always happened for me, both FPGA and silicon. Happens for -SINGLE too. One possible reason could be inaccurate auto-bauding.

    Did you try the Python loader ? (does it work on a MAC ?) - does it have the same failure rate, on the same download images ?

    Looking at the Python loader code, I see it applies 3MBd, then does two
    '> Prop_Clk 0 0 0 0 '
    to change to PLL, before doing a
    '> Prop_Txt 0 0 0 0 '
    with a single block download.

    Provided the first 2 Prop_Clk steps worked, the final download should be at Xtal/PLL speeds, with fresh higher autobaud values, so should have no drift issues.
  • jmg wrote: »

    Did you try the Python loader ? (does it work on a MAC ?) - does it have the same failure rate, on the same download images ?
    The Python loader works on macOS, with a small modification. As written, the script will try to open 'any' port that looks like a serial device. This includes Bluetooth devices that had been seen by the Mac previously, but not currently available.

    The loader, it tries to open "cu.Bluetooth-Incoming-Port", unsuccessfully and fails... Here are the serial ports that show up on my system:
    cu.Bluetooth-Incoming-Port
    cu.JBLEverestElite300-Avne
    cu.JBLEverestElite300-Avne-3
    cu.usbserial-P2EEI8V
    
    A quick fix (macOS only and only for P2ES or other boards with FTDI chips) is to look for a device with "cu.usbserial" in its name. This is NOT a good fix as it will cause other OS's to fail, but gets you by until the loader script is more formally fixed and works on all PC types.
    for spx in portlist:
              if str(spx.device).strip() != "":
    
    # Look for: cu.usbserial
    # added the following line and indented code below it
                   if "cu.usbserial" in str(spx.device).strip():
                       print (spx.device + " = ",end="")
    
    Livermore, CA (50 miles SE of San Francisco)
  • jmg wrote: »
    Did you try the Python loader ? (does it work on a MAC ?) - does it have the same failure rate, on the same download images ?
    I haven't yet but I might like to try it. It should be able to work on the Mac being cross platform if I have it's dependencies installed.
    Looking at the Python loader code, I see it applies 3MBd, then does two
    '> Prop_Clk 0 0 0 0 '
    to change to PLL, before doing a
    '> Prop_Txt 0 0 0 0 '
    with a single block download.

    Provided the first 2 Prop_Clk steps worked, the final download should be at Xtal/PLL speeds, with fresh higher autobaud values, so should have no drift issues.

    I wonder if there should be some ">" characters injected into the data stream for the loadp2 app running in SINGLE mode which uses the RC OSC. Right now it doesn't do that so possibly it could drift slighly during the download. The download code already breaks the transfers up into 128 byte blocks during the file read so it wouldn't be much more to throw in a ">" into the stream at these boundaries (adding just 1 byte of overhead for each 384 bytes sent if Prop_Hex is used, ie. not much). This may help for a long transfer. Clock drift could get worse for large file downloads as they take longer to complete. Same thing could also be done in CHIP mode, though that mode can already change the clock source to the external oscillator meaning no significant drift should occur as you mentioned.

    I studied the ROM code for the P2 serial input and didn't see any obvious bugs, though I'm really no expert there. I think it could be subtle if there is one. It's rather tricky code. Best way for us to be sure the serial downloaded ok is to make use of the checksum.
  • jmgjmg Posts: 13,249
    dgately wrote: »
    The Python loader works on macOS, with a small modification. As written, the script will try to open 'any' port that looks like a serial device. This includes Bluetooth devices that had been seen by the Mac previously, but not currently available.
    Nice details.
    Maybe the Python loader needs a command line option to enable/disable that 'smart' port finder, as it does spit stuff into ALL connected ports as it trundles along...
    It's student friendly, but more advanced system dangerous.

    Or, maybe the port scan can check info, but not need to open/send to indicate which might be P2 devices ?

    Device Instance ID FTDIBUS\VID_0403+PID_6015+P2EEI94A\0000

    I think that P2EE is universal on all P2 Evals (Eng sample1) - that requires a matching serial ID, so would work well on P2 Eval, less well on P2D2 (external USB).
  • roglohrogloh Posts: 987
    edited 2019-02-06 - 01:42:10
    What I'm finding weird is that in one test case a 2000000 bps download seems to work pretty much most of the time, but 1000000bps does not work at all as the P2 program won't start up. I have scoped the Rx/Tx pins with a logic analyser and it seems that the code is being sent with a bit period of 1us for 1Mbps and 0.5us for 2Mbps and I can see the decoded strings like "> Prop_Hex" etc so the Mac FTDI driver appears to doing the right thing, but I haven't yet looked for truncation or actual byte corruption in the raw stream. I've been testing this with DaveHein's life demo. I don't know why the faster download works but the slower one doesn't. This is using -CHIP mode. I will have to patch loadp2 for checksum and see what it returns. I'll also try to probe the reset line as well in case that is doing something weird part way through.

    Such a pity the P2-EVAL overlooked breaking out the reset pin to a header! Seems I'll need to solder a wire to the reset switch for accessing it.
  • Agreed re the reset pin

    On the loading issues I seem to recall OzProp having to pad out the end of the transfer to make loading more reliable, probably in base64, could that be an issue here?
  • Well it looks like there is a difference in output from my Mac between the two bitrates. At 2Mbps things look and work okay but at 1Mbps things start out fine then about 100ms into the image transfer portion the serial data goes mental.. and starts looking like a whole bunch of different length breaks which continues for about 700ms until the transfer stops. No idea why, might be bad FTDI driver or a baud rate it isn't able to sustain. Attached is a capture of an image which was sent in Prop_Hex format using SINGLE mode when it dies.

    So it looks like at least some of my problems are related to my own setup. Interestingly 1.2Mbps and 1.5Mbps also appear to download ok. 2.4Mbps and 3Mbps do not, haven't checked their actual outputs with the analyzer. I need to test more with larger images too.
    1538 x 159 - 45K
  • I added a checksum to the -SINGLE mode, and checked it into GitHub. The only file changed is loadp2.c.
  • Great, thanks Dave! I'll use it if/when it becomes available in Github. For some reason I can't see your latest checkin (yet), maybe Github takes a while to update or my browser is having issues....

    436 x 296 - 32K
  • Sorry, I forgot to push it. It should be there now. BTW, I also added a ">" character after every 128 bytes.
  • jmgjmg Posts: 13,249
    edited 2019-02-06 - 03:28:37
    rogloh wrote: »
    What I'm finding weird is that in one test case a 2000000 bps download seems to work pretty much most of the time, but 1000000bps does not work at all as the P2 program won't start up. ...


    That's certainly weird...

    rogloh wrote: »
    Well it looks like there is a difference in output from my Mac between the two bitrates. At 2Mbps things look and work okay but at 1Mbps things start out fine then about 100ms into the image transfer portion the serial data goes mental.. and starts looking like a whole bunch of different length breaks which continues for about 700ms until the transfer stops. No idea why, might be bad FTDI driver or a baud rate it isn't able to sustain. Attached is a capture of an image which was sent in Prop_Hex format using SINGLE mode when it dies.

    So it looks like at least some of my problems are related to my own setup. Interestingly 1.2Mbps and 1.5Mbps also appear to download ok. 2.4Mbps and 3Mbps do not, haven't checked their actual outputs with the analyzer. I need to test more with larger images too.

    hmm... Sounds Driver broken. Tho hard to fathom how it can start ok, then seems to lose baud settings.
    I see FTDI lists 2 MAC drivers - which OS do you have ?
    Mac OS X 10.3 to 10.8 2012-08-10 2.2.18 2.2.18 2.2.18
    Mac OS X 10.9 and above 2017-05-12 - 2.4.2
    Windows* 2017-08-30 2.12.28 2.12.28

    I've experimented to find a minimal P2-Echo test, and created some large files, with and without Autobaud, to see how long it can 'hold baud' and to give sustained tests > most buffers.
    This should also check Serial drivers/OS hosts with an easier ASCII test.
    P2 may drop some of those dummy packer chars here, but it seems to do so in a controlled manner.

    Up to 2.4MBd it replies the expected 5000 copies of CrLf'Prop_Ver A'CrLf (14 chars) = 70000chars, but at 3MBd it replied a stable 3750 copies, so I needed an EL version (extended line) that adds packer spaces.
    I suspect there, the added SW turn-around overhead time, was not being allowed for at 3MBd


    Prop_Ver_Repeat5000Q.TXT Starts with "> " and is able to pack replies up to 2.4MBd
    Prop_Ver_NoGT_Repeat5000Q.TXT trims leading "> ", so repeat sending of this file, skips any Autobaud, use to check for drift of RCFAST
    Prop_Ver_EL_Repeat5000Q.TXT Added Spaces version, that will echo 5000 replies at 3MBd (needs quite a pack for 3Mbd, maybe the interrupt overhead is pushing things... 3MBd on 23MHz is quite impressive)

    The ROM sleeps after a while, so it's not so easy to run this for hours... but a SW loop could autobaud once, and then run overnight to check temperature drift, or someone could heat-gun their P2.

    Summary: RCFAST does look to be quite/very stable, even at 3MBd where margins are least, it holds baud for as long as I've run tests (ie many 10's of seconds).
  • jmgjmg Posts: 13,249
    Dave Hein wrote: »
    BTW, I also added a ">" character after every 128 bytes.

    Even tho I could not prove it is needed, that seems a good idea to have. Cost is quite low.
    eg Perhaps a rapidly cooling P2, could drift enough during download ?


  • jmg wrote: »
    I see FTDI lists 2 MAC drivers - which OS do you have ?
    Mac OS X 10.3 to 10.8 2012-08-10 2.2.18 2.2.18 2.2.18
    Mac OS X 10.9 and above 2017-05-12 - 2.4.2
    Windows* 2017-08-30 2.12.28 2.12.28

    I found I still had the older default Apple supplied FTDI driver (2.3) for my Mac (OS X 10.10.5), so I updated it to the latest FTDI 2.4.2 version. Didn't seem to help.

    What is weird is that different length programs run at different baud rates. The Blink demo can be loaded at 1MBps but not 2Mbps, while the Life demo using the LED accessory runs at 2Mbps but not at 1Mbps. :confused: What gives?

    I found that the Life demo sent at 1Mbps and failed started sending breaks at 0x140B bytes into the $17AC sized image with CHIP mode, and another time at offset 0x14FC (so not identical file position). Still tracking it down. I'd imagine it if was a bad USB cable the FTDI chip would be able to detect errors in the USB transfer?
  • roglohrogloh Posts: 987
    edited 2019-02-06 - 05:44:48
    Found something very interesting...

    When I was using Dave Hein's latest change that checks for the response in SINGLE mode I noticed that the P2 chip was reset early before the actual transfer had fully completed on the wire (see my capture below with the P2 rx pin and reset line). This is an important data point because it appears that the loadp2 application is not accounting for buffer delays. So perhaps the problem I am seeing with the CHIP mode is that the baud rate and other terminal settings are being changed for the console while image data is still being sent out, thus corrupting the remaining transfers causing breaks and other garbage to be sent. It might even be valid data at 115200bps - I need to check.
    UPDATE: yes it is valid data at 115200 not random breaks!
    This is a problem with waiting for the output to complete in loadp2 before the terminal is entered. i.e. output buffer delays.

    RLs-MacBook-Pro:life roger$ loadp2 -t -l 1000000 -SINGLE -p /dev/cu.usbserial-P23HPBP6 a.out
    a.out failed to load
    Error response was ""

    I think adding more delay after the download completes before terminal mode is entered might be worth a shot, it's only set to 50ms right now and it looks like it needs another 10ms or so. You basically want to wait on a flush operation in the output data path before continuing. There is probably some correct way to do this without using arbitrary delays.


    1556 x 122 - 18K
  • Isn't there a "flush" command that can be used to block until all bytes are sent? Wouldn't that be better than a delay?
    David
    PropWare: C++ HAL (Hardware Abstraction Layer) for PropGCC; Robust build system using CMake; Integrated Simple Library, libpropeller, and libPropelleruino (Arduino port); Instructions for Eclipse and JetBrain's CLion; Example projects; Doxygen documentation
    CI Server: http://david.zemon.name:8111/?guest=1
  • roglohrogloh Posts: 987
    edited 2019-02-06 - 05:50:26
    Exactly.

    I think it is tcdrain() or some similar function.
  • roglohrogloh Posts: 987
    edited 2019-02-06 - 06:13:57
    Adding a wait_drain function in osint_linux.c which calls tcdrain() on the file descriptor fixes my problem in CHIP mode and I can now download ok at many baud rates below 2Mbps. The drain should also help the SINGLE mode but the rx_timeout setting needs more work as it doesn't seem to read the "." which the P2 returns (I see it sent on my logic analyzer). Some more adjustment is needed there. Here's where I added the drain calls in loadp2.c for CHIP mode. :smile:
        txval(size);
        txval(address);
        tx((uint8_t *)"~", 1);
        wait_drain();
        msleep(100);
        while ((num=fread(buffer, 1, 1024, infile)))
        {
            if (patch)
            {
                patch = 0;
                memcpy(&buffer[0x14], &clock_freq, 4);
                memcpy(&buffer[0x18], &clock_mode, 4);
                memcpy(&buffer[0x1c], &user_baud, 4);
            }
            tx((uint8_t *)buffer, num);
            totnum += num;
        }
        wait_drain();
        msleep(50);
        if (verbose) printf("%s loaded\n", fname);
        return 0;
    
  • jmgjmg Posts: 13,249
    rogloh wrote: »
    ... but the rx_timeout setting needs more work as it doesn't seem to read the "." which the P2 returns (I see it sent on my logic analyzer). Some more adjustment is needed there.

    If it fails to see the echo, or times-out early, then that explains the baud-flip.
    If it waited correctly, the echo must come in after the last terminator/command char byte is sent.

  • I found this change to loadp2.c below worked on my system with lots of different baud rates in loadfilesingle() with SINGLE mode. I added the wait_drain call before the "?" is sent to the P2 and increased the receive timeout to 100ms from 10ms. Without increasing this number it would still timeout on the receive. How close it is to the actual limit I'm not sure.
     ....  
     
        if (use_checksum)
        {
            char *ptr = (char *)&checksum;
            checksum = 0x706f7250 - checksum;
            for( i = 0; i < 4; i++ )
                sprintf( &buffer[i*3], " %2.2x", ptr[i] & 255 );
            tx( (uint8_t *)buffer, strlen(buffer) );
    
            tx((uint8_t *)"?", 1);
            wait_drain();  // drain out before requesting checksum validation with "?" char  --- *** NEW
            //msleep(70);   // no need for this anymore?  --- *** NEW 
            num = rx_timeout((uint8_t *)buffer, 100, 100);  // change to 100ms timeout  --- *** NEW
            if (num >= 0) buffer[num] = 0;
            else buffer[0] = 0;
            if (strcmp(buffer, "."))
            {
                printf("%s failed to load\n", fname);
                printf("Error response was \"%s\" len=%d\n", buffer, num);
                promptexit(1);
            }
            if (verbose)
                printf("Checksum validated\n");
        }
        else
        {
            tx((uint8_t *)"~", 1);   // Added for Prop2-v28
            wait_drain();  // drain out before continuing further with optional terminal --- *** NEW
        }
        msleep(50);
        if (verbose) printf("%s loaded\n", fname);
        return 0;
    }
    
  • By the way, if the application starts sending data right after boot without any delay, it will corrupt the P2 response checking code. I removed the sleep(1) from main in life.c and got this error now. I think only one byte should be requested from the serial port not 100 (looks like that value in the code was pasted from elsewhere where it reads in the version string). The garbage output below is from this change...
    <life.c>
    void main(void)
    {
      int i, j;
    
      //sleep(1);
      printf("Starting...\n");
    
    
    RLs-MacBook-Pro:life roger$ p2gcc life.c led.c pins.c -r -t -p /dev/cu.usbserial-P23HPBP6 
    ioctl succeeded
    Setting baudrate to 115200
    a.out failed to load
    Error response was ".??8??x" len=10
    RLs-MacBook-Pro:life roger$ p2gcc life.c led.c pins.c -r -t -p /dev/cu.usbserial-P23HPBP6 
    ioctl succeeded
    Setting baudrate to 115200
    a.out failed to load
    Error response was ".??8" len=4
    

    When I changed it to read only one byte it fixes the problem.
            num = rx_timeout((uint8_t *)buffer, 1, 100);  // change to 100ms timeout  --- *** NEW
    
Sign In or Register to comment.