Shop OBEX P1 Docs P2 Docs Learn Events
Simple question about SimpleIDE and fdserial @ 155200 baud — Parallax Forums

Simple question about SimpleIDE and fdserial @ 155200 baud

I have a simple question about using a QuickStart board to pass through serial data. The code below isn't fast enough to keep up at 115 kbaud, so some serial corruption can occur. Is there a simple change that could make it fast enough? I believe that the default memory model is CMM, and the optimization is -Os Size. Maybe changing those settings is enough, but perhaps it requires a change away from fdserial? I haven't been using Propellers much recently but I know that fdserial has been a subject of discussion over the years. I'm working with some remote engineers who have never used Propellers so SimpleIDE seemed like a good choice. I can see if I can come up with a good way to detect serial corruption (e.g. getting statistics based on file transfers to an embedded device's U-Boot), and then try various solutions. Nothing else is running on the Propeller, so all resources are available. Using cogs inefficiently is o.k. if it makes this easier to get working - e.g. doesn't require installing a new serial object.

For anyone wondering why I would do this, it started out with different baud rates for the host and device sides to deal with an embedded device which used a non-standard baud rate. And then once communications are established the device was reprogrammed to work at 115 kbaud. And then it's just convenient to leave the QuickStart in place to do some testing. But this testing is automated and hits the serial port a lot harder than the initial traffic. The host side is using the on-board FTDI chip to connect to USB.
#include "simpletools.h"
#include "fdserial.h"

int main() {
  fdserial *host;
  fdserial *device;
  int c;
  
  simpleterm_close();
  host   = fdserial_open(31, 30, 0, 115200);
  device = fdserial_open(23, 22, 0, 115200);

  while(1)
  {
    c = fdserial_rxCheck(host);
    if (c != -1) {
      fdserial_txChar(device, c);
    }
          
    c = fdserial_rxCheck(device);
    if (c != -1) {
      fdserial_txChar(host, c);
    }      
  }  
}

Comments

  • Recompiling as XMM would likely help a lot. CMM is faster than Spin by anywhere from 2x to 4x, but FDSerial itself has a lot of internal overhead, which will slow it down.

    Since you know that your transmit rate is limited to 115,200 by the source, you could re-implement fdserial_txChar to skip buffer checks, and that would speed things up a lot.

    Here's the internals of txChar:
    int fdserial_txChar(fdserial *term, int txbyte)
    {
      int rc = -1;
      volatile fdserial_st* fdp = (fdserial_st*) term->devst;
      volatile char* txbuf = (volatile char*) fdp->buffptr + FDSERIAL_BUFF_MASK+1;
    
      while(fdp->tx_tail == ((fdp->tx_head+1) & FDSERIAL_BUFF_MASK))
          ; // wait for queue to be empty
      txbuf[fdp->tx_head] = txbyte;
      fdp->tx_head = (fdp->tx_head+1) & FDSERIAL_BUFF_MASK;
      if(fdp->mode & FDSERIAL_MODE_IGNORE_TX_ECHO)
          rc = fdserial_rxChar(term); // why not rxcheck or timeout ... this blocks for char
      return rc;
    }
    

    If you locally cache the FDP pointer in your own code, you can directly manipulate the buffer yourself, instead of calling the function to do it for you, and that saves a function call with all the push/pop state save/restore that comes with it, and the pointer de-reference. If you cache the buffer pointer, you avoid the dereference and the add. Remove the check for the ignore_echo too.

    And then, after all that, you could skip the buffer wait. You'll end up with this:
    #include "simpletools.h"
    #include "fdserial.h"
    
    void MyTxChar( volatile fdserial_st * fdp, volatile char * txbuf , int txByte )
    {
      // Remove or comment these two lines to skip the buffer wait
      while(fdp->tx_tail == ((fdp->tx_head+1) & FDSERIAL_BUFF_MASK))
          ; // wait for queue to be empty
    
      // put the new character in the transmit buffer
      txbuf[fdp->tx_head] = txbyte;
    
      // update the buffer head index
      fdp->tx_head = (fdp->tx_head+1) & FDSERIAL_BUFF_MASK;
    }
    
    int main() {
      fdserial *host;
      fdserial *device;
      int c;
    
      simpleterm_close();
      host   = fdserial_open(31, 30, 0, 115200);
      device = fdserial_open(23, 22, 0, 115200);
    
      volatile fdserial_st* deviceFdp= (fdserial_st*) device->devst;
      volatile char* deviceBuf= (volatile char*) device->buffptr + FDSERIAL_BUFF_MASK+1;
    
      volatile fdserial_st* hostFdp= (fdserial_st*) host->devst;
      volatile char* hostBuf= (volatile char*) host->buffptr + FDSERIAL_BUFF_MASK+1;
    
    
      while(1)
      {
        c = fdserial_rxCheck(host);
        if (c != -1) {
          MyTxChar(deviceFdp, deviceBuf, c);
        }
              
        c = fdserial_rxCheck(device);
        if (c != -1) {
          MyTxChar(hostFdp, hostBuf, c);
        }      
      }  
    }
    

    I haven't compiled this, but I have done exactly this on my own to get transmit times to be quicker. You can do similar things with the receive code - the rxCheck call is basically just a "is the head the same as the tail" check, so putting that inline would eliminate a bunch of overhead too.
  • AribaAriba Posts: 2,690
    If the Propeller just needs to pass through the signals then you can set up two hardware counters with a feedback mode. The feedback mode outputs the inverted state of the PINA input pin at the PinB output pin.
    Because this is only possible with inverting the signals, you need two additional pins and do this 2 times so the passed through signal is not inverted.
    So it needs 4 counters and therefore a second cog. Here is an untested example:
    #include "propeller.h"
    
    volatile int stack[50];
    
    void cog2()
    {
       DIRA = (1<<30) + (1<<17);         // outputs
       CTRA = (9<<26) + (17<<9) + 23;    // device RX -> /P17
       CTRB = (9<<26) + (30<<9) + 17;    // /P17 -> host TX
       while(1) ;
    }
    
    int main()
    {
       cogstart(&cog2,0,stack,50);       // start second cog for 2 more counters
       DIRA = (1<<22) + (1<<16);         // outputs
       CTRA = (9<<26) + (16<<9) + 31;    // host RX -> /P16
       CTRB = (9<<26) + (22<<9) + 16;    // /P16 -> device TX
       while(1) ;
    }
    
    The cogs just wait in a loop, all the work is done by the counters. The sampling frequency is the clock frequency (80 MHz) and there is only one clock cycle delay (12.5ns).

    Andy
  • That's pretty damn clever. :)
  • JasonDorie wrote: »
    Recompiling as XMM would likely help a lot.

    I think you mean LMM :)
  • Thanks guys - I figured LMM as well. There are a couple of typos in the fdserial example but they are pretty obvious. e.g. device->buffptr should be deviceFdp->buffptr

    I can try Ariba's solution too, because for this case it's really just emulating a wire. And supporting higher baud rates would be great.

    I'm playing around with binary kermit transfers first such that I can validate performance improvements. I'm starting with an FTDI cable to show that it's reliable, and then will switch back to the Propeller. (I think that someone here bought some counterfeit FTDI cables from Amazon too - nothing related to Parallax, but it just adds to the confusion)
  • DavidZemon wrote: »
    JasonDorie wrote: »
    Recompiling as XMM would likely help a lot.

    I think you mean LMM :)

    Ooops - Yes, that is exactly what I meant.
  • Andy - thank you for posting that code. This method is working just fine. Thanks to Jason too, perhaps I will use those ideas too for other cases where Propeller needs to do something with the data.
Sign In or Register to comment.