Shop OBEX P1 Docs P2 Docs Learn Events
FlexProp: a complete programming system for P2 (and P1) - Page 57 — Parallax Forums

FlexProp: a complete programming system for P2 (and P1)



  • ersmithersmith Posts: 6,147

    There's a new flexprop release (7.0.4) with some significant bug fixes. Thanks to @"Stephen Moraco" for finding a subtle problem with optimizations around functions that call ABORT, and to @Wuerfel_21 for her testing and many useful suggestions.

  • Christof Eb.Christof Eb. Posts: 1,295
    edited 2025-02-22 15:20

    PANIC Flexprop 7.0.4 when I add the line
    mount("", _vfs_open_sdcard());
    then the size of the program explodes +340000 bytes or so???? :#
    Is there a tiny file system for SD card only?
    Thanks a lot! Christof

    Ah- the big size comes from my silly HEAPSIZE, that is ignored, if you don't have the mount command.

  • evanhevanh Posts: 16,345
    edited 2025-02-22 11:21

    An empty string probably isn't valid.

    My file read/write tester program, with diagnostics, is about 64 kB (Spin2 binary), 55 kB (C binary).
    Here it is configured to use the built-in SD card driver.

  • Thank you very much @evanh, your code works with my sd-card. And if I insert the card into my PC I can see the files, it has generated.
    Put on track again. Great!

    The sd_dir.bas works but sd_write.bas example seems not to work.
    The hello.txt is not there, if I insert the card in the PC.

  • Christof Eb.Christof Eb. Posts: 1,295
    edited 2025-02-23 16:53

    Hi, Eric,
    something seems to be broken with the sd_write.bas example. I changed /dir to /sd but this did not help. "" instead of "/sd" seems generally not to be allowed either?
    Also I have the question, if there is the possibility to have a second SD Card in the system?
    Thank you and cheers, Christof

  • Hi,
    I am using getchar() in C and it is easily overrun, so I conclude there is no fifo.
    Is there a fifo buffered version of serial input to P2 via getchar()?
    Thanks, Christof

  • ersmithersmith Posts: 6,147

    @"Christof Eb." said:
    Hi, Eric,
    something seems to be broken with the sd_write.bas example. I changed /dir to /sd but this did not help.

    The problem here is that the new sd driver needs a file handle, and allocates #3, which conflicts with the BASIC code. If you change all of the #3 in the BASIC to #4 it should work. I'll fix the samples soon, thanks for catching this.

    "" instead of "/sd" seems generally not to be allowed either?

    Correct, an empty string is not allowed for mount and returns EINVAL.

    Also I have the question, if there is the possibility to have a second SD Card in the system?\

    Yes, it should be possible by using the _vfs_open_sdcardx(a, b, c, d) call to specify the pins explicitly instead of _vfs_open_sdcard.

    I am using getchar() in C and it is easily overrun, so I conclude there is no fifo.

    That's almost right, there is room for 4 characters to be buffered inside the smart pin.

    Is there a fifo buffered version of serial input to P2 via getchar()?

    You could use one of the existing buffered serial objects in Spin2, and hook it up to stdio. Getting it to work with stdio functions like getchar would be a bit of work, but just using the object directly is very easy.

  • Christof Eb.Christof Eb. Posts: 1,295
    edited 2025-02-25 16:43

    @ersmith said:

    Is there a fifo buffered version of serial input to P2 via getchar()?

    You could use one of the existing buffered serial objects in Spin2, and hook it up to stdio. Getting it to work with stdio functions like getchar would be a bit of work, but just using the object directly is very easy.

    Thanks for the reply!
    As far as I know, these SPIN2 objects need extra COGs.
    Taqoz has an interrupt driven serial input in the main COG. The service routine and the ring buffer is in HUB. Also a few register variables including the vector register. Wouldn't that be interesting for FlexProp? For me it is not clear how to find the service routine in HUB RAM to install it. Below is Peter's code.
    How would that look like: mov ijmp1,##@taqoz_rxisr

    016ac                 | '#######################################################################'
    016ac                 | '   serial.p2a
    016ac                 | '#######################################################################'
    016ac                 | 
    016ac                 | 
    016ac                 | InitSerial  drvh    #tx_pin             ' force high '
    016ac     59 7C 64 FD 
    016b0     3E F8 0C FC |         wrpin   #$7C,#tx_pin            ' asynchronous transmit
    016b4     3F 7C 0C FC |         wrpin   #$3E,#rx_pin            ' asynchronous receive
    016b8                 | 'SETBAUD                        ' calculate baud timing at runtime'
    016b8     14 20 04 FB |         rdlong  r0,#@_CPUHZ         ' read from config table in low hub'
    016bc     1C 22 04 FB |         rdlong  r1,#@_BAUDR
    016c0     11 20 10 FD |         qdiv    r0,r1               ' cpuhz/baud'
    016c4     18 20 60 FD |         getqx   r0
    016c8     10 20 64 F0 |         shl r0,#16              ' (cpuhz/baud)<<16'
    016cc     07 20 04 F1 |         add r0,#7               ' add 8 data bits
    016d0     3F 20 14 FC |         wxpin   r0,#rx_pin          ' write config baud ticks and 8 data
    016d4     02 20 04 F1 |         add r0,#morestops               ' ADD 2 MORE STOP BITS FOR TX 191120 '
    016d8     3E 20 14 FC |         wxpin   r0,#tx_pin          ' baud 8 data
    016dc     3E 1A 2C FC |         wypin   #$0D,#tx_pin            ' kick off transmit empty flag'
    016e0                 | 
    016e0     41 7E 64 FD |         dirh    #rx_pin             ' enable smartpin mode'
    016e4     3F 42 9C FA |         rdpin   rxdat,#rx_pin wc        ' clear rx?
    016e8     25 00 64 FD |         setint1 #0              ' disable int0'
    016ec     00 3E 04 F6 |         mov     rxwrC,#0
    016f0     00 40 04 F6 |         mov rxrdC,#0
    016f4     1F 3C CC F9 |         bmask   rxlong,#31          ' invalid codes'
    016f8                 | 
    016f8     20 7E 67 FD |         setse1  #%110<<6+rx_pin             ' set se1 to trigger on  rx char event?????
    016fc     0B 00 00 FF 
    01700     0C E9 07 F6 |                 mov     ijmp1,##@taqoz_rxisr        ' set int1 jump vector to receive buffer
    01704     25 08 64 FD |         setint1 #4              ' Enable int0 to trigger on se1'
    01708     2D 00 64 FD |         ret
    0170c                 | 
    0170c                 | '
    0170c                 | ' ( SERIAL RECEIVE INTERRUPT SERVICE ROUTINE )' 47!
    0170c                 | '
    0170c                 | taqoz_rxisr
    0170c     3F 42 8C FA |                 rdpin   rxdat,#rx_pin               ' recv byte (bits31:24) from rx pin
    01710     18 42 44 F0 |         shr rxdat,#24           ' right justify'
    01714     08 3C 64 F0 |         shl     rxlong,#8           ' maintain a 4 character deep sequence long'
    01718     21 3C 40 F5 |         or  rxlong,rxdat            ' update history word'
    0171c     1F 44 00 F6 |         mov rxptr,rxwrC         ' update write pointer'
    01720     FA 03 00 FF 
    01724     00 45 04 F1 |         add rxptr,##rxbuffers
    01728     22 42 40 FC |         wrbyte  rxdat,rxptr         ' write char to buffer'
    0172c     01 00 00 FF 
    01730     FF 3E 04 F7 |         incmod  rxwrC,##rxsize-1        ' update write index'
    01734     1C 42 14 F2 |         cmp rxdat,#$1C wc           ' don't check sequences if not a certain control
    01738     F5 FF 3B 3B |     if_nc   reti1
    0173c                 |         ''
    0173c     00 3C 0C F2 |         cmp rxlong,#0 wz
    01740     30 00 90 AD |     if_z    jmp #RST
    01744     14 42 14 F2 |         cmp rxdat,#$14 wc           ' ignore control chars < $14 '
    01748     F5 FF 3B CB |     if_c    reti1
    0174c     0A 0A 0A FF 
    01750     14 3C 0C F2 |         cmp rxlong,##$14141414 wz       ' check ^T^T^T^T for force TRACE sequence'
    01754     C8 EC BF AD |     if_z    call    #TRACE
    01758     F5 FF 3B AB |     if_z    reti1
    0175c     8A 8A 0A FF 
    01760     15 3D 0C F2 |         cmp rxlong,##$15151515 wz       ' check ^U^U^U^U for force UNTRACE sequence'
    01764     CC EC BF AD |     if_z    call    #UNTRACE            ' The TRACING locations has the default doNEXT instruction'
    01768     F5 FF 3B AB |     if_z    reti1
    0176c     8D 8D 0D FF 
    01770     1B 3D 0C F2 |         cmp rxlong,##$1B1B1B1B wz       ' check esc esc esc esc sequence'
    01774                 | RST if_z    coginit #0,#0               ' force reset '
    01774     00 00 EC AC 
    01778     0D 0D 0D FF 
    0177c     1A 3C 0C F2 |         cmp rxlong,##$1A1A1A1A wz       ' check for ^Z^Z^Z^Z sequence '
    01780     1C 42 C4 A9 |     if_z    decod   rxdat,#28           ' reboot via hub'
    01784     00 42 60 AD |     if_z    hubset  rxdat
    01788     F5 FF 3B FB |         reti1
    0178c                 | 
    0178c                 | ' TAQOZ interface '
    0178c                 | ' Read the next character from the serial rx buffer or return with null if empty
    0178c                 | ' Note: INTERNAL - called by CONKEY
    0178c                 | READRX      cmp rxrdC,rxwrC wz      ' check head/tail
    0178c     1F 40 08 F2 
    01790     81 00 80 AD |     if_z    jmp #PUSHACC        ' return with null if empty'
    01794     20 22 00 F6 |         mov r1,rxrdC        ' save current read index '
    01798     FA 03 00 FF 
    0179c     00 23 04 F1 |         add r1,##rxbuffers      ' index into the rx buffer'
    017a0     11 1C C0 FA |         rdbyte  u,r1            ' read the data'
    017a4     01 00 00 FF 
    017a8     FF 40 04 F7 |         incmod  rxrdC,##rxsize-1    ' and update read index with wrap'
    017ac                 | rr1     jmp #PUSHACC
    017ac     81 00 80 FD 
    017b0                 | 
    017b0                 | {       *** P2 INTERRUPTS ***
    017b0                 | When an interrupt event occurs, certain conditions must be met before the interrupt branch can happen:
    017b0                 | 
    017b0                 | ALTxx / CRCNIB / SCA / SCAS / GETXACC / SETQ / SETQ2 / XORO32 / XBYTE must not be executing
    017b0                 | AUGS must not be executing or waiting for a S/# instruction
    017b0                 | AUGD must not be executing or waiting for a D/# instruction
    017b0                 | REP must not be executing or active
    017b0                 | STALLI must not be executing or active
    017b0                 | The cog must not be stalled in any WAITx instruction
    017b0                 | }
  • evanhevanh Posts: 16,345

    Flex doesn't officially support interrupts at this stage. Using them would likely cause problems.

  • @evanh said:
    Flex doesn't officially support interrupts at this stage. Using them would likely cause problems.

    Hm, yes, I have seen the statement in the doc.
    There are a lot of discussions in this forum about enhancements of this and that. I have the impression, that a buffered serial might be of more practical value for more people than some other things. So, perhaps it would be good to give this some priority. At the moment, the serial stream does not only loose characters but also seems to loose start bit synchronisation. At least there are a lot of chars, which have never been sent.
    Unfortunately Teraterm does allow only Transmit Delay of entire millisecs, so if I set it to 1, then the speed is reduced very much....

    _rxraw(tim) can only be set to entire milliseconds too. I think that has once been a good idea, when baud rates have been 9600 or so.

  • evanhevanh Posts: 16,345
    edited 2025-02-26 08:55

    It's only really intended for keyboard entry.

    Eric has a file transfer protocol over comport functioning already I believe. If that's what you're after.

    But it also wouldn't be too hard to take control of those serial smartpins yourself and ignore the C libs.

  • ersmithersmith Posts: 6,147

    @"Christof Eb." said:
    As far as I know, these SPIN2 objects need extra COGs.
    Taqoz has an interrupt driven serial input in the main COG. The service routine and the ring buffer is in HUB. Also a few register variables including the vector register. Wouldn't that be interesting for FlexProp? For me it is not clear how to find the service routine in HUB RAM to install it. Below is Peter's code.

    As Evan has already mentioned, interrupts aren't supported by the compiler -- the code generated is not designed to be interrupt safe or to yield periodically for interrupts. Partly this is because the compiler also targets P1, and partly it's because interrupts are a fairly rare thing.

    Giving up a COG for high priority tasks seems like a reasonable trade off. I know people try to save as many COGs as possible, but on P2 with the smartpins there is far less pressure on COGs than on P1, so I don't think using one COG for serial I/O (and perhaps other I/O too) is too bad. It's a rare program that actually needs all 8 COGs.

  • Christof Eb.Christof Eb. Posts: 1,295
    edited 2025-02-27 07:22

    OK, so I try to work around.
    Yes, this is about a "fast" data transfer. 230400 baud. P2 @ 200MHz. There should be plenty of time to get a char?
    At other places in the program I use getchar() and also gets() so I have not taken over control over serial entirely.

    The first version in C works, but only if throttled down by the transmitter with delays per char. The routine _rxraw(1) does a lot of things as far as I can see in the the listing file. Is this the vfs getting in the way? I have no /host activated.

    I have found no good description of the async mode. How it can buffer 4 chars is a miracle?

    The other two very simple versions crash.
    ((The compiler loads the WAITLOOP into the cache for each char. This must be AI.... :D ))
    I have tried if_c and if_nc but neither works.

    What's wrong with my attempts? Thank you very much for some hints!

    PRIMI(gettext) { // get text into block Buffer until ESC
       R1= (int)(forthMem+FORTHSIZE-1-BSIZE); 
       do {
          do {
          } while(R0==-1);
          if(R0!=27 ) { *(char*)R1 = R0; R1++; } 
       } while(R0!=27);
       *(char*)R1 = 0; 
    PRIMI(gettext) { // get text into block Buffer until ESC=27 or EURO=128
       R1= (int)(forthMem+FORTHSIZE-1-BSIZE); 
       do {
          __asm {
             WAITLOOP testp #63 wc   // test for char received
                if_c jmp #WAITLOOP  
                rdpin   R0, #63        // get the char and acknowledge
                shr R0,#32-8
                wrpin R0,#62         // echo back
          if(R0!=27 ) { *(char*)R1 = R0; R1++; } 
       } while(R0!=27 && R0!=128);
       *(char*)R1 = 0; 
    PRIMI(gettext) { // get text into block Buffer until ESC=27 or EURO=128
       R1= (int)(forthMem+FORTHSIZE-1-BSIZE); 
       do {
          __asm {
                rdpin   R0, #63 wc        // get the char and acknowledge
                if_nc jmp #WAITLOOP      // try again
                shr R0,#32-8
                wrpin R0,#62         // echo back
          if(R0!=27 ) { *(char*)R1 = R0; R1++; } 
       } while(R0!=27 && R0!=128);
       *(char*)R1 = 0; 
  • evanhevanh Posts: 16,345
    edited 2025-02-27 09:30

    Put the whole buffering loop inside the assembly block. Then it'll go real fast. With the way it is now there is a notable overhead in copying the Pasm code into cogRAM upon each and every time around the loop.

    BTW: How fast did the _rxraw() solution go?

  • OK, some very strange progress:
    This works but as bad as the C-Version only with delay per char. Does it rely on a very large number of stopbits?

    PRIMI(gettext) { // get text into block Buffer until ESC=27 or EURO=128
       R1= (int)(forthMem+FORTHSIZE-1-BSIZE); 
       R0=(int)_rxraw(1); // for some kind of setup ?????
       if(R0!=27 && R0!=-1) { *(char*)R1 = R0; R1++; } 
       do {
          __asm {
             WAITLOOP testp #63 wc   // test for char received
                if_nc jmp #WAITLOOP  // nc from serial full duplex
                rdpin   R0, #63        // get the char and acknowledge
                // shr R0,#32-8
                shr R0,#4            // why on earth?
                getbyte R0, R0, #0
                // wrpin R0,#62         // echo back
          if(R0!=27 ) { *(char*)R1 = R0; R1++; } 
       } while(R0!=27 && R0!=128);
       *(char*)R1 = 0; 
  • ersmithersmith Posts: 6,147

    @"Christof Eb." Honestly I wouldn't use inline assembly for this, especially at 230400 baud -- the P2 code is not going to be the bottleneck. I would just use a loop calling _rxraw(0) and checking for the returned value. @evanh did some clever things to make _rxraw buffer up to 3 characters (not 4 as I thought; the stop/start bits have to fit in the 32 bit smartpin buffer) so it's robust against small variations in timing.

    The flexc library routines are optimized for synchronous transfers, such as the ones performed by the built in host file system. Indeed using the host file system you can get up to ~21000 bytes/second transferred at 230400 baud, e.g. with a program like the following:

    #include <stdio.h>
    #include <propeller2.h>
    #define BIG_BUFFER_SIZE 64*1024
    char bigbuf[BIG_BUFFER_SIZE];
    int main()
        FILE *f;
        unsigned long elapsed;
        mount("/data", _vfs_open_host());
        //mount("/data", _vfs_open_sdcard());
        f = fopen("/data/test.bin", "rb");
        if (!f) {
            return 1;
        printf("starting test...\n");
        int errors = 0;
        elapsed = _getms();
        int r = fread(bigbuf, 1, BIG_BUFFER_SIZE, f);
        elapsed = _getms() - elapsed;
        float bytes_per_second = 1000 * (float)r / (float)elapsed;
        printf("time to read %d bytes: %u ms\n", r, elapsed);
        printf("  (%f bytes per second)\n", bytes_per_second);
        return 0;
  • Christof Eb.Christof Eb. Posts: 1,295
    edited 2025-02-28 06:44

    Hi Eric,
    this is a very nice forum and I appreciate your FlexProp very much. But this discussion is :s . I hope that I can manage to keep a polite tone. These things are very time consuming for me, because I have to look up so many things. For example I have been wondering for quite a while, what magical things "getbyte local06, local06, #0" does, because I did not get what is wrong. Normally the byte should sit in the MSB and be shifted by 24.

    @ersmith said:
    @"Christof Eb." Honestly I wouldn't use inline assembly for this, especially at 230400 baud -- the P2 code is not going to be the bottleneck. I would just use a loop calling _rxraw(0) and checking for the returned value.

    This is exactly what I have done and (after a trial in the Forth level, which also should be fast enough) posted in #1694 and what does not work!!! Please have a look at the code of __system___rxraw and of __system___setbaud !!! If you then still think, that this is like it should be, then please add the following to your documentation:

    "The serial receiver is set up in a mighty clever way. It does not only sample 8bit but even 8+20 bits. These additional 20 bits are discarded. So to prevent loss of characters and avoid de-synchronisation just set your sending software to use more than 20 stopbits."

    Sorry, Christof

    P.S. If the vfs needs this strange setup (I wonder if this can ever work reliably, if the sender is working over USB?), then only this part of the software should use the 28bits.

    11518                 | __system___setbaud
    11518     14 32 07 FB |     rdlong  muldiva_, #20
    1151c     6A 35 03 F6 |     mov muldivb_, arg01
    11520     3A 01 A0 FD |     call    #divide_
    11524     17 07 48 FB |     callpa  #(@LR__1831-@LR__1830)>>2,fcache_load_ptr_
    11528                 | LR__1830
    11528     3E B6 9E FA |     rdpin   result1, #62 wc
    1152c     5C B9 A2 F1 |     subx    result2, result2
    11530     F4 FF 9F CD |  if_b   jmp #LR__1830
    11534                 | LR__1831
    11534     40 7C 64 FD |     dirl    #62
    11538     40 7E 64 FD |     dirl    #63
    1153c     56 35 63 FC |     wrlong  muldivb_, ptr___system__dat__
    11540     10 34 67 F0 |     shl muldivb_, #16
    11544     07 D6 06 F6 |     mov arg02, #7
    11548     9A D7 02 F1 |     add arg02, muldivb_
    1154c     3E F8 0C FC |     wrpin   #124, #62
    11550     3E D6 16 FC |     wxpin   arg02, #62
    11554     3F 7C 0C FC |     wrpin   #62, #63
    11558     14 D6 06 F1 |     add arg02, #20 <<<<<<<<<<<<<<<<<<<<<<<<<<< ???????
    1155c     3F D6 16 FC |     wxpin   arg02, #63
    11560     41 7C 64 FD |     dirh    #62
    11564     41 7E 64 FD |     dirh    #63
    11568                 | __system___setbaud_ret
    11568     2D 00 64 FD |     ret

    Edit, this idiotic patching code works without additional stop bits:

    PRIMI(gettext) { // get text into block Buffer until ESC=27 or EURO=128
       R1= (int)(forthMem+FORTHSIZE-1-BSIZE); 
       __asm { 
          wxpin ##((868<<16)+7),#63 // 200E6/230400, patch to 8 bit
          WAITLOOP testp #63 wc   // test for char received
             if_nc jmp #WAITLOOP  // nc from serial full duplex
             rdpin  R0, #63        // get the char and acknowledge
             //shr R0,#24            
             getbyte    R0, R0, #3
             cmp R0,#27 wz        // ESC
             if_ne wrbyte R0,R1
             if_ne add R1,#1
             if_ne cmp R0,#128 wz   // EURO
             if_ne jmp #WAITLOOP
          wxpin ##((868<<16)+27),#63 // 200E6/230400, 28 bit patch back
          wrbyte #0,R1
  • ersmithersmith Posts: 6,147

    @"Christof Eb." said:

    @ersmith said:
    @"Christof Eb." Honestly I wouldn't use inline assembly for this, especially at 230400 baud -- the P2 code is not going to be the bottleneck. I would just use a loop calling _rxraw(0) and checking for the returned value.

    This is exactly what I have done and (after a trial in the Forth level, which also should be fast enough) posted in #1694 and what does not work!!! Please have a look at the code of __system___rxraw and of __system___setbaud !!! If you then still think, that this is like it should be, then please add the following to your documentation:


    "The serial receiver is set up in a mighty clever way. It does not only sample 8bit but even 8+20 bits. These additional 20 bits are discarded. So to prevent loss of characters and avoid de-synchronisation just set your sending software to use more than 20 stopbits."

    That is absolutely not true. If you tried the program I posted earlier you'll see that we can transfer ~21000 bytes/second at 230400 baud; clearly that could not be done with 20 stop bits. But the _rxraw code is very tricky, and if you're trying to sync up to it with your own inline assembly it probably won't work. In that case you're better off avoiding _rxraw and using an existing serial object (like jm_serial.spin2) instead.

    I did make a mistake with _rxraw(0); that should have returned -1 immediately, but instead it waits forever. There is a missing function to check for a character and return if it is not present (_rxraw(1) is too slow because it waits a millisecond). I will add an _rxpoll function in the next release to correct this.

    In the meantime, what (at a high level) are you trying to accomplish? If it's transferring large amounts of data quickly from PC to the P2, then I really suggest you use the built in host file system. That can keep up even at 2_000_000 baud (transferring 100K of data in about 832 milliseconds). If it's interactive reads, then _rxraw(1) is sufficient (people can't type faster than 1000 keys/second). If it's something else, then you can either get the new version of flexspin with _rxpoll() or use an object running in another COG.

  • Christof Eb.Christof Eb. Posts: 1,295
    edited 2025-02-28 19:42

    What exactly of my sentence is not true in your opinion? Your code receives 28 bits and returns only 8 of them. It swallows the rest which might be the next char. The problem is, that the time for the next byte is totally unknown. For this there are start and stop bits. Which you put out of order.

    Even your own ansi terminal has a paste feature, which would be handy if it worked sending to P2.

    Getchar() and gets() are in my opinion a function which has a defined behaviour. If it swallows 2 of 3 chars it is clearly a bug.
    These functions are universal and can be used for anything. Hand typed input is only one special application.

    I will stop now about this topic.

  • evanhevanh Posts: 16,345

    Time to get detailed, not walk away - Quote exact lines of source code and where they came from. Version numbers.

    I have yet to find source code for _rxraw() myself.

  • ersmithersmith Posts: 6,147

    @"Christof Eb." I'm sorry that you've found this experience frustrating. It's clear that the use case that this code was optimized for (large blocks of data streaming as fast as possible out of the PC, so in 10 bit chunks framed with single start/stop bits and no gaps between them) is not the use case that you seem to have right now. So indeed it probably makes sense for you to use a different serial library for your application.

    The source code for _rxraw() is in flexspin, in the sys/p2_code.spin file. It looks like:

    ' timeout is approximately in milliseconds (actually in 1024ths of a second)
    pri _rxraw(timeout = 0) : rxbyte = long | z, endtime, temp2, rxpin
      if _bitcycles == 0
      if timeout
        endtime := _getcnt() + timeout * (__clkfreq_var >> 10)
        endtime := 0 ' just gets rid of a compiler warning
      rxbyte := -1
      rxpin := _rxpin
      z := 0
      temp2 := _rx_temp
      '' slightly tricky code for pulling out the bytes from the 28 bits
      '' of data presented by the smartpin
      '' Courtesy of evanh
              testb  temp2, #8 wc     ' check framing of prior character for valid character   
              testbn temp2, #9 andc   ' more framing check (1 then 0)
              shr    temp2, #10       ' shift down next character, if any
      if_c    mov    z, #1
      if_c    jmp    #.breakone
              testp  rxpin wz
      if_z    mov    z, #1
      if_z    rdpin  temp2, rxpin
      if_z    shr    temp2, #32 - 28
      until z or (timeout and (endtime - _getcnt() < 0))
      if z
        rxbyte := temp2 & $ff
      _rx_temp := temp2

    Note the repeat ... until loop containing the assembly language. It looks for start/stop bits and shifts until it finds one. Excess bits that are left over get stored in the _rx_temp variable and used again on the next call to _rxraw. They are not "thrown away", and as I mentioned you can test this by streaming data at high speed to the P2 -- the bytes are all received and come at roughly 10 bits per byte, just as one would expect, and not 28 bits/byte.

    At least, that's the case on my Linux machine. Maybe Windows leaves gaps between the bytes? That would confuse things and could be the source of the problems.

  • evanhevanh Posts: 16,345
    edited 2025-03-01 00:29

    hmmm, oldly, I don't have that source file ... time to start a fresh git thingy ....

    EDIT: Doh! I was looking in include/sys

  • evanhevanh Posts: 16,345
    edited 2025-03-01 02:34

    Here's hopefully better comments explaining that routine. I've also optimised it now as well.
    temp2 adds double-buffering as well as the hack extending to 30-bit framing. Without interrupts or a dedicated cog, there's little value in going further.

                    testb   temp2, #8   wc      ' prior stop-bit (1) check
                    testbn  temp2, #9   andc    ' next start-bit (0) check
                    shr     temp2, #10          ' shift down next character, if any
                    testp   rxpin   wz          ' 30 bits have arrived, minus the framing for 28 bits in hardware buffer
      if_c_or_z     mov     z, #1               ' characters are available
      if_nc_and_z   rdpin   temp2, rxpin        ' move hardware buffer into temp2 buffer
      if_nc_and_z   shr     temp2, #32 - 28     ' and adjust, makes temp2[31:28] = 0 always

    The hardware frame length is set to 30 bits instead of 10 bits. This, in turn, demands a higher accuracy of the tx and rx bauds matching each other.

    If the temp2 stop-start bit checks fail then it considers the whole temp2 buffer as empty.

    Potential failing of the hack is when there is small character to character gaps. In other words, it relies on either the characters being gapless, or have gaps longer than 20 bit times (two characters).

  • evanhevanh Posts: 16,345
    edited 2025-03-01 02:36

    Maybe could make the buffer shift conditional. This would freeze the final character and, more importantly, ignore any trailing bits that got swept up between then and end of 30-bit frame.

                    testb   temp2, #8   wc      ' prior stop-bit (1) check
                    testbn  temp2, #9   andc    ' next start-bit (0) check
      if_c          shr     temp2, #10          ' shift down next character, if any
                    testp   rxpin   wz          ' 30 bits have arrived, minus the framing for 28 bits in hardware buffer
      if_c_or_z     mov     z, #1               ' characters are available
      if_nc_and_z   rdpin   temp2, rxpin        ' move hardware buffer into temp2 buffer
      if_nc_and_z   shr     temp2, #32 - 28     ' and adjust, makes temp2[31:28] = 0 always
  • @evanh and @ersmith
    Ah, thanks for taking this seriously now!

    So I was wrong, in my conclusion, that 2/3 chars are swallowed. The problem is the defunctional synchronising.

    To have a common basis, I have written this little test for gets()

    // Test for gets()
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    enum { _xtlfreq = 25_000_000 } ;
    enum {    _clkfreq = 200_000_000 } ;
    enum {    DOWNLOAD_BAUD = 230_400 } ;
    enum {    DEBUG_BAUD = 230_400 } ;
    extern register int R0, R1, R2, R3; //
    char line[300];
    void mygets(void) { // get text into block Buffer until ESC=27 or cr=13
       R1= (int)line;
       __asm { 
          wxpin ##((868<<16)+7),#63 // 200E6/230400, patch to 8 bit
          WAITLOOP testp #63 wc   // test for char received
             if_nc jmp #WAITLOOP  // nc from serial full duplex
             rdpin  R0, #63        // get the char and acknowledge
             getbyte    R0, R0, #3
             cmp R0,#27 wz        // ESC
             if_ne wrbyte R0,R1
             if_ne add R1,#1
             if_ne cmp R0,#13 wz   // cr
             if_ne jmp #WAITLOOP
          wxpin ##((868<<16)+27),#63 // 200E6/230400, 28 bit patch back
          wrbyte #0,R1
    void main(void) {
        while(1) {
                printf("Input Text \n");
                mygets(); // hardcoded for line
                printf("%s \n",line);

    If you use it with the standard gets() and with "internal ANSI terminal", paste is offered but does not work at all. The text appears on the screen, but it is not sent over and therefore not repeated.

    If you use it with Teraterm 2 stopbits+1ms delay per char, paste works as it should, but very slowly.
    If you switch off the delay the paste gets destroyed to "ucË«£›g6[¶l–" Only the very first char 'u' is correct and the number is lower, what I meant with "swallowed". (CR was inserted manually, to end gets()).

    By chance it worked here, if I switch to 1 stopbit. For some reason Windows USB did not insert any unknown delays in this case. That's the last line.

    Question is, if we agree, that gets() must work (as usual!!!) with any delay between chars and also -as usual- with 1,5 or 2 stopbits?

    I do not understand, why you use such a complicated way, as my straight forward code shows, that the smartpin 8 bit async mode just works out of the box even with one stopbit, tested with teraterm?????

    A good use case for a P1/P2 is to use it as a subsystem for a PC or LINUX board. In this case on the PC side there is no Terminal but some other program. For example GRBL setup uses an original Arduino to drive the stepper motors while some program on the PC side is doing graphics and display and sends commands. Myself had a similar setup for my little lathe with P1 and Tachyon. On the raspi side a Python script was used, which sent commands over to P1 and got answers. Forth was used as protocol. It is a very powerful protocol.
    So in my experience it is not a good idea, to fix the use case of the serial to either hand input or "exactly one stop bit". Long term it would make a lot of sense to have a buffered serial for getchar() and fgets(). For the moment I will have to do it myself.

    If you still think, that the code is as it should be, then please insert a warning in the docs, that the sender must either use more than 20 stopbits or make sure that exactly one stopbit with no additional delays is guarantied. (Who can with Windows?)
    Cheers Christof

  • evanhevanh Posts: 16,345
    edited 2025-03-01 11:54

    Yes, it assumes one stop bit. Why use anything else?

    1.5 certainly won't be an option but I'll have a go at handling 2+, see if it can be done efficiently ...

    PS: The reason for doing this is straightforward speed. Both the smartpin's shifter and buffer is triple length. Bigger buffering allows more to be collected per access.

  • evanhevanh Posts: 16,345

    grrr, loadp2 only does 8N1.

  • evanhevanh Posts: 16,345
    edited 2025-03-01 13:36

    nah, failed. What we've got works for 8N1. Otherwise best to configure the smartpin normally.

  • ersmithersmith Posts: 6,147

    @evanh I think the real problem isn't so much stop bits but what I will for lack of a better term call "idle bits" -- time when there is no data on the wire. If the data is coming at full speed the existing code works fine (at 8N1). If it's coming very slowly then it works fine too (there is only one actual character in each period of 28 bits). It's when the data is coming quickly but not quite at full speed that things break down -- we get 8 real data bits, a stop bit, then some number of idle bits (time when the signal is high because there is no transmission), and then a new start bit. The number of idle bits is unpredictable :(.

    We might be able to solve this by inverting the data and using ENCOD to search for the start bits within the 28 bit window. Re-synchronizing could be a pain though. For now I think I will drop back to the traditional 8 bit per character method. I did really like the 28 bit receive because it gave us a much bigger leeway for starting the receive code, but the problems with synchronization are a bit scary.

  • evanhevanh Posts: 16,345

    I think he actually has been trying to use 2 stop bits. All the rest of the talk I think was just efforts to solve it.

Sign In or Register to comment.