There's a new flexprop release (7.0.4) with some significant bug fixes. Thanks to @"Stephen Moraco" for finding a subtle problem with optimizations around functions that call ABORT, and to @Wuerfel_21 for her testing and many useful suggestions.
Solved
PANIC Flexprop 7.0.4 when I add the line
mount("", _vfs_open_sdcard());
then the size of the program explodes +340000 bytes or so????
Is there a tiny file system for SD card only?
Thanks a lot! Christof
Ah- the big size comes from my silly HEAPSIZE, that is ignored, if you don't have the mount command.
My file read/write tester program, with diagnostics, is about 64 kB (Spin2 binary), 55 kB (C binary).
Here it is configured to use the built-in SD card driver.
Thank you very much @evanh, your code works with my sd-card. And if I insert the card into my PC I can see the files, it has generated.
Put on track again. Great!
The sd_dir.bas works but sd_write.bas example seems not to work.
The hello.txt is not there, if I insert the card in the PC.
@ersmith
Hi, Eric,
something seems to be broken with the sd_write.bas example. I changed /dir to /sd but this did not help. "" instead of "/sd" seems generally not to be allowed either?
Also I have the question, if there is the possibility to have a second SD Card in the system?
Thank you and cheers, Christof
Hi,
I am using getchar() in C and it is easily overrun, so I conclude there is no fifo.
Is there a fifo buffered version of serial input to P2 via getchar()?
Thanks, Christof
@"Christof Eb." said:
@ersmith
Hi, Eric,
something seems to be broken with the sd_write.bas example. I changed /dir to /sd but this did not help.
The problem here is that the new sd driver needs a file handle, and allocates #3, which conflicts with the BASIC code. If you change all of the #3 in the BASIC to #4 it should work. I'll fix the samples soon, thanks for catching this.
"" instead of "/sd" seems generally not to be allowed either?
Correct, an empty string is not allowed for mount and returns EINVAL.
Also I have the question, if there is the possibility to have a second SD Card in the system?\
Yes, it should be possible by using the _vfs_open_sdcardx(a, b, c, d) call to specify the pins explicitly instead of _vfs_open_sdcard.
I am using getchar() in C and it is easily overrun, so I conclude there is no fifo.
That's almost right, there is room for 4 characters to be buffered inside the smart pin.
Is there a fifo buffered version of serial input to P2 via getchar()?
You could use one of the existing buffered serial objects in Spin2, and hook it up to stdio. Getting it to work with stdio functions like getchar would be a bit of work, but just using the object directly is very easy.
Is there a fifo buffered version of serial input to P2 via getchar()?
You could use one of the existing buffered serial objects in Spin2, and hook it up to stdio. Getting it to work with stdio functions like getchar would be a bit of work, but just using the object directly is very easy.
Thanks for the reply!
As far as I know, these SPIN2 objects need extra COGs.
Taqoz has an interrupt driven serial input in the main COG. The service routine and the ring buffer is in HUB. Also a few register variables including the vector register. Wouldn't that be interesting for FlexProp? For me it is not clear how to find the service routine in HUB RAM to install it. Below is Peter's code.
How would that look like: mov ijmp1,##@taqoz_rxisr
016ac | '#######################################################################'
016ac | ' serial.p2a
016ac | '#######################################################################'
016ac |
016ac |
016ac | InitSerial drvh #tx_pin ' force high '
016ac 59 7C 64 FD
016b0 3E F8 0C FC | wrpin #$7C,#tx_pin ' asynchronous transmit
016b4 3F 7C 0C FC | wrpin #$3E,#rx_pin ' asynchronous receive
016b8 | 'SETBAUD ' calculate baud timing at runtime'
016b8 14 20 04 FB | rdlong r0,#@_CPUHZ ' read from config table in low hub'
016bc 1C 22 04 FB | rdlong r1,#@_BAUDR
016c0 11 20 10 FD | qdiv r0,r1 ' cpuhz/baud'
016c4 18 20 60 FD | getqx r0
016c8 10 20 64 F0 | shl r0,#16 ' (cpuhz/baud)<<16'
016cc 07 20 04 F1 | add r0,#7 ' add 8 data bits
016d0 3F 20 14 FC | wxpin r0,#rx_pin ' write config baud ticks and 8 data
016d4 02 20 04 F1 | add r0,#morestops ' ADD 2 MORE STOP BITS FOR TX 191120 '
016d8 3E 20 14 FC | wxpin r0,#tx_pin ' baud 8 data
016dc 3E 1A 2C FC | wypin #$0D,#tx_pin ' kick off transmit empty flag'
016e0 |
016e0 41 7E 64 FD | dirh #rx_pin ' enable smartpin mode'
016e4 3F 42 9C FA | rdpin rxdat,#rx_pin wc ' clear rx?
016e8 25 00 64 FD | setint1 #0 ' disable int0'
016ec 00 3E 04 F6 | mov rxwrC,#0
016f0 00 40 04 F6 | mov rxrdC,#0
016f4 1F 3C CC F9 | bmask rxlong,#31 ' invalid codes'
016f8 |
016f8 20 7E 67 FD | setse1 #%110<<6+rx_pin ' set se1 to trigger on rx char event?????
016fc 0B 00 00 FF
01700 0C E9 07 F6 | mov ijmp1,##@taqoz_rxisr ' set int1 jump vector to receive buffer
01704 25 08 64 FD | setint1 #4 ' Enable int0 to trigger on se1'
01708 2D 00 64 FD | ret
0170c |
0170c | '
0170c | ' ( SERIAL RECEIVE INTERRUPT SERVICE ROUTINE )' 47!
0170c | '
0170c | taqoz_rxisr
0170c 3F 42 8C FA | rdpin rxdat,#rx_pin ' recv byte (bits31:24) from rx pin
01710 18 42 44 F0 | shr rxdat,#24 ' right justify'
01714 08 3C 64 F0 | shl rxlong,#8 ' maintain a 4 character deep sequence long'
01718 21 3C 40 F5 | or rxlong,rxdat ' update history word'
0171c 1F 44 00 F6 | mov rxptr,rxwrC ' update write pointer'
01720 FA 03 00 FF
01724 00 45 04 F1 | add rxptr,##rxbuffers
01728 22 42 40 FC | wrbyte rxdat,rxptr ' write char to buffer'
0172c 01 00 00 FF
01730 FF 3E 04 F7 | incmod rxwrC,##rxsize-1 ' update write index'
01734 1C 42 14 F2 | cmp rxdat,#$1C wc ' don't check sequences if not a certain control
01738 F5 FF 3B 3B | if_nc reti1
0173c | ''
0173c 00 3C 0C F2 | cmp rxlong,#0 wz
01740 30 00 90 AD | if_z jmp #RST
01744 14 42 14 F2 | cmp rxdat,#$14 wc ' ignore control chars < $14 '
01748 F5 FF 3B CB | if_c reti1
0174c 0A 0A 0A FF
01750 14 3C 0C F2 | cmp rxlong,##$14141414 wz ' check ^T^T^T^T for force TRACE sequence'
01754 C8 EC BF AD | if_z call #TRACE
01758 F5 FF 3B AB | if_z reti1
0175c 8A 8A 0A FF
01760 15 3D 0C F2 | cmp rxlong,##$15151515 wz ' check ^U^U^U^U for force UNTRACE sequence'
01764 CC EC BF AD | if_z call #UNTRACE ' The TRACING locations has the default doNEXT instruction'
01768 F5 FF 3B AB | if_z reti1
0176c 8D 8D 0D FF
01770 1B 3D 0C F2 | cmp rxlong,##$1B1B1B1B wz ' check esc esc esc esc sequence'
01774 | RST if_z coginit #0,#0 ' force reset '
01774 00 00 EC AC
01778 0D 0D 0D FF
0177c 1A 3C 0C F2 | cmp rxlong,##$1A1A1A1A wz ' check for ^Z^Z^Z^Z sequence '
01780 1C 42 C4 A9 | if_z decod rxdat,#28 ' reboot via hub'
01784 00 42 60 AD | if_z hubset rxdat
01788 F5 FF 3B FB | reti1
0178c |
0178c | ' TAQOZ interface '
0178c | ' Read the next character from the serial rx buffer or return with null if empty
0178c | ' Note: INTERNAL - called by CONKEY
0178c | READRX cmp rxrdC,rxwrC wz ' check head/tail
0178c 1F 40 08 F2
01790 81 00 80 AD | if_z jmp #PUSHACC ' return with null if empty'
01794 20 22 00 F6 | mov r1,rxrdC ' save current read index '
01798 FA 03 00 FF
0179c 00 23 04 F1 | add r1,##rxbuffers ' index into the rx buffer'
017a0 11 1C C0 FA | rdbyte u,r1 ' read the data'
017a4 01 00 00 FF
017a8 FF 40 04 F7 | incmod rxrdC,##rxsize-1 ' and update read index with wrap'
017ac | rr1 jmp #PUSHACC
017ac 81 00 80 FD
017b0 |
017b0 | { *** P2 INTERRUPTS ***
017b0 | When an interrupt event occurs, certain conditions must be met before the interrupt branch can happen:
017b0 |
017b0 | ALTxx / CRCNIB / SCA / SCAS / GETXACC / SETQ / SETQ2 / XORO32 / XBYTE must not be executing
017b0 | AUGS must not be executing or waiting for a S/# instruction
017b0 | AUGD must not be executing or waiting for a D/# instruction
017b0 | REP must not be executing or active
017b0 | STALLI must not be executing or active
017b0 | The cog must not be stalled in any WAITx instruction
017b0 | }
@evanh said:
Flex doesn't officially support interrupts at this stage. Using them would likely cause problems.
Hm, yes, I have seen the statement in the doc.
There are a lot of discussions in this forum about enhancements of this and that. I have the impression, that a buffered serial might be of more practical value for more people than some other things. So, perhaps it would be good to give this some priority. At the moment, the serial stream does not only loose characters but also seems to loose start bit synchronisation. At least there are a lot of chars, which have never been sent.
Unfortunately Teraterm does allow only Transmit Delay of entire millisecs, so if I set it to 1, then the speed is reduced very much....
_rxraw(tim) can only be set to entire milliseconds too. I think that has once been a good idea, when baud rates have been 9600 or so.
@"Christof Eb." said:
As far as I know, these SPIN2 objects need extra COGs.
Taqoz has an interrupt driven serial input in the main COG. The service routine and the ring buffer is in HUB. Also a few register variables including the vector register. Wouldn't that be interesting for FlexProp? For me it is not clear how to find the service routine in HUB RAM to install it. Below is Peter's code.
As Evan has already mentioned, interrupts aren't supported by the compiler -- the code generated is not designed to be interrupt safe or to yield periodically for interrupts. Partly this is because the compiler also targets P1, and partly it's because interrupts are a fairly rare thing.
Giving up a COG for high priority tasks seems like a reasonable trade off. I know people try to save as many COGs as possible, but on P2 with the smartpins there is far less pressure on COGs than on P1, so I don't think using one COG for serial I/O (and perhaps other I/O too) is too bad. It's a rare program that actually needs all 8 COGs.
OK, so I try to work around.
Yes, this is about a "fast" data transfer. 230400 baud. P2 @ 200MHz. There should be plenty of time to get a char?
At other places in the program I use getchar() and also gets() so I have not taken over control over serial entirely.
The first version in C works, but only if throttled down by the transmitter with delays per char. The routine _rxraw(1) does a lot of things as far as I can see in the the listing file. Is this the vfs getting in the way? I have no /host activated.
I have found no good description of the async mode. How it can buffer 4 chars is a miracle?
The other two very simple versions crash.
((The compiler loads the WAITLOOP into the cache for each char. This must be AI.... ))
I have tried if_c and if_nc but neither works.
What's wrong with my attempts? Thank you very much for some hints!
/*
PRIMI(gettext) { // get text into block Buffer until ESC
R1= (int)(forthMem+FORTHSIZE-1-BSIZE);
do {
do {
R0=(int)_rxraw(1);
} while(R0==-1);
if(R0!=27 ) { *(char*)R1 = R0; R1++; }
} while(R0!=27);
*(char*)R1 = 0;
}
PRIMI(gettext) { // get text into block Buffer until ESC=27 or EURO=128
R1= (int)(forthMem+FORTHSIZE-1-BSIZE);
do {
__asm {
WAITLOOP testp #63 wc // test for char received
if_c jmp #WAITLOOP
rdpin R0, #63 // get the char and acknowledge
shr R0,#32-8
wrpin R0,#62 // echo back
};
if(R0!=27 ) { *(char*)R1 = R0; R1++; }
} while(R0!=27 && R0!=128);
*(char*)R1 = 0;
}
*/
PRIMI(gettext) { // get text into block Buffer until ESC=27 or EURO=128
R1= (int)(forthMem+FORTHSIZE-1-BSIZE);
do {
__asm {
WAITLOOP
rdpin R0, #63 wc // get the char and acknowledge
if_nc jmp #WAITLOOP // try again
shr R0,#32-8
wrpin R0,#62 // echo back
};
if(R0!=27 ) { *(char*)R1 = R0; R1++; }
} while(R0!=27 && R0!=128);
*(char*)R1 = 0;
}
Put the whole buffering loop inside the assembly block. Then it'll go real fast. With the way it is now there is a notable overhead in copying the Pasm code into cogRAM upon each and every time around the loop.
OK, some very strange progress:
This works but as bad as the C-Version only with delay per char. Does it rely on a very large number of stopbits?
PRIMI(gettext) { // get text into block Buffer until ESC=27 or EURO=128
R1= (int)(forthMem+FORTHSIZE-1-BSIZE);
R0=(int)_rxraw(1); // for some kind of setup ?????
if(R0!=27 && R0!=-1) { *(char*)R1 = R0; R1++; }
do {
__asm {
WAITLOOP testp #63 wc // test for char received
if_nc jmp #WAITLOOP // nc from serial full duplex
rdpin R0, #63 // get the char and acknowledge
// shr R0,#32-8
shr R0,#4 // why on earth?
getbyte R0, R0, #0
// wrpin R0,#62 // echo back
};
if(R0!=27 ) { *(char*)R1 = R0; R1++; }
} while(R0!=27 && R0!=128);
*(char*)R1 = 0;
}
@"Christof Eb." Honestly I wouldn't use inline assembly for this, especially at 230400 baud -- the P2 code is not going to be the bottleneck. I would just use a loop calling _rxraw(0) and checking for the returned value. @evanh did some clever things to make _rxraw buffer up to 3 characters (not 4 as I thought; the stop/start bits have to fit in the 32 bit smartpin buffer) so it's robust against small variations in timing.
The flexc library routines are optimized for synchronous transfers, such as the ones performed by the built in host file system. Indeed using the host file system you can get up to ~21000 bytes/second transferred at 230400 baud, e.g. with a program like the following:
#include <stdio.h>
#include <propeller2.h>
#define BIG_BUFFER_SIZE 64*1024
char bigbuf[BIG_BUFFER_SIZE];
int main()
{
FILE *f;
unsigned long elapsed;
mount("/data", _vfs_open_host());
//mount("/data", _vfs_open_sdcard());
f = fopen("/data/test.bin", "rb");
if (!f) {
perror("test.bin");
return 1;
}
printf("starting test...\n");
int errors = 0;
elapsed = _getms();
int r = fread(bigbuf, 1, BIG_BUFFER_SIZE, f);
fclose(f);
elapsed = _getms() - elapsed;
float bytes_per_second = 1000 * (float)r / (float)elapsed;
printf("time to read %d bytes: %u ms\n", r, elapsed);
printf(" (%f bytes per second)\n", bytes_per_second);
return 0;
}
Hi Eric,
this is a very nice forum and I appreciate your FlexProp very much. But this discussion is . I hope that I can manage to keep a polite tone. These things are very time consuming for me, because I have to look up so many things. For example I have been wondering for quite a while, what magical things "getbyte local06, local06, #0" does, because I did not get what is wrong. Normally the byte should sit in the MSB and be shifted by 24.
@ersmith said:
@"Christof Eb." Honestly I wouldn't use inline assembly for this, especially at 230400 baud -- the P2 code is not going to be the bottleneck. I would just use a loop calling _rxraw(0) and checking for the returned value.
This is exactly what I have done and (after a trial in the Forth level, which also should be fast enough) posted in #1694 and what does not work!!! Please have a look at the code of __system___rxraw and of __system___setbaud !!! If you then still think, that this is like it should be, then please add the following to your documentation:
"The serial receiver is set up in a mighty clever way. It does not only sample 8bit but even 8+20 bits. These additional 20 bits are discarded. So to prevent loss of characters and avoid de-synchronisation just set your sending software to use more than 20 stopbits."
Sorry, Christof
P.S. If the vfs needs this strange setup (I wonder if this can ever work reliably, if the sender is working over USB?), then only this part of the software should use the 28bits.
Edit, this idiotic patching code works without additional stop bits:
PRIMI(gettext) { // get text into block Buffer until ESC=27 or EURO=128
R1= (int)(forthMem+FORTHSIZE-1-BSIZE);
__asm {
wxpin ##((868<<16)+7),#63 // 200E6/230400, patch to 8 bit
WAITLOOP testp #63 wc // test for char received
if_nc jmp #WAITLOOP // nc from serial full duplex
rdpin R0, #63 // get the char and acknowledge
//shr R0,#24
getbyte R0, R0, #3
cmp R0,#27 wz // ESC
if_ne wrbyte R0,R1
if_ne add R1,#1
if_ne cmp R0,#128 wz // EURO
if_ne jmp #WAITLOOP
wxpin ##((868<<16)+27),#63 // 200E6/230400, 28 bit patch back
wrbyte #0,R1
}
}
@ersmith said:
@"Christof Eb." Honestly I wouldn't use inline assembly for this, especially at 230400 baud -- the P2 code is not going to be the bottleneck. I would just use a loop calling _rxraw(0) and checking for the returned value.
This is exactly what I have done and (after a trial in the Forth level, which also should be fast enough) posted in #1694 and what does not work!!! Please have a look at the code of __system___rxraw and of __system___setbaud !!! If you then still think, that this is like it should be, then please add the following to your documentation:
>
"The serial receiver is set up in a mighty clever way. It does not only sample 8bit but even 8+20 bits. These additional 20 bits are discarded. So to prevent loss of characters and avoid de-synchronisation just set your sending software to use more than 20 stopbits."
That is absolutely not true. If you tried the program I posted earlier you'll see that we can transfer ~21000 bytes/second at 230400 baud; clearly that could not be done with 20 stop bits. But the _rxraw code is very tricky, and if you're trying to sync up to it with your own inline assembly it probably won't work. In that case you're better off avoiding _rxraw and using an existing serial object (like jm_serial.spin2) instead.
I did make a mistake with _rxraw(0); that should have returned -1 immediately, but instead it waits forever. There is a missing function to check for a character and return if it is not present (_rxraw(1) is too slow because it waits a millisecond). I will add an _rxpoll function in the next release to correct this.
In the meantime, what (at a high level) are you trying to accomplish? If it's transferring large amounts of data quickly from PC to the P2, then I really suggest you use the built in host file system. That can keep up even at 2_000_000 baud (transferring 100K of data in about 832 milliseconds). If it's interactive reads, then _rxraw(1) is sufficient (people can't type faster than 1000 keys/second). If it's something else, then you can either get the new version of flexspin with _rxpoll() or use an object running in another COG.
Hi,
What exactly of my sentence is not true in your opinion? Your code receives 28 bits and returns only 8 of them. It swallows the rest which might be the next char. The problem is, that the time for the next byte is totally unknown. For this there are start and stop bits. Which you put out of order.
Even your own ansi terminal has a paste feature, which would be handy if it worked sending to P2.
Getchar() and gets() are in my opinion a function which has a defined behaviour. If it swallows 2 of 3 chars it is clearly a bug.
These functions are universal and can be used for anything. Hand typed input is only one special application.
@"Christof Eb." I'm sorry that you've found this experience frustrating. It's clear that the use case that this code was optimized for (large blocks of data streaming as fast as possible out of the PC, so in 10 bit chunks framed with single start/stop bits and no gaps between them) is not the use case that you seem to have right now. So indeed it probably makes sense for you to use a different serial library for your application.
The source code for _rxraw() is in flexspin, in the sys/p2_code.spin file. It looks like:
' timeout is approximately in milliseconds (actually in 1024ths of a second)
pri _rxraw(timeout = 0) : rxbyte = long | z, endtime, temp2, rxpin
if _bitcycles == 0
_setbaud(__default_baud__)
if timeout
endtime := _getcnt() + timeout * (__clkfreq_var >> 10)
else
endtime := 0 ' just gets rid of a compiler warning
rxbyte := -1
rxpin := _rxpin
z := 0
temp2 := _rx_temp
'' slightly tricky code for pulling out the bytes from the 28 bits
'' of data presented by the smartpin
'' Courtesy of evanh
repeat
asm
testb temp2, #8 wc ' check framing of prior character for valid character
testbn temp2, #9 andc ' more framing check (1 then 0)
shr temp2, #10 ' shift down next character, if any
if_c mov z, #1
if_c jmp #.breakone
testp rxpin wz
if_z mov z, #1
if_z rdpin temp2, rxpin
if_z shr temp2, #32 - 28
.breakone
endasm
until z or (timeout and (endtime - _getcnt() < 0))
if z
rxbyte := temp2 & $ff
_rx_temp := temp2
Note the repeat ... until loop containing the assembly language. It looks for start/stop bits and shifts until it finds one. Excess bits that are left over get stored in the _rx_temp variable and used again on the next call to _rxraw. They are not "thrown away", and as I mentioned you can test this by streaming data at high speed to the P2 -- the bytes are all received and come at roughly 10 bits per byte, just as one would expect, and not 28 bits/byte.
At least, that's the case on my Linux machine. Maybe Windows leaves gaps between the bytes? That would confuse things and could be the source of the problems.
Here's hopefully better comments explaining that routine. I've also optimised it now as well. temp2 adds double-buffering as well as the hack extending to 30-bit framing. Without interrupts or a dedicated cog, there's little value in going further.
asm
testb temp2, #8 wc ' prior stop-bit (1) check
testbn temp2, #9 andc ' next start-bit (0) check
shr temp2, #10 ' shift down next character, if any
testp rxpin wz ' 30 bits have arrived, minus the framing for 28 bits in hardware buffer
if_c_or_z mov z, #1 ' characters are available
if_nc_and_z rdpin temp2, rxpin ' move hardware buffer into temp2 buffer
if_nc_and_z shr temp2, #32 - 28 ' and adjust, makes temp2[31:28] = 0 always
endasm
The hardware frame length is set to 30 bits instead of 10 bits. This, in turn, demands a higher accuracy of the tx and rx bauds matching each other.
If the temp2 stop-start bit checks fail then it considers the whole temp2 buffer as empty.
Potential failing of the hack is when there is small character to character gaps. In other words, it relies on either the characters being gapless, or have gaps longer than 20 bit times (two characters).
Maybe could make the buffer shift conditional. This would freeze the final character and, more importantly, ignore any trailing bits that got swept up between then and end of 30-bit frame.
asm
testb temp2, #8 wc ' prior stop-bit (1) check
testbn temp2, #9 andc ' next start-bit (0) check
if_c shr temp2, #10 ' shift down next character, if any
testp rxpin wz ' 30 bits have arrived, minus the framing for 28 bits in hardware buffer
if_c_or_z mov z, #1 ' characters are available
if_nc_and_z rdpin temp2, rxpin ' move hardware buffer into temp2 buffer
if_nc_and_z shr temp2, #32 - 28 ' and adjust, makes temp2[31:28] = 0 always
endasm
@evanh and @ersmith
Ah, thanks for taking this seriously now!
So I was wrong, in my conclusion, that 2/3 chars are swallowed. The problem is the defunctional synchronising.
To have a common basis, I have written this little test for gets()
// Test for gets()
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
enum { _xtlfreq = 25_000_000 } ;
enum { _clkfreq = 200_000_000 } ;
enum { DOWNLOAD_BAUD = 230_400 } ;
enum { DEBUG_BAUD = 230_400 } ;
extern register int R0, R1, R2, R3; //
char line[300];
void mygets(void) { // get text into block Buffer until ESC=27 or cr=13
R1= (int)line;
__asm {
wxpin ##((868<<16)+7),#63 // 200E6/230400, patch to 8 bit
WAITLOOP testp #63 wc // test for char received
if_nc jmp #WAITLOOP // nc from serial full duplex
rdpin R0, #63 // get the char and acknowledge
getbyte R0, R0, #3
cmp R0,#27 wz // ESC
if_ne wrbyte R0,R1
if_ne add R1,#1
if_ne cmp R0,#13 wz // cr
if_ne jmp #WAITLOOP
wxpin ##((868<<16)+27),#63 // 200E6/230400, 28 bit patch back
wrbyte #0,R1
}
}
void main(void) {
while(1) {
printf("Input Text \n");
//gets(line);
mygets(); // hardcoded for line
printf("%s \n",line);
}
}
If you use it with the standard gets() and with "internal ANSI terminal", paste is offered but does not work at all. The text appears on the screen, but it is not sent over and therefore not repeated.
If you use it with Teraterm 2 stopbits+1ms delay per char, paste works as it should, but very slowly.
If you switch off the delay the paste gets destroyed to "ucË«£›g6[¶l–" Only the very first char 'u' is correct and the number is lower, what I meant with "swallowed". (CR was inserted manually, to end gets()).
By chance it worked here, if I switch to 1 stopbit. For some reason Windows USB did not insert any unknown delays in this case. That's the last line.
Question is, if we agree, that gets() must work (as usual!!!) with any delay between chars and also -as usual- with 1,5 or 2 stopbits?
I do not understand, why you use such a complicated way, as my straight forward code shows, that the smartpin 8 bit async mode just works out of the box even with one stopbit, tested with teraterm?????
A good use case for a P1/P2 is to use it as a subsystem for a PC or LINUX board. In this case on the PC side there is no Terminal but some other program. For example GRBL setup uses an original Arduino to drive the stepper motors while some program on the PC side is doing graphics and display and sends commands. Myself had a similar setup for my little lathe with P1 and Tachyon. On the raspi side a Python script was used, which sent commands over to P1 and got answers. Forth was used as protocol. It is a very powerful protocol.
So in my experience it is not a good idea, to fix the use case of the serial to either hand input or "exactly one stop bit". Long term it would make a lot of sense to have a buffered serial for getchar() and fgets(). For the moment I will have to do it myself.
If you still think, that the code is as it should be, then please insert a warning in the docs, that the sender must either use more than 20 stopbits or make sure that exactly one stopbit with no additional delays is guarantied. (Who can with Windows?)
Cheers Christof
Yes, it assumes one stop bit. Why use anything else?
1.5 certainly won't be an option but I'll have a go at handling 2+, see if it can be done efficiently ...
PS: The reason for doing this is straightforward speed. Both the smartpin's shifter and buffer is triple length. Bigger buffering allows more to be collected per access.
@evanh I think the real problem isn't so much stop bits but what I will for lack of a better term call "idle bits" -- time when there is no data on the wire. If the data is coming at full speed the existing code works fine (at 8N1). If it's coming very slowly then it works fine too (there is only one actual character in each period of 28 bits). It's when the data is coming quickly but not quite at full speed that things break down -- we get 8 real data bits, a stop bit, then some number of idle bits (time when the signal is high because there is no transmission), and then a new start bit. The number of idle bits is unpredictable .
We might be able to solve this by inverting the data and using ENCOD to search for the start bits within the 28 bit window. Re-synchronizing could be a pain though. For now I think I will drop back to the traditional 8 bit per character method. I did really like the 28 bit receive because it gave us a much bigger leeway for starting the receive code, but the problems with synchronization are a bit scary.
Comments
There's a new flexprop release (7.0.4) with some significant bug fixes. Thanks to @"Stephen Moraco" for finding a subtle problem with optimizations around functions that call ABORT, and to @Wuerfel_21 for her testing and many useful suggestions.
Solved
PANIC Flexprop 7.0.4 when I add the line
mount("", _vfs_open_sdcard());
then the size of the program explodes +340000 bytes or so????
Is there a tiny file system for SD card only?
Thanks a lot! Christof
Ah- the big size comes from my silly HEAPSIZE, that is ignored, if you don't have the mount command.
An empty string probably isn't valid.
My file read/write tester program, with diagnostics, is about 64 kB (Spin2 binary), 55 kB (C binary).
Here it is configured to use the built-in SD card driver.
Thank you very much @evanh, your code works with my sd-card. And if I insert the card into my PC I can see the files, it has generated.
Put on track again. Great!
The sd_dir.bas works but sd_write.bas example seems not to work.
The hello.txt is not there, if I insert the card in the PC.
@ersmith
Hi, Eric,
something seems to be broken with the sd_write.bas example. I changed /dir to /sd but this did not help. "" instead of "/sd" seems generally not to be allowed either?
Also I have the question, if there is the possibility to have a second SD Card in the system?
Thank you and cheers, Christof
Hi,
I am using getchar() in C and it is easily overrun, so I conclude there is no fifo.
Is there a fifo buffered version of serial input to P2 via getchar()?
Thanks, Christof
The problem here is that the new sd driver needs a file handle, and allocates #3, which conflicts with the BASIC code. If you change all of the
#3
in the BASIC to#4
it should work. I'll fix the samples soon, thanks for catching this.Correct, an empty string is not allowed for mount and returns EINVAL.
Yes, it should be possible by using the
_vfs_open_sdcardx(a, b, c, d)
call to specify the pins explicitly instead of_vfs_open_sdcard
.That's almost right, there is room for 4 characters to be buffered inside the smart pin.
You could use one of the existing buffered serial objects in Spin2, and hook it up to stdio. Getting it to work with stdio functions like
getchar
would be a bit of work, but just using the object directly is very easy.Thanks for the reply!
As far as I know, these SPIN2 objects need extra COGs.
Taqoz has an interrupt driven serial input in the main COG. The service routine and the ring buffer is in HUB. Also a few register variables including the vector register. Wouldn't that be interesting for FlexProp? For me it is not clear how to find the service routine in HUB RAM to install it. Below is Peter's code.
How would that look like: mov ijmp1,##@taqoz_rxisr
Flex doesn't officially support interrupts at this stage. Using them would likely cause problems.
Hm, yes, I have seen the statement in the doc.
There are a lot of discussions in this forum about enhancements of this and that. I have the impression, that a buffered serial might be of more practical value for more people than some other things. So, perhaps it would be good to give this some priority. At the moment, the serial stream does not only loose characters but also seems to loose start bit synchronisation. At least there are a lot of chars, which have never been sent.
Unfortunately Teraterm does allow only Transmit Delay of entire millisecs, so if I set it to 1, then the speed is reduced very much....
_rxraw(tim) can only be set to entire milliseconds too. I think that has once been a good idea, when baud rates have been 9600 or so.
It's only really intended for keyboard entry.
Eric has a file transfer protocol over comport functioning already I believe. If that's what you're after.
But it also wouldn't be too hard to take control of those serial smartpins yourself and ignore the C libs.
As Evan has already mentioned, interrupts aren't supported by the compiler -- the code generated is not designed to be interrupt safe or to yield periodically for interrupts. Partly this is because the compiler also targets P1, and partly it's because interrupts are a fairly rare thing.
Giving up a COG for high priority tasks seems like a reasonable trade off. I know people try to save as many COGs as possible, but on P2 with the smartpins there is far less pressure on COGs than on P1, so I don't think using one COG for serial I/O (and perhaps other I/O too) is too bad. It's a rare program that actually needs all 8 COGs.
OK, so I try to work around.
Yes, this is about a "fast" data transfer. 230400 baud. P2 @ 200MHz. There should be plenty of time to get a char?
At other places in the program I use getchar() and also gets() so I have not taken over control over serial entirely.
The first version in C works, but only if throttled down by the transmitter with delays per char. The routine _rxraw(1) does a lot of things as far as I can see in the the listing file. Is this the vfs getting in the way? I have no /host activated.
I have found no good description of the async mode. How it can buffer 4 chars is a miracle?
The other two very simple versions crash.
))
((The compiler loads the WAITLOOP into the cache for each char. This must be AI....
I have tried if_c and if_nc but neither works.
What's wrong with my attempts? Thank you very much for some hints!
Put the whole buffering loop inside the assembly block. Then it'll go real fast. With the way it is now there is a notable overhead in copying the Pasm code into cogRAM upon each and every time around the loop.
BTW: How fast did the _rxraw() solution go?
OK, some very strange progress:
This works but as bad as the C-Version only with delay per char. Does it rely on a very large number of stopbits?
@"Christof Eb." Honestly I wouldn't use inline assembly for this, especially at 230400 baud -- the P2 code is not going to be the bottleneck. I would just use a loop calling
_rxraw(0)
and checking for the returned value. @evanh did some clever things to make_rxraw
buffer up to 3 characters (not 4 as I thought; the stop/start bits have to fit in the 32 bit smartpin buffer) so it's robust against small variations in timing.The flexc library routines are optimized for synchronous transfers, such as the ones performed by the built in host file system. Indeed using the host file system you can get up to ~21000 bytes/second transferred at 230400 baud, e.g. with a program like the following:
Hi Eric,
. I hope that I can manage to keep a polite tone. These things are very time consuming for me, because I have to look up so many things. For example I have been wondering for quite a while, what magical things "getbyte local06, local06, #0" does, because I did not get what is wrong. Normally the byte should sit in the MSB and be shifted by 24.
this is a very nice forum and I appreciate your FlexProp very much. But this discussion is
This is exactly what I have done and (after a trial in the Forth level, which also should be fast enough) posted in #1694 and what does not work!!! Please have a look at the code of __system___rxraw and of __system___setbaud !!! If you then still think, that this is like it should be, then please add the following to your documentation:
"The serial receiver is set up in a mighty clever way. It does not only sample 8bit but even 8+20 bits. These additional 20 bits are discarded. So to prevent loss of characters and avoid de-synchronisation just set your sending software to use more than 20 stopbits."
Sorry, Christof
P.S. If the vfs needs this strange setup (I wonder if this can ever work reliably, if the sender is working over USB?), then only this part of the software should use the 28bits.
Edit, this idiotic patching code works without additional stop bits:
>
That is absolutely not true. If you tried the program I posted earlier you'll see that we can transfer ~21000 bytes/second at 230400 baud; clearly that could not be done with 20 stop bits. But the _rxraw code is very tricky, and if you're trying to sync up to it with your own inline assembly it probably won't work. In that case you're better off avoiding _rxraw and using an existing serial object (like jm_serial.spin2) instead.
I did make a mistake with
_rxraw(0)
; that should have returned -1 immediately, but instead it waits forever. There is a missing function to check for a character and return if it is not present (_rxraw(1)
is too slow because it waits a millisecond). I will add an_rxpoll
function in the next release to correct this.In the meantime, what (at a high level) are you trying to accomplish? If it's transferring large amounts of data quickly from PC to the P2, then I really suggest you use the built in host file system. That can keep up even at 2_000_000 baud (transferring 100K of data in about 832 milliseconds). If it's interactive reads, then
_rxraw(1)
is sufficient (people can't type faster than 1000 keys/second). If it's something else, then you can either get the new version of flexspin with_rxpoll()
or use an object running in another COG.Hi,
What exactly of my sentence is not true in your opinion? Your code receives 28 bits and returns only 8 of them. It swallows the rest which might be the next char. The problem is, that the time for the next byte is totally unknown. For this there are start and stop bits. Which you put out of order.
Even your own ansi terminal has a paste feature, which would be handy if it worked sending to P2.
Getchar() and gets() are in my opinion a function which has a defined behaviour. If it swallows 2 of 3 chars it is clearly a bug.
These functions are universal and can be used for anything. Hand typed input is only one special application.
I will stop now about this topic.
Time to get detailed, not walk away - Quote exact lines of source code and where they came from. Version numbers.
I have yet to find source code for _rxraw() myself.
@"Christof Eb." I'm sorry that you've found this experience frustrating. It's clear that the use case that this code was optimized for (large blocks of data streaming as fast as possible out of the PC, so in 10 bit chunks framed with single start/stop bits and no gaps between them) is not the use case that you seem to have right now. So indeed it probably makes sense for you to use a different serial library for your application.
The source code for _rxraw() is in flexspin, in the sys/p2_code.spin file. It looks like:
Note the
repeat
...until
loop containing the assembly language. It looks for start/stop bits and shifts until it finds one. Excess bits that are left over get stored in the_rx_temp
variable and used again on the next call to_rxraw
. They are not "thrown away", and as I mentioned you can test this by streaming data at high speed to the P2 -- the bytes are all received and come at roughly 10 bits per byte, just as one would expect, and not 28 bits/byte.At least, that's the case on my Linux machine. Maybe Windows leaves gaps between the bytes? That would confuse things and could be the source of the problems.
hmmm, oldly, I don't have that source file ... time to start a fresh git thingy ....
EDIT: Doh! I was looking in include/sys
Here's hopefully better comments explaining that routine. I've also optimised it now as well.
temp2
adds double-buffering as well as the hack extending to 30-bit framing. Without interrupts or a dedicated cog, there's little value in going further.The hardware frame length is set to 30 bits instead of 10 bits. This, in turn, demands a higher accuracy of the tx and rx bauds matching each other.
If the temp2 stop-start bit checks fail then it considers the whole temp2 buffer as empty.
Potential failing of the hack is when there is small character to character gaps. In other words, it relies on either the characters being gapless, or have gaps longer than 20 bit times (two characters).
Maybe could make the buffer shift conditional. This would freeze the final character and, more importantly, ignore any trailing bits that got swept up between then and end of 30-bit frame.
@evanh and @ersmith
Ah, thanks for taking this seriously now!
So I was wrong, in my conclusion, that 2/3 chars are swallowed. The problem is the defunctional synchronising.
To have a common basis, I have written this little test for gets()
If you use it with the standard gets() and with "internal ANSI terminal", paste is offered but does not work at all. The text appears on the screen, but it is not sent over and therefore not repeated.

If you use it with Teraterm 2 stopbits+1ms delay per char, paste works as it should, but very slowly.
If you switch off the delay the paste gets destroyed to "ucË«£›g6[¶l–" Only the very first char 'u' is correct and the number is lower, what I meant with "swallowed". (CR was inserted manually, to end gets()).
By chance it worked here, if I switch to 1 stopbit. For some reason Windows USB did not insert any unknown delays in this case. That's the last line.

Question is, if we agree, that gets() must work (as usual!!!) with any delay between chars and also -as usual- with 1,5 or 2 stopbits?
I do not understand, why you use such a complicated way, as my straight forward code shows, that the smartpin 8 bit async mode just works out of the box even with one stopbit, tested with teraterm?????
A good use case for a P1/P2 is to use it as a subsystem for a PC or LINUX board. In this case on the PC side there is no Terminal but some other program. For example GRBL setup uses an original Arduino to drive the stepper motors while some program on the PC side is doing graphics and display and sends commands. Myself had a similar setup for my little lathe with P1 and Tachyon. On the raspi side a Python script was used, which sent commands over to P1 and got answers. Forth was used as protocol. It is a very powerful protocol.
So in my experience it is not a good idea, to fix the use case of the serial to either hand input or "exactly one stop bit". Long term it would make a lot of sense to have a buffered serial for getchar() and fgets(). For the moment I will have to do it myself.
If you still think, that the code is as it should be, then please insert a warning in the docs, that the sender must either use more than 20 stopbits or make sure that exactly one stopbit with no additional delays is guarantied. (Who can with Windows?)
Cheers Christof
Yes, it assumes one stop bit. Why use anything else?
1.5 certainly won't be an option but I'll have a go at handling 2+, see if it can be done efficiently ...
PS: The reason for doing this is straightforward speed. Both the smartpin's shifter and buffer is triple length. Bigger buffering allows more to be collected per access.
grrr, loadp2 only does 8N1.
nah, failed. What we've got works for 8N1. Otherwise best to configure the smartpin normally.
@evanh I think the real problem isn't so much stop bits but what I will for lack of a better term call "idle bits" -- time when there is no data on the wire. If the data is coming at full speed the existing code works fine (at 8N1). If it's coming very slowly then it works fine too (there is only one actual character in each period of 28 bits). It's when the data is coming quickly but not quite at full speed that things break down -- we get 8 real data bits, a stop bit, then some number of idle bits (time when the signal is high because there is no transmission), and then a new start bit. The number of idle bits is unpredictable
.
We might be able to solve this by inverting the data and using
ENCOD
to search for the start bits within the 28 bit window. Re-synchronizing could be a pain though. For now I think I will drop back to the traditional 8 bit per character method. I did really like the 28 bit receive because it gave us a much bigger leeway for starting the receive code, but the problems with synchronization are a bit scary.I think he actually has been trying to use 2 stop bits. All the rest of the talk I think was just efforts to solve it.