Propeller Serial with Tight Byte Spacing

SRLM · 2013-08-07 19:27

I'll start with my problem: the new version of the RN-42 Bluetooth (firmware v6.15) seems to have a different serial driver where it spaces the bytes very close to one another. In the old version, if I sent a byte every few milliseconds I'd get a nice empty space between bytes:

attachment.php?attachmentid=103237&d=1375928690

attachment.php?attachmentid=103237&d=1375928690

(it's the middle channel that is of interest)

Now, with the same transmission the new RN-42 Bluetooth outputs the bytes very close together. In fact, it appears to be at full baud saturation:

attachment.php?attachmentid=103236&d=1375928690

attachment.php?attachmentid=103236&d=1375928690

(it's the top channel that is of interest)

Notice how there is almost no gap between bytes from the new RN-42. I'm using FDS1 as my Propeller serial driver and it can't handle the bytes being so close together. My question is:

Is there a serial driver that can handle the full saturation of the RX channel?

Preferably it's a duplex serial driver, and can run at 460800 baud. I know there are other options (transmit the bytes with a delay, change to 8N2, etc.) but that's just masking the problem (slow serial driver).

JonnyMac · 2013-08-07 20:09

At that speed you may need to run separate half-duplex drivers. Unrolled drivers can deal with very high bit rates.

Mike Green · 2013-08-07 20:29

You may also be overrunning the buffer. FDS normally uses a small (16 byte) buffer. The 4-port serial driver that was revised by Tracey Allen can be configured for a receive buffer around 256 bytes and a transmit buffer of 32 bytes and, with only one port configured, can run at over 500KB. See the comments at the beginning for details.

localroger · 2013-08-07 20:29

Echoing JonnyMac, at such a high baud a timesharing driver is likely to miss start bits badly enough to hose reception. Do you actually need half a megabaud? I've had very few application in 30 years of industrial service that needed more than 19200 and those were building elaborate user interfaces on serial terminals.

lonesock · 2013-08-07 21:06

Would you please try the kuroneko-special version of FFDS1 located in this post: FFDS1 link? He fixed a timing bug that cropped up on close input, so it may fix this problem.

thanks,
Jonathan

SRLM · 2013-08-07 21:42

JonnyMac wrote: »

At that speed you may need to run separate half-duplex drivers. Unrolled drivers can deal with very high bit rates.

Hmmm. I was afraid of this answer. Any suggestions on a half duplex driver that I can use?

lonesock wrote: »

Would you please try the kuroneko-special version of FFDS1 located in this post: FFDS1 link? He fixed a timing bug that cropped up on close input, so it may fix this problem.

thanks,
Jonathan

Yep, I'm actually using that version right now (at least the assembly portion. I have a C++ portion sitting on top). And I'd like to say thank you! It works great for nearly everything, and I like the code. There aren't any corrupted bytes, just missing bytes.

Peter Jakacki · 2013-08-07 21:59

SRLM wrote: »

Hmmm. I was afraid of this answer. Any suggestions on a half duplex driver that I can use?

Yep, I'm actually using that version right now (at least the assembly portion. I have a C++ portion sitting on top). And I'd like to say thank you! It works great for nearly everything, and I like the code. There aren't any corrupted bytes, just missing bytes.

I have my serial driver running at over 3M baud in Tachyon. I could extract that and turn it into a Spin object and add in a transmit section to complete a high-speed half-duplex driver. Would that do?

SRLM · 2013-08-07 22:44

Peter Jakacki wrote: »

I have my serial driver running at over 3M baud in Tachyon. I could extract that and turn it into a Spin object and add in a transmit section to complete a high-speed half-duplex driver. Would that do?

Does it do RX? That's mostly what I'm interested in here. If yes, then that would be useful. Thanks!

Peter Jakacki · 2013-08-07 23:34

SRLM wrote: »

Does it do RX? That's mostly what I'm interested in here. If yes, then that would be useful. Thanks!

Yes, of course, as the tx part was optional. I will look at converting it into a Spin object sometime in the next 24 hours.

JonnyMac · 2013-08-08 07:34

I've attached a half-duplex serial object -- the unrolled code was inspired by what Peter did for his Tachyon receiver.

Dave Hein · 2013-08-08 08:40

SRLM wrote: »

Yep, I'm actually using that version right now (at least the assembly portion. I have a C++ portion sitting on top). And I'd like to say thank you! It works great for nearly everything, and I like the code. There aren't any corrupted bytes, just missing bytes.

You also have to determine if your C++ code can keep up with the incoming data rate. A Spin program has a maximum sustained receive rate of slightly over 57,600 baud when sitting in a tight loop calling the rx method and storing the data in a linear buffer. Spin code converted to LMM C++ will be quite a bit faster, but I don't know if it can keep up with back-to-back bytes at 460,800. It would have to run 8 times faster than the Spin code. I've been able to receive data at 1 mbps in Spin using a block read routine instead of the rx routine. You may need to do a similar thing with your code. It depends on how big the back-to-back bursts of data are, and whether they will fit in the serial driver's serial buffer.

lonesock · 2013-08-08 10:20

On the off chance that a block-read would help, here's a more-advanced-but-less-tested version of FFDS1 with a block read function ('RxBuf'). (This is my in-testing version of FFDS1 version 1.0.) When doing block reads it is helpful to make sure the RX buffer size is > 2x the largest block read you plan to get.

thanks,
Jonathan

Tracy Allen · 2013-08-08 12:03

I ran into a similar problem with a network of Memsic Iris wireless nodes. They transmit packets asynchronously, and the base station receiver queues them up and delivers them at 115kbaud with exactly one stop bit, no flow control. The nodes didn't transmit very often, so on the average, removing data from the buffer was not a problem. However, allowance had to be made for occasional pileups of data from several nodes (i.e., a large buffer). The requirement was not for high speed per se, but for low power, which amounts to the same thing. The Prop was to receive at 115200 while operating with clkfreq=5MHz. That would amount to 1.8MBaud at 80MHz.

The critical step is not the bit to bit timing, but the time from the detection of the last data bit to the leading edge of the next start bit. Time for housekeeping and a couple of hub writes.

I'll attach my rx object (HSLP=HighSpeedLowPower). It is similar to Jon's, but the buffer size is unrestricted, up to available memory, and two hub writes instead of 3 in the wrapup. I'm curious how Peter gets 3Mbaud with writes into a hub buffer.

Peter Jakacki · 2013-08-08 21:09

Tracy Allen wrote: »

I ran into a similar problem with a network of Memsic Iris wireless nodes. They transmit packets asynchronously, and the base station receiver queues them up and delivers them at 115kbaud with exactly one stop bit, no flow control. The nodes didn't transmit very often, so on the average, removing data from the buffer was not a problem. However, allowance had to be made for occasional pileups of data from several nodes (i.e., a large buffer). The requirement was not for high speed per se, but for low power, which amounts to the same thing. The Prop was to receive at 115200 while operating with clkfreq=5MHz. That would amount to 1.8MBaud at 80MHz.

The critical step is not the bit to bit timing, but the time from the detection of the last data bit to the leading edge of the next start bit. Time for housekeeping and a couple of hub writes.

I'll attach my rx object (HSLP=HighSpeedLowPower). It is similar to Jon's, but the buffer size is unrestricted, up to available memory, and two hub writes instead of 3 in the wrapup. I'm curious how Peter gets 3Mbaud with writes into a hub buffer.

Hi Tracy, I just scoped my 3M operation and it looks like it has a lot of gaps there when I paste into Minicom, however when I do an ASCII file transfer it really packs it together tight with only one stop bit and I definitely lose stuff. I'm not too worried about this at 3M baud because I get the serial cog to also process the data a bit before it writes it to the hub. Testing this at 230400 baud there is no problem of course.

Repeating this at 460800 I have dropouts but then again I am doing this additional processing by filtering out whitespaces etc and recognizing command sequences and break detection etc.
Again at 460800 with 2 stop bits and leaving in the preprocessor it works fine. Removing this will mean that it should work find at 460800, I can't see any problem with that.

pjv · 2013-08-08 22:26

Several years ago I posted an ASCII receiver as well as a transmitter that would handle back to back bytes with hub writes/reads and no intervening spaces other than the regular stop bit. My recollection is that it ran flawlessly at 5 megabits/sec. There was not much interest, so I just parked the code. I can probably find it if there is some interest now.

Cheers,

Peter (pjv)

Peter Jakacki · 2013-08-08 23:02

pjv wrote: »

Several years ago I posted an ASCII receiver as well as a transmitter that would handle back to back bytes with hub writes/reads and no intervening spaces other than the regular stop bit. My recollection is that it ran flawlessly at 5 megabits/sec. There was not much interest, so I just parked the code. I can probably find it if there is some interest now.

Cheers,

Peter (pjv)

You should post the link then. I know if I skip reading and writing the buffer write index and simply write the data to the buffer then this speeds things up to. However software that needs to access the buffer needs to know where it's been written to. I take it you are referring to asynchronous reception where you need to wait for a start bit that can start anytime asynchronously as this is the part that can also cause timing problems at high speed vs synchronized (even with a "start" bit) bit-rate specific such as 5MHz being derived from the Prop's cycle time and especially if you are just handling a fixed block before updating hub variables etc.

Phil Pilgrim (PhiPi) · 2013-08-08 23:15

I think the trick -- if there is one -- is to begin looking for the next start bit as soon as the stop bit is only partway through. This technique will help to avoid "bit creep" if the receive timing happens to lag the transmit timing by a little.

-Phil

Peter Jakacki · 2013-08-08 23:33

Phil Pilgrim (PhiPi) wrote: »

I think the trick -- if there is one -- is to begin looking for the next start bit as soon as the stop bit is only partway through. This technique will help to avoid "bit creep" if the receive timing happens to lag the transmit timing by a little.

-Phil

Yes, but skipping the stop bit altogether can prove a disaster as async streams can get out of sync with glitches for whatever reason plus they all need at least one character gap to help resync to the real start bit. So typically the stop bit is checked half-way through to confirm it's right otherwise I take that as a framing error or wrong baud rate or part of a break if the data was also zero. Certain protocols allow you to store the data much faster in cog RAM as the packet is of specified size so all the slow hub stuff can be done afterwards.

pjv · 2013-08-09 07:32

Sorry Peter, I know I should have linked it, but I'm on the road right now vacationing, and it will be difficult for me to locate it. I'm not a searching expert, and working on a small tablet just now.

As far as I recall, Phil has the right idea. I store the data when the stop bit is sampled, and then immediately start searching for the falling edge of the next start bit.

The next character can start at any time, not necessarily on "even" bit times. It is fully asynchronous.

Just now I can't recall what happens at speeds just below 5 Mb/s as it has been optimized to hit the hub access sweet spot.

To avoid these link issues in the future, perhaps I should find a place in the OBEX to put it. I have never done that before.

My apologies.

Cheers,

Peter, (pjv)

Peter Jakacki · 2013-08-09 10:15

Well here's a snippet of the main receive routine that I have optimized to work with one stop bit between characters and it's been tested reliably to 2M baud and I'm still tweaking it. What I did was distribute the buffer write code through the unrolled loop so that as soon as it has assembled a byte and before it waits for a stop bit it has already calculated the hub buffer write address and internally updated the write index so that it does a wrbyte just before a waitcnt for the stop bit. Since the worst case for a hub access is 23 cycles that's 287.5ns while a waitcnt for 3M baud would require 333ns so that could just work too.

Here's the snippet:

[COLOR=#000000][FONT=Ubuntu Mono][B]receive[/B][/FONT][/COLOR][COLOR=#000000][FONT=Ubuntu Mono]     mov        rxcnt,stticks          ' Adjusted bit timing for start bit[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            waitpne    rxpin,rxpin            ' wait for a low = start bit[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]'[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' START BIT DETECTED[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]'                                             ' time sample for middle of start bit[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            add        rxcnt,cnt              ' uses special start bit timing[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            test       rxpin,ina wz           ' sample middle of start bit[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]rxcond2     if_nz      jmp    #receive        ' restart if false start[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]'[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' START bit validated[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' Read in data bits lsb first[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' No point in looping as we have plenty of code to play with[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' and inlining (don't call) and unrolling (don't loop) can lead to higher receive speeds[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]'[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            test       rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            muxc       rxdata,#01[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            test       rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            muxc       rxdata,#02[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            test       rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            muxc       rxdata,#04[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            test       rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            muxc       rxdata,#08[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            test       rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            muxc       rxdata,#$10[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            test       rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            muxc       rxdata,#$20[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]            mov        X1,rxbuf[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]            add        X1,rxwr                        ' X points to buffer location to store[/FONT][/COLOR]
  [COLOR=#000000][FONT=Ubuntu Mono]          waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            test       rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            muxc       rxdata,#$40[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]            add        rxwr,#1[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]            and        rxwr,wrapmask[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            test       rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            muxc       rxdata,#$80[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]            wrbyte     rxdata,X1                      ' save data in buffer - could take 287.5ns worst case[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            waitcnt    rxcnt,rxticks                  ' check stop bit (a little earlier) (need to detect errors and breaks)[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]           'test      rxpin,ina wc                   ' ignoring stop bit state, only timing[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]            wrword     rxwr,hubrxwr                    ' update hub index for code reading the buffer (100ns)[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]           jmp        #receive[/FONT][/COLOR]

Tracy Allen · 2013-08-09 11:12

I agree that skipping stop bit detection is a bad thing, for the reasons stated. Also when the rx pin is left floating in the break state, you can end up with an unwanted stream of zeros in the receive buffer. FullDuplexSerial4port tests for the stop bit. However, when it does detect a framing error, all it does is drop the character rather than putting it in the buffer. There is no mechanism for throwing out an error message or for re-syncing with a serial stream.

Detecting the stop bit means that the decision to put the char in the buffer (or not) has to wait until halfway through the stop bit. That leaves less than 1/2 bit time to take care of all the housekeeping, writing the data and head pointer back to the hub. On the other hand, by skipping stop bit detection that leaves more, something between 1 and ~<1.5 bit times to take care of business.

Two hub writes with two intervening operations to hit the sweet spot are going to take 23 to 38 clock ticks. Add a 4 more instructions for shifting or muxing in the final data bit, updating the head pointer and for jumping to start bit detection and you have to allow at least 54 clock ticks. (I allowed 66 ticks in mine for a couple of instructions that could be cut if I borrow from Jon's driver and did more patching in the init code). Give it 60 clock ticks for leeway, baud mismatch etc, at 80MHz clock it adds up to 0.75 µs. If that is 1.5 bit, you have 0.5µs per bit, or a baud rate of 2 MBaud. No doubt that could be faster if it can use a cog buffer instead of a hub buffer.

richaj45 · 2013-08-09 18:12

You guys are amazing at how much performance can be squeezed out of a software UART.
But i notice that another situation creeps up were another "save" is made in software.

I wonder if there was just a couple of simple, well though out shift registers in hardware, how much better serial protocols might be handled and the standby UART would not have to be modified every three months.

If the shifter was thought out it could be used for UART, SPI, I2C, NRZI, USB, Ethernet, Manchester, Group codes, etc. with just a small amount of software like the video shift register is used for TV, VGA, Sprites and others.

Just wondering.

cheers,
Rich

Phil Pilgrim (PhiPi) · 2013-08-09 18:43

Rich,

Your wondering is not unprecedented in this forum. The subject has come up from time to time, particularly in regards to the dev efforts surrounding the Prop II. There is a segment who believes that universal hardware shift registers are as fundamental as universal counters. Another segment believes that such hardware is unnecessary, since the software can do everything adequately. If you're interested in perusing the opinions of these various factions, go to the Propeller 2 forum and read some of the dev threads. I'm sure you will be entertained!

-Phil

Peter Jakacki · 2013-08-10 22:41

I think I've done enough on this driver now to see that 2M baud or so is the highest speed that one can run to with a general-purpose "any baud" routine. That's assuming that it has one hub write for the data and one for the write index. Trying to go faster means that the WAITCNT can overrun and be locked up until it times out if a hub op takes too long. I tried looping back looking for the start bit as a normal timed bit (no overhead) and that does indeed work except the timing will drift after enough back-to-back characters (two systems are not timing locked).

There is another method I could use and that is to only write once to the hub for every character but that would either mean using two bytes as a WRWORD with one as flag or else encoding the 8th bit of the character so that a buffer read could work out if it's an unread character etc. So this means that the general-purpose routine could be used up to 3M baud at full throughput but it's either limited to 7-bit ASCII or uses double the receive buffer memory.

For the moment I can handle 3M baud with only a single stop bit between characters but the 3M receive is a separate routine using instruction timing (hack).

[COLOR=#000000][FONT=Ubuntu Mono][B]'**************************************** SERIAL RECEIVE ******************************************[/B][/FONT][/COLOR]


[COLOR=#000000][FONT=Ubuntu Mono][B]{ This is a dedicated serial receive routine that runs in it's own cog }[/B][/FONT][/COLOR]

[COLOR=#000000][FONT=Ubuntu Mono]DAT[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]                        long 0[2]    ' read and write    ' this hub space is used for rxwr & rxrd at runtime[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]rxbuffers                    ' [/FONT][/COLOR][COLOR=#ff0000][FONT=Ubuntu Mono]hub ram gets reused as the receive buffer[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]                            org[/FONT][/COLOR]

[COLOR=#000000][FONT=Ubuntu Mono]HSSerialRx            
            mov        rxwr,#0[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             wrword     rxwr,hubrxwr     ' clear rxrd in hub[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             wrword     rxwr,hubrxrd     ' make rxwr = rxrd[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             cmp        rxticks,#26 wz    ' is it 3M baud?[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]    if_z    jmp        #receive3    ' use special receive routine for 3M[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             mov        stticks,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             shr        stticks,#1    ' half a bit time less[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             sub        stticks,#4    ' compensate timing - important at high speeds
[/FONT][/COLOR][COLOR=#000000][FONT=Ubuntu Mono][B]receive[/B][/FONT][/COLOR][COLOR=#000000][FONT=Ubuntu Mono]     mov        rxcnt,stticks    ' Adjusted bit timing for start bit[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            waitpne    rxpin,rxpin    ' wait for a low = start bit[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]'[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' START BIT DETECTED[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]'                                                            ' time sample for middle of start bit[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             add        rxcnt,cnt               ' uses special start bit timing[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            test       rxpin,ina       wz          ' sample middle of start bit[/FONT][/COLOR][COLOR=#000000][FONT=Ubuntu Mono]
   if_nz    jmp        #receive                    ' restart if false start[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]'[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' START bit validated[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' Read in data bits lsb first[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' No point in looping as we have plenty of code to play with[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' and inlining (don't call) and unrolling (don't loop) can lead to higher receive speeds[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]'[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            test       rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            muxc       rxdata,#01[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            test       rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            muxc       rxdata,#02[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            test       rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            muxc       rxdata,#04[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            test       rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             muxc       rxdata,#08[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             test       rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             muxc       rxdata,#$10[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono] ' data bit 5[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             test       rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             muxc       rxdata,#$20[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]             mov        X1,rxbuf[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]             add        X1,rxwr    ' X points to buffer location to store[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' data bit 6[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             test       rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             muxc       rxdata,#$40[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]             add        rxwr,#1[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]             and        rxwr,wrapmask[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' last data bit 7[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             waitcnt    rxcnt,rxticks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             test       rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             muxc       rxdata,#$80[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]             wrbyte     rxdata,X1    ' save data in buffer - could take 287.5ns worst case[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' stop bit[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             waitcnt    rxcnt,rxticks    ' check stop bit (a little earlier) (need to detect errors and breaks)[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             test       rxpin,ina wc    ' stop bit[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]    if_nc    sub        rxwr,#1    ' restore rxwr on error - no need to worry yet about wrap[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]    if_c     wrword     rxwr,hubrxwr    ' update hub index for code reading the buffer if all good[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            jmp        #receive[/FONT][/COLOR]

[COLOR=#000000][FONT=Ubuntu Mono]' 3M baud receive[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono][B]receive3[/B][/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' 333.33ns/bit with 50ns/instruction or 26.66 clocks[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            waitpne    rxpin,rxpin    ' wait for a low = start bit[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono] '[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' START BIT DETECTED[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]'                                                            ' time sample for middle of start bit[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             test      rxpin,ina       wz          ' sample middle of start bit[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]    if_nz    jmp       #receive3                    ' restart if false start[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' 200ns[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' START bit validated[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]'[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' @300ns[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' bit 0 [/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             test      rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             muxc      rxdata,#01[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' bit 1 [/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             test    rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             muxc    rxdata,#02[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' bit 2[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             test    rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             muxc    rxdata,#04[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' bit 3 [/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             test    rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             muxc    rxdata,#08[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' bit 4 [/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             test    rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             muxc    rxdata,#$10[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' bit 5 [/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            nop[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]             mov    X1,rxbuf[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]             add    X1,rxwr    ' X points to buffer location to store[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            test    rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             muxc    rxdata,#$20[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' bit 6 [/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            nop[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]             add    rxwr,#1[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]             and    rxwr,wrapmask[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            test    rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             muxc    rxdata,#$40[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' bit 7 [/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             nop[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             test    rxpin,ina wc[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             muxc    rxdata,#$80[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]            wrbyte    rxdata,X1    ' save data in buffer - could take 287.5ns worst case[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]' stop bit[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]            test    rxpin,ina wc    ' stop bit[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]        if_nc    sub    rxwr,#1    ' restore rxwr on error - no need to worry yet about wrap[/FONT][/COLOR]
[COLOR=#990000][FONT=Ubuntu Mono]        if_c    wrword    rxwr,hubrxwr[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]             jmp    #receive3[/FONT][/COLOR]


[HR][/HR]
[COLOR=#000000][FONT=Ubuntu Mono]rxpin            long    |<rxd                   ' mask of rx pin[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]hubrxrd            long    @rxbuffers+s-4    ' ptr to rxrdin hub[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]hubrxwr            long    @rxbuffers+s-2     ' word address of rxwr in hub (after init)[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]rxbuf            long    @rxbuffers+s         ' pointer to rxbuf in hub memory[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]mode            long    0[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]rxticks            long    0[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]stticks            long    0[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]spticks            long    0[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]breakcnt            long    40[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]wrapmask            long    bufsize-1[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]rxcnt            long    0[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]rxdata            long    0                   'assembled character[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]lastch            long    0[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]X1            long    0[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]savdat            long    0[/FONT][/COLOR]
[COLOR=#000000][FONT=Ubuntu Mono]rxwr            long    0    'cog receive buffer write index - copied to hub ram[/FONT][/COLOR]

jmg · 2013-08-11 00:17

Peter Jakacki wrote: »

I think I've done enough on this driver now to see that 2M baud or so is the highest speed that one can run to with a general-purpose "any baud" routine....

Because this is Rx only, and there is very good lock to the start edge (much better than a /16 or /8 Uart), there is scope tosqueeze a little more out of the rate, by using that better start precision as skew at the end.
ie you sample deliberately a little fast, so you buy more time between the critical stages at the end.
It is also not vital to sample Stop at exactly 50%, it could start sample from 60~75%, to still avoid clock creep issues.

The FT232H has a virtual Baud clock of 96MHz, so if this fast link is talking to a PC, there will be some better matching Baud choices. FT232H clock precision, is very good, as they use crystals, and so too usually does the Prop.

Peter Jakacki · 2013-08-11 01:45

jmg wrote: »

Because this is Rx only, and there is very good lock to the start edge (much better than a /16 or /8 Uart), there is scope tosqueeze a little more out of the rate, by using that better start precision as skew at the end.
ie you sample deliberately a little fast, so you buy more time between the critical stages at the end.
It is also not vital to sample Stop at exactly 50%, it could start sample from 60~75%, to still avoid clock creep issues.

The FT232H has a virtual Baud clock of 96MHz, so if this fast link is talking to a PC, there will be some better matching Baud choices. FT232H clock precision, is very good, as they use crystals, and so too usually does the Prop.

I looked at different methods including sampling early but none these help as it's the time between the last data bit and also the stop bit that's critical which is why I even have used a different bit time for start and stop bits (applied at last data). Writing the data before the stop bit is okay as it's not accepted as a new byte in the buffer until the write index is updated. The single write method could wait until the stop bit is sampled early but still takes up to 23 additional cycles and at 26 cycles at 3M there's only 13 cycles for a stop bit verified at half bit time until another start bit. I think that it's more than good enough to handle 2M rates with one stop bit between characters and allow 3M with 2 stops bits. The end result is that we are assured that the receiver will capture all data up to 2M whereas most forum members rarely venture beyond 115.2K if that.

PJAllen · 2013-08-11 11:04

I believe the following links to pjv's effort

http://forums.parallax.com/showthread.php/120125-5Mbit-sec-ASCII-streaming-from-to-Hub-RAM

MJB · 2013-08-11 12:18

there are so many NOPs in this code which make me wonder if writing the pointer to HUB
which seems to be the critical point,
could be delayed to in between receiving of the next char?
Unfortunately we don't have the wait with timeout yet, that is coming with the prop2.
For some communication like lines ended with CRLF it might be ok to get the last char only at the next burst ...

Peter Jakacki · 2013-08-11 15:41

PJ Allen wrote: »

I believe the following links to pjv's effort

http://forums.parallax.com/showthread.php/120125-5Mbit-sec-ASCII-streaming-from-to-Hub-RAM

As I suspected like Beau's 14.5M baud serial object this one too is not adjustable or general-purpose and is in fact very specialized for same frequency Prop to Prop coms. An asynch serial object needs to be able to be programmed for any usable baud rate and receive characters at any time in a continuous fashion. I have used a variation of Beau's serial driver for fibre-optic communications which is DC balanced and runs at a continuous 10Mbit data rate but that is all I can use it for.

Peter Jakacki · 2013-08-11 15:48

MJB wrote: »

there are so many NOPs in this code which make me wonder if writing the pointer to HUB
which seems to be the critical point,
could be delayed to in between receiving of the next char?
Unfortunately we don't have the wait with timeout yet, that is coming with the prop2.
For some communication like lines ended with CRLF it might be ok to get the last char only at the next burst ...

I think the nops you are seeing are simply part of the 3 Mbaud only hack (receive3) and not the general-purpose code (receive) which is limited to around 2 Mbaud max. At this speed trying to test for special end conditions etc would make the code unreliable however the code works reliably up to 2M baud without a hitch.

pjv · 2013-08-12 17:39

Hi Peter,

The code I posted three years ago was not intended to be a universal speed programmable model. It was a response to a need of "fastest serial ASCII link" that I had for an industrial application, and I just thought I would post it in case there was interest in the subject. There seemed to be none. Regrettably it also appears not to meet your requirements....... but luckily there are many others to choose from.

Cheers,

Peter (pjv)

Propeller Serial with Tight Byte Spacing

Comments