Shop Learn
Putty & Prop Plug -- what is the highest Baud rate feasible? — Parallax Forums

Putty & Prop Plug -- what is the highest Baud rate feasible?

Hi all,

I'm messing about with a serial driver I'm writing and I'm trying to push the speed up to it's maximum Baud rate. I'm using a Prop Plug (circa 2009 ish) and Putty for my terminal software.

Now, Putty seems to be very happy writing out data at 2,000,000 baud, but push it any further than that and the actual rate measured suddenly falls back to what appears to be 115200 Baud.

I think that the Prop Plug is not to blame (because all the bytes do seem to appear, just at a much lower rate than expected).

Can someone just quickly check to see how fast they can run Putty's serial Baud rate?

I'm using 8 data bits, no parity, 1 stop bit, no flow control.

If it turns out to be a bug / limitation in Putty then I'll have to find a faster terminal program, which is fine I guess.

Comments

  • The Ft232 can run at 3mbaud, and I think putty is 921600

  • I know I run PST on the prop at 1_000_000 baud, but 2_000_000 it gets funky

  • jmgjmg Posts: 14,756
    edited 2021-06-17 02:00

    @Cabbage said:
    Now, Putty seems to be very happy writing out data at 2,000,000 baud, but push it any further than that and the actual rate measured suddenly falls back to what appears to be 115200 Baud.

    Usually modern Terminal programs will pass to the driver any baud rate you type in, (eg I have a UART+Terminal here that accepts and delivers 7868852 as valid) and the vendor drivers will then decode what they consider legal.
    Ideally a driver rounds correctly to the nearest granular baud value, but some jump to a default value, if given something they do not understand.

    Way back, FTDI allowed a kludge of baud aliasing, so maybe you have bumped into that ?
    http://ftdichip.com/Documents/AppNotes/AN232B-05_BaudRates.pdf

  • @jmg said:
    Way back, FTDI allowed a kludge of baud aliasing, so maybe you have bumped into that ?
    http://ftdichip.com/Documents/AppNotes/AN232B-05_BaudRates.pdf

    Now that is interesting. I'll have a look at this tomorrow. Thanks for the info :)

  • Not sure about the Prop plug, but the limits of Real-Term (or the UART on my PC) are at 2 MB. I have a PIC16F15324 running at 4 MB talking to another PIC without any issue, I imagine the Prop can at least do the same speed.

  • Yeah, the Prop is perfectly happy at speeds way above 2 MHz, it's just that an ATMega1284P's hardware USART maxes out at 2.5 MHz (from a 20 MHz xtal), and the next lowest Baud (on the AVR) that Putty+FTDI driver can cope with is 1 MHz (from the internal RC osc @ 8 MHz).

    Hmm, that's got me thinking. I wonder what the theoretical maximum Baud for the P8X32A is (for both stable TX and RX). I'm sure it's at least 4MHz, without having to go above 80 MHz system clock.

  • Tried some naive PASM code just now. P8X32A is trivially able to transmit RS-232 style at 6.5 MHz (8 data bits, no parity, 1 stop bit) from a single Cog.

    Receive will be a bit slower I suspect.

  • jmgjmg Posts: 14,756

    @Cabbage said:
    Tried some naive PASM code just now. P8X32A is trivially able to transmit RS-232 style at 6.5 MHz (8 data bits, no parity, 1 stop bit) from a single Cog.

    Receive will be a bit slower I suspect.

    8.N.2 might be more practical, and a triggered RX might do good burst speeds with some rules ?
    The XR21B1420 I'm testing, supports baud rates of 480M/N, so it can get to 0.2% of 6.5Mbd

  • @jac_goudsmit made an 8Mbps transmitter a few years back:
    https://github.com/parallaxinc/propeller/tree/master/libraries/community/p1/All/Txx - Fast Serial Transmitter for PASM and Spin

    I haven't tried it yet myself, but it certainly caught my eye when I saw '8 megabits per second' :D

  • jac_goudsmitjac_goudsmit Posts: 417
    edited 2021-06-18 22:53

    @avsa242 said:
    @jac_goudsmit made an 8Mbps transmitter a few years back:
    https://github.com/parallaxinc/propeller/tree/master/libraries/community/p1/All/Txx - Fast Serial Transmitter for PASM and Spin

    I haven't tried it yet myself, but it certainly caught my eye when I saw '8 megabits per second' :D

    Ha, thanks for that link to the new location, I couldn't even find it myself when I searched for it a while ago. I also recently made my own Github repo for this project at https://github.com/jacgoudsmit/TXX. I think the code is probably pretty much the same between the two locations at the time of this writing.

    Yes, TXX is capable of using up to 8mbps as baud rate. However, the FTDI chip on the Prop Plug (and various other Parallax products like the FLiP) can only handly baud rates up to 3 mbps.

    Furthermore, the maximum throughput of the FTDI is much lower than 3 mbps. I recently did some tests by using the core character transmitting code from my TXX module to transmit a continuous stream of "1234567890" at 3 mbps to see how much delay I had to insert between the characters in order to not overrun the receive buffer of the FTDI chip, and I got to a disappointingly low number of something like 77 kilobytes per second maximum throughput.

    If you're not limited by 3rd party chips in your project, the throughput can be much faster. I think Beau Schwabe made a high-speed serial transmitter/receiver between Propellers a long time ago, and he could get the speed up to 16 megabits per second if I'm not mistaken. I think he basically had two Propellers that were clocked by the same crystal and that simplifies things a little bit because you don't have to worry about things such as synchronization and clock drift.

    ===Jac

  • CabbageCabbage Posts: 32
    edited 2021-06-19 14:48

    CabbageComms (tm)

    I need you guys to sanity check an idea I had at 3 o'clock this morning.

    How about this as an idea for making a symmetrical 20 MHz RS-232 transceiver in one cog...

    Since each Cog executes most normal instructions in 4 clock cycles, we can configure CounterA to sample the serial RX_PIN at the same rate: 20 MHz.

    Theory of operation (for the RX part)...

    1. Set PHSA to 0
    2. Set FRQA to 0x01 (least significant bit first)
    3. use WAITPEQ to detect the START BIT (this takes 6+ clock cycles, not sure if that's good or bad)
    4. enable CTRA to sample the RX input pin, probably mode %01000 POS DETECTOR
    5. SHL FREQ, #1 //0x02
    6. SHL FREQ, #1 //0x04
    7. SHL FREQ, #1 //0x08
    8. SHL FREQ, #1 //0x10
    9. SHL FREQ, #1 //0x20
    10. SHL FREQ, #1 //0x40
    11. SHL FREQ, #1 //0x80
    12. disable CTRA
    13. PHSA now contains the byte read in over serial!

    The problem is I have no way to test this - I have no means to generate 20 MHz RS-232 traffic :(

    Anyway, here's the TX method (very similar but uses counter mode %00100 (NCO single ended)...

    1. Set FRQA to 0
    2. Set PHSA to the byte you want to send (right aligned, which is super convenient actually)
    3. OR PHSA with 0xffffff00 (to make sure we return to idle high after transmitting, this may not actually be needed but I don't know)
    4. ROR PHSA, #1 //pre-load bit 31 with the LSB of the byte we're sending
    5. Enable CTRA
    6. ROR PHSA, #1 //second bit goes out
    7. ROR PHSA, #1 //third
    8. ROR PHSA, #1
    9. ROR PHSA, #1
    10. ROR PHSA, #1
    11. ROR PHSA, #1
    12. ROR PHSA, #1 //the last bit (MSB of the original byte)
    13. disable CTRA

    Would this work? My addled brain seems to think it might, but I'm not certain and I can't prove it (that's why I haven't written any code properly).

    What do you guys think?

    (Runs out to buy an oscilloscope, sig-gen and high speed logic analyser :))

    EDIT: I forgot the start and stop bits on the TX side, but I think that can be added without affecting the Baud rate by appending them to each side of the byte being sent before enabling the counter.

  • CabbageCabbage Posts: 32
    edited 2021-06-19 15:30

    I'm being dense... of course I can test it. I can just use PLL_X1 instead of X16, if it works it'll simply run at 1/16th the speed and should scale back up to 20MHz perfectly.

    I'll write some code..............

    EDIT: This is actually looking promising :)

    EDIT #2: I've got TX working at 1.25 MHz using PLL_1X, so running the same code with PLL_16X should give 20 MHz.

  • jmgjmg Posts: 14,756

    @Cabbage said:
    The problem is I have no way to test this - I have no means to generate 20 MHz RS-232 traffic :(

    You can test RX with simple square waves, at least to some level.

    10MHz   -> S01010101
    5MHz    -> e01100110 (ignore the bad stop bit for now)
    3.33MHz -> S00011100 
    2.5MHz  -> e01111000 (ignore the bad stop bit for now)
    S is valid stop bit, e is bad stop bit 
    

    EDIT #2: I've got TX working at 1.25 MHz using PLL_1X, so running the same code with PLL_16X should give 20 MHz.

    RX also needs to consider the next-byte arrival time, most UARTS only give you up to 2 Stop bits, but you can force parity to sneak a 3rd stop bit.
    The vast majority of PC UARTS go up to 12MBd, even HS-USB ones. ( A niche Exar part claims 15MBd)

    20MHz would be useful for P1 to P1 custom links, in which case more than 8 bits may be useful.

  • Well, I've hit a wall with this. The TX side works well (because it's entirely deterministic within the timing domain of the sender).
    However the RX side is causing me great trouble. It does detect the start bit reliably, but the subsequent 8 samples are basically jumbled garbage.

    I think part of the problem is that WAITPEQ takes 6 or more cycles before it resumes normal execution. It is not clear how many clock cycles have elapsed since the falling edge was detected. What a pain.

    I could make this work if I simply abandon the RS-232 signalling standard and use perhaps 2 start bits, but then it would lose all the benefits of being compliant with normal terminal software. sigh

  • jmgjmg Posts: 14,756
    edited 2021-06-20 20:49

    @Cabbage said:
    Well, I've hit a wall with this. The TX side works well (because it's entirely deterministic within the timing domain of the sender).

    Fast TX could still have uses for SPI connections, eg like HC595 strings ? - This might work for 1,2,3,4 HC595 in series ?
    The tricky part may be stopping the SPI clock at the right time - maybe correctly timed STCP signal is ok ?
    On a HC595 string, there is a bit of tolerance on start clock phasing, as only the last N*8 bits are latched.

    ie if above becomes
    ROR PHSA, #1 // clocks in middle of this bit
    ROR PHSA, #1
    ROR PHSA, #1 //the last bit (MSB of the original byte)
    SET STCP // latch in the previous N*8
    I'm not sure if the PHSA delay differs from the Pin-delay, but there are 2 sysclks until the next CLK edge, and you could sneak a 3rd sysclk tolerance by making the SPI 595 CLK edges 75% positioned.

    However the RX side is causing me great trouble. It does detect the start bit reliably, but the subsequent 8 samples are basically jumbled garbage.
    I think part of the problem is that WAITPEQ takes 6 or more cycles before it resumes normal execution. It is not clear how many clock cycles have elapsed since the falling edge was detected. What a pain.

    Other parts of the problem are the CTRs are actually adders, and the mode is an ADD-enable, not what is actually wanted, which is a single centre-bit-sample.
    That means every sysclk FRQA is added, so 4 sysclks per bit will add 4 times, which becomes a shift-left-2.
    Notice too that any small skew in the ADD-Enable will bleed from adjacent bits, to give 75% from wanted bit and 25% from unwanted adjacent bit.

    I think for async it is not practical, but if P1 generates the CLK, perhaps it might work on a SPI receive use case. Those are not as common as P1/P2 -> HC595

    Addit: I think a general Async Serial RX can be made with a variant of jac_goudsmit code, that uses 2 lines for pin test and shift.
    That would support SysCLK/N baud speeds for any N >= 14. (6+4+4), working on a bit-time basis, but the inter-byte time may dictate the top practical rate as you need to unload byte and prepare for next start bit, inside the stop bit times. Verify of stop bit = 1, ideally should be done too, but that could be skipped.
    Maybe 5333333.N.8.2 or 5333333.M.8.2 are candidate practical values ? (PL2303GC and XR21B1420 can support this value)
    4.8MBd is supported by PL2303GC, XR21B1420, EFM8UB3 & FT232H, & a fractional Baud 80MHz P1 can do ((80M/16.7)) = 4790419 for ~ 0.199% baud error
    The XR21B1420 can exactly follow all SysCLK/N (N>=14) possible baud values, with 0% average baud error. The bit-skews look stable within any byte, so a P1 could be coded to match that pattern.

  • jmgjmg Posts: 14,756
    edited 2021-06-20 22:20

    @jac_goudsmit said:
    Ha, thanks for that link to the new location, I couldn't even find it myself when I searched for it a while ago.

    I like the details of the code

                            waitcnt chr_time, chr_bittime   ' Never waits
                            mov     PHSA, chr_char          ' Bit 31 is 0: start bit
                            waitcnt chr_time, chr_bittime   ' Wait for next bit
                            ror     PHSA, #1                ' Output original bit 0
                            waitcnt chr_time, chr_bittime   ' Wait for next bit
                            ror     PHSA, #1                ' Output original bit 1
                            waitcnt chr_time, chr_bittime   ' Wait for next bit
                            ror     PHSA, #1                ' Output original bit 2
                            waitcnt chr_time, chr_bittime   ' Wait for next bit
                            ror     PHSA, #1                ' Output original bit 3
                            waitcnt chr_time, chr_bittime   ' Wait for next bit
                            ror     PHSA, #1                ' Output original bit 4
                            waitcnt chr_time, chr_bittime   ' Wait for next bit
                            ror     PHSA, #1                ' Output original bit 5
                            waitcnt chr_time, chr_bittime   ' Wait for next bit
                            ror     PHSA, #1                ' Output original bit 6
                            waitcnt chr_time, chr_bittime   ' Wait for next bit
                            ror     PHSA, #1                ' Output original bit 7
                            waitcnt chr_time, chr_bittime   ' Wait for next bit
                            ror     PHSA, #1                ' Output original bit 8: stop bit
                            waitcnt chr_time, chr_bittime   ' Ensure stop bit at least 1 bit time
    
    

    because those 10 bit waits do not need to be identical, you can selectively add 1 more sysclk to any of the 10, to give fractional baud capability. (tho compile time locked)

    Yes, TXX is capable of using up to 8mbps as baud rate. However, the FTDI chip on the Prop Plug (and various other Parallax products like the FLiP) can only handly baud rates up to 3 mbps.

    Furthermore, the maximum throughput of the FTDI is much lower than 3 mbps. I recently did some tests by using the core character transmitting code from my TXX module to transmit a continuous stream of "1234567890" at 3 mbps to see how much delay I had to insert between the characters in order to not overrun the receive buffer of the FTDI chip, and I got to a disappointingly low number of something like 77 kilobytes per second maximum throughput.

    Which FTDI variant was that ?

    UARTS are improving all the time, the CP2102N can sustain 3.428571MBd.N.8.1 one way, and the newest ones like PL2303GC or XR21B1420 can offer even better FS-USB performance
    Many have fractional baud clocks, which is nice for P1/P2 as it means you have more sysclk freedom choice

    The XR21B1420 can sustain over 10Mbd averaged, (one way) with handshake line connected.

    Virtual Baud CLOCKS - for values below 12M (4M on CP2102N) possible baud is VirtualBaudCLK/N

    Part VirtualBaudCLK
    CP2102N 24MHz
    PL2303GC 96MHz
    XR21B1420 480MHz

    Here is an example of fractional baud clock support in XR21B1420 - if you look carefully, not all bits are exactly the same duration, but they do appear consistent within the 10b frame.

  • @jmg said:
    Fast TX could still have uses for SPI connections, eg like HC595 strings ? - This might work for 1,2,3,4 HC595 in series ?

    Another way to effectively drive a 595 is to share all of the clocks and all of the mode lines and have a separate Data line for each 595 and just load all 595's at once rather than daisy chaining them all together. This goes for any of the shift in/ shift out registers ... i.e. HC165, etc.

  • @jmg said:
    because those 10 bit waits do not need to be identical, you can selectively add 1 more sysclk to any of the 10, to give fractional baud capability. (tho compile time locked)

    Well... not compile time locked; you could wait for 10 different values that you calculate at runtime somehow when you initialize the cog...

    Yes, TXX is capable of using up to 8mbps as baud rate. However, the FTDI chip on the Prop Plug (and various other Parallax products like the FLiP) can only handly baud rates up to 3 mbps.

    Which FTDI variant was that ?

    I measured with the FLiP which apparently has an FT231XS. I think the all the FTDI chips have the same maximum baud rate of 3mbps but I suppose it's possible that different variations perform better when it comes to actual bandwidth.

    UARTS are improving all the time, the CP2102N can sustain 3MBd one way, and the newest ones like PL2303GC or XR21B1420 can do a little more.
    ...
    The XR21B1420 can sustain over 10Mbd averaged, (one way) with handshake line connected.

    Sustained throughput? In other words, uninterrupted traffic of 300,000 or 1,000,000 characters per second, using isosynchronous transfers? This would be interesting for a project I'm working on (S/PDIF decoder for Propeller 1). I'll have to look into that...

    ===Jac

  • jmgjmg Posts: 14,756
    edited 2021-06-20 23:09

    jac_goudsmit said:

    @jmg said:
    because those 10 bit waits do not need to be identical, you can selectively add 1 more sysclk to any of the 10, to give fractional baud capability. (tho compile time locked)

    Well... not compile time locked; you could wait for 10 different values that you calculate at runtime somehow when you initialize the cog...

    :) yes, I guess you could self-modify the code if you were motivated enough.
    The main use I see is to define for a given crystal value, so you get a useful baud from a standard xtal. Xtals are less likely to change at runtime :)

    Yes, TXX is capable of using up to 8mbps as baud rate. However, the FTDI chip on the Prop Plug (and various other Parallax products like the FLiP) can only handly baud rates up to 3 mbps.

    Which FTDI variant was that ?

    I measured with the FLiP which apparently has an FT231XS. I think the all the FTDI chips have the same maximum baud rate of 3mbps but I suppose it's possible that different variations perform better when it comes to actual bandwidth.

    Here are my test notes - the FT232H sustains no-gaps tx here. This is a simplex test.
    _FT2232H transmit c8 -> CP2102N OK at 1MBd, 2Mbd, 2.181818MBd, 2.4MBd, 2.666666MBd 3MBd, 3.428571MBd OK, Drops Chars at 4MBd 8,N,1, & 4M,8,N,2, but is looking OK at 4M 8,M,2
    FT2232H transmit c8 -> FT232R OK at 1MBd, 2MBd, 3MBd, No Support for 4MBd
    FT2232H transmit c8 -> FT231X OK at 1MBd, 2MBd, 3MBd, No Support for 4MBd
    _

    Your disappointingly low number of something like 77 kilobytes per second maximum throughput. must have been some other issue ? Was that duplex ?

    UARTS are improving all the time, the CP2102N can sustain 3MBd one way, and the newest ones like PL2303GC or XR21B1420 can do a little more.
    ...
    The XR21B1420 can sustain over 10Mbd averaged, (one way) with handshake line connected.

    Sustained throughput? In other words, uninterrupted traffic of 300,000 or 1,000,000 characters per second, using isosynchronous transfers? This would be interesting for a project I'm working on (S/PDIF decoder for Propeller 1). I'll have to look into that...

    Certainly one way traffic of 300k is easy, & close to 1M is possible over FS-USB. That's part of what impresses me on these parts.

    pasted from my test notes

    FT2232H transmit c8 -> EXAR XR21B1420 OK at 1MBd, 2MBd, 4MBd 100,000 chars continual, 8,N,1, 4.8MBd 8,M,2 OK, 6MBd 8,M,2 OK, 6MBd 8,N,1 OK, 8MBd 8,N,1 OK
    and a bit more is possible with HW handshake
    FT232H -> XR21B1420_12MBd_RX_3000000_RTS.png shows handshakes.
    12MBd HWHS: 9764140 sustained speed, handshake needed.
    8MBd : 8.00014 and no handshakes seen - cannot test other speeds as FT232H does not do between 8-12MBd,

    _XR21B1420 (Driver 2.6.0.0 Dec 2019) -> FT232H 12MBd sustained average 10.447200 MBd _

    _XR21B1420 Loopback tests - 8.N.1 - one indication of duplex capability (no handshake)
    5333333 bd -> 2.666696*2 = 5.333392M duplex.
    6000000 bd -> 3.000000*2 = 6.000000M duplex
    6500000 bd -> 3.039*2 6.078
    8000000 bd -> 3.039*2 6.078
    1200000 bd -> 3.039*2 6.078 _

    Sanity check of transported data bits over FS-USB 2*8*(3.040*2/10) = 9.728Mbits / second, inside 12Mbit link - overhead.

    That in indicates a tad over 600k bytes / second both ways in loop-back duplex is around the ceiling, with more in simplex : ~1044kB/s to PC or ~976kB/s from PC

  • jmgjmg Posts: 14,756

    @Cabbage said:
    How about this as an idea for making a symmetrical 20 MHz RS-232 transceiver in one cog...

    Since each Cog executes most normal instructions in 4 clock cycles, we can configure CounterA to sample the serial RX_PIN at the same rate: 20 MHz.

    Theory of operation (for the RX part)...

    1. Set PHSA to 0
    2. Set FRQA to 0x01 (least significant bit first)
    3. use WAITPEQ to detect the START BIT (this takes 6+ clock cycles, not sure if that's good or bad)
    4. enable CTRA to sample the RX input pin, probably mode %01000 POS DETECTOR
    5. SHL FREQ, #1 //0x02

    You were quite close, just missed a gated sample detail, see this code for 20MHz RX over SPI - for Async you still need the CLK pin for the RX-sample-pulse, and use MODE LOGIC A & B
    https://forums.parallax.com/discussion/comment/1466234/#Comment_1466234

    For a practical PC-Async system, a 96MHz sysclk, and /8 on the sample pulses would give a 12Mbd link.

  • @jmg,
    Of course, you're right about the adding frqx into phsx once per clock. I had mistakenly thought I could use the pll_div field to slow that down to once every 4th clock to match the instruction speed.

    Oh well, it was fun getting the tx working though. Thanks for the help everyone :)

  • CabbageCabbage Posts: 32
    edited 2021-06-21 18:36

    Well I'm not beaten yet. More ideas...

    I have come to peace with the fact that the P1 cannot react fast enough to a waitpeq to capture the first data bit transferred when the Baud rate is equal to (CLK / 4). Can't be done.

    Yet another (literally) half-baked idea for an RX mechanism...

    (this is not working code, but rather a conceptual description of my new crackpot idea)

    CON
      _clkmode = xtal1 | PLL1X
      _xinfreq = 5_000_000
    
    PUB Main
      coginit(0, @C0_ENTRY, 0)
    
    DAT
                  org       0
    C0_ENTRY
    
                  or        dira, TX_PIN
                  or        dira, DUMMY_HIGH
                  or        outa, DUMMY_HIGH
    
    :c0_loop
                  mov       PHSA, #0 'reference counter (free running, unconditional)
                  mov       PHSB, #0
                  mov       FRQA, #1
                  mov       FRQB, #1
    
                  mov       CTRA, CTRA_MODE 'set the reference counter running
                  mov       CTRB, CTRB_MODE 'set the receiver counter running
    
                  waitpeq   ALL_ZEROES, RX_PIN 'casually wait for start bit, this might be non-critical since the counters are always running regardless
    
                  nop
                  nop
                  nop
                  nop 'these nops could be replaced by a fancy
                  nop 'set of bit manipulations on PHSB
                  nop
                  nop
                  nop
    
                  'stop counters
                  mov       CTRA, #0
                  mov       CTRB, #0
    
    
    
                  'DO SOMETHING CLEVER HERE
                  'is it possible to calculate the serial byte received based on the
                  'differences between PHSA and PHSB?
    
    
    
                  mov       ser_byte, the_byte_we_received
                  call      #Ser_TX 'echo the byte back to the sender
    
                  jmp       #:c0_loop 'repeat
    
    
    
    
    
    'expects ser_byte to contain the byte to send (gets clobbered)
    Ser_TX
                  'TX initialisation ------------------------------------------
                  mov       FRQA, #0      'needs to stay at 0 always
                  neg       PHSA, #1      '0xffffffff 'this assures that the idle high state is default
                  mov       CTRA, CTRA_TX 'this counter is always running, but sending out IDLE high 1..1..1...
                  or        dira, TX_PIN  'output, glitch-less :)
                  '------------------------------------------------------------
    
                  or        ser_byte, #$01_00 'put the STOP bit after the last data bit
                  mov       PHSA, ser_byte    'bit 31 must be clear, it acts as the START bit
    
                  ror       PHSA, #1  'rotate right |x.....>>>>| to shift out bits (LSB first) into bit 31
                  ror       PHSA, #1
                  ror       PHSA, #1
                  ror       PHSA, #1
    
                  ror       PHSA, #1
                  ror       PHSA, #1
                  ror       PHSA, #1
                  ror       PHSA, #1
    
                  neg       PHSA, #1 '0xffffffff  'returns us to idle high state
    
    Ser_TX_ret    ret
    
    '########################################
    
    CTRA_TX       long      %0_00100_000_00000000_000000_000_000000
    CTRA_MODE     long      %0_11111_000_00000000_000000_000_000000   'mode "LOGIC always"
    CTRB_MODE     long      %0_11000_000_00000000_000010_000_000001   'mode "RX & P2"
    
    ALL_ZEROES    long      $00000000
    ALL_ONES      long      $ffffffff
    
    TX_PIN        long      |< 0
    RX_PIN        long      |< 1
    DUMMY_HIGH    long      |< 2
    
    ser_byte      res       1
    
                  fit
    
    

    Consider that PHSA and PHSB will run alongside in parallel until a byte arrives on the serial RX pin.
    PHSB will be affected by the passing of these modulated bits, but PHSA will not.
    PHSA could act like a "control" sample in a scientific test.
    PHSB would be the actual "test subject" in that test.

    This gives us a difference that perhaps we can analyse to figure out the actual bit pattern that came in without us having to actively clock the bits in manually.

    Can this be made to work do you think?

    All opinions invited.

  • jmgjmg Posts: 14,756
    edited 2021-06-23 00:02

    @Cabbage said:
    This gives us a difference that perhaps we can analyse to figure out the actual bit pattern that came in without us having to actively clock the bits in manually.
    Can this be made to work do you think?
    All opinions invited.

    I think the problem here is that the huge time taken to extract a single byte of info, would swamp any gains in the faster RX
    The code I linked to above can (self) RX at 20MHz, so why not do a variant of that ? (tho is does need a spare pin)

    In a practical ASYNC system, 20MHz is not of great use (no PC-UARTS exist), and the interbyte delays are critical.

    I think 12Mbd, (possible with 96MHz sysclk and a /8 rate) or 10MBd (80MHz sysclk and /8) are possible fixed P1 upper targets for async, with maybe 8MBd(9.6MBd at 96Mhz) tops, on a WAIT-based Async block.

    Even here, there are going to need to be design/system agreements between the P1 side and host side, to make use of this burst ability.
    Some packet buffer size needs to be agreed and allocated, as random length timeout is hard to manage. HW handshake will also be needed to pause the HW until the SW is poised at the right place.
    Stop bits also need to be agreed, as a burst RX needs time to decrement and write between bytes.

    The good news is there are USB-UARTS that can manage these high speeds, so it is worthwhile checking into this.
    Here is what 10MBd (fractional baud) looks like, on the one brand of UART that manages it

    Addit: here is 12M.8.M.2 which is the longest STOP duration standard uarts can provide. This delivers exactly 1MBytes/s over the link, and I think a 96MHz P1 can manage this, COG-local at least.
    An important gain of high burst speeds, is the link overhead is much reduced. eg users code can report 4 bytes and return in 4us.

  • CabbageCabbage Posts: 32
    edited 2021-06-23 21:01

    I finally relented and turned to external hardware for the RX.

    74HC595 to the rescue. A combination of fast clocking TX into the 595, followed up with a short pulse for the end-of-byte latch (shared with the receiving Prop so it knows when to read the 595).
    Works beautifully as long as you remember to wire up the 8-bit parallel lines the right way around on the remote Prop's input port. :)

    1.5 megabytes per second peak performance (80 MHz clk) in one direction, quite nice. Also the two Props don't need a common clock source as long as they are running at nominally the same frequency.
    Perhaps the Props could differ by maybe 20% in clock speed and that difference would be absorbed quite well by the 595 because it is a clocked intermediary. A large difference in the two clocks would only introduce latency in two-way comms, but probably not actual read/write errors. Speculation is fun. :)

  • Cluso99Cluso99 Posts: 17,839

    If you are willing to sacrifice 8 pins for parallel transfers then using two 74LVC573's back to back would work nicely, or even a 74xx543 or 74xx646 (not sure what series are available) which are back to back 8bit latches in a 24 pin package.

  • jmgjmg Posts: 14,756

    @Cabbage said:
    I finally relented and turned to external hardware for the RX.
    1.5 megabytes per second peak performance (80 MHz clk) in one direction, quite nice.

    If you really want to add external chips, you can add something like SN74ACT2228, and with the 20MHz clocked speeds mentioned above, that's 32 bytes of 2.5MBytes/second transfer in both directions.
    Or, you could use a compact FPGA like ICE40UL1K-SWG16ITR, to make a modern 2021 version of a hardware bridge :)

  • Amusingly, I have absolutely no application in mind for high speed serial / parallel transfer. I was just pushing my brain to find out where the extremes of the Propeller's counters were.

    It is fun to play with this stuff and to try to optimise it. I'm pretty pleased with the result even though it's not symmetrical or even close to it.

Sign In or Register to comment.