Shop OBEX P1 Docs P2 Docs Learn Events
Big update for DE2-115 and DE0-Nano users w/add-on boards - Page 8 — Parallax Forums

Big update for DE2-115 and DE0-Nano users w/add-on boards

1568101113

Comments

  • jmgjmg Posts: 15,175
    edited 2013-10-03 14:09
    evanh wrote: »
    It just occurred to me SPI normally uses "chip select" for framing but as has been demonstrated by the FTDI High Speed Serial mode the Prop doesn't have to be that way as a slave itself. That suggests the length control from the UART could be used as a slave too. Eliminating the need for another pin assignment. That might restrict what can talk to it as a slave though.

    Chip select is somewhat optional on SPI, but the better SPI designs seem to use it, as it allows the clock-spike error trapping.

    It splits SPI into two phases, shift 8*N, which can include cascade of many slaves, and then load on CS _/= , which checks the clock-count on all slaves, and if <> MOD8 can skip update.

    I guess a good slave-clock synch-block could also include a CLK spike filter but usually that demands higher divide ratios.

    I think a target of 50+MHz is practical, given other devices data sheets. 80+MHz looks unlikely to have enough margin.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2013-10-03 14:16
    evanh wrote:
    Chip has said the current implementation is simple. ...
    I totally agree. It's about as simple as a UART can get. But people have been saying that, once the serial peripheral box has been opened a crack, why not fill it with other stuff, like a USART or universal SERDES? That's where things get less simple and where a discussion about where to draw the line between hardware and software seems appropriate. A USART can be complicated. My thought was to let the software handle the complicated stuff, assisted by the minimum amount of hardware necessary to make that possible.

    -Phil
  • Heater.Heater. Posts: 21,230
    edited 2013-10-03 14:24
    Phil,
    My thought was to let the software handle the complicated stuff, assisted by the minimum amount of hardware necessary to make that possible.
    Exactly.

    With the emphasis on "minimal". Enough to provide the speed that software cannot achieve. Simple enough to not make the instruction set and registers impossible to understand. Whilst still flexible.

    Is this even possible? Do we know how to do it? Cleanly?

    Or should the PII design be frozen where it is so that it can actually be produced soon?

    Not my call. I'm just getting nervous about never ending feature creep.
  • KC_RobKC_Rob Posts: 465
    edited 2013-10-03 14:27
    That's where things get less simple and where a discussion about where to draw the line between hardware and software seems appropriate.
    Amen.
  • evanhevanh Posts: 16,031
    edited 2013-10-03 14:44
    JMG pointed out, in post #202, that extending the frame length (I called it word length just before) capabilities is a good improvement. Making this fully definable up to the full 32 bit length of the shift registers, instead of the current 8 or 32 bit only option. Send and receive having separate parameters.

    There is a presumption of one full latch of buffering on at least the receive side so that it can handle receiving "gapless" streaming. Send side would be nice also.
  • evanhevanh Posts: 16,031
    edited 2013-10-03 14:54
    My thought was to let the software handle the complicated stuff, assisted by the minimum amount of hardware necessary to make that possible.

    I understand but this didn't come in on a breeze. The comms link was seen as an important role to fill, anything slower can be dealt to purely in software with no hardware assist. And we do have eight of these things now, why not open them up a little more? Chip will say if a particular request is too demanding.
  • jmgjmg Posts: 15,175
    edited 2013-10-03 15:07
    evanh wrote: »
    There is a presumption of one full latch of buffering on at least the receive side so that it can handle receiving "gapless" streaming. Send side would be nice also.

    Correct, and I think this infers some simple flags.
    Commonly :
    Rx_Buffer_Valid : Set by HW, and cleared on SW read of RxBuffer.
    Tx_Buffer_Ready : Cleared by SW Write to TxBuffer, and set by HW when HW moves buffer into shifter.

    A 32 bit read can access all status info, on Rx & Tx - flags and progress counters and state bits.
    Config setup may exceed 32 bits, but simple stacking schemes can put that in one register location.
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-10-03 17:17
    Serial Circuitry
    I was going to start a new thread but see lots of discussion here overnight (here in Oz).

    I, like Phil, want to keep it simple and flexible. The VGA in P1 was used by a few to do some special serial output. But what was lacking was being able to input to it. In P2, this mode is no longer possible due to the DACs.

    The UART implementation that Chip has done went too far IMHO because it added start/stop bits which blocked its use generally. So lets examine this first...
    * excellent clock generation but lacks option for external clocking (SPI Slave, etc) and option to output clocks (SPI Master, etc)
    * very fast & high granular clock (baud) generation, great for interprop comms etc
    * great for async - start/stop insertion/removal but lacks option to turn it off required for sync protocols (SPI, USB, etc)
    * 8,32,36 bit modes (36 with addressing) - needs options for 1..32 its as well (maybe not 1?) for true UART and also sync (SPI, USB, etc)
    * two tx/rx pairs A & B

    Now, lets step back a bit...
    In P1 we can (just) do 4x UARTS by software in 1 cog (4 x 115200 IIRC). With P2, 4 x 1Mbps should be possible in software. Kind of defeats the purpose for hardware assist for this part. Most cogs are after all designed for soft peripherals.

    So, what is the hardware assist required for...
    * High speed
    * Synchronous (SPI, USB, etc) - async as possible extension
    * Specialised protocols
    * There is likely only one/two of these required for the project
    * SPIQ (quad SPI) - thanks Phil for the reminder with VGA possibilities

    My take on this has been based on...
    * Generalised simple assistance
    * Maximum flexibility
    * Fast
    * Maximum clock granularity

    Therefore, I ultimately came up with (see post #69 and #162) these generalised 32 bit shifters...
    * easy clock generation (baud) with 1 clock granularity (as per Chips baud/clock generator)
    -* option to clock from external pin
    -* option to output clock
    * basic 32 bit shift register
    -* capable of being read and written in parallel (32 bits)
    -* can be used as transmitter or receiver (two shifters required to do transmitter and receiver)
    -* input selectable from external pin or "0" or "1" (0/1 for stop bits insertion on underrun when used as transmitter)
    -* output selectable to external pin (perhaps with option to force "0" or "1" ???)
    -* because we can have both an input and output pin on the same shift register, we can use this as a daisy-chained serial shifter like a '595 etc
    * LSB first (we can simply reverse bits in software)
    * start & stop bits can be added simply in software

    A few features that might be nice...
    * read/write counter to count bit clocks (at least 6-7 bits to indicate underrun/overrun and 0.5 bit time)
    * inversion for clock, input, output - can be done at the pin ???
    * option to commence clocking on data change (option to specify going "0" or "1") - can be used as start detection
    * apparently we can do differential pairs at the pins
    * need at least 2 of these per cog, so option to daisy-chain the A & B shifters
    * can the clocks be provided by the counters? Not likely to be used otherwise in this cog using shifters?
    * dare I say CRC assist ???

    As has been said elsewhere, UARTs to 1Mbps and I2C can easily be done in software without any hardware assist.

    While remember it, it was mentioned previously about the ability to daisy-chain the counters ???
  • SeairthSeairth Posts: 2,474
    edited 2013-10-03 17:34
    K2 wrote: »
    So you are more a Propeller philosopher than a Propeller user? Interesting, but not very fulfilling.

    As a Propeller philosopher, are you defending or attacking the Propeller concept? I can't tell. The various posts you've made seem to be all over the place on this point.

    Please kindly discuss the ideas, not the person.
  • jmgjmg Posts: 15,175
    edited 2013-10-03 17:54
    Cluso99 wrote: »
    I, like Phil, want to keep it simple and flexible.

    I think everyone does.
    The simplest model is to let HW manage the bits and the SW manage the Bytes/words.
    Cluso99 wrote: »
    With P2, 4 x 1Mbps should be possible in software. Kind of defeats the purpose for hardware assist for this part.

    Not at all. 1Mbps is very slow access to a 160MIP core. You need to lift the bar :)

    Serial COMS should be targeting 50Mhz plus, in both Sync and Async (with optional clock) and it should also target energy saving, so that the CORE can idle while the simple hardware collects bits, if it needs to.

    eg FT23xxH parts can give me a simple COM port interface, to a 50MHz Duplex link to any PC (clocked Async, where P2 gives the CLK).

    For a moderate price, I can buy 15kV ESD Protected, 100Mbps, 5V, PROFIBUS, Full Fail-safe, RS-485/RS-422 Transceivers
    (and also 80 and 40 MBd Transceivers )


    It would be a shame if the P2 could not match that in a small HW serializer. I'm sure it can.


    Cluso99 wrote: »
    Therefore, I ultimately came up with (see post #69 and #162) these generalised 32 bit shifters...
    * easy clock generation (baud) with 1 clock granularity (as per Chips baud/clock generator)

    Was this the one using a Timer as Baud ?

    To me, not a great use of silicon, as the new timers are much smarter, and I really do not want to waste one, just for simple baud generate.

    Instead, I want both smarter timers, and good bit-shifting in HW.

    Localised logic like small counters 8..12 bit, take very little room, and can save inter-bock routing.
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-10-03 18:45
    jmg wrote: »
    I think everyone does.
    The simplest model is to let HW manage the bits and the SW manage the Bytes/words.
    Agreed.
    Not at all. 1Mbps is very slow access to a 160MIP core. You need to lift the bar :)
    Disagree - this is for a cog doing 4 or 8 UARTs or a mix of UARTs and I2C, PS2 or similar
    I am not talking about a specialised cog to do 50+MHz here.
    Serial COMS should be targeting 50Mhz plus, in both Sync and Async (with optional clock) and it should also target energy saving, so that the CORE can idle while the simple hardware collects bits, if it needs to.

    eg FT23xxH parts can give me a simple COM port interface, to a 50MHz Duplex link to any PC (clocked Async, where P2 gives the CLK).

    For a moderate price, I can buy 15kV ESD Protected, 100Mbps, 5V, PROFIBUS, Full Fail-safe, RS-485/RS-422 Transceivers
    (and also 80 and 40 MBd Transceivers )

    It would be a shame if the P2 could not match that in a small HW serializer. I'm sure it can.
    Without totally dedicated hw there isn't going to be any way P2 can match 100Mbps specialised protocol. P2 is after all, a microcontroller.

    But, with some simple (and I mean here, flexible, flexible, flexible) lots of chores (sw) can be assisted by simple hw.

    Chip never foresaw what we would be able to do with the P1, particularly the counters and VGA. If we get some nice flexible basics in the P2 just imagine what we can do. Phil's DSP stuff is amazing, we have amazing audio by Ahle2, etc.

    I have been working on USB LS & FS on the P1. Many are wanting to run Quad SPI Flash as cache. Native SD (not SPI) would be really neat. Not so sure about Ethernet. We all want some hw assist for this, but to be so specific it only achieves one of them at the expense of the others. And I at least want it to be general enough to be able to do some other really cool things with it.

    I was disappointed that we could not use the counters to gate inputs in the P1 (like a deserialiser), or clock a signal(s) into the VGA and read the VGA registers. Of course, none of this was thought of when Chip designed P1. He dedicated some hw at the time to do some amazing things that could not be done back then.
    Was this the one using a Timer as Baud ?
    To me, not a great use of silicon, as the new timers are much smarter, and I really do not want to waste one, just for simple baud generate.
    Instead, I want both smarter timers, and good bit-shifting in HW.
    Localised logic like small counters 8..12 bit, take very little room, and can save inter-bock routing.
    We have the counters there already, so why not take advantage of them. IMHO in what I want the hw assist for, the counters would not be used in that cog, so why not make use of them?


    As a final note, all the high speed stuff is going sync again. I used to design and program synchronous comms (BSC, SDLC, etc) back in the late 70's and early 80's. Then everything went to async. Now the tide has shifted but microcontrollers haven't caught up yet, so hence the need for chips like FT232 (USB to async). The FT232 is expensive at half the price of the P1.

    This is just my 2c.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2013-10-03 18:56
    jmg wrote:
    The simplest model is to let HW manage the bits and the SW manage the Bytes/words.
    Yes, that seems like a good division of labor.
    ________

    Now, regarding all this talk about differentail signaling: there's a whole heckauva lot more to it than just opposing logic states! RS-422RS485 transceivers, for example, tolerate common-mode voltages that extend below ground and above Vdd (typically -7V to +12V). They also tailor the rise and fall slopes to avoid ringing on the transmission line. Timing skew is also carefully managed -- especially important at high speeds. I would never drive a differential transmission line directly from a microncontroller or use a micro's voltage comparator for reception. That's what the specialized transceiver chips are for.

    -Phil
  • jmgjmg Posts: 15,175
    edited 2013-10-03 18:57
    Cluso99 wrote: »
    Without totally dedicated hw there isn't going to be any way P2 can match 100Mbps specialised protocol.
    P2 is after all, a microcontroller.

    It will get close, and the final limit I suspect will be pin-drivers and Tsu/Th ( as seems to be the case in other vendors devices).

    It is common for uC to manage SysClk/2 on Data+Clk TX.RX, ignoring pin drivers, that gives 80MHz.

    That was why I said targeting 50Mhz plus

    The fact you can buy 100Mbps Transceivers, was to underline why you should not set the bar too low.
  • jmgjmg Posts: 15,175
    edited 2013-10-03 19:00
    Yes, that seems like a good division of labor.
    ________

    Now, regarding all this talk about differentail signaling: there's a whole heckauva lot more to it than just opposing logic states! RS-422RS485 transceivers, for example, have common-mode ranges that extend below ground and well above Vdd. They also tailor the rise and fall slopes to avoid ringing on the transmission line. I would never drive a differential transmission line directly from a microncontroller. That's what the specialized transceiver chips are for.

    Yup - and do not forget ESD as well... P2 pins will NOT be 15kv ESD rated ;)

    Still, there are plenty of fast, moderately priced transceiver chips out there. I found one at 100Mbps, and many at 80Mbps.
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-10-03 19:47
    Yes, that seems like a good division of labor.
    ________

    Now, regarding all this talk about differentail signaling: there's a whole heckauva lot more to it than just opposing logic states! RS-422RS485 transceivers, for example, tolerate common-mode voltages that extend below ground and above Vdd (typically -7V to +12V). They also tailor the rise and fall slopes to avoid ringing on the transmission line. Timing skew is also carefully managed -- especially important at high speeds. I would never drive a differential transmission line directly from a microncontroller or use a micro's voltage comparator for reception. That's what the specialized transceiver chips are for.

    -Phil
    USB requires opposing states, with exceptions for SE0/SE1 acks, and at 3V3 and short lines with series 68R and LS 1.5Mbps, FS 12Mbps, HS IIRC is 480Mbps.
    Chip reminded me of the differential circuitry in the I/O pins to achieve the inversion on an adjacent pin. So LS & FS should be achievable in the P2, even if not fully compliant.
    A little help in serialising and deserialising and it should be relatively easy.

    Otherwise, totally agree with specialised transceiver chips. But then again, we are never going to get 100+Mbps or more out of P2 - it's only a microcontroller ;)
  • jmgjmg Posts: 15,175
    edited 2013-10-03 20:11
    Cluso99 wrote: »
    I have been working on USB LS & FS on the P1.

    Are you able to port that, to run on your P2 FPGA system ?
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-10-03 20:53
    jmg wrote: »
    Are you able to port that, to run on your P2 FPGA system ?
    WIP
  • average joeaverage joe Posts: 795
    edited 2013-10-03 23:16
    I've stayed quiet about this for a while, but I can no longer hold my tongue. I really like Chip's ASYNC implementation as it exists right now. While I do understand the drive for SER/DESER in hardware, in my opinion this is not the time to handle this. I would honestly like to see this in a P3 but not now. I feel there is just too much on the line, too late in the game and the real issue is that the community can't even agree on what a minimal implementation would require. There are too many permutations to be accounted for without some serious justification. It may suck not being able to max out my 104mhz QSPI flash chips but I still feel that it's best left for software. Having ASYNC hardware is good enough IMO. Especially considering the bulk of debugging is done over this protocol. Synchronous serial on the other hand is just too diverse. Please guys, Chip has enough on his plate without adding more to worry about.
    I've been following developments closely for the past few months and great headway has been made. If I could justify the expenditure on a DE2 to my better half, I'd have one. Sadly that is not the case... Hopefully we will have real chips soon but until then, maybe we will have a 2 cog variant soon?
  • evanhevanh Posts: 16,031
    edited 2013-10-03 23:46
    Chip has encouraged us twice in this thread to outline the features we would like added onto the existing UART.
  • evanhevanh Posts: 16,031
    edited 2013-10-04 00:13
    Cluso99 wrote: »
    Without totally dedicated hw there isn't going to be any way P2 can match 100Mbps specialised protocol.
    Profibus is just beefed up RS485. So a generic, but fast, UART. The specialised protocol layered on top of that. I'm not sure it actually goes up to 100 Mbps though.
    We have the counters there already, so why not take advantage of them. IMHO in what I want the hw assist for, the counters would not be used in that cog, so why not make use of them?
    Well, there is the distinct possibility of multiple drivers being packaged onto one Cog. I'm easy on this one. Chip's call, imho.
  • average joeaverage joe Posts: 795
    edited 2013-10-04 00:15
    evanh wrote: »
    Chip has encouraged us twice in this thread to outline the features we would like added onto the existing UART.

    Fair enough. I'm just not sure how to satisfy SPI and I2C. USB would be a wonderful feature to have. Sadly I have no experience with USB protocol. I think the bare minimum hardware is the way to go. What this means exactly, I'm not sure.

    I wonder how much can be based around the P1's video hardware? Could this be a good starting point, considering how it's been used and abused.
  • jmgjmg Posts: 15,175
    edited 2013-10-04 01:22
    Fair enough. I'm just not sure how to satisfy SPI and I2C.

    SPI is simple enough - basically, disable the Start and Stop state engine, and allow master/slave choices.

    I2C needs almost no special hardware, it can be done in SW, but if you wanted to add Start and Stop sense in HW, that is just 2 D FF.

    On a Prop, some option to wait-on-edge could be generally useful ?
    For i2c the start would be (SCL=H)(SDA= =\_) and Stop is (SCL=H)(SDA= _/= )

    Highest i2c speed is 5Mbd and that is one-way only. 3.4MBd is the top 2 way speed.
  • SapiehaSapieha Posts: 2,964
    edited 2013-10-04 05:32
    Hi Chip.

    Thanks for info.

    BUT can that be run as free runing counter without need of some code to execute it --- As in my proposal else need some code to that.


    cgracey wrote: »
    The Prop2 counters do that. They can measure frequency or period.
  • SeairthSeairth Posts: 2,474
    edited 2013-10-04 08:03
    Here's a possible generalization (it definitely needs refinement, and may duplicate what others already said):
    • Two 32-bit parallel/serial shift registers (SRA and SRB)
    • Two configuration registers, configurable through CFGSRx
    • Associated pins are always paired (similar to the other pin modes)
    • RDSRx and WRSRx perform a parallel read or write from/to SRx
    • CLKSRx does the actual shifting of the registers
    • CLKSRx stalls (unless C flag set) in single-task mode, loops in multi-task mode
    • The existing CFGPINS is used for inverting the signal, setting schmitt input, etc.

    CFGSRx D (%MM_AAAAAAA_E_CCCCCC_CCCCCCCC_CCCCCCCC)

    configure the shift register
    MM = mode (%00 = disabled, %01 = shift out only, %10 = shift in only, %11 = shift out and in)
    AAAAAAA = 0..126, output pin (input is A+1)
    E = enable clocked pin (when enabled, clock input/output is A+2)
    CCCCCC_CCCCCCCC_CCCCCCCC = clocks per bit (0..2^22)

    WRSRx D/#n

    write a value to the shift register

    RDSRx D

    read the shift register to a register

    CLKSR D/#n (%AAAAA_BBBBB)

    clock the shift register
    AAAAA = number of bits to shift SRA
    BBBBB = number of bits to shift SRB


    For serial (e.g. 8N1), you'd transmit by loading ten bits (start bit, 8-bit value, stop bit) into SRA, then clock it out. For receive, you use WAITPEQs to detect the start condition, then clock in 8 bits to SRB. If your baud rate was low enough, you could also toggle SRA between transmit-only and receive-only modes (which would be fine for protocols like MODBUS RTU, which is strictly request/response). If you wanted to add a parity bit, that would be handled (both generating and verifying) in software. In order to asynchronously send and receive, use two tasks and both shift registers.

    For SPI, you enable master by enabling the clocking pin and specifying a non-zero clocks-per-bit value. For slave, you set the clocks-per-bit to zero (indicating that the clock pin is an input instead). Note that the clocking pin might need additional configuration (steal one more bit from the clocks-per-bit field?) to control when the rising/falling edge occurs relative to the shifted data bit. If this is the case, it might be that this approach would only work at CLK/2 max, which I'd still be fine with.

    Half-duplex protocols (I2C) can use a single shift register that simply switches between transmit-only and receive-only mode. The in and out pins would be tied together. Multi-master is simply a matter of setting/clearing the clocks-per-bit value. All of the protocol-level stuff would be done in software.

    Differential could be handled with two shift registers. Alternatively, if the shift-out and shift-in pins were separately configurable, then you could use CFGPINS to enable differential output of the shift-out pin.

    Note that it might be possible to rework the above to support more than two shift registers. If you could get at least 4 registers, you'd be able to do QSPI.

    Anyhow, this is about as generic an approach as I can imagine. Feel free to tear it apart, tweak it, or even ignore it completely. :)
  • jmgjmg Posts: 15,175
    edited 2013-10-04 12:00
    Seairth wrote: »
    For serial (e.g. 8N1), you'd transmit by loading ten bits (start bit, 8-bit value, stop bit) into SRA, then clock it out. For receive, you use WAITPEQs to detect the start condition, then clock in 8 bits to SRB.

    That's less than Chip has already working, and IMO has too many operational restrictions.
    The Start/Stop handler is only a 2 bit state engine, which is already in the FPGA code.

    * Classic Async is full duplex, ie needs both Rx and Tx with any phase on RX start and can run gapless both ways.
    * Async hardware gives a full bit-frame (char time) to respond.
    The above plan squeezes you to under half a bit time
    * FTDI High Speed USB parts have a 50MHz clocked Async mode, duplex (at the pins).
    What Chip has already, pretty much just needs a clock-out added to talk to FTDI parts.

    An easy 50MHz pathway to PC is a quite compelling selling point.
    Especially a low power, low overhead one which lets you do stuff and still keep up.
  • SeairthSeairth Posts: 2,474
    edited 2013-10-04 12:33
    jmg wrote: »
    That's less than Chip has already working, and IMO has too many operational restrictions.
    The Start/Stop handler is only a 2 bit state engine, which is already in the FPGA code.

    * Classic Async is full duplex, ie needs both Rx and Tx with any phase on RX start and can run gapless both ways.
    * Async hardware gives a full bit-frame (char time) to respond.
    The above plan squeezes you to under half a bit time
    * FTDI High Speed USB parts have a 50MHz clocked Async mode, duplex (at the pins).
    What Chip has already, pretty much just needs a clock-out added to talk to FTDI parts.

    An easy 50MHz pathway to PC is a quite compelling selling point.
    Especially a low power, low overhead one which lets you do stuff and still keep up.

    I thought this was more flexible, myself. For instance, UARTs could have variable character lengths and parity bits, which the current UARTs don't have. As I pointed out, full duplex async could be done with two tasks and two shift registers. One transmits, one receives. Neither are dependent on the other.

    I'm not sure i understand the full/half bit time comment.

    So, is this approach so far off the mark that it couldn't be made workable? Otherwise, what would need to be changed?
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-10-04 13:30
    Seairth: What you have said is pretty much as I have said. You missed the pin definitions so there needs to be a second cfg command. Chip can/will work that out.

    For async, IMHO this is an easy way to tx
    ' char is the 8bit char to be transmitted (in a cog long register with all upper bits =0)
       shl   char,#1   'insert stop bit
       or     char,stops 'add stop bits to upper 23 bits - we will cfg the srx to shift in "1" bits = stop bits.
       wrsrx  char    'load shift out
       cfgsrx cfgval  'enable shift out (no need if we are already running - it will wait until the next valid bit which is likely after the current stop bit)
       mov txtime,cnt  'save the timer
       add  txtime, bittime9  ' 9 bittimes (might have to be 9.2+)
    ' go do something (just need to check when cnt > txtime for "now sending stop bit")
    
    The above is easy to change for any other bit length (1-32), including parity (user inserted), 2 stop bits (user increases bittimeX+1).
    Just 2 instructions (SHL and OR) does the tx start and stop bits.

    Receive is slightly more complicated but quite similar.

    But I still don't get why we cannot use the counters for the clk generation. In an ultra fast serial, the cog won't have time to do anything else so the counters will be unused.

    In a slower situation (multiple UARTs + maybe I2C, or SPI with multiple CSs), a lot of this can be done by sw and most likely will not even bother with the shift registers - there is only 2 sets in Chips current implementation anyway, and we will want to create a driver capable of doing 4 or 8 uarts anyway so that will still be done in sw.
    So its the high and ultra-high speed single I/O driver for the cog that we are really trying to assist, and these are mostly sync these days.

    img: What Chip has done actually goes further and this is not general enough - you keep missing this point in your quest to use the FTDI part.

    General:
    * What would be more assistance here would be a CRC instruction.
    * Another help might be a shift register bit time counter (1 for tx, 1 for rx - ie 2 for generalised shift registers). This timer would count half-bit times and would count at least 64 bit times to allow for roll-over of 32 bits+start/stop. A polled read instruction would work for this. Probably need a bit more work to specify.
  • jmgjmg Posts: 15,175
    edited 2013-10-04 13:48
    Seairth wrote: »
    So, is this approach so far off the mark that it couldn't be made workable? Otherwise, what would need to be changed?

    The only change, is really no-change - just leave the Async Start/Stop state engine in silicon, as Chip has it now.
    Variable length is already mentioned above.

    Rather than break Async, the focus is on adding SPI modes(master/slave), by allowing optional use of that Start/Stop state engine, and adding clock out choices, in both modes.
  • jmgjmg Posts: 15,175
    edited 2013-10-04 14:06
    Cluso99 wrote: »
    img: What Chip has done actually goes further and this is not general enough - you keep missing this point in your quest to use the FTDI part.

    The point is Chip's first pass is a great first pass, but do not break that, in the effort to make it more general.
    You can do both Async and SPI properly, with very little effort.

    Async is already coded, tested and in FPGA. Code is already using this feature.

    Sure, expand it to make it more general, but do not remove it.

    FTDI is just one example, from the real word.
    Cluso99 wrote: »
    But I still don't get why we cannot use the counters for the clk generation.

    Sure, you can use the counters, but do not force all users to use them. That becomes a wasteful constraint.

    Small uC often give users the choice of timers OR local baud rate generation.
    If there is room for a bit to choose Timer or local BRG, I'd be fine with that.
    No constraints and no surprise.

    Cluso99 wrote: »
    In an ultra fast serial, the cog won't have time to do anything else so the counters will be unused.

    ?? Did you really say that

    If the fast serial has the bits managed in HW, (as now) there are plenty of cycles for other threads

    One nice feature of the FTDI 50MHz Clocked Async mode, is that you can feed it as slow as you want, or you can go up to the full speed in bursts as needed.
    The user has the choice.
  • SeairthSeairth Posts: 2,474
    edited 2013-10-04 14:13
    jmg wrote: »
    Rather than break Async, the focus is on adding SPI modes(master/slave), by allowing optional use of that Start/Stop state engine, and adding clock out choices, in both modes.

    Actually, I was trying to go beyond that. In the grand tradition of generality on the propeller, I was aiming for some basic hardware that would allow for support of just about any serial interface (within reason). It may not be an ideal solution for any single interface, but it would hopefully let any single interface have a better implementation that the software-only approach would provide.
Sign In or Register to comment.