Shop Learn
P2 Serial / Shift Register discussion — Parallax Forums

P2 Serial / Shift Register discussion

Cluso99Cluso99 Posts: 18,062
edited 2013-11-24 12:19 in Propeller 2
I thought there should be a new dedicated thread as there are other discussions also going on in the "Big update..." thread
http://forums.parallax.com/showthread.php/150588-Big-update-for-DE2-115-and-DE0-Nano-users-w-add-on-boards/page8

Here is a block diagram for use as a starting point for further discussion...

There are 4 x Shift Registers, 4 x BRGs (Baud Generators), 4 x Configuration registers for the Shift Registers and 4 x Configuration registers for the BRGs. Each can be read/written and are 32 bits each.

Each Shift Register can be used as a Shift In, Shift Out or both Shift In & Out. The Data Input (DI0..3) can each be inverted, the Data Output (DO0..3) can each be inverted, the Clock In can each be Inverted, the Clock Out can be inverted, and the Clock In can each be selected from the Clock Input (XC0..3) or from the Internal Clock divided by the BRG.

P2_serial_003.jpg
«134

Comments

  • Cluso99Cluso99 Posts: 18,062
    edited 2013-10-05 21:52
    Chip ultimately will decide, but as a starting point to get things going here are some sugestions
    ...
    Shift Register Configuration:
    xxxxxxxxx_xxxxx_DDddddddd_SSsssssss

    DD = 00 = Data Output Disabled
    DD = 01 = -spare-
    DD = 10 = Data Output Enabled & Not Inverted
    DD = 11 = Data Output Enabled & Inverted
    ddddddd = Data Output Pin P0..127
    SS = 00 = Data Input Disabled
    SS = 01 = -spare-
    SS = 10 = Data Input Enabled & Not Inverted
    SS = 11 = Data Input Enabled & Inverted
    sssssss = Data Input Pin P0..127

    xxxxxxxxx_xxxx = currently undefined (we need to work out what modes the Shift Registers can operate)

    Clock/Baud Configuration:
    nnnn....._xxxx_C_DDddddddd_SSsssssss

    DD = 00 = Clock Output Disabled
    DD = 01 = -spare-
    DD = 10 = Clock Output Enabled & Not Inverted
    DD = 11 = Clock Output Enabled & Inverted
    ddddddd = Clock Output Pin P0..127
    SS = 00 = Clock Input Disabled
    SS = 01 = -spare-
    SS = 10 = Clock Input Enabled & Not Inverted
    SS = 11 = Clock Input Enabled & Inverted
    sssssss = Clock Input Pin P0..127 (ignored if internal)

    C = 0 = Select BRG as Clock Input
    C = 1 = Select Clock Input Pin "sssssss" as Clock Input

    NNNN... = Baud Generator (Internal Clock divided by "nnnn.."
  • Cluso99Cluso99 Posts: 18,062
    edited 2013-10-05 21:52
    Reserved for updates/summary
  • Cluso99Cluso99 Posts: 18,062
    edited 2013-10-05 22:30
    What would we like to do???
    Here are some - we need to discuss properly to ensure we don't limit flexibility and overcomplicate the silicon...
    • Always LSB first
      • MSB can be achieved by REV instruction
    • UART 8/32/36 bit with Start/Stop (as Chip has already done)
      • 4 bit address option is excellent
    • Sync
      • Disable auto insert/delete of stop bits
      • Specify number of bits 1..32 to tx/rx
      • Do we need to be able to read how many bits were read?
      • Do we need to specify the data edge to start the shift in on?
      • When we write, do we need to be able to restart the clocks, or do we continue at the next clock edge?
    • USB
      • Use Sync mode above
      • Bit Stuffing/Unstuffing by sw
      • CRC16 by sw
    • SPI
      • Use Sync mode above
    • Quad SPI
      • To do this we need to operate the 4 SRs in parallel and specify the same clock
      • Use Sync mode above
      • Anything else?
      • Mux/shifter to read the bits in correct (or reverse and use REV) order.
    • ICS or AC97 ???
    • I2C
    • Daisy-chain up to 4 shift registers (SRs) (could be done using the internal pins P92..127)
      • Work like multiple daisy-chained 74LS595 (without latch)
    • Parallel shift register pairs to use one as a data latch???
    • When we read a SR do we need to be able to clear/set the SR or clock or clock counter?
    • Do we need to count the clocks?
    • Could we use the Counters for the BRG instead? (only 2 - is this OK)
    • We don't want to limit I/O pins to a groups (except the hw differential pairs)
    • Add Mux to input to allow "1" or "0" to be shifted in (permits a stop bit shifted in on underrun, or break)
    What have I missed???
  • average joeaverage joe Posts: 795
    edited 2013-10-05 22:36
    Looking very good. I see options for this as synchronous as well as asynchronous. I really like the fact that there are 4 shift registers. This would make QSPI very feasible. The other protocol option I would personally be interested in is USB. Even if it's not fully compliant, it would be nice to have. I like how this maps to 4 quads. I wish I had more to add but I've only worked with bit-bang SPI and I2C. Never done much on the hardware implementation end..
  • average joeaverage joe Posts: 795
    edited 2013-10-05 22:48
    Cluso99 wrote: »
    Do we need to be able to read how many bits were read?

    I don't think it's absolutely necessary, but it could be very nice.
    Cluso99 wrote: »
    Do we need to specify the data edge to start the shift in on?

    YES, I absolutely think this is necessary.
    Cluso99 wrote: »
    When we write, do we need to be able to restart the clocks, or do we continue at the next clock edge?

    I think continuing at the next clock edge would suffice.

    Cluso99 wrote: »
    SPI
    Quad SPI
    To do this we need to operate the 4 SRs in parallel and specify the same clock
    Anything else?

    If there was an easy way to handle /CS strobe in hardware it would be handy. Otherwise software could handle.
    Cluso99 wrote: »
    When we read a SR do we need to be able to clear/set the SR or clock or clock counter?
    I don't think these are necessary. Set / clear could be done by writing back to SR. Not sure about clock or counter.

    Cluso99 wrote: »
    Do we need to count the clocks? Can this be done with the BRG?

    I think this could be very handy. Might be part of my above statement about /CS?
    Cluso99 wrote: »
    Could we use the Counters for the BRG instead? (only 2 - is this OK)

    I think this could be done. I would like the BRGs better though...
    Cluso99 wrote: »
    We don't want to limit I/O pins to an form of groups (except the hw differential pairs)

    I absolutely agree. Any pin, any task.
    Cluso99 wrote: »
    What have I missed???

    Seems pretty complete to me, although I'm sure others will have more thoughts.
  • jmgjmg Posts: 14,946
    edited 2013-10-06 00:13
    Cluso99 wrote: »
    [*]UART 8/32/36 bit with Start/Stop (as Chip has already done)

    This first pass lacks an important 9 bit mode. I think 5 bits for bit-frame size, makes this properly flexible.
    Uart mode needs to have a Clocked variant. - same bits as classic Async.


    [*]Sync
    • Disable auto insert/delete of stop bits
    • Specify number of bits 1..32 to tx/rx
    • Do we need to be able to read how many bits were read?
    • Do we need to specify the data edge to start the shift in on?
    • When we write, do we need to be able to restart the clocks, or do we continue at the next clock edge?


    If you have Bit Count field for sync, use that also for Async.

    Reading the bits is a good noise-filter. As mentioned before, some SPI slaves check MOD 8 on CLK-Count, before they accept.
    This removes one-off impulse noise effects on the CLK line.

    It is vital to specify the active edge, otherwise you will be sampling very close to where the data changes.

    Transmit usually does not restart CLK (but maybe a Baud-reload, should re-prime to avoid one-more-bit at old rate effects?)

    Gapless Send and receive are important, in all modes.

    Some slave mode SPI parts, include SS support in the HW, It Gates the Clock (via CE), and tri-states the SPI output.
    ie SPI is a 4 pin hardware design exercise.

    [*]USB
    • Use Sync mode above
    • Bit Stuffing/Unstuffing by sw
    • CRC16 by sw

    USB would be nice, but presents a conundrum.
    SW only has many compromises and can limit packet sizes, or not fully meet specs - ok for limited tasks, but not commercial.
    Bit level HW support is not complex, but needs significant testing time.

    Bit-Toggle modulation is really a T-FF, and could be safely made a SPI option.Needs minimal testing.
    Edge Sync needs a moderate divide ratio, to give Edge-snap, rather than a pure PLL.
    Support of this needs a means to modulate the Baud Divider. Either +1/+0/-1 or pre-load

    Bit Stuff / unstuff needs a Gated Counter/state engine, and a Clock Enable on the SPI shifter.
    Not large HW, but it needs to be tested.


    [*]Quad SPI
    • To do this we need to operate the 4 SRs in parallel and specify the same clock

    Not quite, as that gives 4 shuffled decks - a pain to unshuffle.
    I think Chip had worked out how to manage QuadSPI.
    The issue is mainly around pin-mapping.

    [*]ICS or AC97 ???

    This can be done as a SPI variant. It needs the SS pin to have small HW support, of update on BitField End.
    (ie not just a SW wiggle but add a single D-FF, or a T-FF.). Probably also a good idea on SPI SS as well.
    Who would want to change SS part way thru a field ?
    Supposing there Is a rare case, use the Length field, and send t portions.

    [*]Could we use the Counters for the BRG ?

    Sure, as mentioned before, as a user option. BRG is the default, so the new, smarter and expanded counters are not wasted.

    [*]We don't want to limit I/O pins to a groups (except the hw differential pairs)

    Whilst this sounds nice, it has a high cost, as you have a LOT of Pin-Map fields.

    Mentioned earlier after a Chip discussion, was a pin-follower scheme, where a Pin-map field sets a lead pin, and the others fall into line behind.
    Do users really want to scatter their SPI pins all over the device ?
  • evanhevanh Posts: 13,098
    edited 2013-10-06 00:52
    Are you sure Chip has Quad-SPI sorted? I just noted Quad-SPI uses bidirectional data pins, I guess that should be obvious but it deviates pretty severely from generic SPI. It's gonna make the pin mux'ing more complicated. Can the direction change timing be managed in software? Do we want to slave this too?

    Lol, I also note the data is note arranged serially. Except for command compatibility, Quad-SPI is really a 4-bit parallel port. That's gonna be a hassle to re-package the data to and from the shift registers. It might actually be faster to bit-bang it!
  • average joeaverage joe Posts: 795
    edited 2013-10-06 01:40
    evanh wrote: »
    Are you sure Chip has Quad-SPI sorted? I just noted Quad-SPI uses bidirectional data pins, I guess that should be obvious but it deviates pretty severely from generic SPI. It's gonna make the pin mux'ing more complicated. Can the direction change timing be managed in software? Do we want to slave this too?

    Lol, I also note the data is note arranged serially. Except for command compatibility, Quad-SPI is really a 4-bit parallel port. That's gonna be a hassle to re-package the data to and from the shift registers. It might actually be faster to bit-bang it!

    These are very good questions. My experience with the P1 and QSPI was a bit annoying just because of the bi-directional pins. I have all but given up using counters to run the clock due to timing restrictions. You are also very right about re-packaging the data to and from shift registers. Bit banging may be the way to go.
  • average joeaverage joe Posts: 795
    edited 2013-10-06 02:05
    jmg wrote: »
    This first pass lacks an important 9 bit mode. I think 5 bits for bit-frame size, makes this properly flexible.
    Uart mode needs to have a Clocked variant. - same bits as classic Async.


    [*]Sync
    • Disable auto insert/delete of stop bits
    • Specify number of bits 1..32 to tx/rx
    • Do we need to be able to read how many bits were read?
    • Do we need to specify the data edge to start the shift in on?
    • When we write, do we need to be able to restart the clocks, or do we continue at the next clock edge?


    If you have Bit Count field for sync, use that also for Async.

    Reading the bits is a good noise-filter. As mentioned before, some SPI slaves check MOD 8 on CLK-Count, before they accept.
    This removes one-off impulse noise effects on the CLK line.

    It is vital to specify the active edge, otherwise you will be sampling very close to where the data changes.

    Transmit usually does not restart CLK (but maybe a Baud-reload, should re-prime to avoid one-more-bit at old rate effects?)

    Gapless Send and receive are important, in all modes.

    Some slave mode SPI parts, include SS support in the HW, It Gates the Clock (via CE), and tri-states the SPI output.
    ie SPI is a 4 pin hardware design exercise.

    [*]USB
    • Use Sync mode above
    • Bit Stuffing/Unstuffing by sw
    • CRC16 by sw

    USB would be nice, but presents a conundrum.
    SW only has many compromises and can limit packet sizes, or not fully meet specs - ok for limited tasks, but not commercial.
    Bit level HW support is not complex, but needs significant testing time.

    Bit-Toggle modulation is really a T-FF, and could be safely made a SPI option.Needs minimal testing.
    Edge Sync needs a moderate divide ratio, to give Edge-snap, rather than a pure PLL.
    Support of this needs a means to modulate the Baud Divider. Either +1/+0/-1 or pre-load

    Bit Stuff / unstuff needs a Gated Counter/state engine, and a Clock Enable on the SPI shifter.
    Not large HW, but it needs to be tested.


    [*]Quad SPI
    • To do this we need to operate the 4 SRs in parallel and specify the same clock

    Not quite, as that gives 4 shuffled decks - a pain to unshuffle.
    I think Chip had worked out how to manage QuadSPI.
    The issue is mainly around pin-mapping.

    [*]ICS or AC97 ???

    This can be done as a SPI variant. It needs the SS pin to have small HW support, of update on BitField End.
    (ie not just a SW wiggle but add a single D-FF, or a T-FF.). Probably also a good idea on SPI SS as well.
    Who would want to change SS part way thru a field ?
    Supposing there Is a rare case, use the Length field, and send t portions.

    [*]Could we use the Counters for the BRG ?

    Sure, as mentioned before, as a user option. BRG is the default, so the new, smarter and expanded counters are not wasted.

    [*]We don't want to limit I/O pins to a groups (except the hw differential pairs)

    Whilst this sounds nice, it has a high cost, as you have a LOT of Pin-Map fields.

    Mentioned earlier after a Chip discussion, was a pin-follower scheme, where a Pin-map field sets a lead pin, and the others fall into line behind.
    Do users really want to scatter their SPI pins all over the device ?

    My interest for USB is simple devices such as keyboard and mouse. FLASH drives would be a great addition but not entirely necessary.
  • jmgjmg Posts: 14,946
    edited 2013-10-06 02:34
    evanh wrote: »
    Are you sure Chip has Quad-SPI sorted? I just noted Quad-SPI uses bidirectional data pins, I guess that should be obvious but it deviates pretty severely from generic SPI. It's gonna make the pin mux'ing more complicated. Can the direction change timing be managed in software? Do we want to slave this too?

    There was a previous comment suggesting this was largely fleshed out.


    Not that much more complex, just follow the rule of Bit-level in HW and Byte/word level in SW.

    evanh wrote: »
    Lol, I also note the data is note arranged serially. Except for command compatibility, Quad-SPI is really a 4-bit parallel port. That's gonna be a hassle to re-package the data to and from the shift registers. It might actually be faster to bit-bang it!

    No, you just build the shift register with muxes. It shifts nibbles, but reads / writes in parallel, as 32 bits, with the bits in the correct places.
  • evanhevanh Posts: 13,098
    edited 2013-10-06 03:01
    jmg wrote: »
    No, you just build the shift register with muxes. It shifts nibbles, but reads / writes in parallel, as 32 bits, with the bits in the correct places.
    It can't be an ordinary bit shifter since the data goes out as a parallel nibble not a serial nibble. That's quite a difference.
  • evanhevanh Posts: 13,098
    edited 2013-10-06 03:25
    Interesting, there is another mode called QPI, in addition to Quad-SPI, where the commands are also in parallel. Not all Quad-SPI devices support QPI mode though.

    I have yet to find an example of how the hardware handles transforming between the various serial and parallel modes.
  • rabaggettrabaggett Posts: 96
    edited 2013-10-06 04:43
    [QUOTE=average joe;1211741

    I absolutely agree. Any pin, any task.

    .[/QUOTE]

    average joe seems to be doing pretty much exactly the same things I am. I can't add much except to highlight free pin assignments. I don't know about everyone else, but my pin assignments constantly change until the board layout is finished. I will happily update my code and schematic to get rid of a via or two!
  • pjvpjv Posts: 1,903
    edited 2013-10-06 08:31
    Cluso,

    This seems overly complicated to me..... in fact I can't even follow your proposal without studying it more than I'm prepared to do at this time.

    I'd be happy with 4 cascadable registers, each double buffered with an auto-transfer after N clocks so the driving software can continuously feed/read data at jittery intervals. I simply wish to be able to continuously stream data perfectly, and have time to analyze data (CRC, LFSR,CHECKSUM,BOSE-CHAUDRY, etc) while streaming is ongoing. Also, it would be nice (but not imperative) if the clock generator could also employ a "swallow" mode like radio RF generators so any frequency could be made, not just integer-divides of the clock.

    Just my opinion.

    Cheers,

    Peter (pjv)
  • ctwardellctwardell Posts: 1,714
    edited 2013-10-06 08:54
    pjv wrote: »
    each double buffered with an auto-transfer after N clocks so the driving software can continuously feed/read data at jittery intervals

    I agree that buffering is needed to get the most benefit from the shifters.

    Chris Wardell
  • jmgjmg Posts: 14,946
    edited 2013-10-06 12:38
    evanh wrote: »
    It can't be an ordinary bit shifter since the data goes out as a parallel nibble not a serial nibble. That's quite a difference.

    No, it is not ordinary, it has muxes added.

    That changes the hand-over points, so each bit reaches 4 ahead, and the last 4 (or first 4) drive the IO pins.
    8 clocks move all 32 bits, but it reads as a 32 bit register

    - this is a couple of lines in HDL.
  • jmgjmg Posts: 14,946
    edited 2013-10-06 12:44
    jmg wrote: »
    [*]ICS or AC97 ???

    This can be done as a SPI variant. It needs the SS pin to have small HW support, of update on BitField End.
    (ie not just a SW wiggle but add a single D-FF, or a T-FF.).

    Expanding:
    This simple control-pin option of a D/T FF, (Bit-frame clocked, so control pin is exactly known) should also be available in all Async modes, where it would be very useful for RS485 direction handling.
  • evanhevanh Posts: 13,098
    edited 2013-10-06 14:10
    jmg wrote: »
    ... and the last 4 (or first 4) drive the IO pins. 8 clocks move all 32 bits, but it reads as a 32 bit register

    I think I got it. Only one of the four 32 bit buffer registers gets used. :( I suppose the odd parallel arrangement is to allow easier bit-bashing for uC that don't have the right hardward.
  • evanhevanh Posts: 13,098
    edited 2013-10-06 14:14
    jmg wrote: »
    Expanding:
    This simple control-pin option of a D/T FF, (Bit-frame clocked, so control pin is exactly known) should also be available in all Async modes, where it would be very useful for RS485 direction handling.

    The special case wiring for Quad-SPI is stacking up.

    I know ... that's a bit snarky. It's just I'm finding myself a bit shocked at how non-SPI it is. It's a bit rude to have a related name.
  • jmgjmg Posts: 14,946
    edited 2013-10-06 14:37
    evanh wrote: »
    The special case wiring for Quad-SPI is stacking up.

    I know ... that's a bit snarky. It's just I'm finding myself a bit shocked at how non-SPI it is. It's a bit rude to have a related name.

    Not really, it is a nibble wide super set, designed to still fit into 8 pin standard packages.
    Most (all?) QuadSPI parts can still run 1-bit SPI mode, and to make it nibble wide, you have to go bi-directional.
    So there will be caveats, but nothing some simple HW cannot accommodate.

    Then there is DDR QuadSPI .... ;)
    - because the Prop 2 is likely to be pin-bound ~ 50MHz pin wiggle, DDR has appeal, and starts to make XIP more compelling.
  • evanhevanh Posts: 13,098
    edited 2013-10-06 15:01
    Lol, good yarn. Quad-SPI probably has more in common with ECP printer ports than it does with SPI.
  • jmgjmg Posts: 14,946
    edited 2013-10-06 16:16
    Where anyone thinks QuadSPI 'belongs' in the spectrum of history, is not that important.

    More important, is to look at a device like Spansion S25FL128SA, available for under $2, and see how the QuadSPI DDR works.

    It does not look too bad,
    4.2.1 shows a 1-bit SPI command sent, then 32 bits of Address, then Mode of 8 bits, left justified in 32 bits (24 bits dummy) and then Data follows after a turn around time.

    Figure 10.52 Continuous DDR Quad I/O Read Subsequent Access (4-byte Address, EHPLC=01b)
    gives clearer info on the turn around.
    32 bits of Address. DDR
    8 bits of mode
    24 bits of floating IO (3 full clks at DDR = 6 slots = 24 bits )
    then the Flash device emits 32 bits of DLP, and then Data streams.

    If you wanted to software-assist this, you need length control in 4 bit multiples (fits inside the proposed 1 bit granularity)
    and a Dirn bit on TriState, that can be bit-Frame sync'd

    From Figure 10.52 Sw actions are then just this :
    Length=32, TxMode(drive pins) Send Address. [4 clocks]
    Length=8, TxMode(drive pins) Send Mode [1 clock]
    Length=24, RxMode(Float pins) Discard RX data [3 clocks]
    Length=32 (DLP size), RxMode(Float pins) Discard RX data [4 clocks]
    (adjust these Float+Discard phases, in SW, to suit vendor or chip changes )

    and after the above 12 clocks setting address and chip getting ready, then valid read data streams from Flash

    repeat for as much data as you want ...
    Length=UserChoice, RxMode(Float pins) Vaild RX data [Length/8 Clocks]

    Note that continuous DDR skips the 1-bit command preamble, so has a 'sticky' command, but drops the clocks needed.
    Again we follow the rule, HW-does-bits, SW-does-bytes.
  • Cluso99Cluso99 Posts: 18,062
    edited 2013-10-06 16:23
    rabaggett wrote: »
    average joe seems to be doing pretty much exactly the same things I am. I can't add much except to highlight free pin assignments. I don't know about everyone else, but my pin assignments constantly change until the board layout is finished. I will happily update my code and schematic to get rid of a via or two!
    Me too! Even move pins to put them near the connectors. I even swap address lines to the SRAM and swap the data lines too. But another reason could be that the silicon has some special function like differential tx & rx pairs. This is limited to even/odd pairs, so if we force consecutive pins we fall foul of being able to use these as well.
  • Cluso99Cluso99 Posts: 18,062
    edited 2013-10-06 16:29
    pjv wrote: »
    Cluso,

    This seems overly complicated to me..... in fact I can't even follow your proposal without studying it more than I'm prepared to do at this time.

    I'd be happy with 4 cascadable registers, each double buffered with an auto-transfer after N clocks so the driving software can continuously feed/read data at jittery intervals. I simply wish to be able to continuously stream data perfectly, and have time to analyze data (CRC, LFSR,CHECKSUM,BOSE-CHAUDRY, etc) while streaming is ongoing. Also, it would be nice (but not imperative) if the clock generator could also employ a "swallow" mode like radio RF generators so any frequency could be made, not just integer-divides of the clock.

    Just my opinion.

    Cheers,

    Peter (pjv)
    What do you mean by swallow. My view of the BRG is not actually a divider, but a counter. So we count "n" bits and reset the counter. This gives us a totally flexible bitrate of "n" clocks - so able to set to a granuality of 1 clock.

    Sorry if you are not understanding me. I am not trying to make it overly complex, just extremely flexible, with all shift registers being identical.
    What parts don't you understand? - my fault if you don't follow, so please ask.

    I would love to ask for a CRC16 instruction but am not game to ask. This is one of the problems with FS USB. Without a shift register, FS is proving to be quite difficult at 80MHz but I think I will eventually get there.
  • Cluso99Cluso99 Posts: 18,062
    edited 2013-10-06 16:33
    jmg wrote: »
    Expanding:
    This simple control-pin option of a D/T FF, (Bit-frame clocked, so control pin is exactly known) should also be available in all Async modes, where it would be very useful for RS485 direction handling.
    I am not understanding this option. Can you explain further perhaps with a block diagram?
  • jmgjmg Posts: 14,946
    edited 2013-10-06 16:36
    Cluso99 wrote: »
    My view of the BRG is not actually a divider, but a counter. So we count "n" bits and reset the counter. This gives us a totally flexible bitrate of "n" clocks - so able to set to a granuality of 1 clock.

    This becomes semantics, one mans divider can be another mans counter.

    Most modern BRG are reloadable dividers/counters, so they are not binary, but /N.
    because RX needs to do a half-bit offset, there usually is an 'Even N' implied so the formula Fb = FSys/(2*N)

    Next, practical Async sample-width/jitter considerations usually have N > 2, but a CLK-OUT Async mode could support FSys/2,
    up to the pin-limit.


    I prefer baud formula. to words. Makes it very clear what is supported.
  • Cluso99Cluso99 Posts: 18,062
    edited 2013-10-06 16:37
    jmg wrote: »
    Where anyone thinks QuadSPI 'belongs' in the spectrum of history, is not that important.

    More important, is to look at a device like Spansion S25FL128SA, available for under $2, and see how the QuadSPI DDR works.

    It does not look too bad,
    4.2.1 shows a 1-bit SPI command sent, then 32 bits of Address, then Mode of 8 bits, left justified in 32 bits (24 bits dummy) and then Data follows after a turn around time.

    Figure 10.52 Continuous DDR Quad I/O Read Subsequent Access (4-byte Address, EHPLC=01b)
    gives clearer info on the turn around.
    32 bits of Address. DDR
    8 bits of mode
    24 bits of floating IO (3 full clks at DDR = 6 slots = 24 bits )
    then the Flash device emits 32 bits of DLP, and then Data streams.

    If you wanted to software-assist this, you need length control in 4 bit multiples (fits inside the proposed 1 bit granularity)
    and a Dirn bit on TriState, that can be bit-Frame sync'd

    From Figure 10.52 Sw actions are then just this :
    Length=32, TxMode(drive pins) Send Address. [4 clocks]
    Length=8, TxMode(drive pins) Send Mode [1 clock]
    Length=24, RxMode(Float pins) Discard RX data [3 clocks]
    Length=32 (DLP size), RxMode(Float pins) Discard RX data [4 clocks]
    (adjust these Float+Discard phases, in SW, to suit vendor or chip changes )

    and after the above 12 clocks setting address and chip getting ready, then valid read data streams from Flash

    repeat for as much data as you want ...
    Length=UserChoice, RxMode(Float pins) Vaild RX data [Length/8 Clocks]

    Note that continuous DDR skips the 1-bit command preamble, so has a 'sticky' command, but drops the clocks needed.
    Again we follow the rule, HW-does-bits, SW-does-bytes.
    Seems to be quite complex. What speed does this operate at? How can minimal hw be added so that sw can do the bulk?
  • jmgjmg Posts: 14,946
    edited 2013-10-06 16:40
    Cluso99 wrote: »
    I am not understanding this option. Can you explain further perhaps with a block diagram?

    In RS485. many modern UARTS now have a RS485 direction bit.
    This changes at the exact end of the Stop bit, so the BUS Transceiver flips direction. (half duplex)

    In SPI, SS-Pin control is best bit-Frame aligned, this just extends that option, to Async, to support the RS485.DIR pin.
  • Cluso99Cluso99 Posts: 18,062
    edited 2013-10-06 16:47
    Overnight I had a couple of thoughts...
    • We can free up a whole instruction and perhaps this should be the WTG.
      • The R bit indicates read or write
    • Regarding clocking etc, I wondered if all instructions should be 2 clocks.
      • If the instruction read/writes the shift register while a shift clock executes, then the instruction read/writes again on the next clock, else the clock is just a nop stall. Keeps timing deterministic.
    • Regarding Q-SPI, read/write of data effectively goes thru a barrel shifter (think that's the correct term). D0-3 have to go to separate shift registers for D0, etc. Same applies to read. Others may refer this as muxing. We only need to be able to explain what is required and Chip will understand.
    Seems to me that none of this is likely to work with multi-threads and super fast I/O. I am basing this on the USB FS that I am working on.
    Currently all I am trying to do is monitor the FS data. I am not getting enough time to get it going yet.
  • jmgjmg Posts: 14,946
    edited 2013-10-06 16:51
    Cluso99 wrote: »
    Seems to be quite complex. What speed does this operate at? How can minimal hw be added so that sw can do the bulk?

    Software is doing the bulk - see

    Length=32, TxMode(drive pins) Send Address. [4 clocks]
    Length=8, TxMode(drive pins) Send Mode [1 clock]
    Length=24, RxMode(Float pins) Discard RX data [3 clocks]
    Length=32 (DLP size), RxMode(Float pins) Discard RX data [4 clocks]
    Length=UserChoice, RxMode(Float pins) Vaild RX data [Length/8 Clocks]


    The HW is minimal, - a couple of lines of HDL to morph shifter to Quad compatible, (it reads/writes bit-correct, no shuffles needed)
    and even DDR is not complex if you have a higher SysClk, which P2 does.

    To me the pin-mapping is the more complex issue, but I think Chip had that covered.

    in DDR, standard Shifter clocks from faster internal CLK, and that toggles a pin, for the slower DDR clk. (master mode)
    Some Tpd care would be needed, to phase the CLK pin and Data (this could be Data from a other-edge FF, at SysClk speeds)

    The Spansion part specs 66MHz DDR.Quad, so depending on the Prop 2 pins I'd guess 50~66MHz nibble rate.
Sign In or Register to comment.