Shop OBEX P1 Docs P2 Docs Learn Events
Using the video generator for SPI, i2s etc — Parallax Forums

Using the video generator for SPI, i2s etc

Ahle2Ahle2 Posts: 1,179
edited 2014-03-23 19:28 in Propeller 1
I realized that It might be a good idea to share a piece of code that has been lying on my self for a couple of years now.
It's a video-generator-driven SPI driver. I think it's possible to use the same "trick" for other serial protocols as well. (i2s?)
The thing is that it's actually possible to do some crunching simultanesouly with an byte being transfered using the video generator.

In short it works like described below.
Each byte in a data stream is picked from a table in cog memory (address 0 - 255). Evey position in the table has got an 8bit value and the clock interleaved.
The video generator is then setup in certain way to output a clock signal on one pin and the bit stream on another pin.

I will soon upload something... stay tuned!

Comments

  • pik33pik33 Posts: 2,394
    edited 2012-07-06 05:10
    It seems the Propeller after 6 years is still underestimated and it still can do more. With fast spi transfer we maybe can get a high color video driver...
  • Martin HodgeMartin Hodge Posts: 1,246
    edited 2012-07-06 06:58
    So this method requires more than half the cog for a look up table?
  • Ahle2Ahle2 Posts: 1,179
    edited 2012-07-06 07:02
    Yes, but the LUT is calculated on startup, so the footprint isn't very big.
  • kuronekokuroneko Posts: 3,623
    edited 2012-07-06 07:30
    Ahle2 wrote: »
    Yes, but the LUT is calculated on startup, so the footprint isn't very big.
    Why don't you just use the PLL clock itself (%00010)?
  • Ahle2Ahle2 Posts: 1,179
    edited 2012-07-06 07:51
    @kuroneko
    Because then I can't output both clock and data simultaneously without any pasm intervention.

    I can simply do
    wrByteData waitvid   CISO, 0-0 ' Start writing a byte using the video generator (clock + serial data)
    
               do
               something
               while
               the
               video
               generator
               is 
               busy
               writing
               a
               byte
               [B]COMPLETELY[/B]
               by
               itself
    
    wrNextByte waitvid   CISO, 0-0 
    
  • Mark_TMark_T Posts: 1,981
    edited 2012-07-06 08:33
    But many SPI transactions require input, not just output - so you'll have to synchronize carefully to be able to pull in the bits.

    How long does the video generation technique take to start up and tear-down? - surely the PLL has to stabilise (haven't seen this mentioned in the manual - perhaps its very quick).
  • pik33pik33 Posts: 2,394
    edited 2012-07-06 08:49
    Mark_T wrote: »

    How long does the video generation technique take to start up and tear-down? - surely the PLL has to stabilise (haven't seen this mentioned in the manual - perhaps its very quick).

    Give it some milliseconds to stabilize, but it is one time job.
  • Ahle2Ahle2 Posts: 1,179
    edited 2012-07-06 09:02
    @Mark_T
    The PLL is set at startup and isn't touched after that.
    It's then up to the code to feed the video generator at a steady rate.

    I was able to achieve true 20 Mbit from hub-RAM to SPI-RAM on my C3 without any hickups. when the PLL was aligned to the hub window.
    I am pretty sure it's possible to do better than that with some clever unrolling in combination with rdlong. (of course, the SPI-RAM on the C3 can't handle more than 20 Mbit)
    It was more than a year ago I did anything on this. First of all I have to locate the code somewhere on my drives.
    Then I may have to do some code cleaning.
  • KyeKye Posts: 2,200
    edited 2012-07-06 18:05
    You can do 20 Mb/s output with the propeller at 80 MHz without the video generator. Its the 10 Mb/s input without the video generator that's the bottle neck. Please see either FSRW or the FATEngine SPI driver code.

    Thanks,
  • ctwardellctwardell Posts: 1,716
    edited 2012-07-06 19:10
    Kye wrote: »
    You can do 20 Mb/s output with the propeller at 80 MHz without the video generator. Its the 10 Mb/s input without the video generator that's the bottle neck. Please see either FSRW or the FATEngine SPI driver code.

    Thanks,

    Ahle2 is talking about using the video generator to output SPI at 20Mhz while freeing the COG to do other work.

    C.W.
  • Ahle2Ahle2 Posts: 1,179
    edited 2012-07-07 00:17
    @Kye
    I've been looking at the code for both FSRW and the FATEngine... and as far as I can tell they only bursts out bytes at 20 Mbit. The real continuous performance from hub RAM to SPI is far less.
    I'm talking about a loop of "rdbyte-from-HUB-output-to-SPI-RAM" with a real bandwidth of 20 Mbit witouth any kind of hickup. That's impossible with the
    technique used in the objects you are talking about, because you can't read data from hub simultanously with an byte being sent.

    Since the video generator obviously can't read data, I am not able to boost read performance though.
    But I think it's possible to do more than 20 Mbit (hopefully 40 Mbit) for I2S and other "oneway protocols" with my technique. And that's where it's getting interesting.
  • Ahle2Ahle2 Posts: 1,179
    edited 2012-07-07 00:28
    When the LUT at address 0-255 in cog memory is in place and the video generator is setup in a certain way......

    This
    writeSPISpeed           movi    frqa,                  #00_0000_0                ' Write out data.                        
                            shl     phsb,                  #1                           '
                            shl     phsb,                  #1                           '
                            shl     phsb,                  #1                           '
                            shl     phsb,                  #1                           '
                            shl     phsb,                  #1                           '
                            shl     phsb,                  #1                           '
                            shl     phsb,                  #1                           '
    
    
    

    Can be replaced with
    [LEFT][COLOR=#3E3E3E][FONT=Parallax]waitvid   CISO, 0-0 ' Writes a byte (both clock and data)[/FONT][/COLOR][/LEFT]
    
    
    
    
    
    

    "0-0" can be any byte value and will point to the LUT.
    Both clock and data are interleaved in the lookup.
  • Ahle2Ahle2 Posts: 1,179
    edited 2012-07-07 02:12
    I have located the code now. But since it's a complete memory suite for the C3 and it's also a little bit messy ATM, I will only provide key fragments.

    Here's the code to generate the interleaved clk+data LUT from address 0 to 255
    '-----------------------------------------------------------
    'Pre-calculate spi-byte-data and interleave into the 32 bit LUT
    '-----------------------------------------------------------
    programStart  mov       count1, #255
    precalcLoop1  mov       tmpVal, count1
                  movd      lutAddress1, count1
                  mov       count2, #8
    bitLoop       shr       tmpVal, #1                      wc
    lutAddress1   rcl       0-0, #4
                  djnz      count2, #bitLoop 
                  djnz      count1, #precalcLoop1
                  mov       0, #0
    '-----------------------------------------------------------
    'Pre-calculate spi-clock-data and interleave into the 32 bit LUT
    '-----------------------------------------------------------
                  mov       count1, #255
    precalcLoop2  movd      lutAddress2, count1
                  movd      lutAddress3, count1
                  mov       dummy, #0                       wz
    lutAddress2   muxnz     0-0, spiClkLowBits
    lutAddress3   muxz      0-0, spiClkHighBits
                  djnz      count1, #precalcLoop2  
                  mov       dummy, #0                       wz
                  muxnz     0, spiClkLowBits
                  muxz      0, spiClkHighBits
    '
    '
    Further down
    '
    '
    spiClkHighBits long     _10_00_10_00_10_00_10_00_10_00_10_00_10_00_10
    spiClkLowBits  long     _00_10_00_10_00_10_00_10_00_10_00_10_00_10_00
    


    The video generator is then initialized for 4 color VGA mode on pin group 1 (where the SPI pins are on the C3) and the pixel clock is set to 40 Mhz (since each spi clock equals two "pixels")
    I will not provide code for this since it's the standard procedure
    


    For it to work I need to setup the 4 "colors" as seen below.
    ' CISO stands for "Color In Spi Out" and can be seen as a lookup table to convert
    ' the 32bit interleaved SPI clk/data, "pixel data", to the corresponding SPI pins on the C3
    
    
                  '        11       10       01       00       <- Incoming "Color" value
                  '         |        |        |        |
    CISO          long 001010_00001000_00000010_00000000   '<- Outgoing SPI data on pin 8 and pin 10 (serial out and clock)
    


    When everything is setup correctly, a simple
    waitvid   CISO, X
    


    will send a full byte on the SPI bus.
    X is the byte to send.
  • Ahle2Ahle2 Posts: 1,179
    edited 2012-07-07 03:33
    I will start experimenting with an I2S driver using this method.
    PCM1795DB seems like a good candidate for my experiments, has anyone tried it?
    [h=1][/h]
  • kuronekokuroneko Posts: 3,623
    edited 2012-07-07 03:47
    Ahle2 wrote: »
    @kuroneko
    Because then I can't output both clock and data simultaneously without any pasm intervention.
    So presumably you'll still have to keep waitvid busy otherwise you'll get undefined data on the bus. Which still doesn't explain why you can't use the PLL clock as is, i.e. instead of PLL mode 1 you use 2 which gives you active video h/w and I/O pins for the clock. I don't see how this requires PASM intervention.
  • Mark_TMark_T Posts: 1,981
    edited 2012-07-07 07:34
    Ahle2 wrote: »
    I will start experimenting with an I2S driver using this method.
    PCM1795DB seems like a good candidate for my experiments, has anyone tried it?

    No, but its got current outputs (needs dual RtR opamp on output), and has an SPI interface - the WM8759 is cheaper, is simpler to solder (SOIC14), can drive headphones directly and doesn't need any SPI programming so takes up fewer Prop pins.
  • Ahle2Ahle2 Posts: 1,179
    edited 2012-07-07 09:45
    "the WM8759 is cheaper"
    I can almost buy 10 x WM8759 for the same price as one PCM1795DB. I think I will order 6 pcs for some experimentation! :)
  • porcupineporcupine Posts: 80
    edited 2014-03-23 18:16
    Hello, sorry to wake up a very old thread, but I've become curious.

    I've got an MPC4922 (two channel 12 bit thru-hole DAC) hooked up via SPI to my SpinStudio board. I can successfully drive audio rate (48khz) through out, but only one channel. Once I get to two, the cog I'm using can't keep up and I have to drop the bitrate below 20khz. I'm doing this all in propgcc (C++ code, lmm mode FWIW.)

    Anyways, came across this old post, and was wondering if the WAITVID technique is applicable here. Seems like a lot of the time in my loop is spent iterating the bits in the word to toggle the serial output line; seems like waitvid could be used to send all 16 bits per channel (or 32 bits for both channels?) at once?

    Unfortunately I'm also a bit of a propeller newbie (have had this board for years and just learning it now) and PASM is unfamiliar to me, and the video generation hardware even more so.

    Any tips on how to set up the video hardware for doing SPI this way? I'd _love_ to be able to push high bitrate audio out this way to the DAC.
  • kuronekokuroneko Posts: 3,623
    edited 2014-03-23 19:28
    porcupine wrote: »
    Any tips on how to set up the video hardware for doing SPI this way?
    [thread=143679]This thread[/thread] may help you to get an idea.
Sign In or Register to comment.