SPI communication between Parallax P2 and Raspberry Pi

disha_sudra · 2023-09-18 05:39

@evanh said:
Is that the SPI clock rate? If so then I'm impressed. And I assume your Prop2 is at 340 MHz and therefore achieving just short of sysclock/3. SPI receiver smartpin has proven to be good for that. Transmitter not so much - because of the lag issue.

Yes P2 runs at 340 MHZ, but even with RPI 110 MHZ SPI clock I am not getting data faster. Is there any workaround that we can follow to receive data from RPi to P2 in 1.5 to 2ms? Also if we increase RPI spi clock frequency from 110 MHZ to 130 MHZ, we are not getting data properly.

evanh · 2023-09-18 06:10

@disha_sudra said:
Is there any workaround that we can follow to receive data from RPi to P2 in 1.5 to 2ms?\

You still need to separate overheads from data rate. If we don't know the achievable data rate of the interface then we can't gauge how much of that 3 ms is overheads. Then, after that, figure out where the overheads are coming from.

Also if we increase RPI spi clock frequency from 110 MHZ to 130 MHZ, we are not getting data properly.

That would be right. It exceeds sysclock/3 rate. 340 / 3 = 113 MHz.

evanh · 2023-09-18 06:25

Theoretical peak data rate is 110 Mb/s. 19200 x 8 = 153600 bits to transfer. So ideal best case is 153600 / 110e6 = 1.3 ms. I guess, unlike SD card block transfers, there isn't any need for block gaps. It can transfer everything in one long burst. So only overhead should be initiation handshake.

EDIT: A bonus is if a spare cog is used to be the SPI slave then it can effectively operate in the background. Not taking up any foreground processing time at all. At least not until notification of transfer completion. The RPi controls the transfer.

I'm not sure if this is advantage for you though. Why do you need such a short transfer time?

evanh · 2023-09-18 06:45

@disha_sudra said:

@evanh said:
Question: Is disha_sudra and chintan_joshi the same person?

We are working together on same project.

Ah, cool. Thanks.

disha_sudra · 2023-09-18 06:50

There is buffer limit in P2. So we fixed 19200 bytes buffer size.

Whole scenario is as below

RPI should transfer 19200 bytes in 1.5 to 2 ms
After receiving 19200 bytes P2 start processing of received 19200 bytes and in parallel cog and RPI can start filling of next 19200 bytes. (Processing time of P2 here is less than 2 ms), so we need RPI to transfer 19200 bytes in less than 2 ms to prevent any delay in printing.

evanh · 2023-09-18 09:03

So a sustained 76.8 Mb/s. Yeah, that's substantial. And as you're saying buffer size is naturally constrained.

Good news is a background slave would work fine. I'll continue with finishing that. Others will make use of it too.

evanh · 2023-09-22 10:35

I've managed to stay sane long enough to refine the SPI handler a little more. The events hardware isn't all that well documented so needed a rock solid understanding.

Here's the newest [revised again] version of the command handler I've been working on:

'==========================================
'  Wait for a command from master
'==========================================
' If CLK glitches while CS still high then a fast reset of smartpins occurs
' If CS glitches then RDPIN returns a zero and jump table back to "cmdloop"
' 
cmdloop
        testp   p_cs   wz   ' CS high?  (Z=1 is yes)
    if_nz   jmp #cmdloop    ' wait until CS high, avoids random garbage with CS low
.retry
        dirl    p_txrx    ' reset both Tx and Rx smartpins, clears buffers
        dirh    p_txrx    ' enable both Tx and Rx smartpins
.cawait
        jpat    #.retry       ' CS and CLK rise(clocking edge), clock glitch while CS high
        testp   p_mosi   wz   ' cmd+addr received?  (Z=1 is yes)
    if_nz   jnse2   #.cawait      ' wait until CA complete or CS pin rise

        rdpin   cmdaddr, p_mosi    ' read smartpin's buffer (and ACK the smartpin)
        rev cmdaddr            ' 32-bit endian bit-swap
        getnib  pb, cmdaddr, #7    ' first 4 bits is command
        setnib  cmdaddr, #0, #7    ' remove command from the address
        altgb   pb, #cmdjmp        ' lookup table of 8-bit pointers
        getbyte pb
        jmp pb

disha_sudra · 2023-09-22 10:51

How can we use this in code in our existing code mentioned in comment #20?

evanh · 2023-09-22 11:46

Getting there. Still further to go. Just letting you know of progress.

disha_sudra · 2023-09-22 11:48

@evanh said:
Getting there. Still further to go. Just letting you know of progress.

That's very kind of you. I actually unfamiliar with assembly code so.

Beau Schwabe · 2023-09-22 18:57

Not to derail any of your efforts, but I am curious why you are using a SPI interface as opposed to just a standard USART interface between the Pi and the Propeller.

jmg · 2023-09-22 19:09

@"Beau Schwabe" said:
Not to derail any of your efforts, but I am curious why you are using a SPI interface as opposed to just a standard USART interface between the Pi and the Propeller.

I think the Pi UART is nowhere near fast enough ?

evanh · 2023-09-22 20:10

That's a point. If the RPi can do 100 Mb/s UART with 32-bit word framing, then that'd be ideal for the Prop2.

Most likely the speed is a problem because regular hardware implementations of UARTs have x16 input sampling of each bit spacing. Such would then need 1600 MHz sampling clock on the RX serial pins.

jmg · 2023-09-22 20:45

@evanh said:
That's a point. If the RPi can do 100 Mb/s UART with 32-bit word framing, then that'd be ideal for the Prop2.

Most likely the speed is a problem because regular hardware implementations of UARTs have x16 input sampling of each bit spacing. Such would then need 1600 MHz sampling clock on the RX serial pins.

Finding actual Pi UART use cases above even 4Mbd is not easy.

The web finds paper numbers like this
UART: 476baud to 31.25Mbaud (theoretical maximum)
SPI: 3.8 kHz to 250 MHz (theoretical maximum)
I2C: 400kbps

but the PI also has small FIFOs, so you would need to lock a core to a UART to manage

Google also finds this

If the system clock is 250 MHz and the speed field is zero the SPI clock frequency is 125 MHz. The practical SPI clock will be lower as the I/O pads can not transmit or receive signals at such high speed

and this
https://forums.raspberrypi.com/viewtopic.php?t=192447#p1482053
mentions 'parallel connection' but also baud rate ??
The measured 2G/hr is ballpark 550k Bytes/sec averaged, so they are well short of sustaining the set 10.5MBd

There was a lot of problems but it can be done. I know that it took a lot of time but I created comunnication beetwen RPi and other side devices (parallel connection) using baudrate = 10500000 (10,5Mb).... Now transfer 2GB of data takes about 1 hour

evanh · 2023-09-22 21:32

@jmg said:
The web finds paper numbers like this
UART: 476baud to 31.25Mbaud (theoretical maximum)
SPI: 3.8 kHz to 250 MHz (theoretical maximum)

From that, I'll presume a 500 MHz sysclock. (31.25 x 16 = 500)

EDIT: Haven't read them but found these:
https://datasheets.raspberrypi.com/bcm2835/bcm2835-peripherals.pdf
https://datasheets.raspberrypi.com/bcm2711/bcm2711-peripherals.pdf

evanh · 2023-09-23 22:33

An answer to JMG's earlier question from https://forums.parallax.com/discussion/comment/1544113/#Comment_1544113

@jmg said:
How long does CS need to be high between bursts ?

Now that I've got something near complete. I haven't tested it for real but glancing at the typical code path looks like about 32 sysclocks from detecting a CS rise until ready for a subsequent CS low. So about 100 ns at 340 MHz.

3/4 of that time is the loop period of prior function. Not all are the same length so there is going to be variations. Maybe I could utilise interrupts to make that more responsive ...

EDIT: Interrupts now working smoothly!
I'm not sure what the worst case response will be though. It only needs two instructions (4 sysclock ticks) within the ISR to be ready. Worst case IRQ blocking is just branch instructions. So that's another 4 ticks I guess, plus the 4 ticks for the branch into the ISR itself. So 12 ticks total, maybe. So we're down to a relatively consistent 35 ns hopefully.

EDIT2: Okay, not that great. It needs 16 ticks20 ticks when SPI clock frequency is sysclock/8. Didn't need any packing at all when sysclock/80. I'm thinking the old method probably needed over 120 ticks.

And the measure is not exactly how long the CS high pulse is but rather the time from CS rise until the shifter smartpins come out of reset. The CS high pulse can be shorter as long as the subsequent SPI clocking is delayed enough.

EDIT3: sysclock/3 and sysclock/4 are both 24 ticks. So at 340 MHz that's 70 ns minimum from CS rise until clocks can start. CS can go low again anytime in between.

evanh · 2023-09-24 23:00

Huh, turns out I was getting unexpected clock glitch events. By clearing that event state along with the shifter reset it has now closed the margins some more. And also explains some erratic-ness that would come and go.

New figure based on previous measuring technique is now 12 ticks at sysclock/3. Which is exactly what I was originally hoping for. But I should hook up the scope because I suspect there is a little more when viewed from the physical pin timings ...

EDIT: Right, it's actually 13 ticks on the scope. And that's to the shifter clocking edge. Which can be either the first or second clock edge depending on CPHA of SPI clock mode. So at 340 MHz sysclock that's 38 ns minimum between CS rise and first clocking edge of next command. With CS going low again anytime in between.

evanh · 2023-09-25 00:32

Well, I got a newly observed oddity now - The Tx smartpin doesn't forget its data during a reset. If re-enabled it'll just repeat the old data word over and over. It makes sense because both initial loading modes allow a data word to be placed directly into the shifter, bypassing the buffer, by starting with DIR low (Which would normally clear all the working registers in many other smartpin modes).

It could be argued it's not an issue but I think I should probably do something to clear the data.

EDIT: [see answer to JMG below]

jmg · 2023-09-25 04:08

@evanh said:
Well, I got a newly observed oddity now - The Tx smartpin doesn't forget its data during a reset. If re-enabled it'll just repeat the old data word over and over. It makes sense because both initial loading modes allow a data word to be placed directly into the shifter, bypassing the buffer, by starting with DIR low (Which would normally clear all the working registers in many other smartpin modes).

Do you mean in slave mode, if more clocks arrive before new data, it repeats the last sent data word ?
That underrun behaviour is somewhat common I think.

evanh · 2023-09-25 07:01

This is after a CS high reset of the smartpins - software performs a DIRL-DIRH combo. Yes, as the slave. The Tx buffer register retains prior data and happily feeds it out when the smartpin is re-enabled. The software needs to either leave the smartpin in reset or feed it a zero value to keep the data pin quiet. In the past I've always left it in reset until needed.

Now though, the Tx smartpin is live all the time. This is so that I can pace out a known number of SPI clocks to the start of a reply to a command.

PS: The fix is simply add a WYPIN #0,txpin. Just had to work out the best place for it was all. Most convenient place is after the smartpin reset.

ISR_cs_rise    ' abandon prior command handling
        dirl    p_txrx        ' reset both Tx and Rx smartpins
        dirh    p_txrx        ' enable both Tx and Rx smartpins
        pollpat    ' clear any prior clock glitch event, placed after smartpin enable due to more I/O stages
        wypin   #0, p_miso    ' clear the Tx buffer and shifter

The down side is it is a little laggy at the physical pin. The Tx pin transitions about 3 sysclock ticks after the clocking edge. Which is no issue internally, the shifter still gets cleared before the bit-shift occurs. But that can still effect a high first bit externally when at the minimum 12 ticks.

Upping the required minimum gap, of 12 ticks from CS rise to new clocks, would sort it.

EDIT: Err, that was just plain wrong. I'd unwittingly removed the data source in my testing. The smartpin clearing has to be during reset. The minimum gap needs adjusted accordingly ...

ISR_cs_rise    ' abandon prior command handling
        dirl    p_txrx        ' reset both Tx and Rx smartpins
        wypin   #0, p_miso    ' clear the Tx buffer and shifter
        dirh    p_txrx        ' enable both Tx and Rx smartpins
        pollpat    ' clear any prior clock glitch event, placed after smartpin enable due to more I/O stages

EDIT2: Bugger, even with the sooner WYPIN it's still 19 sysclock ticks from CS rise until Tx pin clears on the scope. So that puts a minimum of 20 ticks on that gap. 59 ns at 340 MHz sysclock.

evanh · 2023-09-27 06:47

Righty-ho, first draft with just the minimum block read and write:

EDIT: Couple of fixes made and C tester program added.
EDIT2: Added new documenting comment:

CMD4 BlockRead is more speed limited than CMD5. Where CMD5 can achieve sysclock/3 rate, CMD4
is restricted to maybe sysclock/7. Maybe it can do better, I haven't fully tested.

EDIT3: Another typo in the docs. I'd written "24-bit address" but should have been "28-bit address". The diagram was right though. 4 + 28 = 32 bits for the command+address.
EDIT4: I've both fixed up and cleaned up the loopback tester.

disha_sudra · 2023-09-28 09:58

It seems that the above code is written for sending data in loop in P2. In my case I do not need to fill sbuff by P2, as it should be filled by receiving data from R-pi. So, I modified above code as only reading the data from MOSI, considering P2 as a slave. Here I got some confusion in the portion "Block Read Test". It should read data from its SPI pin, then why we are using wypin there, It should be rdpin?

for( j = 0; j < 12; j++ )
    {
        printf( "Block Read Test  " );
        filldat( (uint8_t *)sbuff, BUFFSIZE/4 );
//        wypin( LCLK, 5 );
//        waitus( 10 );
        __asm {
        dirl    #LMOSI
        mov pb, ##0x4000_0000
        add pb, addr
        rev pb
        wypin   pb, #LMOSI     // load CMD5 (BlockWrite) into shifter
        outh    #LCS           // signal end of prior SPI transaction
        waitx   #4
        outl    #LCS           // signal start of SPI transaction
        dirh    #LMOSI         // ready for clocks
        dirl    #LCLK
        dirh    #LCLK
        wypin   #300, #LCLK    // fire!
        wypin   #0, #LMOSI     // clear buffer register
        }
        _waitms( 1 );                  // pause for clocking to complete
//        _pinh( LCS );                  // signal completed SPI transaction
        for( i = addr; i < addr + 6; i++ )
            printf( " %08x ", sbuff[i] );
        puts("");
    }

evanh · 2023-09-28 10:25

Don't use that as example for your use. It was just my loopback tester converted to C. It is only useful for seeing the simplicity of starting the slave cog.

PS: The main feature of the driver is the blank sheet of sbuff[]. The Rpi gets to read and write wherever it likes within sbuff[]. As does the other seven cogs of the Prop2. You can make it whatever size you like, and chop it up into however many structures you like.

I'll add this note to the docs ...

evanh · 2023-09-28 10:39

Just to be clear, the driver takes care of all I/O at the Prop2 end. What you need to do now is write the Rpi I/O to communicate with it.

What you are looking at in that loopback tester is just a quick hack of mine at simulating a SPI master talking to the slave.
Sorry, I was maybe a little tired and shouldn't have posted that.

evanh · 2023-09-28 10:58

Here's a starter program (untested) for starting up the slave driver and then reporting any changes to sbuff[0].

evanh · 2023-09-29 04:58

Oh, I probably should highlight there is a couple of pin order restrictions because of smartpin routing limitations.

MISO and MOSI pins have to be a neighbouring pair. Also, it needs a minor update to allow either order. The existing source code assumes MISO comes first.

The second restriction on pin order is CLK pin has to be within three pins of both MISO and MOSI pins.

The driver init routine doe not complain at this stage. It just fails silently.

EDIT: Here's a minor update allowing either MISO or MOSI to come first.

EDIT2: MISO is always driven by the slave. It doesn't float when CS high.

disha_sudra · 2023-09-29 11:29

I am trying to send data using below code of R-Pi. And in P2 used latest code provided by yours (spislavetest1.c and spi_slave_trucated.spin2). I am wondering why I am not able to receive data in P2. I tried by changing cmode and frequency as well.

EDIT: I have used same pins and configuration as you did. One more query I have, how sbuff in P2 is filled if I am sending data from R-Pi? Does it filled by driver itself or any receiving function required?

#include <sys/ioctl.h>
#include <linux/spi/spidev.h>
#include <fcntl.h>
#include <iostream>
#include <cstdio>

#define CHUNK 19200

int fd;
unsigned char chunk[CHUNK] = {0};

void Txfunc(unsigned char chunkx[], int l) {
    struct spi_ioc_transfer spi;

    memset (&spi, 0, sizeof (spi));
    spi.tx_buf        = (unsigned long)chunkx;
    spi.len           = l;

    int ret = ioctl (fd, SPI_IOC_MESSAGE(1), &spi);
}

int main () {
    fd = open("/dev/spidev0.0", O_RDWR);
    uint32_t speed = 10000000;

    ioctl (fd, SPI_IOC_WR_MAX_SPEED_HZ, &speed);

    for (int i = 0, var = 1; i<= CHUNK;i++){
        chunk[i] = var;
        var++;
        if(var > 100){
            var = 1;
        }
        //printf("p = %d\n",chunk[i]);
    }
    printf ("Sending data....\n");      

    for(int ix = 0; ix < 56;ix++){
        Txfunc(chunk, CHUNK);
    }

    printf("finished sending data...\n");
    return 0;
}

evanh · 2023-09-29 13:05

It uses commands like most SPI devices. CMD5 for the master to send data. It'll just ignore an invalid command and any associated data.

  Example of writing a data block into the slave's buffer
  _____    starting from address $4AF (longword granular)           ____
   CS  |_______________________________________________ ... _______|
         CMD5    28-bit address ($4AF)      DATA in multiples of 32 bits
         0101 0000000000000000010010101111 xxxxxxxxxxxx ... xxxx

EDIT: This should get you going (Assuming it is working otherwise):

#include <sys/ioctl.h>
#include <linux/spi/spidev.h>
#include <fcntl.h>
#include <iostream>
#include <cstdio>

#define CHUNK 19200

int fd;
unsigned char chunk[CHUNK] = {0};

void Txfunc(unsigned char chunkx[], int l) {
    struct spi_ioc_transfer spi;

    memset (&spi, 0, sizeof (spi));
    spi.tx_buf        = (unsigned long)chunkx;
    spi.len           = l;

    int ret = ioctl (fd, SPI_IOC_MESSAGE(1), &spi);
}

int main () {
    fd = open("/dev/spidev0.0", O_RDWR);
    uint32_t speed = 10000000;

    ioctl (fd, SPI_IOC_WR_MAX_SPEED_HZ, &speed);

    for (int i = 0, var = 1; i<= CHUNK;i++){
        chunk[i] = var;
        var++;
        if(var > 100){
            var = 1;
        }
        //printf("p = %d\n",chunk[i]);
    }
    chunk[0] = 0x50;
    chunk[1] = 0;
    chunk[2] = 0;
    chunk[3] = 0;
    printf ("Sending data....\n");      

    for(int ix = 0; ix < 56;ix++){
        Txfunc(chunk, CHUNK);
    }

    printf("finished sending data...\n");
    return 0;
}

disha_sudra · 2023-10-06 12:18

Hi @evanh

Thanks for the code, I tested it. You did amazing job.

I modify the code to check receiving time for multiple buffers. I applied some logic to note down first and last digit of the buffer to note the time, not sure it is correct, but tried as given below. I found that the time is almost the same which is around 1.8 milliseconds. But in starting some buffers takes longer time. I have a query why this time variation is happening?

#include <Simpletools.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <sys/time.h>


enum {
    //_xinfreq = 20_000_000,
    _xtlfreq = 20_000_000,
    _clkfreq = 360_000_000,

    DOWNLOAD_BAUD = 230_400,
    DEBUG_BAUD = DOWNLOAD_BAUD,

    CS = 29, //Rpi24
    CLK = 26, //rpi 23
    MISO = 24, //rpi 21
    MOSI = MISO + 1, // rpi 19

    CMODE = 3,  // CPOL is bit1, CPHA is bit0

    BUFFSIZE = 19200,    // longwords

    SPINS = CS | CLK<<6 | MISO<<12 | MOSI<<18 | CMODE<<30,
};



struct __using("spi_slave_truncated.spin2") rpi;


//static uint32_t  sbuff[BUFFSIZE];
unsigned char sbuff[BUFFSIZE];

static long stackxrts1[BUFFSIZE];
int countx = 0;


void  main(void) {
    high(30);
    high(31);

    printf( "   clkfreq = %d   clkmode = 0x%x\n", _clockfreq(), _clockmode() );

    rpi.start_SPI_slave( SPINS, sbuff, BUFFSIZE );
    _waitatn();    // wait for cog to start up


    // Slave cog is ready and waiting.  At this stage the tester program has not setup any of its pins,
    // so we are going to be glitching them from the start.
    int t1, t2;
    int flag = 1;
    while (1) {

        if ( (sbuff[0] == 1) && flag == 1 ) {
            t1 = _getus();
            flag = 0;
        }

        if( sbuff[BUFFSIZE-1] == 138 ) {
            t2 = _getus();
            countx++;
            printf("countx = %d\n t2-t1 = %d\n", countx, t2-t1);
            memset(sbuff, 0 , BUFFSIZE);
            flag = 1;
        }
        }
}

The output:

( Entering terminal mode.  Press Ctrl-] or Ctrl-Z to exit. )                    
   clkfreq = 360000000   clkmode = 0x10011fb                                    
countx = 1                                                                      
 t2-t1 = 4614                                                                   
countx = 2                                                                      
 t2-t1 = 4618                                                                   
countx = 3                                                                      
 t2-t1 = 4618                                                                   
countx = 4                                                                      
 t2-t1 = 4624                                                                   
countx = 5                                                                      
 t2-t1 = 4625                                                                   
countx = 6                                                                      
 t2-t1 = 4628                                                                   
countx = 7                                                                      
 t2-t1 = 1843                                                                   
countx = 8                                                                      
 t2-t1 = 1895                                                                   
countx = 9                                                                      
 t2-t1 = 1843                                                                   
countx = 10                                                                     
 t2-t1 = 1843                                                                   
countx = 11                                                                     
 t2-t1 = 1844                                                                   
countx = 12                                                                     
 t2-t1 = 3975                                                                   
countx = 13                                                                     
 t2-t1 = 1843                                                                   
countx = 14                                                                     
 t2-t1 = 1844                                                                   
countx = 15                                                                     
 t2-t1 = 1845                                                                   
countx = 16                                                                     
 t2-t1 = 1843                                                                   
countx = 17                                                                     
 t2-t1 = 1874                                                                   
countx = 18                                                                     
 t2-t1 = 1843                                                                   
countx = 19                                                                     
 t2-t1 = 3933                                                                   
countx = 20                                                                     
 t2-t1 = 1879                                                                   
countx = 21                                                                     
 t2-t1 = 1880                                                                   
countx = 22                                                                     
 t2-t1 = 1876                                                                   
countx = 23                                                                     
 t2-t1 = 1843                                                                   
countx = 24                                                                     
 t2-t1 = 1843                                                                   
countx = 25                                                                     
 t2-t1 = 3855                                                                   
countx = 26                                                                     
 t2-t1 = 1842                                                                   
countx = 27                                                                     
 t2-t1 = 1845                                                                   
countx = 28                                                                     
 t2-t1 = 1656                                                                   
countx = 29                                                                     
 t2-t1 = 1880                                                                   
countx = 30                                                                     
 t2-t1 = 1843                                                                   
countx = 31                                                                     
 t2-t1 = 3948                                                                   
countx = 32                                                                     
 t2-t1 = 1874                                                                   
countx = 33                                                                     
 t2-t1 = 1875                                                                   
countx = 34                                                                     
 t2-t1 = 1843                                                                   
countx = 35                                                                     
 t2-t1 = 1843                                                                   
countx = 36                                                                     
 t2-t1 = 1842                                                                   
countx = 37                                                                     
 t2-t1 = 1843                                                                   
countx = 38                                                                     
 t2-t1 = 1880                                                                   
countx = 39                                                                     
 t2-t1 = 1881                                                                   
countx = 40                                                                     
 t2-t1 = 1876                                                                   
countx = 41                                                                     
 t2-t1 = 1843                                                                   
countx = 42                                                                     
 t2-t1 = 1842                                                                   
countx = 43                                                                     
 t2-t1 = 1843                                                                   
countx = 44                                                                     
 t2-t1 = 1843                                                                   
countx = 45                                                                     
 t2-t1 = 1880                                                                   
countx = 46                                                                     
 t2-t1 = 1879                                                                   
countx = 47                                                                     
 t2-t1 = 1880                                                                   
countx = 48                                                                     
 t2-t1 = 1843                                                                   
countx = 49                                                                     
 t2-t1 = 1843                                                                   
countx = 50                                                                     
 t2-t1 = 1843                                                                   
countx = 51                                                                     
 t2-t1 = 1842                                                                   
countx = 52                                                                     
 t2-t1 = 1843                                                                   
countx = 53                                                                     
 t2-t1 = 1879                                                                   
countx = 54                                                                     
 t2-t1 = 3907                                                                   
countx = 55                                                                     
 t2-t1 = 1843                                                                   
countx = 56                                                                     
 t2-t1 = 1879

evanh · 2023-10-06 13:02

Did you modify the meaning of BUFFSIZE in the Spin code? It was written to be in longwords but you've used bytes instead.

Use this line to correct: rpi.start_SPI_slave( SPINS, sbuff, BUFFSIZE/4 );

PS: I've posted the version 1 code to a new topic and named it as PropSPI - https://forums.parallax.com/discussion/175543/propspi-emulation-of-a-generic-spi-ram-chip-using-the-prop2/p1
It has new undocumented feature of notification when a transfer has completed.

Hope this code works (It requires the newer v1 PropSPI and also the attached pllset.spin2):

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <sys/time.h>


enum {
    //_xinfreq = 20_000_000,
    _xtlfreq = 20_000_000,
    _clkfreq = 360_000_000,

    DOWNLOAD_BAUD = 230_400,
    DEBUG_BAUD = DOWNLOAD_BAUD,

    CS = 29, //Rpi24
    CLK = 26, //rpi 23
    MISO = 24, //rpi 21
    MOSI = MISO + 1, // rpi 19

    CMODE = 3,  // CPOL is bit1, CPHA is bit0

    BUFFSIZE = 19200,    // bytes

    SPINS = CS | CLK<<6 | MISO<<12 | MOSI<<18 | CMODE<<30,
};



struct __using("pllset.spin2") lib;
struct __using("propspi_v1_truncated.spin2") rpi;


unsigned char sbuff[BUFFSIZE];


void  main(void) {
    _pinh(30);
    _pinh(31);

    printf( "   clkfreq = %d   clkmode = 0x%x\n", _clockfreq(), _clockmode() );

    rpi.start_propspi( SPINS, sbuff, BUFFSIZE/4 );
    _waitatn();    // wait for cog to start up


    // Slave cog is ready and waiting.  At this stage the tester program has not setup any of its pins,
    // so we are going to be glitching them from the start.
    unsigned  countx = 0, lastch, t1, t2;

    _pollatn();
    while (1)
    {
        if( sbuff[0] ) usec = 0;    // missed it!
        else usec = 1;    // all good

        while( !sbuff[0] );    // detect start
        t1 = _cnt();
        _waitatn();    // detect end
        t2 = _cnt();
        countx++;
        lastch = sbuff[BUFFSIZE-1];
        memset( sbuff, 0, BUFFSIZE );  // clear ASAP

        if( usec )    // valid measurement
            usec = lib.muldiv65( t2 - t1, 1000000, _clockfreq() );
        printf( " last byte = 0x%02x   countx = %d   t2-t1 = %d us\n", lastch, countx, usec );
    }
}

EDIT: Improved transfer start detection

SPI communication between Parallax P2 and Raspberry Pi

Comments