SPI communication between Parallax P2 and Raspberry Pi
in C/C++
Hello,
I'm trying to transfer data from Raspberry Pi (As a master) to Parallax P2. I've attached Rpi code rpi_master_spi.cpp I have tried with Arduino Mega as a slave, I'm able to communicate bidirectionally from this code. Now I want to use Parallax P2 as a slave. I've checked shift_in() and shift_out() functions, but it's not working for me. I have used below pins of parallax P2
SCK 24
DATA 25
SS 26
So, is there any sample code, example or reference link available for P2 to work with SPI?
cpp

546B
Comments
For the P2 you will want to look into smart pins. In Spin2, you would start with something like this
con SS = 26 MISO = 25 SCLK = 24
Assuming the RPi CPOL is 0, you could setup the SPI input like this:
m := P_SYNC_RX ' spi rx mode m |= ((SCLK-MISO) & %111) << 24 ' add SCLK offset (B pin) x := %0_00000 | (8-1) ' sample ahead of b pin rise, 8 bits pinstart(MISO, m, x, 0) ' configure smart pin pinf(MISO) ' reset/disable until used
This SUGGESTED code (two versions) is based on other things I've worked on, but I have not used the P2 as a SPI slave with another device. Ultimately, you'll probably want a cog that monitors the SPI input and updates a buffer on the fly (like serial objects do).
pub shiftin(count, p_buf) | value repeat count pinl(MISO) ' enable MISO repeat until pinr(MISO) ' wait for a byte value := rdpin(MISO) ' get the value pinf(MISO) ' reset MISO value rev= 31 ' restore MSBFIRST order byte[p_buf++] := value pub shiftin_pasm(count, p_buf) | value org .loop drvl #MISO ' enable MISO testp #MISO wc ' wait for a byte if_nc jmp #$-1 rdpin pr0, #MISO ' get the value fltl #MISO ' reset MISO rev pr0 ' restore MSBFIRST order wrbyte pr0, p_buf ' write to buffer add p_buf, #1 ' update buffer pointer djnz count, #.loop end
Hi JonnyMac,
Thanks for reply. I tried this code. However I'm unable to communicate.
I want to set clock frequency of P2 same as Raspberry Pi. And receive data from Raspberry Pi.
Can you please explain in details about how should the m and x be selected. And what is the role of SCLK particularly.
I've saved above code as testcode.spin2 and used it in c code as below.
#include "Simpletools.h" #include "stdio.h" struct __using ("testcode.spin2") test; int main(){ int buff[50] = {0}; test.setup(); while(1){ test.shiftin(50, buff); for (int i=0; i<50; i++){ printf("%d \n", buff[i]); } } }
The details can be found here:
-- https://docs.google.com/document/d/1gn6oaT5Ib7CytvlZHacmrSbVBJsD9t_-kmvjd7nUR6o/edit#heading=h.53grmlqytmbv
I use variable m for the smart pin mode (in this case synchronous RX), and x and y are the registers used by the smart pin (y is 0 in this case). I have used smart pin synchronous RX many times, but always with an internally generated clock. This should not be an issue, though, because the smart pin setup links the clock pin (an input) to the MISO pin. The only restriction is the delta in pin #s. You need to make sure that your clock pin is within 3 pins of the MISO and MOSI that it might be affecting.
There is no reason to run the P2 at the same clock speed as the RPi. In fact, you might try slowing your SPI output on the RPi. When a byte is received by the P2, it has to fix and store it before the next rising clock edge. A logic analyzer will be helpful here. Also, why not write test code to get one byte from RPi to P2 before you try 50? This is a common problem with programmers: jumping directly to the end goal instead of working their way there. I've been coding the Propeller for a long time, and I still do it very incrementally. The process only seems slower -- what I find is that incremental development eliminates bugs before they can become buried in a lot of code.
Have a look at the attached demo. It uses one Spin cog to simulate the RPi:
pri master_spi() | m, x, value ' launch wit cogspin() ' setup smart pins for spi output pinhigh(CS0) ' make output/high (deactivate) m := P_SYNC_TX | P_OE ' spi tx mode m |= ((SCLK0-MOSI0) & %111) << 24 ' add clk offset (B pin) x := %1_00000 | (8-1) ' start/stop mode, 8 bits pinstart(MOSI0, m, x, 0) ' activate smart pin pinfloat(MOSI0) ' reset/disable until used m := P_PULSE | P_OE ' spi clock mode (CPOL = 0) x.word[0] := 2 #> (clkfreq / 31_250_000) <# $FFFF ' ticks in period (32.25 MHz) x.word[1] := x.word[0] >> 1 ' ticks in low cycle (50%) pinstart(SCLK0, m, x, 0) ' initialize smart pin waitms(500) ' let other cog start ' Send byte every 250 ms ' -- order is MSBFIRST repeat repeat outcount from 0 to 255 pinlow(CS0) pinlow(MOSI0) ' enable spi output wypin(MOSI0, outcount rev 7) ' send count MSBFIRST wypin(SCLK0, 8) ' 8 clock pulses repeat until pinr(SCLK0) ' wait until done wypin(MOSI0, (255 - outcount) rev 7) ' flip value, send MSBFIRST wypin(SCLK0, 8) ' 8 clock pulses repeat until pinr(SCLK0) ' wait until done pinfloat(MOSI0) ' reset tx pinhigh(CS0) waitms(333)
After I got things working with one byte at a time I added a second byte of output to the master.
The slave cog uses an external clock (from the master cog) and monitors its CS; when this drops it waits for a byte to come it or CS to go back high. When a byte comes in it is read from the smart pin, fixed (for MSBFIRST), and written to the buffer. Here's the Spin-only version.
pri slave_spi() | m, x, value, p_buf ' launch with cogspin() pinclear(CS1) ' make input m := P_SYNC_RX ' spi rx mode m |= ((SCLK1-MOSI1) & %111) << 24 ' add clock offset (B pin) x := %0_00000 | (8-1) ' sample ahead of b pin rise, 8 bits pinstart(MOSI1, m, x, 0) ' configure smart pin pinfloat(MOSI1) ' reset/disable until used ' Wait for CS drop ' -- get 8 bits, convert to MSBFIRST repeat p_buf := @incount ' reset input buffer pointer repeat while pinread(CS1) ' wait for chip select pinlow(MOSI1) ' enable spi rx repeat if pinread(MOSI1) ' byte received? byte[p_buf++] := rdpin(MOSI1) rev 31 ' convert to MSBFIRST and save if pinread(CS1) ' slave deselected? pinfloat(MOSI1) ' disable/reset quit
After the SPI RX pin is setup (associated with the external clock), it drops into a loop that waits for CS to drop and then writes incoming bytes to the buffer. It works, but is only good for the demo because we know two bytes are coming.
As I said earlier, doing things incrementally is a good way to go. Once all the Spin2 code was working, I converted the ending repeat loop to this inline pasm fuction. It does the same, but will return the number of bytes the were received while CS was low.
pri shift_in(p_buf) : count '' Wait for cs to drop '' -- read bytes from SPI to p_buf '' -- returns count when CS goes high '' -- SPI bit order is MSBFIRST '' -- blocks until CS transitions low, then high org .spi_in testp #CS1 wc ' monitor chip select if_c jmp #.spi_in ' wait for drop drvl #MOSI1 ' enable spi rx .get_byte testp #MOSI1 wc ' data from master? if_nc jmp #.check_cs ' no, check cs rdpin pr0, #MOSI1 ' get the value rev pr0 ' fix MSBFIRST wrbyte pr0, p_buf ' save to hub add p_buf, #1 ' point to next addr add count, #1 ' update return count .check_cs testp #CS1 wc ' check state of cs if_nc jmp #.get_byte ' still active? fltl #MOSI1 ' disable/reset end
All the code is in the demo. Please, please, please, give it a try before converting it to another language. Go slowly.
As you already owe me a beer or an ice cream , I will leave adding the MISO mechanism to you. It should be no problem. Ultimately, you could create a SPI slave object with circular buffers and the ability to read as many bytes as one wants. Have a look at jm_fullduplexserial.spin2 for ideas on circular buffers and using head and tail pointers.
Hi @JonnyMac
Thank you for the solution. I have run this demo code. Got an error pr0 register.
I replaced pr0 with its address as given below
But, I am not getting exact data sent from Raspberry-pi. So what am I doing wrong here? Can you please suggest?
PR0..PR7 are user registers in the Spin2 interpreter -- I thought FlexProp understood these registers. That said, it's easy to add a local variable (I wouldn't use a hard address as you're doing; no telling what's at that address under FlexProp). As you can see, with a small adjustmet for FlexProp, the demo works there, too.
Now I suggest you get a logic analyzer and see what's coming out of your Raspberry Pi. I set the SPI clock back to 1MHz and captured the output from my master cog which sends a count value, and 255-count in each CS frame. Here's the capture:
I think I've done all I can given your limited information and feedback.
-- "I am not getting exact data sent from Raspberry-pi" is not helpful to those from whom you're asking for help.
Programming can be fiddly. Time to fiddle until you understand what's going on with your hardware. Good luck with your project.
Thank you for your valuable help.
Hello @disha_sudra,
You can try this "C programming language" method to receive the data from Rpi to propeller 2 using SPI protocol.
#include "simpletools.h" #include <stdio.h> #include "spi-c.h" #define NSAMPS_MAX 8 int data[NSAMPS_MAX]; void main() { high(38); // This is to provide 3.3v to logic level converter. int i = 0; DIRA = 0; DIRA |= MOSI ; // set to outputs while (1) { int value = 0, decimal = 0, base = 1; while(_pinr(CS) == 1); while(_pinr(CS) == 0) { if(_pinr(CLK) == 1) ((_pinr(CLK) == 1))&&((_pinr(MOSI) == 1)) ? (data[i++] = 1) : (data[i++] = 0); } for(int count = 7; count >=0; (decimal = decimal + (data[count] * base)),(base *= 2),count --); printf("%d\n",decimal); i = 0; } }
PR0..PR7 symbols were added to Flex suite as of v5.9.15 - the latest release.
v5.9.15 fixes a bug or two with inline Pasm too.
Have you tested that with physical signals? It seems like when the CS line drops and the CLK line goes high, the code is going to be in a very tight loop that blasts values to the data[] array -- there seems to be nothing stopping the code from generating a bunch of samples from the same clock pulse, and this will continue as long as CS is low. The P2 smart pin SPI feature takes care of this; we simply wait check to see if a byte as been received. When that is not the case, we can check the state of the CS line to see if the master is done sending. And with mode 0 SPI, you want to sample right before the rising clock edge, not while it's high. If I'm wrong, please tell me -- I understand C, but I don't use it with the Propeller.
I don't like to post untested code (especially in a language I don't use regularly, and never on the Propeller), but maybe the C version with smart pin SPI looks like this:
int shift_in(uint8_t buf[]) { int count = 0; // bytes received while(_pinr(CS) == 1); // wait while CS high _pinl(MOSI); // enable smart pin SPI RX while(_pinr(CS) == 0) { // while CS is low if (_pinr(MOSI)) // if byte waiting buf[count++] = _rev(_rdpin(MOSI)); // read, fix MSB, save to array } _pinf(MOSI); // disable/reset SPI RX return count; }
I have .14 -- will look for the latest. Thanks for the tip.
Can you add a small delay on the RPi between CS dropping and bits going out? Maybe there is a race condition.
Hi @JonnyMac
As per your suggestion I've used smart pins and created a function for receiving.
I am using P_SYNC_RX as per your suggestion in previous post.
I am attaching my RPi code and P2 code in c++.
RPI Frequency: 5 MHZ
P2 Frequency: 200 MHZ
With above mentioned frequency setup I'm getting proper data for buffer size from 10 to 255 bytes, but when I increases buffer size on RPi from 255 to 500 or more then I am getting garbage data on P2. same thing happen if I change the clock frequency of RPI or P2. I need to setup RPI spi Frequency to 10 MHZ for high speed data transfer. So if some one can point what I am doing wrong here then it will be great help. I am new to P2 and smart pin configuration. Thank you.
Code for P2:
#include "spi-c.h" unsigned char data[4096]; enum { _clkfreq = 200000000 }; int my_shift_in(void) { int count = 0; // bytes received while(_pinr(CS) == 1); // wait while CS high _pinl(MOSI); // enable smart pin SPI RX while(_pinr(CS) == 0) { // while CS is low if (_pinr(MOSI)) // if byte waiting data[count++] = _rev(_rdpin(MOSI)); // read, fix MSB, save to array } _pinf(MOSI); // disable/reset SPI RX return count; } void main() { DIRA = 0; int m = 0, x = 0, ret = 0; DIRA |= MOSI ; _pinclear(CS); m = P_SYNC_RX; m |= ((CLK-MOSI) & 0b111) << 24; x = 0b0_00000 | (8-1); _pinstart(MOSI, m, x, 0); while(1){ ret = 0; ret = my_shift_in(); if(ret>0) for(int i = 0; i<ret; i++){ printf("%d ",data[i]); } } }
Code for R-pi:
#include <iostream> #include "/home/admin/Documents/SPI/wiringPiSPI.h" #define SPI_CHANNEL 0 #define SPI_CLOCK_SPEED 5000000 int main(int argc, char **argv) { int fd = wiringPiSPISetupMode(SPI_CHANNEL, SPI_CLOCK_SPEED, 0); if (fd == -1) { std::cout << "Failed to init SPI communication.\n"; return -1; } unsigned char buf[4096] = {0}; for(int i=0, count = 0,k = 0;i<4096;i++){ buf[count++] = k++; if(k==256) k=0; } wiringPiSPIDataRW(SPI_CHANNEL, buf, 10); return 0; }
Disha,
Give this a try. It compiles but I've not tested it with any hardware.
PS: The inner loop is 14 sysclock ticks. That's fast enough to keep up with the smallest recommended oversampling of 3. Therefore max bitrate is 200 MHz / 3 = 66 Mbit/s. PPS: Although sysclock/5 is probably the more reliable figure, which means up to 200 / 5 = 40 Mbit/s.
PPPS: Oversampling is needed factor here because the Prop2 doesn't provide external synchronous clocking for any I/O. The SPI clock input pin is just another data I/O pin, same as the SPI data pins. If the clocking source (RPi in this case) is not common to the Prop2's internal clock then the I/O has to be treated as asynchronously sampled. Which means all the usual sampling aliasing problems rare up.
I have tested this code with hardware setup. I have setup Rpi frequency to 10MHz. First I have sent 10 Bytes of buffer from Rpi to P2 but most of time receiving garbage data and also tested with large buffer size (Approx 100KB) but result was same. I've to send 1 to 5 Mbyte of image data.
Thank you so much for your valuable reply.
Hmm, maybe still need to sort out CPOL/CPHA and the likes. Any idea what the RPi is sending as?
Oh, with the DIRH placed after CS low detection, it does need a small amount of time from CS low until smartpin is accepting serial data. Maybe 10 sysclocks, 50 ns post-CS-low. If SPI clocks start in less time then it'll be a mess.
Maybe better to enable the smartpin earlier ...
EDIT: I've added an extra step to check for a prior low CS level. This allows for an earlier enable of the smartpin.
EDIT2: After reading some of Jon's earlier postings I noticed another timing improvement - check for rising CS loop exit only when smartpin buffer is empty. Updated my code to incorporate that too.
PS: The forum's formatting doesn't work. Attached file instead:
Final slave code is presented in other topic - https://forums.parallax.com/discussion/comment/1543798/#Comment_1543798
'Twas a much more intensive exercise than I was expecting. It really kind of caught me off guard. I didn't dig out the scope partly because of the fiddliness of hooking it up, which cost me more time chasing my tail.
Using below code of P2 I have measured time to receive one buffer(of 19200 bytes) that is 13 millisecond. How can I increase the speed? One more question I have, if the SPI frequency of P2 differ than _clkfreq, then how can I achieve maximum SPI clock frequency?
In R-Pi I have taken :
uint32_t speed = 12000000; ioctl (fd, SPI_IOC_WR_MAX_SPEED_HZ, &speed);
Below is P2's code for communication with R-Pi.
enum { //_xinfreq = 20_000_000, _xtlfreq = 20_000_000, _clkfreq = 340_000_000, DOWNLOAD_BAUD = 230_400, DEBUG_BAUD = DOWNLOAD_BAUD, PINS = 20 | 7 << 6, GROUP = 2, CLK_PIN = 6, TX_REGD = 1, CLK_REGD = 0, CPOL = 1, CPHA = 0, CS = 32, SCLK = 33, MOSI = 34, MISO = 35, HEAPSIZE = 80000, CLK_DIV = 20, M_NCO = 0x8000_0000UL / CLK_DIV + (0x8000_0000UL % CLK_DIV > 0UL ? 1UL : 0UL), // round up M_LEADIN = X_IMM_32X1_1DAC1 | 3 + CLK_DIV + CLK_REGD - TX_REGD - (1 ^ CPHA) * (CLK_DIV >> 1), }; int my_shift_in(void) { int count = 0; // bytes receive while(_pinr(CS) == 1); // wait while CS high _pinl(MOSI); // enable smart pin SPI RX while(_pinr(CS) == 0) { // while CS is low if (_pinr(MOSI)){ // if byte waiting data[count++] = _rev(_rdpin(MOSI)); // read, fix MSB, save to array } } _pinf(MOSI); // disable/reset SPI RX return count; } int spi_shift_out( uint8_t *addr ) { int count = 2; // +2 when buffer data moves to the shifter __asm volatile { rdfast #0, addr // setup the hubRAM FIFO hardware for sequencial reads neg pa, #SCLK & 0x20 wc // SETPAT port select: C=0 for INA, C=1 for INB modz 15 wz // set Z flag: SETPAT equality (INx == S) event setpat mask1, mask1 // Setup CLK glitch event: When CS high and CLK high setse1 #2<<6 | CS // Setup CS falling event: 1=rising-edge, 2=falling-edge rflong pa // read four bytes from FIFO (hubRAM) movbyts pa, #0b00_01_10_11 // 32-bit byte-swap to suit big-endian shifted out from lsb of register rev pa // 32-bit bit-swap to suit big-endian shifted out from lsb of register .retry dirl #MISO // disable/reset SPI TX smartpin, pin is driven low wypin pa, #MISO // place first two bytes directly in smartpin's shift register dirh #MISO // enable SPI TX smartpin, first bit is presented to pin .wait jpat #.retry // CLK toggled while CS still high, treat as a glitch jnse1 #.wait // wait while CS is high setse1 #1<<6 | CS // Setup CS rising event: 1=rising-edge, 2=falling-edge movbyts pa, #0b01_00_11_10 // move second lot down wypin pa, #MISO // place second lot in buffer register .loop testp #MISO wc // check for empty smartpin buffer register, 32 bits shifted out if_c rfword pa // read two bytes from FIFO (hubRAM) if_c movbyts pa, #0x1b // 32-bit byte-swap to suit big-endian shifted out from lsb of register if_c rev pa // 32-bit bit-swap to suit big-endian shifted out from lsb of register if_c wypin pa, #MISO // place another two bytes in buffer register if_c add count, #2 _ret_ jnse1 #.loop // break loop when CS high mask1 long (1<<(CS & 0x1f)) | (1<<(SCLK & 0x1f)) // CS/CLK pin mask for SETPAT } // Exiting from Fcache/Cogexec triggers an implicit RDFAST by the branch to hubexec return count; } void Data_rec() { printf("Started new cog Data_rec()\n"); DIRA = 0; int m = 0, x = 0, i = 0; DIRA |= MOSI ; _wrpin(CS, P_HIGH_15K | P_LOW_FLOAT | P_SYNC_IO | P_SCHMITT_A); // do this first _drvh(CS); // 15 kR pull-up resistor _waitms(200); printf( "\n clkfreq = %d clkmode = 0x%x\n", _clockfreq(), _clockmode() ); _wrpin(SCLK, P_SYNC_IO | P_SCHMITT_A); // single ended //_wrpin(SCLK, P_SYNC_IO | P_FILT0_AB | P_COMPARE_AB); // differential pair _pinclear(CS); m = P_SYNC_RX; m |= ((SCLK-MOSI) & 0b111) << 24; x = 0b0_00000 | (8-1); _pinstart(MOSI, m, x, 0); _wrpin(CS, P_HIGH_15K | P_LOW_FLOAT | P_SCHMITT_A); // do this first _drvh(CS); // 15 kR pull-up resistor _wrpin(SCLK, (CPOL ? 0 : P_INVERT_A) | P_SCHMITT_A); // This invert doesn't apply to other smartA/B muxes m = P_SYNC_TX | P_SCHMITT_A | P_OE; m |= ((SCLK - MISO) & 0b0111) << 24; // smartB (clock pin) input select m |= (CPOL ^ !CPHA) ? 0 : P_INVERT_B; // SPI Clock Mode (see above documentation) x = 0<<5 | (32-1); // continuous buffer mode and 32-bit shifter and buffer _pinstart(MISO, m, x, 0); int cnt = 0, rx = 0; // To write received data in a text file in SD card. mount("/sd", _vfs_open_sdcard()); fptr = fopen("/sd/data.dat", "a"); if (fptr == NULL) { printf("File failed to open\n"); return 0; } else { printf("The file is now opened\n"); } while(true) { while(_pollatn()==0); printf("received atn\n"); while (1){ my_shift_in(); // conituoues check for signal -> if got 245 then ready to receive next chunk of data. if (data[0] == 248){ printf("break loop \n"); // If receive 248 then break the loop as it is signal for whole data has sent. flag_y = 1; fclose(fptr); printf("File closed \n"); umount( "/sd" ); break; } if (data[0] == 245 ){ t1 = _getms(); spi_shift_out( txdata ); txdata[0] = txdata[0] + 1; if (txdata[0] > 100){ txdata[0] = 1; } rec = my_shift_in(); // Receive a buffer t2 = _getms(); fwrite(data, sizeof(char), sizeof(data), fptr); printf("time = %d\nrec = %d\n", t2-t1, rec); // it takes 13 milliseceonds cnt++; } } } }
Decrease CLK_DIV. EDIT: Oh, you've cut'n'paste from different methods. CLK_DIV isn't actually used by the code.
Yes, mismatched clocks is a limitation. It would be preferable to have a common whole-board reference system clock for both CPUs.
Aside from that there is also a significant lag in the Prop's I/O that you won't see in other SPI slave devices. This has to be worked around to get decent speed. And will take the form of using prior clock cycle to predict later data need. NOTE: A down side of this strategy is any gaps in the master clock will mess up the timing. Ie: Doesn't follow proper SPI clocking rules at all.
Reading your code, it looks like it's mostly from what I'd posted back in March - Gone back and found that now too.
That was written to actually follow SPI clock rules and tested to work up to sysclock/5. Or at least that's what the notes say. I'd really forgotten it worked that well.
So, with that 340 MHz sysclock, you should be fine using, say, a 60 MHz SPI master clock from the RPi.
Question: Why do you need a small file sent so fast?
Hello @evanh , Is there any config parameter we can set in P2 to setup SPI clock frequency to receive data from Raspberry pi? For example if we want to use 20 MHZ Clock frequency for SPI communication then in RPI we can set that frequency as per @disha_sudra's code, But in P2 how can we set it?
Nope, as a slave it just follows the clock. That's one reason why it's tricky to be a slave without a real SPI port.
Looking at my old programs from back in March, I can see I've also written an "always-on" slave too. It uses a whole Cog. It can handle both transmit and receive functions. I don't know if it was finished though ... It can't have been finish, it'd need a command set - and that definitely isn't there.
I could finish it as an example. Do you have a spare cog?
@evanh
As per your comment #18, I tried with my_shift_in() function to continuously receive data from R-Pi. I have used level shifter between them, and able to receive data of 19200 bytes buffer in about 2 to 4 milliseconds at 110 MHz clock speed of R-Pi. When I removes level shifter and directly connect P2 to R-Pi I am not able to receive data buffer by buffer. Is it possible to receive buffer in less than 2 milliseconds and steady time for all buffer?
Yes, I have.
Is that the SPI clock rate? If so then I'm impressed. And I assume your Prop2 is at 340 MHz and therefore achieving just short of sysclock/3. SPI receiver smartpin has proven to be good for that. Transmitter not so much - because of the lag issue.
Question: Is disha_sudra and chintan_joshi the same person?
We are working together on same project.