Okay! My simple Prop-to-Prop cables arrived this week, so I have connected a P2_EVAL to a P2_EDGE, and here is my initial test program - simple but reliable synchronous transfers using a 32 bit parallel bus between two or more Propellers at 32Mb/s ...
/*
* Program to test how fast and reliably bus read/writes can be done using a
* simple synchronous parallel bus connecting two or more Propellers.
*
* There can be only one sender cog at a time, but can be multiple receiver
* cogs. The intention is to start one sender cog on one Propeller, and
* multiple receiver cogs on the other Propellers. This program supports
* four receivers on a single Propeller.
*
* The Propellers must be connected pin to pin on pins 00 .. 31 (i.e. port A).
* If you have more than two Propellers connected, you could start additional
* receiver cogs on the other Propellers.
*
* This is a synchronous transfer, so how much data can be transferred depends
* on how closely the respective Propeller clocks are synchronized. The crystal
* accuracy is typically +/-0.5 PPM. At 25 clocks per long, this means the
* clocks may be out of sync by up one clock after 40,000 longs. A suitable
* maximum size for a single synchronous transfer might therefore be 20,000
* longs, which is why this value is used by this program. Larger transfers
* can be performed by doing multiple smaller transfers (which is the point
* of this test program!).
*
* The program uses P2 NATIVE PASM, so it must be compiled in P2 NATIVE mode
* (which is the default mode for the Propeller 2).
*
* To maximize available cogs, add -C NO_MOUSE and -C NO_FLOAT (and -C SIMPLE
* if using a serial HMI). However in this simple-minded program, buffer space
* is likely to be the limiting factor, not cogs.
*
* To build as a sender, add -C SENDER
*
* To build as a receiver, add -C RECEIVER
*
* For example, compile with a command like:
*
* catalina -p2 -lci p2_bus.c -o sender -C NO_MOUSE -C NO_FLOAT -C SENDER
* or
* catalina -p2 -lci p2_bus.c -o receiver -C NO_MOUSE -C NO_FLOAT -C RECEIVER
*
* Then load and execute with commands like:
*
* payload sender -PN -i
* payload receiver -PM -i
*
* where N & M are the Propeller ports to use for the sender and receiver,
* respectively (add more 'receiver' commands if there are more than two
* Propellers on the bus).
*/#if !defined (__CATALINA_SENDER) && !defined(__CATALINA_RECEIVER)#error EITHER SENDER OR RECEIVER MUST BE DEFINED!#endif#include<stdio.h>#include<stdlib.h>#include<catalina_cog.h>#include<catalina_plugin.h>#define XFER_DELAY 7 // clocks per long is 18+this (7 works!)#define NUM_LONGS 20000 // number of longs to transfer (20000 works!)#define STACK_SIZE 500 // size of cog stack (stdio requires 500)#define TOTAL_ERRORS_ONLY // print only total errors, not detailsstaticunsignedlong start = 0; // clock count used to synchronize cogsstaticint lock = 0; // lock to protect I/Ostaticunsignedlong *send_buff; // data to be sentstaticunsignedlong *rcv1_buff; // data received (1)staticunsignedlong *rcv2_buff; // data received (2)staticunsignedlong *rcv3_buff; // data received (3)staticunsignedlong *rcv4_buff; // data received (4)/*
* sync - synchronize multiple cogs to start on a specific clock count
*
* 'start' should be set to a clock count some time in the
* future - e.g. _cnt() + _clockfreq() for one second
*/intsync(unsignedlong start){
return PASM (
" getct r0\n"" sub r2, r0\n"" waitx r2\n"" getct r0\n"
);
}
/*
* send - pasm code to write a number of longs to the 32 bit bus
* Note: sending starts immediately, and then sends a new long
* every 'time'+18 ticks.
* Note: loads the code into LUT RAM and executes it there.
*
* 'time' (passed in r4) is the time between writes in clocks (+18)
* 'buff' (passed in r3) is an array holding longs to send
* 'size' (passed in r2) is the size of the array
*/intsend(int time, void *buff, int size){
return PASM (
// set pins 00 .. 31 as outputs" mov outa, #0\n"" or dira,##$FFFFFFFF\n"// load LUT RAM:" setq2 #(send_end - send_start - 1)\n"" rdlong 0, ##@send_start\n"// jump to code in LUT RAM:" jmp #send_start\n"// code to be executed in LUT RAM:" org $200\n""send_start\n"" getct r0\n"// LUT: 2 (clocks)"\n"" rep #7, r2\n"// LUT: 2" addct1 r0, #18\n"// LUT: 2" rdlong r1, r3\n"// LUT: 9 .. 16" waitct1\n"// LUT: 2" add r3, #4\n"// LUT: 2" mov outa, r1\n"// LUT: 2" addct1 r0, r4\n"// LUT: 2" waitct1\n"// LUT: 2"\n"" addct1 r0, r4\n"" addct1 r0, #18\n"" waitct1\n"" mov outa, #0\n"" getct r0\n"" jmp #send_cont\n""send_end\n"// resume Hub Execution:" orgh\n""send_cont\n"
);
}
/*
* sender - send an array of longs to one or more receivers
*
* 'buff' is an array of NUM_LONGS longs.
*/voidsender(void *buff){
unsignedint started, stopped, total;
int me = _cogid();
started = _cnt();
stopped = send(XFER_DELAY, buff, NUM_LONGS);
total = stopped - started;
ACQUIRE(lock);
printf("send (cog %d) took %d clocks (%d per long)\n\n",
me, total, total/NUM_LONGS);
RELEASE(lock);
while(1); // don't exit
}
/*
* recv - pasm code to read a number of longs from the 32 bit bus.
* Note: receiving starts 'time' clock ticks after any non-zero
* value is detected on the bus, and then reads a new long
* every 'time'+18 ticks.
* Note: loads the code into LUT RAM and executes it there.
*
* 'time' (passed in r4) is the time between reads in clocks (+18)
* 'buff' (passed in r3) is an array to hold longs received
* 'size' (passed in r2) is the size of the array
*/intrecv(int time, void *buff, int size){
return PASM (
// set pins 00 .. 31 as inputs" andn dira,##$FFFFFFFF\n"// load LUT RAM:" setq2 #(recv_end - recv_start - 1)\n"" rdlong 0, ##@recv_start\n"// jump to code in LUT RAM:" jmp #recv_start\n"// code to be executed in LUT RAM:" org $200\n""recv_start\n"" mov r1, ina\n"// LUT: 2 (clocks)" cmp r1, #0 wz\n"// LUT: 2" if_z jmp #recv_start\n"// LUT: 4" getct r0\n"// LUT: 2 "\n"" rep #7, r2\n"// LUT: 2" addct1 r0, #18\n"// LUT: 2" wrlong r1, r3\n"// LUT: 3 .. 10" waitct1\n"// LUT: 2" add r3, #4\n"// LUT: 2" addct1 r0, r4\n"// LUT: 2" waitct1\n"// LUT: 2" mov r1, ina\n"// LUT: 2 "\n"" getct r0\n"" jmp #recv_cont\n""recv_end\n"// resume Hub Execution:" orgh\n""recv_cont\n"
);
}
/*
* receiver- receive an array of longs from the 32 bit bus
*
* 'buff' is an array of NUM_LONGS longs.
*/voidreceiver(void *buff){
unsignedint started, stopped;
int me = _cogid();
started = sync(start);
stopped = recv(XFER_DELAY, buff, NUM_LONGS);
ACQUIRE(lock);
printf("recv (cog %d) started at clock 0x%08x\n", me, started);
RELEASE(lock);
while(1); // don't exit
}
voidmain(void){
unsignedlong i;
long send_stack[STACK_SIZE];
long rcv1_stack[STACK_SIZE];
long rcv2_stack[STACK_SIZE];
long rcv3_stack[STACK_SIZE];
long rcv4_stack[STACK_SIZE];
int send_cog;
int rcv1_cog;
int rcv2_cog;
int rcv3_cog;
int rcv4_cog;
int errors = 0;
int transfers = 0;
// assign a lock to be used to avoid plugin contention
lock = _locknew();
// give the vt100 emulator (if used) a chance to start#ifdef __CATALINA_VT100
_waitms(500);
#endif// allocate the arrays (we allocate them all whether we are a sender// or a receiver)
send_buff = malloc(NUM_LONGS*4);
rcv1_buff = malloc(NUM_LONGS*4);
rcv2_buff = malloc(NUM_LONGS*4);
rcv3_buff = malloc(NUM_LONGS*4);
rcv4_buff = malloc(NUM_LONGS*4);
// initialize the arraysfor (i = 0; i < NUM_LONGS; i++) {
send_buff[i] = i+1; // can be anything, but must be non-zero
rcv1_buff[i] = 0;
rcv2_buff[i] = 0;
rcv3_buff[i] = 0;
rcv4_buff[i] = 0;
}
#ifdef __CATALINA_SENDER// set all bus pins to zero
PASM(
" mov outa, #0\n"" or dira,##$FFFFFFFF\n"
);
// start ONE sender
k_clear();
ACQUIRE(lock);
printf("SENDER (Clock = %lu Hz)\n\n", _clockfreq());
printf("Start the receiver cogs in the Receiver program,\n");
printf("then press any key to start sender cog\n");
RELEASE(lock);
k_wait();
// keep transferring foreverwhile (1) {
ACQUIRE(lock);
printf("\nStarting sender cog ...\n\n");
RELEASE(lock);
send_cog = _cogstart_C(&sender, send_buff, send_stack, STACK_SIZE);
// give the sender a chance to send (3 seconds is generous!)
_waitms(3000);
ACQUIRE(lock);
printf("... done\n");
RELEASE(lock);
// cancel the sender (we restart it again for each transfer)
_cogstop(send_cog);
// update and print statistics
transfers++;
ACQUIRE(lock);
printf("\nTotal %d transfers\n",transfers);
RELEASE(lock);
}
#endif#ifdef __CATALINA_RECEIVER
ACQUIRE(lock);
printf("RECEIVER (Clock = %lu Hz)\n\n", _clockfreq());
printf("Press any key to start receiver cogs, then start the sender\n");
printf("cog in the sender program\n");
RELEASE(lock);
k_wait();
ACQUIRE(lock);
printf("Starting receiver cogs ...\n\n");
RELEASE(lock);
while (1) {
// set a start time for the receiver cogs to use in the sync function
start = _cnt() + _clockfreq(); // set start time for +1 seconds// start FOUR receiver cogs
rcv1_cog = _cogstart_C(&receiver, rcv1_buff, rcv1_stack, STACK_SIZE);
rcv2_cog = _cogstart_C(&receiver, rcv2_buff, rcv2_stack, STACK_SIZE);
rcv3_cog = _cogstart_C(&receiver, rcv3_buff, rcv3_stack, STACK_SIZE);
rcv4_cog = _cogstart_C(&receiver, rcv4_buff, rcv4_stack, STACK_SIZE);
ACQUIRE(lock);
printf("... done\n");
RELEASE(lock);
// give receiver a chance to receive
_waitms(1000);
// wait till receivers completewhile (
(rcv1_buff[NUM_LONGS - 1] == 0)
&& (rcv2_buff[NUM_LONGS - 1] == 0)
&& (rcv3_buff[NUM_LONGS - 1] == 0)
&& (rcv4_buff[NUM_LONGS - 1] == 0)
) {
_waitms(1000);
}
// terminate the receiver cogs (we restart them again for each transfer)
_cogstop(rcv1_cog);
_cogstop(rcv2_cog);
_cogstop(rcv3_cog);
_cogstop(rcv4_cog);
// check the results
ACQUIRE(lock);
printf("\nChecking data ...\n");
for (i = 0; i < NUM_LONGS; i++) {
// check receiver 1 got the correct dataif (send_buff[i] != rcv1_buff[i]) {
printf("send[%3d]=0x%08X != rcv1[%3d]=0x%08X\n",
i, send_buff[i], i, rcv1_buff[i]);
_waitms(5);
#ifdef TOTAL_ERRORS_ONLY// if only counting total errors, we are done - we don't // report each mismatch
errors++;
break;
#endif
}
// check receiver 2 got the correct dataif (send_buff[i] != rcv2_buff[i]) {
printf("send[%3d]=0x%08X != rcv2[%3d]=0x%08X\n",
i, send_buff[i], i, rcv2_buff[i]);
_waitms(5);
#ifdef TOTAL_ERRORS_ONLY// if only counting total errors, we are done - we don't // report each mismatch
errors++;
break;
#endif
}
// check receiver 3 got the correct dataif (send_buff[i] != rcv3_buff[i]) {
printf("send[%3d]=0x%08X != rcv3[%3d]=0x%08X\n",
i, send_buff[i], i, rcv3_buff[i]);
_waitms(5);
#ifdef TOTAL_ERRORS_ONLY// if only counting total errors, we are done - we don't // report each mismatch
errors++;
break;
#endif
}
// check receiver 4 got the correct dataif (send_buff[i] != rcv4_buff[i]) {
printf("send[%3d]=0x%08X != rcv4[%3d]=0x%08X\n",
i, send_buff[i], i, rcv4_buff[i]);
_waitms(5);
#ifdef TOTAL_ERRORS_ONLY// if only counting total errors, we are done - we don't // report each mismatch
errors++;
break;
#endif
}
}
// update and print statistics
transfers++;
printf("Total errors = %d (from %d transfers)\n", errors, transfers);
RELEASE(lock);
// re-initialize the arrays for the next transferfor (i = 0; i < NUM_LONGS; i++) {
rcv1_buff[i] = 0;
rcv2_buff[i] = 0;
rcv3_buff[i] = 0;
rcv4_buff[i] = 0;
}
}
#endif
}
I've updated my "Propeller2Propeller" bus test program (now called p2p.c) to add the ability for a Propeller to either be a 'sender', a 'receiver', or a 'transceiver' which alternates between sending and receiving (which makes it a more realistic test).
Also, I had some failures with transferring synchronous blocks of 20,000 longs so I have wound it back to 10,000 longs at a time for the moment. I think this is due to slight differences between the Propeller clocks. Larger transfers will need to be done in multiple synchronous blocks anyway, but eventually I might add some code to auto detect the maximum block size so the user doesn't need to configure it.
The next step is to add a higher level protocol to allow multiple Propellers to share the bus and send and receive without bus contention.
Also, I intend to make it configurable whether the P2P bus is 8, 16 or 32 bits wide.
can be not dira, #0
or neg dira, #1
or bmask dira, #31
True, but eventually I will allow for 8, 16 or 32 bit bus configurations using any combination of the 4 bytes inn the port, so I was keeping it straightforward - i.e. you set the corresponding bit to 1 to include that bit in the bus. Also, this means I can eventually just define one mask to represent the bus bits and use it everywhere.
There is also dirl #basepin | 7<<6 and dirh #basepin | 7<<6 for eight consecutive pins. And this also works for port B, although you can't straddle both ports in one op.
PS: The previous 32-bit ops can also be dirl #0 | 31<<6 and dirh #0 | 31<<6 respectively.
@evanh said:
There is also dirl #basepin | 7<<6 and dirh #basepin | 7<<6 for eight consecutive pins. And this also works for port B, although you can't straddle both ports in one op.
PS: The previous 32-bit ops can also be dirl #0 | 31<<6 and dirh #0 | 31<<6 respectively.
The P2 has more possibilities than I have had hot dinners!
Comments
Okay! My simple Prop-to-Prop cables arrived this week, so I have connected a P2_EVAL to a P2_EDGE, and here is my initial test program - simple but reliable synchronous transfers using a 32 bit parallel bus between two or more Propellers at 32Mb/s ...
/* * Program to test how fast and reliably bus read/writes can be done using a * simple synchronous parallel bus connecting two or more Propellers. * * There can be only one sender cog at a time, but can be multiple receiver * cogs. The intention is to start one sender cog on one Propeller, and * multiple receiver cogs on the other Propellers. This program supports * four receivers on a single Propeller. * * The Propellers must be connected pin to pin on pins 00 .. 31 (i.e. port A). * If you have more than two Propellers connected, you could start additional * receiver cogs on the other Propellers. * * This is a synchronous transfer, so how much data can be transferred depends * on how closely the respective Propeller clocks are synchronized. The crystal * accuracy is typically +/-0.5 PPM. At 25 clocks per long, this means the * clocks may be out of sync by up one clock after 40,000 longs. A suitable * maximum size for a single synchronous transfer might therefore be 20,000 * longs, which is why this value is used by this program. Larger transfers * can be performed by doing multiple smaller transfers (which is the point * of this test program!). * * The program uses P2 NATIVE PASM, so it must be compiled in P2 NATIVE mode * (which is the default mode for the Propeller 2). * * To maximize available cogs, add -C NO_MOUSE and -C NO_FLOAT (and -C SIMPLE * if using a serial HMI). However in this simple-minded program, buffer space * is likely to be the limiting factor, not cogs. * * To build as a sender, add -C SENDER * * To build as a receiver, add -C RECEIVER * * For example, compile with a command like: * * catalina -p2 -lci p2_bus.c -o sender -C NO_MOUSE -C NO_FLOAT -C SENDER * or * catalina -p2 -lci p2_bus.c -o receiver -C NO_MOUSE -C NO_FLOAT -C RECEIVER * * Then load and execute with commands like: * * payload sender -PN -i * payload receiver -PM -i * * where N & M are the Propeller ports to use for the sender and receiver, * respectively (add more 'receiver' commands if there are more than two * Propellers on the bus). */ #if !defined (__CATALINA_SENDER) && !defined(__CATALINA_RECEIVER) #error EITHER SENDER OR RECEIVER MUST BE DEFINED! #endif #include <stdio.h> #include <stdlib.h> #include <catalina_cog.h> #include <catalina_plugin.h> #define XFER_DELAY 7 // clocks per long is 18+this (7 works!) #define NUM_LONGS 20000 // number of longs to transfer (20000 works!) #define STACK_SIZE 500 // size of cog stack (stdio requires 500) #define TOTAL_ERRORS_ONLY // print only total errors, not details static unsigned long start = 0; // clock count used to synchronize cogs static int lock = 0; // lock to protect I/O static unsigned long *send_buff; // data to be sent static unsigned long *rcv1_buff; // data received (1) static unsigned long *rcv2_buff; // data received (2) static unsigned long *rcv3_buff; // data received (3) static unsigned long *rcv4_buff; // data received (4) /* * sync - synchronize multiple cogs to start on a specific clock count * * 'start' should be set to a clock count some time in the * future - e.g. _cnt() + _clockfreq() for one second */ int sync(unsigned long start) { return PASM ( " getct r0\n" " sub r2, r0\n" " waitx r2\n" " getct r0\n" ); } /* * send - pasm code to write a number of longs to the 32 bit bus * Note: sending starts immediately, and then sends a new long * every 'time'+18 ticks. * Note: loads the code into LUT RAM and executes it there. * * 'time' (passed in r4) is the time between writes in clocks (+18) * 'buff' (passed in r3) is an array holding longs to send * 'size' (passed in r2) is the size of the array */ int send(int time, void *buff, int size) { return PASM ( // set pins 00 .. 31 as outputs " mov outa, #0\n" " or dira,##$FFFFFFFF\n" // load LUT RAM: " setq2 #(send_end - send_start - 1)\n" " rdlong 0, ##@send_start\n" // jump to code in LUT RAM: " jmp #send_start\n" // code to be executed in LUT RAM: " org $200\n" "send_start\n" " getct r0\n" // LUT: 2 (clocks) "\n" " rep #7, r2\n" // LUT: 2 " addct1 r0, #18\n" // LUT: 2 " rdlong r1, r3\n" // LUT: 9 .. 16 " waitct1\n" // LUT: 2 " add r3, #4\n" // LUT: 2 " mov outa, r1\n" // LUT: 2 " addct1 r0, r4\n" // LUT: 2 " waitct1\n" // LUT: 2 "\n" " addct1 r0, r4\n" " addct1 r0, #18\n" " waitct1\n" " mov outa, #0\n" " getct r0\n" " jmp #send_cont\n" "send_end\n" // resume Hub Execution: " orgh\n" "send_cont\n" ); } /* * sender - send an array of longs to one or more receivers * * 'buff' is an array of NUM_LONGS longs. */ void sender(void *buff) { unsigned int started, stopped, total; int me = _cogid(); started = _cnt(); stopped = send(XFER_DELAY, buff, NUM_LONGS); total = stopped - started; ACQUIRE(lock); printf("send (cog %d) took %d clocks (%d per long)\n\n", me, total, total/NUM_LONGS); RELEASE(lock); while(1); // don't exit } /* * recv - pasm code to read a number of longs from the 32 bit bus. * Note: receiving starts 'time' clock ticks after any non-zero * value is detected on the bus, and then reads a new long * every 'time'+18 ticks. * Note: loads the code into LUT RAM and executes it there. * * 'time' (passed in r4) is the time between reads in clocks (+18) * 'buff' (passed in r3) is an array to hold longs received * 'size' (passed in r2) is the size of the array */ int recv(int time, void *buff, int size) { return PASM ( // set pins 00 .. 31 as inputs " andn dira,##$FFFFFFFF\n" // load LUT RAM: " setq2 #(recv_end - recv_start - 1)\n" " rdlong 0, ##@recv_start\n" // jump to code in LUT RAM: " jmp #recv_start\n" // code to be executed in LUT RAM: " org $200\n" "recv_start\n" " mov r1, ina\n" // LUT: 2 (clocks) " cmp r1, #0 wz\n" // LUT: 2 " if_z jmp #recv_start\n" // LUT: 4 " getct r0\n" // LUT: 2 "\n" " rep #7, r2\n" // LUT: 2 " addct1 r0, #18\n" // LUT: 2 " wrlong r1, r3\n" // LUT: 3 .. 10 " waitct1\n" // LUT: 2 " add r3, #4\n" // LUT: 2 " addct1 r0, r4\n" // LUT: 2 " waitct1\n" // LUT: 2 " mov r1, ina\n" // LUT: 2 "\n" " getct r0\n" " jmp #recv_cont\n" "recv_end\n" // resume Hub Execution: " orgh\n" "recv_cont\n" ); } /* * receiver- receive an array of longs from the 32 bit bus * * 'buff' is an array of NUM_LONGS longs. */ void receiver(void *buff) { unsigned int started, stopped; int me = _cogid(); started = sync(start); stopped = recv(XFER_DELAY, buff, NUM_LONGS); ACQUIRE(lock); printf("recv (cog %d) started at clock 0x%08x\n", me, started); RELEASE(lock); while(1); // don't exit } void main(void) { unsigned long i; long send_stack[STACK_SIZE]; long rcv1_stack[STACK_SIZE]; long rcv2_stack[STACK_SIZE]; long rcv3_stack[STACK_SIZE]; long rcv4_stack[STACK_SIZE]; int send_cog; int rcv1_cog; int rcv2_cog; int rcv3_cog; int rcv4_cog; int errors = 0; int transfers = 0; // assign a lock to be used to avoid plugin contention lock = _locknew(); // give the vt100 emulator (if used) a chance to start #ifdef __CATALINA_VT100 _waitms(500); #endif // allocate the arrays (we allocate them all whether we are a sender // or a receiver) send_buff = malloc(NUM_LONGS*4); rcv1_buff = malloc(NUM_LONGS*4); rcv2_buff = malloc(NUM_LONGS*4); rcv3_buff = malloc(NUM_LONGS*4); rcv4_buff = malloc(NUM_LONGS*4); // initialize the arrays for (i = 0; i < NUM_LONGS; i++) { send_buff[i] = i+1; // can be anything, but must be non-zero rcv1_buff[i] = 0; rcv2_buff[i] = 0; rcv3_buff[i] = 0; rcv4_buff[i] = 0; } #ifdef __CATALINA_SENDER // set all bus pins to zero PASM( " mov outa, #0\n" " or dira,##$FFFFFFFF\n" ); // start ONE sender k_clear(); ACQUIRE(lock); printf("SENDER (Clock = %lu Hz)\n\n", _clockfreq()); printf("Start the receiver cogs in the Receiver program,\n"); printf("then press any key to start sender cog\n"); RELEASE(lock); k_wait(); // keep transferring forever while (1) { ACQUIRE(lock); printf("\nStarting sender cog ...\n\n"); RELEASE(lock); send_cog = _cogstart_C(&sender, send_buff, send_stack, STACK_SIZE); // give the sender a chance to send (3 seconds is generous!) _waitms(3000); ACQUIRE(lock); printf("... done\n"); RELEASE(lock); // cancel the sender (we restart it again for each transfer) _cogstop(send_cog); // update and print statistics transfers++; ACQUIRE(lock); printf("\nTotal %d transfers\n",transfers); RELEASE(lock); } #endif #ifdef __CATALINA_RECEIVER ACQUIRE(lock); printf("RECEIVER (Clock = %lu Hz)\n\n", _clockfreq()); printf("Press any key to start receiver cogs, then start the sender\n"); printf("cog in the sender program\n"); RELEASE(lock); k_wait(); ACQUIRE(lock); printf("Starting receiver cogs ...\n\n"); RELEASE(lock); while (1) { // set a start time for the receiver cogs to use in the sync function start = _cnt() + _clockfreq(); // set start time for +1 seconds // start FOUR receiver cogs rcv1_cog = _cogstart_C(&receiver, rcv1_buff, rcv1_stack, STACK_SIZE); rcv2_cog = _cogstart_C(&receiver, rcv2_buff, rcv2_stack, STACK_SIZE); rcv3_cog = _cogstart_C(&receiver, rcv3_buff, rcv3_stack, STACK_SIZE); rcv4_cog = _cogstart_C(&receiver, rcv4_buff, rcv4_stack, STACK_SIZE); ACQUIRE(lock); printf("... done\n"); RELEASE(lock); // give receiver a chance to receive _waitms(1000); // wait till receivers complete while ( (rcv1_buff[NUM_LONGS - 1] == 0) && (rcv2_buff[NUM_LONGS - 1] == 0) && (rcv3_buff[NUM_LONGS - 1] == 0) && (rcv4_buff[NUM_LONGS - 1] == 0) ) { _waitms(1000); } // terminate the receiver cogs (we restart them again for each transfer) _cogstop(rcv1_cog); _cogstop(rcv2_cog); _cogstop(rcv3_cog); _cogstop(rcv4_cog); // check the results ACQUIRE(lock); printf("\nChecking data ...\n"); for (i = 0; i < NUM_LONGS; i++) { // check receiver 1 got the correct data if (send_buff[i] != rcv1_buff[i]) { printf("send[%3d]=0x%08X != rcv1[%3d]=0x%08X\n", i, send_buff[i], i, rcv1_buff[i]); _waitms(5); #ifdef TOTAL_ERRORS_ONLY // if only counting total errors, we are done - we don't // report each mismatch errors++; break; #endif } // check receiver 2 got the correct data if (send_buff[i] != rcv2_buff[i]) { printf("send[%3d]=0x%08X != rcv2[%3d]=0x%08X\n", i, send_buff[i], i, rcv2_buff[i]); _waitms(5); #ifdef TOTAL_ERRORS_ONLY // if only counting total errors, we are done - we don't // report each mismatch errors++; break; #endif } // check receiver 3 got the correct data if (send_buff[i] != rcv3_buff[i]) { printf("send[%3d]=0x%08X != rcv3[%3d]=0x%08X\n", i, send_buff[i], i, rcv3_buff[i]); _waitms(5); #ifdef TOTAL_ERRORS_ONLY // if only counting total errors, we are done - we don't // report each mismatch errors++; break; #endif } // check receiver 4 got the correct data if (send_buff[i] != rcv4_buff[i]) { printf("send[%3d]=0x%08X != rcv4[%3d]=0x%08X\n", i, send_buff[i], i, rcv4_buff[i]); _waitms(5); #ifdef TOTAL_ERRORS_ONLY // if only counting total errors, we are done - we don't // report each mismatch errors++; break; #endif } } // update and print statistics transfers++; printf("Total errors = %d (from %d transfers)\n", errors, transfers); RELEASE(lock); // re-initialize the arrays for the next transfer for (i = 0; i < NUM_LONGS; i++) { rcv1_buff[i] = 0; rcv2_buff[i] = 0; rcv3_buff[i] = 0; rcv4_buff[i] = 0; } } #endif }
More to come!
I've updated my "Propeller2Propeller" bus test program (now called p2p.c) to add the ability for a Propeller to either be a 'sender', a 'receiver', or a 'transceiver' which alternates between sending and receiving (which makes it a more realistic test).
Also, I had some failures with transferring synchronous blocks of 20,000 longs so I have wound it back to 10,000 longs at a time for the moment. I think this is due to slight differences between the Propeller clocks. Larger transfers will need to be done in multiple synchronous blocks anyway, but eventually I might add some code to auto detect the maximum block size so the user doesn't need to configure it.
The next step is to add a higher level protocol to allow multiple Propellers to share the bus and send and receive without bus contention.
Also, I intend to make it configurable whether the P2P bus is 8, 16 or 32 bits wide.
Ross.
can be
PASM(" and dira, #0\n");
or
PASM(" mov dira, #0\n");
can be
not dira, #0
or
neg dira, #1
or
bmask dira, #31
True, but eventually I will allow for 8, 16 or 32 bit bus configurations using any combination of the 4 bytes inn the port, so I was keeping it straightforward - i.e. you set the corresponding bit to 1 to include that bit in the bus. Also, this means I can eventually just define one mask to represent the bus bits and use it everywhere.
There is also
dirl #basepin | 7<<6
anddirh #basepin | 7<<6
for eight consecutive pins. And this also works for port B, although you can't straddle both ports in one op.PS: The previous 32-bit ops can also be
dirl #0 | 31<<6
anddirh #0 | 31<<6
respectively.The P2 has more possibilities than I have had hot dinners!
Oops, those "PS:" don't encode without ##. #7<<6 is the largest immediate.