P2 PASM P2 to P2 data transfer
pic18f2550
Posts: 400
in Propeller 2
Hello,
I am looking for a way to transfer 30 longs from one P2 to another P2 in 2 µs.
Comments
How many pins are you willing to use? That's around 480 megabits per second, which would be rather difficult on one pin (even with the P2 overclocked). But with 4 pins it should be possible, with 8 pins pretty easy, and with 33 pins almost trivial.
I was thinking of a possibility with the smartpins.
The first pin should only be long for the first person.
This would realise a kind of synchronisation of the entire data package.
On the other pins, the remaining data is sent with a time delay.
I also thought about strimming as with the htmi, but here there is no possibility of reading into the P2.
Each streamer can read or write between I/O and hubRAM, on every sysclock tick, 1-bit wide (one data pin) or 2-bit wide (two data pins) or 4-bit wide (four data pins) or 8-bit wide (eight data pins) or 16-bit wide (sixteen data pins) or 32-bit wide (thirty two data pins).
To achieve max rate (one transfer per tick) would require both prop2 chips on the same external clock source. Ie: Using a single oscillator feeding XI pin on both Prop2 chips.
1 sysclock tick is not good for two P2s that are operated from different crystals.
Therefore, no guarantee can be given for the correctness of the data.
So back to the smartpin or I/O routine.
Smartpins cannot work faster than streamers. Prop2 does not have external direct clocking of I/O registers. All I/O clocking is internal against sysclock.
So, common crystal is only way to achieve fastest data rate.
A common crystal is no guarantee of synchronicity.
As a deviation, the transient behaviour of the pll must be taken into account.
The priority for data transmission is to receive the data block with 100%.
The transmission by means of several smartpins seems to me to be the better way.
Does anyone have a PASM example for sending and receiving a long for me? I will then add the other channels myself.
Thanks.
Okay. I haven't tested a pair of PLLs like that. You're probably right.
An asynchronous serial smartpin can accept serial data up to about sysclock/3. It doesn't need a clock pin therefore lowest pin count. Trick will be in splitting and merging the data at each end for multiple pins. EDIT: Ah! Just send whole longwords in each smartpin. Makes it simpler at both ends.
A 102MBd uart can sent 6 longs in 2us with 1 stop bit, or 105Mbd can send 6 longs in 2us.
5 lanes would then manage 30 longs, assuming that 105 MBd can actually work.
More practical may be 70Mbd for 4 longs/2us, so 8 lanes can then manage 32 longs/2us, plus a pin for sync/triggering
I did a bit of maths.
At 256Mhz it would be 512clks / (3 * (32+2)) = 5.019 longs per line.
So 6 lines + 1 sync line are needed.
Are you wanting this as continuous stream? Unidirectional, one chip to other, eg: 30 input channels?
Or more complex bidirectional comms? Ie: Does it need to handle tx/rx turnaround on this comms bus.
It is a continuous stream in one direction, from one P2 to the next P2.
The first one only sends and the last one only receives.
The 7 in between can do both, but on separate pins. Here the data is further processed and sent to the next P2.
I did some experiments with the streamer https://forums.parallax.com/discussion/170216/ringbuffer-was-streamer-questions-how-to-sync
But that was with Rev A so the code does not run anymore.
There is some mode added to rev C allowing ADC input with external clock. According to chip this should work with ADC DIS-abled to stream in data with the streamer from Pins to HUB and external clock provided by the sending unit.
That was my next step with the ring buffer concept, sadly I do not have any rev C chips and also no time to do so right now.
waiting for P2 Edge with external RAM.
Gruss,
Mike
Rev C only cut a track for the ADC inside the chip die. There are no other silicon changes from Rev B chips.
maybe rev B did the changes to the streamer, but my rev A are out dated
Mike
Wow, that would be way cool.
The concept needs more explanation(Chip?)
How is the data registered with the internal sysclock(possible data glitches?)
This could be really handy for a lot of other applications, too.It could be used to clock in data from an ethernet phy, for instance(off topic)
I think they are now called SCOPE modes, because of use by debug.
But they sample up to sysclock/2 with trigger? from pins to HUB.
Can't test have just rev A
Try and report, please,
Mike
The smartpins could handle external clocked PDM bitstream even with revA silicon. Just it was only Sinc1, didn't have the Sinc2 or 3 modes.
Prop2 handles external clocks by sampling it same as data line. Not sure how fast. Clock edges may be limited to sysclock/3, which would mean data rate limited to sysclock/6. Probably sysclock/2 and /4 respectively. The sysclock/3 is for async serial.
%11110: Asynchronous serial transmit (AST)
%11111: Asynchronous serial receive (ASR)
Register X bits X[31:16] set the number of system-clock periods in a bit period. In a case where your calculated bit rate leaves bits X[31:26] are all zero, bits X[15:10] let you set a base-2 fraction of a system-clock period to obtain an accurate bit-per-second transmission rate.
X[31:16] = 1 ?
256 MHz system clock = 256.000.000 bds ?
Correct. Don't expect that value to work though. Read the state machine sequence - step 3 in particular:
And yep, that snippet looks good to transmit.
EDIT: Throw in an ASMCLK and say _CLKFREQ = 10_000_000. It'll give you a specific clock rate to comfirm the ratios against.
Duplikat.
Start and stop bit are added automatically, what about the parrtity bit?
I have the suspicion that I have to knit this myself.
But that would mean that I can't transfer 32 bit values.
Honestly, using a single parity bit for a 31 or 32 bit transfer doesn’t get you much benefit as it will only detect odd numbers of bit errors and won’t tell you which bits are in error. A better approach might be to run the data through a CRC routine on each end although that will increase the number of longs to send by one.
Chip included the CRCNIB and CRCBIT instructions for that sort of thing.
A software solution is not what I am looking for. There is no time for that either.
Correct. Hardware only performs stop bits = 1, parity = none. Any synthesis of extra bits will subtract from the 32 bits.
It depends on the criticality of the data being correct and the action taken on detecting an error.
If you have no time for a CRC then what would you do with a parity bit? Using a traditional UART, if a parity error occurred you would throw that data away and request a resend if there was a return channel and enough time.
In this case, you are proposing no return channel so you would normally use some method of Forward Error Correction, which increases the amount of data to send, but provides a reasonable probability of error detection and correction.
If there’s no time for anything like that then just send the data alone, and hope (pray?) that any bit errors that occur don’t have a critical effect.
@jmg has indicated that you could send 32 longs within your time budget, which would give you data space to carry some form of ECC, so it then comes down to processing time.
In the event of a transmission error, the entire data packet is to be discarded and a stored security data set is used.
The error bit I just wanted to add up and if the result is <>0 discard the packet.
A new send is useless because the data are already in processing (emergency data) and the next data packet is already waiting.
I think I will include a simple CRC check routine for this I will include another IO pin in the data transfer.
The CRC check will only consist of an EXOR command. that must be enough.
What you could do is add another pin-pin channel, that is the XOR of the other N, and then that is fast and easy to generate and check. (which maybe what you meant here ?)
You can trial with the extra check channel and see if the XOR (== 'sideways' parity) ever triggers.
Parity was there for long runs of cables, to things like serial printers.
This way the P2 in the chain should receive and forward their data.
The 1st P2 starts with a cyclic timer and has no RX routines.
The last P2 has no TX routines