PASM Coding Challenge - 4 wire Prop to Prop communication.

heater · 2009-10-03 10:53

The aim here is to implement bidirectional Prop to Prop communication using the following scheme:

1) Two pairs of signal wires connects the Props, one pair for each direction.
2) Within a pair let the signals be called "zero_bit" and "one_bit"
3) The initial and resting state of both wires is low.
4) A transition, low to high or vice versa, on the zero_bit wire indicates a 0 is transmitted. The actual signal level is unimportant.
5) A transition, low to high or vice versa on the one_bit wire indicates a 1 is transmitted. The actual signal level is unimportant.
6) Assume data is transmitted in 8 bit bytes.
7) At the end of transmitting the 8 bits the wires are set back low, the rest state.
8) The byte is transmitted high bits first.

So the code to transmit using this scheme might look something like this:

tx_byte                                        'Assume wires are low, resting, before we start
        mov    data_byte, some_byte_to_send    'Get a byte to send
        shl    data_byte, #24                  'Move it to the top of the long
        mov    count, #8                       'Count 8 bits
:loop
        shl    data_byte, #1 wc                'Get the value of the top most bit
if_c    xor    outa, one_bit                   'It's a 1 so transition the "one" wire
if_nc   xor    outa, zero_bit                  'It's a 0 so transition the "zero" wire
        djnz   count, #:loop                   'Round again if need be
        mov    outa, #0                        'Done, set the wires back to resting state.
tx_byte_ret ret

Now that code is a bit jittery in its timing as the transition times depend on if the data is zero or one. Not to worry for now.

The Tx loop is 4 instructions, so about 5M bits/second on a normal prop set up. The loop could be unrolled giving 6.6M bits/second maybe.

Now the challenge, if you have not guessed, is the hard part. Create the receiving side of this communication link. As fast as possible. Using whatever pins are most convenient/helpful. Changing the Tx code is allowed to fix the jitter or slow it down if necessary.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

JonnyMac · 2009-10-03 12:41

This is not tested -- it's simply based on conversations I've had with PJMonty vis-a-vis processor-to-processor comms that he uses in his animatronics control projects.

Instead of using one line for zero and the other for one, I use one line as the bit and the other as a the bit ready indicator (state change).

txbyte                  mov     txwork, bytetosend
                        mov     count, #8

:loop                   shr     txwork, #1              wc
                        muxc    outa, txmask
                        xor     outa, readymask                 ' indicate bit ready
                        djnz    count, #:loop
                        andn    outa, txmask

txbyte_ret              ret

rxbyte                  mov     count, #4

:loop                   waitpeq ina, readymask                  ' wait for (odd) bit
                        test    rxmask, ina             wc
                        rcr     rxwork, #1
                        waitpne ina, readymask                  ' wait for (even) bit
                        test    rxmask, ina             wc
                        rcr     rxwork, #1
                        djnz    count, #:loop
                        shr     rxwork, #24                     ' clean-up

rxbyte_ret              ret

Jeepers... its 5:40A on a Saturday and I'm up and writing code (I hope it's intelligible).

Post Edited (JonnyMac) : 10/3/2009 12:47:40 PM GMT

kuroneko · 2009-10-03 12:48

Side challenge: Make the transmitter run with 1 bit/instruction (using the original specification) without using self-modifying code (as this would be detrimental to the overall transfer rate).

heater · 2009-10-03 15:01

JonnyMac. Nice, but sadly rejected. The challenge is as it is stated, a zero line and a one line. There are reasons for this.

Kuroneko. Let's see....

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

JonnyMac · 2009-10-03 15:08

Hmmm... what part of "Changing the Tx code is allowed..." did I misunderstand? And if it is a Propeller-to-Propeller scheme then why would you reject, out-of-hand, a potential answer? Now, it there a mitigating factors (e.g., external hardware you cannot change) you should state them, otherwise your request (and calling it a "challenge") is disingenuous, maybe even deceitful.

Post Edited (JonnyMac) : 10/3/2009 3:33:07 PM GMT

Beau Schwabe · 2009-10-03 15:19

heater,

"The aim here is to implement bidirectional Prop to Prop communication" - can this be simultaneous bi-directional data transfer? and is there a COG restriction or limit?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

heater · 2009-10-03 15:41

"Simultaneous bidirectional". Yes. Full duplex.
There is no COG restriction limit. I was only thinking as far as one for Tx and one for Rx.
If it can be done with a single COG performing both tx and rx that would be cool even with a huge speed hit.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

heater · 2009-10-03 16:14

JonnyMac: OK, you are right I should explain more of my thinking here and where this all comes from. I will, but first:

The phrase "Changing the Tx code is allowed..." was intended to mean that it could rewritten in any way, to speed it up or slow it down or fix the jittery bit edge timing, whatever, BUT that it should implement the 4 wire scheme/protocol as specified. Sorry if that was not so clear.

Yes, I am looking at this as a Propeller to Propeller scheme but also looking toward a particular external hardware.

So now, more explanation: There is a certain multi-core embedded processor chip design that implements communication channels between devices using this protocol. So my thinking here is motivated by a few things:

1) Academic - Why did they settle on that particular method? I'm something of an ignoramus here so I'm curious. What benefits does it have over other methods?

2) Practical - Would it be beneficial in any way for Prop to Prop comms? Is it implementable in a way to be able make use of the benefits?

2) Optimistic - The very unlikely and distant possibility of getting a Prop to interoperate with said devices using this protocol. Not that it sure that this is a desirable way to get interoperation at this stage.

I dare not name said devices here as they can be seen as a competing product to the Parallax Propeller (which I don't necessarily agree with). Those who follow such things know it already. That's why my initial challenge post may be a bit more mysterious than I intended.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

Phil Pilgrim (PhiPi) · 2009-10-03 16:24

Heater,

Is it okay to assume that the receiving pins are A1 and A0, say, or does it have to be pin-independent?

-Phil

heater · 2009-10-03 16:38

Use which ever pins makes life easy [noparse]:)[/noparse]

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

Beau Schwabe · 2009-10-03 17:12

heater,

There was an earlier thread that talked about using 4-wires (unidirectional) where each wire represented a symbol...

Wire1 = 00
Wire2 = 01
Wire3 = 10
Wire4 = 11

...In this implementation the wire would transition based on the bit pattern (symbol) of data that needed to be sent.

for example.... sending 10001011 would require a transition on W3,followed by W1,W3 again, and W4. So in total 4 'output cycles' on the transmission line. Using the same 4-wires if you simply output the 'nibble' at a time, the number 'output cycles' would be reduced to 2 instead of 4. .... '1000' followed by '1011'

What I'm getting at, is that you 'see' the same sort of thing with your two wire symbolic method. Basically your 'symbols' are 1 and 0 and split among two wires.

Wire1 = 0
Wire2 = 1

It would take 8 'output cycles' to transmit 10001011, where as if you sent the data in half nibbles you could reduce the number of 'output cycles' down to 4 ... 10, 00, 10, 11

If you are going for over all speed then this probably isn't the best method to use.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

Cluso99 · 2009-10-03 23:42

I like Beau's original 2 wire method. I figure that we can leave the code running at full speed and just pass data in (expecting 32 bits) and have it pop out.
For more than 2 props an extra pin could be added, so that each prop had its own tx pin and multiple rx pins if speed is such an issue.

However, when things settle, I am going to use Beau's method, together with some other methods suggested using the counters, to just continually shift data in and out. I see this as being fast and optimal.

I see optimal designs using multiple props. My RamBlade (release soon now) will be a single prop with ram & sd and use a 2 in ultra high speed serial interface based on Beau's method. It is a cheap and minimal system which is an easy module to add. Another module with prop needs to be the display(s) and inputs (keyboard, mouse, etc). The third module is a totally flexible uncommitted I/O board. Sound familiar??? It is the TriBlade on seperate pcbs

And I do not have to worry about other procesor code.

Note that the prop is now cheap enough, and maybe we can look forward to another (smaller) price drop now another batch of chips has been produced. IMHO the only thing we are missing is pins. If we had the pins, memory is NOT an issue. Every day we see another piece of brilliant code being added. We are now doing things that I am sure were not even thought of 1 year ago, and there is plenty more that can be done.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:

· Home of the MultiBladeProps: TriBlade,·RamBlade, RetroBlade,·TwinBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: Micros eg Altair, and Terminals eg VT100 (Index) ZiCog (Z80) , MoCog (6809)
· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm

Phil Pilgrim (PhiPi) · 2009-10-04 06:17

Here is a receiver which, I believe, adheres to the spirit, if not the letter, of the original challenge:

rcv           [b]waitpeq[/b]   p00,pins

w00_0         [b]waitpne[/b]   p00,pins
              [b]test[/b]      pin1,[b]ina[/b] [b]wc[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w10_1

w01_1         [b]waitpne[/b]   p01,pins
              [b]test[/b]      pin1,[b]ina[/b] [b]wc[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w11_2

w00_2         [b]waitpne[/b]   p00,pins
              [b]test[/b]      pin1,[b]ina[/b] [b]wc[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w10_3
        
w01_3         [b]waitpne[/b]   p01,pins
              [b]test[/b]      pin1,[b]ina[/b] [b]wc[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w11_4

w00_4         [b]waitpne[/b]   p00,pins
              [b]test[/b]      pin1,[b]ina[/b] [b]wc[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w10_5

w01_5         [b]waitpne[/b]   p01,pins
              [b]test[/b]      pin1,[b]ina[/b] [b]wc[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w11_6

w00_6         [b]waitpne[/b]   p00,pins
              [b]test[/b]      pin1,[b]ina[/b] [b]wc[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w10_7

w01_7         [b]waitpne[/b]   p01,pins
              [b]test[/b]      pin1,[b]ina[/b] [b]wc[/b]
              [b]rcl[/b]       data,#1
              [b]jmp[/b]       rcv_ret        

w10_1         [b]waitpne[/b]   p10,pins
              [b]andn[/b]      pin1,[b]ina[/b] [b]wc[/b],[b]nr[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w00_2

w11_2         [b]waitpne[/b]   p11,pins
              [b]andn[/b]      pin1,[b]ina[/b] [b]wc[/b],[b]nr[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w01_3

w10_3         [b]waitpne[/b]   p10,pins
              [b]andn[/b]      pin1,[b]ina[/b] [b]wc[/b],[b]nr[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w00_4

w11_4         [b]waitpne[/b]   p11,pins
              [b]andn[/b]      pin1,[b]ina[/b] [b]wc[/b],[b]nr[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w01_5

w10_5         [b]waitpne[/b]   p10,pins
              [b]andn[/b]      pin1,[b]ina[/b] [b]wc[/b],[b]nr[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w00_6

w11_6         [b]waitpne[/b]   p11,pins
              [b]andn[/b]      pin1,[b]ina[/b] [b]wc[/b],[b]nr[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w01_7

w10_7         [b]waitpne[/b]   p10,pins
              [b]andn[/b]      pin1,[b]ina[/b] [b]wc[/b],[b]nr[/b]
              [b]rcl[/b]       data,#1
rcv_ret       [b]ret[/b]

pins
p11           [b]long[/b]      1<<one_pin|1<<zero_pin
p00           [b]long[/b]      0
p01           [b]long[/b]      1<<zero_pin
pin1
p10           [b]long[/b]      1<<one_pin

data          [b]res[/b]       1

zero_pin and one_pin are constants that represent the pin numbers of the "0" and "1" pins, respectively. rcv is the subroutine that gets the next byte. Basically, this is an unrolled state machine whose transition diagram is a lattice. The current state is like a ball in a two-column Pachinko machine, dropping straight down on an incoming zero, and pinging to the adjacent row on a "1". The bits are collected in data.

Timing-wise, this will not quite hit the 5MHz mark, due to the extra clock states in the waitpne instructions. But it comes close!

-Phil

Edit: ROLs changed to RCLs.

Post Edited (Phil Pilgrim (PhiPi)) : 10/4/2009 6:45:53 AM GMT

kuroneko · 2009-10-04 06:35

@Phil: Don't you mean rcl for bit collection?

Post Edited (kuroneko) : 10/4/2009 6:53:16 AM GMT

Beau Schwabe · 2009-10-04 06:37

Phil,

Very nice!

Since the Transmitter ends with 'both' pins LOW, couldn't you assume that it would also start with both pins LOW so that the Receiver would look for this condition to be false as a SYNC?

This way you wouldn't need to use the waitpne for every bit... True the Transmitter creates some jitter in the way that the data is sent, but it would be 'predictable' on the receiver end depending on if the bit sent from the transmitter was detected as a "1" or a "0" on the receiver.

Edit:
After the first waitpne detecting that the pins have changed, "another" redundant waitpne should place any further pin reads during the time frame that the Transmitter is still in the djnz count, #:loop loop or the shl data_byte, #1 wc which should be a stable position to read the data.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

Post Edited (Beau Schwabe (Parallax)) : 10/4/2009 7:06:51 AM GMT

Phil Pilgrim (PhiPi) · 2009-10-04 06:58

kuroneko,

Yes, indeed that's what I meant. Amazing how one bad gene can spoil the whole family tree when replicated!

Beau,

You're right, assuming both Props are running at the same clock speed. I could add a nop or two after the first transition to make sure every sample comes after the pin change, then replace all the remaining waitpnes with nops. Moreover, if the transmit loop were unrolled, I could just eliminate the nops altogether, resulting in a 6.6MHz bit rate.

-Phil

Post Edited (Phil Pilgrim (PhiPi)) : 10/4/2009 7:05:28 AM GMT

Beau Schwabe · 2009-10-04 07:07

Phil,

bump! - see my Edit just above your post

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

Phil Pilgrim (PhiPi) · 2009-10-04 07:28

I think this will work:

xmt           [b]shl[/b]       data,#24

              [b]shl[/b]       data,#1 [b]wc[/b]
        [b]if_c[/b]  [b]xor[/b]       [b]outa[/b],xpin1
        [b]if_nc[/b] [b]xor[/b]       [b]outa[/b],xpin0
              
              [b]shl[/b]       data,#1 [b]wc[/b]
        [b]if_c[/b]  [b]xor[/b]       [b]outa[/b],xpin1
        [b]if_nc[/b] [b]xor[/b]       [b]outa[/b],xpin0
              
              [b]shl[/b]       data,#1 [b]wc[/b]
        [b]if_c[/b]  [b]xor[/b]       [b]outa[/b],xpin1
        [b]if_nc[/b] [b]xor[/b]       [b]outa[/b],xpin0
              
              [b]shl[/b]       data,#1 [b]wc[/b]
        [b]if_c[/b]  [b]xor[/b]       [b]outa[/b],xpin1
        [b]if_nc[/b] [b]xor[/b]       [b]outa[/b],xpin0
              
              [b]shl[/b]       data,#1 [b]wc[/b]
        [b]if_c[/b]  [b]xor[/b]       [b]outa[/b],xpin1
        [b]if_nc[/b] [b]xor[/b]       [b]outa[/b],xpin0
              
              [b]shl[/b]       data,#1 [b]wc[/b]
        [b]if_c[/b]  [b]xor[/b]       [b]outa[/b],xpin1
        [b]if_nc[/b] [b]xor[/b]       [b]outa[/b],xpin0
              
              [b]shl[/b]       data,#1 [b]wc[/b]
        [b]if_c[/b]  [b]xor[/b]       [b]outa[/b],xpin1
        [b]if_nc[/b] [b]xor[/b]       [b]outa[/b],xpin0
              
              [b]shl[/b]       data,#1 [b]wc[/b]
        [b]if_c[/b]  [b]xor[/b]       [b]outa[/b],xpin1
        [b]if_nc[/b] [b]xor[/b]       [b]outa[/b],xpin0
              
xmt_ret
              
rcv           [b]waitpeq[/b]   zero,rpins

              [b]waitpne[/b]   zero,rpins
              [b]nop[/b]                        'Necessary?
              
w00_0         [b]test[/b]      rpin1,[b]ina[/b] [b]wc[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w10_1

w01_1         [b]test[/b]      rpin1,[b]ina[/b] [b]wc[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w11_2

w00_2         [b]test[/b]      rpin1,[b]ina[/b] [b]wc[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w10_3
        
w01_3         [b]test[/b]      rpin1,[b]ina[/b] [b]wc[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w11_4

w00_4         [b]test[/b]      rpin1,[b]ina[/b] [b]wc[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w10_5

w01_5         [b]test[/b]      rpin1,[b]ina[/b] [b]wc[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w11_6

w00_6         [b]test[/b]      rpin1,[b]ina[/b] [b]wc[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w10_7

w01_7         [b]test[/b]      rpin1,[b]ina[/b] [b]wc[/b]
              [b]rcl[/b]       data,#1
              [b]jmp[/b]       rcv_ret        

w10_1         [b]andn[/b]      rpin1,[b]ina[/b] [b]wc[/b],[b]nr[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w00_2

w11_2         [b]andn[/b]      rpin1,[b]ina[/b] [b]wc[/b],[b]nr[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w01_3

w10_3         [b]andn[/b]      rpin1,[b]ina[/b] [b]wc[/b],[b]nr[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w00_4

w11_4         [b]andn[/b]      rpin1,[b]ina[/b] [b]wc[/b],[b]nr[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w01_5

w10_5         [b]andn[/b]      rpin1,[b]ina[/b] [b]wc[/b],[b]nr[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w00_6

w11_6         [b]andn[/b]      rpin1,[b]ina[/b] [b]wc[/b],[b]nr[/b]
              [b]rcl[/b]       data,#1
        [b]if_c[/b]  [b]jmp[/b]       #w01_7

w10_7         [b]andn[/b]      rpin1,[b]ina[/b] [b]wc[/b],[b]nr[/b]
              [b]rcl[/b]       data,#1
rcv_ret       [b]ret[/b]

zero          [b]long[/b]      0

rpins         [b]long[/b]      1<<r_one_pin|1<<r_zero_pin
rpin1         [b]long[/b]      1<<r_one_pin

xpin0         [b]long[/b]      1<<x_zero_pin
xpin1         [b]long[/b]      1<<x_one_pin

data          [b]res[/b]       1

The nop may or may not be necessary, depending on how soon after the pin change the waitpne resumes. has this timing ever been established?

All of this begs the question, though: if the transmitter and receiver are this well synchronized, why use two pins? Simple NRZ with start and stop bits would work with one pin at 10MHz.

-Phil

heater · 2009-10-04 07:39

Gosh, all this already.
I'm just off to work again, hope to have time to study this later.

Beau: As stated in the first post "The initial and resting state of both wires is low". We can also assume this rest state lasts for a couple of bit times or a few if needed.

We should assume the Tx and Rx clocks are not actually synchronized, just derived from separate, normally accurate, crystal sources.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

ericball · 2009-10-05 20:09

@PhiPi - great job. The nop shouldn't be necessary as you're trying to read INA asap. Some observations:
1. End w01_7 with "JMP rcv_ret" (no immediate flag)
2. The code works great if the sender & receiver are running from the same clock. But the whole reason for this weird protocol is it's used by some external device which may not share the same clock rate. Therefore, heater may want to use the waitpne version. Lower maximum bit rate, but it will handle slower bit rates without tweaking. (Although the transmitter would need to be tweaked.)
3. One thing which hasn't been defined is the minimum "rest" time. It will need to be at least 27 cycles (WRBYTE + WAITPEQ).

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: Forum
NTSC & PAL driver templates: ObEx Forum
OnePinTVText driver: ObEx Forum

kuroneko · 2009-12-14 05:27

[noparse][[/noparse]POC] 10Mbaud link @80MHz

2 cog transmitter (8 cycles/bit)
single cog receiver (hard-wired to 8 cycles/bit)

heater · 2009-12-14 07:13

I love all the ideas that come up for a problem like this.

Phils original receiver with a waitne for each bit would probably work in ideal circumstances but suffers form one problem: If the rx and tx are not "in scync" then it can never get back into synch. i.e. if transmission is interrupted half way through a byte the you can't get the receiver back to looking for the beginning of the next byte. Well, not the easy way by just resting the line long enough.

Beau is right that the rest state of the link is both lines low so it's good to wait for pins not zero to get the first bit of a byte transmission and the just sample the rest of the bits at the right moment, like a regular UART after a start bit.

This brings us to Phils second solution which does exactly wait for "non-rest" and then sample the rest of the bits without waiting. It's neat but has one problem: The transmitter has very jittery timing on the edges. This may or may not matter when connected to something on a different clock with hardware tx and rx of this protocol.

By the way Phil asked: "if the transmitter and receiver are this well synchronized, why use two pins? Simple NRZ with start and stop bits would work with one pin at 10MHz"

If you have not guessed from my previous comments this is for connecting the Prop to chips that implement this protocol in hardware. I and others want to use them as co-processors for the Prop for say Ethernet and/or USB etc. This is the "native" way to connect to them that saves resources at the other end.

ericball: Yes the minimum rest time is not defined. Turns out we have some flexibility there. It can be configured to be very long.

With further research I find I need to send 9 bits not 8, the 9th bit is a data/control flag. There is a tenth bit transition to get both lines back to the rest state. When receiving, having the lines not at rest at the end indicates a parity error.

So my jitter free transmitter at only 5Mb/s looks like:

token_tx                test    ttoken, #%0_10000000 wz 'Get data bit 7 into carry
        if_nz           mov     out_1, wire_1           'To toggle wire 1...
        if_z            mov     out_1, wire_0           '...or wire 0.
                        xor     outa, out_1             'Toggle selected wire

                        test    ttoken, #%0_01000000 wz 'Get data bit 6 into carry
        if_nz           mov     out_1, wire_1           'To toggle wire 1...
        if_z            mov     out_1, wire_0           '...or wire 0
                        xor     outa, out_1             'Toggle selected wire

                        test    ttoken, #%0_00100000 wz 'Get data bit 5 into carry
        if_nz           mov     out_1, wire_1           'To toggle wire 1...
        if_z            mov     out_1, wire_0           '...or wire 0
                        xor     outa, out_1             'Toggle selected wire

                        test    ttoken, #%0_00010000 wz 'Get data bit 4 into carry
        if_nz           mov     out_1, wire_1           'To toggle wire 1...
        if_z            mov     out_1, wire_0           '...or wire 0
                        xor     outa, out_1             'Toggle selected wire

                        test    ttoken, #%0_00001000 wz 'Get data bit 3 into carry
        if_nz           mov     out_1, wire_1           'To toggle wire 1...
        if_z            mov     out_1, wire_0           '...or wire 0
                        xor     outa, out_1             'Toggle selected wire

                        test    ttoken, #%0_00000100 wz 'Get data bit 2 into carry
        if_nz           mov     out_1, wire_1           'To toggle wire 1...
        if_z            mov     out_1, wire_0           '...or wire 0
                        xor     outa, out_1             'Toggle selected wire

                        test    ttoken, #%0_00000010 wz 'Get data bit 1 into carry
        if_nz           mov     out_1, wire_1           'To toggle wire 1...
        if_z            mov     out_1, wire_0           '...or wire 0
                        xor     outa, out_1             'Toggle selected wire

                        test    ttoken, #%0_00000001 wz 'Get data bit 0 into carry
        if_nz           mov     out_1, wire_1           'To toggle wire 1...
        if_z            mov     out_1, wire_0           '...or wire 0
                        xor     outa, out_1             'Toggle selected wire

                        test    ttoken, #%1_00000000 wz 'Get "control" indicator into carry
        if_nz           mov     out_1, wire_1           'To toggle wire 1...
        if_z            mov     out_1, wire_0           '...or wire 0
                        xor     outa, out_1             'Toggle selected wire

                        nop                             'Bit time delay before comming to rest.
                        nop
                        nop
                        mov     outa, #%00              'Set wires to rest (0,0) state.
token_tx_ret            ret

The 5Mb/s code to receive looks like:

token_rx                mov     rtoken, #0

                        waitpne rest_state, pins_rx     'Wait for wires to leave the rest state
                        mov     in_0, ina               'Get wires from P0 and P1
                        test    in_0, #%1000 wz         'Test for wire 1 change
        if_nz           or      rtoken, #%0_10000000    'Wire 1 set so set data bit 7

                        nop                             'Delay to get subsequent samples nearer the
                        nop                             'middle of bit periods.

                        mov     in_1, ina               'Get wires from P0 and P1
                        xor     in_0, in_1              'Get wire changes
                        test    in_0, #%1000 wz         'Test for wire 1 change
        if_nz           or      rtoken, #%0_01000000    'Wire 1 change so set data bit 6

                        mov     in_0, ina               'Get wires from P0 and P1
                        xor     in_1, in_0              'Get wire changes
                        test    in_1, #%1000 wz         'Test for wire 1 change
        if_nz           or      rtoken, #%0_00100000    'Wire 1 change so set data bit 5

                        mov     in_1, ina               'Get wires from P0 and P1
                        xor     in_0, in_1              'Get wire changes
                        test    in_0, #%1000 wz         'Test for wire 1 change
        if_nz           or      rtoken, #%0_00010000    'Wire 1 change so set data bit 4

                        mov     in_0, ina               'Get wires from P0 and P1
                        xor     in_1, in_0              'Get wire changes
                        test    in_1, #%1000 wz         'Test for wire 1 change
        if_nz           or      rtoken, #%0_00001000    'Wire 1 change so set data bit 3

                        mov     in_1, ina               'Get wires from P0 and P1
                        xor     in_0, in_1              'Get wire changes
                        test    in_0, #%1000 wz         'Test for wire 1 change
        if_nz           or      rtoken, #%0_00000100    'Wire 1 change so set data bit 2

                        mov     in_0, ina               'Get wires from P0 and P1
                        xor     in_1, in_0              'Get wire changes
                        test    in_1, #%1000 wz         'Test for wire 1 change
        if_nz           or      rtoken, #%0_00000010    'Wire 1 change so set data bit 1

                        mov     in_1, ina               'Get wires from P0 and P1
                        xor     in_0, in_1              'Get wire changes
                        test    in_0, #%1000 wz         'Test for wire 1 change
        if_nz           or      rtoken, #%0_00000001    'Wire 1 change so set data bit 0

                        mov     in_0, ina               'Get wires from P0 and P1
                        xor     in_1, in_0              'Get wire changes
                        test    in_1, #%1000 wz, wc     'Test for wire 1 change, C = Parity
        if_nz           or      rtoken, #%1_00000000    'Wire 1 change so set "control" bit

                        mov     in_1, ina
                        test    in_1, pins_rx wz        'Check wires for rest state (0,0) at end  of token
        if_nz           mov     rtoken, minus_one       'Indicate parity error if need be.

Sorry about the hard wired pin immediates.

This does work Prop to Prop, no alien hardware to test against yet.

Kuroneko: Thanks, I'll take a look when I have a mo.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

heater · 2009-12-14 09:02

Kurokeko: That solution looks totally brilliant.

Any chance you could stretch it to the 9 bits + "rest" I mentioned above[noparse]:)[/noparse]
I'd like to try it against real target hardware one day.

The actual 4 wire protocol is described in "XS1-L System Specification" here:
www.xmos.com/support/documentation

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

Andrey Demenev · 2009-12-14 11:42

10 Mbit/s TX using just one cog, at cost of code size. It is not complete, but you can get the idea - build a table of 256 patterns, use data to TX as index to that table, and then xor and shift pattern 8 times.

send
            and data, #$FF
            add data, #bit_table
            movs pattern, data
            nop
pattern    mov ptn, bit_table + 0
            xor OUTA, ptn
            ror ptn, #2
            xor OUTA, ptn
            ror ptn, #2
            xor OUTA, ptn
            ror ptn, #2
            xor OUTA, ptn
            ror ptn, #2
            xor OUTA, ptn
            ror ptn, #2
            xor OUTA, ptn
            ror ptn, #2
            xor OUTA, ptn
            ror ptn, #2
            xor OUTA, ptn
            nop                    ' just in case,
            andn OUTA, bit_pattern    ' if bus should be set to IDLE state
send_ret    ret

bit_pattern long %0011

bit_table
            long %1010101010101010
            long %0110101010101010
            long %1001101010101010
' cut here - total 256 entries
            long %1001100101010101
            long %0101100101010101
            long %1010010101010101
            long %0110010101010101
            long %1001010101010101
            long %0101010101010101

ptn            res 1

heater · 2009-12-14 13:43

Andrey: That is so elegantly brilliant I don't know what to say!

It's a nice mirror image of kuroneko's also elegantly brilliant Rx solution at 10Mbs.

kuroneko, in your solution you mention the overheads of the table lookup after a byte reception. It does not look like too much though. There are a few other things that need doing at that point to satisfy the higher level protocol requirements. When talking to the target hardware it is possible to configure it put quite large delays between bytes.

Thing is, I was very keen on not using more than two COGs to do this duplex link. COGS are precious.

Looks like I have my work cut out now to test this Prop to Prop.

So Andrey and kuroneko looks like you are both sharing the prize in this challenge. Two instructions per bit must be the limit. What prize you ask? ..err..um..I'll think of something.

Is it OK with you two if I use those codes in my final object for this protocol?
Can be MIT license or whatever if you like.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

kuroneko · 2009-12-14 14:04

heater said...
Andrey: That is so elegantly brilliant I don't know what to say!

Good thing is that Andrey's solution should manage 20Mbaud (ror outa, #n). Unfortunately that's TX only [noparse]:([/noparse]

heater said...
kuroneko, in your solution you mention the overheads of the table lookup after a byte reception. It does not look like too much though.

I get a bit nervous when the pre/post processing takes up nearly as much time as the actual transfer [noparse]:)[/noparse] Just thought I mention it in a comment.

heater said...
Is it OK with you two if I use those codes in my final object for this protocol?

No problems from my side. Thanks.

heater · 2009-12-14 14:45

The Rx post processing is a bit tricky. There are some control "tokens" that can arrive that should really be acted on as soon as possible and should not go into any FIFO or such and they don't go up to the application level. Ideally there should be no incoming tokens whilst the last control is being acted upon and we can't act on it without decoding it first.

A big thank you for your contribution.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

Andrey Demenev · 2009-12-14 14:59

kuroneko said...
show previous quotes

heater said...
Is it OK with you two if I use those codes in my final object for this protocol?

No problems from my side. Thanks.

No prob.

heater · 2009-12-16 08:50

A little problem.

The "tokens" exchanged with the target hardware or actually 9 bits long. That means the table used in Andrey Demenev's 10Mbit/s Tx solution is not going to fit in COG.

May be the same with 10Mbit/s receiver, I'll have to check.

Could put the tables in HUB at risk of slowing things a bit. Still that's a lot of HUB space to chew up. Must be able to squish them into 256 LONGs.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

kuroneko · 2009-12-16 08:54

heater said...
The "tokens" exchanged with the target hardware or actually 9 bits long.

Is there a quick explanation as to what the 9th bit is? Maybe we can deal with it on-the-fly?

Post Edited (kuroneko) : 12/16/2009 9:01:40 AM GMT

heater · 2009-12-16 09:33

Basically the xlink protocol uses 9 bit "tokens". The first eight bits are data transmitted most significant bit first. The ninth bit indicates if the token carries a data byte in the 8 bits or a control byte value. Control bytes are things such as HELLO tokens to initiate a link, and RESET tokens. These commands and should be acted on without them entering the Rx FIFO. Also CREDIT tokens tell how many bytes the other end has free in it's Rx FIFO. There is a 10th bit in the timing of the transmission used to set both lines are set back to zero.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

PASM Coding Challenge - 4 wire Prop to Prop communication.

Comments