Shop OBEX P1 Docs P2 Docs Learn Events
Full Duplex Serial: Is it really full duplex? - Page 2 — Parallax Forums

Full Duplex Serial: Is it really full duplex?

2

Comments

  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2015-08-27 23:47
    heater wrote:
    Of course in all synchronous schemes the transmitter does supply the clock, perhaps in the in the same line as the data.
    Sometimes the receiver provides the clock, such as when a micro is reading from an SPI peripheral. There might even be synchronous networking schemes where the clock is provided by an external master.
    ... We will guarantee to give you edges often enough to make that possible.
    Actually, "You get an edge for every bit."

    -Phil
  • Heater.Heater. Posts: 21,230
    Phil,

    Quite so, the receiver can supply the clock. Or a third party. It's all the same. Both ends are synchronized to the clock.

    I'm not sure what you mean by "You get an edge for every bit." If that were always true there would be no need for bit stuffing.
  • Peter JakackiPeter Jakacki Posts: 10,193
    edited 2015-08-28 01:22
    I have written many special purpose serial drivers for the Prop and practically all of them do multiple data bits which can encompass 9-bit address mode, parity and detect framing errors. In fact the framing errors that are contiguous nulls are counted up to detect breaks which I use over serial links including RS485 and RF for resetting the system. Even Tachyon does this.

    The other thing is that at higher speeds it is not good enough just to use half the bit time from detecting the start bit edge to sampling the "middle" of the bit as there are overheads so my start bit sampling time is calculated with this offset and used as a constant. But since practically all my drivers dedicate a cog to receive (even if it is half or auto duplex), I know that my edge detection is very accurate as it either uses a WAITPNE or a tight loop.

    Also once the stop bit is sampled and the data finalized I make sure the data line is high before I test for a start bit, just in case there was a framing error or it happens that the serial cog is started while receive data is active etc.

    There is also another enhancement that I could use but have never had to and that is to resynch on all edges but the nature of asynch is that at most you might only have 8 or 9 bits plus a stop bit to worry about so unless the clock source is sloppy there is not normally a need for this.

    Now if all asynch characters as standard always had a synch bit of the opposite polarity from the start bit and immediately following it would make it so much easier to determine baud rates, not just for an initial sequence but in fact for each and every character as the start bit period can be measured easily. Some systems allow me to just set the lsb as I can convey enough information in the other 7-bits anyway and my software can always lock onto the actual baud rate of the connected system rather than the standardized baud rates. So if a system transmitted at 250,608 baud then my driver locks on as long as it uses the bit0 synch method which is application dependant but transparent to regular UARTs.

    BTW: I2C start and stop have absolutely nothing to do with synch as the clock is supplied but these conditions (not bits) are used to signal an I2C address is to follow a start so it is nothing more than a "serial chip select" if you like. A lot of times you can get away without the stop condition and just issue start conditions but some chips use the stop condition such as EEPROMs to signal it to start programming etc.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2015-08-28 01:28
    heater wrote:
    I'm not sure what you mean by "You get an edge for every bit." If that were always true there would be no need for bit stuffing.
    'Not sure what you mean, either. But it's simple: with synchronous serial, every bit is clocked in with a provided clock edge.
    I2C start and stop have absolutely nothing to do with synch as the clock is supplied but these conditions (not bits) are used to signal a I2C address is to follow a start so it is nothing more than a "serial chip select" if you like.
    If that's not a description of synchronization, I don't know what is. :)

    -Phil
  • Sock,

    The counter approach is pretty clever and I considered using it for the 1X version of the driver. In the end, it didn't seem necessary as I have 460K baud working well. It's not an option for the 4x version of the driver.

    One investigation that remains for the 1X version is whether the counters could be used for either assembly of receive data or shifting transmit data.

    There's some clever SPI implementation in the OBEX that uses the counters to shift data, but I've never taken a close look at whether that's possible for the needs of the 1X serial driver.

    Have you considered that?


  • ksltd wrote: »
    Sock,

    The counter approach is pretty clever and I considered using it for the 1X version of the driver. In the end, it didn't seem necessary as I have 460K baud working well. It's not an option for the 4x version of the driver.

    One investigation that remains for the 1X version is whether the counters could be used for either assembly of receive data or shifting transmit data.

    There's some clever SPI implementation in the OBEX that uses the counters to shift data, but I've never taken a close look at whether that's possible for the needs of the 1X serial driver.

    This is where P1 or perhaps P1.5 could have made life easier with simple shift registers on each cog rather than video.

    I have used the counters to transmit 10Mbit data with clocking bits resulting in a 20MHz stream for fibre-optic datacomms since the single-mode modules I used did have capacitive coupling internally.

  • Cluso99Cluso99 Posts: 18,066
    FYI: In synchronous comms, the TXC and RXC were often output on pins 15 & 17 of the DB25 DCE. Sometimes the DTE was able to supply clocks on pin 24 of the DB25 (eg from a mainframe computer for direct connection to another mainframe without modems, and using a crossover cable).

    In the 70's & 80's I built all sorts of comms equipment that enabled mainframes to talk to other mainframes, such as ASCII to EBCDIC code converters where one mainframe used ASCII and the other used EBCDIC code characters. This included an Apple //e and Apple /// card that enabled the Apple to act as a 327x Terminal connected to an IBM Mainframe which we sold and build for Apple.
  • One limitation of the descendents of full duplex serial is the slow byte-by-byte transfer of string data to the transmit buffer. I like the more recent alternatives such as the put_bytes(..) method in Wayne's serial_Nx. Also, the txBuf(..) method in Jonathan's FFDS2. Put_bytes(..) uses bytemove to move chunks of data rapidly into the transmit buffer, while txBuf(..) dispenses with the transmit buffer altogether and simply points to any old buffer in the hub. I think Peter uses a similar technique. Avoiding the speed bottleneck of spin is essential for things like high speed transfer of SD card files over a serial link.

    There is a lot out there to learn and share. I think I picked up the cmpsub trick from Phil's PBnJ zero-jitter object. Detecting framing errors from SerialMirror, although granted, detecting framing errors does have a cost. It all comes down to synchronizing to specific engineering needs.

  • Peter JakackiPeter Jakacki Posts: 10,193
    edited 2015-08-29 02:29
    I do believe I can write a 32-channel full-duplex object that runs in a single cog reliably at 115,200 baud. Whaaaa!? "Shirley" you're not serious you might say but seriously don't call me Shirley.

    For further details see my post in the Tachyon thread. (sure would help if we could link individual posts).

    However on the subject of buffers, when you are working at higher baud rates it's better to skip the buffer and transmit directly if possible but at lower baud rates another technique I use that doesn't rely upon a dedicated buffer is to have a simple long variable that characters can be passed to one at a time as long as it's empty (-1). Sounds dumb until you go to print a string and instead of passing characters one by one you simply pass the address which being >$FF is interpreted as a pointer to the "null terminated buffer". Way more efficient don't you think? This version was going to go into Tachyon but the tecnique may end up being used on this new multiport object instead. Also if the address is >$8000 then it uses Tachyon's virtual memory straight from the SD card.
  • pjvpjv Posts: 1,903
    Peter,

    If you mean 32 contiuous no-break standard 8 bit data streams with one stop bit from different sources (as in non synced bit clocks), I would bet that you can not achieve that.

    I'd be very pleased and extremely impressed to be proven wrong.

    Cheers,

    Peter (pjv)
  • sure would help if we could link individual posts

    I'm not going to attempt the 32 continuous no-break data streams but I can help with linking to individual posts.

    To the right of the avatar on each post is a date. You can "copy link address" to this date to link to the individual post. Here's a link to your post.

    forums.parallax.com/discussion/comment/1342567/#Comment_1342567

    Tracy, thanks for the reminder about Wayne's driver. It sounds like something I need for my current project. And thanks to Wayne for the driver.

  • You are supposed to say "impossible", then I would have a little more confidence. A challenge nonetheless!
  • Peter,

    What you propose, presuming you meant to implement on the P8x32a without significant external hardware, isn't only impossible, it doesn't stand up to nearly any first pass review of being remotely possible.

    32 full duplex ports is 64 channels of IO.

    First, 64 channels require 64 IO pins. You're already more than 50% over subscribed.

    Second, there's just not enough performance. 64 channels at 115200 bits per second allows you only 10.8 80MHz clocks per bit. Any kind of counted looping primitive requires at least 8 clocks, leaving you only 2.8 clocks for all other work on a per-bit basis. Even at 100Mhz, you only get 13.5 clocks per bit.

    Third, there's not enough memory in the cores. You'd need some kind of state for each of the 64 channels and some amount of code. There's not enough processing bandwidth for the code to be self modifying, so you need per-channel code. This means the 496 addressable 32-bit units need to be shared among the 64 channels, so you only have 7.75 per channel.

    The current breakdown of per-core memory utilization looks like this:

    Tx Init: 3
    Tx code: 26
    Tx data: 8

    Rx Init: 1
    Rx code: 27
    Rx data: 6

    That's 71 32-bit words of storage per full duplex port. That suggests that perhaps 7 ports could be supported with code that's been heavily optimized by many for years. Your claim for more than quadrupling that is well beyond insulting, it's just absurd.

    I could go on, but why bother, it's simply a complete waste of time.


  • Peter JakackiPeter Jakacki Posts: 10,193
    edited 2015-08-29 14:48
    Indeed it's a waste of time but obviously you are wasting your time because perhaps you haven't read the post I mentioned. Anyway, when I say 32 channels I am OBVIOUSLY referring to the 32 I/O lines but not that we would use every one of them which would be pretty dumb, but the driver is designed to drive each and every one of them in any kind of practical mix.

    The timing you are referring to would be the conventional method would it not? Obviously that is NOT the way I am tackling this.

    Anyway I'm wasting my time testing this out at the moment transmitting up to 32 channels in what turns out to be a very simple loop which handles all 32 channels in parallel. Then I will move on to testing my receiver methods as described in my preliminary post on the Tachyon thread. I said it was a challenge but obviously I wouldn't be talking about it or tackling it if I didn't think I had a chance and without a big challenge we would probably never be able to accomplish anything great, except maybe in the area of scoffing others.

    From my observations of your posts k I see that although you have great insight it also seems that you must really like to or perhaps need to find critical fault to the point of blinding yourself by the brilliance of your own wisdom. I'd rather plunge myself into these do or die challenges and have fun at the same time. What's the worst that can happen? I try and fail? Or that I fail to try?
  • No, Peter, that's not it at all.

    I have read your post and did dig through 100s of posts to find it.

    And forgive me for interpreting what you wrote for what you meant to have said - but it is a written medium and my mind reading skills are, clearly, not so good.

    You did say 32 full duplex channels. And that's clearly not possible.

    Even if there were 32 full duplex UARTs built in and all the driver was required to do was to buffer receive data and feed transmit data, that alone consumes more than 50% of the available cycles at 11520 bytes per second on each of 64 lines. And the straight line code to do enqueue/dequeue is about 20 instructions per port, so more than all available space for 32 ports. That could be reduced with self modifying code at the expense of performance, but there's no performance to spare.

    Again, in terms of wires, code space and computational throughput it is not feasible to drive 32 full duplex ports at 115200 with one core.

    The fact of the matter remains that there's far more to the driver than the twiddling of the wires. And a simple analysis of the whole problem is fairly convincing with respect to its feasibility. It's not off by a little, it's off by a lot. It's just not possible.

    And in response to your question of with respect to what's the worst that can happen, here's my perspective.

    First, is that it derails useful threads that are of interest to those who are trying to solve real problems. This is the 3rd such serial driver thread that has descended into obscurity/absurdity.

    Second, you see to be one of the few actually capable and willing to make contributions and yet you're tilting at windmills. I have a selfish interest in seeing your energy pointed in a direction that's not quite so likely to be wasted.

    Finally, let's call a spade a spade. For most here, Propeller is not about building product for revenue, which is my interest. It is, instead, a combination of hobby and education. While I can't imagine something less interesting than technology as a hobby, I can't think of anything more worthwhile than technology education. And from an educational perspective, there is a certain amount of engineering discipline that ought to be inherent in the way one presents concepts and goals. Whether it's a mid-term essay, a product proposal at a fortune 100 company or a business pitch for venture capital, the notion of feasibility is critical to technology undertakings. One need only look at the serial (pun not intended) disaster that is P2 development to appreciate that presenting and pursuing those things that are not feasible in the face of trivial analysis has severe consequences. So if these forums are primarily about technology education, ought not the conversations be operating at a level that might get a passing grade?
  • Peter JakackiPeter Jakacki Posts: 10,193
    edited 2015-08-29 16:51
    I appreciate your input k but if you are interested in tech ed then fair enough, I try to make money out of this stuff and at the moment I'm having quite some fun transmitting independent data simultaneously on 16 20 I/O at 9600 in Forth from one cog without even trying too hard. The 16 20 channels may as well be 1 or 32 as it makes no difference to the driver. It's 2.30 in the morning but when I get a bit more time I will hookup the logic analyser and publish the timing graphs. This little exercise has in very little time demonstrated this much already that is both educational and useful.
  • Heater.Heater. Posts: 21,230
    edited 2015-08-29 16:41
    ksltd,
    While I can't imagine something less interesting than technology as a hobby...
    That is one of the saddest things I have ever read here. I have been playing with electronics since getting Philips electronics kit at age 9. Later building some tube HAM gear was fun, hacking up a digital clock out of TLL and Nixie tubes in the early 1970's was a gas (get it?). Still I can't help but get hold of new and interesting gadgets and "playing" with them when I have the time. It's intrinsically interesting, it's a social thing, it's a lot more fun that soduku :)

    Peter's ambition may well be impossible. But he can have fun trying and achieve something better than existing serial solutions even if the target is not reached. That all sounds good educationally and may well produce something useful.


  • Peter said: For further details see my post in the Tachyon thread. (sure would help if we could link individual posts).
    Wayne said: I have read your post and did dig through 100s of posts to find it.

    Wayne, can you please share the link you went to such great effort to find? Or Peter. I find the claim hard to believe too, but I suspend judgement. I expect there is a catch of some sort and a special design consideration. But I'd like the pasm to speak for itself apart from the Tachyon wrapper.
    Duane said: To the right of the avatar on each post is a date. You can "copy link address" to this date to link to the individual post.
  • Peter JakackiPeter Jakacki Posts: 10,193
    edited 2015-08-29 17:51
    Peter said: For further details see my post in the Tachyon thread. (sure would help if we could link individual posts).
    Wayne said: I have read your post and did dig through 100s of posts to find it.

    Wayne, can you please share the link you went to such great effort to find? Or Peter. I find the claim hard to believe too, but I suspend judgement. I expect there is a catch of some sort and a special design consideration. But I'd like the pasm to speak for itself apart from the Tachyon wrapper.
    Duane said: To the right of the avatar on each post is a date. You can "copy link address" to this date to link to the individual post.

    The post is the very latest post in the Tachyon thread, no need to scour as I've only just started this. But using the link method it is here.

    Anyway as I mentioned already, I have 20 channels transmitting independent data without jitter at 9600 baud in Forth and of course PASM will be much faster because all this will be accessed internally in the cog rather than reading and writing and shuffling in the slow hub.

    Doesn't the fact that I have a 32-channel jitter-free 115,200 baud (ok, 9600 right at this moment) serial object working grant some credibility to what I'm saying?
    Proof to follow in the morning or so, but it is so.


  • Dave HeinDave Hein Posts: 6,347
    edited 2015-08-29 18:05
    A "jitter-free" 115200 baud channel can be done only if the system clock is an exact multiple of 115200 Hz. 80 MHz is 694.444... times 115200 Hz. If you're running at 80 MHz you are going to have some jitter.

    Of course, 8-bit asynch can allow for about +/- 2.5% jitter/clock mismatch between the transmitter and the receiver. At a nominal rate of 115200 baud the transmit/receive clocks could be off by 2,880 Hz and still function OK.

    EDIT: After thinking about it for a while, it is possible to use a divisor of 694 and be jitter free. However, the baud rate would be off by about 74 Hz.
  • If we want to be pedantic about when jitter is "jitter" we can always set the baud rate to 115273.7752 for "jitter free" but there is always some jitter somewhere if we zoom up on it far enough. Compared to FDS and 4-port this FDS32 is "jitter free".
  • Dave HeinDave Hein Posts: 6,347
    edited 2015-08-29 18:12
    Peter, I was editing my post about the time you added your post, so we ended up saying similar things. Jitter and frequency are not a problem as long as they are within the 2.5% limit for 8-bit async.
  • Tracy,

    It's here . It doesn't really say much. While I can see some applicability to transmit, I can't see how receive can work scalably - especially start bit detect. And I can't see how to make it work at all if the ports are operating at baud rates that differ substantially or by a ratio that is not a power of two. And I don't see how buffer management can possibly be done by the same core in the face of variable latency memory operations, which means that the driver likely requires an additional thread.

    Peter,

    I made it clear that my interests are commercial; my reference to education was with respect to forum content.

    What you've accomplished so far is about 2.5% of what you claimed possible in terms of aggregate bit rates. How much of the available resources have you used?
  • Well, in my opinion peter is a pretty brilliant guy. Love to see him work his magic
  • Peter JakackiPeter Jakacki Posts: 10,193
    edited 2015-08-30 02:58
    Shawn Lowe wrote: »
    Well, in my opinion peter is a pretty brilliant guy. Love to see him work his magic
    cool, some encouraging remarks for when we stupidly volunteer to venture into the "impossible". Just got back into it again today and connected the logic analyser with 16 channels of capture. Since this is proof-of-concept at the moment the code is all high level and yet to benefit from cool PASM tricks and some more enhancements yet to be tested. What this means is that I already have a 32-channel serial transmit object especially once I code it in PASM. I will probably polish up my inline interactive assembler now to speed up the process of testing.

    In this diagram the main cog is writing a character then waiting for synch to write the next so that it doesn't write in the middle of a shift operation, so each channel is staggered by 1 bit and there are currently 20 channels transmitting a $30 plus their channel offset, so $30 to $3F in the diagram. Bear in mind that this is just test code and PASM is what is needed to make it fly and fill in the gaps and up the speed.


    FDS32%20transmitting.png
    1366 x 768 - 102K
  • Peter JakackiPeter Jakacki Posts: 10,193
    edited 2015-08-30 07:00
    What was Clark's third law again? Something about sufficiently advanced technology being indistinguishable from magic?

    32 channels transmitting strings at 115,200 baud from a single cog, some I/O pins are not enabled of course.
    BTW, this works just as well at over 1M baud but I need to eventually match the receivers so 115,200 is the goal.
    FDS32%20115200baud%20transmitting.png
    1366 x 768 - 106K
  • Okay Peter... you've got us all thoroughly impressed/confused/skeptical. When you said 32 cahnnels @ 115,200, I though "He's crazy! I hope he does it".

    But now 1MBaud??? I'm move from hopeful to skeptical now :P Please share your work. Both the key loop that enables this magic and the API for a user of this object.
  • The tx stuff is dead simple really, even 1Mbaud x32, although there are some "tricks", it's all in the method which I am now making some adjustments to to simplify the loading etc. Since it was only yesterday that I actually started coding and testing I think you will understand that I don't want to complicate it by releasing what code I have yet, but be patient as it won't be too long before I do. The 1M baud is not a problem for the transmitters but won't work for the receivers which is why I am "only" planning 115,200.

    Follow my progress then in my new MAGIC32 thread :)
  • Follow my progress then in my new MAGIC32 thread :)

    I will! With keen interest!
  • Tell us about the buffer management, please ...
Sign In or Register to comment.