Shop OBEX P1 Docs P2 Docs Learn Events
W5200 ethernet performance with QuickStart — Parallax Forums

W5200 ethernet performance with QuickStart

manitoumanitou Posts: 29
edited 2013-02-25 09:24 in Propeller 1
I hooked up WIZ820io (W5200 chip) on a breadboard to quickstart and used W5200/SPI library from the following
http://code.google.com/p/propeller-w5200-driver/source/checkout
Wiznet specs the W5200 at 33mbs (megabits/sec) compared to 0.3 mbs for the older W5100. A simple test of writing/reading the W5200 buffers ran at 8.9/5.3 mbs on the quickstart. Some simple UDP tests achieved 3mbs. For W5200 performance on other MCUs (UNO, DUE, maple, teensy) see wizperf.txt at the following github site
https://github.com/manitou48/DUEZoo

Comments

  • D.PD.P Posts: 790
    edited 2013-02-22 09:18
    manitou wrote: »
    I hooked up WIZ820io (W5200 chip) on a breadboard to quickstart and used W5200/SPI library from the following
    http://code.google.com/p/propeller-w5200-driver/source/checkout
    Wiznet specs the W5200 at 33mbs (megabits/sec) compared to 0.3 mbs for the older W5100. A simple test of writing/reading the W5200 buffers ran at 8.9/5.3 mbs on the quickstart. Some simple UDP tests achieved 3mbs. For W5200 performance on other MCUs (UNO, DUE, maple, teensy) see wizperf.txt at the following github site
    https://github.com/manitou48/DUEZoo

    Nice work on gathering these stats, thanks for posting this.
  • twctwc Posts: 107
    edited 2013-02-22 16:22
    ...yes, thanks, definitely some interesting info, might want to post a copy to the Spinneret forum. Good fodder for discussion of hot topics like soft peripherals, h/w stack vs. s/w stack, Prop vs. Beagle/Pi/ARM, etc. There's clear (and not unexpected) dependence on SPI thruput so a real SPI (or maybe just a shift register) would go a long way. Mike's SPI driver is great and a big improvement, but still nowhere near the W5200 spec 80 MHz. I don't know much about the Beagle/Pi, but I'm guessing they exploit high bandwidth (i.e. fast & wide) network-memory interface. Another interesting point is how the W5200 has essentially the same performance for UDP and TCP, but Beagle/Pi TCP is only small fraction (<25%) of UDP. Raises even more questions: What would happen if it was 8 sockets? Or different buffer size (W5200 could be configured 1 socket x 16KB)? Then finally there's the question of what you've got left in terms of MIPs and real-time response under full network load. In the case of the Prop I guess this network test is using about 1/4 the capability (ex: 2 cogs, 8KB) so there's plenty of code space and real-time bandwidth leftover. By contrast I suspect an Arduino doesn't have much left. Not sure about Beagle/Pi in terms of MIPs consumed and real-time response. Anyway, interesting stuff, thanks again.!
  • Peter JakackiPeter Jakacki Posts: 10,193
    edited 2013-02-22 21:29
    I started writing code for the W5100 before I changed over to the W5200 which I then realized it was a much much better chip despite the similarities. Although I will be using this chip in P1 designs it is the P2 that I will be hoping to use for the max performance. So after I have settled on some P1 code and functionality for the W5200 I will be coding this in Tachyon Forth for the P2 on the DE2 emulation platform.
  • D.PD.P Posts: 790
    edited 2013-02-22 23:44
    I started writing code for the W5100 before I changed over to the W5200 which I then realized it was a much much better chip despite the similarities. Although I will be using this chip in P1 designs it is the P2 that I will be hoping to use for the max performance. So after I have settled on some P1 code and functionality for the W5200 I will be coding this in Tachyon Forth for the P2 on the DE2 emulation platform.

    DE Nano here, in case a single cog version of Tachyon appears.
  • manitoumanitou Posts: 29
    edited 2013-02-23 04:15
    twc wrote: »
    ...yes, thanks, definitely some interesting info, might want to post a copy to the Spinneret forum. Good fodder for discussion of hot topics like soft peripherals, h/w stack vs. s/w stack, Prop vs. Beagle/Pi/ARM, etc. There's clear (and not unexpected) dependence on SPI thruput so a real SPI (or maybe just a shift register) would go a long way. Mike's SPI driver is great and a big improvement, but still nowhere near the W5200 spec 80 MHz. I don't know much about the Beagle/Pi, but I'm guessing they exploit high bandwidth (i.e. fast & wide) network-memory interface. Another interesting point is how the W5200 has essentially the same performance for UDP and TCP, but Beagle/Pi TCP is only small fraction (<25%) of UDP. Raises even more questions: What would happen if it was 8 sockets? Or different buffer size (W5200 could be configured 1 socket x 16KB)? Then finally there's the question of what you've got left in terms of MIPs and real-time response under full network load. In the case of the Prop I guess this network test is using about 1/4 the capability (ex: 2 cogs, 8KB) so there's plenty of code space and real-time bandwidth leftover. By contrast I suspect an Arduino doesn't have much left. Not sure about Beagle/Pi in terms of MIPs consumed and real-time response. Anyway, interesting stuff, thanks again.!

    Re: TCP I'm pretty sure the TCP on beagle/raspberry can perform as well as UDP on local net with bigger block transfers and larger window sizes -- I just took default settings. TCP provides reliable service and adapts to available bandwidth, whereas UDP can lose packets, have errors in packets, or have packets arrive out of order or even be duplicated. But UDP is a good measure of underlying performance

    Re: W5200 buffers The default w5200 configuration allocates 2k-byte buffers per socket. For UDP that allows the wiznet to receive up to 2k at wire speeds, otherwise, as noted, the wiz performance is limited by SPI performance. Larger wiz buffers would allow buffering larger bursts of UDP packets. Though the w5200 specs says it can handle 80MHz SPI, wiznet also says they only have reliably tested it up to 33Mhz. Those high rates would be hard to sustain in a breadboard configuration.

    Just for the record, the WIZ820io chip I have seems a bit tempermental -- irrespective of the MCU used to exercise it. Sometimes several repetitions of tests will fail and then start working ...
  • twctwc Posts: 107
    edited 2013-02-23 09:49
    ...I've got a solid Prop+Wiz820io setup if you want me to test your code.

    Another look (admittedly superficial) at the resuts seems to convey insight about the SPI and CPU load issue. Most apparent with the DUE and Maple is the difference in the way UDP performance scales with SPI clock for regular SPI vs. DMA+SPI. Namely it scales semi-directly for SPI, but only up to a point, beyond which it continues to scale higher with SPI+DMA. Most obvious example is Maple: Regular SPI UDP performance is the same for 9 or 18 MHz SPI clock where SPI+DMA at 18 MHz delivers a lot more performance than 9 Mhz. The implication I draw is that the CPU is saturating driving the regular SPI. Advantage Prop which has plenty of cogs and memory for other stuff.
  • manitoumanitou Posts: 29
    edited 2013-02-23 09:54
    TCP results: send 3.6mbs recv 1.8mbs (megabits/sec)
    Sending 100 1000-byte records, or receiving 100 1000-byte records.
  • twctwc Posts: 107
    edited 2013-02-23 10:02
    ...so another implication would be; Absent DMA there's little reason to bother running the SPI clock > 10-15 MHz.

    Anyway to characterize the CPU load on beagle/pi? Are they saturated, or scheduler is saving bandwidth for other tasks? Just curious...
  • twctwc Posts: 107
    edited 2013-02-23 10:07
    ...thanks for confirming, essentially no penalty for using TCP instead of UDP.
  • DynamoBenDynamoBen Posts: 366
    edited 2013-02-23 16:13
    manitou wrote: »
    Wiznet specs the W5200 at 33mbs (megabits/sec) compared to 0.3 mbs for the older W5100. A simple test of writing/reading the W5200 buffers ran at 8.9/5.3 mbs on the quickstart. Some simple UDP tests achieved 3mbs. For W5200 performance on other MCUs (UNO, DUE, maple, teensy) see wizperf.txt at the following github site
    https://github.com/manitou48/DUEZoo

    Mind trying the same tests with this driver?
  • manitoumanitou Posts: 29
    edited 2013-02-24 09:12
    DynamoBen wrote: »
    Mind trying the same tests with this driver?

    The underlying SPI code for this driver looks about the same -- write/ read are clocked at 20MHz/10Mhz using frqa. The test of just writing/reading 1000 bytes to the W5200 socket buffer area ran slower at 6.4/4.9 mbs (megabits/sec), compared to 8.8/5.3mbs to the other driver. The other driver combines the first 4 bytes (W5200 command) into a single SPI write operation, so that may account for some of its speed.

    Presumably UDP and TCP would run proportionally slower as well. I haven't done the UDP tests since txUDP in this driver wants me to build a UDP header at the front of my data...
  • DynamoBenDynamoBen Posts: 366
    edited 2013-02-24 09:37
    manitou wrote: »
    The underlying SPI code for this driver looks about the same -- write/ read are clocked at 20MHz/10Mhz using frqa. The test of just writing/reading 1000 bytes to the W5200 socket buffer area ran slower at 6.4/4.9 mbs (megabits/sec), compared to 8.8/5.3mbs to the other driver.

    Hmmm this is a touch disappointing, I would have expected the same speed or better. I suspect the spin overhead is killing my speed.
    The other driver combines the first 4 bytes (W5200 command) into a single SPI write operation, so that may account for some of its speed.

    I will check that out. My driver is utilizing the W5200s "burst" mode, so after the first byte is sent (40bits total) I only send data bytes (8 bits).
  • manitoumanitou Posts: 29
    edited 2013-02-24 10:25
    DynamoBen wrote: »
    Mind trying the same tests with this driver?

    I tried a simple UDP echo test with txUDP -- txUDP never returns, sigh. W5200 responds to pings. I commented out the first "if" in txUDP and that let it "run" -- but there were no UDP packets seen on the Ethernet (tcpdump). Do you have a working UDP example?

    The other W5200 driver had lots of examples, so it was easy to create various tests ....

    thanks
  • DynamoBenDynamoBen Posts: 366
    edited 2013-02-24 10:47
    manitou wrote: »
    I tried a simple UDP echo test with txUDP -- txUDP never returns, sigh. W5200 responds to pings. I commented out the first "if" in txUDP and that let it "run" -- but there were no UDP packets seen on the Ethernet (tcpdump). Do you have a working UDP example?

    The other W5200 driver had lots of examples, so it was easy to create various tests ....

    The driver I'm pointing you to is a rework of the W5100, so all the examples that Timothy created for the W5100 should work. I've attached one that demonstrates UDP.
  • manitoumanitou Posts: 29
    edited 2013-02-24 16:12
    OK here are the UDP numbers for your lib
                          yourlib         otherlib
    UDP echo(us)            1322            2621
    UDP send(mbs)            4.3             3.7
    UDP recv(mbs)            1.9             2.1
    buffer rd(mbs)           4.9             5.3
    buffer wrt(mbs)          6.4             8.9
    
  • Mike GMike G Posts: 2,702
    edited 2013-02-24 17:47
    manitou, it would be helpful to see your test code. For the UDP stuff if socket.Available was used in the echo test you'll get a lower throughput than using Socket.DataReady as socket.Available polls for data until it times out.

    Otherwise, thank you for your test results. I was able to identify a bottle neck in my code which I'm actively fixing.
  • manitoumanitou Posts: 29
    edited 2013-02-25 04:18
    Mike G wrote: »
    manitou, it would be helpful to see your test code. For the UDP stuff if socket.Available was used in the echo test you'll get a lower throughput than using Socket.DataReady as socket.Available polls for data until it times out.

    Otherwise, thank you for your test results. I was able to identify a bottle neck in my code which I'm actively fixing.

    Yes, I used socket.Available, i'll try your suggestion. Here are snippets of code being measured.
    w5200 buffer read and write
      wiz.Read($8000,@buff,1000)
      wiz.Write($8000,@buff,1000)
    
    UDP echo
        sock.Send(@udpMsg, 8)
        bytesToRead := sock.Available
        sock.Receive(@buff, bytesToRead)
    
    UDP send
        sock.Send(@buff, 1000)
    
    UDP recv (buffer sometimes had 2 packets)
            bytesToRead := sock.Available
            if(bytesToRead =< 0)
                next
            sock.Receive(@buff, bytesToRead)
            pst.Dec(bytesToRead)
            pst.Char(pst#NL)
            if (bytesToRead > 1100)
                pkts++
            pkts++
    
      
    

    thanks for the w5200 code
  • twctwc Posts: 107
    edited 2013-02-25 08:20
    There is a fair amount of overhead for W5200 buffer management (logical->physical address and wraparound) in the UDP/TCP critical path which the buffer rd/wr test code doesn't exercise (fits with the results showing SPI+DMA buffer access nearly equal SPI clock). That's actually helpful since it isolates the buffer management (and other) overhead. For the Prop+W5200, UDP thruput is less than half of buffer thruput, confirming there's quite a bit of overhead that has nothing to do with performance of the low-level SPI transactions.

    Thanks for the posting, helpful to illuminate some optimization ideas.
  • Mike GMike G Posts: 2,702
    edited 2013-02-25 09:24
    Thanks for the snippet that helps me keep my tests consistent.

    Yep the receive buffer can contain more than one packet. I've running performance tests this weekend and built a few testing applications. I'm still trying to make sense of results.

    As twc mentioned, at least for W5200.spin, the buffer logic sucks up a lot of ticks. I'll offload that logic to PASM and that should speed things up. I'm pretty sure multiple UDP packets in the Rx buffer is always a possibility.
Sign In or Register to comment.