Shop OBEX P1 Docs P2 Docs Learn Events
Webserver randomly becomes unresponsive - Page 3 — Parallax Forums

Webserver randomly becomes unresponsive

13»

Comments

  • Beau SchwabeBeau Schwabe Posts: 6,566
    edited 2011-01-13 07:25
    Mike G,

    So the question begs, what changes did you make? :-)
  • chillybasenchillybasen Posts: 16
    edited 2011-01-13 09:09
    My webserver started hanging again, so I decided to try jstjohnz solution. However, on SocketTCPestablished, there'd be a $14 sent and it would reset. So I modified the code slightly
      repeat
        if !ETHERNET.SocketTCPestablished(0)
          if ETHERNET.SocketStatus(0) == $00 OR ETHERNET.SocketStatus(0) == $1C
            ResetSocket
        else
            'send HTML
    

    And ResetSocket is just
    PRI ResetSocket
      ETHERNET.SocketTCPdisconnect(0)
      ETHERNET.SocketClose(0)
      ETHERNET.SocketOpen(0, ETHERNET#_TCPPROTO, localSocket, destSocket, @destIP[0])
      ETHERNET.SocketTCPlisten(0)
      return
    

    My server has been up 3 days now, a record. And I've hit it with thousands of requests in testing.
  • Beau SchwabeBeau Schwabe Posts: 6,566
    edited 2011-01-13 14:47
    chillybasen,

    It might also be a good idea to periodically check 'ETHERNET.SocketTCPestablished(0)' and make sure that it still reads as what you would expect as well as the status

    In addition to checking for a status of $14 ... it might be a good idea to check that it eventually becomes $17 while the ETHERNET.SocketTCPestablished(0) is still valid.
    A valid status of $14 does change to $16 before it changes to $17, but it can go directly from $14 to $17 ... much of this depends on the flavor of browser and the handshaking
    implemented on the client side.

    $00 is basically a complete shutdown (release of socket service) on the server side ... could be caused by a terminated or misdirected client connection.
    $1C is generated when the client forces a quit ... i.e they close the browser connection.
  • zapmasterzapmaster Posts: 54
    edited 2011-01-15 17:42
    This is the code i use for the web serving of the current 3 pages:
    it does not lock up any more.
    i only tested this in house.
    72.55.239.53
    give it a try
    PUB  webserver     | idx  , x   ,temp0
    
    
      'Infinite loop of the server
      repeat       
        'Waiting for a client to connect
             'Testing Socket 0's status register and looking for a client to connect to our server
              ETHERNET.readIND(ETHERNET#_S0_SR, @temp0, 1)
              temp0 <<=24      'bit shift left for the first byte
              temp0 >>=24      'bit right left for the first byte
               PST.Str(string("  hold for page")) 
               PST.Str(string(PST#NL))        
    
           repeat while temp0 == $14
                    dira[23]~~          ' turn on led to show ready for web page
                    outa[23]~~         ' turn on led to show ready for web page 
              temp0 <<=24      'bit shift left for the first byte
              temp0 >>=24      'bit right left for the first byte
               ETHERNET.readIND(ETHERNET#_S0_SR, @temp0, 1)
                       waitcnt(clkfreq/40  + cnt)      
            
          ETHERNET.readIND(ETHERNET#_S0_SR, @temp0, 1)
              temp0 <<=24      'bit shift left for the first byte
              temp0 >>=24      'bit right left for the first byte
           PST.Str(string(PST#NL))
         if temp0 <> $17
             PST.hex(temp0,8)
             PST.Str(string("  socket receaved bad data "))        
             PST.Str(string(PST#NL))
             crash:=crash+1
           i2cObject.Writelong(i2cSCL, EEPROMAddr, crashprom  , crash)
             end              'reset the socket
             next
        outa[23]:=0
        'Initialize the buffers and bring the data over
        bytefill(@data, 0, _bytebuffersize)    
        ETHERNET.rxTCP(0, @data)
    
        if data[5] == "A"   'A = home page
          home_page                      
        if data[5] == "B"   'B = boiler page
          boiler_page
        if data[5] == "C"   'C = Solar page
          Solar_page
        if data[0] == "G"
          home_page
    
    .
  • sstandfastsstandfast Posts: 39
    edited 2011-01-18 09:21
    @zapmaster - just wanted to point out that you can make a small improvement by changing all your
    temp0 <<= 24 'bit shift left for the first byte
    temp0 >>=24 'bit right shift for the first byte
    

    with a simple masking function:
     temp0 &= $FF
    

    This will accomplish the same thing as your back to back shifts with two less instructions (shift then assignment).

    Just thought I would mention it.

    Shawn
  • Mike GMike G Posts: 2,702
    edited 2011-01-20 06:06
    @Beau (and all),

    Sorry I was off on a San Diego field trip last weekend with the Young Explorers. We saw a Grey Whale out in the Pacific, got video and all. Anyway, kinda' relaxed a bit and stayed out of the office.

    I followed Beau's advice to get around the lockups. Plus I read the Wiz5100 manual. I imagine this is similar to what everyone else coded. This snippet is by no means production code but it seems to work. BTW, I also handle 0 data. I'm not sure why I get 0 data but it happens often enough.
        t1 := cnt + HTTP_TIMEOUT  
        repeat while !W5100.SocketTCPestablished(0)
          if(t1 > cnt)
            W5100.readIND(W5100#_S0_SR, @S0_SR, 1)
            ifnot((S0_SR == $14) OR (S0_SR == $16) OR (S0_SR == $17))
              PST.Str(string("Status Code 2: "))
              PST.Hex(S0_SR, 2)
              ResetSocket
    

    http://spinneret.servebeer.com:5000/
    http://www.agaverobotics.com/spinneret/controlpanel.htm
  • Oldbitcollector (Jeff)Oldbitcollector (Jeff) Posts: 8,091
    edited 2011-01-24 07:33
    I'm seeing this as well..

    Perhaps I'm using old code?

    My server is: http://software.propellerpowered.com with a link to the code itself at the bottom of the page.

    OBC
  • DynamoBenDynamoBen Posts: 366
    edited 2011-01-24 07:36
    I wonder if this issue has to do with an uninitialized variable or a variable rolling over somewhere. :innocent:
  • Mike GMike G Posts: 2,702
    edited 2011-01-24 13:24
    I'm finding that the lockups are usually my fault. I log every request along with key variable states like the status register. This has helped me find bugs in my code by following what the user was doing when the server locked up.
  • Timothy D. SwieterTimothy D. Swieter Posts: 1,613
    edited 2011-01-24 16:45
    When discussing code lockup or bugs or responsiveness issues, we should reference what driver and version being used.

    There have been a handful of bugs fixes to both the SPI and Indirect version of the drivers. The driver code is living on Google Code here: http://code.google.com/p/spinneret-web-server/

    If you are able to, I'd suggest downloading the latest SPI and IND drivers and to give them a try. One to see if it improves the performance and two to verify we aren't adding new problems and bugs. I love having a community to test code - you guys have the applications where I don't right now, so I am happy to be tech support for the drivers.

    W5100_SPI_Driver.spin
    W5100_Indirect_Driver.spin
  • Oldbitcollector (Jeff)Oldbitcollector (Jeff) Posts: 8,091
    edited 2011-01-25 06:22
    @Mike,

    What is the value of HTTP_TIMEOUT in your server?

    OBC

    Mike G wrote: »
    @Beau (and all),

    I followed Beau's advice to get around the lockups. Plus I read the Wiz5100 manual. I imagine this is similar to what everyone else coded. This snippet is by no means production code but it seems to work. BTW, I also handle 0 data. I'm not sure why I get 0 data but it happens often enough.
        t1 := cnt + HTTP_TIMEOUT  
        repeat while !W5100.SocketTCPestablished(0)
          if(t1 > cnt)
            W5100.readIND(W5100#_S0_SR, @S0_SR, 1)
            ifnot((S0_SR == $14) OR (S0_SR == $16) OR (S0_SR == $17))
              PST.Str(string("Status Code 2: "))
              PST.Hex(S0_SR, 2)
              ResetSocket
    

    http://spinneret.servebeer.com:5000/
    http://www.agaverobotics.com/spinneret/controlpanel.htm
  • Mike GMike G Posts: 2,702
    edited 2011-01-25 12:32
    2 seconds but I removed the timeout in the most recent stuff.
  • Oldbitcollector (Jeff)Oldbitcollector (Jeff) Posts: 8,091
    edited 2011-01-25 12:42
    I'm still getting lock ups over here. Would you mind sharing your source code (complete) for some comparison?

    OBC
  • Timothy D. SwieterTimothy D. Swieter Posts: 1,613
    edited 2011-01-25 15:49
    I was digging in the WIZnet datasheet last night while working on the feature for adjusting the W5100 socket memory size. While I was in the sheet I saw a couple W5100 values for Retry Time-value REgister (RTR) and Retry Count Register (RCR). These are settings inside the W5100 for handling communications. The default for RTR is 200ms and RCR is 8. I need to read and do some experimenting, but I wonder if RTR should be increased to 400ms or 500ms. I am thinking this has to deal with latency in a network.

    Any regular web server programmers around that can talk about RTR timing on a real web server? How long should a server wait for connections and response before moving on?

    After I get the socket memory size register going I was thinking of building a demo for watching the RTR, RCR and various error bits in the W5100 so we can verify if it is the W5100 having issues with serving or if it is our code in the Propeller. Maybe this weekend I will get to it, unless someone wants to beat me to it.
  • Mike GMike G Posts: 2,702
    edited 2011-01-25 18:00
    @OBC, the base code can be found on http://forums.parallax.com/showthread.php?128445-Dynamic-Spinneret-with-HttpRequest-Object-and-EEPROM-Configuration-Page

    The only difference related to lockups is the code snippet above. My latest code has a lot more stuff like HTTP file upload development and tons of logging. I'm afraid to make that code available on the forum. I ran out of programming space so the latest code has blocks plugged in here and there and would be difficult to follow if your not familiar the logic.
  • CassLanCassLan Posts: 586
    edited 2011-01-27 13:20
    So after reading through the thread...to my understanding is this all solved?

    I had this issue while working several weeks back and would like to get back on things :)

    Rick
  • Oldbitcollector (Jeff)Oldbitcollector (Jeff) Posts: 8,091
    edited 2011-01-27 17:17
    Not resolved here..

    I've started using this and I can run the server for most of a day without a powercycle. (However, it does cause more problems with page load, requiring a 'reload', but at least not a powercycle.)
      'Infinite loop of the server
      repeat
        ETHERNET.SocketOpen(0, ETHERNET#_TCPPROTO, localSocket, destSocket, @destIP[0])
        ETHERNET.SocketTCPlisten(0)
    
        repeat while !ETHERNET.SocketTCPestablished(0)
        
          ResetSocket   
          bytefill(@data, 0, _bytebuffersize)
          WaitCnt(clkfreq / 100 + cnt) ' 10mSec    
    
        ETHERNET.rxTCP(0, @data)
    
        if data[0] == "G" ' Assume a GET request
          ParseURL
    
           'Send the web page - hardcoded here
          'status line
          StringSend(0, string("HTTP/1.1 200 OK", CR, LF))
    

    PRI ResetSocket
      ETHERNET.SocketTCPdisconnect(0)
      ETHERNET.SocketClose(0)
      ETHERNET.SocketOpen(0, ETHERNET#_TCPPROTO, localSocket, destSocket, @destIP[0])
      ETHERNET.SocketTCPlisten(0)
      return
    

    OBC
  • Mike GMike G Posts: 2,702
    edited 2011-01-27 18:35
    @CassLan, simply verifying the status register goes from 0x14 to 0x16 or 0x17 and checking for zero data fixed my problems.

    OBC, you're making the Spinneret do a lot of unnecessary socket work?
  • Beau SchwabeBeau Schwabe Posts: 6,566
    edited 2011-01-28 13:32
    I think a majority of us have decided that it's just a matter of getting the right state machine going.

    Understanding the meaning of the status codes helps.

    In addition to checking for "0x14 to 0x16 or 0x17" depending on what your code does, it might also be wise to check for 0x00 and 0x1C which can occur after 0x17.

    0x00 - can be generated from the browser if the person reloads before the previous page has loaded.

    0x1C - can be generated from the browser if the person just closes the browser

    In either case if you get a 0x00 or 0x1C 'before' you send the HTML, you might as well not send anything before you re-negotiate.
  • Timothy D. SwieterTimothy D. Swieter Posts: 1,613
    edited 2011-01-28 18:28
    Regarding the state machine, should this be something that we embed in the driver and keep out of the application code? The application code would still need to know something about the state of the socket, but the gory details could be handled in the driver and the status announced to the application code if it asks for it.
  • CassLanCassLan Posts: 586
    edited 2011-01-28 19:29
    @Mike - Yeah, thats what I was wondering...had all (or most) of the causes been determined and had a way to deal with.

    @Beau - Thank you for explaining, I guess its def worth reading the 5100 Sheet

    @Tim - There are times when folks may want this stuff true, maybe a behavior profile with which to call the driver...certain automated flags depending on how "simple" you want to use it?


    I'm planning some quality spinnerett time tommorow muhahahahaha, Mike I love your eeprom IP settings save page.
    Maybe I'll have something fun to post back by eod tommorow!!
  • Beau SchwabeBeau Schwabe Posts: 6,566
    edited 2011-01-28 19:43
    Timothy D. Swieter,

    "Regarding the state machine, should this be something that we embed in the driver and keep out of the application code?" - For most applications I would agree, but as soon as I say that there's probably a reason why it shouldn't be managed.

    The Demo Code I posted attempts to move all of the 'gory details' to the background...
  • agsags Posts: 386
    edited 2011-02-02 21:40
    I coded a simple web server (just to prove the basic building blocks for a larger effort) and during development found with my first implementation that for a client on my LAN, everything was fine. When I tried to browse with an iPad over WiFi, or my Android phone over wireless, things fell apart. It turned out to be a mistake in the way I expected state transitions to happen (added latency made intermediate steps last longer, causing problems). This was the source of my problems, and I wonder if it's the cause of the problems described here.

    I'll offer my tiny server up for your abuse - for the purpose of seeing if what I've done is truly a robust implementation that could be shared. The page is static: no buttons, just #page hits and uptime.

    I'm already shuddering at what is going to happen here - but go at it, try to bust it!

    http://xxxxxxxxx
  • eagletalontimeagletalontim Posts: 1,399
    edited 2014-11-29 06:58
    I just experienced my first inbound connection loss after leaving my server up and running overnight. I was able to see the Spinneret sending data out to my server and receiving the needed data back along with DHCP and SNTP checks but I was not able to access the server through a web browser. A simple reset and I was able to connect again. I am using the latest W5100 driver. Do I need to post my code or is this still an issue with the W5100?
Sign In or Register to comment.