Shop OBEX P1 Docs P2 Docs Learn Events
Webserver randomly becomes unresponsive — Parallax Forums

Webserver randomly becomes unresponsive

chillybasenchillybasen Posts: 16
edited 2014-11-29 06:58 in Accessories
I'm using Bean's code, but I've had this happen with Mike's as well

Everything works perfectly fine, sometimes for days. And then I'll just be unable to connect. I tried telneting to port 80 and it won't even connect. It just times out.

If I reset it, everything is fine again.

Anyone else experiencing this? Any ideas on how to fix?
«13

Comments

  • zapmasterzapmaster Posts: 54
    edited 2011-01-03 19:38
    yes this is happening to me.
    nope have not solve it yet.
    also when my page is viewed on my network it is fine when viewed from the web it is cut short.
  • Beau SchwabeBeau Schwabe Posts: 6,566
    edited 2011-01-03 20:29
    I have seen this also... it appears that this is hanging somewhere in software. The reason I say this is because If I hit the 'Retry' from the web browser claiming to not be able to load the page, there is consistent activity on the Spinneret lights every time I hit 'Retry' from the browser. So the request seems to make it to the Spinneret, but the Spinneret is hung waiting for something else.

    One solution might be to introduce a timeout if the data packet is not cleared to zero length in a reasonable amount of time.
  • Beau SchwabeBeau Schwabe Posts: 6,566
    edited 2011-01-03 20:59
    Partially SOLVED!!

    I was testing a program that was border line and adding a few extra spaces caused the Spinneret to hang.

    It has to do with the number of BYTES you are sending at once to the W5100 .... It breaks if you try to send more than 2048 bytes at once. Looking at the code at the very top...

    bytebuffersize = 2048

    .... So apparently That's my sign, smack it right on my forehead :-) Duh!


    EDIT Breaking down the html code length seems to have solved my problems with the Spinneret becoming unresponsive.
  • Daniel HarrisDaniel Harris Posts: 207
    edited 2011-01-03 22:38
    This is a good description of the exact problem I seem to be having (as described in your thread, Beau). Beau, when you say "partially solved", what is still broken? Does it still hang for you? For me, I could get my Spinneret to hang if I spammed a bunch of page requests in my browser (I.E. holding F5). Maybe a receive buffer in the Propeller or transmit buffer in the W5100 is getting full as well?

    I'll mess around with it a bit tomorrow to see if I can characterize my Spinneret's behavior a little better..
  • Beau SchwabeBeau Schwabe Posts: 6,566
    edited 2011-01-03 22:45
    Daniel Harris,

    I say partially, because I literally just stumbled on what appears to be the problem... in other words it should have further confirmation.

    Sending the string size to the serial terminal just before sending it to the W5100 is where I had an 'Ah Ha' moment.

    To answer your question... it doesn't hang right now, but under previous conditions I can cause it to hang.

    I'll let it run until tomorrow, but at around 6pm I need to take it off-line ... (Meeting with Robotics group) ... and I actually don't have a Spinneret
    of my own. The one I'm using is being borrowed from the Robotics club, go figure :-)
  • jstjohnzjstjohnz Posts: 91
    edited 2011-01-03 22:49
    Re buffer size issues. The W5100 defaults to a 2k receive and a 2k transmit buffer for each of the 4 sockets, but these buffer sizes can be altered, keeping in mind the total buffer size is 16k. If you only need 1 socket you could (I believe) have an 8k transmit buffer and an 8k receive buffer on that socket.
    Also, there is no check in the driver for the situation where you are trying to send a string longer than the buffer size. If you do, the driver will hang forever.
  • kuismakuisma Posts: 134
    edited 2011-01-03 23:49
    Partially SOLVED!!

    I was testing a program that was border line and adding a few extra spaces caused the Spinneret to hang.

    It has to do with the number of BYTES you are sending at once to the W5100 .... It breaks if you try to send more than 2048 bytes at once. Looking at the code at the very top...

    bytebuffersize = 2048

    .... So apparently That's my sign, smack it right on my forehead :-) Duh!


    EDIT Breaking down the html code length seems to have solved my problems with the Spinneret becoming unresponsive.

    I patched this bug weeks ago and posted it here. It's not really a matter of the size of the W5100 buffer, but the concept of the TCP implementation.

    Again, you find my patched version at http://whiteboard.ping.se/Propeller/Network.
  • Beau SchwabeBeau Schwabe Posts: 6,566
    edited 2011-01-03 23:59
    kuisma,

    Thanks! I've implemented your patch for the Indirect driver on the Web server... If all goes well
    it will still be up and running by 6pm tomorrow. ...Leaving it alone til then.
  • Mike GMike G Posts: 2,702
    edited 2011-01-04 07:43
    Thanks, kuisma Updated mine as well.
    http://spinneret.servebeer.com:5000

    Check out the xml stylesheet transform (xslt). Works in IE not Firefox and I'm not sure about other browsers. I think I know what's up though. Will check when I get home from work.
    http://spinneret.servebeer.com:5000/xslt/sensor.xml

    Also the configuration link pulls Spinneret settings from EEPROM. The save button only shows the posted values.
  • chillybasenchillybasen Posts: 16
    edited 2011-01-04 09:20
    Thanks kuisma. I tried your new patch, but it still hangs after a certain number of requests.

    I'm able to pretty easily recreate the issue now by using apache benchmark

    ab -n 300 -c 1 http://192.168.1.252/

    After around 250 requests, it hangs. I'm guessing the number of requests is dependent on the size of bytes being received and sent
  • chillybasenchillybasen Posts: 16
    edited 2011-01-04 09:24
    Mike, I did just try hammering your server with 400 requests and it's still up, so maybe it's my server code as well. Can you post your latest server code?
  • Mike GMike G Posts: 2,702
    edited 2011-01-04 09:30
    Sure, when I get home from work.
  • sstandfastsstandfast Posts: 39
    edited 2011-01-04 09:32
    I too have been having this problem, but I am not sure it is 100% the fault of the driver. I have added Kuisma's modifications to the SPI driver and I am still getting the random hangs. I have also set my prop/W5100 combo up to use interrupts. When the server becomes unresponsive, the W5100 does not fire the interrupt after the "refresh" button is pressed, but I can see on Wireshark that it is generating lots of TCP traffic in response to the request. Also, since my page is essentially just a modified version of the HTTP demo (HTML embedded in the SPIN code), I know that I am not coming anywhere near the 2K buffer boundary. I have not yet had an opportunity to dig through the TCP traffic and figure out what the packets are but I will hopefully soon. I do know that it does not matter if I am on my local network or an outside network, when it hangs, I can not get a page at all.

    Shawn
  • sstandfastsstandfast Posts: 39
    edited 2011-01-04 09:39
    @Mike G

    Your XML style sheet does not work in Chrome. I get the following error:
    "
    This page contains the following errors:

    error on line 2 at column 6: XML declaration allowed only at the start of the document
    Below is a rendering of the page up to the first error."

    Followed by a blank page. Just FYI...
  • kuismakuisma Posts: 134
    edited 2011-01-04 09:45
    sstandfast wrote: »
    II know that I am not coming anywhere near the 2K buffer boundary.

    With my code, there is no longer any 2K boundary. It is all handled internally in a sound way, and you can transmit (e.g. enqueue) a 10kB request with txTCP as TCP is supposed to work. Try it out some more, and if you still believe it is the driver, I'll have another look at it.
  • sstandfastsstandfast Posts: 39
    edited 2011-01-04 10:04
    No, I think it is NOT the driver, but rather a glitch in the W5100 itself. Since I've enabled interrupts (basically replaced the .TCPListen() with waitpeq) and the fact that when the server hangs, no more interrupts are generated, the Prop code is not in question. (at least not when no responses are concerned.) Since the W5100 doesn't interrupt, the Prop sits and waits, like it should. However, I do see TCP traffic being generated by the W5100 on Wireshark, I just haven't had an opportunity to open the packets up and see if they are connection requests, or what. I should have some time later this week to dig some more, but I'll let you know what I find.

    Shawn
  • kuismakuisma Posts: 134
    edited 2011-01-04 10:14
    sstandfast wrote: »
    No, I think it is NOT the driver, but rather a glitch in the W5100 itself. Since I've enabled interrupts (basically replaced the .TCPListen() with waitpeq) and the fact that when the server hangs, no more interrupts are generated,

    It sounds like a race condition. Do you check you have processed everything before clearing the interrupt, or do you only process one event?
  • sstandfastsstandfast Posts: 39
    edited 2011-01-04 11:07
    Right now, I am only processing TCP Connect events on socket 0. Any other interrupts are ignored. I suppose I should be responding to 'Timeouts' as well by resetting the socket. Here is the bit of code associated with interrupts:
    'Enable W5100 Interrupts on Socket 0
      dira[8] := w5100#_input
      W5100.ReadSPI(W5100#_IMR, @InterruptMask, 1)'Get current Interrupt mask
      InterruptMask |= 1  'Turn on Socket 0 interrupt
      w5100.WriteSPI(true, W5100#_IMR, @interruptMask, 1) 'Write new Interrupt mask to W5100
    
      repeat
    
        repeat  'Wait for an interrupt to occur
          waitpeq(0, |< 8, 0) 'Conserve power by halting until the INT pin goes low
          repeat until not lockset(lock_id)
          W5100.ReadSPI(W5100#_IR, @InterruptMask, 1) 'Get source of interrupt
          if InterruptMask & $01  'If socket 0 interrupt
            W5100.ReadSPI(W5100#_S0_IR, @InterruptMask, 1) 'Get Socket Interrupt Status
            W5100.WriteSPI(true, W5100#_S0_IR,@InterruptMask, 1) 'Clear Socket Interrupt; This should drive INT pin back High unless other interrupt sources are enabled.
            if InterruptMask & $01 'We only care about Connection Established Interrupts
              quit  'If connected continue          
          lockclr(lock_id)
    

    Perhaps I should try this variation when I get home tonight:
      'Enable W5100 Interrupts on Socket 0
      dira[8] := w5100#_input
      W5100.ReadSPI(W5100#_IMR, @InterruptMask, 1)'Get current Interrupt mask
      InterruptMask |= 1  'Turn on Socket 0 interrupt
      w5100.WriteSPI(true, W5100#_IMR, @interruptMask, 1) 'Write new Interrupt mask to W5100
    
      repeat
    
        repeat  'Wait for an interrupt to occur
          waitpeq(0, |< 8, 0) 'Conserve power by halting until the INT pin goes low
          repeat until not lockset(lock_id)
          W5100.ReadSPI(W5100#_IR, @InterruptMask, 1) 'Get source of interrupt
          if InterruptMask & $01  'If socket 0 interrupt
            W5100.ReadSPI(W5100#_S0_IR, @InterruptMask, 1) 'Get Socket Interrupt Status
            W5100.WriteSPI(true, W5100#_S0_IR,@InterruptMask, 1) 'Clear Socket Interrupt; This should drive INT pin back High unless other interrupt sources are enabled.
            if InterruptMask & $08 'Check for socket timeout
               'Connection terminated
               W5100.SocketClose(0)
    
              'Clear any interrupts generated by closing the socket 
               InterruptMask := $FF
               W5100.WriteSPI(true, W5100#_S0_IR, @InterruptMask,1)
               
               'Once the connection is closed, need to open socket again
               OpenSocketAgain
               
            elseif InterruptMask & $01 'We only care about Connection Established Interrupts
              quit  'If connected continue          
          lockclr(lock_id)
    

    I'll let you know what happens.

    Shawn
  • Beau SchwabeBeau Schwabe Posts: 6,566
    edited 2011-01-04 11:42
    Last night just after I posted and changed to using the patch version of the Indirect driver, Someone kicked me with about 300 hits. ... chillybasen was that you? :-) ... as a result my end locked and I had to reset it.

    I'm not convinced that this is an issue within the W5100 ... I still think it might be how the software handles the W5100. Anyone run a trace as to where 'in code' the Propeller is when this happens?

    The reason I'm still leaning that the W5100 is not at fault is because I can hit it when it 'appears' to be locked up and the Status lights respond as I would expect them to. ... The page just doesn't load, 'I think' because the code is looking for something else to happen and doesn't see the request in order to present the html code to the browser.
  • Mike GMike G Posts: 2,702
    edited 2011-01-04 11:48
    I don't think this is a hardware issue either. I have 1359 hits today and counting. I'm also able to retrieve the 500k Spinneret pdf.
    http://spinneret.servebeer.com:5000/docs/32203.pdf
  • chillybasenchillybasen Posts: 16
    edited 2011-01-04 12:18
    haha, nope, but I would have if you posted the URL :)
  • Mike GMike G Posts: 2,702
    edited 2011-01-04 12:20
    @sstandfast. did you happen to view the source in Chrome? I think the problem is with the headers sent back from the server. IE does not care but Firefox and, I guess, Chrome do care. Not sure though until I get home.
  • sstandfastsstandfast Posts: 39
    edited 2011-01-04 12:39
    I too see that the W5100 is still talking via TCP even when it is hung. However, I have my scope connected to the ~INT line and when the server is "locked up", I do not see it pulsing low in response to the TCP traffic, indicating to me that the problem is on the W5100 side. I.E. it is not establishing a TCP connection. *Edit* This could be due to software misconfiguration (such as clearing an interrupt without performing some action first) or it could be hardware.*/Edit* Now it could be that the pulse is too fast for my scope (it is only a 10MHz single channel B&K Precision scope) and I am just not seeing it so I will modify the interrupt code to stretch the time before it clears the S0_IR register and see if it shows up then. Although I thought that even at 80MHz clock, SPIN only appears to run at about 1.5MIPS which means I should be able to see a nice little pulse.

    Right now, my interrupt code only responds to a socket 0 connection established interrupt. All other interrupts are ignored. I will also check the status of the ISR when the server locks up to see if it is waiting for something to happen, like waiting for me to reset the socket after a connection timeout or similar. I'll post back when I get home.

    Shawn
  • sstandfastsstandfast Posts: 39
    edited 2011-01-04 12:43
    @Mike G.

    Below is a copy of the source. The most notable thing is the presence of a blank line as the first line in the file.
    
    <?xml version="1.0" encoding="utf-8"?>
    
    <?xml-stylesheet type="text/xsl" href="sensor.xsl"?>
    
    <root>
    
      <readings>
    
        <reading time="1">
    
          <ir distance="10"  />
    
          <ping distance="12" />
    
          <tilt x="1" y="2" z="3" />
    
        </reading>
    
        <reading time="2">
    
          <ir distance="12" />
    
          <ping distance="36" />
    
          <tilt x="85" y="100" z="-69" />
    
        </reading>
    
      </readings>	
    
    </root>
    
  • Mike GMike G Posts: 2,702
    edited 2011-01-04 13:17
    @sstandfast, yeah, when I dropped the files on the SD card this morning I forgot to add a context header for an xml and xsl file. So you're seeing the main header content render and a line, then a condition block is never entered, and finally two blanks lines. For a total of 3 blank lines. I hope that's all it is. I'll know more when I get out of the office.
  • sstandfastsstandfast Posts: 39
    edited 2011-01-04 15:56
    I just got home a few minutes ago and have looked at the TCP traffic to and from the W5100 during a "Lock-Up" condition. It appears that when the W5100 is "hung", all TCP "Connection Establish Requests" (TCP Flags value = 0x02) are responded to with "Connection Reset" (TCP Flags value = 0x04) responses from the W5100 and no interrupt is generated. This could mean that the previous TCP connection did not close properly. I will amend my code to test this theory. Attached is the capture from Wireshark if anybody wants to view the TCP traffic in a "hung" condition.


    Ethernet Capture.zip

    *Edit - I should qualify the above attachment with the fact that the Prop/W5100 is located at 192.168.0.100 and the PC is 192.168.0.102
  • Mike GMike G Posts: 2,702
    edited 2011-01-04 16:57
    Just got home and added the "content-type: text/xml" for xml and xsl files. Now Firefox is responding as expect IE, Safari, Opera, and Chrome look good too.
    http://spinneret.servebeer.com:5000/xslt/sensor.xml

    I had the web server up all day. The server received 1780 hits and no crashes not too shabby. I requested the spinneret pdf several times which is 500k and no problems. I'm seeing timeouts in my log file. Not sure of the cause. When a timeout occurs the I reset the socket, that's seems to work pretty good.

    @sstandfast, thanks for the Ethernet capture.
  • jstjohnzjstjohnz Posts: 91
    edited 2011-01-04 23:35
    kuisma wrote: »
    I patched this bug weeks ago and posted it here. It's not really a matter of the size of the W5100 buffer, but the concept of the TCP implementation.

    Again, you find my patched version at http://whiteboard.ping.se/Propeller/Network.

    Hello,
    I have found a bug in the SPI driver re truncating transmitted UDP packets. Since you have addressed a couple of other issues in your version you may want to add that fix as well, that way there will be a single version that has all known bug fixes. It's referenced in another thread on this forum, also in the collaberation thread.
  • Beau SchwabeBeau Schwabe Posts: 6,566
    edited 2011-01-05 09:41
    I think we (Our robotics group) might have found something that could be the cause of the Webserver randomly becoming unresponsive...

    It appears to be a timing issue, especially if you have dynamic HTML code that is generated.

    We reduced the html down to bare minimum, and the hang problems went away, but we could introduce a controlled amount of delay of the html generation and the hang problems came back.

    minimum HTML code:
    HTTP/1.1 404 File Not Found
    Content-Type: text/html
    

    The two lines above are sent separately using ... StringSend
    An adjustable delay using ... PauseMSec ... between the two lines was used to simulate an 'html creation' delay.

    A separate cog for html generation might be required, but then you introduce a latency with the most recent data.
    This might boil down to a speed issue between Spin vs. Pasm.

    ...Anyway, just thought I would pass this info along. I think the software driver is ok, and the 5100... it's a UCE 'user code error' problem.
  • chillybasenchillybasen Posts: 16
    edited 2011-01-05 09:58
    @Beau that could definitely be my problem as I am controlling a servo based on the request. Is there any way to thread that?
Sign In or Register to comment.