Shop Learn
Parallax forums website: 504 Gateway Time-out — Parallax Forums

Parallax forums website: 504 Gateway Time-out

I'm getting recurring 504 errors. Lasts for about 15-20 mins at a time over the last 12 hours or so.

Wondering if it maybe due the wider Let's Encrypt certificate expiry issues?

Comments

  • I experienced the same issue yesterday that practically rendered the forum site unusable. Today's fine but will see what happens next.

  • PublisonPublison Posts: 12,154

    Got more 504's this morning. Will have to call the mothership on Monday to see what is going on.

  • @Publison said:
    Got more 504's this morning. Will have to call the mothership on Monday to see what is going on.

    Hello!
    I saw one as well. It might be the failure of the certs from that company. Then again it might be something else.

    @Evanh said:
    Wondering if it maybe due the wider Let's Encrypt certificate expiry issues?

    A good point. I wonder how many other sites are effected?

  • @Publison said:
    Got more 504's this morning. Will have to call the mothership on Monday to see what is going on.

    Hello!
    So what happened? Everything is up today, so it must have been fixed.

  • evanhevanh Posts: 11,856

    Still happening to me less than 12 hours back.

  • Hello,

    We started seeing increased error rates on Sunday and into today. Specifically, we are seeing that the gateway is reporting that the server is unreachable and cannot deliver traffic. This appears to happen more often when the site is being crawled by various search engines.

    The server that is hosting the website is an older configuration that meters its resources when it gets under load. When it gets too busy, it idles out for 60-120 seconds. The gateway see that and starts blocking traffic. Once the web server goes active again, the gateway will begin forwarding traffic again.

    We are now working to upgrade the underlying bits to replace the resource-sensitive server with a scalable cluster that can scale up or down to meet the traffic.

    I will provide updates as this moves along.

  • evanhevanh Posts: 11,856
    edited 2021-10-05 06:17

    Thanks Jim, that'll be appreciated.

    So the website is getting more users surfing it in general? Or just more background crap?

  • The user traffic is about the same over the past 4 weeks. The search engines are generally polite but I am seeing two and sometimes three crawling the site at the same time.

    With the new configuration, we will automatically spin up additional servers to ensure that they don't interfere with user activities on the site.

  • PublisonPublison Posts: 12,154

    Thanks for the update Jim.

  • @Publison said:
    Thanks for the update Jim.

    Hello!
    Yes thank you Jim. You've done yeoman's service getting this, ah, spun up, and still working. I was going to suggest that you (or someone in the offices) setup a mirror of everything, and point the search bots to it, to keep it off of the regular forum, but I suspect that would be too much work.

    But that does not explain why there is a nest of mountain lions taking over the backlawn of the firm to raise kittens. At least four families worth are there.

  • The search bots are looking for forums.parallax.com, so there isn't any way that I am aware of to force bots to look at a different address. We can reroute them once we have the receive the headers, but that is going to slow down all the other traffic since we have to look at the headers in every request.

    The plan is to scale horizontally as the load increases - keep throwing servers at the site until the load is mitigated. The system will start removing servers as the load drops to keep the accountants happy.

  • VonSzarvasVonSzarvas Posts: 2,467
    edited 2021-10-05 17:25

    Another protection could be to drop packets or route to one other low cost, low resource "overload" server based on src IP and requests per second (/s). Something like anti-hammering protection. Saves the risk of extra server cost if the issue really is just robots and homebrew high-jinx. Various other benefits too.

  • Update:

    We have implemented fully load-balanced multi-server support for the forums. We are working now on retiring the original server to eliminate the 504 error from the site.

    Please let me know if anything is amiss and we'll get on it right way.

  • Update:

    The original web server has been taken offline and is retired. We will continue to monitor the site closely.

  • It appears that the site is unable to send email. We are working to correct that now.

  • Email is working again.

  • Seems like I only get 504's on my Ipad, and on a different ISP

  • The new server configuration appears to have resolved the 504 Gateway errors. We have not logged a single 504 error since the evening of October 5th.

    The site is now running on two dual core ARM processor servers. Traffic is load balanced between the two servers. Over the past 24 hours, each server has been running at between 5-8% CPU. The higher value is approximately when the Microsoft Bing crawler was scouring the site.

  • evanhevanh Posts: 11,856

    Nice. Yep, no issues for me in past couple of days. Neat to hear ARM servers in use.

  • @evanh said:
    Nice. Yep, no issues for me in past couple of days. Neat to hear ARM servers in use.

    Same here in EU. All good. With the electricity prices rising sharply, these ARM servers sound like a smart choice, energy wise.

  • Amazon is incentivizing customers to use ARM servers. They are about 20% less expensive to operate and they benchmark at about 20-30% faster than the equivalent Intel-based servers.

    6 days without a single 504-Gateway error. Time for the next step.

Sign In or Register to comment.