Github is really broken today.
Heater.
Posts: 21,230
Github has been really messed up this morning.
I have this repository: https://github.com/ZiCog/xoroshiro-plusplus/tree/master/src/main/scala/XoroshiroPlusPlus which is a week old and to which I pushed changes some hours ago. If I poke around in the files there I often get a 404 Not Found Error. Sometimes a red alert appears on the page announcing that it cannot supply the latest commits at this time.
To make things more interesting some hours ago I renamed that repo to just "xoroshiro". Well https://github.com/ZiCog/xoroshiro is 404 Not Found.
But in my home page xoroshiro was listed as one of my repos earlier today and I was able to surf it. Not any more.
Basically, github is serving up random files or randomly saying things are not found.
At one point I could not clone either of these as git clone said it could not find them at all!
Seems it not just me:
https://www.theregister.co.uk/2018/10/22/github_down_storage_failure/
And MS hasn't even taken over github yet
Luckily git is a distributed system. Even if github disappeared all together things can just keep rolling along here.
I have this repository: https://github.com/ZiCog/xoroshiro-plusplus/tree/master/src/main/scala/XoroshiroPlusPlus which is a week old and to which I pushed changes some hours ago. If I poke around in the files there I often get a 404 Not Found Error. Sometimes a red alert appears on the page announcing that it cannot supply the latest commits at this time.
To make things more interesting some hours ago I renamed that repo to just "xoroshiro". Well https://github.com/ZiCog/xoroshiro is 404 Not Found.
But in my home page xoroshiro was listed as one of my repos earlier today and I was able to surf it. Not any more.
Basically, github is serving up random files or randomly saying things are not found.
At one point I could not clone either of these as git clone said it could not find them at all!
Seems it not just me:
https://www.theregister.co.uk/2018/10/22/github_down_storage_failure/
And MS hasn't even taken over github yet
Luckily git is a distributed system. Even if github disappeared all together things can just keep rolling along here.
Comments
It's weird, my repo is really out of synch. Sometime I see "xoroshiro" in my repo list, sometimes with it's old name "xoroshiro". Sometimes I can access the files of one or other. Then again not. Sometimes I can clone one or the other, sometimes not.
The the problem for many is that they have become dependent on github's issue tracker. They have no work to do today as they see no issues coming in. Or they cannot show any progress.
This is why one should use cloud system that are distributed and fault tolerant not "cloud" systems that are that in name only and really highly centralized.
I've just tried to access your repo, and apparently, it's coming back, slowly, file by file.
Sometimes a 404 error shows its ugly face, then, one more try and things seems to be working, as expected.
It will be interesting to see where my repo lands when they have everything synced up again.
For a short while there I could not sign in at all.
Not completely working anymore. Unstable!
Another set of retries, and back again at 404. Weird behaviour, like a vintage Cuckoo Clock!
However, I naively assumed that the issue of failed disks/machines and software in distributed cloud services was a solved problem. That, basically, there are dozens, hundreds, thousands of nodes. That nodes being broken or off line was expected to be a normal situation and systems are designed to keep working despite that, without down time or interruption.
Why would I assume that? Well, it's been a a long time since Lamport, Shostak, and Pease wrote their famous paper on the problem of "Byzantine" fault tolerance. Since then other solutions have been used widely, the Paxos and Raft algorithms.
This past yeas I have been using the CockroachDB database. Which is a distributed SQL database that will remain functional and importantly consistent, provided a majority of nodes can agree on the state of things. Cockroach uses the Raft consensus algorithm. It's been kind of fun playing with this, killing off nodes and watch how well it survives. It does.
Still. On the whole I think Github has done very well over the years. They did survive the biggest DDOS attack in history some time back. I have never noticed any such prolonged unavailability before.
One wonders what they use for storage. Googling around nobody seems to know.
All seems to be up and running stably again.
For example after I renamed my repo from "xoroshiro-plusplus" to "xoroshiro" it was oscillating between those two names in my repo list on my front page. That and sometimes a page was available and sometimes 404'ed under either name of the repo. Clearly different parts of their store were in different stages of update, out of sync.
From this I think we can conclude they are not ensuring consensus between nodes with any consensus algorithm like paxos or raft. Rather they have an "eventually consistent" distributed store. Something like Mongo DB perhaps.
This presents the slight worry that when one does a download or git clone it may be possible to get an out of date version of a repo on occasion.
Yep, all seems well just now.