Solaris Machine Shut Down After 3737 Days of Uptime 409
An anonymous reader writes "After running uninterrupted for 3737 days, this humble Sun 280R server running Solaris 9 was shut down. At the time of making the video it was idle, the last service it had was removed sometime last year. A tribute video was made with some feelings about Sun, Solaris, the walk to the data center and freeing a machine from internet-slavery."
Re:Oracle sucks. (Score:5, Informative)
I don't think his comment suggested anything else. You should probably parse it like this:
(Oracle really ground my gears when they stopped supporting OpenSolaris) && (OpenIndiana is going nowhere fast)
Oracle support only applies to the Left Side of the statement. The point of the statement was to suggest that with support gone, and the only alternative to the supported version going nowhere, the Solaris world is completely Shit Out of Luck.
Re:Uptime fetish (Score:5, Informative)
You can get patches, even kernel patches without having to restart the system. That was one of it's selling points back in the day, some systems even allowed you to hot-swap or hot-upgrade CPU's and memory.
Re:So what did it do all that time? (Score:4, Informative)
No, it was idle "only" since day 3509 (served as a hot backup if we had to restore the service from the new machines).
Re:Uptime fetish (Score:5, Informative)
The summary is misleading. It was acting as a backup server for it's own replacement.
Re:in other news ... (Score:5, Informative)
/usr/xpg4/something is not /bin/sh, the latter being what POSIX requires.
Re:So what did it do all that time? (Score:4, Informative)
I don't remember which paper the result was in, but I do remember the overall idea of the proof.
The general proof says to handle t failures there must be 3t+1 nodes in total.
It is a proof by contradiction, so initially we assume the nodes can be split into three groups with each node being in exactly one of those three groups. And we assume that any two out of those three groups can reach a consensus without involving the third group. Now we'll prove that under those assumptions, the system breaks down.
So we imagine two completely functional groups out of those three, the network within each group is stable, but the network between them is slow. All the nodes in the third group suffer from a byzantine failure, which cause them to send corrupted messages. Imagine that the third group of failing nodes is still communicating with each of the functional groups, but it sends different information to those two groups. Under those circumstances the failing group along with one group of functional nodes can reach consensus, because we assumed two groups can reach consensus without the third. But at the same time the failing group can reach consensus on a different result with the other group of functional nodes.
In the above partitioning into three groups, we could have t nodes in each group, in which case it is proven that with t failures among 3t nodes we cannot reach consensus. Additionally there exist solutions that will reach consensus with t failures among 3t+1 nodes. They are randomized which means runtime is theoretically unbounded, but the probability that the protocol will take forever is zero. On average it completes quickly. For example the Asynchronous Binary Byzantine Agreement protocol operates in round and has 50% probability of finishing in a given round. If it fails to complete it will run another round and have 50% chance of finishing there. The idea in that protocol is that if there are two candidate results to agree on with roughly the same number of nodes supporting each result, they flip a coin, and try to agree on using the result of the coin flip. Trying to agree on the coin flip can only fail if the coin suggested a result that was behind in the number of nodes supporting it. Hence there is at least 50% chance the coin will land on a side, that leads to agreement.
The byzantine failure model is a bit extreme, but that means protocols designed to work in that model are resilient to extreme failures. The stop dead model on the other hand is a bit unrealistic. Which means protocols designed to work in that model are only proven correct under unrealistic assumptions. They may work in practice most of the time. But the proof of correctness isn't valid in the real world. I don't know if anybody have managed to come up with a sensible model, which lies somewhere between those two.
10 years up, 1 day until copyright removal (Score:2, Informative)
If you live in Germany, the video is unavailable. Apparently it contains some music from UMG (or someone claimed it does).
10 years of uptime and one day until the video was killed by the copyright mafia. Way to go, guys!
Re:*nix does not need to reboot for more updates u (Score:4, Informative)
Kernel updates generally required reboots even in the unix/linux world. In Windows, you could also avoid a reboot if you stopped the services that are being patched and restart them after a patch was applied.