Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Image

Solaris Machine Shut Down After 3737 Days of Uptime 409

An anonymous reader writes "After running uninterrupted for 3737 days, this humble Sun 280R server running Solaris 9 was shut down. At the time of making the video it was idle, the last service it had was removed sometime last year. A tribute video was made with some feelings about Sun, Solaris, the walk to the data center and freeing a machine from internet-slavery."
This discussion has been archived. No new comments can be posted.

Solaris Machine Shut Down After 3737 Days of Uptime

Comments Filter:
  • by cod3r_ ( 2031620 ) on Thursday March 14, 2013 @04:36PM (#43175731)
    A *nix machine being idle for 3737 days is not all that interesting.
    • by Anonymous Coward on Thursday March 14, 2013 @04:45PM (#43175847)
      Somewhere at my last job, there was a Solaris 8 machine with over 4000 days uptime, that everybody hated to do anything with, but one person loved it and refused to migrate the last service that was still on it to something more modern.

      Uptime is irrelevant for an individual server, anyway. If there's fail over (and there should be if uptime is important), take it down and update the kernel for security reasons, who cares?
      • by crutchy ( 1949900 ) on Thursday March 14, 2013 @04:54PM (#43175941)

        If there's fail over (and there should be if uptime is important)

        i agree... if you're responsible for a single server performing a mission critical function with no fail over, you may as well just fire yourself

        • by h4rr4r ( 612664 ) on Thursday March 14, 2013 @05:04PM (#43176077)

          Just get it in writing.
          Been there done that, when it has to come down for hardware failure or something like that you can show you tried to get a backup machine, you tried to do things right.

          • by Anarchduke ( 1551707 ) on Thursday March 14, 2013 @06:11PM (#43176823)
            an even more important part of your job then ensuring failover. that is, covering your ass.
          • by AK Marc ( 707885 )
            Doesn't help. When my boss lied to my boss's boss (the owner), I had documented proof I was right. My choices were to ignore it, and take punishment for things I didn't do, or prove my boss wrong, defending my position and probably costing me my job. I stayed quiet, found a new job, and left a place where my boss would lie to sell me out.

            Proof you are right doesn't help. It hast to be shared and spread long before there's an issue, or you can still end up in an unwinnable situation.
            • by Almost-Retired ( 637760 ) on Friday March 15, 2013 @12:11AM (#43179603) Homepage

              I'd differ with that. I was fresh on the job, just 2 or 3 months, long enough to get the feeling I would be the scapegoat. The owner came in, and a deal the GM had made in a bar 2 weeks back hadn't worked out, and as the 3 of us were walking to the back of the garage to look at what we had, The GM tried to say it was all my idea.

              Wrong, I skipped out in front, spun around and said this stops right here and now, I was just following orders. The owner looked at the GM, looked at me, gave a barely perceptible nod, and started walking again. I didn't get pushed to take the blame again, but I did get pushed in every other way it seemed.

              Owners didn't get to be owners without a sense of who's right and who's wrong in boss/employee differences. Tell the truth even if you lose, because if you lose, that job was looking for somebody to do it when you walked in. I'd a hell of a lot prefer to stand my ground if I'm right, and admit it if I'm wrong, and I've done quite a bit of both in my 78 years. Honesty has paid off handsomely several times.

              About 2 years later another situation came to a boil, and I was the first one called to the owners office when he arrived. He wanted to know what it would take to fix it. I said 2 things, the gear these people are using is just plain worn out, its been on the road non-stop for at least 5 years, I can't get parts because the parts bills aren't being paid. I need 10 grand in parts, and I can't get a P.O. for more than $200 a month, COD. Hell of a way to run a train. Besides that, the technology has moved on. Its time to upgrade.

              His next question floored me, he wanted to know if he needed a new GM. I had to say it looked like he was, at the end of the day, the biggest roadblock to making things run smoothly. Then he had another dept head paged, 3 all told in the next 30 minutes. Years later he said they all agreed with me, so we had a new GM by the next morning. That and $150,000 in new gear put out the fire. That GM didn't work so well either after a couple years, but that's another story I am not directly involved in. The 3rd one is a pussy cat and we sometimes get into very noisy arguments even now, just to entertain the troops. He's a decent man, a motivated manager, but in a war of wits with me on technical stuff, he is unarmed and knows it very very well.

              Bottom line to this story is that I had already proved my worth from the 1st day on the job because they had about half the gear packed up to go back to the factory shop, expected 2 to 3 grand each for repairs with a 2 week turnaround time. I canceled that, unpacked them and handed in parts orders at about 10% of that per machine. All were back in service inside of 10 days, half that waiting on FEDEX or UPS.

              So it was a question of who was worth more to the person who owns the place. I stayed there 18+ years, have now been retired for 11 years, and the owner and I are still friends.

              Cheers, Gene

        • by kasperd ( 592156 ) on Thursday March 14, 2013 @06:27PM (#43176941) Homepage Journal

          mission critical function with no fail over

          It is surprisingly hard to guarantee data integrity when doing a fail over.

          If you want to guarantee a system keeps operating and maintains data integrity when a single computer fails, you need at least another three computers that are still running with no failures. There is a mathematical proof for this.

          If you want to go lower than four computers, you have to make assumptions about how the failures behave. And if just one computer fails in a way that does not match your assumptions, the system will fail.

          If you do decide to go with the four computers required to handle a single failure, the protocols to ensure they agree on the current state of your data are quite complicated. The protocols have to be non-deterministic. That's another proven fact. No matter how many machines you throw at the problem, a deterministic protocol cannot handle even a single failure.

          You can get around the non-deterministic requirement if you make assumptions about the timing of communication. But you'd slow down the system unnecessarily because you'd have to wait for the maximum time you assumed packet delivery could take on every operation, and if the network was slower than you assumed, the system would fail.

          Knowing how difficult fail over can be, it is no surprise that sometimes it is decided to not bother with it and instead hire an operator, who you assume can make everything be ok as long as you have backups plus spare hardware ready to put in production.

      • If you use something like Ksplice, you can install the kernel security patches without rebooting, although I don't think they were doing that here. I'm so disappointed that Oracle bought Ksplice.

        • by TheLink ( 130905 )
          The problem is ksplice hasn't been around for more than 3737 days ;).

          If you run everything on a "cluster" layer (your apps are not dependent or maybe not even aware of the noncluster layer) then you won't have such problems - you can reboot a node with minimal impact. In the old days the ones famous for uptimes were Tandem and VMS.
      • Yup. And running a full marathon is pointless and irrelevant - any one could run 26.2 miles on a couple of months, half a mile at a time.

      • by dkf ( 304284 )

        [There go the mod points]

        Uptime is irrelevant for an individual server, anyway. If there's fail over (and there should be if uptime is important), take it down and update the kernel for security reasons, who cares?

        Not all critical services are necessarily internet facing. I know of someone who had an application that ran continually for over 10 years, highly business-critical (master video stream controller for a TV network) and with very fancy hardware attached that it was tricky to replicate. The hardware was gradually updated over that decade, as was the code of the application (dlclose() FTW!)

      • by crontabminusell ( 995652 ) on Thursday March 14, 2013 @08:25PM (#43177941)

        Somewhere at my last job, there was a Solaris 8 machine with over 4000 days uptime, that everybody hated to do anything with, but one person loved it and refused to migrate the last service that was still on it to something more modern.

        Uptime is irrelevant for an individual server, anyway. If there's fail over (and there should be if uptime is important), take it down and update the kernel for security reasons, who cares?

        It's like Cory Doctorow said in When Sysadmins Ruled the Earth [craphound.com]:

        “Greedo will rise again,” Felix said. “I’ve got a 486 downstairs with over five years of uptime. It’s going to break my heart to reboot it.”

        “What the everlasting shit do you use a 486 for?”

        “Nothing. But who shuts down a machine with five years uptime? That’s like euthanizing your grandmother.”

    • by Anonymous Coward on Thursday March 14, 2013 @05:13PM (#43176181)

      No, it was idle "only" since day 3509 (served as a hot backup if we had to restore the service from the new machines).

  • Oracle sucks. (Score:5, Insightful)

    by RocketRabbit ( 830691 ) on Thursday March 14, 2013 @04:37PM (#43175745)

    I'd just like to leave this here. Yeah, I know Linux is great and everyfink, but Solaris is excellent and better in some ways. Oracle really ground my gears when they stopped supporting OpenSolaris and OpenIndiana is going nowhere fast.

    RIP Sun.

    • Oracle never supported OpenIndiana, it's a distribution of illumos (the OpenSolaris fork).

      • Re:Oracle sucks. (Score:5, Informative)

        by tnk1 ( 899206 ) on Thursday March 14, 2013 @04:52PM (#43175917)

        I don't think his comment suggested anything else. You should probably parse it like this:

        (Oracle really ground my gears when they stopped supporting OpenSolaris) && (OpenIndiana is going nowhere fast)

        Oracle support only applies to the Left Side of the statement. The point of the statement was to suggest that with support gone, and the only alternative to the supported version going nowhere, the Solaris world is completely Shit Out of Luck.

    • I've never really understood why Oracle had to steal RHEL's distro and rebrand it as its own, when they had a perfectly good OS in Solaris which existed not just on SPARCs, but on x86s as well. As for OpenIndiana, I don't get the point of that project since it doesn't support SPARC, and there is a plethora of OSs for x86
      • Re:Oracle sucks. (Score:4, Insightful)

        by Reschekle ( 2661565 ) on Thursday March 14, 2013 @06:45PM (#43177067)

        OK, first off, it is not stolen. You cannot steal open source software. Oracle is following the GPL.

        Second, Oracle was doing OEL before they acquired Sun.

        Solaris is a technically good and high quality OS but its hardware support was limited. If you bought the Sun-branded boxes and Sun-branded cards, you were OK. However if you are white-boxing a server, you had to be careful to select chipsets that were on their compatibility list. Then support got murky at that point even then.

        I really, really love Solaris, but let's face the facts. Outside of the SPARC platform, there is no reason for Solaris. Linux does everything as well or nearly as well. Linux is weaker in some areas, but not weak enough to justify the cost and lock-in of Solaris.

        Solaris exists for Oracle to milk legacy customers on support contracts who aren't ready or willing to migrate to Linux and commodity x86 hardware . There isn't much if any new development going on, and Oracle is only pushing Solaris to new customers as part of their big data warehouse solutions (where customers have $$$$$ and want to spend it with one vendor) where they want to get people locked in to one vendor.

    • by Tom ( 822 )

      Actually, last I checked Linux can not show you an uptime of 3737 days.

      No, that's not a dig on Linux being unstable. The real reason is both more boring and more interesting at the same time. A Linux system with that kind of uptime would have to be running a kernel from a time where the uptime counter overflows after around 400 days.

      And yes, I've seen that happen. :-)

  • by Anonymous Coward

    Last place I worked at still used token ring. Packet-Packet-Give baby!

    • I'm not sure how uptime and Token Ring really compare. Though I will say that I haven't worked on *any* Token Ring since '94 -- and that was a Thomas Conrad bastardization that did 100 Mbit over fiber. Haven't touched the copper stuff since '92.

  • by StefanJ ( 88986 ) on Thursday March 14, 2013 @04:39PM (#43175787) Homepage Journal

    . . . Mar 12 11:57:03 hedvig kernel:WILL I DREAM?

  • by Anonymous Coward on Thursday March 14, 2013 @04:42PM (#43175819)

    In another 57 years the uptime command might've had rollover issues.

  • This is news? (Score:5, Interesting)

    by Fished ( 574624 ) <amphigory@gma[ ]com ['il.' in gap]> on Thursday March 14, 2013 @04:44PM (#43175837)

    I work at a Very Large Company (who must remain nameless.) We've got Solaris boxes that were last rebooted in the 90's. Yes. Really. Running Solaris 2.6, even.

    • Re:This is news? (Score:5, Insightful)

      by bobbied ( 2522392 ) on Thursday March 14, 2013 @05:21PM (#43176265)

      I work at a Very Large Company (who must remain nameless.) We've got Solaris boxes that were last rebooted in the 90's. Yes. Really. Running Solaris 2.6, even.

      I am not surprised. I've seen Sparc/Solaris boxes run for very long times and even when not properly cared for have run times measured in months and years. I've had to shut down boxes to move them that had been running for 5 years. We where scared to death the disk drives would not spin back up after 2 days in the truck, but when we plugged them back in, they powered right back up. Sun built some SOLID hardware and produced a SOLID operating system.

      • Re:This is news? (Score:5, Interesting)

        by Grog6 ( 85859 ) on Thursday March 14, 2013 @06:42PM (#43177037)

        Amazingly enough, in my experience, two days in a truck is not nearly as bad as a few weeks in an extremely temperature-controlled, vibration free room.

        The drives will weld to the platter if there's no vibration or movement after "spinning themselves flat" over many years' time.

        Apparently, all the micro-projections on the surface of the heads and disks get worn off over time, making the disk and heads Extremely flat; they stick like glue when the air barrier between them escapes over time.

        Thermal changes and ambient vibration are apparently enough to keep things 'fluid', and not as likely to stick.

        YMMV.

        • by dbIII ( 701233 )
          It's well known with bearings and is actually due to stuff from one polished surface diffusing into the other. The smoother the surface the greater the chance of it happening.
    • by amicusNYCL ( 1538833 ) on Thursday March 14, 2013 @06:20PM (#43176889)

      I work at a Very Large Company (who must remain nameless.) We've got Solaris boxes that were last rebooted in the 90's. Yes. Really. Running Solaris 2.6, even.

      I'm willing to hazard a guess who you work for. Let's see.. you're running servers that have an OS that was released in 1997, and apparently you haven't rebooted them since. Almost like your company is stuck in the mid- to late-90s. You're the only Slashdotter I've seen with an AOL instant messenger screen name in their profile. That can't be a coincidence. You work for AOL. They have you designing the latest Free CD labels.

    • Re: (Score:3, Funny)

      by shafty ( 81434 )

      Interesting, I left a Very Large Company in the late 90's after having set up a few Solaris 2.x machines for our R&D projects. I had a Quake server running on one of them. There was a lot of incentive to keep that server up.

  • Uptime fetish (Score:2, Insightful)

    I will never for the life of me understand the "uptime fetish" that uneducated sysadmins have. Who the hell cares? The only people who give a crap about this sort of thing are linux fanbois. The only thing this tells me is that this machine has had an uninterrupted power supply, which is mildly impressive. Otherwise it's a Solaris box which is missing A SHITLOAD OF PATCHES. WTF, sysadmins? What kind of pro sysadmin worships at the altar of individual machine uptime? Much less a Solaris sysadmin?
    • Re:Uptime fetish (Score:4, Insightful)

      by FileNotFound ( 85933 ) on Thursday March 14, 2013 @04:47PM (#43175875) Homepage Journal

      Funny because you're right - "Impressive UPS" is all I thought.

    • Re:Uptime fetish (Score:4, Insightful)

      by tepples ( 727027 ) <tepplesNO@SPAMgmail.com> on Thursday March 14, 2013 @04:47PM (#43175877) Homepage Journal

      Otherwise it's a Solaris box which is missing A SHITLOAD OF PATCHES.

      Apply a patch to a service and restart the service, not the whole computer. Or what am I missing?

      • Re:Uptime fetish (Score:4, Insightful)

        by Richard_at_work ( 517087 ) on Thursday March 14, 2013 @04:54PM (#43175943)

        Impressive if you can do that on the kernel and still be confident of stability.

      • by Cenan ( 1892902 )

        You can't really patch the kernel while it's running

      • He's used to microsoft or apple products?

      • Re:Uptime fetish (Score:5, Insightful)

        by Bacon Bits ( 926911 ) on Thursday March 14, 2013 @05:10PM (#43176135)

        You have no idea if the system can start from a cold boot. And if it fails to start from a cold boot, you have no idea which of the hundreds of patches you've applied in the last 10 years is the one that is causing the boot process to fail, or if it's hardware that's randomly gone sketchy. The last known-good cold state is 10 years ago.

        Power systems fail. Backup power is limited. Buildings get damaged and remodeled. For these reasons it is unwise to assume you will never need to power a system off. Even with the super hotswapping of the VAX you would occasionally need to move the system to a different building with new server rooms. If you never demonstrate that a server can safely power back on to a running state, you have no idea what state the system will be in when you do it.

        Consider the system in this article for a moment. The last service was removed last year. Why was it left powered on? It was literally doing nothing but counting the seconds until it was shut down today. That's a disgusting waste of power.

    • Re: (Score:3, Funny)

      by Anonymous Coward

      Boy, you must be fun at parties.

    • Re:Uptime fetish (Score:5, Informative)

      by guruevi ( 827432 ) on Thursday March 14, 2013 @05:00PM (#43176023)

      You can get patches, even kernel patches without having to restart the system. That was one of it's selling points back in the day, some systems even allowed you to hot-swap or hot-upgrade CPU's and memory.

      • by guruevi ( 827432 )

        And with the right hardware, my OpenSolaris still does it. It "reboots" the kernel but never has to go through the whole BIOS thing. If you ever however have the wrong drivers (like Areca) the system is simply going to complain it can't quiesce the driver and reboot anyway.

      • Actually, the video shows him rooting the box with a custom compiled program as a normal user without entering any password, so it's likely not updated.
    • Re: (Score:2, Troll)

      by jedidiah ( 1196 )

      Why would "missing patches" be of concern for a Unix machine?

      That sounds like the sort of thing a WinDOS consumer would need to be fixated on, not an "educated sysadmin".

      • Why would "missing patches" be of concern for a Unix machine?

        Missing services patches can leave one vulnerable to being hacked. Fortunately, you don't need a reboot to install those. Security related kernel patches do happen and they do require a reboot. However, these are generally of the privilege escalation variety and require specially written code to exploit. If you don't have untrustworthy people logging in to your machine it isn't a major problem if you don't have all the kernel patches.

        Of more serious concern is the general lack of patches for Solaris 9.

    • by Bigbutt ( 65939 )

      The unfortunate software house where the dev teams are broken up after a project is complete. Then approvals are denied to patch systems because there are no devs to correct for any problems that occur due to the patch.

      Uptime is all I have.

      (FreeBSD box with 3,196 day uptime running internal DNS).

      [John]

      • by h4rr4r ( 612664 )

        Why would the box hosting DNS need to stay up?

        Mine could stay up that long, but there are a bunch of VMs doing that task so rebooting them is no big deal.

        Linux, but no real not to do the same with FreeBSD.

    • Re:Uptime fetish (Score:5, Insightful)

      by Anonymous Coward on Thursday March 14, 2013 @05:09PM (#43176127)

      If you don't care, you don't understand history. And sadly, looking at your attitude and phrasing, I got a feeling you're older than I and should know it better.

      That you understand it's not worthy of worship is a mark in your favor -- but not as big as you're hoping.

      It's not fanboyism. It's from the old cult of service. From taking your limited resources on a system that costs more than your pension, and absolutely positively guaranteeing they were available to your userbase.

      We didn't all have roundrobin DNS, sharding, clouds in the early 2000's.

      Some of us had Sun's, BSD's, Vaxen, and other systems that might be missing security fixes, but that by and large were secure as long as you made sure nobody that didn't belong on it had an account.

      Kernel and driver patches? It might be a performance boost, it might be a security patch. It might be a driver problem that could cause data loss, but only if you were running a certain service. A great admin can choose which are needed. A good admin knows they should apply them all

      There's something to be said about rebooting machines -- just to make sure they'll still boot. But the best sysadmins didn't need to check -- they knew.

      Uptime diferentiated us from our little brothers running windows, who couldn't even change network settings without a reboot. Who had to restart every 28 days or crash horribly. Who could be brought to a grinding halt with a single large ICMP request.

      In short, uptime was an additional proxy variable for admin competence (given the presence of an unrooted box).

      Yeah, any idiot could leave a system plugged into a UPS in a closet and have it come out OK. But if you didn't get cracked and filled with porn, you were doing something right.

      Given elastic clouds, round robin DNS, volume licensing, SAS... it's very nearly cheaper to spin up a new image and run the install scripts than reboot these days.

      I'm not convinced this makes modern sysadmin practices better -- just more resilient to single-host failure.

      Just the other week we had a million dollar NAS go down for nearly 12 hours (during the week) while applying a kernel update to the cluster.

      If you did that in 99 on a Unix system, you'd have probably been shot after the execs showed you out the door.

      Somehow, the cult of service availability has been replaced with the cult of 'good enough'

    • Re:Uptime fetish (Score:4, Interesting)

      by arth1 ( 260657 ) on Thursday March 14, 2013 @05:12PM (#43176163) Homepage Journal

      The old adage holds true: Iffen ain't broke, don't fix it.

      If the machine is in an area where security is important, certain security patches might be needed. But that's no certainty. Other patches - well, with an uptime of 10+ years, adding a stability patch which causes downtime seems rather counter-productive.

      Then, experienced sysadmins, which you clearly are not, know that like the most dangerous time for an airplane is during takeoff and landing, the most dangerous time for a server is during shutdown and start. Stiction on old drives, minor internal power surges during boot that doesn't affect a running system, and much else can cause problems.

      Oh, and there are also services that you may want to provide 24/7 with no downtime at all, so help you cod. You even mention one such in your nickname. But I have strong doubts whether you truly have kept that service up and running 24/7, even with failovers, if you install patches and reboot just to install patches and reboot.

      • Re:Uptime fetish (Score:4, Interesting)

        by DerekLyons ( 302214 ) <fairwater@@@gmail...com> on Thursday March 14, 2013 @07:00PM (#43177193) Homepage

        Then, experienced sysadmins, which you clearly are not, know that like the most dangerous time for an airplane is during takeoff and landing, the most dangerous time for a server is during shutdown and start. Stiction on old drives, minor internal power surges during boot that doesn't affect a running system, and much else can cause problems.

        On the other hand, I worked on a system for the US Navy that controlled Trident-I missiles... we rebooted both of our main computers every six hours to ensure that we could reboot them when needed - and the first one after midnight included an extensive hard drive self test to make sure it was working to spec. The gentleman down thread has it right, the answer to 100% uptime is redundancy and failover or switchover, not relying on nothing ever going wrong.
         
        In addition, you seem to be unclear on the difference between a reboot and power cycling... In the latter case, if you're worried about stiction and power surges, that's an indication that you should have been thinking about replacing the machine for quite a while rather than hoping nothing ever goes wrong. Because eventually, something will - and when that happens, now you've potentially got two problems... the one that brought the machine to it's knees, *and* the undiscovered ones because you've never rebooted or cycled power.

        • by arth1 ( 260657 )

          if you're worried about stiction and power surges, that's an indication that you should have been thinking about replacing the machine for quite a while rather than hoping nothing ever goes wrong

          More likely, someone should have thought of that long before the hardware became legacy. When a new sysadmin comes aboard, the best that can be done for legacy systems is often to keep spares and backups, and try not to trigger any faults. The software might not be supported, and the cost of porting can run to millions.

          You're lucky if you've never had to support legacy systems. And a company that has them is lucky if they don't get a new sysadmin who first thing causes downtime by well-meaning patching t

    • by Pecisk ( 688001 )

      Rarely you see trollinsh behavior modded insightful, therefore I will bite.

      First of all, "linux fanbois" pitched uptime feature 15 - 13 years ago when Windows stability was a joke. And feature wise Linux systems weren't less complex than Windows ones. Microsoft just did quite a number of fundamental mistakes in designing Windows 95/98/98SE/ME line and also older Windows NT versions, having graphical driver in ring 0 in example. All this made Windows usable only with regular reboots. Yes, there was carefully

    • by jgrahn ( 181062 )

      I will never for the life of me understand the "uptime fetish" that uneducated sysadmins have. Who the hell cares? The only people who give a crap about this sort of thing are linux fanbois.

      I don't care about uptime per se, but I hate to try to attach to a screen(1) session only to discover it's gone because someone decided it was somehow "good for the machine" to have a power cycle.

      I don't ask for years of uptime, just no gratuitous reboots.

    • The simple fact is that it ran for ten years without needing the patches. I have run Linux web server machines for 4 years with ZERO maintenance. The PSU invariably gives up the ghost after 3 to 4 years. They were never updated and never compromized in all that time.
    • Re:Uptime fetish (Score:4, Interesting)

      by bobbied ( 2522392 ) on Thursday March 14, 2013 @05:40PM (#43176475)

      It's not like Sun has issued very many Solaris 2.6 patches in the last few years...

      Besides... Many Solaris patches simply didn't require a full reboot. In fact, unless you are changing the Kernel, there was no reason to because it just takes longer. Then there is the mission critical system that is on an isolated network that you take a "If it ain't broke, don't fix it" approach. Who cares what patches are on or not? The system just needs to work, day in and out, sans patches.

      Windows users amaze me with all the "got to reboot the box" they put up with. Install software? Reboot! Install new drivers? Reboot! Things start to slow down for unknown reasons? Reboot! I simply don't believe that it should be necessary to reboot a box very often. Reboots should not be required unless you are changing hardware and have to actually power it off or need to change parts of the memory resident portions of the operating system (i.e. the booted kernel image). Windows is getting better about this, but you still need to reboot it way too often for all the "recommended" patches to get installed.

    • by tnk1 ( 899206 )

      I don't know about you, but if I am running a stable version of Solaris, with a version of whatever application that we run on it that is no longer having to be constantly updated, that is pretty much the holy grail of production. While a bazillion hours of uptime doesn't guarantee that is the case, it is a necessary condition for it happening.

      As for patches, half the time new code causes as many problems as they fix. And besides, if you're running something like 2.6 or 2.8 on it, you're so far out of sup

  • Router's at "Time: 14:08:44 up 335 days, 13:29, load average: 037,0.11,0.02". That's the best I've got. Longest running computer is "1:46pm up 280 days, 21:01, 3 users, load average: 0.00, 0.01, 0.00". Tho it is a roughly 15 year old machine and it's had longer runs that the current run, I doubt it's broken a thousand days straight. But 335 and 280 days is pretty good for equipment that's not plugged into a UPS.

  • by jayhawk88 ( 160512 ) <jayhawk88@gmail.com> on Thursday March 14, 2013 @04:58PM (#43176001)

    Did they power it back up again after shutting it off? Just to see?

  • by yakatz ( 1176317 ) on Thursday March 14, 2013 @04:58PM (#43176005) Homepage Journal
  • Netware 3.12 (Score:5, Interesting)

    by slaker ( 53818 ) on Thursday March 14, 2013 @04:59PM (#43176019)

    One of my clients had a Netware 3.12 machine on site that operated continuously about about 16 years. It was retired unceremoniously when they moved to a new location, but that machine did not in all its life have a hardware fault or abend.

  • by onyxruby ( 118189 ) <onyxruby&comcast,net> on Thursday March 14, 2013 @05:11PM (#43176159)

    Last place I was at that had server admins that bragged about /years/ of uptime quickly turned into a discovery that we had thousands of servers that had not been patched in years. Only a few systems can patch the kernel without rebooting and those are the exception, not the rule. It turned into a six month project but in the end we were patching systems that were vulnerable to 5 year old exploits (mix of *nix and Windows).

    I had to make the argument that server uptime meant jack, and to make it I put forward the argument that the only thing that mattered was /service/ uptime. Frankly it is the service that needs to be always available, not the server. This is why you have maintenance windows, for the explicit purpose of allowing a given system to patched and rebooted at a predictable time without interrupting services.

    If your server is really that important it will have a fail over server for redundancy (SQL cluster, whatever). If your server isn't important enough to have a failover server for service redundancy that it isn't so important that you can't have a maintenance window. Think service, not server!

    The only thing that matters is service availability.

  • That is the stability of UNIX and the advantage of using a mature code base. Try doing THAT with Windows!
  • by prisoner-of-enigma ( 535770 ) on Thursday March 14, 2013 @05:48PM (#43176581) Homepage

    Kevin Flynn was trapped in there!

  • by Shag ( 3737 ) on Thursday March 14, 2013 @09:25PM (#43178515) Journal

    I know, I know, it's just a coincidence...

  • by stox ( 131684 ) on Thursday March 14, 2013 @10:30PM (#43178975) Homepage

    I don't know this for sure, but I suspect there is one out there with 30 years of uptime now, or damn close to that, running Unix-RTR as part of a 5ESS switch.

It is easier to write an incorrect program than understand a correct one.

Working...