Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Data Storage Hardware

Hitachi Promises 4-TB Hard Drives By 2011 372

zhang1983 writes "Hitachi says its researchers have successfully shrunken read heads in hard drives to the range of 30-50 nanometers. This will pave the way for quadrupling today's storage limits to 4 terabytes for desktop computers and 1 terabyte on laptops in 2011." Update: 10/15 10:39 GMT by KD : News.com has put up a writeup and a diagram of Hitachi's CPP-GMR head.
This discussion has been archived. No new comments can be posted.

Hitachi Promises 4-TB Hard Drives By 2011

Comments Filter:
  • Waiting for... (Score:5, Insightful)

    by Steffan ( 126616 ) on Monday October 15, 2007 @01:37AM (#20979403)
    Cue the "Nobody needs more than []300GB []1TB []x because I don't have a reason for it" posters
    • Re: (Score:3, Funny)

      by TheBOfN ( 1137629 )
      Don't forget the "finally something to hold all my pr0n" posts...
    • Re:Waiting for... (Score:4, Insightful)

      by MikeFM ( 12491 ) on Monday October 15, 2007 @01:49AM (#20979477) Homepage Journal
      If I could afford petabyts I'd use petabytes. There is no real limit to the amount of hdd space I can go through. No matter how much I add I always feel like I'm running out of space. I'm always shuffling around a couple hundred gigs here and a couple hundred gigs there to try to fit stuff in. This weekend I downloaded over 100GB of files from the web, several gigs of files using Bit Torrent, and had several gigs of mail.

      Even my none geek friends and family are starting to feel the pain as working with video and Bit Torrent becomes more common. Multiple TB usage won't be that uncommon I think. What we really need now though is RAID-5 for the average Joe.
    • actually, no, we're going to change to "look at the colossal amount of largely useless unimportant data those schmucks will lose; look at the colossal amount of data they'll have no means to back up within the budget of the home user, hahahaha!" I trust this will make you feel much better.
      • You use another hard drive for backup: that's not difficult. For off-site or storage backup, you use another hard drive.

        I use this approach at work, rather than spending colossal amounts of money on expensive tape libraries and backup software. It seems quite effective, although it does require a bit of thought to use effectively. (Don't back up live MySQL databases, write them to a backup file!)
    • by TrevorB ( 57780 )
      I built a 4x320GB RAID5 array last November for my MythTV Backend.

      It has confirmed my belief that all new large harddrives will fill up in 4 months.

      Sad part is 900GB seems kinda small by today's standards.
      • Yeah I've got a identical setup.

        Hard drives cannot keep up with space demands these days.
        RAID 5 is the only way to go if you want a lot of cheap redundant space.
      • Re: (Score:3, Informative)

        by tomstdenis ( 446163 )
        Good lord, either you're encoding at a like 10mbit/sec, or you're not throwing away old shows. My Myth setup ran at like 3.2Mbit/sec ish on a single 80GB drive and I was able to keep at least 2-3 episodes of each show in backlog. 900GB? I'd probably be able to keep a week or two...

        It's not the size of your RAID mate, it's how you use it.
    • Re:Waiting for... (Score:5, Insightful)

      by mcrbids ( 148650 ) on Monday October 15, 2007 @02:35AM (#20979693) Journal
      Cue the "Nobody needs more than []300GB []1TB []x because I don't have a reason for it" posters

      Actually, my sickened mind went a completely different direction... remember when we were going to have 8 Ghz Pentium 4s with 6 GB of RAM to run Windows Vista?

      Heck, it's still common to see computers sold with 256 MB of RAM, which wasn't a particularly large amount 5 years ago... that it's even salable today speaks volumes. I have an "end of life" Pentium 4 2.4 Ghz that I picked up this w/e for like $50. 20 GB HDD, 512 DDR RAM, CD, Sound, etc.

      Other than the small-ish HD and the CD instead of the DVD, this system is not significantly different than a low-end new system. And, when it was first sold 3-4 years ago, its specs weren't particularly exciting.

      Point being, there's a "we don't talk about it" stagnation going on in the Computer industry. I honestly think that most of the new purchases are based on the expectation of EOL and the spread of viruses. It's gotten to where it's actually cheaper to buy a new computer than it is to reload your old one. Part of that is the fact that it takes a full business day of rebooting the computer to update Windows from whatever came on the CD.

      This part just floors me. I have the original install disk for the aforementioned $50 Dell 2.4 Ghz system, and am reloading from scratch so it's all clean. It takes ALL FREAKIN DAY simply to update Windows to the latest release, with a 1.5 Mb Internet connection. (not high end, but still no particular slouch)

      Yet it takes about an hour and just ONE short line to update CentOS (RHEL) to current:

      # yum -y update; shutdown -r now;
      I'm getting spoiled by the "ready to go in 10 minutes, fully updated in under an hour with no oversight" way of getting things loaded. Windows is just a serious pain in the neck, IMHO.

      My point to all this?

      The computer industry has (finally) reached a stable point. Performance increases are flat-lining to incremental, rather than exponential, and there's little incentive to change this, since a 4-year-old computer still does most anything anybody needs a computer to do. There will always be a high-performance niche, but it's a niche. The money has moved from computing power to connectivity.

      People no longer pay for processing power, they pay for connections. Thus the Intarweb...
      • by DrYak ( 748999 ) on Monday October 15, 2007 @03:34AM (#20979929) Homepage
        There's a small thing you failed to take account for.

        Yes, indeed, we've reached the point where any computer, even if 4 years old, is good enough to do most day-to-day activities (hanging around on the web, wrting some stuff in a word processor, e-mails, and ROFL/LMAOing on AIM/MSN/GMail/Facebook or whatever is the social norm du jour).
        Case in point, my current home PC is still Intel Tualatin / 440BX based.

        *BUT*...

        . I honestly think that most of the new purchases are based on [ ... ] the spread of viruses. It's gotten to where it's actually cheaper to buy a new computer than it is to reload your old one.

        As you said (and that's something I can confirm here around too), Joe 6 pack buy a new computer every other year, just because his current machine is crawling under viruses and is running too slow (and spitting pop-ups by the dozen). He either pay wads of cash to some repair service that may or may not fix his problems, may or may not lose his data in the process, and he'll have to wait without a machine for a couple of days. Or he gets a new machine. And...

        remember when we were going to have 8 Ghz Pentium 4s with 6 GB of RAM to run Windows Vista?

        Those outrageous configuration never showed up. Never the less, it seems like Vista was still designed with those in mind.

        So in the end the new machine Joe Six pack *WILL* have to be better/faster/stronger, simply because the latest Windows-du-jour has tripled its hardware requirement for no apparant reason.

        OS maker will continue to make new versions on a regular basis, mostly because that's their business and they have to keep the cash flow in. Also, there are security issues to fix (by adding additionnal layers of garbage over something that was initially broken by design), legal stuff (add whatever new DRM / Trusted Computing stupidy is latest requirement voted the **AA lobby), add a lot of dubious feature that still 0.1% of the user base will need (built-in tools to sort / upload photos, built-in tool to edit home-made movies, or whatever. Modern OS tend to get confused with distributions and go the Emacs-way of bloat).
        All this will result in newer OS that take twice the horsepower to perform the exact same task as older.

        And thus, each time Joe 6 pack changes his computer, he gets a newer one, which will obviously have the latest OS on it, and thus will *need* to have 4x the computing power. Just to continue hanging on some IM, sending e-mail, writing things, and browsing porn

        • Modern OS tend to get confused with distributions and go the Emacs-way of bloat).

          Actually, the complete Emacs "operating system" takes up less than 75 MB, uncompressed and including all documentation and LISP source code. The main emacs package is just 25 MB uncompressed. By today's standards, that's positively tiny. Damn Small Linux claims to fit a complete OS in only 50 MB, but like many Live CDs, it "cheats" by storing everything in compressed form and decompressing it on the fly.

          • The main emacs package is just 25 MB uncompressed. By today's standards, that's positively tiny.

            I meant Emacs from the point of view of functionnality. Initially, Emacs was supposed to be an editor with some extension capability.

            This extension capability has been abused over time, and now Emacs can be used as an e-mail client, a browser, features interactive chatbots, and has pretty much everything else including probably a kitchen sink (indeed: There's a Nethack extension [nongnu.org] for Emacs, and Nethack does featur

      • Re:Waiting for... (Score:5, Interesting)

        by gaspyy ( 514539 ) on Monday October 15, 2007 @04:12AM (#20980041)
        I agree with the stagnation part. At work some of our laptops are more than 4 years old (May 2003) and they are still perfectly capable and working (P4 @ 2.8 GHz, 512Mb RAM, 60GB HDD). We even have two T30 Thinkpads that are just enough when traveling to browse, check email and write a doc.

        Regarding the second part (reinstalling XP) - you should really look at Acronis True Image - it's what we use.
        Basically, you install WinXP+patches and whatever programs you need once, make an image and store it on a DVD, network or on a hidden partition on HDD. At boot, you can press F11 to start Acronis instead of Windows from the hidden partition (it's a lightweight Linus distro) and you can restore your image in 5-10 minutes. Even if the image is 6 months old, you still need to download just a few patches and software updates (e.g. update from FF 2.0.0.0 to 2.0.0.7).
      • Re: (Score:3, Funny)

        by ivoras ( 455934 )
        In other words: Where's my flying car?!
      • by dave420 ( 699308 )
        If you did yum -y update; shutdown -r now; on a distro released a few years back it might take a bit longer than an hour, surely :)
      • Re: (Score:3, Informative)

        # yum -y update; shutdown -r now;

        Next time do "# yum -y update && shutdown -r now" the && means that it will only run shutdown if yum reports successful completion, so if yum breaks you can see the errors. :D
    • Re: (Score:3, Insightful)

      by Anonymous Coward
      Hard drives are just getting to the point where a few of them in a RAID configured NAS can hold a decent sized DVD collection in uncompressed form. If HD-DVD/BluRay catch on, we'll need new drives like these in order to accomplish the same thing with the newer formats.

      As someone with close to 300 DVDs (yeah, yeah...I know, MPAA evil...but I try to buy as many of them used as I can), I'm going to wait until HD technology starts catching up with disc technology before upgrading to HD. So any breakthroughs tha
    • by Moraelin ( 679338 ) on Monday October 15, 2007 @05:52AM (#20980409) Journal
      Actually, the scary part is that I can easily see how someone will take it as an invitation to install more bloat on your hard drive, do things even less efficiently, etc.

      I started my programming experience almost directly with assembly. Well, I had about a year of BASIC on my parents' ZX-81 first. But that was a damn slow machine (80% or so of the CPU was busy just doing the screen refresh) and Sinclair BASIC was one of the slowest BASICS too. So with that and 1K RAM (you read that right: one kilobyte), you just couldn't do much, you know. So my dad took the Sink-Or-Swim approach and gave me a stack of Intel and Zilog manuals. Anyway, you had to be particularly thrifty on that machine, because your budget of CPU cycles and bytes makes your average wristwatch or fridge nowadays look like a supercomputer.

      I say that only to contrast it to the first time I saw a stacktrace (Java, obviously) of an exception in a particularly bloated Cocoon application running in WebSphere. If you printed it, it would run over more than two pages. There were layers upon layers upon layers that the flow had to go through, just to call a method which, here's the best part, didn't even do much. That nested call and all the extra code for reusability sake, and checks, and some reflection thrown in for good measure, obviously took more time than the method code itself needed.

      It hurt. Looking at that stacktrace was enough to cause physical pain.

      Now I'm not necessarily saying you should throw Cocoon and J2EE away, obviously there are better ways to do that even with them. Like, for a start, make sure your EJB calls are coarse granularity so you don't go back and forth over RMI/IIOP just to check 1 flag.

      But how many people do?

      The second instance when it caused me pain is when I was testing a particularly bloated XML-based framework, and it took 1.1 seconds on a 2.26 GHz Pentium 4 just for a call to a method that did nothing at all. It just logged the call and returned. That's it. That's 2.5 _billion_ CPU cycles wasted just for a method call. That's more than 30 years worth of Moore's law. Worse yet, someone had used it between methods in the same program, because apparently going through XML layers is so much cooler than plain old method calls. A whole 30 years worth of Moore's Law wasted for the sake of a buzzword. The realization hurt. Literally.

      Again, I'm not saying throw XML away generally, though I would say: "bloody use it for what it was meant, not as a buzzword, and not internally between classes in the same program and indeed the same module." It just isn't a replacement for data objects (what Java calls "beans"), nor for a database, nor as just a buzzword to have on the resume.

      Each iteration of Moore's Law is taken as yet another invitation to write crappier code, with less skilled monkeys, and don't bother optimizing... or even designing it well in the first place. Why bother? The next generation of CPUs will run it anyway.

      And the same applies to RAM and HDD, more or less. I've seen more than one web application which had ballooned to several tens of megabytes (zipped!) by linking every framework in sight. One had 3 different versions of Xerces inside, and some classloader magic, just because it beat sorting out which module needs which version. Better yet, they were mostly just the GUI to an EJB-based application. They didn't actually _do_ more than display the results and accept the input in some forms. Tens of MB just for that.

      So now look on your hard drive, especially if you have Vista, and take a wild guess whether those huge executables and DLLs were absolutely needed, or are there mostly because RAM and HDD space are cheap?

      At this rate and given 4TB HDDs, how long until you'll install a word processor or spreadsheet off a full HD DVD?
  • So? (Score:4, Insightful)

    by Mr_eX9 ( 800448 ) on Monday October 15, 2007 @01:39AM (#20979411) Homepage
    I hope we won't be using hard drives in four years. Let's all pray for a breakthrough in solid-state storage.
    • by l0b0 ( 803611 )

      How about working for it instead of praying for it?

      Sincerely, an atheist.

    • Re: (Score:2, Informative)

      by tomee ( 792877 )
      Last I heard the rate at which flash memory prices are falling is 70% a year. You can find 32GB 2,5 inch solid state disks for about $320 at the moment, so $10 per GB, and $40000 for 4TB. So:

      2007: $40000
      2008: $12000
      2009: $3600
      2010: $1080
      2011: $324

      If this works out, 2011 might be about the time solid state disks overtake hard disks.
  • FTFA:
    "But GMR-based heads maxed out, and the industry replaced the technology in recent years with an entirely different kind of head. Yet researchers are predicting that technology will soon run into capacity problems, and now GMR is making a comeback as the next-generation successor."

    *Scotty sets down mouse- looks at keyboard and replies:"How quaint."*

    Having seen all of the referenced articles and links on my own, this just ties it all together nicely.

    On the downside, if you haven't been subjected or hunt
  • by Anonymous Coward on Monday October 15, 2007 @01:44AM (#20979445)
    30-50 metric nanometers is not as small as 30-to-50 *2^-30* meters, so you purchase one of these drives and they rip you off with a head bigger than the size you expect.
  • by jkrise ( 535370 ) on Monday October 15, 2007 @01:50AM (#20979481) Journal
    Trying to build an open source PACS system at a hospital I consult with. The need is basically for lots and lots of storage, without needing to access a DVD or tape. A typical MRI / CT scan can generate 1 GB of data; so with dozens of scans a day; and the need to store and access patient data pertaining to say, 10 years; these drives will be really useful.

    A simple SATA RAID controller interfaced with 4 such drives can give me 12TB of cheap, fast, storage. At 1TB per year, should be good enough for my needs. H/w vendors currently recommend expensive SAN boxes; which I don't like... no useful value for the application at hand.
    • I will say this, don't use a commodity RAID controller for something like that. They're pretty good for home use, but I would really recommend you spend a little and get a real RAID card. Unless you do a lot of tape backups if that RAID unbinds you'll so many flavors of screwed Baskin Robbins will sue you for trademark infringement.
      • by FireFury03 ( 653718 ) <slashdot@nexTIGERusuk.org minus cat> on Monday October 15, 2007 @06:36AM (#20980599) Homepage
        ...get a real RAID card. Unless you do a lot of tape backups...

        I sincerely hope you do backups anyway. RAID is simply there to allow you to continue running a service under some specific failure conditions that would otherwise cause the service to be down whilst hardware is replaced and backups restored - it is not a substitute for backups, RAID and backups accomplish different jobs.

        Some examples of failure conditions where RAID won't save you but backups will:

        - Some monkey does rm -rf / (or some rogue bit of software buggers the file system).
        - The power supply blows up and sends a power spike to all the hard drives in your array (I've personally seen this happen to a business who didn't take backups because they believed RAID did the same job - they lost everything since all the drives in the array blew up).
        - The building bursts into flames and guts your server room.

        In all these conditions, having a regular off-site backup would save you whereas just using a RAID will not.
    • That's terrifying; You would trust that kind of data to a simple raid5 commodity card? A SAN is a must, with a disk juke box backing it up.

      Sure, you can recover from 1 disk loss, but what about 2? Murphy is a cruel bastard who enjoys eating fools like you for breakfast.
      • by SamP2 ( 1097897 )
        Sure, you can recover from 1 disk loss, but what about 2?

        RAID 6 is your friend.
        • Except that's not what the OP had in mind. 4TB drives, 4 disks with a total overall capacity of 12TB. That's raid5.

          Even raid6 in this configuration is scary. I'd want a SAN, if for no other reason than the backend management. On top of the fact that you slam 16 drives in the bloody thing ( minimum for this kind of data ), and have half as hot spares to a raid6 array. On top of this, you have a support contract with the vendor, so if a drive dies you have an exact replacement in under 24 hours. You dum
          • by jkrise ( 535370 )
            Except that's not what the OP had in mind. 4TB drives, 4 disks with a total overall capacity of 12TB. That's raid5.

            Actually, the setup includes an off-site Disaster recovery setup that will have identical storage size, in an external drive cage, attached to vanilla hardware. So in the event of a major crash, I just need to transport the DR box and rebuild the RAID.
          • Re: (Score:3, Insightful)

            Which CT scanner are we not going to get, in order to pay for all this? Or which MRI will we fail to pay for the last 25% of to pay for all this?

            Because, you see, you've just spent your budget on hardware that will never likely be used that gets you no visible day-to-day advantage, except leaving you vulnerable to multiple simultaneous drive failures. (This is surprisingly likely: go read the Google paper on drive failure rates.)

            Instead,, you use a second system with snapshot backups, possibly using a syste
          • by drsmithy ( 35869 )

            On top of the fact that you slam 16 drives in the bloody thing ( minimum for this kind of data ), and have half as hot spares to a raid6 array.

            Holy shit, dude. There's responsible redundancy, then there's paranoia, then there's overkill, then, far off in the distance, there's having half a shelf dedicated to hot spares.

            One hot spare per shelf is heaps. Consider a 7*750G RAID6 that suffers a disk failure. An array rebuild will take ca. 20 hours (assuming it's not offlined during the rebuild). Even a c

      • by norton_I ( 64015 )
        Where on there did he say he didn't have backups?

        Just because you can't be bothered to make a reliable system that actually meets the requirements at hand does not mean everyone needs to spend an order of magnitude more for features that provide no value for the problem.
    • Re: (Score:3, Interesting)

      by drsmithy ( 35869 )
      Being in (roughly) the same industry and situation, I can sympathise. Our setup for study archiving is a front end "head" unit that receives data to a local 4-disk RAID10 (via a hardware - for transparency - 3ware card). It is connected to a number of back end "disk servers"[0] via GbE and iSCSI to present their disk space to the front end as block devices which are then stiched together using LVM.

      Studies are archived daily, with an automated script simply carving an appropriately-sized LV out of the VG,

  • by webplay ( 903555 ) on Monday October 15, 2007 @01:53AM (#20979501)
    With the current market trends, the flash memory-based HDs should be cheap enough to replace magnetic hard drives in laptops by 2011 in most applications. They are already superior in access time, drive life, power use, and transfer speeds (see the FusionIO demo or MTRON drives).
    • But that fits a different need - the need for fast access times, low power, etc. This fits its own need - people that need extremely large amounts of storage space, no matter the access time or power usage tradeoffs. Also, while this'll be pretty expensive, keep in mind that SSD drives are still gonna be expensive as hell, and even assuming the price of SSD drives comes down, 500Gb is still gonna cost a pretty penny, while normal mechanical HDD's at that size will probably be no more than $50 dollars (since
  • by Zantetsuken ( 935350 ) on Monday October 15, 2007 @01:54AM (#20979511) Homepage
    I know most people think they don't need that much, but still, thats a helluva lot of porn!
  • by TwoBit ( 515585 ) on Monday October 15, 2007 @01:58AM (#20979541)
    I want more reliability. Over the last ten years of using hard drives, I have about a 50% failure rate.
    • by pla ( 258480 ) on Monday October 15, 2007 @02:20AM (#20979621) Journal
      I want more reliability. Over the last ten years of using hard drives, I have about a 50% failure rate.

      I see comments like this all the time, and really don't understand them.

      I have personally bought an average of one HDD per six months over the past decade, and, except for ones outright DOA, I have only had one fail, ever (and that after it had served for a good many years). And I include both DiamondMaxes and the legendary DeathStars in that list, both considered some of the most prone-to-failure out there.

      Considering my work environment, I can expand that sample to most like 100+ HDDs; Of those, only two have failed, both laptop drives.

      I have to suspect the people experiencing the flakyness of HDDs either fail to adequately cool them (I put ALL my HDDs loosely-packed in 5.25 bays with a front-mounted 120mm low-RPM fan cooling them) or somehow subject them to mechanical stresses not intended (car PC? portable gaming rig? screws tight agains the drive's board?).
      • by G Fab ( 1142219 )
        Look man, you did the right thing. Fans for each HDD probably saved you a lot of money. But the typical consumer wants a hard drive that is durable enough that it can be abused a bit more. That's all the parent means.

        I don't have time to cool all my hard drives. In fact, I'm sure the one in this computer is covered in dust. It's a deskstar, and it's been making odd rattles for a while, so I know this system is headed south. Could I have babied it to where that wasn't going to happen? Yeah, but I don'
        • by pla ( 258480 )
          But the typical consumer wants a hard drive that is durable enough that it can be abused a bit more.

          Fair enough - I can accept that interpretation... But ignoring the reality that HDDs have rapidly moving parts that must never touch mere nanometers apart, combined with a high sensitivity to heat, well, that just asks for trouble. Ideally, we'd have better. Practically, we have what we have.



          I don't have time to cool all my hard drives.

          I didn't mean to imply that I have some complicated setup... Ju
      • by Aladrin ( 926209 )
        It depends on how you use the drive. I've killed a few drives in my time. 1 was from kicking the computer while it was on (I was young and stupid.) The other 2 died from excessive use. They were being read and written constantly, 24 hours a day, 7 days a week for about a year. I know it was constant because it was stuff I was grabbing from the net, and a lot of it ended up deleted before it was ever fully looked through. (I had my reasons.)

        This kind of constant use is apparently too hard on consumer h
    • by Wildclaw ( 15718 )
      Hard drives will always crash eventually.

      What you really want is a hard drive that is big enough that it contain all your data, while cheap enough that you can buy a few without going over budget. That way it is easier to make backups, as well as implementing a redundant RAID.
      • Re: (Score:3, Funny)

        by fractoid ( 1076465 )

        ...as well as implementing a redundant RAID.
        Is that like an ATM Machine? Or a PIN Number? ;)
        • by Wildclaw ( 15718 )
          My bad. I was specifically trying to not include RAID 0 which doesn't provide any fault tolerance.

          • I thought of that just as I clicked the submit button... but even if I'd been quicker, you don't expect me to pass up a chance to be pedantic, do you? :) RAID0 is evil anyway.
  • by pla ( 258480 ) on Monday October 15, 2007 @02:09AM (#20979583) Journal
    This will pave the way for quadrupling today's storage limits to 4 terabytes for desktop computers and 1 terabyte on laptops in 2011.

    Prior to the rise of perpendicular recording [wikipedia.org], we had cheap and plentiful 200-400GB HDDs using plain ol' longitudinal recording. Suddenly PMR hits the market, promising 10x the storage density at up to 1Tb/in^2 (which Seagate claims they actually achieve), and two years later we have only two real models (with a few variations for SATA/PATA) of 1TB drives available.

    Call me crazy, but a few really trivial calculations show that at 6.25in^2 *of usable area) per platter surface, times two surfaces per platter, times three platters, we should have, using today's technology, 4.5TB (note the change in case of the "B", no confusing units here) 3.5" HDDs.

    So forgive me for not wetting my pants in excitement about an "announcement" that something realistically available today, we won't have for another half of a decade.
  • The bigger problem (Score:5, Interesting)

    by SamP2 ( 1097897 ) on Monday October 15, 2007 @02:17AM (#20979605)
    The real problem is not the lack of space but the systematic chronical unability of the industry and users alike (but especially the industry) to properly manage their files.

    Yes, there are some cases where 4TB truly isn't enough without the problem being poor data management (large datacenter, huge DVD-quality media collection, etc). But far too often we see the reason for more space being poorly managed mail servers, tons of WIP that has not been properly archived or disposed of, huge amounts of unhandled spam, work-related casual conversations that really don't need to be stored after the work they relate to has been completed, outdated and obsolete software not being uninstalled, inflated registry (or any other overhead data) that keeps being backed up and restored without any cleanup involved...

    A lot of people, when challenged with the problem of this vast array of useless junk data will just respond "well we have space, and if we run out we can always buy more, and the purchase price is way cheaper than the manhours needed to clean up this mess, so why bother". Another common excuse is "it doesn't bother me, so why not keep it just in the potential case I'll ever need it again, even if the chance is extremely small".

    It does not occur to these people that proper data management is extremely important procedure, and must be ingrained in the business process. Much the same way you clean up physical garbage, remove obsolete physical equipment, empty the contents of that blue recycle bin under your desk, and do it all on a regular basis to keep the garbage from getting out of hand. Trash not worth keeping in real life does not become valuable when stored online, even if it can be stored for free or cheaper than the disposal price.

    Properly disposing data as a business process will take time, but this time will be saved many times over when people don't have to dig up through junk to find what they need, when important things are not buried in crap, when all data worth storing is clean and polished and free of rust, when your OS is not clobbered up by crap processes or temporary files, when your DBE doesn't have to go through zillions of crap stored in the database to find a single row, when you do the cleanup as-you-go, rather than waiting for things to be completely out of hand and then doing a half-assed job because by that point it is really hard to tell apart the good from the junk.

    The problem is spiraling - the longer people don't properly clean up data, the harder it is to clean it, especially as files grow larger and more complex as hardware and applications evolve. In turn, it motivates people to just invest in extra drive space, processing power, memory, etc, because by that time it's cheaper than the cleanup. And of course, once the resources have been invested into, they are filled with even more crap until they are full too.

    But the biggest problem of poor data management is actually not technical, it's business-related. As we are faced with an increasing information overload, it is very easy to make poor decisions based on data that is not necessarily wrong, but is outdated, matched with incompatible other data, or just not put in the right perspective. The whole "data warehousing" principle absolutely REQUIRES proper and timely maintenance and cleanup of data. This is so important that (and this has been proven over and over again) large corporations with proper data management gain a substantial strategic advantage over those who don't.

    It's not just about a little slower response time, or some more work to find what you need on the server. It's about right business decisions vs. wrong business decisions. And it's also about not being taken advantage of - contractors and business partners can easily manipulate data to present it in the light favorable to them, and if you are a private business, this kind of crap can make you bankrupt. Of course, it happens day after day in the government with the taxpayers footing the bill, but that's another story altog
    • by svunt ( 916464 ) on Monday October 15, 2007 @02:53AM (#20979755) Homepage Journal
      You seem to be approaching the need for big disks from a purely sysadmin point of view. In my case, and the case of a lot of friends/family, massive media collections aren't the exception, they're the rule. Between backups, downloads and plain old piracy, a lot of individuals need enormous data storage, as do film makers, musicians, artists etc. The sort of issues *you* face make it clear where your priorities lie, but don't assume that your experience is definitive.

      I for one am getting sick of having to navigate between endless stacks of DVD-spindles every time I'm in a house!

      • Even without piracy, this occurs. I've probably got 100 commercial DVD's in my house, scattered in various boxes and shelves of particular collectible sets. Add the open source CD and DVD's to that, and it's another 100DVD's worth of such media collected over the years. It would be nice to have the antique RedHat 6.2 installation media online for historical refernce, since I still work with tools that haven't evolved much since then, and it's become a real problem to find online.

        So with 200 DVD's, at roughl
    • by Kjella ( 173770 ) on Monday October 15, 2007 @03:01AM (#20979811) Homepage
      Properly disposing data as a business process will take time, but this time will be saved many times over when people don't have to dig up through junk to find what they need, when important things are not buried in crap, when all data worth storing is clean and polished and free of rust,

      I'm sorry, but this is just fantasy world 101. I almost never have to look through old mail, but when I do it's because some clients are trying to dredge up something that just not how it happened. Often when I do, it's important that I have all the "useless" mails as well, so you can say with confidence that "No, you just brougth this up two months before the project deadline and it wasn't in any of the workshop summaries [which are in project directories, not mail] before that either."

      When I do, it's far more efficient to search up what I need rather than going over old junk - what you're saying is something which would imply that the Internet is useless since it's full of so much redundant, unorganized information. It's quite simply not true, and even though you should extract vital bits to organized systems, keeping the primary source around is very useful.

      Extracting experience from current communication to improve business systems (or for that matter, technical routines) should be an ongoing process - it's vital going forward. Going back to old junk to try to figure out what's deletable just to run a "clean ship" is just a big timesink and waste of money. Maybe you'd have an argument if there was a good system not being used because it's all kept as unorganized mailboxes. In my expererience, usually the prolem is there's no such system and doing a clean-up would do nothing to change that.
    • by l0b0 ( 803611 )
      Very good points. Also, this is one place where open standards are great: Even if the file format is long obsolete, there's a good chance there are modern tools available to read them, and you can create your own scripts to extract data automatically for review.
    • I think Google has DEFINITIVELY PROVEN that you can find a needle in a haystack and software is getting better at it all the time. Every few months or so, I select everything on my desktop, move it into a folder labeled with the current date, and stash that folder into the "Desktop Junk" parent folder of my "Archive" directory. I've got a couple GB of junk over the last few years and I can not count the number of times a 2 second, indexed search located something super useful in that directory.

      Proper data o
  • by ryanisflyboy ( 202507 ) * on Monday October 15, 2007 @02:17AM (#20979607) Homepage Journal
    Okay, that's great. Hard drives will get bigger. The problem is they aren't getting any faster. I'm having a hard time trying to get RAID 6 working well with my 1TB drives (think rebuild times, RAID 5 will be on its way out). How do I manage a RAID array of 4TB disks that still only give me about 60MB/s real-world write performance. So I put 12 in a RAID 6 and end up with 40TB. How many days will it take to rebuild a failed drive in real-world work loads? Capacity is great - but at some point we are all going to wake up and start begging for faster speeds as well. I think hybrid drives might have a shot, 1TB of flash with 3TB disk might be the right match - but you're still waiting forever on rebuilds (and a policy to manage it).

    I imagine some of you out there, like myself, are starting to see problems with data integrity as the mountain of data you are sitting on climbs in to the petabytes. All I can say is: bit flips suck! Do you KNOW your data is intact? Do you REALLY believe your dozens of 750GB-1TB SATA drives are keeping your data safe? Do you think your RAID card knows what to do if your parity doesn't match on read - does it even CHECK? I hope your backup didn't copy over the silent corruption. I further hope you have the several days it will take to copy your data back over to your super big - super slow - hard drive.

    Is anyone thinking optical? Or how about just straight flash? I have a whole stack of 2GB USB flash drives - should I put them in a RAID array? ;-)
    • Re: (Score:2, Informative)

      by Anonymous Coward
      Let me get this straight - you're complaining about write times and you recommend we use flash? Flash has access times several orders of magnitude better than HD. However, write and read performance is about half from what I can remember.

      Also, if Hitachi manages to get 4 TB onto a single or 2 platter arrangement, data density will be much higher now which should mean quite a bump in read/write speed (about 4 times, no?).
    • Re: (Score:3, Informative)

      You don't. You build 4 or 8 sets of smaller RAID arrays, and use the others for snapshotted backup. This makes doing a straight rebuild/reformat/restore vastly faster and keeps the recovery times down to a quarter of the time of using a single array. It also lets you re-allocate the smaller arrays, or upgrade them, over time.
    • Re: (Score:3, Informative)

      by slashflood ( 697891 )

      I have a whole stack of 2GB USB flash drives - should I put them in a RAID array?
      Why [joensuu.fi] not? [pixelfeuer.de]
    • Re: (Score:3, Informative)

      Okay, that's great. Hard drives will get bigger. The problem is they aren't getting any faster.

      Not true. Media transfer speed increases as the data capacity increases (though less than linearly) and seek "rate" improves in terms of number of tracks the head passes over in the same time. What doesn't increase much at all is rotation speed, which means that average seek time gets worse and worse over time in relation to transfer speed. It's still very fast though, currently about 6-7 ms for commodity drives. If you're unhappy with the overall performance of your disk system, it isn't the fault of

  • Ugh, no. (Score:2, Insightful)

    by JewGold ( 924683 )
    Whenever I read about advancements in storage space, what comes to mind for me is now there will be NO incentive for companies to ever throw away information they have about you. In years past, physical storage limits--and later data storage limits--has caused companies (and the government) to routinely purge data. With hard drives getting bigger at a rate faster than they can fill them, why expend the effort to get rid of old data? Why would they spend the manhours to delete old data, when it's cheaper jus
    • by Fizzl ( 209397 )
      Yes, but consider the amount of porn you can store!
      I think we can all agree that the benefits clearly outweigh the disadvantages.
  • More About 2TMR head (Score:3, Interesting)

    by vivekg ( 795441 ) on Monday October 15, 2007 @05:00AM (#20980253) Homepage Journal
    1 CPP-GMR: As an alternative to existing TMR heads, CPP-GMR head technology has a lower electrical resistance level, due to its reliance on metallic rather than tunneling conductance, and is thus suited to high-speed operation and scaling to small dimensions.

    2TMR head: Tunnel Magneto-Resistance head A tunnel magneto-resistance device is composed of a three layer structure of an insulating film sandwiched between ferromagnetic films. The change in current resistance which occurs when the magnetization direction of the upper and lower ferromagnetic layers change (parallel or anti parallel) is known as the TMR effect, and ratio of electrical resistance between the two states is known as the magneto-resistance ratio.

    Source: Official Press Release [hitachigst.com]
  • Too bad you won't be able to import them in the US! :)
  • Now I can watch High Def. streams in boring powerpoint slides.

Programmers do it bit by bit.

Working...