Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Data Storage

Data Storage Capacity Mostly Wasted In Data Center 165

Lucas123 writes "Even after the introduction of technologies such as thin provisioning, capacity reclamation and storage monitoring and reporting software, 60% to 70% of data capacity remains unused in data centers due to over provisioning for applications and misconfiguring data storage systems. While the price of storage resource management software can be high, the cost of wasted storage is even higher with 100TB equalling $1 million when human resources, floor space, and electricity is figured in. 'It's a bit of a paradox. Users don't seem to be willing to spend the money to see what they have,' said Andrew Reichman, an analyst at Forrester Research."
This discussion has been archived. No new comments can be posted.

Data Storage Capacity Mostly Wasted In Data Center

Comments Filter:
  • by Anonymous Coward on Wednesday July 28, 2010 @01:22PM (#33058504)

    The story would be generating much gnashing of teeth about the evil corporations and the corner cutting that was bringing down our pink unicorns.

    Can win for losing around here.

  • Overprovisioning (Score:4, Interesting)

    by shoppa ( 464619 ) on Wednesday July 28, 2010 @01:23PM (#33058516)

    It's so easy to over-provision. Hardware is cheap and if you don't ask for more than you think you need, you may end up (especially after the app becomes popular, gasp!) needing more than you thought at first.

    It's like two kids fighting over a pie. Mom comes in, and kid #1 says "I think we should split it equally". Kid #2 says "I want it all". Mom listens to both sides and the kid who wanted his fair share only gets one quarter of the pie, while the kid who wanted it all gets three quarters. That's why you have to ask for more than you fairly need. It happens not just at the hardware purchase end but all the way up the pole. And you better spend the money you asked for or you're gonna lose it, too.

  • Disk space is free (Score:5, Interesting)

    by amorsen ( 7485 ) <benny+slashdot@amorsen.dk> on Wednesday July 28, 2010 @01:27PM (#33058574)

    Who cares if you leave disks 10% full? To get rid of the minimum of 2 disks per server you need to boot from SAN, and disk space in the SAN is often 10x the cost of standard SAS disks. Especially if the server could make do with the two built-in disks and save the cost of an FC card + FC switch port.

    I/O's per second on the other hand cost real money, so it is a waste to leave 15k and SSD disks idle. A quarter full does not matter if they are I/O saturated; the rest of the capacity is just wasted, but again you often cannot buy a disk a quarter of the size with the same I/O's per second.

  • by Anonymous Coward on Wednesday July 28, 2010 @01:38PM (#33058752)

    If you RTFA (and admittedly, this is not very clear), the article tries to make the point that you don't need all of this storage capacity to be live. However, you've got a bunch of storage pools or machines just running idling as opposed to actually doing something. What the article is trying to say is that using provisioning tools that will spin up storage pools or servers as they are needed (as capacity increases) is a much better solution to just leaving them running. Obviously peak load will cause issues, but you configure your provisioning tools to be smarter to start bringing up capacity at lighter loads or specific times of day. The point still stands that most data centers just have idling machines that could just as easily be shut off most of the time and automatically brought up when needed, it's just that most do not use these tools despite the savings in electricity, wear, and cooling costs.

    The article confounds the issue by starting to talk about the lack of monitoring tools that leads to overprovisioning, and ends with a discussion as to how to make the storage problem more efficient (thin provisioning). Thing is, thin provisioning only works when you have the extra capacity, but it's not live until you need it. You still need to overprovision, but you won't be running all those resources idle at once just in case.

  • by eldavojohn ( 898314 ) * <eldavojohn@gma[ ]com ['il.' in gap]> on Wednesday July 28, 2010 @01:44PM (#33058842) Journal

    Who cares if you leave disks 10% full? To get rid of the minimum of 2 disks per server you need to boot from SAN, and disk space in the SAN is often 10x the cost of standard SAS disks. Especially if the server could make do with the two built-in disks and save the cost of an FC card + FC switch port.

    I/O's per second on the other hand cost real money, so it is a waste to leave 15k and SSD disks idle. A quarter full does not matter if they are I/O saturated; the rest of the capacity is just wasted, but again you often cannot buy a disk a quarter of the size with the same I/O's per second.

    I don't know too much about what you just said but I do know that the Linux images I get at work are virtual machines of a free distribution of Linux. I can request any size I want. But my databases often grow. And then the next thing is that a resizing of a partition is very expensive from our provisioner. So what do we do? We estimate how much space our web apps take up a month and then we request space for 10 years out. Because a resize of the partition is so damned expensive. And those sizes are usually pretty small anyway if you're building databases. Then we occasionally notify our managers when space is getting low by using the provisioner's dashboard tool and we re-assess the application. Is it getting unexpectedly popular or was it bad estimation from the beginning?

    I don't know if I should be bothering with the hardware level of things. I sure do like it this way even though it is a really expensive price for the project but the payment remains inside our company anyway. It's internal to the company so we're all using some nebulous group of actual machines and RAIDs to produce a massive cloud of smaller servers as images. There are some downsides and a bit of overhead to pay for virtualization but I thought everyone had moved to this model ...

  • by alen ( 225700 ) on Wednesday July 28, 2010 @01:45PM (#33058852)

    time to go and buy up all kinds of expensive software to tell us something or other

    it's almost like the DR consultants who say we need to spend a fortune on a DR site in case a nuclear bomb goes off and we need to run the business from 100 miles away. i'll be 2000 miles away living with mom again in the middle of no where and making sure my family is safe. not going to some DR site that is going to close because half of NYC is going to go bankrupt in the depression after a WMD attack

  • by bobcat7677 ( 561727 ) on Wednesday July 28, 2010 @01:48PM (#33058878) Homepage
    Parent has an excellent point. Utilization is not always about how full the disk is...especially in a data center where there is frequently large database operations requiring extreme amounts of IOPS. In the past, the answer was to throw "more spindles" at it. At which point you could theoretically end up with a 20GB database spread across 40 SAS disks making available ~1.5TB of space using the typical 73GB size disks just to reach the IOPS capacity needed to handle heavy update/insert/read operations. Huge waste of space, but only way to do it with spinning disks. SSDs of course can solve the problem, but most SAN vendors are still charging insane prices for what meager SSD options they offer, with some vendors not even offering SSD options yet. And then you can end up on the other end of the scale, with having to buy more IOPS capacity then you need just to get enough SSD space for your data. Adaptec has some cool technology for "hybrid" arrays consisting of both SSDs and spindle disks in the same array (I have heard the latest versions of Solaris can do this with ZFS too). But the applications for Hybrid arrays are somewhat limited because write performance still sucks once any available write cache is saturated (and especially if the controller/software array has no cache).
  • No... (Score:3, Interesting)

    by rickb928 ( 945187 ) on Wednesday July 28, 2010 @02:11PM (#33059242) Homepage Journal

    "It's a bit of a paradox. Users don't seem to be willing to spend the money to see what they have,"

    I think he meant users don't seem willing to spend the money to MANAGE what they have.

    As many have pointed out, you need 'excess' capacity to avoid failing for unusual or unexpected processes. How often has the DBA team asked for a copy of a database? And when that file is a substantial portion of storage on a volume, woopsie, out of space messages can happen. Of course they should be copying it to a non-production volume. Mistakes happen. Having a spare TB of space means never having to say 'you're sorry'.

    Aside from the obvious problems of keeping volumes too low on free space, there was a time when you could recover deleted files. Too little free space pretty much guarantees you won't be recovering deleted files much older than, sometimes, 15 minutes ago. In the old days, NetWare servers would let you recover anything not overwritten. I saved users from file deletions over the span of YEARS, in those halcyon days when storage became relatively cheap and a small office server could never fill a 120MB array. Those days are gone, but without free space, recovery is futile, even over the span of a week. Windows servers, of course, present greater challenges.

    'Online' backups rely on delta files or some other scheme that involves either duplicating a file so it can be written intact, or saving changes so they can be rolled in after the process. More free space here means you actually get the backup to complete. Not wasted space at all.

    Many of the SANs I've had the pleasure of working with had largely poor management implementations. Trying to manage dynamic volumes and overcommits had to wait for Microsoft to get its act together. Linux had a small lead in this, but unless your SAN lets you do automatic allocation and volume expansion, you might as well instrument the server and use SNMP to warn you of volume space, and be prepared for the nighttime alerts. Does your SAN allow you to let it increase volume space based on low free space, and then reclaim it later when the free space exceeds threshold? Do you get this for less than six figures? Seven? I don't know, I've been blessed with not having to do SAN management for about 5 years. I sleep much better, thanks.

    Free space is precisely like empty parking lots. When business picks up, the lot is full. This is good.

  • Re:Mod parent up (Score:2, Interesting)

    by minorproblem ( 891991 ) on Thursday July 29, 2010 @12:08AM (#33065040)

    i've seen worse. At my company they moved the CAD software management to drafters and then they broke up the drafting department and just assigned each drafter to a team. I am an engineer and i sit near the IT department. I feel sorry for the poor buggers, now not only do they have to run around like headless chooks. But so do the CAD drafters because before the load level was done by a head drafter allocating work. now its managers running around asking other managers can they "borrow" there drafter, and we have different people running different versions and to sum it up its hell to watch.

    And the only reason they implemented such a scheme was that accounting told them it would save money... So instead of having 8 drafter for the whole company we now have 12 (one for each project). Sometimes the world doesn't work with just numbers!

"Engineering without management is art." -- Jeff Johnson

Working...