Dell Says 90% of Recorded Business Data Is Never Read

Become a fan of Slashdot on Facebook

Dell Says 90% of Recorded Business Data Is Never Read 224

Posted by timothy on Saturday July 10, 2010 @08:04AM from the sounds-lowball-no-matter-the-methodology dept.

Barence writes "According to a Dell briefing given to PC Pro, 90% of company data is written once and never read again. If Dell's observation about dead weight is right, then it could easily turn out that splitting your data between live and old, fast and slow, work-in-progress versus archive, will become the dominant way to price and specify your servers and network architectures in the future. 'The only remaining question will then be: why on earth did we squander so much money by not thinking this way until now?'" As the writer points out, the "90 percent" figure is ambiguous, to put it lightly.

This discussion has been archived. No new comments can be posted.

Dell Says 90% of Recorded Business Data Is Never Read

Load All Comments

Search 224 Comments Log In/Create an Account

Comments Filter:

Coincidence? (Score:5, Funny)

by Hognoxious ( 631665 ) writes: on Saturday July 10, 2010 @08:07AM (#32859370) Homepage Journal

90% - just like the percentage of statistics that are made up on the spot.

Share
twitter facebook
- Re:Coincidence? (Score:5, Funny)
  
  by dov_0 ( 1438253 ) writes: on Saturday July 10, 2010 @09:31AM (#32859728)
  
  Or is dell about to make a press release about faulty storage in their servers resulting in about 90% data loss?
  
  Parent Share
  twitter facebook
  - Re:Coincidence? (Score:5, Funny)
    
    by espiesp ( 1251084 ) writes: on Saturday July 10, 2010 @10:30AM (#32860008)
    
    Or having developed a new memory technology.
    "Dell releases a new drive based on their patented WORN architecture. Because this device forgoes the need to read your data they can be made lighter and faster and more power efficient than even the latest SSD drive technology."
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by Jeremiah Cornelius ( 137 ) * writes:
      
      I have an Outlook Folder on my work machine, labeled "Things I Will Never Read". Last count, there were 4,700+ unread items in that location...
      Of course, they are available for CYA through the search capability.
- - Comment removed (Score:4, Insightful)
    
    by account_deleted ( 4530225 ) writes: on Saturday July 10, 2010 @04:21PM (#32861700)
    
    Comment removed based on user account deletion
    
    Parent Share
    twitter facebook
Which 90% ? (Score:5, Insightful)

by mbone ( 558574 ) writes: on Saturday July 10, 2010 @08:10AM (#32859380)

I could believe the 90% number. There is plenty of data sitting around in case it is needed. Some of it will be needed. Much of won't be. How do you predict which is which ?

Share
twitter facebook
- Re:Which 90% ? (Score:5, Insightful)
  
  by eldavojohn ( 898314 ) * writes: <eldavojohn@gma[ ]com ['il.' in gap]> on Saturday July 10, 2010 @08:15AM (#32859404) Journal
  
  I could believe the 90% number. There is plenty of data sitting around in case it is needed. Some of it will be needed. Much of won't be. How do you predict which is which ?
  Yeah, as someone who has implemented a few auditing solutions where I work, I must confess that it seems to be 99% of the data we archive is never looked at again. A lot of it is due to policies and is only used after something goes dreadfully wrong. If they are well thought out, the metrics can be collected as the data is written instead of needing to search across the data.
  
  I think their "90% dead-weight rule" is really a misnomer as you could probably claim that 90% of Google's indexing is never read but we all know that it's the potential that data holds that makes it so valuable and necessary. If Google knew every future possible search then they could delete the data they will never use ... but how do they know they will never use it? How do I know that the auditing data will never have a use--by new metric or incident investigation? The truth is simply that you don't.
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Insightful)
    
    by sco08y ( 615665 ) writes:
    
    I think their "90% dead-weight rule" is really a misnomer as you could probably claim that 90% of Google's indexing is never read but we all know that it's the potential that data holds that makes it so valuable and necessary.
    Another problem is figuring out _why_ data isn't used before archiving it. Is it not useful, or are the tools not in place to use it?
    If companies decide that the x% least used data will be shoved away in the attic, then "x% of data isn't useful" becomes a self-fulfilling prophecy.
    - Re: (Score:3, Interesting)
      
      by BrokenHalo ( 565198 ) writes:
      
      Another problem is figuring out _why_ data isn't used before archiving it.
      
      The problem is that so much data is made available without anyone ever considering how useful it might be. At least we've come some way in the last 20 years:
      
      Back in the '70s and '80s I worked at many sites where mainframe ops used to clear tonnes of fanfold paper every day. This is why we had separate printer rooms: a bank of 6 or 8 barrel-printers belting out 132 columns of text at 1800 lines/minute created sacksful of dust.
      
      Mo
  - Re: (Score:2)
    
    by shentino ( 1139071 ) writes:
    
    Just like 90 percent of the time you don't need to file an insurance claim, but when you do, you really do need it.
    It's just insurance.
    Sorta like how we have a big military that is spending more time in training than actual combat.
    - Re: (Score:3, Insightful)
      
      by jeffmeden ( 135043 ) writes:
      
      Bingo. The first thing I thought of is "sure 90% goes to waste but you don't know *which* 90% until after the fact"...
      Is Dell working on a patent to send information back from the future about what stored data is never used again? I just hope they don't stumble on the Slashdot comment archives, the future-tubes would be clogged indefinitely.
      - Re: (Score:3, Insightful)
        
        by Cylix ( 55374 ) * writes:
        
        I'm afraid they will run into issues if they do. There are already storage providers that will determine what data you are accessing frequently and move said data chunk to the faster storage area. Conversely it will move less frequently accessed data to the slower and cheaper bulk disks.
        It's a nifty optimization/shuffle technique that allows you to mix ssd, sas and sata disks for their various needs. The best part is it is rather auto-magic.
        We used to do something similar in a very manual process by keeping
        
        Re: (Score:2, Informative)
        
        by BrokenHalo ( 565198 ) writes:
        
        We used to do something similar in a very manual process by keeping the most frequently access oracle data on the leading edge of the disk platters.
        
        I haven't really kept up to date with HDD technology in recent years, but there was a time when some operating systems (Data General's AOS/VS, for example) allowed you to keep your most frequently accessed files (or even records in a database) around the middle of the disk platter, on the principle that the heads spent more time on average around the middle t
        
        Re: (Score:3, Informative)
        
        by afidel ( 530433 ) writes:
        
        Look for auto tiering, most of the newer products from EMC now support it. The technology is OS agnostic because it is done at the block level. Compellant and Isilon are two other vendors I'm familiar with that do auto-tiering.
  - Cost of storage (Score:2)
    
    by mangu ( 126918 ) writes:
    
    it's the potential that data holds that makes it so valuable and necessary
    What matters is the cost/benefit ratio.
    The potential for the data being valuable may be very low, but the cost of storing it is going down all the time. Disk space today is a dime a gigabyte, so let's keep it just in case.
    - Re: (Score:2)
      
      by vivian ( 156520 ) writes:
      
      The real cost is not storing it - but rather the cost in recording all that info in the first place. Someone has to type in all that data to start with, and possibly someone else has to at least glance at the resulting reams of reports that are produced from it.
      It is all too tempting to create database apps to record all sorts of information "just in case", but more often than not all you end up doing is making the system more complex than it has to be, and more time consuming in maintenance of both the ap
      - Re: (Score:3, Interesting)
        
        by Chris Mattern ( 191822 ) writes:
        
        Someone has to type in all that data to start with
        Not true; a lot of data is harvested automatically these days. And if you're getting the data by having the customer fill something out, then you're not paying for the typing.
  - Re: (Score:2, Insightful)
    
    by Anonymous Coward writes:
    
    the metrics can be collected as the data is written instead of needing to search across the data.
    Yet if you are only going to ever look at it once then why bother optimizing to that case? I have also seen cases where doing this you loose some other piece of information. Like my example bellow you maybe right now only care about total time at a drop off. But maybe at some future point you care more about the time it started and ended? So careful what you prune.
    Having implemented a few systems myself one
- Re: (Score:2, Interesting)
  
  by Anonymous Coward writes:
  
  I work for a large resource company and we collect loads of data... some of which is valuable today and some of which is valuable tomorrow... interestingly what is of value tomorrow is dependent on the maturity for data consumption is today.......
  so we collect the data not because it's of value today, but because we might analyse it tomorrow in a new way.
- Re:Which 90% ? (Score:4, Interesting)
  
  by alexhs ( 877055 ) writes: on Saturday July 10, 2010 @09:43AM (#32859790) Homepage Journal
  If each piece of data has 90% probability of not beaing read again...
  You discard only 10 pieces out of 100, or out of 1 billion, whatever...
  The probability that none of these 10 pieces of data would have ever been needed again is 0.9^10 = 0.348 = 34.8%
  Which means that you keep all of your data.
  Caveats :
  This assumes that all pieces have equal interest (but maybe you store a field that the interface doesn't allow you to retrieve).
  Assuming a random access on the 10% used, if you remove 10 out of 100, you have a much more important retrieve failure than if you remove 10 out of a billion. Some retrieve failure rate could be acceptable.
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by Planesdragon ( 210349 ) writes:
    
    If each piece of data has 90% probability of not [being] read again...
    Each piece of data has a 10% chance of being necessary. For any given sample, 1/10th of them will be necessary.
    Now, a MUCH more useful set of data is probability over time. 1/10 within 10 years? 5 years? 1 week?
    - Re:Which 90% ? (Score:4, Informative)
      
      by alexhs ( 877055 ) writes: on Saturday July 10, 2010 @02:16PM (#32861142) Homepage Journal
      
      For any given sample, 1/10th of them will be necessary.
      I'm sorry but you're wrong. That's not how stats are working.
      Let's play heads or tails.
      Each toss has a 50% chance of being heads.
      According to you, for any number of tosses, 50% of them will be heads. In other words, you're saying that there is a 100% chance that half of them will be heads.
      For a sample of two tosses, that would mean a 100% probability of one head(s) and one tail(s).
      I hope that you see how this is wrong. You would actually have 50% probability of one head and one tail, 25% probability of two heads, 25% probability of two tails.
      For a sample of size n, 10% probability for a piece of data to be necessary, the correct formula says that the probability for at least one element of the sample to be necessary is 1-(0.9^n), which quickly approches 1 (100%) as n increases.
      Now, a MUCH more useful set of data is probability over time. 1/10 within 10 years? 5 years? 1 week?
      It depends of what you mean by probability over time. What I can tell you is that as more time elpases, the probability of an element to be necessary (more correctly, to having been necessary) increases. The 90% never read is supposedly for an infinity of time (that's what "never" means, right ?).
      
      Parent Share
      twitter facebook
- Re: (Score:2)
  
  by CAIMLAS ( 41445 ) writes:
  
  A big part of it is: how are they quantifying "data"?
  We keep machine backups. They are each anywhere from 3GB up through 20Gb and averaging around 10GB, and each host has 2-3 copies (taken weekly). Then we've got database backups which, likewise, are taken multiple times a month. These databases aren't pure data, but are instead part of larger systems - transactional tables, but also the application's "we need this data to run" tables which, from what I've seen, rarely get used much at all. The subset of da
- Re: (Score:3, Funny)
  
  by obarthelemy ( 160321 ) writes:
  
  as someone once said: "50% of my advertising budget is wasted... only I don't know which 50%"
- Re:Which 90% ? (Score:5, Insightful)
  
  by Mspangler ( 770054 ) writes: on Saturday July 10, 2010 @12:43PM (#32860688)
  
  Note that I'm working from a process control perspective in a chemical plant, but 90% of data written is never read again sounds about right for when things are going well. It's when something goes wrong and you have to figure out what went wrong at exactly what time and what the regulatory consequences were that having all that previously unread data suddenly becomes very interesting indeed.
  And also when you start looking at a system in detail to see if you can increase output, or change a composition, all that usually ignored data becomes very valuable.
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by OnePumpChump ( 1560417 ) writes:
  
  You identify criteria by which you can divide data into activity categories. Say data from within the warranty period for whatever you're selling has a 30 percent likelihood of being needed again (if, for instance, you're selling Xbox 360s), but things from outside that period have less than 1 percent, you keep them both available, but the least likely data to be needed gets stored by cheaper means. Occasionally, you will have slower access to data you need right now, but most of the time what you need wi
- - Re: (Score:3, Funny)
    
    by camperdave ( 969942 ) writes:
    
    People change addresses and phone numbers at the drop of a hat, so recording that would be pointless.
  - Re: (Score:2)
    
    by ta bu shi da yu ( 687699 ) writes:
    
    HIPPA laws say differently. And fair enough too - they should definitely be retaining your medical records for up to 7 years!
    It's funny that the last sentence in this slashdot piece asks why didn't people do this before. They did, and indeed they still do! People have secondary and even tertiary backup - in fact I happen to know that even stodgy old EMC have made a mint out of their Centerra storage devices for this sort of thing. It's called Content Addressable Storage [wikipedia.org], and despite a particularly brain-dea
Hope They Don't Want the Z-Series (Score:3, Insightful)

by eldavojohn ( 898314 ) * writes: <eldavojohn@gma[ ]com ['il.' in gap]> on Saturday July 10, 2010 @08:10AM (#32859382) Journal

From the article:
Opportunity too good to pass up

It was just about then that one of my favourite bargain-hunting websites turned up a device called the CORAID EtherDrive. Take a look at the product range at CORAID, but don’t spend too long on it.
That's the same device from a story I submitted yesterday [slashdot.org]. I hope they don't plan on getting a Z-Series running ZFS.

Share
twitter facebook
which 90% (Score:3, Insightful)

by marmusa ( 557884 ) writes: on Saturday July 10, 2010 @08:11AM (#32859386)

Which 90% though? Like the Coca Cola exec who remarked that he was pretty sure half of his advertising budget was wasted, he just wasn't sure which half.

Share
twitter facebook
- Re:which 90% (Score:5, Informative)
  
  by Koby77 ( 992785 ) writes: on Saturday July 10, 2010 @09:00AM (#32859600)
  
  I worked in a call center, and I can definitely believe that 90% of the data is never read again. However, when a customer is calling back (and is angry!), you don't have time on a live call to wait to see what's up with the account. Also there can be some litigious aspects, and a lot of information was recorded for C.Y.A. purposes. Again, you never know which part is needed for C.Y.A. purposes, but that 10% sure is valuable.
  
  So yeah, we needed to store ALL the account information, and we needed fast access to ALL of it ALL the time.
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Insightful)
    
    by itwerx ( 165526 ) writes:
    
    "...we needed to store ALL the account information, and we needed fast access to ALL of it ALL the time."
    Which is why decent needs analysis is critical. In other situations that would not be the case.
    I must say this line at the end of the article does more to reflect the ignorance of the author than anything else, "...why on earth did we squander so much money by not thinking this way until now?"
    Who is this "we", kemosabe? Smart IT people have been thinking this way since the dawn of
  - Re: (Score:2)
    
    by guyminuslife ( 1349809 ) writes:
    
    I also worked in a call center, and while we had the same needs, we didn't get anything like that.
- Re: (Score:2, Informative)
  
  by bwintx ( 813768 ) writes:
  
  Like the Coca Cola exec who remarked that he was pretty sure half of his advertising budget was wasted, he just wasn't sure which half.
  FWIW, and pointing this out only because I've seen this quote referenced so many times over the years...
  John Wanamaker, a 19th century entrepreneur, Lord Leverhulme, founder of consumer goods giant Unilever, and Franklin Winfield Woolworth, the founder of Woolworth's, have all been credited with the quote: "I know that half of my advertising is wasted. I just don't know w
It's like Office features (Score:5, Informative)

by drinkypoo ( 153816 ) writes: <drink@hyperlogos.org> on Saturday July 10, 2010 @08:12AM (#32859388) Homepage Journal

People always bitch that they have to pay for Microsoft (or whatver) Office's features because they only use 5% of its functionality. But you buy all those features at once because you don't know which you will need in the future. Data warehousing is the same way. If you start taking data offline you'll just need that data. That's why analyses of very large data sets are performed before archiving.
But what is really wanted is a way to cluster the database servers, with old data automatically cycled to the slowest, most remote nodes, and with the most frequently-altered data heavily replicated and aggressively synchronized.

Share
twitter facebook
- Re: (Score:3, Insightful)
  
  by 1u3hr ( 530656 ) writes:
  
  People always bitch that they have to pay for Microsoft (or whatver) Office's features because they only use 5% of its functionality. But you buy all those features at once because you don't know which you will need in the future.
  Bullshit. True only if you've never used a wordprocessor in your life before. If you have, you know what you use. And you can read the description of other features to decide if you want them.
  And this is a pointless analogy because if in the future you decide you do need the 3D
  - Re: (Score:3, Insightful)
    
    by drinkypoo ( 153816 ) writes:
    
    Bullshit. True only if you've never used a wordprocessor in your life before. If you have, you know what you use. And you can read the description of other features to decide if you want them.
    It doesn't make it unreasonable to purchase a lighter word processor with less features, but I for one would not want to support a word processor where you buy access to toolbar buttons. And if I'm doing database reporting (for which I have been paid in the past) I would not want to have to request that pieces of data be reloaded into the database so I can perform analyses. And further, if I have to do a year-by-year analysis, I do not want to have to load and unload data sets, crunching one year at a time.
    - Re: (Score:2)
      
      by 1u3hr ( 530656 ) writes:
      
      It doesn't make it unreasonable to purchase a lighter word processor with less features, but I for one would not want to support a word processor where you buy access to toolbar buttons.
      You're talking about what you want to support, I'm talking about what the user wants. Which may be simplicity and speed; some people prefer that to 20 tool bars and the need for a 6-core processor to open a memo. Anyway, MS doesn't give you any such choice: you take the whole multi-gigabyte package, or nothing. So the o
- Re: (Score:3, Insightful)
  
  by icebraining ( 1313345 ) writes:
  
  No, I think Office features are different; everyone only uses 5%, but each person uses a different 5%.
  - - Re: (Score:2)
      
      by icebraining ( 1313345 ) writes:
      
      I like these phrases with a punch, that deliver the idea clearly and shortly. But if it was true, then 100%/5% = 20 total Office users that use different 5%.
      Wrong. Two "five percents", even if they intersect in part (for example, 3% of each are equal), they're still different.
      The set [A, B, C] is different than the set [A, B, F], even though a part of each set intersects with a part of the other.
- Re: (Score:2)
  
  by Vellmont ( 569020 ) writes:
  
  People always bitch that they have to pay for Microsoft (or whatver) Office's features because they only use 5% of its functionality. But you buy all those features at once because you don't know which you will need in the future.
  
  Heh. Individuals use about 5% of Office's features. 80% (as a group) use 20% of offices features. 50% of Offices features are never or rarely used by anyone, and exist solely as marketing and justification to buy the thing again. (Numbers all made up on the spot to illustrate a
- Re: (Score:2)
  
  by cgenman ( 325138 ) writes:
  
  The other question, is why would not selling you certain features in Microsoft Office reduce what the consumer has to pay?
  1. The additional software features have zero per-unit cost. They don't save anything by not shipping it to you.
  2. Microsoft already charges individual markets more or less whatever it thinks the market will bear.
  3. Microsoft wants you to try out and get locked into the advanced features.
  Now, there are end-user experience and programmatic reasons to kill the bloat. But from a business
- Oracle already has that (Score:2)
  
  by Moraelin ( 679338 ) writes:
  
  1. Oracle already has that, under partitioning. If there's a column you can define intervals on, you can have your database partitioned like that. E.g., you can have the database sliced by year, and move the old tablespaces to another HDD.
  Probably DB/2 too, though I don't have that much experience with that one.
  2. _But_ as Oracle itself points out, if you're doing it because of some delusions of gaining speed, you're doing it wrong. In this case while "90% never read", don't forget that in a well indexed da
- Re: (Score:2)
  
  by FoolishOwl ( 1698506 ) writes:
  
  Word processors aren't a great example. When I've worked as a word processor, I found that almost all the work could be done, more quickly and conveniently, in a simple text editor. The only things for which a word processor was needed were setting the margins, the line spacing, and the font. The only reason that Microsoft Word would be needed, rather than Wordpad, was so that you could read other Microsoft Word files.
- - Re: (Score:2)
    
    by Vellmont ( 569020 ) writes:
    
    Why did "we squander so much money by not thinking this way until now"? Because "we" are savages/infants who refuse to retain experience.
    
    Or.. maybe because when a resource expands at exponential rates it's cheaper to just get more resources than try to conserve resources.
    The article seeks to vastly over simplify a complex problem. Spitting out figures like 90% and then going down the road of making a huge number of hidden assumptions about the rest of the unanswered questions is just as stupid as not reme
The problem is "Write-only" applications (Score:5, Insightful)

by shoppa ( 464619 ) writes: on Saturday July 10, 2010 @08:16AM (#32859406)

Interesting that this seems to have been written up as a "hardware" or "storage" topic.
The problem is, that IT people dream up all these "write only" applications that record data, without any rational plan for what the data might actually be used for in the business.
For example, some people worry about privacy when they go to the grocery store and know that all their purchases are being tracked by their loyalty card, or worry that the big bad US government is tapping all the E-mail.
In fact, I'm 100% sure that some IT geek had some wet dream years ago about recording everybody's purchases and E-mail and phone call and it's being done every which way.;
The true "IT application" issue is that there is no real business need for this data 99.999% of the time. It gets recorded, probably gets staged off to tape, maybe indexed in some giant table, and then ... sits there for years with no actual need for it.
I'm sure the IT geeks who dreamed up the technical ability to record all this stuff, thought they were hot shit when they came up with it. Oh, man, those IT architects were just having a big go-round whipping this problem in scalability. In their heads, they were gonna record everything on disk, then go home and fuck the prom queen.

Share
twitter facebook
- Re:The problem is "Write-only" applications (Score:5, Insightful)
  
  by mikael_j ( 106439 ) writes: on Saturday July 10, 2010 @08:33AM (#32859486)
  
  The problem is, that IT people dream up all these "write only" applications that record data, without any rational plan for what the data might actually be used for in the business.
  These plans mostly come into being because us "IT people" (read: developers) know that the "business people" love changing the specs and they'll blame us if they want to start using data they didn't ask us to save and we tell them we can't save data retroactively (really, they'll basically blame the developers for not being able to time-travel). This is why we'd rather save everything than not save enough.
  
  Parent Share
  twitter facebook
- Re: (Score:3, Insightful)
  
  by DerekLyons ( 302214 ) writes:
  
  The problem is, that IT people dream up all these "write only" applications that record data, without any rational plan for what the data might actually be used for in the business.
  Seems to me that the IT folks shouldn't be making these decisions (what data to capture and store) any more than they should be deciding what to stock for the Memorial Day sale.
- Re: (Score:2)
  
  by CharlyFoxtrot ( 1607527 ) writes:
  
  Who's exactly going to say no in the decision chain ? The vendors wine and dine the managers because they all got lots of stuff to push: Sun the hardware, EMC the storage, IBM the hardware, storage and backup solution, Oracle the database and analysis tools, etc. The manager wants to justify his position and this stuff sounds nice and science-fictiony and "pro-active" and really, really expensive so you know its good. The IT guys get an increased budget, lots of new shiny toys to play with and a couple of p
This isn't a 'new way of thinking' (Score:5, Insightful)

by sirwired ( 27582 ) writes: on Saturday July 10, 2010 @08:17AM (#32859412)

Automated Hierarchical Storage Management has literally been around for decades. It may be new-ish on low-end crap x86 servers, but for say, mainframe users, it isn't new at all.
What is new is available implementation choices. When your tier choices are between enterprise disk and enterprise tape, you are biased towards keeping data on disk; there's still use cases for HSM with only high-end disk and tape, but they aren't as great. Now with lower-cost disk available, you have a cheap disk choice too, with fairly reasonable access time.
SirWired

Share
twitter facebook
Perfect (Score:4, Funny)

by Andreaskem ( 999089 ) writes: on Saturday July 10, 2010 @08:18AM (#32859418)

A perfect application for my patented write-only memory.

Share
twitter facebook
- Re: (Score:2)
  
  by Dachannien ( 617929 ) writes:
  
  A perfect application for my patented write-only memory.
  Bob Pease, is that you? [national.com]
This is new? (Score:5, Interesting)

by rapturizer ( 733607 ) writes: on Saturday July 10, 2010 @08:25AM (#32859440)

I saw this over a decade ago when I was working as an IT consultant in the advertising industry. They regularly used only 5% - 10% of their information (and that's being generous). The systems I designed included a server for active work, an archive server for information used in the last 24 months, and then an archive solution (Magneto Optical at the time) that allowed for the information to be available, just not on demand. This idea has been working since for the clients that are still in business.

Share
twitter facebook
- - Re: (Score:2)
    
    by rapturizer ( 733607 ) writes:
    
    The advertising industry in the 2000's went through the worst recession since the great depression. Companies that specialized in print media and ones that were marginally profitable either went out of business or were purchased by other companies. Of all the clients I had, only one closed its doors (they specialized in quick turn around newspaper advertising), I did have three acquired by other companies. One of them purchased by another client of mine. One of the points that sealed the deal was compatible
Signetics invented the needed chip back in the 70s (Score:3, Funny)

by ve3id ( 601924 ) writes: <nw,johnson&ieee,org> on Saturday July 10, 2010 @08:29AM (#32859458)

FINALLY !!! AN APPLICATION FOR THE WOM!!!! http://www.national.com/rap/files/datasheet.pdf [national.com] Bob Pease sure was fore-sighted, since this memory chip was invented back in the seventies!

Share
twitter facebook
In my experience (Score:2)

by AbbyNormal ( 216235 ) writes:

In my experience with small businesses, it may be never read but will absolutely need to be found for some type of emergency presentation/proposal.
Good argument for tape? (Score:4, Interesting)

by mlts ( 1038732 ) * writes: on Saturday July 10, 2010 @08:30AM (#32859474)

This is one reason I like tape: The drives are expensive, but the tapes are $30-$50 (LTO-4 is $30 on mail-order). So having an autochanger moving all the rarely used data into storage is likely the most efficient way of moving data to long term archiving. Even better is making sure that 2-3 sets of tapes are used (one onsite, one offsite.)
Of course, hard disks by themselves may seem cheaper, but they are not a true archival medium. There are so many moving parts in a HDD and each of them (bearings, heads, spindles, motors, controller card) are a point of failure.
With HDD capacities starting to not grow as exponentially as they did last decade, it would be nice if tape companies would not just catch up with 2-3TB native tape offerings, but be able to offer drives at a lower price so home and SOHO users can use them for long term storage. I'm sure that if someone offered a consumer level tape drive for $500 with a decent capacity, that a lot of small businesses would buy it, especially if it came with decent backup software (Retrospect, Backup Exec, Amanda, bru, or another utility that is similar.) Since some tape drives are even bootable (some HP offerings have a section of the tape to emulate a boot CD or DVD), it would be ideal for bare metal recoveries even by nontechnical users. Pop in the tape, boot the machine, type in the encryption key, select where the data should be restored to, walk off for a bit and it is done.
Even though the SAN companies have said tape is going to die, until another form of media (perhaps super-inexpensive flash media [1]) is as reliable as tapes and can be put in the Iron Mountain case and sent offsite for safekeeping for decades on end, tape will be with us. Only optical comes close to tape for long term archiving abilities.
[1]: I can see someone make flash media that is semi-smart where it is put in a specific case, shipped to an offsite warehouse, and that warehouse plugs in the cases into 5-12VDC. Then over time, the circuitry on the flash drives periodically checks the stored flash media for damage or bit rot, corrects errors by rewriting blocks, and good blocks it would periodically move to ensure that there is a high signal to noise level on all media. Of course, this requires power, while tapes can happily sit in a climate controlled warehouse and be still recoverable.

Share
twitter facebook
- Re:Good argument for tape? (Score:4, Informative)
  
  by mbone ( 558574 ) writes: on Saturday July 10, 2010 @10:48AM (#32860108)
  
  Tapes are not archival storage either. In either case, archival storage is a system, not a medium.
  I hope you are reading all of those tapes on a 5 year cycle, and writing new ones with the recovered data. I also hope you are making sure that the humidity and temperature are strictly controlled at all times in the tape storage room.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by Vellmont ( 569020 ) writes:
    
    I hope you are reading all of those tapes on a 5 year cycle, and writing new ones with the recovered data. I also hope you are making sure that the humidity and temperature are strictly controlled at all times in the tape storage room.
    
    Not everyone has the same standards as to data retention. Believe it or not, some people actually couldn't care less if a 6 year old version of a document they last touched 5 years ago can't be recovered!
    In my experience, this kind of extreme level of data retention has more
  - Re: (Score:3, Informative)
    
    by vrmlguy ( 120854 ) writes:
    
    I also hope you are making sure that the humidity and temperature are strictly controlled at all times in the tape storage room.
    That's why the OP said to use Iron Mountain [wikipedia.org]. They maintain the humidity and temperature at all times in their storage rooms.
    It costs a little extra, but if you want long term storage, rent some underground space. According to http://mic.imtc.gatech.edu/preservationists_portal/presv_costcompare.htm [gatech.edu], underground storage costs can get as low as $2/year per cubic foot (not including relocation, initial filing charges, retrieval & re-file charges) if you're buying four delivery trucks worth of space.
  - Re: (Score:3, Informative)
    
    by mlts ( 1038732 ) * writes:
    
    5 year cycles are close enough. In business, with laws like Sarbanes Oxley, FERPA, HIPAA, PCI-DSS, and many others, if a business puts it on tape (where the maker says the archival life is in decades), drops it off at Iron Mountain, and has a documentable chain of custody system, should an audit happen and some tapes are not readable, they are off the hook. Management can look at the auditor and say that any missing data was stored in multiple places, and if anything is lost due to tape failures/bit rot o
"Once" may be pushing it (Score:2)

by jayhawk88 ( 160512 ) writes:

But if you revised this to say, "Never accessed again a week after it's creation", I'd believe it.
this is actionable: think of the storage savings (Score:5, Funny)

by rubycodez ( 864176 ) writes: on Saturday July 10, 2010 @08:39AM (#32859524)

this helps me to be a better employee. From now on I'll only save 25% of the data I acquire, because the odds are the other 75% would only be needed 7.5% of the time. In other words, 92.5% chance not likely to be needed at all.

Share
twitter facebook
Much, much higher - probably 99% +++ (Score:4, Funny)

by petes_PoV ( 912422 ) writes: on Saturday July 10, 2010 @08:47AM (#32859560)

If you're talking about blog entries. Almost all of them (well, almost all of *mine* :-) are written once and never read, unless you count spiders as reading them.

Share
twitter facebook
I only read the headline... (Score:2)

by erroneus ( 253617 ) writes:

...I didn't bother to read any further because I felt it was probably useless data anyway.
dell's new line of fire extinguishers coming soon! (Score:5, Insightful)

by drfireman ( 101623 ) writes: <(dan) (at) (kimberg.com)> on Saturday July 10, 2010 @08:56AM (#32859588) Homepage

Over 92% of fire extinguishers will never be used, we could probably save a bit of space by having the unneeded ones stored off-site, or in less accessible corners of the garage.
Slightly more seriously, we can certainly answer this question posed by the linked article easily: "why on earth did we squander so much money by not thinking this way until now?" The answer is: because you are a moron. Anyone who has given even a moment's thought to storage has known this, either implicitly or explicitly, for a long time. So whoever's included in your "we," Steve Cassidy, is just profoundly stupid. I think that quite easily explains why you all squandered so much money by not thinking about this. Next question?

Share
twitter facebook
- Re: (Score:3, Insightful)
  
  by mbone ( 558574 ) writes:
  
  Well over 99% of all lifeboats are never used.
In other news... (Score:3, Interesting)

by argStyopa ( 232550 ) writes: on Saturday July 10, 2010 @08:59AM (#32859598) Journal

...at least 70% of the crap you store in your house isn't really needed, either. Do you really ever LOOK at the pictures hanging on the walls? Are you sure you're going to read every book you own, again?

Share
twitter facebook
Databases should handle this automagically (Score:2)

by Proudrooster ( 580120 ) writes:

Anyone who manages large systems know that this is very true, yet the data piles up. I've often wished that databases would allow us to make a view or some other type of abstraction which would allow you to make the decision whether or not to join an archive table. Right now, everything needs to be handled on a program by program or query by query basis. Hey, maybe I should quickly patent this idea, then I can license it to Oracle. :)
- Re: (Score:3, Informative)
  
  by afidel ( 530433 ) writes:
  
  Oracle's way ahead of you, they've had programatically partitioned tables for quite some time. Queries don't need to altered, if they call for data outside of the active tables range then the archive table(s) are automatically used.
This should be obvious... (Score:2)

by jridley ( 9305 ) writes:

to anyone aware of Sturgeon's Law [wikipedia.org]. 90% of everything is crud.
So what? (Score:5, Insightful)

by davidbrit2 ( 775091 ) writes: on Saturday July 10, 2010 @09:26AM (#32859696) Homepage

And if you didn't have that 10% that is eventually needed, you'd be totally screwed. Do we really need to play the 20/20 hindsight game every time somebody thinks of something like this?

Share
twitter facebook
- Re: (Score:2)
  
  by DaveGod ( 703167 ) writes:
  
  And if you didn't have that 10% that is eventually needed, you'd be totally screwed. Do we really need to play the 20/20 hindsight game every time somebody thinks of something like this?
  I know /. summaries are traditionally highly unreliable and jumping to obvious conclusions after picking up on a couple of key words is often a safer bet, but this time we have a good one. It goes straight (perhaps too straight) to the point that some data is in use that needs to be on expensive servers, and there is data th
Only 90% (Score:2)

by flyingfsck ( 986395 ) writes:

Many businesses work with a customer file a few times and then never again - for example lawyers and realtors. I'd like to see a file system that will auto archive data and shift it transparently into long-term storage, and then transparently undo it when needed again.
- Re: (Score:2)
  
  by IrquiM ( 471313 ) writes:
  
  Dell delivers servers with this system. It's nothing new. We've used it the last 3-4 years.
Health Insurance? (Score:2)

by Kozz ( 7764 ) writes:

I also wonder if +90% of all health insurance benefits go unused each year. And you probably have business data and insurance for some of the same reasons: it's better to have it and not need it than need it and not have it. amirite?
Question (Score:2)

by chazzf ( 188092 ) writes:

Is there a reliable metric as to which 10% will be needed again at the time the data is written? If not then I don't see what this buys us.
If Dell is talking about it's failure rate (Score:3, Funny)

by christoofar ( 451967 ) writes: on Saturday July 10, 2010 @09:36AM (#32859738)

If the data was recorded by Dell computers... then yeah I would expect that 90% of business customers aren't able to read it back.

Share
twitter facebook
About 90% of slashdot posts are also WORN (Score:2)

by aapold ( 753705 ) writes:

At least 90% at Write Once Read Never.

Wonder if you could go into business archiving never-read data. I mean you could guarantee privacy....
Exactly. (Score:4, Insightful)

by brusk ( 135896 ) writes: on Saturday July 10, 2010 @09:54AM (#32859824)

I wasted money on a dictionary that has tens of thousands of words but have only ever looked up a few hundred. I should have bought one that just had the words I would actually need.

Share
twitter facebook
Solutions: (Score:5, Interesting)

by drolli ( 522659 ) writes: on Saturday July 10, 2010 @10:00AM (#32859862) Journal

a) Forbid *unmanaged* of documents. If the question: "where is the most up-to-date version of this document stored?" is systematically and easily answered then people can delete the crap from their laptops.
b) Forbid in-company attachments to mails. If the last version can be easily found, including the revision history, a link to this revision is worth *more* than the current state of the document. Most space in my inbox are totally useless attached documents.
c) Forbid the use of formats unsuitable for storing a certain kind of information. (Where i work, they use powerpoint/word files for electronics forms)
d) Provide a good archiving and backup service. Besides the quality improvement by using a service, also the 100th copy done in some unsystematic way of some data is prevented (forbid this explicitely)
e) Thin clients. store the data on a server. Deduplicate.
f) i would expect that most of the documents in a company can (and should) be stored in a database.

Share
twitter facebook
It's not like this is something new (Score:2)

by IrquiM ( 471313 ) writes:

Dell has been doing this for our company the last 3-4 years now.
In related news... (Score:2)

by GodfatherofSoul ( 174979 ) writes:

Backup snapshots are wasting space 99% of the time!
I can easily believe it (Score:2)

by onyxruby ( 118189 ) writes:

Most people don't understand the nature of large amounts of data like that. They think "I want more, more, more" and never beyond that. Getting data is easy, getting useful data is far more important and for that you need to have your customers spend some time with the database where they can tell you everything that they don't need or want. Once you can confirm the accuracy of that information you can then purge your data of the clutter.
What people really fail to understand though is that getting rid of da
Observation (Score:2)

by halcyon1234 ( 834388 ) writes:

And now that Dell's looked at the files, they've been read. There goes that theory.
Data warehousing (Score:2)

by v1x ( 528604 ) writes:

I have serious doubts about how they came up with that number. Data captured once can be stored in a data warehouse and analyzed and reused in many different ways for analytics and reporting, so I am not sure how they estimate that 90% of data is never used again (unless, of course they meant that it is not pulled up again on the frontend application side, which would still make no sense at all).

At our hospital, they have replaced the inpatient electronic medical records system at least 3 times in the la
In a non-IT field (Score:2)

by KenSeymour ( 81018 ) writes:

I work on communications and control systems for subway and light-rail.
A lot stuff is recorded in case there is an incident or accident that they want to investigate. Even phone calls to the control center and radio transmissions are recorded. CPUC and FRA regulators come by, especially during construction and early service, and poke around, ask questions, pull records and so on.
There is a regulatory retention period. If nothing happens for that period, the stuff gets deleted. But a lot of minor stuff g
This isn't surprising. (Score:2)

by asdf7890 ( 1518587 ) writes:

For the major app that I work on for my company, I would say that a lot of the data is write-only until something goes wrong. There is a lot of data that is recorded simply for auditing purposes. The system keeps a copy of every version of a form that it has seen and in ideal situations these data rows, and sometimes entire documents that someone has written, are not looked at again - they are there so that if a problem is found or a complaint made everything can be tracked down to the source and procedures
Rate of access does not equal importance of... (Score:3, Insightful)

by Dcnjoe60 ( 682885 ) writes: on Saturday July 10, 2010 @12:07PM (#32860478)

Rate of access does not equal importance of data. How important are, say, dental records or DNA? To the majority of people, probably not too important. However, in law enforcement, they could be very important. The US military has DNA records on all of its members. However, unless you are dead and they are trying to identify your body, 99% of it is just stored and never used.
Medical records are stored and unlikely to be used on a regular basis, however, someone coming into the emergency room at the local hospital with chest pains, access to those records in a quick and timely manner may be important.
What the author seems to be proposing, however, is that records be stored on the basis of how often they will be needed (needed frequently - high speed storage, once in a blue moon, slow or offline storage). In reality, data should be stored on the cost associated with it not being available when needed.
Using the medical example, it seems that patient data would have a high cost of not being available when needed (death). Payroll information, however, which is needed somewhat frequently, has a lower cost if not available (employee having to wait for the information). As such, the metric should not be on how often the data is accessed, but instead on how vital quick access is.

Share
twitter facebook
The problem is-- (Score:3, Insightful)

by Chris Mattern ( 191822 ) writes: on Saturday July 10, 2010 @12:08PM (#32860484)

If you can't figure out which 10% you'll need later, you can't use this fact to cut down on your data storage.

Share
twitter facebook
Has any progress been made... (Score:2)

by iamacat ( 583406 ) writes:

On identifying the 10% which will be needed ahead of time? I think the focus should be the opposite - to preserve MORE data and index it better. It's not hard to imagine that an addition 10% could have been used if made available at the point of need in a relevant format, effectively doubling productivity. How many employees in a company with 10K+ developers are still coding hashtables. Sure there are variations in languages and needs, but some HAS already written JUST what you need and if you had access to
But which 10%? (Score:2)

by tcgroat ( 666085 ) writes:

The devil is in the details: figuring which part is the 90% that you'll never need again, and which is the 10% that will be needed. Some of that "write-only" data is stuff that companies are legally obligated to retain, some is CYA records that you hope you'll never need again. In both cases when the court, IRS, etc. orders you to produce the documents, you'd better have them.
Two words (Score:2)

by jav1231 ( 539129 ) writes:

Purge policy. This is not news, though the figure may be ambiguous. Any SA can tell you, if asked how long has the data remain untouched? You see this in database backups where they go un-queried for years. We give it 3 years then it's gone. Storing data for 3 years isn't going to break the bank, per se'.
Compliance and Lawsuits (Score:2)

by bagsc ( 254194 ) writes:

Probably 95% of the records are for compliance and things legal wants saved for CYA purposes. This is more a function of the legal environment, where everyone wants to sue every business that looks at them funny, and how courts expect tons of documents on everything you've ever done. It'd be an interesting analysis to see what the costs of excess records retention are compared to the legal losses, and more importantly, the losses consumers incur because they can't afford to fight well documented machines
- Re: (Score:2)
  
  by sco08y ( 615665 ) writes:
  
  Wow, this percentage is the same as /. articles!
  Much more than 90%. When it comes to uselessness, /. has a rock solid 5 9's methodology.
- Re: (Score:2)
  
  by sco08y ( 615665 ) writes:
  
  Only a small part of the human brain is active at any given time, but you just try to think without the rest of it...
  I'll bet I'd still get credit card offers.
- Re: (Score:2)
  
  by jabuzz ( 182671 ) writes:
  
  You have the idea right, but what you want to do is automate the process. For example why should say a Brazilian keyboard layout or a driver for a printer I don't own be on fast disk just because it happens to be part of the OS? Why does the word document I am working on today get to be on slow disk?
  That is you have some fast disk that new stuff gets written to, and then after a period of time if it is not accessed it gets moved to slower disk. You can even add in an extra layer so stuff that has not been u
- Re: (Score:2)
  
  by theshowmecanuck ( 703852 ) writes:
  
  We have been. It is called data warehousing, data marting, operational data stores, etc (and they aren't the same things). People have been doing this for a long time. That is why there are analysts who specialize in these areas. They help the business identify the things that are used regularly, things not used often, and things that are nice to keep somewhere, and things that you can throw out after a few weeks. And the most ideal storage mechanisms (but not necessarily the specific technology).
  Whenever
- Re: (Score:2)
  
  by Xacid ( 560407 ) writes:
  
  "And how long are we supposed to keep our records in case of audits?"
  7 years if gov.
- Re: (Score:2)
  
  by Arimus ( 198136 ) writes:
  
  Nah, Seagate have too many patents and experience in that market place ;)

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Coincidence? (Score:5, Funny)

Re:Coincidence? (Score:5, Funny)

Re:Coincidence? (Score:5, Funny)

Re: (Score:2)

Comment removed (Score:4, Insightful)

Which 90% ? (Score:5, Insightful)

Re:Which 90% ? (Score:5, Insightful)

Re: (Score:3, Insightful)

Re: (Score:3, Interesting)

Re: (Score:2)

Re: (Score:3, Insightful)

Re: (Score:3, Insightful)

Re: (Score:2, Informative)

Re: (Score:3, Informative)

Cost of storage (Score:2)

Re: (Score:2)

Re: (Score:3, Interesting)

Re: (Score:2, Insightful)

Re: (Score:2, Interesting)

Re:Which 90% ? (Score:4, Interesting)

Re: (Score:2)

Re:Which 90% ? (Score:4, Informative)

Re: (Score:2)

Re: (Score:3, Funny)

Re:Which 90% ? (Score:5, Insightful)

Re: (Score:2)

Re: (Score:3, Funny)

Re: (Score:2)

Hope They Don't Want the Z-Series (Score:3, Insightful)

which 90% (Score:3, Insightful)

Re:which 90% (Score:5, Informative)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2, Informative)

It's like Office features (Score:5, Informative)

Re: (Score:3, Insightful)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Oracle already has that (Score:2)

Re: (Score:2)

Re: (Score:2)

The problem is "Write-only" applications (Score:5, Insightful)

Re:The problem is "Write-only" applications (Score:5, Insightful)

Re: (Score:3, Insightful)

Re: (Score:2)

This isn't a 'new way of thinking' (Score:5, Insightful)

Perfect (Score:4, Funny)

Re: (Score:2)

This is new? (Score:5, Interesting)

Re: (Score:2)

Signetics invented the needed chip back in the 70s (Score:3, Funny)

In my experience (Score:2)

Good argument for tape? (Score:4, Interesting)

Re:Good argument for tape? (Score:4, Informative)

Re: (Score:2)

Re: (Score:3, Informative)

Re: (Score:3, Informative)

"Once" may be pushing it (Score:2)

this is actionable: think of the storage savings (Score:5, Funny)

Much, much higher - probably 99% +++ (Score:4, Funny)

I only read the headline... (Score:2)

dell's new line of fire extinguishers coming soon! (Score:5, Insightful)

Re: (Score:3, Insightful)

In other news... (Score:3, Interesting)

Databases should handle this automagically (Score:2)

Re: (Score:3, Informative)

This should be obvious... (Score:2)

So what? (Score:5, Insightful)

Re: (Score:2)

Only 90% (Score:2)

Re: (Score:2)

Health Insurance? (Score:2)

Question (Score:2)

If Dell is talking about it's failure rate (Score:3, Funny)

About 90% of slashdot posts are also WORN (Score:2)

Exactly. (Score:4, Insightful)