Slashdot is powered by your submissions, so send in your scoop


Forgot your password?
Security Data Storage IT

To Purge Or Not To Purge Your Data 190

Lucas123 writes "The average company pays from $1 million to $3 million per terabyte of data during legal e-discovery. The average employee generates 10GB of data per year at a cost of $5 per gigabyte to back it up — so a 5,000-worker company will pay out $1.25 million for five years of storage. So while you need to pay attention to retaining data for business and legal requirements, experts say you also need to be keeping less, according to a story on Computerworld. The problem is, most organizations hang on to more data than they need, for much longer than they should. 'Many people would prefer to throw technology at the problem than address it at a business level by making changes in policies and processes.'"
This discussion has been archived. No new comments can be posted.

To Purge Or Not To Purge Your Data

Comments Filter:
  • by William Robinson ( 875390 ) on Thursday September 18, 2008 @10:32AM (#25054769)

    For example, Financial institutions are required to keep data for longer period for legal purpose as well as traceability (during investigation of fraud or other kind of crimes). The banks worked for had legal requirement of keeping data at 2 places at least 15 km apart, with all kind of protection against fire and intrusion.

    A good manufacturing company would keep data for longer period ot only to comply with ISO standards, but to trace manufacturing defects and a good evidence of past history for insurance company against theft/fire and other kind of problems.

    We used to keep daily changes of source code of only previous releases, and purge rest of of the releases (we kept the final source code and patches of all previous releases, but purge daily changes).

    In a nutshell, it depends upon your type of bussines.

  • Data Discovery Woes (Score:1, Informative)

    by Anonymous Coward on Thursday September 18, 2008 @10:43AM (#25054955)

    I work for a few lawyers and we just began running into issues with "data discovery". Two recent examples:

    They are a medium sized law firm and they were involved in a lawsuit with another law firm. The other law firm (much smaller) required a copy of all the data from the firm.

    Data from encrypted laptops = 80GB x 6 users
    2 hours per laptop to decrypt and image (12 hours)
    Data from 4 servers and email = 65GB (2 hours)
    That's now almost 500GB and 14 billable hours of support.


    The law firm was involved in a lawsuit where they were doing discovery and had to review evidence.
    They were going to get data from 10 laptops (800GB total) that will require backups of the data and archival for X years (so far it is 1 year and indefinite).

    Quickly the data discovery is getting expensive - and annoying on a technical level.

  • litigation hold (Score:2, Informative)

    by Benjamin_Wright ( 1168679 ) on Thursday September 18, 2008 @11:51AM (#25056129) Homepage
    Any record destruction policy must include a "litigation hold". A litigation hold means that record destruction must stop when litigation is anticipated or pending. But in a complex enterprise, it is tricky to know what litigation the enterprise anticipates. It was the trickiness of litigation hold that led to the demise of Arthur Andersen. The risks associated with litigation hold give enterprises incentive to store lots more records. --Ben []
  • by ubercam ( 1025540 ) on Thursday September 18, 2008 @11:57AM (#25056231)

    Users aren't meant to be making those decisions, the Records Management department should be... that is if you even have one! If you leave everything up to the users, you WILL have a cluster fuck of records.

    I work in Records Management at a large company with many different divisions in diverse fields. RM is completely left up to us. We manage well over 10,000 boxes and there's only 3 of us. We alone determine when something is to be destroyed (but require authorization from dept heads to be shredded), how long it's kept, etc.

    Disclaimer: We work mainly with paper records, but the exact same principles apply to electronic records.

    You need a retention schedule. Look at your national, state/provincial and municipal laws to determine the minimum legally required length of time each TYPE of record is to be kept. Employee time cards are different from pension plans, sales invoices and legal files. It's not *always* 7 years either. Some are less, some are more, some are permanent. Also, you don't have to shred when the law says it's time if there's a valid business reason to keep that set of records. I mean, let's get this straight. You don't HAVE TO shred at all, but you're digging yourself a deep hole if you do... "You can get in just as much trouble by keeping records too long as you can by destroying them too quickly." - Dr. Mark Langemo

    If this was all left up to individuals, they would just keep everything. I've seen what this is like, and it's pathetic, maddening and counter productive. Things must be properly named and catalogued down to the file level when put in storage, or you will NEVER find ANYTHING without an exhaustive search EVERY time. It might be alright when it's on your desk or in your local filing area and you know what's where, but when you archive it, you can't assume the guy looking for your file you need knows anything about it. We need explicit details or else we can't help you. At my company we require everyone to fill out a nice sheet detailing the contents of their box, the type of records, dates (most remember dates above all else), sender's name, dept, etc.

    We are by no means a perfect operation here, but we're far better than 90% of other companies out there.

    There is a series of excellent seminars done by Dr. Mark Langemo (sorry no links) to teach you how to deal with records. Also check out ARMA International [] if you're looking to get in touch with other Records Managers in your area. They have local chapters all over the place.

    To summarize, if your company doesn't have a Records Manager, HIRE ONE NOW and give him/her the resources to get your records under control! Check out ARMA, they have jobs posted on their site. There are also many companies out there that will help you clean up your stuff and get you started on the right track.

  • Re:Easier to keep (Score:3, Informative)

    by guruevi ( 827432 ) <`moc.stiucricve' `ta' `ive'> on Thursday September 18, 2008 @12:53PM (#25057131) Homepage

    1) This is the average. Your company might have 700MB/user, in my organization, it's close to 1TB/user/year that gets added. We're doing medical imaging.

    2) It's not just tape libraries. The cost for D2D2T or D2D2D (what we're doing) goes way up compared to a 'simple' backup scheme. Especially if you're like us and require mulitple gigabit streams, disk storage can't be just 4 cheap SATA disks in RAID5. We have 2 storage arrays with 14 drives each for general access and another storage array with 10 SATA disks for primary backup and those things don't come very cheap especially since you need multiple servers to handle the load.

    3) Encryption, tape rotation or multiple locations add to the costs.

    4) If you're buying a solution eg. from IBM (Tivoli), you need to pay for a consultant and/or another employee to get that stuff running. We're doing what we're doing with open source and it's going well, but if you can't and need to pay for software, it adds up (especially for Windows systems)

  • Sarbanes Oxley (Score:1, Informative)

    by Anonymous Coward on Thursday September 18, 2008 @01:22PM (#25057647)

    I'm not sure if most of you understand what is really being written about here. There are laws in place that REQUIRE that companies retain EVERY document according to a certain set of rules. These rules change depending on the type of company, but a good rule of thumb is 2 years of document retention. Publicly traded companies are under even more extremely strict guidelines including Sarbanes Oxley.

    Exchange servers alone will generate huge amounts of data in no time at all. When these companies go into litigation (and they almost always do), all of this data is considered discoverable and can cost the legal department huge fees.

    When involved in litigation, these documents can not simply be pulled out of archive and made available for review. There is a very strict set of rules that require these documents are produced in a non-editable, read-only image format (usually tif) and then put into discovery review platforms such as Concordance or Summation. This costs tons of money to have produced because they typically do not produce them in house.

    The cost of producing the documents is only the beginning though. After they are produced, the legal fees of having them reviewed is where the really steep fees come into play. Lawyer fees can run upwards of 250 - 400 dollars per hour. That means that an email that took someone 10 minutes to type out might be reviewed for 30 min- 1 hr by the Legal Team (depending on relevance). So that single email could end up costing several hundred dollars between document production and review.

    Now, if there are suspicious documents that have links to files that no longer exist, the opposing counsel has the right to do a forensic investigation on the system to look for deleted files. If they are found to be deleted when they should have been kept, the court can actually sanction the company in question... not a good position to be in!

    Electronic Discovery is huge business these days and only grows as more and more companies enter litigation each year.

  • by Anonymous Coward on Thursday September 18, 2008 @02:54PM (#25059433)
    Afaik one of windows file systems has Archive bit in order to represent this.
  • I've become the e-discovery guy (at least for email) where I work. Our lawyers told me that the latest revision of FRCP (Federal Rules of Civil Procedure) require an entity to keep evidence, even if automatic purging systems are in place.

    Rule 37 of FRCP [] says that if you are ordered to hand over the evidence, and you cannot, then the judge can order that "designated facts be taken as established for purposes of the action, as the prevailing party claims". In other words, if the person suing you claims you sent them an email offering a million dollars to not go to court, and you auto-purge your email (taking away the ability to prove you didn't send the email), the judge has the option of deciding that yes you did make an offer of a million dollars via email. T'would suck to be you.

    It even gets a little worse. Although you must keep evidence after being told you are being taken to court, it turns out you need to keep all evidence in case you are taken to court. I'm told that the criteria here is "reasonable expectation that the matter will go to court". It's reasonable (for example) to expect to end up in court if an employee dies while on the job (and it wasn't due to natural causes). The point here is that if a person dies, you'd better keep any email about the situation that lead to death - 60 day auto-purging email expiration practice be damned.

    Auto-purging is a fine thing, as long as you have the ability to except items out, in case they become evidence.

"Even if you're on the right track, you'll get run over if you just sit there." -- Will Rogers