Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Data Storage Hardware IT

Data Deduplication Comparative Review 195

snydeq writes "InfoWorld's Keith Schultz provides an in-depth comparative review of four data deduplication appliances to vet how well the technology stacks up against the rising glut of information in today's datacenters. 'Data deduplication is the process of analyzing blocks or segments of data on a storage medium and finding duplicate patterns. By removing the duplicate patterns and replacing them with much smaller placeholders, overall storage needs can be greatly reduced. This becomes very important when IT has to plan for backup and disaster recovery needs or when simply determining online storage requirements for the coming year,' Schultz writes. 'If admins can increase storage usage 20, 40, or 60 percent by removing duplicate data, that allows current storage investments to go that much further.' Under review are dedupe boxes from FalconStor, NetApp, and SpectraLogic."
This discussion has been archived. No new comments can be posted.

Data Deduplication Comparative Review

Comments Filter:
  • Second post (Score:2, Funny)

    by Anonymous Coward on Wednesday September 15, 2010 @07:13PM (#33594064)

    Same as the first.

  • Re:Um.. (Score:3, Funny)

    by igny ( 716218 ) on Wednesday September 15, 2010 @09:33PM (#33595344) Homepage Journal
    Yeah! To fight dupes I compute CRC checksum for each file and store it (and only it) on my back up drive. That method removes dupes almost automatically and there is a side effect of a huge compression ratio too. I have been downloading the high def videos from Internet for quite a while now and with my compression method I have used less than 10 percent of 1GB flash drive! I strongly recommend this method to everyone!
  • by MyLongNickName ( 822545 ) on Wednesday September 15, 2010 @09:57PM (#33595544) Journal

    After an analysis of a 1TB drive, I noticed that roughly 95% were 0's with only 5% being 1's.

    I was then able to compress this dramatically. I just record that there are 950M 0's and 50M 1's. The space taken up drops to around 37 bits. Throw in a few checksum bits, and I am still under eight bytes.

    I am not sure what is so hard about this disaster recovery planning. Heck, I figure I am up for a promotion after I implement this.

  • by zooblethorpe ( 686757 ) on Wednesday September 15, 2010 @10:31PM (#33595784)

    ...so reducing sinning platters can be a bad thing.

    Satan, is that you?

    Cheers,

  • by StikyPad ( 445176 ) on Thursday September 16, 2010 @05:20PM (#33604666) Homepage

    Sounds like what we need is a giant table of all possible byte values up to 2^n length, then we can just provide the index to this master table instead of the data itself. I call this proposal the storage-storage tradeoff where, in exchange for requiring large amounts of storage, we require even more storage. I'll even throw in the extra time requirements for free.

"May your future be limited only by your dreams." -- Christa McAuliffe

Working...