Data Deduplication Comparative Review

Data Deduplication Comparative Review 195

Posted by samzenpus on Wednesday September 15, 2010 @07:10PM from the a-little-order-please dept.

snydeq writes "InfoWorld's Keith Schultz provides an in-depth comparative review of four data deduplication appliances to vet how well the technology stacks up against the rising glut of information in today's datacenters. 'Data deduplication is the process of analyzing blocks or segments of data on a storage medium and finding duplicate patterns. By removing the duplicate patterns and replacing them with much smaller placeholders, overall storage needs can be greatly reduced. This becomes very important when IT has to plan for backup and disaster recovery needs or when simply determining online storage requirements for the coming year,' Schultz writes. 'If admins can increase storage usage 20, 40, or 60 percent by removing duplicate data, that allows current storage investments to go that much further.' Under review are dedupe boxes from FalconStor, NetApp, and SpectraLogic."

Data Deduplication Comparative Review

This discussion has been archived. No new comments can be posted.

Search 195 Comments Log In/Create an Account

Comments Filter:

Wrong layer (Score:5, Insightful)

by Hatta ( 162192 ) writes: on Wednesday September 15, 2010 @07:15PM (#33594088) Journal

Filesystems should be doing this.

Which filesystem should be doing this??? (Score:2, Insightful)

by DanDD ( 1857066 ) writes: on Wednesday September 15, 2010 @07:40PM (#33594300)

Filesystems should be doing this.
The one on your desktop machine, or the primary NAS storage that you access shared data from, or the backup server that ends up getting it all anyway? You see, this is a shared database problem. If your local filesystem does this, then it has to 'share' knowledge of all the unique blocklets with every other server/filesystem that wishes to share in this compressed file space. De-duplication is a means of compression that works across many filesystems - or at least it can be, if it is properly implemented.

Re:Don't forget to weigh in the cost (Score:4, Insightful)

by h4rr4r ( 612664 ) writes: on Wednesday September 15, 2010 @09:08PM (#33595132)

More disk is still so much cheaper it really cannot be justified on that front. More disks also mean more IOPS, so reducing sinning platters can be a bad thing.
There are some reasons to go for it, but even with thousands of clients it may or may not be suitable for what you are doing.

Re:Wrong layer (Score:3, Insightful)

by h4rr4r ( 612664 ) writes: on Wednesday September 15, 2010 @09:12PM (#33595174)

Open Solaris is dead, and there are kernel bugs in the latest version, so good luck with that. I looked at doing it at one time and due to fears about Opensolaris I stayed away. I consider myself lucky.

Ya it is (Score:4, Insightful)

by Sycraft-fu ( 314770 ) writes: on Wednesday September 15, 2010 @09:20PM (#33595246)

Something you start to appreciate when you are called on to do a really high availability, high reliability system is to have features like this. For one thing it reduces the time it takes to get a replacement. Unless a drive fails late at night, you get one the next day. You don't have to rely on someone to notice the alert, place the order, etc. It just happens. Also, like most high end support companies, their shipping time is fairly late so even late in the day it is next day service. What arrives is the drive you need, in its caddy, ready to go.
Then there's just the fact of having someone else help monitor things. It's easy to say "Oh ya I'll watch everything important and deal with it right away," but harder to do it. I've known more than a few people who are not nearly as good at monitoring their critical system as they ought to be. A backup is not a bad thing.
You have to remember that the kind of stuff you are talking about for things like NetApps is when no downtime is ok, when no data loss is ok. You can't say "Ya a disk died and before we got a new on in another died so sorry, stuff is gone."
Not saying that your situation needs it, but there are those that do. They offer other features along those lines like redundant units, so if one fails the other continues no problem.
Basically they are for when data (and performance) is very important and you are willing to spend money for that. You put aside the tech-tough guy attitude of "I can manage it all myself," and accept that the data is that important.

Re:Ya it is (Score:3, Insightful)

by h4rr4r ( 612664 ) writes: on Wednesday September 15, 2010 @09:47PM (#33595466)

I mean have the nagios server order the drive without any human intervention.
Also if it was really critical you would keep several disks ready to go on site. You know for when you can't wait for next day. Also like netapp you too can have many hot spares in the volume.
If you have problems with people not noticing or reacting to alerts you need to fire them.

Re:Wrong layer (Score:3, Insightful)

by drsmithy ( 35869 ) writes: <drsmithy@nOSPAm.gmail.com> on Wednesday September 15, 2010 @11:33PM (#33596124)

Sweet, thanks for the pointer. I was also concerned about the death of OpenSolaris but it sounds like Nexenta may be just what I want.
Nexenta is built off Open Solaris and is, therefore, also dead - though it may take longer for the thrashing to stop.

Re:Wrong layer (Score:4, Insightful)

by drsmithy ( 35869 ) writes: <drsmithy@nOSPAm.gmail.com> on Wednesday September 15, 2010 @11:41PM (#33596174)

Filesystems should be doing this.
No, block devices should be doing this. Then you get the benefits regardless of which filesystem you want to layer on top.

Re:Don't forget to weigh in the cost (Score:4, Insightful)

by Krahar ( 1655029 ) writes: on Wednesday September 15, 2010 @11:54PM (#33596236)

Sinning platters cause original spin.

Re:Ya it is (Score:2, Insightful)

by Anonymous Coward writes: on Thursday September 16, 2010 @12:05AM (#33596296)

I'll but in and say that firing people is a piss poor way to fix problems unless you've made very sure that the person in question needs to go. What you do is find out what happened if an alert goes unnoticed and make a change that removes the root cause of that failure. That may be that you have to let go of the guy doing drugs in the corner, but it may also be that your hardware issues alerts in a way that it is easy to miss. You may also realize that perhaps an alert happens only once a year, and in that case you may need to issue spurious alerts to make sure that people know what to do and remain vigilant. The root cause may even be that your staff is completely overworked, and just think where firing someone is going to put you then. Or maybe what you need is to put a siren on the damn thing that will make it impossible to miss even at 3 in the night when the guy at watch falls asleep because he's been pulling all-nigthers to keep your company in business. Firing someone just because a fuck-up happened is sometimes a very bad response.

Re:Ya it is (Score:3, Insightful)

by totally bogus dude ( 1040246 ) writes: on Thursday September 16, 2010 @12:16AM (#33596348)

Developing a monitoring system for a complicated piece of storage that reacts properly to every possible failure mode is a massive undertaking. It will take a lot of time just to figure out everything that you need to monitor, and the possible values for them during normal operation; let alone actually test that your system correctly detects and responds to every possibility.
If your business is providing SAN management/support services, then I can see this as being worthwhile. It's a massive investment in technology and skills amongst your staff, but if that's what you make your money doing, it may well give you a competitive edge.
But if your business is anything else, why are you going to invest so much into something that's really just a background piece of infrastructure? What's your plan for retaining the staff that know how the monitoring system works, and know your storage system in sufficient detail to be able to understand all the things it's checking, etc?
If you really have the expertise on-hand to implement such a thing in a way that you're comfortable relying on, why on earth wouldn't you use them for something more productive that will actually make your business money? Again, if your business is monitoring storage infrastructure, it makes sense. If your business is anything else, why are you spending the time of highly skilled people to implement something you can easily buy off-the-shelf (i.e. a standard support contract)?

Re:Don't forget to weigh in the cost (Score:3, Insightful)

by TheRaven64 ( 641858 ) writes: on Thursday September 16, 2010 @07:07AM (#33597848) Journal

No, good department managers don't know that. Department managers in companies with bad senior management know that. Companies with competent senior management are willing to increase the budgets for departments that have shown that they are fiscally responsible, and cut the budgets or fire the department heads of others.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Data Deduplication Comparative Review 195

Data Deduplication Comparative Review More Login

Data Deduplication Comparative Review

Wrong layer (Score:5, Insightful)

Which filesystem should be doing this??? (Score:2, Insightful)

Re:Don't forget to weigh in the cost (Score:4, Insightful)

Re:Wrong layer (Score:3, Insightful)

Ya it is (Score:4, Insightful)

Re:Ya it is (Score:3, Insightful)

Re:Wrong layer (Score:3, Insightful)

Re:Wrong layer (Score:4, Insightful)

Re:Don't forget to weigh in the cost (Score:4, Insightful)

Re:Ya it is (Score:2, Insightful)

Re:Ya it is (Score:3, Insightful)

Re:Don't forget to weigh in the cost (Score:3, Insightful)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot