Forgot your password?
typodupeerror
Data Storage

Error-Proofing Data With Reed-Solomon Codes 196

Posted by kdawson
from the trust-but-verify dept.
ttsiod recommends a blog entry in which he details steps to apply Reed-Solomon codes to harden data against errors in storage media. Quoting: "The way storage quality has been nose-diving in the last years, you'll inevitably end up losing data because of bad sectors. Backing up, using RAID and version control repositories are some of the methods used to cope; here's another that can help prevent data loss in the face of bad sectors: Hardening your files with Reed-Solomon codes. It is a software-only method, and it has saved me from a lot of grief..."
This discussion has been archived. No new comments can be posted.

Error-Proofing Data With Reed-Solomon Codes

Comments Filter:
  • by xquark (649804) on Sunday August 03, 2008 @05:49PM (#24459605) Homepage

    slow news day anyone?

  • ZFS? (Score:3, Interesting)

    by segfaultcoredump (226031) on Sunday August 03, 2008 @05:53PM (#24459653)

    Uh, is this not one of the main features of the ZFS file system? It does a checksum on every block written and will reconstruct the data if an error is found? (assuming you are using either raid-z or mirroring. Otherwise it will just tell you that you had an error).

  • by symbolset (646467) on Monday August 04, 2008 @12:45AM (#24462479) Journal

    Look, if it's secret, one copy is too many. For everything else, gmail it to five separate recipients. It's not like Google has ever lost any of the millions of emails I've received to date. (This is not a complaint -- they don't show me the spam unless I ask for it).

    And if they ever did lose an email, well, to paraphrase an old Doritos commercial, "They'll make more."

    Seriously, personally I view the the persistence of data as a problem. It's harder to let go of than it is to keep.

  • Speed? (Score:4, Interesting)

    by grasshoppa (657393) <`gro.oc-onpt' `ta' `ydenneks'> on Monday August 04, 2008 @12:51AM (#24462515) Homepage

    My question is of speed; this seems a promising addition to anyone's back up routine. However, most folks I know have 100s of gigs of data to back up. While differentials could be involved, right now tar'ing to tape works fast enough taht the backup is done before the first staff shows up for work.

    I assume we're beating the hell out of the processor here; so I'm wondering how painful is this in terms of speed?

  • by Marillion (33728) <ericbardes@gmail . c om> on Monday August 04, 2008 @01:09AM (#24462623)

    My biggest failed prediction in the world of computers was the CD-ROM.

    I was an audio CD early adopter and I knew from articles I read that audio CD's often had a certain defect rate. The defect rate was usually such that you would never hear it. One artist even published all the defects in the liner notes.

    Based upon this, I presumed that you would never get the defect rate to zero and that no one would trust a data medium with anything less than perfection - and thus predicted the CD-ROM would never catch on.

    They don't have to get the rate to zero. Just close enough to zero for the RS to function.

  • by xquark (649804) on Monday August 04, 2008 @01:15AM (#24462645) Homepage

    My understanding is that it is possible to drill a few holes no larger than 2mm in diameter equally spread over the surface of an "audio cd" and with the help of h/w RS erasure decoding, channel interleaving and channel prediction (eg:probabilistically reconstruct missing right channel from known left channel) one can produce a near perfect reconstruction - that's what usually happens to overcome scratches and other kinds of simple surface defects.

  • Re:Speed? (Score:5, Interesting)

    by xquark (649804) on Monday August 04, 2008 @01:23AM (#24462691) Homepage

    The speed of encoding and decoding directly relates to the type of RS and the amount of FEC required. Generally speaking erasure style RS can go as low as O(nlogn) (essentially inverting and solving for a vandermonde or Cauchy style matrix) A more general code that can correct errors (the difference between an error and an erasure is that in the latter you know the location of the error but not its magnitude) may require a more complex process, something like Syndrome-Berlekamp Massey-Forney which is about O(n^2).

    It is possible to buy specialised h/w (or even GPUs) to perform the encoding steps (getting roughly 100+MB/s) and most software encoders can do about 50-60+Mb/ for RS(255,223) - YMMV

  • by InakaBoyJoe (687694) on Monday August 04, 2008 @01:28AM (#24462721)

    TFA introduces some new ".shielded" file format. But do we need yet another file format when PAR (Parchive) [wikipedia.org] has been doing the same job for years now? The PAR2 format is standardized and well-supported cross-platform, and might just have a future even IF you believe that Usenet is dying [slashdot.org]...

    I always thought it would be cool to have a script that:

    • Runs at night and creates PAR2 files for the data on your HD.
    • Occasionally verifies file integrity against the PAR2 files.

    With a system like this, you wouldn't have to worry about throwing away old backups for fear that some random bit error might have crept into your newer backups. Also, if you back up the PAR2 files together with your data, as your backup media gradually degrades with time, you could rescue the data and move it to new media before it was too late.

    Of course, at the filesystem level there is always error correction, but having experienced the occasional bit error, I'd like the extra security that having a PAR2 file around would provide. Also, filesystem-level error correction tends to happen silently and not give you any warning until it fails and your data is gone. So a user-level, user-adjustable redundancy feature that's portable across filesystems and uses a standard file format like PAR would be really useful.

  • by Firehed (942385) on Monday August 04, 2008 @01:53AM (#24462853) Homepage

    From what I've read and heard, ZFS is designed to pretty much be the last filesystem we'll ever need. I'm pretty sure they've considered hash collisions with regards to data integrity.

    Also consider that you probably won't need to reconstruct the entire sector, but only a few bits from it. If there was some sort of insane scenario where you had to reconstruct a complete 1GB block from a single MD5 hash... (ie, "here's an MD5 hash. Give me a sequence of 1073741824 bytes to make it") well it's technically possible, though the electric bill for your server farm may piss off more than a few treehuggers. On the other hand, if you had only a few bytes that needed repair, brute-force reconstruction, while still time-consuming, suddenly becomes more more feasible. I always wonder why I can't apply this kind of logic to torrents with that one file stuck at 99.98%...

    I'm sure that kind of thing is largely irrelevant with ZFS as it's designed to be somewhat more efficient, but you get the point.

  • by femto (459605) on Monday August 04, 2008 @02:10AM (#24462939) Homepage
    Another view is that everything is a code in a noisy environment, so there is no way to talk about "the underlying device" as it itself is just another type of coding. Magnetic recording can be viewed as a way of encoding information onto the underlying (thermal) noisy matter. There is some very deep stuff happening in information theory. Let's take the empty universe as a noisy channel. Now every structure in the universe (including you and me) becomes information encoded over the empty universe. One gets the feeling that any "ultimate theory" won't be expressed in terms of forces and fields but some underlying, unifying, concept of information.
  • by billcopc (196330) <vrillco@yahoo.com> on Monday August 04, 2008 @05:11AM (#24463815) Homepage

    Reed-Solomon is ancient compared to par2.

    No, you're dumb. Par2 IS Reed-Solomon. Silly me to expect an AC to fact-check the most trivial subjects of a post.

    The procedure explained in TFA is basically adapting a different tool to behave more or less like single-file par2. That makes it redundant (in the /. sense, not the data-recovery sense).

    There is one thing I would love to see, and that's local disk checksumming. That's right, take a 500gb disk, chop it into slices and do RAID-5 on them as if they were individual spindles. It's been years since I've had a hard drive actually die on me, but I've seen bit-errors more often than I'd like. Having self-checking built into the filesystem (or low-level disk access) would help ensure 100% data integrity, and you could still do RAID-1 on top of it for safety.

  • by MagdJTK (1275470) on Monday August 04, 2008 @08:49AM (#24465075)
    Indeed. In fact, the code used for CDs can cope with 4000 consecutive bits being unreadable. Quite remarkable!
  • by Wowsers (1151731) on Monday August 04, 2008 @08:53AM (#24465137) Journal

    I loved my DAT (for audio) portable recorder, it employed Double-Reed-Solomon error correction, you would have to do some serious hammering to the side of the recorder to get the tape to "skip" in a way the error correction could not correct it and you'd hear it drop out, running and recording was NOT out of the question though.

    Now what do the consumers have for recorders - cr*ppy, cheap, nasty, low bitrate, overcompressed MP3 recorders. The recording industry killed off an excellent (but expensive) format to palm off rubbish compressed audio to the masses. (Proper PCM recorders are no different in price to the DAT decks).

"Consistency requires you to be as ignorant today as you were a year ago." -- Bernard Berenson

Working...