Researchers Achieve Storage Density of 2.2 Petabytes Per Gram of DNA 136
A reader sends news of researchers who encoded an MP3, a PDF, a JPG, and a TXT file into DNA, along with another file that explains the encoding. The researchers estimate the storage density of this technique at 2.2 petabytes per gram (abstract). "We knew we needed to make a code using only short strings of DNA, and to do it in such a way that creating a run of the same letter would be impossible. So we figured, let's break up the code into lots of overlapping fragments going in both directions, with indexing information showing where each fragment belongs in the overall code, and make a coding scheme that doesn't allow repeats. That way, you would have to have the same error on four different fragments for it to fail – and that would be very rare," said one of the study's authors. "We've created a code that's error tolerant using a molecular form we know will last in the right conditions for 10 000 years, or possibly longer," said another.
Re:Latency and bandwidth? (Score:5, Interesting)
Not if it is for archival purposes, like Amazon storage.
Re:Latency and bandwidth? (Score:4, Interesting)
It's not useless. One interesting part is how long it holds up in storage. There isn't any effective storage medium available today that lasts for 10k+ years. Another is how high the information density is.
Re:Please use a real unit of measure (Score:5, Interesting)
Major challenge: Retrieval and storage (Score:4, Interesting)
Okay, storing is "solved". How about retrieval? Especially random access retrieval that are simple and fast (relatively speaking) that allow such storage medium to be practical? Certainly not DNA sequencing that can take weeks to complete?
The second problem: DNA denature and fragment at room temperature. Certainly a -80C lab freezer for storage wouldn't be practical.
Third problem: DNA secondary and tertiary structure. The coding scheme must also solves the problem of DNA tendency to make secondary structure (like hairpin) or tertiary structure (like super-coil) that can hamper reading / access to the information. I think this is the reason why the storage uses short sequences. But short DNA sequences like the one proposed (~100 bp, from the figure) could still make such structures.