Forgot your password?
typodupeerror
Data Storage Biotech Science

Researchers Achieve Storage Density of 2.2 Petabytes Per Gram of DNA 136

Posted by Soulskill
from the nature-is-much-smarter-than-we-are dept.
SternisheFan sends news of researchers who encoded an MP3, a PDF, a JPG, and a TXT file into DNA, along with another file that explains the encoding. The researchers estimate the storage density of this technique at 2.2 petabytes per gram (abstract). "We knew we needed to make a code using only short strings of DNA, and to do it in such a way that creating a run of the same letter would be impossible. So we figured, let's break up the code into lots of overlapping fragments going in both directions, with indexing information showing where each fragment belongs in the overall code, and make a coding scheme that doesn't allow repeats. That way, you would have to have the same error on four different fragments for it to fail – and that would be very rare," said one of the study's authors. "We've created a code that's error tolerant using a molecular form we know will last in the right conditions for 10 000 years, or possibly longer," said another.
This discussion has been archived. No new comments can be posted.

Researchers Achieve Storage Density of 2.2 Petabytes Per Gram of DNA

Comments Filter:
  • by ShanghaiBill (739463) * on Wednesday January 23, 2013 @05:12PM (#42673513)

    Each DNA nucleotide has a molecular weight of about 150. So a gram of DNA should contain about about 6e23/150 = 4e21 bases. At two bits per base, that is 1e21 bytes. These guys are getting 2e15. So, in theory, they are getting about a half millionth of the potential storage, or 0.0005%.

  • by Anonymous Coward on Wednesday January 23, 2013 @05:13PM (#42673523)

    Huge latency and low bandwidth. From the abstract:

    DNA-based storage scheme could be scaled far beyond current global information volumes and offers a realistic technology for large-scale, long-term and infrequently accessed digital archiving

  • by tragedy (27079) on Wednesday January 23, 2013 @05:22PM (#42673631)

    Well, this smbc comic [smbc-comics.com] addresses that, except that it's stored in bacterial DNA.

  • by reverseengineer (580922) on Wednesday January 23, 2013 @06:55PM (#42674745)

    These are artificial DNA oligos, so there shouldn't be any of those sorts of modifications. However, a figure of MW 150 per base leaves out the sugar-phosphate backbone, and doesn't account for this being double-stranded DNA. Molecular weight per base pair should be around 700 g/mol..

    Of course, that's really nitpicking, What really accounts for the low ratio of achieved versus theoretical is that they made "~1.2x10^7 copies of each DNA string."

    They go on to explain in the supplementary materials that "With the latest platform, up to 244,000 unique sequences are synthesized in parallel and delivered as ~1-10 pmol pools of oligos... In our experiment, three runs were used to synthesize 153,335 designs, leading to the higher figure of ~12-120x10^6 (= 3-30 x 10^-12 x6.02x10^23/153,335)." A more accurate assessment of their coding scheme is that they used 153335 strings of 117 nucleotides ( 17940195 total) to encode 5165800 bits of Shannon information, or about 0.29 bits per nucleotide.

    The fact they made ten million copies of each string is more of a current technical limitation of DNA oligo synthesis and automated DNA sequencing than an limit on the efficiency of the encoding itself. With the appropriate technology, you could make a few thousand copies (for appropriate error correction) instead of ten million, and your mass of DNA would be in the femtograms instead of hundreds of picograms.

"Success covers a multitude of blunders." -- George Bernard Shaw

Working...