What Happens After Throughput to DNA Storage Drives Surpasses 2 Gbps? (ieee.org) 35
High-capacity DNA data storage "is closer than you think," Slashdot wrote in 2019.
Now IEEE Spectrum brings an update on where we're at — and where we're headed — by a participant in the DNA storage collaboration between Microsoft and the Molecular Information Systems Lab of the Paul G. Allen School of Computer Science and Engineering at the University of Washington. "Organizations around the world are already taking the first steps toward building a DNA drive that can both write and read DNA data," while "funding agencies in the United States, Europe, and Asia are investing in the technology stack required to field commercially relevant devices." The challenging part is learning how to get the information into, and back out of, the molecule in an economically viable way... For a DNA drive to compete with today's archival tape drives, it must be able to write about 2 gigabits per second, which at demonstrated DNA data storage densities is about 2 billion bases per second. To put that in context, I estimate that the total global market for synthetic DNA today is no more than about 10 terabases per year, which is the equivalent of about 300,000 bases per second over a year. The entire DNA synthesis industry would need to grow by approximately 4 orders of magnitude just to compete with a single tape drive. Keeping up with the total global demand for storage would require another 8 orders of magnitude of improvement by 2030. But humans have done this kind of scaling up before. Exponential growth in silicon-based technology is how we wound up producing so much data. Similar exponential growth will be fundamental in the transition to DNA storage...
Companies like DNA Script and Molecular Assemblies are commercializing automated systems that use enzymes to synthesize DNA. These techniques are replacing traditional chemical DNA synthesis for some applications in the biotechnology industry... [I]t won't be long before we can combine the two technologies into one functional device: a semiconductor chip that converts digital signals into chemical states (for example, changes in pH), and an enzymatic system that responds to those chemical states by adding specific, individual bases to build a strand of synthetic DNA. The University of Washington and Microsoft team, collaborating with the enzymatic synthesis company Ansa Biotechnologies, recently took the first step toward this device... The path is relatively clear; building a commercially relevant DNA drive is simply a matter of time and money...
At the same time, advances in DNA synthesis for DNA storage will increase access to DNA for other uses, notably in the biotechnology industry, and will thereby expand capabilities to reprogram life. Somewhere down the road, when a DNA drive achieves a throughput of 2 gigabases per second (or 120 gigabases per minute), this box could synthesize the equivalent of about 20 complete human genomes per minute. And when humans combine our improving knowledge of how to construct a genome with access to effectively free synthetic DNA, we will enter a very different world... We'll be able to design microbes to produce chemicals and drugs, as well as plants that can fend off pests or sequester minerals from the environment, such as arsenic, carbon, or gold. At 2 gigabases per second, constructing biological countermeasures against novel pathogens will take a matter of minutes. But so too will constructing the genomes of novel pathogens. Indeed, this flow of information back and forth between the digital and the biological will mean that every security concern from the world of IT will also be introduced into the world of biology...
The future will be built not from DNA as we find it, but from DNA as we will write it.
The article makes an interesting point — that biology labs around the world already order chemically-synthesized ssDNA, "delivered in lengths of up to several hundred bases," and sequence DNA molecules up to thousands of bases in length.
"In other words, we already convert digital information to and from DNA, but generally using only sequences that make sense in terms of biology."
Now IEEE Spectrum brings an update on where we're at — and where we're headed — by a participant in the DNA storage collaboration between Microsoft and the Molecular Information Systems Lab of the Paul G. Allen School of Computer Science and Engineering at the University of Washington. "Organizations around the world are already taking the first steps toward building a DNA drive that can both write and read DNA data," while "funding agencies in the United States, Europe, and Asia are investing in the technology stack required to field commercially relevant devices." The challenging part is learning how to get the information into, and back out of, the molecule in an economically viable way... For a DNA drive to compete with today's archival tape drives, it must be able to write about 2 gigabits per second, which at demonstrated DNA data storage densities is about 2 billion bases per second. To put that in context, I estimate that the total global market for synthetic DNA today is no more than about 10 terabases per year, which is the equivalent of about 300,000 bases per second over a year. The entire DNA synthesis industry would need to grow by approximately 4 orders of magnitude just to compete with a single tape drive. Keeping up with the total global demand for storage would require another 8 orders of magnitude of improvement by 2030. But humans have done this kind of scaling up before. Exponential growth in silicon-based technology is how we wound up producing so much data. Similar exponential growth will be fundamental in the transition to DNA storage...
Companies like DNA Script and Molecular Assemblies are commercializing automated systems that use enzymes to synthesize DNA. These techniques are replacing traditional chemical DNA synthesis for some applications in the biotechnology industry... [I]t won't be long before we can combine the two technologies into one functional device: a semiconductor chip that converts digital signals into chemical states (for example, changes in pH), and an enzymatic system that responds to those chemical states by adding specific, individual bases to build a strand of synthetic DNA. The University of Washington and Microsoft team, collaborating with the enzymatic synthesis company Ansa Biotechnologies, recently took the first step toward this device... The path is relatively clear; building a commercially relevant DNA drive is simply a matter of time and money...
At the same time, advances in DNA synthesis for DNA storage will increase access to DNA for other uses, notably in the biotechnology industry, and will thereby expand capabilities to reprogram life. Somewhere down the road, when a DNA drive achieves a throughput of 2 gigabases per second (or 120 gigabases per minute), this box could synthesize the equivalent of about 20 complete human genomes per minute. And when humans combine our improving knowledge of how to construct a genome with access to effectively free synthetic DNA, we will enter a very different world... We'll be able to design microbes to produce chemicals and drugs, as well as plants that can fend off pests or sequester minerals from the environment, such as arsenic, carbon, or gold. At 2 gigabases per second, constructing biological countermeasures against novel pathogens will take a matter of minutes. But so too will constructing the genomes of novel pathogens. Indeed, this flow of information back and forth between the digital and the biological will mean that every security concern from the world of IT will also be introduced into the world of biology...
The future will be built not from DNA as we find it, but from DNA as we will write it.
The article makes an interesting point — that biology labs around the world already order chemically-synthesized ssDNA, "delivered in lengths of up to several hundred bases," and sequence DNA molecules up to thousands of bases in length.
"In other words, we already convert digital information to and from DNA, but generally using only sequences that make sense in terms of biology."
More ominous question (Score:2)
If we start down the route of using DNA in mainstream computing devices, what sort of new vectors for attacking people is that going to open up? I don't think there would be many with just a hard drive or usb stick sort of device, but if we start doing implanted tech I think this is seriously asking for trouble because it's very rare that theoreticians are right about what engineers and hackers end up being able to do unless they're approaching theory from a pessimistic perspective.
Re: (Score:2)
Re: More ominous question (Score:5, Interesting)
Probably none, just because you can write a single DNA strand, even if you were to somehow get that DNA in a human, it would be nothing compared to the many DNA and RNA we encounter/eat on a daily basis.
The primary problem with this tech is going to be read back and longevity.
Write now to âoedecodeâ DNA we need a big ass machine that costs an order of magnitude more than a server with drives. More often, a single 2U server with drives can maintain all the output from the above machine for a few months (millions of bases per hour).
Organic storage has already been tried (writeable CD and DVD) and we now know that even with the best sealing technology we can commercialize, we get at best 10 years out of it and more often than not, a lot less. Most of your own DNA gets refreshed every few weeks/months, keeping up with that rate of production commercially is going to be difficult unless they collocate the production facility, which means practically unlimited storage forever (as long as you give it food and power) and thus you will sell one machine every time a major advancement is made (like tape, hence it is very expensive for smaller deployments)
Re: (Score:2)
Re: (Score:1)
Totally different thing. The DNA is not complete in any of those cases and fossil "DNA" is reconstructed based on DNA from modern animals and trying to map it into fragments of what we have.
It's like reconstructing the Bible from ancient papyrus, if you just took those samples, the result would be practically unreadable. You add onto the text what you already know from other texts to fill in the gaps. And sure you could do that in this case as well (and you probably would have to RAI-DNA) but the older you
Re: (Score:2)
Why don't we pick another molecule which is similar to DNA and not biologically active?
Re: (Score:2)
> Would there be some actual advantage to using DNA storage versus the traditional kinds we all know and loathe?
I wonder about that, and how efficient and reliable the process is.
"Optimizing DNA Stability for DNA Data Storage" - https://www.biorxiv.org/conten... [biorxiv.org]
Re: (Score:2)
None, actually.
DNA and ma
DMCA Attack (Score:2)
what sort of new vectors for attacking people is that going to open up?
Probably legal ones since now your data will be able to copy itself so expect the copyright lawyers to come calling!
A virus infection ... (Score:3)
of a DNA based storage system could really be something interesting!
Alternatively.. (Score:2)
No. (Score:1)
"No" happens, as always with Betteridge...
So what's the point in doing this? (Score:2)
Would there be some actual advantage to using DNA storage versus the traditional kinds we all know and loathe?
And let's qualify that with "... and can conceivably be brought down to a comparable cost?"
Re: (Score:2)
Data density for one. One estimate for DNA storage has been put around 200 PB per gramme.
My question is "why DNA?".
Is there a particular reason why we are looking at DNA?
Is it simply because it's evolved to be a stable (comparatively) way to store information and we've already been experimenting in ways to read and write it for decades, or are there other, simpler, candidate molecules we could be looking at?
As to cost, you only need look at how much easier and cheaper it is to sequence a strand of DNA now c
I am really tired of this nonsense (Score:4, Interesting)
No, this is not a valid storage option. No, this is not "closer than I think" (which is "certainly not in the next 30 years"). And no, I do not want to hear about it.
Get me all those never materialized "holographic storage" first. That was at least halfway believable.
Re: (Score:1)
Crystalline atomic storage seems like the best solution. Not only is it dense but it's durable. DNA storage isn't durable at all and in fact degrades really fast.
The problem with crystalline storage is that it requires more energy than we can currently create. So uh, bring on the fusion power!
Re: (Score:1)
Here you go [walmart.com]
Re: (Score:2)
Most people don't understand this. I'd give you mod points if I currently had them. I'd have gone with BDs, though. ;)
makin' money (Score:2)
So will this amazing technology be used to advance mankind ?
or just make more money ?????
Stable storage (Score:2)
Re: (Score:2)
How do we ensure storage is stale?
You don't. You use ECC and read multiple strands. In fact, with a lot of DNA sequencing methods you have to do this anyway, because they only grab fragments of a longer chain and have to be pieced together by matching the ends up where the sequences overlap.
No sale (Score:2)
Re: (Score:2)
Re: (Score:2)
Yes and no. If you’re considering the rate of single strand breaks, then yes. But you aren’t going to store data this way because at the resolution of a single nucleotide you would only have 4 bits, so you need a byte encoding method that uses a stretch of nucleotides, and if you do that correctly you can include some sort of ECC. Also, if you look at living systems as models there are several additional redundancies. The first is double-stranded DNA, which uses complementarity to fix many error
Hahahahaha (Score:2)
I already own a few so called "DNA printers" .. these are nowhere near gigabase per second.. it takes MINUTES to do ONE BASE. So good luck with that, you idiots. We are about a century from being able to print megabits per second .. let alone gigabits. Note .. printing an arbitrary sequence is different than copying. We can use an enzyme to copy DNA at kilobit per second per copy rates per strand (and it can be massively parallel). As in you can have trillions of copies after 45 minutes.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
What? (Score:2)