Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Software Hardware Science Technology

Pocket-Sized DNA Reader Used To Scan Entire Human Genome Sequence (arstechnica.com) 76

An anonymous reader quotes a report from Ars Technica: A few years back, a company called Oxford Nanopore announced it was developing a radically different way of sequencing DNA. Its approach involved taking single strands of the double helix and stuffing them through a protein pore. With a small bit of current flowing across the pore, the four bases of DNA each created a distinct (if tiny) change in the voltage as it passed through. These could be used to read the DNA one base at a time as it wiggled through the pore. After several years of slow progress, Oxford Nanopore announced that its sequencing hardware would be as distinctive as its wetware: a USB device that could fit comfortably in a person's hand. As the first devices went out to users, it became clear that the device had some pros and cons. On the plus side, the device was quick and could be used without requiring a large facility to support it. It could also read very long stretches of DNA at once. But the downside was significant: it made lots of mistakes.

With a few years of experience, people are now starting to learn to make the most of the devices, as demonstrated by a new paper in which researchers use it to help sequence a human genome. By using the machine's long reads -- in one case, nearly 900,000 bases from one DNA molecule -- the authors were able to get data out of areas of the human genome that resisted characterization before. And they were able to distinguish between the two sets of chromosomes (one from mom, one from dad) and locate areas of epigenetic control in many areas of the genome. In light of all the distinct information it can provide, the machine's error rate is seeming like less of a problem.

This discussion has been archived. No new comments can be posted.

Pocket-Sized DNA Reader Used To Scan Entire Human Genome Sequence

Comments Filter:
  • Just remember it was a cautionary tale and NOT an operations manual.

    • by Gravis Zero ( 934156 ) on Tuesday January 30, 2018 @10:42PM (#56037961)

      Just remember it was a cautionary tale and NOT an operations manual.

      Don't be ridiculous! I mean, everyone knows that 1984 is the real instruction manual. ;)

      • Re: (Score:1, Troll)

        by sehlat ( 180760 )

        Don't be ridiculous! I mean, everyone knows that 1984 is the real instruction manual. ;)

        No, it isn't. Orwell was an optimist.

    • our brains already do the gross DNA analysis with sexism, racism and stereotypes, this is just a fine tuning.

    • This reminds me of the Human Genome Project. After a few years of trying to get funding for a fifteen year project to sequence the entire human genome, the Reagan administration allocated $3 billion to get started. It was "finished" 13 years later. Now this iPhone doohickey does it in seconds or minutes.

  • Easy Fix (Score:2, Insightful)

    by Anonymous Coward

    Just do multiple passes and match the commonalities. Should be an easy way to sort out the errors and make it much more accurate

    • Re:Easy Fix (Score:5, Informative)

      by Pseudonym ( 62607 ) on Tuesday January 30, 2018 @11:28PM (#56038133)

      That's what we do now with short reads. It kind of works, but only because we understand in a lot of detail about how errors happen.

      For example, 454 sequencing tends to get the number of nucleotides in a repeat sequence wrong. So, for example, CTAAAGT might be read as CTAAAAGT. Illumina sequencing doesn't have that problem, but tends to degrade along the length of the read. So the last few nucleotides are more likely to be wrong than the first few.

      And this is just read errors; with short-read sequencing, there are also PCR amplification errors, which is why we think nanopore sequencing will do better. When you start "unwinding" a chromosome, the parts that you unwind first tend to get amplified more than the parts that you unwind nearer to the end. Some sequences are amplified more than others for chemical reasons, and the relative error might depend on the specific revision of reagent chemicals.

      We don't really understand enough about nanopore sequencing to be able to develop appropriate algorithms to match long-read sequences together. We don't even know what the right number of multiple passes is yet. And that's important, because genomics and transcriptonomics are important, but the bigger issue for researchers is economics.

      • Isn't the Illumina problem fixed by paired end reads on the rather short fragments?

        • Yes and no. Paired end reads give you either longer reads or longer range information. The problem isn't fixed because as the technology gets better we just push up the read length.

      • With machine learning, you can theoretically use a known set of good DNA reads to determine what needs adjustment. That, of course, requires a human to train the machine learning algorithm to better-interpret the data and learn properly. It also requires a lot of manual setup reading and rereading known DNA, as well as making adjustments to the hardware to decrease its error rate as you discover particular error conditions for which you can correct directly.

        Even with all the manual work involved, it's g

    • "Just do multiple passes and match the commonalities. Should be an easy way to sort out the errors and make it much more accurate"

      Just like an idiot calculating stuff, making him do it multiple times and he'll will be a stable genius.

  • in finding out what kinds of DNA is in my pocket,
    • Your own, if you play "pocket pool"

    • by gringer ( 252588 )

      You can find out yourself for the low, low price of $1000 USD. ... or wait a few months for SmidION to come out, which will be a bit cheaper, and plug into your iPhone or Android device.

      • by mentil ( 1748130 )

        Can't wait for Apple/Google to have my sequenced DNA information... what could possibly go wrong?!

    • String or nothing!

  • Just need to ID marker DNA sequences not the whole thing.

    Scanning for Flu. Searching for H1N1, Negative, H1N2 Negative........... H2N3 POSITIVE! Confirmation Scan? Y/N?

  • by wisebabo ( 638845 ) on Tuesday January 30, 2018 @11:02PM (#56038021) Journal

    ... and it (kinda) works as advertised. It is also VERY low cost (compared to the previous generation of sequencing machines which cost 700K and up, it costs about $1K). The main disadvantages are that 1) it's still inaccurate, maybe only in the ~90% accuracy rate (not a good thing when you're reading 3B base pairs) and 2) the reagents and flow cell used are expensive (so on big jobs you're almost better off using a traditional sequencer). Still, it does do LONG reads which gets over one of the big disadvantages of the previous gen. machines.

    Even with a high error rate, if the errors are UNBIASED then you can overcome them by simply sequencing the same area over and over again to come up with a consensus. This is called "coverage" and usually a factor of 10X is used but if the sequencing technology is cheap enough why not do it 30X or 100X or more?

    For us citizen scientists, you'll still need a way of processing and purifying your DNA, I'm trying to get a Bento Lab (hopefully shipping in a month or two). Also the technology will hopefully get better and better, the next version will supposedly have the nanopore membrane separate from the flow cell so the whole thing won't have to be replaced when the membrane is used up. (The version after THAT supposedly will a tiny device directly attachable to an iPhone with an even tinier replaceable membrane so maybe it'll become really cheap to sequence DNA; at parties even :). Finally, I think they may be moving to freeze dried or otherwise non-perishable reagents so the storage requirements will become a little easier (I have a dedicated battery backed freezer at home).

    Now with CRISPR kits for only $40, there's no end to the fun (and disasters) that we can do with our basement genetic experiments!

    I should mention you'll need a little lab experience and know how to use a pipette and have steady hands! Go take some courses at the local community college and you'll be good to go. (Of course in order to interpret your results you'll need to study BioInformatics, my specialty :)

    • by Anonymous Coward

      I work for a major company pursuing orders of magnitude synthesis and sequencing more than just about anybody else in the world. We have a bunch of these things in addition to the more traditional sequencers. They fit our long read pipeline very nicely but I'd hesitate to use them on their own.

    • by Pseudonym ( 62607 ) on Tuesday January 30, 2018 @11:38PM (#56038169)

      it's still inaccurate, maybe only in the ~90% accuracy rate (not a good thing when you're reading 3B base pairs)

      Former de novo assembly software writer here. Do we have a good handle on the kinds of errors that you tend to find? You know how 454 reads tends to miscount repeat sequences and Illumina tends to decline in quality along the read. Do we understand where the errors come from?

      Also, are the errors correlated? If you try to sequence the same 500k read twice, will it make errors in the same places?

      • by Cyberax ( 705495 )
        Former Illumina Long Read kit developer here. Nanopore reads are lousy with repeated nucleotides (to be fair, our kit also kinda was because of multiple PCR cycles needed) and it does have a certain GC bias.

        Right now regular Illumina short reads with a little bit of long reads are enough to get phased SNP information from most relevant parts of a human genome. It's also cheaper at scale, human genome sequencing at 3x can be done for less than $500.

        With de-novo assembly it's a bit different. Nanopores pr
        • Right. So for de novo (as noted, that was my field) it seems to me that the best approach might be to build and clean up a de Bruijn graph from short reads, and then align long reads to the graph to get contigs.

          • I would do it the other way - align short reads on contigs made from long reads. No need for de Brujn graphs, simple OLC (overlap layout consensus) is sufficient.
            • The benefit of doing it the other way is you can use existing efficient graph cleanup algorithms like tour bus.

              It will be interesting.

              • by gringer ( 252588 )

                The benefit of creating scaffolds first from long reads is that it's a lot easier to capture regions where there is a Very-long Complex Tandem Repeat (VeCTR). These regions are collapsed in scaffolds assembled from short reads.

                • That case would still work because CTRs correspond to a loop in the de Bruijn graph. The theory is that all true contigs are paths in the graph, and you can use the long reads to find each one.

                  But I agree that you could do it either way and we don't know which one would be better until we have more experience.

                  • by gringer ( 252588 )

                    Not necessarily. If the unit length of the repeat is greater than the fragment length (I've seen tandem repeats with unit lengths of 40 kb), then the region will not be detected as repetitive.

    • by Anonymous Coward

      Fuck the iPhone accessory thing.

      I want my DNA sequenced, but I don't want to hand it to some bullshit Cloud AI IoT App company that will sell my DNA to advertisers.

      "Hi! I see your sequence here is AGTAGG, would you like some hard liquor?"

  • by gringer ( 252588 ) on Tuesday January 30, 2018 @11:42PM (#56038183)

    I did a Q&A on this sequencer on SoylentNews a couple of years ago:

    https://soylentnews.org/articl... [soylentnews.org]

    The technology has improved substantially since then. Feel free to ask me any more questions about the sequencing. Although I'm not an author on this paper, I'm fairly familiar with the sequencing project that was done, and am happy to answer any general questions you might have on this technology.

    • by ngc5194 ( 847747 )

      Are the errors random, or are they consistent? That is, can we just run strands through enough times to get the error rates down to acceptable levels?

      • by gringer ( 252588 ) on Wednesday January 31, 2018 @03:01AM (#56038567)

        Some errors are random, some are systematic. The systematic errors tend to be either small shifts in long stretches of the same base, or interesting features of the DNA (e.g. methylation), and there are a few people trying to work out what those interesting features are.

        A key obstacle to getting people interested in nanopore sequencing (or other types of observational sequencing) is that we have been locked in for so long to the idea of DNA as a sequence of letters that we forget there are other things attached to it that also have functional roles. Nanopore is more accurate when matching sequences at the signal/electrical level, but almost no one is doing that yet.

        • Nanopore is more accurate when matching sequences at the signal/electrical level, but almost no one is doing that yet.

          Reminds me matching peptide sequences at the mass-spectrometry level in proteomics (Disclaimer: used to work at GeneBio).

    • How long will it be before I'm turned down for a job because my genes are bad?
  • Employers will use such under the table to screen candidates for medical and/or genetic problems. I've worked for slimebags who would happily cheat at anything to gain an edge.

  • bring on the DNA-reader PAM module so I can log into my laptop by licking instead of swiping my finger. on second thoughts, maybe not a good idea because everyone can get a spit sample and log into my linux...
  • There are two plagues in current WGS: errors in sequence: frameshifts on monomer runs, flaky stop codons in the middle of ORFs etc, and problem of assmbly of short reads in repeated sequences.

    This method helps the second problem.

    Errors in sequence can be minimized by doing things several times.

  • by Anonymous Coward

    Someone mentioned 454?

    To the best of my knowledge, 454 was a small company on the East Coast, maybe New Hampshire? They were acquired by Roche; the whole operation was moved west.

    Did I say the whole operation? Well, they picked and chose who they wanted and who they didn't. In the case of the IT department, they brought exactly one guy west, and, I infer, laid off everyone else.

    I came in as a contractor - I gathered the impression that part of the deal involved a two-week-long, all-expenses-paid vacation in

You are always doing something marginal when the boss drops by your desk.

Working...