Next generation sequencing, or simply NGS, has completely changed the way we do biological research. Also referred to as massively parallel sequencing, NGS offers high-throughput and high-resolution methods to decipher nucleic acid sequences, which provides important clues about protein function, regulatory and signalling pathways, mechanisms of human, plant and animal disease, diagnostics, and much more.
This blog post will introduce you to the basics of NGS and provide an overview of some of its major applications.
So what exactly is NGS?
Over the past 40 years or so, NGS has more or less replaced the famous Sanger-sequencing method, which is named after Frederick Sanger who developed the technique with colleagues in the UK in the late 1970s.
Sanger sequencing was an early, gel-based, dideoxynucleotide method to determine the sequence of chains of nucleotides. Using this method, nucleotide-specific sequenced fragments are generated using a mix of normal and chemically-modified nucleotides that terminate extension when they are incorporated into the DNA molecule being sequenced.
The so-called chain termination step described above results in a range of fragment lengths that can then be separated either by gel electrophoresis or via column chromatography, and subsequently read from shortest to longest to determine the overall target sequence. Although Sanger sequencing remains to be the most accurate sequencing method today, its low-throughput makes it unsuitable for most applications.
Today, NGS refers to a collection of rapidly evolving methodologies that allow high-throughput single nucleotide resolution sequencing and generate very large datasets thanks to parallel advances in computing. In contrast to Sanger sequencing, which is conducted in a step-wise manner, NGS performs sequencing and detection simultaneously on a scale that typically allows anything from thousands to billions of reactions to be sequenced in a single instrument run.
There are a number of major NGS platform providers in the market, including Illumina, PacBio and Oxford Nanopore Technology (ONT) (see here for an overview of what was available in 2021).
These providers vary in their approach, typically differing in length of the stretches of DNA, i.e., ‘read length’, that are sequenced and then pieced together to solve larger sequences, as well as the detection method used to determine nucleotide identity. Common to all platforms, however, is the need for upstream preparation of high-quality nucleic acid samples into sequencing-ready libraries. During this process, high molecular weight DNA must also be sheared to an appropriate size and then ligated with the platform-specific adapter that will initiate the sequencing process.
Current NGS applications
The power of NGS technology is so great that it has found its way into most, if not all, areas of biological research. Some of the most widespread applications include:
- Whole genome sequencing. NGS has made it possible to reconstruct entire genomes from a wide range of animal, plant and microbial species.
- RNA sequencing. Here, NGS-based sequencing of whole RNA samples can provide a snapshot of the numbers and identities of RNA molecules in any sample at any time under a condition(s) of interest. Such studies can help to reveal candidate genes that may be involved in regulating animal and plant development, as well as identify differential gene expression during health and disease. In addition, RNA-seq has also revealed that eukaryotic transcriptomes are much more complex than previously thought!
- Metagenomics and metatranscriptomics. Metagenomics and metatranscriptomics are used to analyse and compare communities of organisms under different conditions, e.g., the gut microbiome from healthy and sick individuals. These fields involve the study of the structure and function of all the DNA or RNA sequences, respectively, that are isolated from a sample containing a mixture of organisms, to provide a detailed molecular insight into what is happening within the community, i.e., which pathways are turned on, which signalling molecules are present, what metabolite-encoding genes are expressed, and so on.
- Genome-wide association studies (GWAS). These studies involve scanning the entire genome sequence of a large group of individuals for single nucleotide polymorphisms (SNPS). Studying the prevalence of these variations in large groups of people can help to identify new disease-associated genes and can also be used to estimate the prevalence of disease-associated genes across populations.
NGS technology is advancing and expanding all the time. Some of these advances have pushed the cost of DNA sequencing down, while others have addressed areas of the genome that were previously inaccessible to sequencing workflows. As the NGS toolbox continues to expand, we expect that its applications will become even more diverse.
That was it for now! Stay tuned for Part 2 where we look at the typical NGS library workflow including tips for good sample collection as well as NGS library preparation.