Next Generation Sequencing 101 – Part 2

blog / Molecular Biology June 09 2022
dna strands

Welcome back to our next generation sequencing (NGS) 101 mini-series! In Part 1, we introduced the basics of NGS and provided an overview of some of its major applications. In this follow-up, we take a look at the typical NGS workflow including tips for good sample collection and NGS library preparation.

So what does a typical NGS workflow look like?

NGS is an umbrella term used to describe several modern second- or third-generation sequencing technologies that vary in their underlying chemistries. While the chemistries and protocols vary, the overall workflows are similar (Figure 1).

NGS workflow

Figure 1: Typical NGS workflow from sample collection to data analysis

In the remainder of this blog post, we will focus primarily on NGS for DNA samples. If you are working with RNA, we recommend that you check out our RNA-seq blog series!

The NGS workflow step-by-step

1. Sample collection

A good start is more than half the battle when it comes to NGS! Samples destined for NGS (and indeed any other nucleic acid analysis application) are vulnerable to nucleic acid degradation through enzymatic activity, as well as biological and environmental factors.

Degradation affects not only the overall nucleic acid yield; it also impacts the genomic and transcriptomic profile of your sample. Such changes may be of particular concern in microbiomics research, where NGS is typically used in metagenomics experiments to identify the structure and function of entire nucleotide sequences from all the organisms present in a complex sample (e.g. soil, faeces). In experiments to compare how microbial communities behave under different conditions, proper sample storage is critical to ensure that the nucleic acid profiles of the samples don’t change between the time of collection and nucleic acid isolation.

The best way to preserve a sample for NGS is to store it at the point of collection in a preservative such as DNA/RNA Shield™ from Zymo Research. This is an all-in-one reagent for the collection and preservation of any sample type that ensures nucleic acid stabilisation at ambient, fridge or freezer temperatures. DNA/RNA Shield™ inactivates nucleases, microorganisms and viruses, providing an unbiased molecular snapshot by preserving the genetic integrity of your sample at the time of collection. You can read more about DNA/RNA Shield™ here.

2. Nucleic acid isolation and preparation

Intact and high purity nucleic acids are a critical starting point for a successful NGS experiment and meaningful data. Our top tips for DNA isolation and preparation include:

  • Choose a DNA isolation kit that is validated for your sample type (microbial, plant, blood, yeast, etc.) and that suits your experimental goal. For instance, if you want to perform long-read sequencing, make sure, if available, to choose an isolation kit that is validated for high-molecular weigh (HMW) DNA isolation.
  • Always perform quality control on your isolated samples, e.g., through fluorimetry or capillary electrophoresis, to check the integrity, purity and yield.
  • Use freshly isolated samples when possible, otherwise make sure that older samples have been stored correctly, at an appropriate storage temperature in the presence of stabilisers as necessary.
  • Note that some DNA isolation methods may introduce inhibitors of downstream enzymatic reactions necessary for library preparation. This is particularly the case for some in-house developed plant DNA isolation protocols that fail to eliminate plant polyphenolic compounds. One way to mitigate this risk is to choose validated isolation protocols optimised for your exact sample (including species) type or a dedicated validated commercial isolation kit. Commercial PCR inhibitor removal kits may also be useful to clean up DNA samples prior to library preparation.

3. Library preparation

This part of the workflow is usually performed using a kit compatible with the sequencing platform in use. The exact steps and the order in which they are performed may vary, but a very generic workflow includes:

  1. Fragmentation. Larger DNA molecules are typically fragmented either mechanically or enzymatically to produce smaller fragments that are suitable for sequencing.
  2. PCR amplification. The fragments are PCR amplified to yield a library, which is a collection of specifically-sized DNA amplicons that are compatible with the chosen sequencing The primers used in library preparation are designed based on the sequences of interest, which may span anything from an entire genome to a selection of chosen target DNA sequences.
  3. Adaptor ligation. Sequencing adapters are ligated to the amplicons, which allow them to interact with the surface of a sequencing flow cell later during the sequencing run. For multiplex sequencing runs, unique barcodes are also ligated to the amplicons to allow individual samples to be “demultiplexed” later during data analysis.
  4. Library cleanup, size selection and amplification. Library cleanup, often referred to as size selection, involves the targeted removal of sub-optimal DNA fragments including primers, primer-dimers and adapters from the library prior to sequencing. This is a relatively cost‐effective part of the workflow that can have a profound impact on data quality, as it ensures that the amplicons in the library are within the optimum size range for the specific sequencing instrument; for example, this range is 200-500 bp for Illumina platforms.

4. Sequencing

NGS technologies are evolving rapidly, and there is currently no universal consensus on what constitutes a second- or third-generation technology, but generally speaking the generations differ with respect to read length (see here for a more detailed discussion on this topic).

Generally speaking, short-read sequencing approaches are dominated by Illumina and Ion Torrent technologies, while Pacific Biosciences’ and Oxford Nanopore’s technologies are the most popular choice for long-read sequencing. These four technologies are very briefly described here:

  • Illumina sequencing: Illumina’s technology is based on the sequencing by synthesis (SBS) method. In this method, 3’ fluorescently-labelled nucleotides (i.e. A, G, C and T) are added as the substrates for DNA synthesis using the NGS library as a template. In each round of sequencing, the growing DNA strand is imaged to determine which of the 4 fluorophores has been newly incorporated. The labelled nucleotides each contain a reversible terminator that is cleaved after imaging to allow the next round of synthesis to take place. As the DNA strand grows, it is continuously imaged and the fluorescent data is collected to reveal the identity of the newly synthesised DNA strands or reads.
  • Ion Torrent sequencing (Thermo Fisher Scientific): In nature, when a polymerase incorporates a nucleotide into a strand of DNA, a hydrogen ion (H+) is released as a by-product. Ion Torrent sequencing measures the direct release of H+ (protons) from the incorporation of individual bases into target DNA fragments by DNA polymerase.
  • Pacific Biosciences SMRT sequencing: Pac Bio’s core technology is Single Molecule, Real-Time (SMRT) sequencing. This involves the use of a SMRT Cell containing millions of tiny wells called zero-mode waveguides to which single DNA molecules are immobilised, while DNA polymerase incorporates fluorescently-labelled nucleotides. The light emitted at the top of the zero-mode waveguide is recorded and analysed, resulting in the detection of each nucleotide SMRT sequencing allows DNA fragments to be read multiple times, resulting in higher sequencing accuracy. SMRT sequencing can handle very long DNA fragments and generate very long sequence reads (tens of kilobases in length), making this a good choice for whole-genome assembly.
  • Oxford Nanopore Technologies sequencing: Nanopore sequencing enables direct, real-time analysis of long DNA or RNA fragments. The method works by monitoring changes to an electrical current as nucleic acids are passed through a protein nanopore. The resulting signal is then deciphered into the specific DNA or RNA sequence.

5. Data Analysis

Once the sequencing reactions are complete the fun begins, with the need to process and analyse vast amounts of data into meaningful scientific interpretations. There are a plethora of NGS data analysis approaches available including ready-to-use pipelines, many of which are freely available online. Data analysis for NGS is beyond the scope of this blog post, but you will hopefully find useful information in the suggested literature below.

Do you have questions about NGS?

In this mini-series, we aim to provide basic information about what NGS is, how it works and what it can be used for. If you have specific questions about any NGS workflow, or if there is a topic you would like to see covered in a future blog post, don’t hesitate to get in touch with us at info@nordicbiosite.com. We look forward to hearing from you!

Related Nordic BioSite Blog Posts:

Next Generation Sequencing 101

Size Selection for NGS

Five Steps to NGS Success!

Sample Collection and Preservation – Critical Starting Points in Your Research!

Suggested Literature:

Goodwin, S., McPherson, J. & McCombie, W. Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet 17, 333–351 (2016). https://doi.org/10.1038/nrg.2016.49.

Slatko BE, Gardner AF, Ausubel FM. Overview of Next-Generation Sequencing Technologies. Curr Protoc Mol Biol. 2018 Apr;122(1):e59. doi: 10.1002/cpmb.59. PMID: 29851291; PMCID: PMC6020069.