The Importance of Native VH/VL Pairing in Antibody Repertoire Sequencing
Antibodies play a critical role in the adaptive immune system's response to pathogens. They bind to specific antigen epitopes with their variable regions, composed of a heavy chain (VH) and light chain (VL) partner. Native pairing of heavy (VH) and light (VL) chains is critical for maintaining the specificity and functionality of antibodies. However, analyzing the highly diverse antibody repertoire encoded by human B cells with preserved VH/VL pairs has been a major challenge.
Next-generation DNA sequencing (NGS) of antibody repertoires has become a powerful approach for discovering novel antibodies, analyzing B cell development, and investigating immune responses. But most NGS only sequences VH and VL chains separately, whereas paired VH/VL data is required for complete antibody characterization1. This article discusses techniques to recover natively paired antibody sequences using NGS approaches.
Emulsion Encapsulation RT-PCR
Molecular and microfluidic strategies have unlocked the ability to re-link heavy and light chain sequences at the point of sample preparation and sequencing. A good example is the emulsion-based approach developed by Rajan et al.2 Their method compartmentalizes single B cells into tiny picoliter water-in-oil droplets and amplified by reverse transcription (RT) PCR. Early PCR cycles concatenate cognate VH and VL amplicons into a single scFv construct before the limiting reagents are consumed. In this way, native pairing is maintained for >96% of sequences.
The scFv amplicon library can directly be incorporated into a phage display vector for functional screening, in addition to NGS. Using custom primers, this method allows paired-end sequencing to yield natively paired VH and VL fragments in the millions by using Illumina sequencers, such as the Illumina MiSeq, which was used in the experiment.2
Using this workflow, Rajan et al.2 isolated rare broadly neutralizing antibodies against influenza hemagglutinin from healthy donors, demonstrating the potential of the technology.
Cellular Barcoding and Molecular Tagging
An alternative to emulsion encapsulation is labeling cDNA with unique barcodes during RT to tag sequences originating from single cells. The 10x Genomics Chromium platform is an example, compartmentalizing cells into gel beads that co-encapsulate viral RT primers bearing 16-bp random barcodes.5
After barcoded cDNA amplification and emulsion dissolution, NGS reads sharing the same barcode can be assigned to their parent cell. Although stochastic shuffling of mRNA species between beads does occur, bioinformatics tools determine consensus VH and VL pairs from barcoded reads.
Specialized molecular tagging employs template switching to incorporate UMIs and cell barcodes during RT. McDaniel et al.3 demonstrated this approach by co-encapsulating splenocytes with uniquely barcoded beads bearing an oligo(dT) primer. Following RT, emulsion dissolution, and PCR, reads can be traced back to single cells via the bead barcode. By aggregating UMI counts, somatic variants of antibody lineages are also quantified with high confidence.
Long-Read Sequencing
Long-read sequencing holds particular promise for resolving native VH and VL pairs, with reads frequently spanning entire variable domain transcripts.4 For example, Oxford Nanopore sequencing was recently utilized to obtain full-length antibody sequences from rat hybridoma cell lines.7 Although throughput is currently lower than short-read platforms, ongoing advances in long-read sequencing and analysis will undoubtedly enhance these capabilities.
Antibody clonotype assembly: Library Preparation and Computational Pipeline
Realizing the potential of high-throughput, natively paired antibody sequencing requires specialized library construction protocols in concert with tailored computational analysis workflows. Fahad et.6 al developed a protocol for processing millions of Illumina MiSeq 2x300 bp reads into high-confidence V(D)J clonotypes relating to complete VH/VL consensus sequences. Reads are first annotated and filtered before clustering clonally related lineages defined by junctional diversity. Finally, the highest quality full consensus sequence is called to represent each paired antibody lineage cluster.
The Future of High-Throughput Native Antibody Sequencing
In summary, molecular barcoding techniques like emulsion RT-PCR, cellular indexing, and long-read sequencing have empowered the ability to resolve linkage of natively paired antibody chains. Coupling these approaches with streamlined computational pipelines provides a framework for functional interrogation of vast antibody repertoires. These technologies promise to transform antibody discovery and immune repertoire analysis to tackle challenges beyond the constraints of standard NGS.