Introduction
Based on data from a subset of DNA synthesis service providers, error rates for DNA synthesis could range from 0.007% to 0.02%. Re-sequencing synthesized sequences before amplification and subsequent functional analysis can therefore be valuable for mitigating the risk for downstream errors.
With the cost of synthesizing a base pair hovering around $0.10, the total cost of synthesizing tens or hundreds of sequences remains fairly affordable for most labs. In the case of repertoire sequencing or synthetic libraries, synthesizing hit-picked antibody sequences is necessary for subsequent functional assays, while engineered sequences or synthetic conjugates would also be synthesized prior to functional assays.
To put the error rates into context, for a 96-well plate of IgG heavy chain (VH) sequences at an average length of 110 amino acids, you could have as many as 15 sequences containing a nucleotide error.
Checking for any inadvertent errors resulting from synthesis by, for example, re-sequencing synthesized sequences before downstream analyses, can be well worth the trouble, should any errors be present.
De novo gene synthesis technologies
Standard gene nucleotide synthesis has largely relied on the process developed by Beaucage and Caruthers in 1981, involving phosphoramidites that allowed production of stable, solid nucleoside phosphites in normal laboratory conditions1. Deoxynucleotide coupling in chemical synthesis generally has more than 99% coupling efficiency per synthetic cycle2.
With synthesis cycles around 100 nucleotides, a typical IgG VH, would be constructed by assembling several synthesized sequences with overlapping ends. This means that in order to guarantee correct synthesis and assembly, service providers implement various measures for quality control (QC), such as mass spectrometry, enzyme-based or chromatography, light spectroscopy or sequencing.
Since the standard phosphoramidite-based DNA synthesis cycle is capable of reliably producing at most around 200 nucleotides, longer DNA sequences require assembling several oligonucleotides3. As an alternative to the gold standard chemical approach, methods for enzymatic synthesis are being developed by companies such as Ansa Biotechnologies and DNA Script, whose solutions are based on template-independent terminal deoxynucleotidyl transferase (TdT) for DNA synthesis. These technologies promise higher quality, with DNA Script having reached 99.7% and Camena Bioscience having reached 99.9% coupling efficiency, but also increased speed for producing synthesized DNA4. Successful commercialization will, however, still require reaching a feasible scale and throughput.
As successful DNA assembly is required in order to produce longer DNA fragments from oligos, extensive QC is required for chemical synthesizing technologies to reach high levels of accuracy. Assembly technologies and solutions from, among others, Codex DNA (Gibson assembly) can potentially streamline antibody discovery workflows by combining steps such as cloning, colony-picking, culture, prep, synthesis and assembly into one hardware solution. However, in chemical synthesis, larger aggregate numbers of assembled synthesized nucleotides inevitably lead to potential errors which highlights the need for effective QC.
Quality control of synthesized sequences
After successfully screening your antibody library and hit-picking your sequences, you’d normally synthesize your antibodies for epitope mapping, studying binding kinetics and more. With plenty of synthesizing options available, you are guaranteed to receive high-quality synthesized proteins. Nevertheless, as we pointed out above, the process is still imperfect and therefore oligo synthesizing equipment vendors or service providers disclose maximum error rates for synthesizing errors. Different vendors also have different processes for QC, which is likely also reflected in the price of the synthesized sequences.
We collected a subset of error rate examples from different providers and different products (gene synthesis, oligo pools) available in the market.
With the reliability of the nucleotide sequences obtained from gene synthesis being inversely correlated with the number of nucleotides being synthesized and assembled, the more and longer sequences you have, the more likely it will be that these contain errors. Although VH and light chains (VL) of IgGs are relatively short, typically around 110 amino acids (or 330 base pairs) long, larger synthesized sequence libraries inevitably run the risk of containing errors, unless extensively quality controlled.
For example, assume we’ve hit-picked a 96-well plate of IgG variable regions of the heavy chain (VH) to synthesize. At the typical length of 330 nucleotides per VH, we will have a total of 31,680 nucleotides synthesized.
Depending on the applied methodology for synthesis and the associated QC before delivery, our de novo synthesized sequences could contain anywhere from 2 to 16 errors in our 96 sequences.
While there is no guarantee of how the errors are distributed across our sequences, if they are equally distributed with no single VH containing more than one error, anywhere from 2% to 16% of our sequences could contain an error.
Concluding remarks
We are excited to see the next generation of gene synthesis technologies being commercialized and giving researchers access to faster and more affordable methods for producing synthesized DNA, which can be used in a number of different applications, of which one is therapeutic research. Despite the often extensive QC performed by vendors on synthesized DNA products, validating your antibody sequences by re-sequencing mitigates the risk of incorrect antibodies being evaluated in the downstream functional research and development cycle.