Non-canonical cysteines in antibody repertoires – From Camelids to Humans

Role of the disulfide bridges in shaping the functional capabilities of the antibody repertoires.

Category:
Science
Date:
July 24, 2024
Read time:
9
min
Visualization of possible positions of canonical and non-canonical cysteines on the sequence through FR1, CDR1, FR2, CDR2, FR3 with comparison to canonical disulfide from canonical cytisine positions 23/104 to other types of distributions across camelid species.

Canonical cysteines are the conserved cysteine residues found in the core of Ig domains, which connect the polypeptide chains of an antibody molecule encoded by variable and constant gene segments.2 However, a subset of cysteine residues, known as non-canonical cysteines, are typically encoded in human by diversity gene segments, primarily IGHD2 and other D gene families.1

Non-canonical cysteines are more prevalent in species such as chickens3, camels4, llamas5, sharks6, and cows7. Non-canonical cysteine residues participate in the formation of diverse intra-heavy chain disulfide bonds within the CDR-H3 of antibody variable domains.2 Furthermore, these non-canonical cysteines mediate disulfide bonding between the CDR-H3 loop and other CDRs or framework regions (FRs).8

Illustration close-up of immunoglobulin antibody canonical cysteine disulfide bond and non-canonical disulfide bond on the 1ZVH antibody structure and 5IVO antibody structure from PDB as examples of changes in structure based on non-canonical cysteines occurrence.
Figure 1. Examples of disulfide bonds from cysteines. Canonical bond between 23 and 104 cysteine in 1ZVH structure and non-canonical bond between 50 and 102 cysteine in 5IVO. Structures

The disulfide bridges resulting from non-canonical cysteine residues are involved in generating the substantial conformational diversity observed in antibody repertoires and conferring distinct structural conformations to antigen-binding sites. Moreover, these disulfide bonds have been implicated in modulating various effector functions of antibody molecules. The presence and positioning of non-canonical cysteines is a molecular determinant for shaping the functional capabilities of the antibody repertoire.8

Non-canonical cysteines in Bactrian camels

Camelids, such as camels and llamas, possess a unique ability to produce heavy-chain-only antibodies (VHHs) with an exceptionally long CDR-H3 region.9 This extended CDR-H3 loop can form non-canonical disulfide bonds, enabling recognition of unique epitopes that are inaccessible to conventional antibodies.1

Bar chart presenting a comparison of CDR3 length distribution in Bactrian camels VH and VHH, pointing at the differences of relative frequency on each position in amino acid length of CDR3.
Figure 2. The CDR3 comparison of Bactrian camels including CDR3 length (AA) and proportion.

In a study by Liu et al10., the canonical cysteines at positions 23 and 104 were found to be present in all VHs and VHHs. However, non-canonical cysteines were specifically enriched in VHHs. A comparison of the locations of these non-canonical cysteines revealed that they were primarily situated on CDR1 in the Bactrian camel and dromedary, but on FR2 (site 55) in the alpaca. Moreover, the proportion of VHHs harboring non-canonical cysteines was significantly higher in the Bactrian camel and dromedary than in the alpaca10.

Visualization of possible positions of canonical and non-canonical cysteines on the sequence through FR1, CDR1, FR2, CDR2, FR3 with comparison to canonical disulfide from canonical cytisine positions 23/104 to other types of distributions across camelid species.
Figure 3. Presence of canonical and non-canonical cysteines in the IGHV3 family with distinguished position types adapted from Liu et al. 10

Based on the positions of non-canonical cysteine residues, VHHs could be classified into eight distinct types, which exhibited varying distributions across the camelid species. Types 2b, 2c, and 3a were the predominant types in the Bactrian camel and dromedary, while Types 1 and 3b were dominant in the alpaca.10 The presence of non-canonical cysteines facilitates the formation of additional disulfide bonds, thereby enhancing the structural diversity of VHHs.10 This effect is more profound in the Bactrian camel and dromedary compared to the alpaca.10

Notably, the remarkable usage of non-canonical cysteines within VHHs was identified in the Bactrian camel, both at the germline and rearranged levels. These non-canonical cysteines are rarely observed in VHs and other non-camelid species.11 It is hypothesized that the cysteine residues on CDR3 may form disulfide bridges with the cysteine within FR2 or the cysteine on CDR1, resulting in a special disulfide bond configuration.11,12

Disulfide bonds formed by cysteine residues impact protein folding. The additional non-canonical cysteines introduced by heavy-chain antibodies (HCAbs) could lead to novel loop conformations and thereby increase the structural diversity and stabilize the VHH domain.13 Additionally, compared to the alpaca, the Bactrian camel and dromedary exhibited significantly more and distinct non-canonical cysteines, implying a greater diversity of HCAb structures in these two species.10

The introduction of non-canonical cysteines, along with extended CDR3 lengths and increased hypermutation hotspots, may contribute to the acquisition of a diverse antigen-binding repertoire in HCAbs, compensating for the absence of light chains.13,14

Non-canonical disulfide bonds across Camelids

One prominent example is the presence of non-canonical disulfide bonds between cysteine residues outside the traditional antibody framework regions. It occurs in the dromedary camel (Camelus dromedarius), where the most common non-canonical disulfide bond links the complementarity determining regions CDR1 and CDR3, with the cysteine on CDR1 (position 33) being germline encoded.12 The prevalence of this structural motif suggests it plays an important role in the structure and/or function of the HCAbs encoded by this germline gene segment.

Bar chart of presence of noncanonical disulfide bonds in CDR1-CDR3 and CDR2-CDR3 among different camelids and visualization of proportions of occurrence of number of canonical and non-canonical cysteines among different species.
Figure 4. Presence of  non-canonical cysteines in the variable regions in terms of frequency in region and in terms of occurrence in different species for VH and VHH adapted from Liu et al. 10

Through systematic investigations on a set of five camel VHH domains (the single N-terminal variable domains of HCAbs), two key hypotheses have emerged regarding the potential functional significance of the CDR1-CDR3 disulfide bond15:

Antigen Binding Affinity Hypothesis

It is proposed that the CDR1-CDR3 disulfide bond contributes to the antigen binding affinity of HCAbs/VHH by reducing entropic penalties. The CDR3 loops in camelid VHH average three residues longer than those of conventional antibodies.16 By constraining the conformational flexibility of these elongated CDR3 loops, the non-canonical disulfide bond could lower the entropic cost associated with loop immobilization upon antigen binding. However, experimental evidence indicates the effects on affinity are variable across different VHH, rather than being a universal entropy-driven effect.15

Thermal Stability Adaptation Hypothesis

The second hypothesis suggests an evolutionary role where the CDR1-CDR3 disulfide bond enhances the biophysical properties of the VHH domain to prevent heat-induced aggregation in the absence of a light chain partner.15 This disulfide cross-link may represent a key adaptation allowing camels to survive the extreme desert climates and high body temperatures of their natural habitat.17 Camels are known to conserve water by elevating their internal body temperature as high as 40°C during the day and only dissipating heat at night to avoid water loss from evaporation.18

Experiments monitoring the aggregation propensity of a camel VHH (R303) and a variant lacking the CDR1-CDR3 bond (R303C33A/C102A) at moderate temperatures support the notion that this disulfide enhances the reversible folding and solubility of the isolated VHH domain under denaturing thermal conditions seen in the camel's body.16 While not conclusive, these results indicate the thermal stability hypothesis is plausible.

The CDR1-CDR3 disulfide bonding pattern is almost exclusively found in Camelus dromedarius among camelid species, while llamas and alpacas more commonly exhibit non-canonical bonds between CDR2 and CDR3.19,20,21 Furthermore, around 30% of camel VHH sequences contain the germline-encoded cysteine at position 33 that facilitates CDR1-CDR3 bonding, compared to just ~12% in alpacas and ~3.5% in llamas.15

This phylogenetic distribution suggests the CDR1-CDR3 disulfide bond represents a specialized adaptation of camels compared to other camelids.15 The higher frequency in camels may reflect an evolutionary response to the more extreme desert environments and heat stress faced by this particular species within the Camelidae family.15

Overview across other species

 Comparison of immunoglobulin structure of different species based on the host including: conventional IgGs from humans, heavy chain antibodies (HCAbs) and nanobodies (VHHs) from camelids, IgNAR from sharks and IgY in chickens.
Figure 5. Visualization of differences in repertoire in different hosts including: human, camel, shark and chicken.

As non-canonical cysteines in CDR-H3s can form intra- or inter-chain disulfide bonds they can stabilize unique conformations and create novel paratope shapes.2 The antibodies of chickens, known as immunoglobulin Y (IgY)22, are renowned for their high abundance of non-canonical cysteines in CDR-H3s.23 These cysteines form various intra-CDR-H3 disulfide bonds, contributing to the structural diversity and antigen-binding capabilities of chicken antibodies.24

In sharks, the exceptional diversity of non-canonical cysteines in their antibody repertoire is an evolutionary adaptation to the aquatic environment.6 Shark antibodies, known as IgNARs (immunoglobulin new antigen receptors), possess long CDR-H3s with multiple non-canonical cysteines, forming intricate disulfide bond networks that contribute to their structural stability and antigen-binding capabilities.8

Cows also exhibit a remarkable diversity of non-canonical cysteines in their antibody repertoire, with some antibodies containing up to 28 cysteines in their CDR-H3s.7 These ultra-long CDR-H3s, stabilized by an array of disulfide bonds, are thought to play a role in recognizing complex antigens, such as carbohydrate epitopes found on pathogens.11

Rabbit IgG antibodies frequently contain non-canonical disulfide bonds between the CDRH1 and CDRH2, as well as between the kappa light chain and CDRH1/CDRH2.28 These non-canonical disulfides are often viewed as liabilities for therapeutic antibody development from rabbits.27 The kappa light chains of rabbits can also contain an intra-chain disulfide between cysteines 80 and 171 in the dominating K1 isotype.26 This poses challenges when making chimeric rabbit/human Fabs, as human CH1 domains lack cysteine 171. Rabbits without this K1 intrachain disulfide show higher sequence diversity and affinity of chimeric Fabs.26 Some rabbit allotypes like b9 have an alternative intrachain disulfide in the kappa light chain between cysteines 108 and 171 instead. The high diversity of rabbit light chains is functionally important for antigen binding.26

Non-canonical cysteines in human repertoire

The landscape of non-canonical cysteines in the human antibody repertoire has been elucidated through the analysis of large next-generation sequencing (NGS) datasets. By analyzing nearly 3 billion VH sequences from ten individuals, Prabakaran et al.1 identified 12 million unique VH sequences containing non-canonical cysteines in CDR-H3s. These findings challenge the prevailing notion that non-canonical cysteines are rare or absent in human antibodies. The number of non-canonical cysteines in CDR-H3s ranged from one to eight, with two cysteines being the most prevalent. Surprisingly, higher numbers of non-canonical cysteines, typically associated with antibodies from other species like chicken, shark, and cow, were also observed in human CDR-H3s.1 Such non-canonical cysteines created diverse patterns and motifs involving contiguous duplets, triplets, and even septuplets, which were previously unknown in humans.1

These non-canonical cysteine motifs in human CDR-H3s display remarkable diversity, with over 4,000 unique patterns identified.1 The CXnC motif, where Xn represents the number of amino acids separating the two cysteines, was found to be the most prevalent, with the CX4C motif accounting for nearly 75% of all CXnC motifs. 34,266 unique tetrapeptides were embedded within the CX4C motifs, highlighting the sequence diversity.1

The presence of non-canonical cysteines in human CDR-H3s is reminiscent of the patterns observed in other species, suggesting an evolutionary relationship between the VHs of humans and those of lower animals. For instance, the two-cysteine CXnC motifs found in human CDR-H3s resemble those observed in chicken, camel, llama, shark, and cow antibodies.8 However, the diversity and complexity of non-canonical cysteine motifs in human CDR-H3s appear to be more extensive, encompassing a broader range of patterns and potential disulfide bonds.1

References

Right-pointing black chevron

References:

  1. Prabakaran P, Chowdhury PS. Landscape of Non-canonical Cysteines in Human VH Repertoire Revealed by Immunogenetic Analysis. Cell Rep. 2020 Jun 30;31(13):107831. doi: 10.1016/j.celrep.2020.107831. PMID: 32610132; PMCID: PMC7326410.
  2. Frangione B, Milstein C, Pink JR. Structural studies of immunoglobulin G. Nature. 1969 Jan 11;221(5176):145-8. doi: 10.1038/221145a0. PMID: 5782707.
  3. Wu L, Oficjalska K, Lambert M, Fennell BJ, Darmanin-Sheehan A, Ní Shúilleabháin D, Autin B, Cummins E, Tchistiakova L, Bloom L, Paulsen J, Gill D, Cunningham O, Finlay WJ. Fundamental characteristics of the immunoglobulin VH repertoire of chickens in comparison with those of humans, mice, and camelids. J Immunol. 2012 Jan 1;188(1):322-33. doi: 10.4049/jimmunol.1102466. Epub 2011 Nov 30. PMID: 22131336.
  4. Muyldermans S, Atarhouch T, Saldanha J, Barbosa JA, Hamers R. Sequence and structure of VH domain from naturally occurring camel heavy chain immunoglobulins lacking light chains. Protein Eng. 1994 Sep;7(9):1129-35. doi: 10.1093/protein/7.9.1129. PMID: 7831284.
  5. Harmsen MM, Ruuls RC, Nijman IJ, Niewold TA, Frenken LG, de Geus B. Llama heavy-chain V regions consist of at least four distinct subfamilies revealing novel sequence features. Mol Immunol. 2000 Aug;37(10):579-90. doi: 10.1016/s0161-5890(00)00081-x. PMID: 11163394.
  6. Feng M, Bian H, Wu X, Fu T, Fu Y, Hong J, Fleming BD, Flajnik MF, Ho M. Construction and next-generation sequencing analysis of a large phage-displayed VNAR single-domain antibody library from six naïve nurse sharks. Antib Ther. 2019 Jan;2(1):1-11. doi: 10.1093/abt/tby011. Epub 2018 Nov 7. PMID: 30627698; PMCID: PMC6312525.
  7. Haakenson JK, Deiss TC, Warner GF, Mwangi W, Criscitiello MF, Smider VV. A Broad Role for Cysteines in Bovine Antibody Diversity. Immunohorizons. 2019 Oct 16;3(10):478-487. doi: 10.4049/immunohorizons.1900058. PMID: 31619454; PMCID: PMC7366855.
  8. de los Rios M, Criscitiello MF, Smider VV. Structural and genetic diversity in antibody repertoires from diverse species. Curr Opin Struct Biol. 2015 Aug;33:27-41. doi: 10.1016/j.sbi.2015.06.002. Epub 2015 Jul 17. PMID: 26188469; PMCID: PMC7039331.
  9. Asaadi Y, Jouneghani FF, Janani S, Rahbarizadeh F. A comprehensive comparison between camelid nanobodies and single chain variable fragments. Biomark Res. 2021 Dec 4;9(1):87. doi: 10.1186/s40364-021-00332-6. PMID: 34863296; PMCID: PMC8642758.
  10. Liu, Yuexing & Yi, Li & Li, Yixue & Wang, Zhen & Jirimutu,. (2022). Characterization of heavy chain antibody gene repertoires in Bactrian camels. Journal of Genetics and Genomics. 50. 10.1016/j.jgg.2022.04.010.
  11. Jeremy K. Haakenson, Thaddeus C. Deiss, Gabrielle F. Warner, Waithaka Mwangi, Michael F. Criscitiello, Vaughn V. Smider; A Broad Role for Cysteines in Bovine Antibody Diversity. Immunohorizons 1 October 2019; 3 (10): 478–487. https://doi.org/10.4049/immunohorizons.1900058
  12. Mendoza MN, Jian M, King MT, Brooks CL. Role of a noncanonical disulfide bond in the stability, affinity, and flexibility of a VHH specific for the Listeria virulence factor InlB. Protein Sci. 2020 Apr;29(4):1004-1017. doi: 10.1002/pro.3831. Epub 2020 Feb 8. PMID: 31981247; PMCID: PMC7096713.
  13. Nguyen VK, Hamers R, Wyns L, Muyldermans S. Camel heavy-chain antibodies: diverse germline V(H)H and specific mechanisms enlarge the antigen-binding repertoire. EMBO J. 2000 Mar 1;19(5):921-30. doi: 10.1093/emboj/19.5.921. PMID: 10698934; PMCID: PMC305632.
  14. Nguyen VK, Muyldermans S, Hamers R. The specific variable domain of camel heavy-chain antibodies is encoded in the germline. J Mol Biol. 1998 Jan 23;275(3):413-8. doi: 10.1006/jmbi.1997.1477. PMID: 9466919.
  15. Govaert J, Pellis M, Deschacht N, Vincke C, Conrath K, Muyldermans S, Saerens D. Dual beneficial effect of interloop disulfide bond for single domain antibody fragments. J Biol Chem. 2012 Jan 13;287(3):1970-9. doi: 10.1074/jbc.M111.242818. Epub 2011 Nov 29. PMID: 22128183; PMCID: PMC3283254.
  16. Mitchell LS, Colwell LJ. Comparative analysis of nanobody sequence and structure data. Proteins. 2018 Jul;86(7):697-706. doi: 10.1002/prot.25497. Epub 2018 Apr 15. PMID: 29569425; PMCID: PMC6033041.
  17. Kunz P, Zinner K, Mücke N, Bartoschik T, Muyldermans S, Hoheisel JD. The structural basis of nanobody unfolding reversibility and thermoresistance. Sci Rep. 2018 May 21;8(1):7934. doi: 10.1038/s41598-018-26338-z. PMID: 29784954; PMCID: PMC5962586.
  18. Schmidt-Nielsen K, Schmidt-Nielsen B, Jarnum SA, Houpt TR. Body temperature of the camel and its relation to water economy. Am J Physiol. 1957 Jan;188(1):103-12. doi: 10.1152/ajplegacy.1956.188.1.103. PMID: 13402948.
  19. Harmsen MM, Ruuls RC, Nijman IJ, Niewold TA, Frenken LG, de Geus B. Llama heavy-chain V regions consist of at least four distinct subfamilies revealing novel sequence features. Mol Immunol. 2000 Aug;37(10):579-90. doi: 10.1016/s0161-5890(00)00081-x. PMID: 11163394.
  20. Maass DR, Sepulveda J, Pernthaner A, Shoemaker CB. Alpaca (Lama pacos) as a convenient source of recombinant camelid heavy chain antibodies (VHHs). J Immunol Methods. 2007 Jul 31;324(1-2):13-25. doi: 10.1016/j.jim.2007.04.008. Epub 2007 May 15. PMID: 17568607; PMCID: PMC2014515.
  21. Vu KB, Ghahroudi MA, Wyns L, Muyldermans S. Comparison of llama VH sequences from conventional and heavy chain antibodies. Mol Immunol. 1997 Nov-Dec;34(16-17):1121-31. doi: 10.1016/s0161-5890(97)00146-6. PMID: 9566760.
  22. Lanzarini NM, Bentes GA, Volotão EM, Pinto MA. Use of chicken immunoglobulin Y in general virology. J Immunoassay Immunochem. 2018;39(3):235-248. doi: 10.1080/15321819.2018.1500375. PMID: 30044696.
  23. Gjetting, T., Gad, M., Fröhlich, C., Lindsted, T., Melander, M.C., Bhatia, V.K., Grandal, M.M., Dietrich, N., Uhlenbrock, F., Galler, G.R., et al. (2019). Sym021, a promising anti-PD1 clinical candidate antibody derived from a new chicken antibody discovery platform. MAbs 11, 666–680.
  24. Könitzer, J.D., Pramanick, S., Pan, Q., Augustin, R., Bandholtz, S., Harriman, W., and Izquierdo, S. (2017). Generation of a highly diverse panel of antagonistic chicken monoclonal antibodies against the GIP receptor. MAbs 9, 536–549.
  25. Liang WC, Xi H, Sun D, D'Ascenzo L, Zarzar J, Stephens N, Cook R, Li Y, Ye Z, Matsumoto M, Payandeh J, Masureel M, Wu Y. Structure- and machine learning-guided engineering demonstrate that a non-canonical disulfide in an anti-PD-1 rabbit antibody does not impede antibody developability. MAbs. 2024 Jan-Dec;16(1):2309685. doi: 10.1080/19420862.2024.2309685. Epub 2024 Feb 14. PMID: 38356181; PMCID: PMC10877986.
  26. Weber, J., Peng, H. & Rader, C. From rabbit antibody repertoires to rabbit monoclonal antibodies. Exp Mol Med 49, e305 (2017). https://doi.org/10.1038/emm.2017.23
  27. Mage RG, Esteves PJ, Rader C.. Rabbit models of human diseases for diagnostics and therapeutics development. Dev Comp Immunol. 2019;92:99–13. doi: 10.1016/j.dci.2018.10.003. PMID: 30339876.
  28. Liang WC, Xi H, Sun D, D'Ascenzo L, Zarzar J, Stephens N, Cook R, Li Y, Ye Z, Matsumoto M, Payandeh J, Masureel M, Wu Y. Structure- and machine learning-guided engineering demonstrate that a non-canonical disulfide in an anti-PD-1 rabbit antibody does not impede antibody developability. MAbs. 2024 Jan-Dec;16(1):2309685. doi: 10.1080/19420862.2024.2309685. Epub 2024 Feb 14. PMID: 38356181; PMCID: PMC10877986.

Easily identify non-canonical cysteines?

Other recent posts