loading page

Reliable NGS genotyping of MHC class I and II genes requires template-specific optimization of pipeline settings
  • +2
  • Artemis Efstratiou,
  • Arnaud Gaigher,
  • Sven Künzel,
  • Ana Teles,
  • Tobias Lenz
Artemis Efstratiou
University of Hamburg Faculty of Mathematics Computer Science and Natural Sciences

Corresponding Author:efstratiou@evolbio.mpg.de

Author Profile
Arnaud Gaigher
University of Hamburg Faculty of Mathematics Computer Science and Natural Sciences
Author Profile
Sven Künzel
Max Planck Institute for Evolutionary Biology
Author Profile
Ana Teles
University of Hamburg Faculty of Mathematics Computer Science and Natural Sciences
Author Profile
Tobias Lenz
University of Hamburg Faculty of Mathematics Computer Science and Natural Sciences
Author Profile

Abstract

Using high-throughput sequencing for precise genotyping of multi-locus gene families, such as the Major Histocompatibility Complex (MHC), remains challenging, due to the complexity of the data and difficulties in distinguishing genuine from erroneous variants. Several dedicated genotyping pipelines for data from high-throughput sequencing, such as next-generation sequencing (NGS), have been developed to tackle the ensuing risk of artificially inflated diversity. Here, we thoroughly assess three such multi-locus genotyping pipelines for NGS data, using MHC class IIβ datasets of three-spined stickleback gDNA, cDNA, and “artificial” plasmid samples with known allelic diversity. We show that genotyping of gDNA and plasmid samples at optimal pipeline parameters was highly accurate and reproducible across methods. However, for cDNA data, the same configuration yielded decreased overall genotyping precision and consistency between pipelines. Further adjustments of key clustering parameters were required tο account for higher error rates and larger variation in sequencing depth per allele, highlighting the importance of template-specific pipeline optimization for reliable genotyping of multi-locus gene families. Through accurate paired gDNA-cDNA genotyping and MHC-II haplotype inference, we show that MHC-II allele-specific expression levels correlate negatively with allele number across haplotypes. Lastly, sibship-assisted cDNA genotyping of MHC-I revealed novel variants and haplotype-based allelic segregation with a higher-than-previously-reported individual allelic diversity for MHC-I in sticklebacks. In conclusion, we here provide novel genotyping protocols for MHC-I and -II genes of the three-spined stickleback, but also evaluate the performance of popular NGS-genotyping pipelines and highlight the need for template-specific optimization for reliable multi-locus genotyping.
01 Jun 2023Submitted to Molecular Ecology Resources
05 Jun 2023Submission Checks Completed
05 Jun 2023Assigned to Editor
05 Jun 2023Review(s) Completed, Editorial Evaluation Pending
28 Jun 2023Reviewer(s) Assigned
06 Sep 2023Editorial Decision: Revise Minor
21 Oct 20231st Revision Received
24 Oct 2023Submission Checks Completed
24 Oct 2023Assigned to Editor
24 Oct 2023Review(s) Completed, Editorial Evaluation Pending
25 Jan 2024Review(s) Completed, Editorial Evaluation Pending
25 Jan 2024Editorial Decision: Accept