Evaluating the accuracy of variant calling methods using the frequency
of parent-offspring genotype mismatch
Abstract
The use of NGS datasets has increased dramatically over the last decade,
however, there have been few systematic analyses quantifying the
accuracy of the commonly used variant caller programs. Here we used a
familial design consisting of diploid tissue from a single Pinus
contorta parent and the maternally derived haploid tissue from 106
full-sibling offspring, where mismatches could only arise due to
mutation or bioinformatic error. Given the rarity of mutation, we used
the rate of mismatches between parent and offspring genotype calls to
infer the SNP genotyping error rates of FreeBayes, HaplotypeCaller,
SAMtools, UnifiedGenotyper, and VarScan. With baseline filtering
HaplotypeCaller and UnifiedGenotyper yielded one to two orders of
magnitude larger numbers of SNPs and error rates, whereas FreeBayes,
SAMtools and VarScan yielded lower numbers of SNPs and more modest error
rates. To facilitate comparison between variant callers we standardized
each SNP set to the same number of SNPs using additional filtering,
where UnifiedGenotyper consistently produced the smallest proportion of
genotype errors, followed by HaplotypeCaller, VarScan, SAMtools, and
FreeBayes. Additionally, we found that error rates were minimized for
SNPs called by more than one variant caller. Finally, we evaluated the
performance of various commonly used filtering metrics on SNP calling.
Our analysis provides a quantitative assessment of the accuracy of five
widely used variant calling programs and offers valuable insights into
both the choice of variant caller program and the choice of filtering
metrics, especially for researchers using non-model study systems.