Abstract
Sequencing-based genotyping of heterozygous diploids requires sufficient
depth to accurately call heterozygous genotypes. In interspecific
hybrids, alignment of reads to both parental genomes simultaneously can
generate haploid data, potentially eliminating the problem of
heterozygosity. Two populations of interspecific hybrid rootstocks of
walnut (Juglans) and pistachio (Pistacia) were genotyped using alignment
to the maternal genome, paternal genome, and dual alignment to both
genomes simultaneously. Downsampling was used to examine concordance of
imputed genotype calls as a function of sequencing depth. Dual alignment
resulted in datasets essentially free of heterozygous genotypes,
simplifying the identification and removal of cross-contaminated
samples. Concordance between full and downsampled genotype calls was
always highest after dual alignment. Nearly all SNPs in dual alignment
datasets were shared with the corresponding single-parent datasets, but
60-90% of single-parent SNPs were private to that dataset. Private SNPs
in single-parent datasets had higher rates of heterozygosity, lower
levels of concordance, and were enriched in fixed differences between
parental genomes (“homeo-SNPs”) compared to shared SNPs in the same
dataset. In multi-parental walnut hybrids, the paternal-aligned dataset
was ineffective at resolving population structure in the maternal
parent. Overall, the dual alignment strategy effectively produced
phased, haploid data, increasing data quality and reducing cost.