Data processing and linkage map construction in Lep-MAP3
We used mpileup in SAMtools v1.9 (Li et al., 2009), and the
pileupParser2 and pileup2posterior scripts implemented in Lep-MAP3
(Rastas, 2017), to align trimmed reads to both the male and female Hi-C
HiRise genome assemblies separately. Resulting posterior files were used
as input into Lep-MAP3 to produce male- and female-aligned linkage maps.
We used identity-by-descent (IBD) scores to verify the assignment of
individuals to discrete families by removing individuals with less than
25% IBD to at least half of the individuals within their respective
families. The ParentCall2 module imputed missing genotypes in the F1
parents that were not recovered from the bolts. The Filtering2 module
removed markers with high segregation distortion or excessive missing
data (data tolerance score of 0.01, following Lep-MAP3 recommendations).
Next, we used the SeparateChromosomes2 module of Lep-MAP3 to separate
SNPs into distinct linkage groups representing putative chromosomes. We
required retained linkage groups to contain at least 70 SNPs, and set
the informativeMask parameter to 23, which excluded markers that were
informative only for the fathers (i.e., we retained markers that were
either informative for the mothers, or for both mothers and fathers).
Including markers informative only for fathers substantially reduced the
number of SNPs assigned to linkage groups. We adjusted LOD scores until
the number of retained linkage groups closely matched the known number
of chromosomes (11 autosomes + 2 neo-sex chromosomes) based on mountain
pine beetle karyology (Lanier & Wood, 1968). In general, the
appropriate LOD score should be similar to the number of chromosomes in
the genome (Rastas, 2017).
We then ordered each linkage group in both maps five times using the
OrderMarkers2 module of Lep-MAP3 and selected the marker order for each
group with the highest likelihood score. We checked each file for
incorrect marker ordering by visualizing linkage group graphs with xdot
v1.1 (Fonseca, 2019). If any of these graphs indicated improper marker
ordering, we discarded that replicate, chose the replicate with the
next-highest likelihood score, and checked it again. This produced
separate SNP recombination distances for male and female specimens in
each linkage group, which were used as input for ALLMAPS (Tang et al.,
2015). These linkage maps were used to inform joining, ordering, and
orientation of scaffolds in the male and female Hi-C HiRise genome
assemblies (described above). After these assembly modifications and
subsequent steps were completed, we reproduced the male- and
female-aligned linkage maps and ALLMAPS figures using the final versions
of the male and female genome assemblies and the same parameters
described above. We then repeated the marker ordering step in Lep-MAP3,
this time outputting a single, sex-averaged distance for each SNP in the
linkage groups in order to produce chromosome maps using the
LinkageMapView package (Ouellette et al., 2018) in R v3.6.1 (R Core
Team, 2020). We visualized each chromosome as a density map to identify
regions with strong genetic linkage, indicated by shorter per-locus cM
distances.