Stacks parameters
optimization
The protocol established by
Paris et al. (2017) to identify the optimal parameters for the “de
novo” analysis was followed. This optimization is critical due to the
comparison of two different species, which have higher variability than
intraspecific samples. A sub-sample of one individual from each
population was sorted at random to obtain the greatest genetic diversity
from the total sampling. For this shortened data set, the Stacks
protocol was performed several times, varying only one parameter in each
run. Initially, the standard parameters: M – the maximum distance (in
nucleotides) allowed between stacks (default 2); m — minimum depth of
coverage required to create a stack (default 3); N — maximum distance
allowed to align secondary reads to primary stacks (default: M + 2).
Then, the parameters of the ustacks program were increased: m from 3 to
4 (m3-m4) and the M parameter from 2 to 5 (M2-M5). In addition to these
parameters, the cstacks parameter n from 1 to 5 (n1-n5) was tested while
all other parameters (m3, M2, and n1) remained constant. The parameters
were chosen to maximize the number of recovered polymorphic loci and
were selected when further increases in the parameter resulted in the
same number of recovered polymorphic loci. To incorporate the
interspecific polymorphisms, the populations parameter was set to -R =
90%, indicating that SNPs were present in 90% of the samples,
including at least one A. alalia individual.