Optimization of Stacks parameters
Using the standard Stacks parameters, 9,971 polymorphic loci were
obtained, with 8,126 SNPs present in 90% of our sub-sample of 12
individuals. Increasing the -m parameter, which is the minimum coverage
required to produce a new stack, led to a slight reduction in
polymorphic loci (R 90% = 9,604 loci). Therefore, for all the following
analyses (including the optimized one), the value of -m = 3 was applied,
which is the default of the ustacks program. Similarly, increasing the
-M parameter, which is the maximum permitted variance in the same stack,
from its initial standard value (-M = 2) to -M = 3, resulted in a
non-significant rise in the number of polymorphic loci (R 90% = 9,894
loci). However, the number of observed SNPs increased to 8,251 with a
value of -M = 4. With the increment to -M = 5, there was no increase in
the number of polymorphic loci, or even in the number of SNPs, so we
maintained the value of -M = 4 in the following analyses.
As expected for data obtained from two different species, the parameter
that most increased the number of polymorphic loci was -n, which
determines the number of “errors” allowed between samples when the
catalog is built. In highly variable populations or between different
species, there is a greater chance that the same locus will be divided
into more than one stack (or alignment) in “de novo” analyses due to
interspecific or interindividual variation. The default value of -n is
1, which means that if more than one “error” or mutation is present in
the same locus (100 base pairs), it is considered a different locus. If
this is the case, the locus is divided into two stacks, decreasing the
number of SNPs and increasing the number of non-variable loci. With the
increase of -n to 5, and the maximum for the -M = 4 recommended by Paris
et al. (2017), 12,460 polymorphic loci were recovered, for a total of
11,178 SNPs. Therefore, this value of -n was used for the analysis of
all samples.