Optimization of Stacks parameters
Using the standard Stacks parameters, 9,971 polymorphic loci were obtained, with 8,126 SNPs present in 90% of our sub-sample of 12 individuals. Increasing the -m parameter, which is the minimum coverage required to produce a new stack, led to a slight reduction in polymorphic loci (R 90% = 9,604 loci). Therefore, for all the following analyses (including the optimized one), the value of -m = 3 was applied, which is the default of the ustacks program. Similarly, increasing the -M parameter, which is the maximum permitted variance in the same stack, from its initial standard value (-M = 2) to -M = 3, resulted in a non-significant rise in the number of polymorphic loci (R 90% = 9,894 loci). However, the number of observed SNPs increased to 8,251 with a value of -M = 4. With the increment to -M = 5, there was no increase in the number of polymorphic loci, or even in the number of SNPs, so we maintained the value of -M = 4 in the following analyses.
As expected for data obtained from two different species, the parameter that most increased the number of polymorphic loci was -n, which determines the number of “errors” allowed between samples when the catalog is built. In highly variable populations or between different species, there is a greater chance that the same locus will be divided into more than one stack (or alignment) in “de novo” analyses due to interspecific or interindividual variation. The default value of -n is 1, which means that if more than one “error” or mutation is present in the same locus (100 base pairs), it is considered a different locus. If this is the case, the locus is divided into two stacks, decreasing the number of SNPs and increasing the number of non-variable loci. With the increase of -n to 5, and the maximum for the -M = 4 recommended by Paris et al. (2017), 12,460 polymorphic loci were recovered, for a total of 11,178 SNPs. Therefore, this value of -n was used for the analysis of all samples.