SNPs calling and Population Statistics

Firstly, all loci were recovered in each sample using ustacks (optimized parameters: -M 4, -m 3), which aligned the short-read sequences into exactly matching stacks (or putative alleles). A catalog of loci was created in the cstacks (parameter: -n 5), merging the alleles of all samples to form a consensus without a reference genome. Following, samples were matched back against the catalog using the option sstacks and transposing the data (so it is oriented by locus instead of by sample) with tsv2bam. For de novo analyses and single-end reads, the final step was to call SNPs from the loci with the gstacks program.
The population program (parameters –write_single_snp -R 50 -p 6) was used to compute standard population genetics statistics (expected/observed heterozygosity, π, and FIS), as well as to export a variety of standard output formats (Catchen et al., 2011). We calculated divergence from Hardy-Weinberg equilibrium (HWE) for each locus and estimated SNP and haplotype-based F statistics.
Phylogenetic reconstruction
The W-IQ-TREE tool (Trifinopoulos et al., 2016) was used to reconstruct a maximum likelihood (ML) phylogenetic hypothesis, with two objectives: to ascertain whether there is a phylogenetic structure between the species and whether populations (localities) have a phylogenetic structure. The matrix was composed of 142 individuals and 9370 SNPs, 6550 of which were parsimony-informative markers, and 2821 singleton sites. Four individuals were removed from the original matrix due to more than 50% missing values. Two other species from the Actinote ’s orangish-red mimicry complex were used as outgroup: A. catarina,the putative sister species of the A. alalia + A. mantiqueira clade, and A. dalmeidai . The program ModelFinder (Kalyaanamoorthy et al., 2017) was applied to determine the best substitution model according to the Bayesian information criterion (BIC), including an ascertainment bias correction (+ASC) model (Lewis, 2001), as suggested for alignments without constant sites (such as SNP data). In this way, the ML analysis was run under the model TVMe+ASC+G4. Support for nodes was evaluated with 5,000 Shimodaira-Hasegawa-like (SH-aLRT) and ultrafast bootstrap (UFBoot2) approximations (Guindon et al., 2010; Hoang et al., 2017).