Paleoclimatic distribution simulations
Ecological niche modeling (ENM) was used to test the influence of paleoclimate on the distribution of A. mantiqueira and A. alalia . The occurrence points from each species were obtained from an extensive dataset (Gueratto et al. submitted ) and mapped in a grid cell with 5 arc-minutes (~10km). As result, a total of 21 occurrence points were recorded for A. alalia , and 78 forA. mantiqueira . The number of occurrences of the species at each gridded cell was standardized to avoid oversampling of specific environmental conditions (abundance biases).
In order to understand the pattern of spatial expansion and retraction of both Actinote species over glacial cycles, the models were created from the present to 800 thousand years ago (kya), a period that covered nineteen glacial-interglacial cycles. The climate variables of the Last Glacial Maximum (21 kya), Last Interglacial Maximum (LIG – 130 kya) and Upper Pleistocene (~787 kya) periods were obtained from the Paleoclim database (http://www.paleoclim.org/) (Fordham et al. 2017, Brown et al. 2018). To avoid collinearity between the climate variables, five axes were selected through Principal Component Analysis (PCA), accounting for the greatest variation in temperature and precipitation from the current climate data in the Neotropical region. The coefficients of the PCA were then applied to find the axes’ scores from the past climates with the purpose of maintaining the dimensionality among the climate predictors (see the protocols in Manly et al. 1994, Legendre & Legendre 1998, Amaral et al., 2021). The climate conditions (PCA axes) were interpolated by using the global stacked δ18 Oxygen curve (Lisiecki & Raymo 2005) as a covariate (see Lawing & Poly 2011, Raposo et al. 2021). Interpolation was calculated each one thousand years between present time and 600 kya; and each two thousand years in the interval from 602 to 786 kya.
As each algorithm present distinct prediction according to the different niche breadth of the specie (Qiao et al. 2015), their combined use may increase the accuracy of predictions by considering different niche tolerances in the potential distribution of species (Araujo & New 2007, Diniz-Filho et al. 2009). Therefore, five different mathematical algorithms were used: Bioclim (Nix 1986), distance method Domain (Gower distance; Carpenter et al. 1993), support vector machines (SVM; Tax & Duim 2004), maximum entropy – Maxent (Phillips & Dudík 2008) and Ecological-Niche Factor Analysis (ENFA; Hirzel et al. 2002). As a way of evaluating the generated models, the occurrence points were randomized in two groups, training and testing, which contain 70% and 30% of the occurrence points, respectively. The model’s performance was evaluated with the D statistic, which considers only presence records and weights the true positive rate (TPR) by the inverse of the proportional predicted distribution area (pi): D = TPR∗(1-pi) (Pearson et al. 2007). Threshold values using maximum sensitivity and specificity were calculated in order to maximize the correctness of presences and absences. After defining the thresholds, a prediction map of each species was obtained following the ensemble technique (Araujo & New 2007).
All modeling methods describe above were followed separately to build a climate-based to each geological period, thus resulting in 695 final consensus maps for each of the two Actinote species: models at each one ky BP (between present and 600 ky BP) and 2 ky BP (between 602 and 786 ky BP) intervals. The glacial and Interglacial periods were delimited according to Marine Isotope Stages (MIS) proposed by Lisieck & Ramos (2005). All analyzes were performed in R 4.0.2 (R Development Core Team 2010).
Results
Genotyping-by-Sequencing Statistics
The number of reads for each run per lane ranged from 348 to 352 million. Sequencing quality across all bases was above 32 Phred Score, no adapters were found in the sequences, and the missing (N) content in all bases was not significant. After the demultiplexing, 96.7% of reads were retained, and 266,933 loci were genotyped with a mean coverage of 89.9X and a standard deviation of 35.8X.