3.3 Origin of medfly infestation based on demographic history
and phylogenetic analysis
In the DIYABC-RF analysis that tested six hypothetical evolutionary
scenarios (Fig. 5), the classification votes from Scenario 1 to Scenario
6 were: 599, 411, 123, 467, 267 and 133, respectively (i.e. the number
of times a scenario is selected in a random forest). Based on the
classification votes and posterior probabilities, the best fit was
Scenario 1, with a posterior probability of 0.596 and global and local
error rates of 0.499 and 0.404, respectively (Fig. 5). The projections
of data set from the training set on the linear discriminate analyses
(LDA) indicated low power to discriminate the tested Scenarios 1, 2 and
4 because the observed data set was located within the cloud of their
simulated data (Fig. S5-C. Supplementary information). To improve
prediction quality and power of differentiation, we ran a new analysis
only for Scenarios 2, 4 and 5 (selected based on the previous test
results). The classification votes were 778, 1025 and 197 respectively.
The best fit scenario was Scenario 4, with a posterior probability of
0.697 and global and local errors of 0.246 and 0.303, respectively (Fig.
5). The projection of data sets from the training set in this second
test was located within the cloud of Scenario 2, indicating substantial
power to discriminate among the tested scenarios (Fig. S5-C.
Supplementary information). Overall, the demographic colonisation
scenarios suggested a long and interconnected history of invasions ofC. capitata in the studied sites. Both best fit scenarios
predicted Brazil divergence from the ancestral South African population.
However, Scenario 1 predicted direct colonisation from Brazil to the
other sampling sites, while Scenario 4 predicted that the
Spain-Guatemala group originated from the admixture between lineages
from South Africa and Brazil. According to these results, Brazil
specimens were established by direct colonisation from South Africa and
likely admixture events leading to the establishment of the remaining
lineages (i.e. Spain-Guatemala and Greece-Australia).
SNAPP recovered a total of 15 consensus trees topologies. The consensus
tree 1 covered 37.18% of the total cumulative trees (Fig. 6) increasing
to 67.04% when the consensus trees 2 and 3 were included. The consensus
tree topologies were consistent across the independent runs in which
different individuals were sampled from each location (Fig. 6; Fig. S6,
supporting information), indicating that subsampling did not
significantly impact the topology of the SNAPP trees. The species tree
revealed three highly supported lineages (PP=1) corresponding to South
Africa, Brazil and a third lineage comprised of all other regions,
whereas two nodes consistently showed moderate support corresponding to
the divergence between Greece (PP=0.81) and Guatemala and Australia
(PP=0.84) lineages. These results are consistent with the genetic
clusters found in the DAPC and Structure analyses (Fig. 2 and Fig. 3).
Effective population size represented in the branch thickness of the
consensus tree inferred by theta-estimates showed the highest value in
South Africa, followed by Spain, Brazil, and Greece with intermediate
values (Fig. 6).