2.4 Demographic events and population assignment
Genetic diversity indices for each population, including observed and expected heterozygosity, private alleles, nucleotide diversity and inbreeding coefficient, were calculated using populations  in STACKS v2.2 (Rochette et al., 2019). In addition, Tajima’s  was calculated using VCFtools (Danecek et al., 2011) to assess recent historical demographic events. Finally, population differentiation was analysed using pairwise Fst and the statistical significance was calculated under 20,000 permutations in ARLEQUIN v.3.5.2.1.
We assessed the population structure and differentiation using three clustering approaches. Firstly, we used the discriminant analysis of principal components (DAPC) (Jombart, Devillard, & Balloux, 2010) as implemented in the adegenet  package (Jombart, 2008) in R (Team, 2020). The cross-validation xvaldapc  function  was used to determine the optimal number of principal components (PCs) using a training set of 0.6 with 1000 replicates (n.rep = 1000 ). The find.cluster  function  was used to infer the most likely number of clusters with the Akaike Information Criterion (AIC) and K-means of 100 (Jombart et al., 2010). The optimal number of genetic clusters was identified as the lowest AIC value. Secondly, we ran STRUCTURE v2.3.4. (Pritchard, Stephens, & Donnelly, 2000) with 200,000 MCMC iterations following a burn-in of 100,000 iterations, using the admixture ancestry model and no sampling location as a prior, setting a putative K  from 1 to 6 in a total of 15 independent iterations in each run (Falush, Stephens, & Pritchard, 2003). The web version of STRUCTURE HARVESTER (Earl & vonHoldt, 2011) (http://taylor0.biology.ucla.edu/structureHarvester/) was used to infer the most likely K  value based on the Evanno method (Evanno, Regnaut, & Goudet, 2005) and the highest posterior mean log-likelihood (mean LnP(K)). CLUMPP (Jakobsson & Rosenberg, 2007) and DISTRUCT v1.1 (Rosenberg, 2003) were used to summarise the similarity between replicates and visualise the STRUCTURE plots, respectively. Finally, the fineRADstructure  package (Malinsky, Trucchi, Lawson, & Falush, 2018) was run to understand the recent shared ancestry at high resolution derived from haplotype linkage information among all the samples. The database used contained 13,149 SNPs obtained after running populations  on the set of 1,907 neutral SNPs deselecting the -write-single-snps  option; this allowed for the inclusion of linked SNPs in the different RAD-tags. This dataset was reordered first according to linkage disequilibrium using the script sampleLD.R, fineRADstructure  was run using the default settings. The results were graphically interpreted using the Finestructure  package (Lawson, Hellenthal, Myers, & Falush, 2012) and the fineRADstructurePlot.R  script (Malinsky et al., 2018).
2.5 Demographic history scenarios testing and species tree
To investigate which specific demographic history hypotheses were observed to be most compatible with genetic structure, we applied a Bayesian approximate computation approach with supervised machine learning as implemented in DIYABC Random Forest (DIYABC-RF) v 1.0 (Collin et al., 2021). Since the method is computational demanding, the scenarios were kept simple and easily distinguishable from each other (Cabrera & Palsboll, 2017). To save calculation time, we grouped individuals into four groups (1) South Africa, (2) Brazil, (3) Spain and Guatemala (Spain-Guatemala), and (4) Greece and Australia (Greece-Australia) according to genetic groups defined by DAPC and STRUCTURE analyses (see Results, Fig. 3). The modelling strategy was focused on testing different phylogeographic scenarios reflecting hypotheses of colonisation routes from South Africa to Brazil or to the cluster Spain-Guatemala based on the results in Fig. 6 and the historical records (Fig. 1). Main phylogeographic hypotheses are explained graphically in Fig. S1A-B, Supporting information. These scenarios were analysed individually, after selecting the best hypothetical evolutionary scenario, we performed a second analysis only for the three scenarios with the highest classification votes (Fig. S1B, Supporting information).
In accordance with DIYABC’s requirements, those SNPs missing in all individuals in one population need to be excluded, which reduced the dataset to 312 SNPs across the 92 individuals. Uniform priors were used as described in Karsten et al. (2015). Effective population sizes (Ne ) were allowed to range between 10 to 100,000; the priors for generation times between divergence events were set from 10 to 10,000 and admixture rate ranged from 0.001 to 0.999. A total of 20,000 data sets were simulated for each scenario and all simulated data sets were used in each Random Forest training set. We used five noise variables and ran 2,000 random trees to select the most likely phylogeographic scenario.
The species tree was inferred using SNAPP (Bryant, Bouckaert, Felsenstein, Rosenberg, & RoyChoudhury, 2012), a coalescent-based method analysis that uses SNP markers directly to compute the likelihood of the species tree under a finite-sites mutation model over all possible gene trees. SNAPP is implemented in BEAST v2.6.2 (Bouckaert et al., 2014) and estimates the probability of allele frequency change across ancestor and descendent nodes, giving as results a posterior distribution for the species tree (Bryant et al., 2012).
The putative neutral dataset was additionally filtered for non-polymorphic loci following the model assumptions. Running SNAPP comes at a high computational cost; therefore, we pruned our dataset to include five randomly selected individuals from each population (N=30). The mutation rates ( and v ) were calculated and ran for at least one million generations per replicate with sampling every 1,000 generations. We conducted three independent runs and evaluated convergence in Tracer v1.7 (Rambaut, Drummond, Xie, Baele, & Suchard, 2018). The following analyses were carried out with the runs whose parameters presented ESS > 200. We removed 10% of trees as burn-in and merged tree and log files from the different runs using LOGCOMBINER v 2.4.1. Then, TREEANNOTATOR v2.6.2 was used to obtain maximum credibility trees and the trees that were contained in the 95% highest posterior density (HDP). To visualise the posterior distribution of trees we used DENSITREE v2.2.7 (Bouckaert, 2010). This process was repeated further by randomly resampling different individuals than those used for the first run (with the exception of sampling sites from Australia and Guatemala with < 10 individuals) to determine the subsampling effect on the analysis.
2.6 Microbiome analysis
Sixty-three different individuals collected from the populations used in the RADseq analyses were used to study their microbial structure and community composition. Individuals from all populations were used, except for Greece and Guatemala. In addition, we included new samples from Colombia (CO) and Israel (IS) in this analysis, resulting in a total of six sampling sites collected in nine different host plants (Figure 1 and details in Supplementary Table 1). DNA was extracted following the protocol mentioned above, and the V4 region of the 16S rRNA gene was amplified in duplicates using the primers 515F-806R (Caporaso et al., 2011) in one step PCR (95°C 5min, 25 cycles x [95°C 20s, 55°C 20s, 72°C 30s], 72°C 5min). The resulting amplicons were pooled per sample and cleaned (AMPure XP magnetic beads, Beckman Coulter, Indianapolis, USA) and PCR indexing was carried out using a Nextera XT DNA Library Preparation Kit (Illumina Inc., San Diego, US-CA). The paired-end sequencing was performed at the Molecular Core Labs (Sequencing Facility) of The Natural History Museum on a complete flow cell of the Illumina MiSeq platform using v3 chemistry (2 x 300 bp).
Sequence data were processed in Mothur v1.41.3 by adapting the MiSeq SOP protocol (Kozich, Westcott, Baxter, Highlander, & Schloss, 2013; Schloss et al., 2009). Following quality filtering, we merged forward and reverse reads. Primers were removed, ambiguous bases were discarded, and sequences with over 15 homopolymers were excluded from further analysis. Unique sequences were aligned against the Silva Seed v132 reference database (Quast et al., 2013), which was reduced and customized to the V4 region to avoid errors and improve alignment quality. The maximum length of the sequences was set to 275 bp, and those that did not align well were removed. Duplicate sequences were merged, and denoising of unique sequences was performed with the commandpre.cluster with an allowance of 1 difference per 100 bp (Callahan et al., 2016) to infer Amplicon Sequence Variants (ASVs).  Singletons that did not cluster were removed at this stage with the command split.abund. Chimaeras were removed using UCHIME and the Silva Gold database (Edgar, Haas, Clemente, Quince, & Knight, 2011). ASVs were taxonomically classified using the Silva NR v132 reference database with the command classify.seqs , and unwanted lineages, such as Archaea and Eukaryota, were removed with the commandremove.lineage .
ASVs were summarised at the phylum, family, and genus taxonomic levels for further analysis. All statistical analyses were computed in R (Team, 2020) using the packages phyloseq, microbiome, ape  and vegan . The number of ASVs shared in groups between different countries were visualized with Venn diagrams produced through the Van de Peer lab website (http://bioinformatics.psb.ugent.be/webtools/Venn/). Alpha diversity (within samples and communities) was calculated with the species richness (total number of ASVs) estimator, and the Shannon (Shannon, 1948) and Inverse Simpson’s (Simpson, 1949) indexes. In addition, the Bray-Curtis distance metric was used to quantify the dissimilarity of community composition between samples in the different sampling localities, and permutational multivariate analysis of variance (PERMANOVA) was used on the distance matrix to test for statistical differences between groups with the function adonis()  from the vegan package. Group homogeneity of dispersion was evaluated with the function betadisper() from the vegan package (Anderson et al., 2006). Pairwise comparisons were tested with the pairwise.adoniswrapper function by Martinez Arbizu (2020), and p-values were adjusted using Bonferroni corrections. We also performed analysis of similarities (ANOSIM) tests, with the Bray-Curtis distance metric, to evaluate whether any differences in microbiome composition between groups were attributed to geographical distance or medfly diet. ANOSIM results in a ratio of between to within groups variation, with an R value closer to 1 suggesting dissimilarity between groups and values closer to 0 suggesting even distribution between and within groups (Clarke, 1993). For these tests we used the anosim() function from the vegan package.
Differential abundance analyses were performed on the most abundant ASVs to assess differences in the microbiome composition across sample sites with the package edgeR  (Robinson, McCarthy, & Smyth, 2010) implemented in R (Team, 2020). First, ASVs with relative abundance ≥ 0.01% were chosen, and data normalisation was implemented with function calcNormFactors() . Then, an exact binomial test was performed to generalise over dispersed counts with the function exactTest()  (Xia, Sun, & Chen, 2018). Next, the top differentially abundant ASVs were tabulated with function topTags() and the false discovery rate (FDR) was estimated by adjusting the p-values  with the Benjamini-Hochberg correction method (Benjamini & Hochberg, 1995); ASVs with an FDR < 0.01 were considered significant and were chosen to visualize differential abundance in the microbiome composition with the package ggplot2   (Wickham, 2016; Xia et al., 2018).