4.1 Population genomics of Ceratitis capitata
The present study is the first using reduced-representation sequencing genome in combination with a microbial characterisation in the medflyC. capitata , using samples from key geographic locations to investigate the species’ population history and microbiome on a global scale. We find strong evidence for two genetic clusters corresponding to the South African individuals and the other localities in the introduced range, in agreement with virtually all previous studies using allozymes (Gasperi et al., 2002; Gasperi et al., 1991; Kourti, 2004; Malacrida et al., 1992), mitochondrial DNA markers (Arias et al., 2018; Elfékih et al., 2010; Elfékih, Makni, & Haymer, 2013; Karsten, van Vuuren, Barnaud, & Terblanche, 2013; Ruiz-Arce et al., 2020), and microsatellites (Bonizzoni et al., 2004; M. Bonizzoni et al., 2001; Deschepper et al., 2021; Karsten et al., 2015; Nikolouli et al., 2020). In addition, when we analysed the five sampling sites from the introduced range separately (i.e., removing the South African samples), populations from Brazil represented a unique genetic cluster that had not been recognised in previous studies. The Brazil cluster was also characterised by a distinct microbiome and the highest overall bacterial diversity.
Karsten et al. (2013, 2015) observed high genetic diversity in South African medflies due to a large number of alleles present at low frequency, including many private alleles, which led them to suggest that this population was ancestral and has maintained a large population size over time. Other studies have shown that populations derived from the African lineage exhibited a gradual decrease in genetic variation (Malacrida et al., 2007; Deschepper et al., 2021 and references therein), first to the Mediterranean basin populations and a second towards American populations, thus dividing the colonisation process of the medfly in three main categories: Ancestral populations (Sub-Sahara and Africa), ancient populations (Mediterranean basin) and recent populations (America) (Gasperi et al., 2002; Malacrida et al., 1998). Our results showed a different pattern of genetic variation: across the introduced range, most of the sampled locations belong to the same big genetic cluster, indicating gene flow among these locations, except for Brazil.
Gasperi et al. (2002), using allozymes, found similar levels of genetic variability in South American populations (i.e., Argentina, Brazil and Peru) to African ancestral populations, and they stated that these populations did not have enough time to reach equilibrium and further differentiation. Furthermore, Nikolouli et al. (2020) described a discrete genetic cluster in some South American populations (Argentina, Brazil and Bolivia) using microsatellites; however, they stressed that this genetic cluster was not clearly distinct from other medfly populations worldwide. Our study identified high genetic diversity and a genetically distinct cluster of medflies collected in Brazil. These findings suggest that some South American populations might be derived from different genetic sources.
The combination of population structure and ABC analyses with supervised machine learning allowed us to reconstruct the most probable evolutionary scenario of C. capitata. It is important to note that only a portion of the total geographical distribution of medfly is covered in this study. Nevertheless, our limited data set was able to support the initial divergence from South African ancient populations that gave rise to populations in Brazil at a different time than those in the rest of the world. These findings are partially congruent with historical records of medfly distributions (Malacrida et al., 1998), which medfly colonisation route may have occurred through the transatlantic trade of enslaved people, as has been described in D. melanogaster , which is also a descendant from the Afrotropical region (David & Capy, 1988). Furthermore, the medfly museum collections at the Natural History Museum of London provided historical records to support this suggested new colonisation route. We found that in the collection, the oldest record was dated in 1904, with specimens collected in the tropical Saint Helena Island (Fig. 1). This island, a UK overseas territory located in the South Atlantic Ocean, midway between Africa and South America, was an important port during the crown colony and an obligate stop for the Trans-Atlantic trade in the colonial period. The human population in Saint Helena has genomic traces linking them with Central-West African populations, moved during the slavery years (Sandoval-Velasco et al., 2019). This colonisation route for the medfly, different from the one connecting the Mediterranean area and South America, had been mentioned by Gasperi et al. (2002) and Ruiz-Arce et al. (2020) but has never been probed at the genetic level. Our results modify the previous belief that only Mediterranean basin medfly populations had contributed to the colonisation of South America, as described in previous publications (Deschepper et al., 2021; Malacrida et al., 1998; Malacrida et al., 2007), and points out new potential ancestry sources for the genetic units in the South American populations that need further investigation.