Results
Ten individuals of Arabian gray mangrove (A. marina var.marina ) from each of 19 Arabian Peninsula populations plus an
eastern Australian population to be used as an outgroup (A.
marina var. australasica ) were sampled for resequencing, for a
total of 200 sequenced individuals (Fig. 1; Table 1; Table S1,
Supplementary Information). Geographic coordinates of each sampled tree
were recorded to enable genotype-environment analysis using
georeferenced environmental data. SNP calling was carried out with the
GATK Genome Analysis Toolkit (GATK; 28 ) to produce a set of
genome-wide SNPs for downstream genetic analyses. The genome coverage
per individual was 16.8X on average, ranging from 6.1X to 27.1X, after
quality and missing data filtering.
Population structure and genetic diversity analyses The study aimed at evaluating how the array of spatial settings and
temporally volatile geographic barriers of the Arabian Peninsula shaped
the diversity of Arabian mangrove populations. To explore patterns of
population structure, we produced a set of independent, selectively
neutral SNPs after applying quality and missing data filters. A
t-distributed stochastic neighbor embedding (t-SNE) analysis revealed
marked levels of population structure, highly congruent with the
geographic distribution of the sampled populations. Populations from the
Red Sea and the PAG clustered apart, while those from the Arabian Sea
occupied a central, more scattered position, relatively closer to the
PAG cluster yet differentiated from it (Fig. 2A). Structure was also
observable within each of the water bodies defining the three
biogeographic regions, with little to no overlapping among the sampling
locations, revealing considerable differentiation at the population
level (Fig. 2A).
We also examined population divergence in Arabian mangroves using a
sparse non-negative matrix factorization method (SNMF; 29 ), which
recovered variability patterns consistent with the t-SNE results (Fig.
2B). At K = 2, populations clustered into two groups separating the Red
Sea from the Arabian Sea, the Sea of Oman and the PAG. The plot for K =
3 also revealed the more isolated, geographically adjacent populations
from the southern Arabian Sea (Salalah and Taqah) as a differentiated
group, with populations along the northeast Oman coast and into the PAG
showing decreasing degrees of shared ancestry. At K = 4, populations
from the northern basin of the PAG (Bahrain and Dammam) appeared as a
differentiated cluster. At K = 5, populations from northern Red Sea
(Duba, Al Wajh 1 and Al Wajh 2) also appeared as a genetically
differentiated cluster, yet showing a signal of coancestry decreasing
with latitude with proximal populations from the central Red Sea (Al
Kharrar and King Abdullah Economic City, hereafter KAEC) and to a lesser
extent, Al Lith. In the plot for K = 6, populations from the Sea of
Oman, Qurm and Shinas, clustered together with Filim, yet Shinas showed
intermediate levels of shared ancestry with the genetic cluster of the
southern PAG basin, which included Ras Al Khaimah (hereafter RAK), Umm
Al Quawain (hereafter UAQ), Ras Ghurab and Suweihat. At K = 7,
populations from southern Red Sea (Al Lith, Farasan Banks 1 and Farasan
Banks 2; hereafter FB1 and FB2) clustered apart from the central Red Sea
populations, yet Al Lith presented intermediate levels of coancestry
with the central Red Sea population KAEC.
Pairwise F ST and Nei’s genetic distances based on
neutral loci were generally congruent with the t-SNA and SNMF results,
showing high correlation with the geographic distribution of the
mangrove populations. Both indices showed the highest differentiation
scores when comparing populations from the Red Sea to those of the PAG.
Within-region F ST comparisons were higher for the
Red Sea and Nei’s were higher for the PAG, suggesting divergent
population histories (Table S2, Supplementary Material). Heterozygosity
and nucleotide diversity indices revealed overall similar levels of
genetic variability, with more admixed populations at Al Kharrar, KAEC,
Filim, Qurm and Shinas (see SNMF results above) generally showing the
highest scores (Table 1).
Phylogenetic analysis To explore how evolutionary relationships fitted a scenario of
recolonization after the last glacial maximum (LGM) versus isolation in
putative glacial refugia in the Red Sea and the PAG, we conducted a
phylogenetic analysis based on resequencing data. A maximum likelihood
phylogenetic reconstruction was consistent with the t-SNE and SNMF
analyses, showing marked differentiation between and within
biogeographic regions, with almost all the populations sampled resolving
in monophyletic clades with high node support. Arabian mangrove
populations grouped in three major, reciprocally monophyletic clades or
phylogenetic lineages: the Red Sea lineage (Major Clade I), the southern
Arabian Sea lineage (Major Clade II) and the PAG plus the Sea of Oman
lineage (Major Clade III) (Fig. 3). Within the Red Sea’s Major Clade I,
the phylogenetic analysis revealed a sequence of cladogenetic events
consistent with the geographic distribution of the sampled populations,
with the first split separating northern (Duba, Al Wajh 1 and Al Wajh 2)
and central populations (Al Kharrar and KAEC) from those of the south
(Al Lith, FB1 and FB2); adjacent Al Kharrar and KAEC populations from
central Red Sea showed little differentiation and grouped into a single
clade. The Arabian Sea’s Major Clade II encompassed the populations
Salalah, Taqah and Filim, all occurring in southern Oman, and appeared
as the sister group of the Major Clade III, which includes all Sea of
Oman and PAG populations. Major Clade III was further divided into two
clades separating the populations of the northern basin of the PAG
(Bahrain and Dammam) from those of the southern basin (RAK, UAQ, Ras
Ghurab and Suweihat) and the Sea of Oman (Qurm and Shinas). Populations
of these two latter subregions, which occur on each side of the Strait
of Hormuz, appeared as reciprocally monophyletic sister groups. The
Brisbane population belonging to the Australian variety of the gray
mangrove (A. marina var. australasica ) was used as
outgroup (Fig. 3).
Population and demographic history of the Arabian gray mangrove We performed model comparisons under the likelihood framework developed
in fastSIMCOAL2 (30 ) to test the competing hypotheses of
colonization after the LGM versus potential isolation in glacial refugia
in the Red Sea and the PAG. Informed by the previously inferred
phylogenetic relationships and using three populations from each
biogeographic region as lineage representatives, two sets of models were
independently compared for the Red Sea and the PAG plus the Sea of Oman
lineages (Fig. S1, Supplementary Material). The Brisbane population of
Australian mangroves was included as outgroup and to calibrate times of
divergence (26, 31 ). For the Red Sea, the model with the greatest
likelihood score (ΔAIC = 1,707.58; Table S3, Fig. S1, Supplementary
Information) revealed two cladogenetic events and splitting times dating
back to 89,950 years for Duba and Al Kharrar (95% CI = [55,910 -
221,093]); and 166,975 years for FB2 (95% CI = [94,333 -
290,121]), consistent with a process of lineage differentiation in
geographic isolation during glacial periods of low marine connectivity.
No gene exchange among extant lineages was detected (MIG = 0, 95% CI =
[0 – 0]). Potential gene flow was detected among ancestral
lineages, yet the 95% confidence intervals varied by orders of
magnitude and included the zero (MIG anc. pop. 1 → FB2 = 36, 95% CI =
[0 – 78]; MIG FB2 → anc. pop. 1 = 0, 95% CI = [0 - 5,979]
migrants per generation), revealing little impact of these parameters in
the SFS (Table 2, Fig. 4). In the case of the PAG, the model that better
fitted our data (ΔAIC = 7,769.98; Table S3, Fig. S1, Supplementary
Information) revealed a scenario of simultaneous cladogenesis dating
back 32,725 years (95% CI = [19,411-92,498]), prior to the LGM.
Signs of some gene flow were detected among PAG lineages (MIG Dammam →
Ras Ghurab = 13, 95% CI = [0 – 15]; MIG Ras Ghurab → Dammam = 0,
95% CI = [0 – 9]; MIG Ras Ghurab → Shinas = 16, 95% CI = [14 –
30]; MIG Shinas → Ras Ghurab = 0, 95% CI = [0 – 7] migrants per
generation). While narrower than in the Red Sea estimates, the ranges
also included zero in most cases, reducing the certainty in the model
(Table 2, Fig. 4).
We used TreeMix (32 ) to further explore patterns of historical
gene flow among mangrove populations within the Red Sea and the PAG,
separately. TreeMix generates a maximum likelihood tree based on allele
frequencies and then estimates migration events among branches to
account for excess of covariance. The Salalah population from the
Arabian Sea was used to root the trees in both models. In the case of
the Red Sea, a tree model with no migration events accounted for 99.9%
of variation in the SNP dataset, revealing no signs of gene flow among
Red Sea populations (32 ) (Fig, 5A), congruently with the
fastSIMCOAL2 analysis. In the PAG plus the Sea of Oman, a tree model
with three migration events between ancestral (Dammam + Bahrain → Ras
Ghurab + Suweihat) and terminal branches (Shinas → RAK; Suweihat →
Dammam) explained up to 99.8% of the SNP variance, revealing signs of
both historical and contemporary gene flow among mangrove populations in
the northern and southern basins of PAG, and between the PAG and Sea of
Oman populations (Fig. 5B).
To examine changes in the effective population size of the Arabian gray
mangroves, we used the MSMC2 program (33 ) for a reduced set of
populations representative of each one of the marine biogeographic
regions of the peninsula. The results showed similar patterns for all
lineages, revealing an ancestral demographic expansion two to five
million years ago, followed by a gradual decrease of the population
sizes, slightly more pronounced in the lineages from the Red Sea. The
analysis also revealed a moderate demographic recovery 20,000 to 10,000
years ago for all lineages, overlapping with the end of the Last Glacial
Maximum and rising sea levels (Fig. S2, Supplementary Information).
Adaptive divergence in Arabian gray mangroves We conducted a genotype-environment association analysis to identify
candidate loci potentially involved in differential selection among
Arabian gray mangrove populations. We implemented a redundancy analysis
(RDA) based on population allele frequencies and georeferenced
environmental data approximating maximum air temperature of the warmest
month, minimum air temperature of the coldest month, maximum of sea
surface salinity (averaged over months) and annual precipitation as
explanatory variables. The environmental predictors explained 18.3% of
the total genetic variance (Adjusted R2 = 0.183,
p-value < 0.001). Per population RDA scores revealed distinct
association patterns between biogeographic regions. Red Sea mangroves
showed high, negative loading values on RDA axis (RD) 1, while the
remaining populations correlated positively with this axis, and heavily
in the case of the Sea of Oman and PAG southern basin. In contrast, the
Arabian Sea populations showed high negative association values with
RD2, while Red Sea and PAG mangroves showed positive association scores
along this axis, yet the former with much lower correlation values
(Table S4, Supplementary Information). The analysis revealed 668 genetic
variants showing significant associations with one or more of the four
environmental predictors (α = 1x10-4; Fig. S3, Supp.
Inf.). Of the 668 outliers, 79 were located within up to 51 functionally
annotated genes, from which at least 11 (21.6%) presented functions
potentially relevant for adaptation to the extreme and variable habitats
of the seas around the Arabian Peninsula
(See Supplementary Information
for details and a comprehensive list of candidate genes with reported
functions).
To visually explore how adaptive variability is structured at the
individual level, we conducted a complementary redundancy analysis based
on individual genotypes of the 668 candidate loci alone. The plot of the
first two RDA axes revealed distinctive associations patterns:
Populations from the northern Red Sea (Duba, Al Wajh 1 and Al Wajh 2),
the central Red Sea (Al Kharrar and KAEC) and the southern Red Sea (Al
Lith, FB1 and FB2) differentiated along a gradient loading heavily on
minimum temperature of the coldest month and annual precipitation.
Populations from the Arabian Sea (Taqah and Salalah) and Qurm, the
proximal population from the Sea of Oman, presented similar
environment-genotype association patterns, correlating negatively with
maximum sea surface salinity and to a lesser extent, with maximum
temperature of the warmest month. PAG populations from the northern
(Dammam and Bahrain) and southern (Suweihat, Ras Ghurab, RAK and UAQ)
basins clustered together and with northern Red Sea populations, loading
negatively on minimum temperature of the coldest month and positively on
maximum sea surface salinity. The population of Shinas in the Sea of
Oman also presented similar correlation patterns to the PAG/northern Red
Sea cluster, yet showed weaker association with maximum sea surface
salinity and loaded negatively on maximum temperature of the warmest
month (Fig. 6A). Individual Euclidean distances averaged over
populations in the RDA ordination space showed general similar trends
(Fig. 6B), with within-region, highest differences among association
patterns occurring along the Red Sea and outrunning the Northern Red Sea
versus PAG scores. Populations FB1 and especially FB2 showed the most
dissimilar association scores of all pairwise population comparisons of
the RDA based on candidate loci (Fig. 6B).