Revealing the Demographic History of the European nightjar
(Caprimulgus europaeus ).
Abstract
A species’ demographic history provides important context to
contemporary population genetics and a possible insight into past
responses to climate change. An individual’s genome provides a window
into the evolutionary history of contemporary populations. Pairwise
sequentially Markovian coalescent (PSMC) analysis uses information from
a single genome to derive fluctuations in effective population size
change over the last ~5 million years. Here we apply
PSMC analysis to two European nightjar (Caprimulgus europaeus )
genomes, sampled in Northwest and Southern Europe, with the aim of
revealing the demographic history of nightjar in Europe. We successfully
reconstructed effective population size over the last 5 million years
for two contemporary nightjar populations. Our analysis shows that
nightjar are responsive to global climate change, with effective
population size broadly increasing under stable warm periods and
decreasing during cooler spans and prolonged glacial periods. PSMC
analysis on the pseudo-diploid combination of the two genomes revealed
fluctuations in gene flow between the populations over time, with gene
flow ceasing by the last-glacial maximum. This pattern of
differentiation is in line with the species utilising different refugia
during glacial maxima. We suggest that nightjar in Europe may show
latitudinal (East-West) genetic structuring as a result of reduced gene
flow between different glacial refugia. Finally, our results suggest
that migratory behaviour in nightjar likely evolved prior to the
last-glacial maximum, with long-distance migration seemingly persisting
throughout the Pleistocene. However, further genetic structure analysis
of nightjar from known breeding sites across the species’ contemporary
range is needed to fully understand the extent and origins of range-wide
differentiation in the species.
Keywords; PSMC, conservation genomics, refugia, Caprimulgus
europaeus , demography
Introduction
Technological and methodological advancements have led to genomics
playing an increasingly important role in conservation biology
(Allendorf, 2017).
Genomes provide a repository from which information on historic changes
in genetic diversity, effective population size
(N e), speciation, and population structuring can
be inferred and used to track adaptations to environmental change
(Mather et al., 2019;
Patil and Vijay, 2021). Specifically, sequence data from a single
aligned genome can be used to track historic demographic patterns
exhibited by a species or population
(Li and Durbin,
2011). Pairwise sequentially Markovian coalescent (PSMC) analysis is a
powerful tool which infers ancestral changes inN e from a single genome. The analysis applies
hidden-Markov modelling to the coalescence framework, treating a genome
as multiple historic genealogies partitioned by recombination events
(see Li and Durbin,
2011; Mather et al., 2019 for detailed explanation of the method). PSMC
analysis has been used to determine ancestral (up to ~ 5
Mya) population trends and track time scales of species and population
divergences (e.g. Ficedula flycatchers;
Nadachowska-Brzyska et
al., 2016, Catharus thrushes;
Termignoni-Garcia et
al., 2022), as well as shedding light on periods of gene flow among
otherwise genetically structured populations
(Sato et al., 2020).
PSMC analysis can be applied to pseudo-diploid genomes constructed from
two individuals from different populations to investigate changes in
gene flow and timing of divergence
(Li and Durbin, 2011;
Prado-Martinez et al., 2013; Sato et al., 2020). For example, PSMC
applied to pseudo-diploid genomes from three Golden eagle (Aquila
chrysaetos ) subspecies revealed the timing of divergence and gene flow
change among the subspecies over a time scale of ~11
million years (Sato et
al., 2020). When combined with geological and paleoclimate data, PSMC
analysis can reflect a species’ past ability to adapt to environmental
change, and how different populations, or species, have been affected by
broad climate trends
(Nadachowska-Brzyska
et al., 2015; Mather et al., 2019). Understanding a species’ response
to past environmental change allows for predictions to be made regarding
vulnerability to contemporary and future climate change and how this may
vary interspecifically under different life histories
(Kozma et al., 2018,
2016; Chattopadhyay et al., 2019), or between populations at different
locations across a species range
(Sato et al., 2020).
Over the past ~5 million years the global climate has
fluctuated dramatically, oscillating between periods of extensive
glaciation and interglacial warming. Long glacial and short interglacial
periods during the mid-Pleistocene revolution (MPR; ~1
Mya - 450 Kya) resulted in cooler interglacial temperatures than those
presently recorded
(Pisias and Moore,
1981). However, throughout the Mid-Brunhes Event (MBE;
~450-110 Kya), interglaciations were characterised by
warmer temperatures, with comparatively less severe glacial periods
compared with the mid-Pleistocene
(Candy et al., 2010;
Barth et al., 2018). During the last glacial maxima (LGM;
~110 Kya) the Fennoscandian ice sheet covered much of
Western and North Western Europe, restricting temperate zones to
contemporary Southern Eurasia
(Denton and Hughes,
1981). These significant shifts in global climate have been shown to
correspond with fluctuations in historic population size in a number of
species
(Nadachowska-Brzyska
et al., 2015; Kozma et al., 2016, 2018). Over periods of cooling,
temperate Western-Palearctic species will have likely been restricted to
southern refugia in Europe (Iberia, Apennines, and Balkans;
Hewitt, 1999; but see
Thorup et al., 2021).
Restriction to different glacial refugia and subsequent northward
expansion during interglacial periods have been linked to contemporary
population structure and subspecies divergence in multiple species,
including birds and aerial insects (e.g.
Schmitt, 2007; Hansson
et al., 2008; Nadachowska-Brzyska et al., 2016; de Greef et al., 2022).
Occupation of separate glacial refugia by different populations is
thought to have driven spatial patterns of genetic differentiation in
temperate species, with many Palearctic birds exhibiting contemporary
East - West patterns in genetic structure and speciation (e.g.
Hansson et al., 2008;
Lombardo et al., 2022; Väli et al., 2022).
The European nightjar (Caprimulgus europaeus ), henceforth
nightjar, is a long-distance migratory bird with a temperate breeding
distribution ranging from Northwest Europe through to East Asia
(BirdLife
International, 2022). Nightjar likely originated from the Afrotropics
(Han et al., 2010),
with the most closely related extant species being an Afrotropic
resident (Rufous-cheeked nightjar Caprimulgus rufigena )
(Han et al., 2010).
Nightjar are composed of six subspecies (C.e. europaeus ,meridionalis , sarudnyi , unwini , plumipes ,dementievi ) broadly following an East-West clinal distribution
(Cleere, 1998; Cleere
et al., 2021), although mtDNA analysis has found little association
between genetic variation and current subspecies classifications
(Schweizer et al.,
2020). Based on mtDNA analysis,
Larsen et al., (2007)
proposed that nightjar migratory behaviour evolved at the end of the LGM
with glacial expansion and cooling temperatures likely restricting the
availability of temperate breeding habitat in Eurasia. However, it seems
unlikely that nightjar migratory behaviour developed post-LGM; with
evidence of a range-wide phylogenetic divergence in nightjar between
Eastern and Western clades, likely originating in ~2.5
Mya, predating the LGM
(Schweizer et al.,
2020). Suitable breeding habitat would have been available across
temperate Eurasia throughout warm interglacial periods, with breeding
birds able to seek refuge in Mediterranean or North Africa during
glacial periods (Ponti
et al., 2020). If nightjars exhibited an Afro-European migration
strategy pre-LGM, paleoclimatic-driven periods of dramaticN e change should be evident from PSMC analysis,
following population expansion and contraction from refugia during
global warming and cooling, respectively.
Nightjar have been subject to population decline across the NW of their
range (Conway et al.,
2007; Langston et al., 2007). Although current population trends are
not a cause for concern (IUCN: Least Concern;
BirdLife
International, 2022), nightjar migratory behaviour and habitat
specialisation leave them sensitive to environmental change as seen in
other taxa ((Case et
al., 2015; Bairlein, 2016). Its ancestral demographic history may leave
nightjar vulnerable to contemporary and future environmental change, if,
for example, populations have been subject to bottlenecks resulting in
genetic variation depletion
(Bürger and Lynch,
1995; Frankham et al., 2010; Nadachowska-Brzyska et al., 2015; Hohenlohe
et al., 2021). The reference genome for the European nightjar was
sequenced and assembled in 2021 from a bird captured in Southern Europe
during the spring migration period
(Secomandi et al.,
2021). Here we use this published genome alongside a novel Pacbio HIFI
sequence, sequenced by us, sampled from a population from the extreme
North-western range limit in the UK. We apply PSMC analysis to determine
the ancestral demography of nightjar from two contemporary populations
in Europe to estimate the historic N e change over
time from 10 Kya to 5 Mya. Specifically, we aimed to;
- Investigate historic N e trends of two
contemporary nightjar populations relative to past climate
fluctuations. We hypothesised that, in both populations, decreases inN e will have followed periods of global cooling
(i.e LGM (110–10 Kya)), as well as the rapid global climate
oscillations of the MPR (~1.2 Mya - 450 Kya). We
predict that N e increased following periods of
warming (i.e the warmer interglacial periods during the MBE (430–110
Kya));
- Compare N e trends from the two contemporary
populations and a pseudo-diploid combined genome to examine divergence
between the two populations, addressing the hypothesis that the two
populations will have diverged during the LGM.
- Examine temporal N e patterns to investigate the
evolution of migratory behaviour in nightjar. We hypothesise that
nightjar will have exhibited fluctuating N e and
population divergence before the LGM, suggestive of long distance
migratory behaviour arising prior to the LGM.
Methods
Sampling Genetic Material, Extraction & Sequencing
A female nightjar from a breeding population in the East of England
(latitude: 53.531, longitude: -0.953) was used to extract DNA for
sequencing (population henceforth referred to as NW Europe or NWE). The
bird died on 7th August 2019, so was assumed to have been part of the
breeding population and not moving through on migration, with spring and
autumn migration for nightjar occurring between April - June and late
August and September respectively. High molecular weight DNA was
extracted from a blood clot in the individual’s heart using a modified
version of the phenol-chloroform protocol outlined by
Sambrook et al.,
(1989). Full extraction protocol details can be found in the supporting
information. The high molecular weight DNA was then sent to the Centre
for Genomics Research facility at the University of Liverpool for PacBio
HiFi sequencing library preparation. The reference genome (for assembly
details see Secomandi
et al., 2021) and 10x Genomics Illumina sequence reads were sequenced
from a single female nightjar captured in South West Italy in spring
2021 (latitude: 40.794, longitude: 13.427) provided by
Secomandi et al.,
2021) (population henceforth referred to as South Europe or SE).
Genome Alignment
Minimap (minimap2 v. 2.18-r101; Li, 2018) and BWA mem (arXiv:1303.3997v1
[q-bio.GN]; Li, 2013) software were used to align the reads from the
NWE (HIFI reads) and SE (10 x Illumina reads) nightjars to the reference
genome, respectively. Unmapped reads were then filtered from both files
leaving only mapped reads.
PSMC Analysis
To understand ancestral changes in N e a
Pairwise Sequential Markovian Coalescent (PSMC) method was applied to
the mapped bam files from the HIFI and 10x Illumina reads, for which the
average coverage was 30.5x and 88.1x respectively. First, consensus
sequences were generated from the aligned indexed bam files from the
HIFI and 10x reads using SAMtools mpileup command and vcfutils.pl as per
Li and Durbin, (2011)
(https://github.com/lh3/psmc). Consensus files were generated for each
chromosome independently before being combined. For the Hifi data, from
the NWE genome, four chromosomes (chromosome numbers; 3, 5, 25, and 32)
failed to produce consensus files and reduced representations for two of
the four chromosomes (3 and 5) were used, with two chromosomes (25 and
32) excluded from the analysis. This resulted in a loss of only
~1% of genomic material for analysis. Sex chromosomes
were also excluded from the analysis for both HIFI and 10x genomes. This
resulted in 89.8% of the NWE genome and 90.8% of the SE genome being
retained for downstream analysis. Consensus files were then filtered for
read depth and quality. In order to reduce the effects of low coverage
and collapsed regions, consensus files were filtered by excluding reads
~ < ⅓ and >2x mean depth. This
resulted in removing reads <10x and >60x for the
Hifi data and <30 and >120x for the 10x reads,
respectively. Finally, filtering for base quality scores of
<20 for the HIFI reads and 10x reads were applied.
The PSMC analysis was then run on the combined consensus .fastq files
using the PSMC software package
(Li and Durbin, 2011;
https://github.com/lh3/psmc). PSMC parameters used by
Nadachowska-Brzyska et
al., (2015) for demographic analysis of 38 different bird species were
chosen for our analysis, where “-N” (30) is the number of iterations,
“-t” (5) is the maximum time to the most recent common ancestor,
“-r” (5) is the initial mutation/ recombination rate (r = θ/ρ)
and “-p” (4+30*2+4+6+10) denotes the distribution of atomic time
intervals. In order to determine variation in PSMC predictions, the data
were bootstrapped 100 times.
PSMC analysis can be applied to pseudo-diploid genomes formed from the
fusing of haploid genomes from two separate populations or species. When
PSMC is applied, deviations in N e trends of the
pseudo-diploid genome from the two parent populations can denote
reductions in gene flow and points of divergence between the two
populations signified by the N e of the
pseudo-diploid genome tending towards infinity (reducing coalescence
events leading to an apparent increase in Ne)
(Li and Durbin, 2011;
Prado-Martinez et al., 2013; Sato et al., 2020). To determine the
timing of divergence between the two sampled populations a
pseudo-diploid genome was created by first generating pseudo-haploid
genomes through randomly sampling heterozygous alleles using Seqtk V1.3
‘randbase’ (-r)
(https://github.com/lh3/seqtk;
accessed Oct 25th, 2022) from both consensus sequence files as generated
above. Pseudo-haploid files were then merged using Seqtk ‘mergefa’ to
produce a single pseudo-diploid genome consensus file. PSMC analysis was
then applied to the pseudo-diploid genome as described above.
Finally, all PSMC results were plotted using gnuplot
(http://www.gnuplot.info/) with
the -R flag applied to export .txt files. In order to plot the PSMC
results, the data must be scaled to real-time by using mutation rate and
generation time (Li
and Durbin, 2011). A generation time of 2 (-g 2) was selected for
nightjars following that used for Chuck-will’s-widow (Antrostomus
carolinensis )
(Nadachowska-Brzyska
et al., 2015), with birds able to breed in their second year
(Cramp and Simmons,
1985). As no species-specific mutation rates were available for
European nightjar, a mutation rate of μ = 4.6 × 10–9was used as per Sato
et al., (2020). The mutation rate was initially estimated for collared
flycatchers (Ficedula albicollis )
(Smeds et al., 2016),
but has since been successfully applied to other passerines
(Ericson et al.,
2022), raptors (Hanna
et al., 2017; Sato et al., 2020) and waterfowl species
(Ericson et al.,
2017).
Results and Discussion
In this study we explored the demographic history of two modern nightjar
populations (NW and S Europe). Using PSMC we found significant
fluctuation in N e in European nightjar over the
last 5 million years, coinciding with major paleoclimatic events
(Fig1a). The timing of initial divergence between the two nightjar
populations was ~1.2Mya (Fig1A), with final divergence
found to coincide with the LGM (~110 Kya) (Fig1B).
Demographic History of European Nightjar
Our analysis suggests that nightjar have experienced significant
fluctuations in N e over the last
~5 million years. Two of the most significantN e changes occurred during the Pleistocene, with
both populations (NWE & SE) increasing throughout the early Pleistocene
to a maximum N e of ~780,000
individuals, before decreasing to ~570,000 individuals
by 600 Kya during the MPR (~1 Mya - 450 Kya; Fig 1a). As
hypothesised, the N e of both populations then
increased throughout the MBE to ~ 1 million individuals
by ~240 Kya (Fig 1a). Both populations then decreased inN e until ~100 Kya (Fig 1a). At
the onset of the LGP, populations exhibited a peak inN e, followed by a steep decline as the LGP
progressed (Fig 1a). The N e of the two
populations then diverged in size (see below).