Revealing the Demographic History of the European nightjar (Caprimulgus europaeus ).
Abstract
A species’ demographic history provides important context to contemporary population genetics and a possible insight into past responses to climate change. An individual’s genome provides a window into the evolutionary history of contemporary populations. Pairwise sequentially Markovian coalescent (PSMC) analysis uses information from a single genome to derive fluctuations in effective population size change over the last ~5 million years. Here we apply PSMC analysis to two European nightjar (Caprimulgus europaeus ) genomes, sampled in Northwest and Southern Europe, with the aim of revealing the demographic history of nightjar in Europe. We successfully reconstructed effective population size over the last 5 million years for two contemporary nightjar populations. Our analysis shows that nightjar are responsive to global climate change, with effective population size broadly increasing under stable warm periods and decreasing during cooler spans and prolonged glacial periods. PSMC analysis on the pseudo-diploid combination of the two genomes revealed fluctuations in gene flow between the populations over time, with gene flow ceasing by the last-glacial maximum. This pattern of differentiation is in line with the species utilising different refugia during glacial maxima. We suggest that nightjar in Europe may show latitudinal (East-West) genetic structuring as a result of reduced gene flow between different glacial refugia. Finally, our results suggest that migratory behaviour in nightjar likely evolved prior to the last-glacial maximum, with long-distance migration seemingly persisting throughout the Pleistocene. However, further genetic structure analysis of nightjar from known breeding sites across the species’ contemporary range is needed to fully understand the extent and origins of range-wide differentiation in the species.
Keywords; PSMC, conservation genomics, refugia, Caprimulgus europaeus , demography
Introduction
Technological and methodological advancements have led to genomics playing an increasingly important role in conservation biology (Allendorf, 2017). Genomes provide a repository from which information on historic changes in genetic diversity, effective population size (N e), speciation, and population structuring can be inferred and used to track adaptations to environmental change (Mather et al., 2019; Patil and Vijay, 2021). Specifically, sequence data from a single aligned genome can be used to track historic demographic patterns exhibited by a species or population (Li and Durbin, 2011). Pairwise sequentially Markovian coalescent (PSMC) analysis is a powerful tool which infers ancestral changes inN e from a single genome. The analysis applies hidden-Markov modelling to the coalescence framework, treating a genome as multiple historic genealogies partitioned by recombination events (see Li and Durbin, 2011; Mather et al., 2019 for detailed explanation of the method). PSMC analysis has been used to determine ancestral (up to ~ 5 Mya) population trends and track time scales of species and population divergences (e.g. Ficedula flycatchers; Nadachowska-Brzyska et al., 2016, Catharus thrushes; Termignoni-Garcia et al., 2022), as well as shedding light on periods of gene flow among otherwise genetically structured populations (Sato et al., 2020). PSMC analysis can be applied to pseudo-diploid genomes constructed from two individuals from different populations to investigate changes in gene flow and timing of divergence (Li and Durbin, 2011; Prado-Martinez et al., 2013; Sato et al., 2020). For example, PSMC applied to pseudo-diploid genomes from three Golden eagle (Aquila chrysaetos ) subspecies revealed the timing of divergence and gene flow change among the subspecies over a time scale of ~11 million years (Sato et al., 2020). When combined with geological and paleoclimate data, PSMC analysis can reflect a species’ past ability to adapt to environmental change, and how different populations, or species, have been affected by broad climate trends (Nadachowska-Brzyska et al., 2015; Mather et al., 2019). Understanding a species’ response to past environmental change allows for predictions to be made regarding vulnerability to contemporary and future climate change and how this may vary interspecifically under different life histories (Kozma et al., 2018, 2016; Chattopadhyay et al., 2019), or between populations at different locations across a species range (Sato et al., 2020).
Over the past ~5 million years the global climate has fluctuated dramatically, oscillating between periods of extensive glaciation and interglacial warming. Long glacial and short interglacial periods during the mid-Pleistocene revolution (MPR; ~1 Mya - 450 Kya) resulted in cooler interglacial temperatures than those presently recorded (Pisias and Moore, 1981). However, throughout the Mid-Brunhes Event (MBE; ~450-110 Kya), interglaciations were characterised by warmer temperatures, with comparatively less severe glacial periods compared with the mid-Pleistocene (Candy et al., 2010; Barth et al., 2018). During the last glacial maxima (LGM; ~110 Kya) the Fennoscandian ice sheet covered much of Western and North Western Europe, restricting temperate zones to contemporary Southern Eurasia (Denton and Hughes, 1981). These significant shifts in global climate have been shown to correspond with fluctuations in historic population size in a number of species (Nadachowska-Brzyska et al., 2015; Kozma et al., 2016, 2018). Over periods of cooling, temperate Western-Palearctic species will have likely been restricted to southern refugia in Europe (Iberia, Apennines, and Balkans; Hewitt, 1999; but see Thorup et al., 2021). Restriction to different glacial refugia and subsequent northward expansion during interglacial periods have been linked to contemporary population structure and subspecies divergence in multiple species, including birds and aerial insects (e.g. Schmitt, 2007; Hansson et al., 2008; Nadachowska-Brzyska et al., 2016; de Greef et al., 2022). Occupation of separate glacial refugia by different populations is thought to have driven spatial patterns of genetic differentiation in temperate species, with many Palearctic birds exhibiting contemporary East - West patterns in genetic structure and speciation (e.g. Hansson et al., 2008; Lombardo et al., 2022; Väli et al., 2022).
The European nightjar (Caprimulgus europaeus ), henceforth nightjar, is a long-distance migratory bird with a temperate breeding distribution ranging from Northwest Europe through to East Asia (BirdLife International, 2022). Nightjar likely originated from the Afrotropics (Han et al., 2010), with the most closely related extant species being an Afrotropic resident (Rufous-cheeked nightjar Caprimulgus rufigena ) (Han et al., 2010). Nightjar are composed of six subspecies (C.e. europaeus ,meridionalis , sarudnyi , unwini , plumipes ,dementievi ) broadly following an East-West clinal distribution (Cleere, 1998; Cleere et al., 2021), although mtDNA analysis has found little association between genetic variation and current subspecies classifications (Schweizer et al., 2020). Based on mtDNA analysis, Larsen et al., (2007) proposed that nightjar migratory behaviour evolved at the end of the LGM with glacial expansion and cooling temperatures likely restricting the availability of temperate breeding habitat in Eurasia. However, it seems unlikely that nightjar migratory behaviour developed post-LGM; with evidence of a range-wide phylogenetic divergence in nightjar between Eastern and Western clades, likely originating in ~2.5 Mya, predating the LGM (Schweizer et al., 2020). Suitable breeding habitat would have been available across temperate Eurasia throughout warm interglacial periods, with breeding birds able to seek refuge in Mediterranean or North Africa during glacial periods (Ponti et al., 2020). If nightjars exhibited an Afro-European migration strategy pre-LGM, paleoclimatic-driven periods of dramaticN e change should be evident from PSMC analysis, following population expansion and contraction from refugia during global warming and cooling, respectively.
Nightjar have been subject to population decline across the NW of their range (Conway et al., 2007; Langston et al., 2007). Although current population trends are not a cause for concern (IUCN: Least Concern; BirdLife International, 2022), nightjar migratory behaviour and habitat specialisation leave them sensitive to environmental change as seen in other taxa ((Case et al., 2015; Bairlein, 2016). Its ancestral demographic history may leave nightjar vulnerable to contemporary and future environmental change, if, for example, populations have been subject to bottlenecks resulting in genetic variation depletion (Bürger and Lynch, 1995; Frankham et al., 2010; Nadachowska-Brzyska et al., 2015; Hohenlohe et al., 2021). The reference genome for the European nightjar was sequenced and assembled in 2021 from a bird captured in Southern Europe during the spring migration period (Secomandi et al., 2021). Here we use this published genome alongside a novel Pacbio HIFI sequence, sequenced by us, sampled from a population from the extreme North-western range limit in the UK. We apply PSMC analysis to determine the ancestral demography of nightjar from two contemporary populations in Europe to estimate the historic N e change over time from 10 Kya to 5 Mya. Specifically, we aimed to;
  1. Investigate historic N e trends of two contemporary nightjar populations relative to past climate fluctuations. We hypothesised that, in both populations, decreases inN e will have followed periods of global cooling (i.e LGM (110–10 Kya)), as well as the rapid global climate oscillations of the MPR (~1.2 Mya - 450 Kya). We predict that N e increased following periods of warming (i.e the warmer interglacial periods during the MBE (430–110 Kya));
  2. Compare N e trends from the two contemporary populations and a pseudo-diploid combined genome to examine divergence between the two populations, addressing the hypothesis that the two populations will have diverged during the LGM.
  3. Examine temporal N e patterns to investigate the evolution of migratory behaviour in nightjar. We hypothesise that nightjar will have exhibited fluctuating N e and population divergence before the LGM, suggestive of long distance migratory behaviour arising prior to the LGM.
Methods
Sampling Genetic Material, Extraction & Sequencing
A female nightjar from a breeding population in the East of England (latitude: 53.531, longitude: -0.953) was used to extract DNA for sequencing (population henceforth referred to as NW Europe or NWE). The bird died on 7th August 2019, so was assumed to have been part of the breeding population and not moving through on migration, with spring and autumn migration for nightjar occurring between April - June and late August and September respectively. High molecular weight DNA was extracted from a blood clot in the individual’s heart using a modified version of the phenol-chloroform protocol outlined by Sambrook et al., (1989). Full extraction protocol details can be found in the supporting information. The high molecular weight DNA was then sent to the Centre for Genomics Research facility at the University of Liverpool for PacBio HiFi sequencing library preparation. The reference genome (for assembly details see Secomandi et al., 2021) and 10x Genomics Illumina sequence reads were sequenced from a single female nightjar captured in South West Italy in spring 2021 (latitude: 40.794, longitude: 13.427) provided by Secomandi et al., 2021) (population henceforth referred to as South Europe or SE).
Genome Alignment
Minimap (minimap2 v. 2.18-r101; Li, 2018) and BWA mem (arXiv:1303.3997v1 [q-bio.GN]; Li, 2013) software were used to align the reads from the NWE (HIFI reads) and SE (10 x Illumina reads) nightjars to the reference genome, respectively. Unmapped reads were then filtered from both files leaving only mapped reads.
PSMC Analysis
​​To understand ancestral changes in N e a Pairwise Sequential Markovian Coalescent (PSMC) method was applied to the mapped bam files from the HIFI and 10x Illumina reads, for which the average coverage was 30.5x and 88.1x respectively. First, consensus sequences were generated from the aligned indexed bam files from the HIFI and 10x reads using SAMtools mpileup command and vcfutils.pl as per Li and Durbin, (2011) (https://github.com/lh3/psmc). Consensus files were generated for each chromosome independently before being combined. For the Hifi data, from the NWE genome, four chromosomes (chromosome numbers; 3, 5, 25, and 32) failed to produce consensus files and reduced representations for two of the four chromosomes (3 and 5) were used, with two chromosomes (25 and 32) excluded from the analysis. This resulted in a loss of only ~1% of genomic material for analysis. Sex chromosomes were also excluded from the analysis for both HIFI and 10x genomes. This resulted in 89.8% of the NWE genome and 90.8% of the SE genome being retained for downstream analysis. Consensus files were then filtered for read depth and quality. In order to reduce the effects of low coverage and collapsed regions, consensus files were filtered by excluding reads ~ < ⅓ and >2x mean depth. This resulted in removing reads <10x and >60x for the Hifi data and <30 and >120x for the 10x reads, respectively. Finally, filtering for base quality scores of <20 for the HIFI reads and 10x reads were applied.
The PSMC analysis was then run on the combined consensus .fastq files using the PSMC software package (Li and Durbin, 2011; https://github.com/lh3/psmc). PSMC parameters used by Nadachowska-Brzyska et al., (2015) for demographic analysis of 38 different bird species were chosen for our analysis, where “-N” (30) is the number of iterations, “-t” (5) is the maximum time to the most recent common ancestor, “-r” (5) is the initial mutation/ recombination rate (r = θ/ρ) and “-p” (4+30*2+4+6+10) denotes the distribution of atomic time intervals. In order to determine variation in PSMC predictions, the data were bootstrapped 100 times.
PSMC analysis can be applied to pseudo-diploid genomes formed from the fusing of haploid genomes from two separate populations or species. When PSMC is applied, deviations in N e trends of the pseudo-diploid genome from the two parent populations can denote reductions in gene flow and points of divergence between the two populations signified by the N e of the pseudo-diploid genome tending towards infinity (reducing coalescence events leading to an apparent increase in Ne) (Li and Durbin, 2011; Prado-Martinez et al., 2013; Sato et al., 2020). To determine the timing of divergence between the two sampled populations a pseudo-diploid genome was created by first generating pseudo-haploid genomes through randomly sampling heterozygous alleles using Seqtk V1.3 ‘randbase’ (-r) (https://github.com/lh3/seqtk; accessed Oct 25th, 2022) from both consensus sequence files as generated above. Pseudo-haploid files were then merged using Seqtk ‘mergefa’ to produce a single pseudo-diploid genome consensus file. PSMC analysis was then applied to the pseudo-diploid genome as described above.
Finally, all PSMC results were plotted using gnuplot (http://www.gnuplot.info/) with the -R flag applied to export .txt files. In order to plot the PSMC results, the data must be scaled to real-time by using mutation rate and generation time (Li and Durbin, 2011). A generation time of 2 (-g 2) was selected for nightjars following that used for Chuck-will’s-widow (Antrostomus carolinensis ) (Nadachowska-Brzyska et al., 2015), with birds able to breed in their second year (Cramp and Simmons, 1985). As no species-specific mutation rates were available for European nightjar, a mutation rate of μ = 4.6 × 10–9was used as per Sato et al., (2020). The mutation rate was initially estimated for collared flycatchers (Ficedula albicollis ) (Smeds et al., 2016), but has since been successfully applied to other passerines (Ericson et al., 2022), raptors (Hanna et al., 2017; Sato et al., 2020) and waterfowl species (Ericson et al., 2017).
Results and Discussion
In this study we explored the demographic history of two modern nightjar populations (NW and S Europe). Using PSMC we found significant fluctuation in N e in European nightjar over the last 5 million years, coinciding with major paleoclimatic events (Fig1a). The timing of initial divergence between the two nightjar populations was ~1.2Mya (Fig1A), with final divergence found to coincide with the LGM (~110 Kya) (Fig1B).
Demographic History of European Nightjar
Our analysis suggests that nightjar have experienced significant fluctuations in N e over the last ~5 million years. Two of the most significantN e changes occurred during the Pleistocene, with both populations (NWE & SE) increasing throughout the early Pleistocene to a maximum N e of ~780,000 individuals, before decreasing to ~570,000 individuals by 600 Kya during the MPR (~1 Mya - 450 Kya; Fig 1a). As hypothesised, the N e of both populations then increased throughout the MBE to ~ 1 million individuals by ~240 Kya (Fig 1a). Both populations then decreased inN e until ~100 Kya (Fig 1a). At the onset of the LGP, populations exhibited a peak inN e, followed by a steep decline as the LGP progressed (Fig 1a). The N e of the two populations then diverged in size (see below).