Genetic diversity and demographic history
Genetic diversity (θπ ) was calculated for
different site categories across the genome based on the publicly
available genome annotation in MonarchBase
(Zhan & Reppert, 2013). This was done
separately in windows of 10kb for eastern and western monarchs. Genomic
positions were categorized as intergenic, intronic,
1st, 2nd, 3rdcodon positions and 4-fold degenerate sites (4D).
We used two different approaches to analyze demographic history. First,
we used a diffusion approximation method of ∂a∂i
(Gutenkunst, Hernandez, Williamson &
Bustamante, 2010) to investigate the joint demographic history of
eastern and western monarchs. For this analysis we only considered
autosomal scaffolds as Z chromosomes have different effective population
sizes and a low density of SNPs, which could affect the resulting site
frequency spectrum. We generated a Two-Dimensional Joint Site Frequency
Spectrum (2D-JSFS) of eastern and western monarchs using the
“dadi.Misc.make_data_dict_vcf” function provided with ∂a∂i. We
scanned for the likelihoods of a set of 15 demographic models to address
the following questions (Fig. S9): (1) have eastern and western monarchs
diverged in the past; (2) if there is divergence, is there migration
between the two populations; (3) if there is migration, what is the rate
of migration; (4) what is the most likely scenario to explain changes in
effective population size? We simulated 4 two-population models provided
with ∂a∂i (Fig. S9-L to S9-O) and 11 two-population models provided with
dadi_pipeline
(github.com/dportik/dadi_pipeline)
(Portik et al. , 2017). We used log
likelihoods to find the most likely model that can explain the joint
site frequency spectrum of eastern and western monarchs. We performed
parameter optimization by running 100 iterations performed in a total of
4 rounds (10, 20, 30 and 40 iterations for the first, second, third and
fourth round, respectively). Parameters with the highest log-likelihood
were used as starting parameters of the next round. We used the Broyden
Fletcher Goldfarb Shanno (BFGS) algorithm to optimize the parameters.
Results of all 16 optimized models were summarized using
“Summarize_Outputs.py” provided with “dadi_pipeline”. This script
extracts the iterations with the highest log-likelihood and compares
them between models. The model with the highest log-likelihood score and
the lowest Akaike Information Criterion (AIC) was considered the most
likely model to explain the 2D-JSFS.
As a second approach to understand the demographic history of eastern
and western monarchs, we analyzed the genome-wide patterns of Tajima’s D
(TD ) calculated in windows of 10 kb using
Vcftools v. 0.1.15 (Danecek et al. ,
2011) to determine if eastern and western monarchs have different
demographic histories.