Genome-wide SNP genotyping
We used the same samples and quality-filtered reads included in our previous study on the genetic architecture of limb length in A. sagrei (Bock et al., 2021). This sampling includes the A. sagreimales used here for dewlap measurements, non-native population samples obtained earlier in the invasion (i.e., in 2003), and samples from the native range of A. sagrei . Using quality-filtered reads for all samples (see detailed methods in Bock et al., 2021), we repeated the SNP calling and variant filtering steps based on version 2.1 of the A. sagrei genome, which recently became available (Geneva et al., 2021). Reads were aligned to the genome using the dDocent v2.2.20 pipeline (Puritz et al., 2014), and SNPs were called using Freebayes v. 1.3.2 (Garrison & Marth, 2012).
Filtering of the resultant variant calls was implemented using vcflib (https://github.com/vcflib/vcflib) and consisted of sequential steps based on number of alleles (i.e., keeping only biallelic markers), type of variant (i.e., keeping only SNPs), read mapping quality (i.e., using SNPs with a MAPQ score > 20), and depth of sequencing (i.e., keeping only genotypes with DP > 7). For the remaining filtered SNPs, we used BCFtools v.1.9 (Narasimhan et al., 2016) to subset genotypes corresponding to the sequenced A. sagrei individuals obtained in 2018 from Florida and Georgia, for which dewlap trait data was also available (N = 561). Of the 561 samples, nine were sequenced in duplicate, resulting in 570 total sequencing libraries. We then kept only SNPs with data at more than 70% of samples and only SNPs with a minor allele frequency > 1%, and calculated identity-by-state (IBS) between samples using the SNPrelate R package (v. 1.19.4; Zheng et al., 2012). Following previous studies (e.g., Bock et al., 2021), we relied on IBS values between DNA replicates to estimate the rate of genotyping errors in our dataset. We then kept one replicate per sample and repeated the 70% call rate and minor allele frequency filters described above across all 561 genetically distinct samples. Finally, we removed one sample that had missing data at more than 30% of the final filtered SNPs, keeping the remaining 560 genotypes for downstream analyses.
Of the resulting filtered SNPs, we removed candidate gametolog SNPs following Bock et al. (2021). These gametolog SNPs occur on the X chromosome of A. sagrei (Geneva et al., 2021) and are a result of homology between the X chromosome (i.e., scaffold 7 in the A. sagrei v2.1 genome assembly) and the Y chromosome, which is currently not included in the genome assembly. After gametolog removal, we excluded markers that were in strong linkage disequilibrium (i.e. r2 > 0.4), by scanning the genome in 5,000 SNP windows, using ‘–indep-pairwise’ option in PLINK (v1.9; Purcell et al., 2007). We then kept all SNPs located on the largest 14 scaffolds of the genome assembly, which correspond to the known number of chromosomes for A. sagrei and cover more than 99% of the total assembly length (Geneva et al., 2021).