IBD relatedness and selection analysis
Shared ancestry and relatedness between isolates was estimated using Identity-by-descent (IBD). PED and MAP file formats were created using VCFtools from an LD-pruned vcf dataset of the full genome (core + (sub)telomeric and low complexity regions of the 14 chromosomes). IBD-sharing between pairs of samples was calculated using the isoRelate package in R, which can analyse IBD in haploid recombining microorganisms in the presence of multiclonal infections . Genetic distance was calculated using an estimated mean map unit size fromPlasmodium chabaudi of 13.7 kb/centimorgan (cM) . We set the thresholds of IBD at the minimum number of SNPs (n = 20) and length of IBD segments (5000 bp) reported to reduce false-positive calls using an error of 0.001. IBD has been shown to be superior to probabilistic models such as STRUCTURE for understanding the relatedness and interconnectivity of parasite populations . Networks of IBD-sharing (>10% of the genome shared) between individuals were created using the igraph package in R, and the cumulative level of IBD-sharing between isolates in countries in the network was plotted as a connection map with Scimago graphica and used as a measure of connectivity between countries.
For the samples from Latin America, the proportion of pairs of isolates sharing IBD, as well as significance of IBD-sharing was calculated using the isoRelate package in R for all samples together and subdivided by population, based on country, as a measure of positive selection.