2.7 Gene set overlap analysis
In addition to the multivariate analysis, which was expected to reveal broad genome-wide parallelism (in genetic divergence or gene expression) or tradeoffs (in gene expression only), we have used a simpler approach based on counting the number of shared outlier genes between two or more contrasts, as well as the number of contrast-specific outliers. To determine how many genes are expected there just by chance, we used 5,000 permutations of the test statistic to obtain null distributions. We calculated the p-value for observing a certain number of genes in an overlapping or contrast-specific gene set as the fraction of permuted analyses returning the same or more extreme number of genes in the same set, multiplied by two to account for the two-tailed nature of the test. The genes representing each contrast were the top 25% quantile for the genetic divergence (F ST) data, or top and bottom 12.5% quantiles for the log-fold gene expression change. To calculate gene set overlaps we used the function venn in the R packagegplots and visualized the results as UpSet plots using the R package UpSetR (Conway et al., 2017).