2.5 Identification of differentially expressed genes and
pathways
To quantify expression we ran the “align_and_estimate_abundance.pl”
script in Trinity v.2.5.1 (Grabherr et al., 2011) software incorporating
the RSEM program (Li & Dwey, 2011). Gene-level quantification estimates
produced by RSEM were imported into R/Bioconductor with the tximport
package. The tximport package produces count matrices from gene-level
quantification files with effective gene length taken into account
(Soneson, Love, & Robinson, 2016). To detect differential gene
expression, we passed the estimated count matrices from tximport to
DESeq2 (Love, Huber, & Anders, 2014) and analyzed the two species
separately. To build the model for the differential expression analysis
we removed low count genes (<10) and genes that were not
present in at least 2 replicates. To examine intra-species temperature
specific expression pattern in B. calyciflorus s.s., we performed
pairwise
contrasts within the model in the following combinations: 20 vs.26 oC (control vs. mild heat), 20 vs. 32oC (control vs. high heat), and 26 vs.32 oC (mild vs. high heat) (Figure 1). ForB. fernandoi similarly we performed pairwise comparisons in the
following combinations: 20 vs. 23 oC (controlvs. mild heat), 20 vs. 26 oC (controlvs. high heat), and 23 vs. 26 oC (mildvs. high heat) (Figure 1). We used a false discovery rate (FDR)
threshold of 0.05 to correct for multiple testing. All differentially
expressed genes (DEGs) were annotated against the NCBI non-redundant
(nr ) database using blastx (e-value cutoff
1e-10). We assessed overall temperature-dependent
patterns of expression by plotting a two-dimensional principal component
analysis (PCA) of log-transformed counts for each species separately.
We further categorized differential expression at the gene-pathway level
using the online Kyoto Encyclopedia of Genes and Genomes (KEGG)
automatic server for KEGG pathway analysis (Moriya, Itoh, Okuda,
Yoshizawa, & Kanehisa, 2007), which clusters genes based on their
association in biochemical pathways. To estimate whole KEGG pathway
expression, genes belonging to the same KEGG pathway were clustered
together and a differential pathway expression analysis was performed.
Pathways were considered differentially expressed at a false discovery
rate (FDR) below 0.05. Analyses and visualization was performed using
the “gage” (Luo et al., 2009), “clusterProfiler” (Yu, Wang, Han, &
He, 2012), “pathview” (Luo, Weijun, Brouwer, & Cory, 2013), and
“ggplot2” (Wickham, 2016) R packages (R Core Team).
To capture similar or contrasting patterns of expression between the two
species, orthologous genes were identified using Orthofinder (Emms &
Kelly, 2015). From this, we compared expression of orthologous genes
between B. fernandoi and B. calyciflorous usingClust (Abu-Jamous & Kelly, 2018). In Clust, gene
clusters (groups) are identified that are consistently co-expressed
(well-correlated) in both shared and contrasting patterns between
species. Within this, we chose patterns that were biologically
meaningful for further analysis. These groups/clusters were checked for
functional enrichment in any KEGG pathways, using a Fisher Exact Test
and correcting for false positives (FDR=0.05).