4 Discussion
The increasing use of RNA-Seq for ecological, physiological, and evolutionary studies on wild caught organisms has required appraisal of the influence of different sampling techniques, storage methods, processing time, and tissue types on RNA quality and data production (Camacho-Sanchez et al. 2013, Cheviron et al. 2011, Nakatsuji et al. 2019, Yu et al. 2013). Among the most important applications of RNA-Seq currently used is testing for rapid adaptation to environmental change (e.g., to captivity or climate warming), and to determine if environmentally-induced gene expression shifts are transgenerationally transmitted (e.g., Christie et al. 2016, Charlesworth et al. 2017, Skvortsova et al. 2018, Navarro-Martin et al. 2020, Sävilammi et al. 2020). Our results will facilitate future research testing for transgenerational transmission of potentially epigenetic hatchery-adaptive traits in wild fish populations (e.g., Christie et al. 2016, Le Luyer et al. 2017, Wellband et al. 2020).
Evidence is accumulating regarding the effects that sampling techniques, sample processing time, RNA degradation, and different RNA-Seq libraries have on RNA-Seq data (e.g., Gayral et al. 2012, Romero et al. 2014, Ma et al. 2019; see also Introduction). We tested these effects on samples of westslope cutthroat trout sampled using dip-netting or electrofishing. We also tested if distinct tissues may be differently affected by these conditions. Samples were sourced from a wild non-introgressed population raised in controlled environments in order to minimize variation in gene expression.
Overall, we obtained high RNA quality for all tissues (mean RIN> 9.0 for the different tissues) except liver (mean RIN = 8.0). Liver is a tissue with a high rate of protein synthesis and degradation, and the higher RNA degradation observed for this tissue in comparison to blood, muscle, and gills is likely the result of higher enzymatic activity in the liver (Carter et al. 2001, Wiseman et al. 2007). In our experiment, liver was the third tissue sampled after euthanasia, after blood and muscle, and it took us between 2 and 3 minutes to sample. Because of its importance in detoxification mechanisms, physiological studies may require sampling of this tissue. We therefore suggest sampling of liver first - if more than one tissue is sampled - to minimize RNA degradation.
We also found no difference in RNA quality among samples obtained through dip netting or electrofishing even when tissue was not harvested until 5 minutes after death. While opinions on a cutoff threshold RIN value to obtain reliable gene expression data differ, it has been shown that degraded RNA still recovers the same uniquely mapped genes as non-degraded RNA, although the coverage of mapped reads is lower for degraded RNA and gene specific (Romero et al. 2014, Wang et al. 2016). However, while RNA degradation may not strongly affect mapping, it may drastically affect estimates of differential gene expression (Chen et al. 2014, Romero et al. 2014). Furthermore, different RNA-Seq techniques may be differentially affected by RNA degradation (Adiconis et al. 2013), requiring selecting the most appropriate RNA-Seq library depending on RNA quality (Adiconis et al. 2013).
We found that gene expression among individuals belonging to the same group were generally very similar for the majority of comparisons (correlation coefficients > 0.9), independent of the sampling method or harvesting time. However, we observed among-sample variation in gene expression, reflecting the importance of larger sample size in RNA-Seq studies to decrease the influence of stochastic effects on variation in gene expression that could otherwise be interpreted as biologically relevant (Ching et al. 2020). Furthermore, we also observed similarity of expression levels among samples obtained with the two sampling methods, dip netting or electrofishing, or subjected to different tissue harvest times (immediate or 5 minutes after death). Sampling individuals of the same age, in the same environment and on the same day, with many biological replicates per treatment and using only samples with highly similar RNA quality most likely reduced the effects of non-biological variation and of non-relevant biological variation in our experiments (Fang & Cui 2010, Wong et al 2012, Yu et al. 2014).
We recovered a higher number of reads per sample with the whole mRNA-Seq library technique used here (NEB) than with 3’ Tag-Seq (around 10 times higher in NEB than in 3’ Tag-Seq), as expected (Ma et al. 2019). Similar to results reported by Ma et al. (2019), our recovered number of mapped genes was also higher (at least 2X higher) for samples processed with NEB than with 3’ Tag-Seq, independent of the number of reads per gene and gene transcript length. This higher number suggests researchers should use whole mRNA-Seq when their research question requires genome-wide coverage of genes and study of large numbers of genes.
Selection of 11M reads and 40 M reads for 3’ RNA-Tag and whole mRNA (NEB) libraries, respectively, resulted in a very similar number of unique mapped reads on the O. mykiss reference genome for the two library techniques (75% NEB versus 77% 3’ Tag-Seq). Therefore, while RNA-Seq samples prepared using NEB libraries allow recovering more raw reads than when using the 3’ Tag-Seq library, this number did not increase the proportions of uniquely mapped reads on the reference genome. Previous studies (Liu et al. 2014, Ma et al. 2019) also found similar estimates of gene expression for sequencing depth equal or above 10M reads. However, independently of the sequencing depth (in this study NEB: 40M reads and 3’ Tag-Seq: 11M reads), we found different gene expression between NEB and 3’ Tag-Seq, with higher estimated gene expression being gene-specific and not library-dependent. Whole mRNA-Seq has been found to detect more differentially expressed genes, even at lower than 10M reads sequencing depth, potentially as a consequence of the increased number of mapped reads for longer transcripts for whole mRNA-Seq vs 3’ RNA-Seq (Ma et al. 2019). We did find a very slight trend toward a higher proportion of genes with greater gene expression for NEB relative to 3’ Tag-Seq with increasing transcript length.
Although stress levels associated with dip netting and electrofishing may differ, sampling techniques did not affect gene expression levels. This result was independent from the RNA-Seq library type (3’ Tag-Seq or NEB) and tissue used. Although whole mRNA-Seq has been reported to be more sensitive to differentially expressed genes than 3’ RNA-Seq methods (Ma et al. 2019), the fact that independently of the method used we found no differences in estimated gene expression between the two sampling methods further supports that researchers can confidently use either one or both of these sampling methods to obtain fish tissues for studies using RNA-Seq. As field conditions often change among sampling locations, researchers could opt to use electrofishing, where more efficient, and compare with fish obtained by netting in other localities without worrying about introducing extraneous variation in gene expression.
We also found that harvesting the tissue immediately or 5 minutes after death did not produce variation in gene expression, suggesting that it is safe to euthanize fish in batches and then proceed to tissue harvesting. In our work, the maximum processing time of the last tissue harvested after death was approximately 10 min (for fish processed 5 minutes after death). Although sampling techniques and tissue processing time did not influence variation in gene expression, we observed a large proportion of differentially expressed genes among the different tissues.
We found fewer expressed genes in blood compared to gill and muscle, and a smaller proportion of genes with higher expression in blood than in the other two tissues. Blood and muscle were also the tissues with the least number of expressed genes in common. Gill was the tissue in which the higher number of total expressed genes was recovered. This may be due to the active cellular processes occurring in gills (Stolper et al. 2019) - especially in animals that are experiencing growth as were the ones sampled by us - as supported by our finding on the type of genes found to be highly expressed in this tissue (e.g., gene related to metabolic and growth-related processes). Depending on the study question, sampling different tissues may ensure that multiple genes and multiple biological processes are considered for studies on differential gene expression.
In summary, our study indicates that differential gene expression results are likely to be comparable for dip netting and electrofishing. Additionally, gill, blood, and muscle all produce good quality RNA with reliable results if sampled within 10 minutes from death. Only liver samples showed reduced quality results. Finally, although whole mRNA-Seq detects more differentially expressed genes, this did not produce different results in terms of distinct gene expression among the groups tested here. 3’ Tag-Seq can therefore be more cost effective, ensuring a sufficient depth coverage and allowing processing larger samples sizes at a lower cost, thus potentially increasing statistical power of detection of differential gene expression. Consequently, depending on the study question, sequencing a large number of individuals using 3’ Tag-Seq (and a subset of samples with whole mRNA-Seq) will often be the best strategy to test for differences in gene expression among tested groups. Our study provides data crucially-needed to advance use of RNA-Seq to investigate gene expression variation and its role in phenomena such as adaptation to environmental variation and climate change in natural populations