3.1 RNA and raw sequencing data quality statistics
Out of the 120 samples for which RNA was extracted, 86 had a RIN value
(a measure of RNA integrity) equal or above 8.8. Little variation in RIN
scores was observed among the sampled tissues and sampling methods
(Supporting Information Table S1), except for liver, which overall
showed higher levels of RNA degradation and was therefore not used for
library construction and sequencing. Mean and standard deviation for RIN
values for the four tissues were: 9.6±0.22 (blood), 9.2±0.40 (muscle),
8.0±1.21 (liver); 9.0±1 (gill). Mean and standard deviation for RIN
values for the three treatment groups without the liver were: 9.2±0.43
(dip netting), 9.3±0.34 (electrofishing), and 9.2±1.06 (tissue
harvesting after 5 minutes). We found no differences in RIN values among
groups (F = 0.299, df = 2, p = 0.74) and in RIN values
among tissues within each group (F = 0.595, df = 4, p = 0.67),
after excluding the liver from the analyses.
RNA sequencing from 3’ Tag-Seq samples regardless of tissue type yielded
a total of 367.2 million reads for individuals captured by net (mean =
13.1 million + 0.72; N = 28), 328.2 million reads for samples
collected by immediately after electrofishing (mean = 12.62 million+ 0.46; N = 26), and 347.6 million reads from samples
electrofished and processed after 5 minutes (mean = 12.87 million+ 0.71; N = 27) (Supporting Information Table S1). The final
number of reads per individual ranged from 11 million to 15.6 million
(mean = 12.88 million ± 0.67). On average, of the 11 million reads
randomly selected for each sample, we obtained around 77% of uniquely
mapped reads on the rainbow trout (O. mykiss) genome
independently of the sampling method used (range: 67.7 ‐ 86.3%,
Supporting Information Table S1), indicating that we used good libraries
(Dobin and Gingeras, 2015) for downstream analyses.
RNA sequencing from the 14 whole mRNA-Seq (NEB) samples (blood only)
yielded a total of 564 million reads for individuals captured by net
(mean = 112.9 million + 13.95; N = 5), 563.4 million reads for
samples collected by electrofishing and sampled immediately (mean =
112.7 million + 22.4; N = 5), and 350.4 million reads from
electrofishing samples processed after 5 minutes (mean = 87.6 million+ 7.4; N = 4). The final number of reads per individual ranged
from 77.8 to 148.8 million reads (mean = 105.6 million ± 19.1). Number
of reads per sample was therefore on average 10 times higher for NEB
than 3’ Tag-Seq.
After mapping the randomly selected 11 or 40 million reads of 3’ Tag-Seq
and NEB (see Materials and Methods) on the reference genome of O.
mykiss , each 3’ Tag-Seq and NEB sample had >8 and
>28 million reads, respectively, to be used for the
analyses of gene expression (Supporting Information Table S1). Reads
that were uniquely mapped on the O. mykiss genome were similar
among all the groups compared in this study (see % mapping per group
above and in Supporting Information Table S1), suggesting that
>10 million reads, the two RNA-Seq library constructions
(3’ Tag-Seq and NEB) uniquely map to roughly the same percentage of the
reference genome, even if for the whole mRNA-Seq data we used 40 million
reads instead of the 11 million reads used for 3’ Tag-Seq.
Raw reads – i.e., before selecting 11M reads for 3’ Tag-Seq and 40M
reads for whole mRNA-Seq – mapped on the rainbow trout (O.
mykiss) genome recovered a different number of genes between the two
RNA library sequencing types, independently on the number of reads
mapped per gene. Specifically, whole mRNA-Seq recovers two to three
times more genes than 3’ Tag-Seq (Supporting Information Table S1).
Differential expression analysis (see below) for the 14 blood samples
for which RNA libraries were built using for 3’ Tag-Seq and mRNA-Seq
indicates that presence/absence of genes between the two techniques is
independent of gene transcript length (Supporting Information Figure
S1).