5. Sequencing Data Analysis
We adapted a previously published pipeline from the R package ‘phyloseq’
for the statistical analysis of our microbial sequence data. Once the
sequences were obtained from the sequencing facility, we used the
program FastQC to check the initial quality of the sequenced samples and
to trim primer sequences from the samples. The quality scores indicated
the amount of overlap to use for the merged samples. To keep a minimum
phred score of 25, we merged the forward sample at 275 bp and cut the
reverse at 225 bp. The phred score is a score developed to determine the
quality of nucleobases returned from sequenced DNA. The similarity
cutoff threshold was 99% for this sequenced data. We used the
SILVA v138 reference taxonomy dataset to identify the microbial species
present within each sample’s microbial community. Our R script is
available on GitHub
(https://github.com/rusty-russ/Russell-et-al.-Methods-Paper).