Sequence Processing
Raw sequences had an approximate length of ~200 bp,
which were quality trimmed using Trimmomatic (Bolger, Lohse, & Usadel,
2014) and scanned with a 4-base wide sliding window and cut when the
average quality dropped below 15. For merging of paired-end reads, we
used the script ‘join-paired-ends’ within the open-source bioinformatics
pipeline Quantitative Insights into Microbial Ecology v.1.8.0 (QIIME;
Caporaso et al., 2010) with a minimum read overlap of 20 bases. Further
analysis was performed following an in-house developed pipeline (Stecher
et al., 2016) also using QIIME v.1.8.0 (Caporaso et al., 2010). Briefly,
reads were quality-filtered according to recommended settings in
Bokulich et al. (2013). Only sequences that fully matched the primer
sequences at the beginning and end of the sequence, respectively, and
which were between 200 and 500 bp in length were further processed. For
chimera detection and clustering of sequences into OTUs, we used the
QIIME workflow ‘usearch.qf’, which incorporates UCHIME (Edgar et al.,
2011). Pre-clustered sequences were checked for chimeras (de novoand with Silva 119 SSU Ref NR). The remaining sequence set was clustered
(de novo ) into OTUs with a similarity threshold of 98%. All OTUs
consisting of four or fewer sequences were removed. Quality filtered
sequences were then classified using ‘Mothur’ (Schloss et al., 2009)
against the Northern Microbial Eukaryote Database (Lovejoy et al.,
2016), with a threshold confidence of (-c) 0.8. All metazoans,
bryozoans, fungi and viridiplantae related sequences were removed. The
number of reads per library was then rarefied at uniform depth of 9,081
based on the sample with the lowest count. Libraries with less than ca.
5,000 reads in at least one of the samples or primer set were removed in
all datasets to ensure comparability.