Genome sampling and assembly
Collection of all fluid samples and total genomic DNA extractions from
those fluids, as well as corresponding physical and geochemical data
have been described previously (Lau et al. , 2014, 2016; Osburnet al. , 2014; Magnabosco et al. , 2016; Heard et
al. , 2017; Momper et al. , 2017). All MAGs from North America and
Africa were reconstructed according to the methods used in (Momperet al. , 2017). MAG identifiers and sources are listed in Table
S2. Completeness was calculated using the composite values from five
widely accepted core essential gene metrics. Duplicate copies of any of
these single copy marker genes was interpreted as a measure of
contamination (Creevey et al. , 2011; Dupont et al. , 2012;
Wu and Scott, 2012; Campbell et al. , 2013; Alneberg et
al. , 2014). Individual genomes were then submitted for gene calling and
annotations through the DOE Joint Genome Institute IMG-ER (Integrated
Microbial Genomes expert review) pipeline (Markowitz et al. ,
2008; Huntemann et al. , 2015). For quality control purposes, the
genes flanking every denitrification gene presented in this study were
individually searched on the NCBI RefSeq database using the BLASTp
algorithm, confirming that top hits for all flanking genes were also to
Chloroflexi. This step ensured that the nitrogen transforming genes of
interest presented here were not simply on scaffolds that were
incorrectly binned into a putative Chloroflexi genome.