loading page

Shotgun metagenomics of soil invertebrate communities reflects taxonomy, biomass and reference genome properties
  • +4
  • Alexandra Schmidt,
  • Clément Schneider,
  • Peter Decker,
  • Karin Hohberg,
  • Jörg Römbke,
  • Ricarda Lehmitz,
  • Miklos Balint
Alexandra Schmidt
Universität Konstanz
Author Profile
Clément Schneider
Senckenberg Gesellschaft fur Naturforschung
Author Profile
Peter Decker
Senckenberg Museum of Natural History Görlitz Library
Author Profile
Karin Hohberg
Senckenberg Museum of Natural History
Author Profile
Jörg Römbke
ECT Oekotoxikologie, Flörsheim, Germany
Author Profile
Ricarda Lehmitz
Senckenberg Museum für Naturkunde Görlitz
Author Profile
Miklos Balint
Senckenberg Biodiversity Climate Research Center
Author Profile

Abstract

Metagenomics - shotgun sequencing of all DNA fragments from a community DNA extract - is routinely used to describe the composition, structure and function of microorganism communities. Advances in DNA sequencing and the availability of genome databases increasingly allow the use of shotgun metagenomics on eukaryotic communities. Metagenomics offers major advances in the recovery of biomass relationships, in comparison to taxonomic marker gene based approaches (metabarcoding). However, little is known about the factors that influence metagenomics data from eukaryotic communities, such as differences among organism groups, properties of reference genomes and genome assemblies. We evaluated how shotgun metagenomics records composition and biomass in artificial soil invertebrate communities. We generated mock communities of controlled biomass ratios from 28 species from all major soil mesofauna groups: mites, springtails, nematodes, tardigrades and potworms. We shotgun-sequenced these communities and taxonomically assigned them with a database of over 270 soil invertebrate genomes. We recovered 90% of the species, and observed relatively high false positive detection rates. We found strong differences in reads assigned to different taxa, with some groups consistently attracting more hits than others. Biomass could be predicted from read counts after considering taxon-specific differences. Larger genomes more complete assemblies consistently attracted more reads than genomes. The GC content of the genome assemblies had no effect on the biomass-read relationships. The results show considerable differences in taxon recovery and taxon specificity of biomass recovery from metagenomic sequence data. Properties of reference genomes and genome assemblies also influence biomass recovery, and they should be considered in metagenomic studies of eukaryotes. We provide a roadmap for investigating factors which influence metagenomics-based eukaryotic community reconstructions. Understanding these factors is timely as accessibility of DNA sequencing, and momentum for reference genomes projects show a future where the taxonomic assignment of DNA from any community sample becomes a reality.

Peer review status:UNDER REVIEW

27 Nov 2021Submitted to Ecology and Evolution
29 Nov 2021Assigned to Editor
29 Nov 2021Submission Checks Completed
29 Nov 2021Reviewer(s) Assigned