loading page

SCNIC: Sparse Correlation Network Investigation for Compositional Data
  • +1
  • Michael Shaffer,
  • Kumar Thurimella,
  • John Sterrett,
  • Catherine Lozupone
Michael Shaffer
University of Colorado - Anschutz Medical Campus

Corresponding Author:michael.t.shaffer@colostate.edu

Author Profile
Kumar Thurimella
University of Colorado - Anschutz Medical Campus
Author Profile
John Sterrett
University of Colorado at Boulder
Author Profile
Catherine Lozupone
University of Colorado - Anschutz Medical Campus
Author Profile

Abstract

Background Microbiome studies are often limited by a lack of statistical power due to small sample sizes and a large number of features. This problem is exacerbated in correlative studies of multi-omic datasets. Statistical power can be increased by finding and summarizing modules of correlated observations, which is one dimensionality reduction method. Additionally, modules provide biological insight as correlated groups of microbes can have relationships among themselves. Results To address these challenges, we developed SCNIC: Sparse Cooccurrence Network Investigation for compositional data. SCNIC is open-source software that can generate correlation networks and detect and summarize modules of highly correlated features. Modules can be formed using either the Louvain Modularity Maximization (LMM) algorithm or a Shared Minimum Distance algorithm (SMD) that we newly describe here and relate to LMM using simulated data. We applied SCNIC to two published datasets and we achieved increased statistical power and identified microbes that not only differed across groups, but also correlated strongly with each other, suggesting shared environmental drivers or cooperative relationships among them. Conclusions SCNIC provides an easy way to generate correlation networks, identify modules of correlated features and summarize them for downstream statistical analysis. Although SCNIC was designed considering properties of microbiome data, such as compositionality and sparsity, it can be applied to a variety of data types including metabolomics data and used to integrate multiple data types. SCNIC allows for the identification of functional microbial relationships at scale while increasing statistical power through feature reduction.
28 May 2022Submitted to Molecular Ecology Resources
08 Jun 2022Reviewer(s) Assigned
27 Jun 2022Review(s) Completed, Editorial Evaluation Pending
18 Jul 2022Editorial Decision: Revise Minor
17 Aug 2022Review(s) Completed, Editorial Evaluation Pending
17 Aug 20221st Revision Received
18 Aug 2022Editorial Decision: Accept
Sep 2022Published in Molecular Ecology Resources. 10.1111/1755-0998.13704