A universal tool for marine metazoan species identification -- Towards
best practices in proteomic fingerprinting
Abstract
Proteomic fingerprinting using MALDI-TOF mass spectrometry is a
well-established tool for identifying microorganisms and has shown
promising results for identification of animal species, particularly
disease vectors and marine organisms. And thus can be a vital tool for
biodiversity assessments in ecological studies. However, few studies
have tested species identification across different orders and classes.
In this study, we collected data from 1,246 specimens and 198 species to
test species identification in a diverse dataset. We also evaluated
different specimen preparation and data processing approaches for
machine learning and developed a workflow to optimize classification
using random forest. Our results showed high success rates of over 90%,
but we also found that the size of the reference library affects
classification error. Additionally, we demonstrated the ability of the
method to differentiate marine cryptic-species complexes and to
distinguish sexes within species.