Abstract
Metazoa-level Universal Single-Copy Orthologs (mzl-USCOs) are
universally applicable markers for DNA taxonomy in animals which can
replace or supplement single-gene barcodes. While previously mzl-USCOs
from target enrichment data were shown to reliably distinguish species,
here we tested whether USCOs are an evenly distributed, representative
sample of a given metazoan genome and therefore able to cope with past
hybridization events and incomplete lineage sorting. This is relevant
for coalescent-based species delimitation approaches, which critically
depend on the assumption that the investigated loci do not exhibit
autocorrelation due to physical linkage. Based on 239 assessed
chromosome-level assembled genomes, we confirmed that mzl-USCOs are
genetically unlinked for practical purposes and a representative sample
of a genome in terms of reciprocal distances between USCOs on a
chromosome and of distribution across chromosomes. We tested the
suitability of mzl-USCOs extracted from genomes for species delimitation
and phylogeny in four case studies: Anopheles mosquitos, Drosophila
fruit flies, Heliconius butterflies, and Darwin’s finches. In almost all
instances, USCOs allowed delineating species and yielded phylogenies
that correspond to those generated from whole genome data. Our
phylogenetic analyses demonstrate that USCOs may complement single-gene
DNA barcodes and provide more accurate taxonomic inferences. Combining
USCOs from sources that used different versions of ortholog reference
libraries to infer marker orthology may be challenging and at times
impact taxonomic conclusions. However, we expect this problem to become
less severe as the rapidly growing number of reference genomes provides
a better representation of the number and diversity of organismic
lineages.