Selection of ORF target regions
Mapping the clustered ORFs from our supertranscriptome to the BUSCO Metazoa_odb9 database, we retained 633 single-copy and 334 duplicated BUSCO hits, respectively (Table 2, last column), of which 633 and 186, respectively, were retained as likely single-copy orthologous targets across our ingroup taxa. Evaluation of orthology for ORFs that mapped to the Unioverse probe set suggested that 186 of the 811 Unioverse loci (22.9%) are affected by homology issues for our ingroup taxa (which belong to the taxa for which the Unioverse probe set was designed). In most of these cases, several divergent ORFs mapped to a single Unioverse locus, suggesting paralogy, but we also observed instances where Unioverse loci were not orthologous to their ‘associated’ Bathymodiolus target region, as indicated by less sequence divergence between Bathymodiolus and the matching fragment of our ingroup ORFs than betweenBathymodiolus and the associated Unioverse loci. Nevertheless, our evaluation suggested most Unioverse loci to be single-copy orthologous in Coelaturini, which resulted in the addition of 297 ORF targets from the Unioverse probe set (usually several Unioverse loci map to a single ORF). Mapping ORFs and subregions among each other resulted in the removal of one ORF from the duplicate BUSCO selection and another from the Unioverse set, resulting in a total of 1,114 retained ORFs which cover 1,677,936 nucleotides (on average 1506 nt/ORF).