Keywords
Cephalopholis sonnerati , chromosome-level genome assembly, genome annotation, comparative genome analyse
Introduction
Groupers (subfamily Epinephelinae species, Serranidae, Percoidei, Perciformes), the largest subfamily in the Serranidae family, consist of more than 160 species in 16 genera (Zhang et al., 2013). These commercially important fishes possess special characteristics of a long lifespan, large size, slow growth, vulnerability and delayed reproduction (Morris et al., 2000). Moreover, they usually inhabit coral reefs of tropical and subtropical coasts. Of them, the genusCephalopholisis the most abundant serranid in the Gulf of Aqaba (Red Sea) (Shpigel & Fishelson, 2010).
The tomato hindCephalopholis sonnerati(Valenciennes) (Serranidae), belonging to the genusCephalopholis , is a bottom-dwelling coral reef of 12–120-m depth in the Indo-Pacific and Red Sea.C.sonnerati are protogynous hermaphrodites in life and feeding on little fish and invertebrates (Shpigel, 1985; Shpigel & Fishelson, 1989a,b; Shpigel & Fishelson, 2010). Furthermore, they are also characterized by complex social structures and behavioural mechanisms. They naturally form social groups, with males and several females occupying individual territories within the male’s larger territory (Meyer, 2008; Shpigel & Fishelson, 1989b). However, due to overfishing, anthropogenic activities and water pollution, the natural populations of C.sonnerati have directly declined (Hawkins & Roberts, 1994). Previous studies of the genusCephalopholimainly focused on fishery management, species conservation (Galal-Khallaf et al., 2018), behavior biology (Shpigel & Fishelson, 2010), nutrition biology, and phylogeographic biology (Gaither et al., 2011). Nevertheless, owing to the lack of genomic resources, molecular-genetic studies and genomic breeding remain unexplored in this species.
PacBio (a single-molecule real-time [SMRT] sequencing), a newly third-generation sequencing technology, generates long reads with uniform coverage and high consensus accuracy compared with the second-generation sequencing technology that generates short reads (Rhoads & Au, 2015). Morever, third-generation sequencing technology is less expensive than second-generation sequencing technology and does not depend on amplification for library generation (Ze-Gang & Shao-Wu, 2018). Additionally, Hi-C, a chromosome conformation capture-based method, can convert chromatin interactions, reflecting topological chromatin structures into digital information (Belaghzal et al., 2017). Presently, it has become a mainstream technology in 3D genomics. Despite that more than 270 aquatic organisms’ genome sequences have been published (https://www.ncbi.nlm.nih.gov/genome/browse#!/overview/fish), only three genome sequences of grouper species (the giant grouperEpinephelus lanceolatus [Zhou et al., 2019], the red-spotted grouperEpinephelus akaara [Ge et al., 2019] and the leopard coral grouper, Plectropomus leopardus [Zhou et al., 2020] are available. Therefore, it is significantly important to gain more genome sequences of grouper species for the research on the classification, evolutionary, genetics, and biological studies of groupers.
In the present study, we reported the first chromosome-level genome assembly of C.sonnerati, which was obtained by using PacBio long-read sequencing and Hi-C sequencing technologies. Our reference genome will lay a solid foundation for studies on the genetics conservation, resistance breeding and evolutionary of C. sonnerati .