K-mer analysis and evaluation of genome size
All paired-end reads generated in this study were cleaned using fastp
(Chen, Zhou, Chen, & Gu, 2018), under default settings. About 50X of
the estimated genome, totaling 37.5 Gb clean data were randomly selected
from the whole genome sequencing data to estimate T. dalaicagenome size, using k-mer analysis. The depth distribution of effective
17-mers was estimated using Jellyfish version 2.2.10 (Marcais &
Kingsford, 2011), with depth of the main peak selected as Kdepth, and
genome size estimated using the following formula\(Genome\_Size\ =\ Knum/Kdept\). Genome heterozygosity was estimated
using GenomeScope 2.0 (Vurture et al., 2017).