Abstract
Northern red oak (Quercus rubra L.) is an ecologically and economically
important forest tree native to the northeastern United States. We
present a chromosome-scale, haplotype-resolved genome of Q. rubra, a
representative red oak species, generated by the combination of PacBio
sequences and chromatin conformation capture (Hi-C) scaffolding. This is
the first reference genome from the red oak clade (section Lobatae). The
Q. rubra assembly spans 739 Megabases (Mb) with 95.27% of the genome
sequences scaffolded into 12 chromosomes and 33,333 protein-coding
genes. Comparisons to the genomes of Q. lobata and Q. mongolica reveal
high collinearity, with intrachromosomal structural variants present.
Orthologous gene family analysis with other oak and rosid tree species
revealed that gene families associated with defense response were
expanding and contracting simultaneously across the Q. rubra genome.
Quercus rubra had the most CC-NBS-LRR and TIR-NBS-LRR resistance genes
out of the nine species analyzed. Terpene synthase gene family
comparisons further reveal tandem gene duplications in TPS-b subfamily,
similar to Q. robur. Single major QTL regions were identified for
vegetative bud break and marcescence which contain candidate genes for
further research, including a putative ortholog of the circadian clock
constituent cryptochrome (CRY2) and a family of eight tandemly
duplicated genes for serine protease inhibitors, respectively.
Genome-environment associations across natural populations identified
candidate abiotic stress tolerance genes and predicted performance in a
common garden. This high-quality red oak genome represents an essential
resource to the oak genomics community which will further supplement the
knowledge of Quercus genomics.