2.8 Physical Genome Intervals
Physical genome intervals in the P. trichocarpa genome (v3.0) were examined for each significant QTL for biotic associations. The intervals were defined as 1 Mb regions centered on the marker with the highest LOD score. Fixed physical genome sizes were used rather than intervals defined based on LOD scores due to the large variation in magnitude of LOD observed for the significant QTL. For example, intervals of 1 LOD centered on the QTL ranged in size from 169 to 4620 kb. Much of this variation was likely due to variation in marker density and local recombination rates, in addition to phenotyping and genotyping error. We believe that a fixed 1 Mb interval is a more consistent and conservative approach. On average, this represents approximately 6.34 cM, based on a total map size of 2617 cM and a total assembled genome length of 420 Mb.
Orthologous intervals were identified in the P. deltoides clone WV94 reference genome (v2.1) obtained from Phytozome (Goodstein et al., 2012). Orthology was determined using a combination of protein sequence conservation and synteny using MCScanX (Wang et al., 2012). Briefly, all proteins were compared in all-vs-all searches using blastp both within genomes and between genomes. These were then chained into collinear segments using the MCScanX algorithm. Orthologous segments were identified based on the presence of large numbers of gene pairs in collinear order with high sequence identity (median blastp E score <1e-180) (Figure 2). Synonymous (Ks) and nonsynonymous Ka) nucleotide substitution rates were calculated using the Bioperl DNAstatistics module (Stajich, 2002) (Table S1), domain composition (Table S2), and Gene Ontology (GO) terms (Table S3) were obtained for each genome from Phytozome (v12.1). Intervals were customized for the grandparents of the pseudo backcross progeny (clones 93-968 for P. trichocarpa and D124 for P. deltoides ) using ~150X of 2x250 paired end Illumina sequences. These were aligned to the respective reference genome for each species using bwa mem with default parameters. SNPs and small indels were identified using samtools mpileup (Li, 2011; Li et al., 2009), and sequence depth was extracted using vcftools (Danecek et al., 2011). Sequences were converted using the vcftools utility vcf-consensus. Genes with no coverage in the alignments were excluded from the intervals for each species.