2.9 Tandem Duplications
Tandemly duplicated genes were identified using all-vs-all blastp
searches within each genome for biotic stress associated intervals
(Table S4). Genes with blastp E scores < 1e-180 that were
located within 500 kb of one another were considered to be recent tandem
duplications. The window size was determined by testing a range of
values and choosing a window size at which the number of
newly-discovered tandem duplicates began to decline (Figure 3). The
stringent E score cutoff was intended to focus the analysis on genes
that are recently duplicated and therefore potentially differentially
duplicated between the species. The QTL intervals were tested for
significant enrichment of tandem duplicates by using a Monte Carlo
simulation. Sets of contiguous genes equal in number to those contained
in each QTL interval were randomly selected from the whole genome, and
the number of sampled tandem duplications was counted for each
iteration. This was repeated 10,000 times, and the observed number of
tandem duplicates was compared to the simulated distribution to derive
an empirical P-value.