Discussion
Assessment of Northern Wild Rice via Genotyping-By-Sequencing. Cost-effective sequencing technologies that are capable of generating robust sets of genome-wide molecular markers, such as GBS, are providing researchers, especially those working with complex plant genomes and limited public resources, an avenue for rapid variant detection64–68. In this study, we developed a large genome-wide SNP dataset, aligned to the NWR reference genome, to assess the relationship between natural and cultivated populations as well as to provide a basis for future breeding and conservation studies. However, we want to emphasize that this GBS approach, which is based onBtg1 and TaqI restriction enzymes, may have introduced a bias in allele frequencies due to polymorphisms in restriction sites69, which could have led to slightly skewed inferences regarding the population genetics of NWR.
Relationships within the Zizania Genus. Species of Zizania are endemic to North America 70,71 and split from the Oryzinae subtribe 20-30 million years ago (MYA)34,72,73. Following this split, the Zizaniinae subtribe is hypothesized to have experienced a radiation of speciation across North America and into eastern Asia as individuals made it over the Bering land bridge, leading to the speciation of Zizania latifolia 74. Comparison of the Z. palustris and Z. latifolia genomes suggest that the two species split 6-8 MYA 34. Extant North American species split 0.7-1.1 MYA; this split was likely precipitated by increases in the habitat range of the Zizania progenitor species as climatic conditions shifted over the last million years. 71,75. Evidence suggests that Zizania texana , an endangered species living in a small stretch of the San Marcos River Valley in the southern US, is a relic, isolated population of the ancestral Z. palustris species 74. However, the evolutionary relationship between Z. aquatica , a species found from the Great Lakes region to the east coast of the continental US, and Z. palustris is not well understood. This is likely due to their overlapping range and interspecific crossability 74,76. In this study, the UPGMA, STRUCTURE, and principal coordinate analyses showed only moderate support for the separation of Z. palustris and Z. aquatica . Fst values were primarily affected by geographic distance, and highest within Z. palustris rather than betweenZ. palustris vs Z. aquatica (Figure 4). This may be due to the limited sample size in this study and more research is needed to resolve the complex relationship between these Zizania species.
Structure of Northern Wild Rice Populations is Tied to Geography and a Complex History of Ecosystem Management. Previous genetic analyses of wild populations of NWR have found limited gene flow between populations, and a lack of data to support a correlation between population structure and geographical location5,24,74. In this study, we did not find evidence of significant gene flow among wild populations of NWR, with populations by and large clustering according to their lake or river of origin (Figures 2, 3, and S3). This level of differentiation was also evidenced by the moderate Fst values (0.05-0.15)77 found between Natural Stand populations (Figure 4). However, the majority of our analyses, including the Mantel test (Figure S6), suggest a geographic basis for population structure in NWR. For example, Garfield Lake and Necktie River, the two closest populations in our study, displayed a high level of similarity with one another (Figure 2b; Figure 3). Therefore, it appears that while gene flow is limited in NWR, it likely occurs between populations in close proximity and more research is needed to understand the spatial dynamics of NWR populations, whose aquatic habitats are often discrete and fragmented. While the primary drivers of gene flow in NWR are not well understood, Lu et al., 2005 found the area and size of a NWR population, along with its degree of isolation, were major factors affecting the genetic variability and gene flow among the NWR populations tested. Additionally, recent pollen travel studies found that most pollen is dispersed within the first 7 m for Z. palustris 78 and 1.5 m for Z. texana 79, limiting the likelihood of high levels of gene flow via wind-pollination in the genus.
The historical management and development of lakes across NWR’s natural range have likely contributed to the population structure identified in this study. Efforts to establish new stands of NWR as well as to address declining population sizes have resulted in reseeding efforts across the species’ natural range 80,81. For example, Upper Rice Lake (RRN), which is known to have undergone extensive reseeding efforts since the 1930s (Dr. Kimball, personal communication ), clustered primarily with several UMR populations while showing limited overlap with other RRN populations (Figure 2b; Figure 3). Upper Rice Lake also showed heavy admixture with a number of lakes in STRUCTURE analyses (Figure 3). Taken together, these results suggest that human intervention may have altered the genetic variability and population structure of the Upper Rice Lake population assessed in this study. Additionally, Phantom Lake of the SCR watershed displayed heavy admixture with Cultivated materials. These results were surprising as Phantom Lake is one of the most geographically distant Natural Stand populations from cNWR production in this study, and closer populations displayed little to no admixture with Cultivated materials (Figure 3). However, Phantom Lake is part of the Crex Meadows State Wildlife Area and was artificially created in the 1950s, when a series of levee systems were installed. NWR restoration in this area began in 1991, with 500 lbs (227 kg) of seed sown over the course of three years82. We hypothesize that at least a portion of the seed utilized in these efforts came from cNWR production, further highlighting the complexity of population genetic studies in NWR, as well as the importance of documenting seed sources used in reseeding efforts. We also suggest that future reseeding efforts should not use Phantom Lake populations as a seed source based on the recommendations of the Great Lakes Indian Fish & Wildlife Commission83.
The data presented here for 12 wild populations of NWR likely represents only a small fraction of the species’ genetic diversity. However, even with our small sample size, we were able to identify unique genetic variation within many of the populations in the Natural Stand collection. This indicates that for conservation efforts, it is important to consider populations of NWR individually as they may harbor unique alleles and may be more or less adapted to environmental change. Further studies, using a broader range and more even distribution of sampling locations, will increase our knowledge about the population structure and genetic relationship between wild NWR populations and aid with decision making for future reseeding and other conservation-based efforts.
Spatio-Temporal Genetic Diversity Analyses Can Aid in Conservation Efforts. Comprehensive monitoring of the spatio-temporal genetic diversity of a species can provide a better understanding of the evolutionary change a species undergoes over time and help to identify targets for conservation efforts. Although present-day genetic diversity studies are available for a wide array of plant species, the majority of spatio-temporal diversity assessments have focused on large agricultural commodity crops 84–86. Few studies have focused on natural populations 87,88 and many, only theoretically89,90. As NWR is an important target for conservation, monitoring the spatio-temporal diversity of wild populations would provide impactful data for resource managers and environmental agencies interested in the health and preservation of NWR populations across the species natural range.
As a preview of what a more extensive study on the spatio-temporal genetic diversity of NWR could provide, we evaluated two populations, Garfield and Shell Lakes, in 2010 and 2018. Comparing these two time points, we identified a reduction of diversity in samples collected from Shell Lake in 2018 compared to those collected in 2010, while observing limited change in the Garfield Lake population (Figure 2d). This may suggest that Shell Lake has experienced a loss of genetic diversity during the eight years between collection times, while Garfield Lake has not. However, more data is needed to confirm this hypothesis. A wide array of factors could have contributed to this reduction in diversity in Shell Lake, including the 3 - 5 year boom and bust cycles of NWR91, or environmental conditions favorable for specific genotypes in the population’s seed bank. It is also possible that shoreline development and recreation stemming from campgrounds and resorts on Shell Lake could have impacted the health of its native NWR population 15.
Cultivated Northern Wild Rice is Distinct from Natural Stand Populations. Gene flow between domesticated crops and their wild counterparts can have significant impacts on both natural ecosystems and agricultural production systems. Genetic contamination, loss of identity and genetic diversity, and increased weediness are all potential consequences of gene flow 92. For these reasons, the extent of gene flow between crops and their wild cohorts has been evaluated in numerous species and found to be dependent on a variety of factors including, but not limited to, mating system (i.e. out-crossing vs selfing), the type and frequency of pollination (i.e. insect vs wind), the selective (dis)advantage of particular domesticated traits (i.e. seed shattering resistance reducing seed dispersal), genetic drift, and genotype × environment interactions 92–95. Some studies, such as those in soybean (Glycine max ), have identified limited gene flow, with domesticated and wild samples separating into monophyletic clades 96,97. Other studies have identified significant historical gene flow during domestication, such as Emmer wheat (Triticum dicoccon )98, as well as on-going gene flow between crop-weed complexes, such as those in cowpea (Vigna unguiculata (L.) Walp)99, pearl millet (Pennisetum glaucum )100, and species in the Sorghum genus101,102.
Given the out-crossing nature of NWR and that cNWR production occurs within the centers of origin and diversity of Z. palustris , it is important to understand the extent of gene flow between cultivated and wild populations. This study found that Natural Stand and Cultivated collections are genetically distinct from one another (Figure 2a; Figure 3; Figure S4), indicating minimal gene flow between these two groups and corroborating the results of previous diversity studies in NWR using different marker systems 5,24,25. However, based on the 1st principal coordinate from Figure 2a, we identified more similarities between the Cultivated collection and Bass, Decker, and Dahler Lakes than other Natural Stand populations. These lakes are geographically close to the UMN cNWR paddy complex in Grand Rapids, MN and could suggest gene flow. However, it is more likely that this is due to a shared ancestral relationship, as neither STRUCTURE analysis (Figure 3) nor D -statistics (Table S6) suggest recent gene flow between the two populations. Importantly, the cultivated germplasm in use today is all descended from natural stand samples originally collected from this geographical region within the UMR watershed. Cultivation and domestication of NWR began in Aitkin, MN and several small enterprises likely gathered seeds from local populations to build their germplasm bases 13.
Domestication and Stewardship of Cultivated Northern Wild Rice. As domestication is a process rather than a specific event, species exhibit varying levels of domestication 103. In cereals and other major agricultural crops, seed retention and size, seed dormancy and germination, plant growth habit, and plant size are domestication traits commonly targeted for selection104. The presence of these common traits across multiple taxa is known as the domestication syndrome, which differentiates domesticated species from their wild counterparts. While many of today’s largest agricultural commodity crops have undergone mass selection for thousands of years, the advent of new technologies, such as genomic sequencing, provide today’s plant breeders with new opportunities for the rapid, targeted domestication of new crops105. Additionally, these technologies afford researchers the opportunity to study the domestication process in real-time 106.
To begin exploring the domestication process of cNWR, we evaluated changes in nucleotide diversity levels and allele frequency distributions between Natural Stand and cNWR populations using Tajima’s D, FST , and XP-CLR tests. No significant overlap was identified between the three tests, suggesting there is limited evidence for selective sweeps in cNWR. However, two 1-Mb regions on ZPchr0011 and ZPchr0013 had overlapping top 1% ofFST and XP-CLR scores suggesting there is some evidence of genetic changes in cNWR compared to the Natural Stands (Figure 5). A preliminary scan of genes in these two regions identified 5 putative genes whose functions in other species, mainly white rice, are related to drought and salt stresses as well as abscisic acid (ABA) signaling. These included a 60S ribosomal protein kinase 32-like gene 107; a CBL-interacting protein kinase 32-like gene 108; a E3 ubiquitin-protein ligase RZFP34 isoform X2 109,110; and two copies ofras-related protein RABC2a 111 . Unlike wild populations of NWR, cNWR is grown in man-made irrigated paddies, which are drained shortly after flowering (Principal Phenological Stage 6)112 to allow for mechanical harvesting of the grain. Therefore, cNWR experiences conditions similar to upland crops, for which standing water is not available during the development of fruit, ripening, and senescence. These results may suggest that stress-related genes, particularly drought-related genes, were heavily selected for in cNWR germplasm to adapt to this drastic change in environmental conditions compared with its natural habitats.
As XP-CLR is more robust than FST  for identifying recent selection events 63, we looked at the two additional XP-CLR regions that contained the top 1% of the statistic’s empirical distribution, including a region on ZPchr0005 between 8.5-9.7 Mb and a region on ZPchr0006 between 1.2-1.4 Mb. Within these regions, we identified a calcium-dependent protein kinase family protein associated with drought and salt tolerance in white rice113,114; a 2,3-bisphosphoglycerate-independent phosphoglycerate mutase-like gene involved with chlorophyll synthesis and photosynthesis in white rice 115; a CTD nuclear envelope phosphatase 1 homolog associated with seed shattering resistance in white rice 116; a KH domain-containing protein SPIN1-like associated with flowering time in white rice 117; and a pentatricopeptide repeat-containing protein At1g11900 isoform X1 associated with male sterility in Petunia 118. Two paralogs ofcytochrome P450 714D1-like were identified on both ZPchr0005 and ZPchr0006 regions of interest. In white rice, this gene is associated with seed dormancy and flowering time 119,120. While not in the scope of our current study, we think these regions merit further investigation. Given the significance of NWR to a wide range of stakeholders, it’s important to understand the potential impact of gene flow from cNWR to wild NWR populations. Therefore, while understanding the domestication process in cNWR is important for the plant breeding process, it can also be used to monitor the genetic diversity of natural stands, allowing for better stewardship of these vital populations.
Domestication indices that account for varying levels of domestication have been proposed for several species and typically include: the extent of phenotypic differentiation between the domesticated species and its wild counterparts; the length of a species’ domestication history; whether major genetic changes to the domesticated species have been identified; whether the species has been adapted to agricultural settings through targeted breeding efforts; and the extent of the species’ cultivation 121–123. Cultivated NWR is somewhat phenotypically distinct from wild NWR, mainly in its growth habit and seed retention characteristics, which have been made possible through breeding efforts. While the species has a short history of cultivation, its production has expanded to California, which is outside the species’ natural range. For these reasons, we suggest that cNWR should be classified as semi-domesticated.