INTRODUCTION
The canola flower midge (CFM), Contarinia brassicola Sinclair, is a newly discovered cecidomyiid fly from the Canadian prairies that forms flower galls on canola, Brassica napus L. (Mori et al.2019). Canola was initially developed from rapeseed, Brassica rapa L. and B. napus in the Canadian provinces of Manitoba and Saskatchewan in the 1970s, and has since increased to become one of the largest oilseed crops in the world due to widespread use as livestock feed, biofuel, and cooking oil (Barthet 2016; Canola Council of Canada 2020a). Today, the Canadian Prairies produce and export the largest amount of canola in the world, and the highest levels of Canadian production occur in Saskatchewan (LMC International 2016; Statistics Canada 2019).
CFM is hypothesized to be native to Canada (Mori et al. 2019), although knowledge of its biology is limited by the short history of its taxonomic existence. Prior to its description in 2019, the canola midge pests of the Prairie provinces were erroneously thought to be the swede midge, Contarinia nasturtii (Kieffer), a morphologically and ecologically similar congener of CFM. Swede midge causes significant crop damage in parts of Europe, Asia, and more recently, as an invasive pest of canola in North America (Hallett et al. 2007; Chenet al. 2011). Like swede midge, CFM appears to be multivoltine. Initial adult emergence occurs in June and July, during canola bud formation, with a second generation in August; however, CFM larvae have been observed in the field throughout the summer and into September, suggesting that they may produce more than two generations per year (Chen et al. 2011; Andreassen et al . 2018; Mori et al. 2019; Soroka et al . 2019). Larvae are small, up to a few millimeters in length, and they feed within developing canola flower buds. This causes the buds to transform into galls, which then fail to flower or produce seed (Mori et al. 2019). Due to their feeding behaviour and ability to produce multiple generations per year, CFM is potentially capable of causing significant impact on Canadian canola crop yields.
While several aspects of CFM ecology have been described (Mori et al. 2019; Soroka et al . 2019), little is known about CFM population dynamics. Prior genetic investigation of CFM was restricted to specimens sampled primarily from Saskatchewan and use of only a single mitochondrial gene (Mori et al. 2019). There have been no assessments of CFM population structure at wider geographic scales, thus limiting effective monitoring and risk assessment across the canola producing region. Population genetics is a powerful tool for integrated pest management, and can inform effective management strategies by elucidating how genetic diversity, population size, and habitat connectivity influence population dynamics (Rollins et al. 2006; Tiroesele et al. 2014; Pélissié et al. 2018; Combset al. 2019). Genetic assessments of population dynamics are particularly important when organisms lack comprehensive historical occurrence records (e.g. Mori et al . 2016) or are not easily observed in the field, as is the case with CFM. Next generation sequencing (NGS) approaches, particularly those that use restriction enzymes to digest DNA and ultimately produce large single nucleotide polymorphism (SNP) datasets, have recently become widespread in population genomic studies. These approaches can assess hundreds or thousands of markers across the genome in organisms with no existing genomic resources (Davey & Blaxter 2010; Andrews et al. 2016), and often provide a more comprehensive representation of population dynamics compared to one or a few markers (Dussex et al. 2016; Vendrami et al. 2017; Liu et al. 2019).
Small organisms present a challenge to restriction enzyme-based approaches, as these approaches require higher quality and quantity of DNA than traditional gene sequencing. The development of whole genome amplification (WGA) techniques, which amplify genomic DNA prior to NGS library preparation and sequencing, present a possible solution to this problem. However, the application of WGA in NGS datasets is relatively new and there have been few studies to date that have assessed whether WGA is likely to introduce amplification biases that may impact genome coverage and genotyping, particularly in organisms that lack a reference genome (Lovmar et al. 2006; El Sharawy et al. 2012; Ellegaard et al. 2013; Cruaud et al. 2018). In the first study to comprehensively investigate WGA for insect population genetics using non-pooled samples, de Medeiros & Farrell (2018) found WGA resulted in sufficient libraries for analysis, albeit with fewer loci. Given the likelihood that WGA techniques will see increased use in SNP-based studies of small organisms, further assessment of amplification and sequencing biases in this context is necessary.
Here we assessed the genomic structure of CFM, and investigated whether the use of WGA prior to NGS introduced differences in locus recovery, SNP genotyping, and estimates of polymorphism that may impact downstream population genomic analyses. We sampled CFM across its known range and compared the population structure recovered with COI haplotype data to that of genomic SNPs. This is the first genomic study of CFM, which presents a data-rich foundation for continued study of population dynamics of this pest and highlights several areas for future research to improve risk assessment and monitoring efforts for CFM.
METHODS