INTRODUCTION
The canola flower midge (CFM), Contarinia brassicola Sinclair, is
a newly discovered cecidomyiid fly from the Canadian prairies that forms
flower galls on canola, Brassica napus L. (Mori et al.2019). Canola was initially developed from rapeseed, Brassica
rapa L. and B. napus in the Canadian provinces of Manitoba and
Saskatchewan in the 1970s, and has since increased to become one of the
largest oilseed crops in the world due to widespread use as livestock
feed, biofuel, and cooking oil (Barthet 2016; Canola Council of Canada
2020a). Today, the Canadian Prairies produce and export the largest
amount of canola in the world, and the highest levels of Canadian
production occur in Saskatchewan (LMC International 2016; Statistics
Canada 2019).
CFM is hypothesized to be native to Canada (Mori et al. 2019),
although knowledge of its biology is limited by the short history of its
taxonomic existence. Prior to its description in 2019, the canola midge
pests of the Prairie provinces were erroneously thought to be the swede
midge, Contarinia nasturtii (Kieffer), a morphologically and
ecologically similar congener of CFM. Swede midge causes significant
crop damage in parts of Europe, Asia, and more recently, as an invasive
pest of canola in North America (Hallett et al. 2007; Chenet al. 2011). Like swede midge, CFM appears to be multivoltine.
Initial adult emergence occurs in June and July, during canola bud
formation, with a second generation in August; however, CFM larvae have
been observed in the field throughout the summer and into September,
suggesting that they may produce more than two generations per year
(Chen et al. 2011; Andreassen et al . 2018; Mori et
al. 2019; Soroka et al . 2019). Larvae are small, up to a few
millimeters in length, and they feed within developing canola flower
buds. This causes the buds to transform into galls, which then fail to
flower or produce seed (Mori et al. 2019). Due to their feeding
behaviour and ability to produce multiple generations per year, CFM is
potentially capable of causing significant impact on Canadian canola
crop yields.
While several aspects of CFM ecology have been described (Mori et
al. 2019; Soroka et al . 2019), little is known about CFM
population dynamics. Prior genetic investigation of CFM was restricted
to specimens sampled primarily from Saskatchewan and use of only a
single mitochondrial gene (Mori et al. 2019). There have been no
assessments of CFM population structure at wider geographic scales, thus
limiting effective monitoring and risk assessment across the canola
producing region. Population genetics is a powerful tool for integrated
pest management, and can inform effective management strategies by
elucidating how genetic diversity, population size, and habitat
connectivity influence population dynamics (Rollins et al. 2006;
Tiroesele et al. 2014; Pélissié et al. 2018; Combset al. 2019). Genetic assessments of population dynamics are
particularly important when organisms lack comprehensive historical
occurrence records (e.g. Mori et al . 2016) or are not easily
observed in the field, as is the case with CFM. Next generation
sequencing (NGS) approaches, particularly those that use restriction
enzymes to digest DNA and ultimately produce large single nucleotide
polymorphism (SNP) datasets, have recently become widespread in
population genomic studies. These approaches can assess hundreds or
thousands of markers across the genome in organisms with no existing
genomic resources (Davey & Blaxter 2010; Andrews et al. 2016),
and often provide a more comprehensive representation of population
dynamics compared to one or a few markers (Dussex et al. 2016;
Vendrami et al. 2017; Liu et al. 2019).
Small organisms present a challenge to restriction enzyme-based
approaches, as these approaches require higher quality and quantity of
DNA than traditional gene sequencing. The development of whole genome
amplification (WGA) techniques, which amplify genomic DNA prior to NGS
library preparation and sequencing, present a possible solution to this
problem. However, the application of WGA in NGS datasets is relatively
new and there have been few studies to date that have assessed whether
WGA is likely to introduce amplification biases that may impact genome
coverage and genotyping, particularly in organisms that lack a reference
genome (Lovmar et al. 2006; El Sharawy et al. 2012;
Ellegaard et al. 2013; Cruaud et al. 2018). In the first
study to comprehensively investigate WGA for insect population genetics
using non-pooled samples, de Medeiros & Farrell (2018) found WGA
resulted in sufficient libraries for analysis, albeit with fewer loci.
Given the likelihood that WGA techniques will see increased use in
SNP-based studies of small organisms, further assessment of
amplification and sequencing biases in this context is necessary.
Here we assessed the genomic structure of CFM, and investigated whether
the use of WGA prior to NGS introduced differences in locus recovery,
SNP genotyping, and estimates of polymorphism that may impact downstream
population genomic analyses. We sampled CFM across its known range and
compared the population structure recovered with COI haplotype
data to that of genomic SNPs. This is the first genomic study of CFM,
which presents a data-rich foundation for continued study of population
dynamics of this pest and highlights several areas for future research
to improve risk assessment and monitoring efforts for CFM.
METHODS