Peng Liu

and 4 more

Accurate and efficient genotyping of microsatellite loci is essential for their application in population genetics and various demographic analysis. Protocols for next generation sequencing of microsatellite loci generate high-throughput and cross-compatible allele scoring characteristics: common issues associated with size separation on conventional capillary-based protocols. As a result, we have developed a novel, ultra-fast, all-in-one software Seq2Sat in C++ to support accurate automated microsatellite genotyping. It directly takes raw reads of microsatellite amplicons and subsequently performs read quality control before inferring genotypes based on depth of read, sequence composition and length. It does not produce any intermediate files, making I/O very efficient. Additionally, we developed a module in Seq2Sat for sex identification based on sex locus amplicons. We further developed a user-friendly website-based platform SatAnalyzer to conduct reads-to-report analyses by calling Seq2Sat to generate genotype tables and interactive genotype graphs for manual editing. SatAnalyzer also allows visualization of read quality and distribution across loci and samples to troubleshoot multiplex optimization and high-quality library preparation. To evaluate its performance, we benchmarked SatAnalyzer against conventional capillary gel electrophoresis and an existing microsatellite genotyping software MEGASAT. Results show that SatAnalyzer can achieve > 0.993 genotyping accuracy and Seq2Sat is ~ 5 times faster than MEGASAT despite many more informative tables and figures generated. Seq2Sat and SatAnalyzer are freely available at github (https://github.com/ecogenomicscanada/Seq2Sat) and dockerhub (https://hub.docker.com/r/rocpengliu/satanalyzer).

Rebecca Taylor

and 3 more

Conservation genomics is an important tool to manage threatened species under current biodiversity loss. Recent advances in sequencing technology mean that we can now use whole genomes to investigate demographic history, local adaptation, inbreeding, and more in unprecedented detail. However, for many rare and elusive species only non-invasive samples such as faeces can be obtained, making it difficult to take advantage of whole genome data. We present a method to extract DNA from the mucosal layer of faecal samples to reconstruct high coverage whole genomes using standard laboratory techniques, therefore in a cost-effective and efficient way. We use wild collected faecal pellets collected from wild caribou (Rangifer tarandus), a species undergoing declines in many parts of its range in Canada and subject to comprehensive conservation and population monitoring measures. We compare four faecal genomes to two tissue genomes sequenced in the same run. Quality metrics were similar between faecal and tissue samples with the main difference being the alignment success of raw reads to the reference genome likely due to differences in endogenous DNA content, affecting overall coverage. One of our faecal genomes was only reconstructed at low coverage (1.6X), however the other three obtained between 7 and 15X, compared to 19 and 25X for the tissue samples. We successfully reconstructed high-quality whole genomes from faecal DNA and, to our knowledge, are the first to obtain genome-wide data from wildlife faecal DNA in a non-primate species, representing an important advancement for non-invasive conservation genomics.