loading page

Practical guide for obtaining and validating chromosome-scale genome assemblies with Hi-C scaffolding
  • +3
  • Kazuaki Yamaguchi,
  • Mitsutaka Kadota,
  • Osamu Nishimura,
  • Yuta Ohishi,
  • Yuki Naito,
  • Shigehiro Kuraku
Kazuaki Yamaguchi
RIKEN Center for Biosystems Dynamics Research

Corresponding Author:kazuaki.yamaguchi@riken.jp

Author Profile
Mitsutaka Kadota
RIKEN Center for Biosystems Dynamics Research
Author Profile
Osamu Nishimura
RIKEN Center for Biosystems Dynamics Research
Author Profile
Yuta Ohishi
RIKEN Center for Biosystems Dynamics Research
Author Profile
Yuki Naito
Database Center for Life Science
Author Profile
Shigehiro Kuraku
RIKEN Center for Biosystems Dynamics Research
Author Profile

Abstract

Recent development of ecological studies has been fueled by the introduction of massive information based on chromosome-scale genome sequences, even for species whose genetic linkage was previously not accessible. This was enabled mainly by the application of Hi-C, a method for genome-wide chromosome conformation capture which was originally developed for investigating long-range interaction of chromatins. Performing genomic scaffolding using Hi-C data is highly resource-demanding in elaborate laboratory steps for sequencing sample preparation, building primary genome sequence assembly as an input, and computation for genome scaffolding using Hi-C data, followed by careful validation. This article summarizes existing solutions for these steps and provides a test case of its application to a reptile species, the Madagascar ground gecko (Paroedura picta). Among frequently exerted metrics for evaluating scaffolding results, we investigate the validity of completeness assessment using single-copy reference orthologs and report problems with the widely used program pipeline BUSCO.