Genetic Data
As our knowledge of rare disease genetics develops and the interaction
between genetic loci are more fully understood, there is a pressing need
for the visualization of all types of genetic variation within a single
interface. DECIPHER fulfills this need, supporting many types of genetic
variation including sequence variants, CNVs, aneuploidy, uniparental
disomy (UPD), inversions, insertions and short tandem repeats (STRs)
(Fig. 2).
Variant deposition: Variants are deposited using genomic
coordinates. Sequence variants can also be deposited using a relevant
subset of HGVS nomenclature (den Dunnen et al ., 2006), and will
be normalised (left aligned, parsimonious) during the deposition process
(Tan et al ., 2015). For known STRs, the disease-relevant STR can
be selected from a dropdown in the web interface. Additional information
about the variant such as inheritance, genotype, pathogenicity, and
contribution to phenotype can also be recorded.
Mosaicism: For de novo mosaic variants, it is possible to
record the mosaicism observed in each tissue, as a percentage. This
information is clinically important as it can help explain the
variability of clinical symptoms, for example the difference between
nevus sebaceous or Schimmelpenning syndrome (where extracutaneous
abnormalities are present), caused by HRAS and KRASvariants (Groesser et al ., 2012).
Mitochondrial variants: DECIPHER supports the deposition and
interpretation of variants in the nuclear and mitochondrial genomes.
Mitochondrial diseases are the most common form of inherited
neuro-metabolic disorders, and are caused by mutations in the nuclear or
mitochondrial genomes. In addition, nuclear genetic factors have been
shown to influence clinical outcomes for mitochondrial DNA mutations
(Boggan et al ., 2019). Thus the display of both genomes in a
single interface is clinically important. In DECIPHER it is possible to
record homoplasmy or the percentage of heteroplasmy per tissue, which is
clinically essential as it has been shown to contribute to disease
progression (Grady et al ., 2018).
Variant haplotypes: Variants may work in cis to create or
modify a disease allele or in trans to cause a biallelic
disorder. For this reason DECIPHER users can assign variants to a
haplotype, e.g. for compound heterozygous variants, the variants will be
shown as in trans . As our understanding of rare disease genetics
improves, the representation of its complexity is becoming even more
essential. It is known that genetic modifiers alleviate or exacerbate
the severity of the disease (Rahit and Tarailo-Graovac 2020) and there
are recent examples where rare pathogenic haplotypes have been shown to
cause disease, such as an albinism-causing TYR haplotype (Campbellet al , 2019).
Pathogenicity predictors: For all sequence variants deposited to
DECIPHER, predictions from the Ensembl Variant Effect Predictor (VEP;
McLaren et al ., 2016) are displayed across all Ensembl/GENCODE
transcripts. Predictions include the consequence (e.g. missense,
frameshift), the protein change, and several pathogenicity scores: SIFT
(Sim et al ., 2012), PolyPhen-2 (Adzhubei et al ., 2013),
CADD (Kircher et al ., 2014), REVEL (Ioannidis et al .,
2016), and SpliceAI (Jaganathan et al. , 2019). DECIPHER seeks
advice from experts in the field and refers to benchmarking studies for
pathogenicity predictors (e.g. Gunning et al ., 2021) prior to the
inclusion of additional scores, assisting in the application of good
practice.
Reference genome: All genomic information is displayed in the
GRCh38 assembly version of the human genome, allowing the most
up-to-date genome and transcript information to be used to enable
accurate variant interpretation. The display of genomic data in GRCh38
permits DECIPHER to promote the use of Matched Annotation from NCBI and
EMBL-EBI (MANE) transcripts, where the RefSeq and Ensembl/GENCODE
transcripts from a protein-coding gene pair are identical (5’ UTR,
coding region, and 3’ UTR). DECIPHER currently promotes and highlights
MANE Select transcripts, one high-quality representative transcript per
protein-coding gene that is well-supported by experimental data and
represents the biology of the gene
(https://tark.ensembl.org/web/mane_project). Describing variants
relative to a single, recommended transcript, along with sequence
variant normalisation, assists in the standardisation of variant
reporting.
Reference conversion tools: Deposition with GRCh37/hg19
coordinates is still supported: prior to normalisation, DECIPHER remaps
GRCh37 coordinates onto the GRCh38 assembly, using an algorithm based on
the UCSC LiftOver tool
(https://genome.ucsc.edu/cgi-bin/hgLiftOver,
Kuhn et al ., 2013). A range of tools are also provided to allow
users to visualise the differences between assemblies. These include
GRCh37 and GRCh38 comparative genome browsers, gene lists for variants
lifted over by DECIPHER which display genes that no longer overlap the
variant, and a liftover mapping genome browser track (Fig. 3).