Introduction
The population prevalence of rare disease has recently been estimated to be 3.5–5.9%, which equates to 263–446 million people affected globally. A large proportion of these rare diseases, approximately 72%, are known to have a genetic basis (Wakap et al ., 2020 PMID: 31527858). Advances in genomic technologies to determine causal variants, such as whole‐exome sequencing, are identifying the genetic basis of disease for only 25–40% of patients (Stranneheim et al. , 2021, Quaio et al. , 2020, Sawyer et al. , 2016). As a result, many patients undergoing diagnostic genetic testing do not receive a genetic diagnosis, and often experience long delays which have a substantial emotional impact on the family (Miller, 2021) and significant healthcare costs (Monroe et al ., 2016). A genetic diagnosis has multiple benefits for the patient and their family, including better understanding of the prognosis, personalised treatment, tailored management and surveillance, improved access to health and social care, and increased reproductive choice (Wright et al ., 2018).
The number of rare Mendelian diseases with known molecular aetiology is estimated to be 5,000–6,000 (Hartley et al ., 2018), however, for the majority of disease-associated genes it is not known which variants are disease-causing, and which are benign. Different pathogenic variants in the same gene can cause different diseases, for example variants inFGFR3 can cause multiple diseases including Muenke Syndrome, Hypochondroplasia, and Achondroplasia. Different diseases caused by variants in the same gene must be considered distinct due to their disparate clinical presentation and differing treatment options. The sharing of patient level variants and phenotypes is therefore essential to accelerate our understanding of the molecular basis of genetic disease.
DECIPHER (Firth et al. , 2009; Swaminathan et al ., 2012; Bragin et al ., 2014; Chatzimichali et al . 2015) is a global web-based platform which shares phenotype-linked variant data from rare disease patients (Fig. 1A). It is freely available via a web interface athttps://www.deciphergenomics.org. Approximately 40,000 of the patient records held by DECIPHER have explicit patient consent for open sharing on the website (Fig. 1B). These openly shared records contain more than 51,000 variants and more than 172,000 phenotype terms. The integration of this phenotype and variant data enables the discovery of new gene-disease and variant-disease relationships, driving diagnosis and our understanding of human biology. Since DECIPHER was established in 2004, the platform has been used and cited in more than 2,600 published manuscripts.
Patient records in DECIPHER are deposited by academic clinical centres, which are affiliated both to a hospital which oversees the treatment of patients with genetic disorders, and to a local university department of human/clinical genetics. Eligible centers (https://www.deciphergenomics.org/join/overview) can apply to join DECIPHER using an online application form. Data from a centre is stored within a DECIPHER project, and a senior clinician at that centre (clinical coordinator), sometimes in conjunction with a senior clinical scientist (lab coordinator), has the responsibility for approving/rejecting applications from individuals working at that centre who wish to access the data in the project.
The platform supports the deposition of almost all types of genetic variation, including sequence variants, short tandem repeats, copy-number variants (CNVs) and large structural variants. Variant interpretation interfaces are provided, including genome and protein browsers, which contextualise genetic and phenotype information to enable accurate interpretation. These interfaces integrate external datasets such as the Genome Aggregation Database (gnomAD, Karczewskiet al ., 2020), which can be used to exclude variants seen at appreciable frequency in the general population, in addition to disease relevant datasets such as ClinVar (Landrum et al ., 2018 PMID: 29165669) and DECIPHER records themselves. DECIPHER also encourages the use of global standards to promote good practice, including the American College of Medical Genetics (ACMG) guidelines for sequence variant interpretation (Richards et al., 2015) and ACMG/ClinGen technical standards for interpreting CNVs (Riggs et al ., 2020).
In the following sections we present examples of the genotype/phenotype data deposited and shared with the rare disease community. In addition we present the tools provided by DECIPHER to assess the pathogenicity of variants according to international standards, and the utility of DECIPHER to map the clinically relevant parts of the genome.