Evolution of Matchmaking
Matchmaking using MME is based on a two-sided framework where two interested parties are both looking for a match for the same gene(Philippakis et al., 2015) . As the past 7 years of experience has demonstrated, this approach has been very successful in advancing the discovery of novel disease-gene relationships. However, this approach only works when both interested parties have taken the time to flag a highly compelling novel candidate gene of interest. But what of all the datasets where extensive manual review has not occurred due to all sorts of factors? Discovery for these types of datasets needs to happen differently. One-sided matchmaking (Figure 1 ) can occur when one party is interested in a candidate gene and queries a database hosting genome-wide sequencing data from undiagnosed patients to identify variants in the candidate gene associated with additional information. Zero-sided matchmaking (Figure 1 ) is the term used to describe the state where there are no candidates identified but instead computational analysis across the cohort is used to identify genes with predicted damaging variants in common across phenotypes. For example, the genebass.org website allows users to query precomputed gene burden analyses across all genes for all phenotypes in the UK Biobank(Karczewski et al., 2021) . In another example, the Deciphering Developmental Disorders (DDD) study applies burden testing frameworks to identify genes with significant enrichments of damaging variants, such as genes with more de novo loss-of-function variants in the DDD cohort than expected (PMC: 7116826). Likewise, the new GREGoR consortium (gregorconsortium.org) is amassing rare disease data on the AnVIL platform (Schatz et al., 2022) from both the prior NIH Centers for Mendelian Genomics as well as prospectively collected data to improve power for identifying gene-disease candidates. As more and more data are generated, this type of approach will be critical to ensure we can analyze unsolved datasets at scale.
Several data platforms are approaching one-sided matchmaking by providing information about the existence of a specific variant and its associated information (e.g., phenotype). These databases include MyGene2 (NHGRI/NHLBI University of Washington-Center for Mendelian Genomics (UW-CMG), Seattle, WA) , Geno2MP (University of Washington Center for Mendelian Genomics) , VariantMatcher (Wohler et al., 2021) , and Franklin (Genoox) . MyGene2 and Geno2MP are public databases, with sharing driven by families in the case of MyGene2 where anyone can access the displayed variant level data associated with phenotypic terms. VariantMatcher will accept variant-specific queries, search its database of variants, and respond if the variant is present and the associated phenotype if available with dual notification to the querier and data submitter (Wohler et al., 2021) . Franklin is an interpretation and connection platform that supports a community of users to facilitate variant interpretation. These four data platforms are working to facilitate a federated connection to one another using Data Connect, a standard for discovery and search from GA4GH (PRODUCTION: REFERENCE APPEARS IN THE SAME SPECIAL ISSUE(Rodrigues et al., 2022) .
At the gene level, several databases, such as DECIPHER (PRODUCTION: REFERENCE APPEARS IN THE SAME SPECIAL ISSUE (Foreman et al., 2022) ), RD-Connect GPAP (PRODUCTION: REFERENCES APPEARS IN THE SAME SPECIAL ISSUE (Laurie et al., 2022) ), Genomics4RD (PRODUCTION: REFERENCES APPEARS IN THE SAME SPECIAL ISSUE (Driver et al., 2022) ), and seqr if used in collaboration with the Broad Center for Mendelian Genomics (PRODUCTION: REFERENCES APPEARS IN THE SAME SPECIAL ISSUE (Pais et al., 2022) ) are now individually approaching this challenge using internal one-sided matchmaking where an internal user with a candidate gene identified in an undiagnosed patient can query the genomic data housed in the database to see all variants identified in this candidate gene at a certain frequency, or of a certain type, across the dataset along with associated phenotypic and often inheritance data. While these approaches are currently siloed and only available to internal users due to the level of data being shared, efforts are underway to make more of this data available. For example, Geno2MP (University of Washington Center for Mendelian Genomics)allows searches of the rare variants generated by the majority of the Centers for Mendelian Genomics which are linked to very high-level phenotypic information (Baxter et al., 2022) . Genomics4RD (PRODUCTION: REFERENCES APPEARS IN THE SAME SPECIAL ISSUE (Driver et al., 2022) ) is piloting a one-sided matchmaking platform for external users using a registered access model to facilitate multi-level filtering for both genetic variation and phenotypic information and ensuring that compound heterozygous variants in a single participant are identifiable. Beacon is a genomic discovery protocol and data access API issued by the GA4GH. Its most recent version (v2) presented in this issue describes its new and enhanced features for complex queries and richer responses (PRODUCTION: REFERENCES APPEARS IN THE SAME SPECIAL ISSUE (Rambla et al. 2022). Beacon v2 is designed to sit on top of existing solutions, can be integrated into Beacon networks and provides a way forward for the next phase of genomic matchmaking and other data queries.