Using Networks and Prior Knowledge to Uncover Novel Rare Disease Phenotypes

Abstract

Rare diseases are characterized by low prevalence and high phenotypic diversity. Accurately identifying phenotypes associated with rare diseases is crucial for facilitating their diagnosis and management. However, this task presents significant challenges: rare disease datasets are typically small, making statistical assessments difficult, and they often report phenotypes using various terminologies, hindering the identification of common phenotypes across datasets or the recognition of those already documented in literature and knowledge databases. The Xcelerate RARE 2023 challenge was established to address the identification of phenotypes associated with rare diseases. Our team, MAGNET, developed a network-based approach that integrates patient clinical data from the Xcelerate RARE 2023 challenge with existing knowledge from Orphanet and the Human Phenotype Ontology (HPO). Our approach first builds a patient-disease-phenotype network comprising two layers: the Xcelerate layer encoding disease-patient-symptom associations, and the Prior Knowledge layer incorporating relationships between Orphanet rare diseases and HPO phenotypes. Then, for each rare disease included in the Xcelerate dataset, a Random Walk with Restart (RWR) algorithm is applied to the multilayer network to prioritize phenotype nodes. This framework effectively prioritizes phenotypes associated with rare diseases while distinguishing novel phenotypes from those already documented in knowledge bases, hence offering new perspectives for improving the diagnosis and characterization of rare diseases. Our solution was awarded the prize for the most innovative approach in the Xcelerate RARE challenge.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This work was supported by the French National Research Agency (ANR-21-CE45-0001), the European Rare Diseases Research Alliance (ERDERA), the MarMaRa Institute (AMX-19-IET-007), and the European Union's Horizon 2020 research and innovation programme (EJP RD COFUND-EJP No 825575).

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Only existing public datasets were used in this study.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

All data used in this study have been obtained from the Xcelerate RARE Open Science Data Challenge. The data can be accessed with a registered account on Synapse: info@sagebase.org SB. RARE-X A Rare Disease Open Science Data Challenge [Internet]. [cited 2025 Mar 17]. Available from: https://www.synapse.org/Synapse:syn51198355

https://www.synapse.org/Synapse:syn51198355

Comments (0)

No login
gif