In its first year, ERDERA has brought together 10 000 harmonised genomic and phenotypic datasets from unsolved rare disease cases across Europe, creating a secure, standardised and scalable resource.

ERDERA unites 10 000 rare disease datasets in a single, secure European resource

In its first year, the European Rare Diseases Research Alliance (ERDERA) has assembled a unified collection of 10 000 harmonised genomic and phenotypic datasets from unsolved rare disease cases across European centres.

Hosted within the ERDERA Diagnostic Research environment, this resource is designed to help rare disease clinicians and researchers work more effectively across borders, with the shared aim of improving diagnosis and opening up new routes to discovery for people living with a rare disease and their families.

Beyond the number itself, the step-change here is practical: moving from fragmented, institution-specific case files to a shared, standardised dataset that can be searched, compared and analysed responsibly across multiple countries and systems.

A shared dataset built for comparability

Rare disease data are often rich, but not easily comparable. Different sites may record similar clinical features in different ways, use different naming systems, or structure files differently.

The ERDERA Diagnostic Research Programme has coordinated the collation of unsolved cases contributed by multiple partners across countries and institutions, alongside a standardisation pipeline that maps incoming records to common data models and ontologies so that datasets become interoperable and comparable.

Complementary quality control workflows then validate each dataset, safeguarding reliability for downstream analyses, cross-cohort queries and reproducible research. The result is not simply a large data pool, but a harmonised one: 10 000 datasets that “speak the same language”, making it far easier to investigate rare genetic causes and phenotype–genotype relationships at scale.

“The idea is that these datasets are hosted on a standard platform where we have the variants of all 10 000 patients. They are then uploaded to the European Genome-phenome Archive (EGA)”, explains Vicente Yépez, from the Diagnostics group in ERDERA. “If a researcher has a valid scientific purpose and their proposal is approved, they can download and work with this data. It is a controlled access system to ensure the data is used only for high-quality research.”

Governance that enables reuse while protecting people

Bringing data together is only useful if it is done in a way that protects participants and earns trust. ERDERA has implemented a joint data controllership model and a formal data-sharing framework agreement intended to support secure, auditable and compliant use.

“We have a responsibility towards consent, confidentiality and regulatory requirements, but we also have the opportunity to help institutions reach faster, more reliable diagnoses through responsible data sharing” says Holm Graessner, lead of the Diagnostic Research Programme in ERDERA who has been actively involved in the process. “This aim balances the imperative to share with the duty to protect.”

To support responsible and effective use, the ERDERA Diagnostic Research Programme leverages established, community-recognised platforms—including the RD‐Connect Genome‐Phenome Analysis Platform (GPAP), the RD3 (Rare Disease Data about Data) database and the European Genome-phenome Archive (EGA).

Using familiar interfaces matters: it reduces friction for researchers and helps make analysis a routine part of collaborative work, rather than a bespoke, one-off effort.

Engineered to scale: from 10 000 to 100 000

While the 10 000 genomic datasets were the primary focus for the first year, the goal is to reach 100 000 (gen)omics datasets over the course of ERDERA.

To achieve this, ERDERA members are extensively working on the availability of additional omics data types, which include at least 3 000 RNA sequences, 800 long-read DNA sequences, and around 400 optical genome mapping data to help identify broader chromosomal aberrations. The goal is to integrate all these different layers of data across all ERDERA partners.

This scale-up is intended to strengthen Europe’s rare disease research ecosystem and support advances in interpretation, therapeutic development and precision care.

As ERDERA Scientific Coordinator Daria Julkowska explains: “In rare diseases, individual conditions are uncommon and patient cohorts are small — which makes integration not optional, but essential. ERDERA has built the foundations of a European infrastructure for answers and is now advancing towards a federated model, empowering national capacities while connecting them into a seamless ecosystem where knowledge can flow without data having to move.” By bringing together what was previously scattered, ERDERA is creating the conditions for more diagnoses, faster discovery and better-informed care.

 

News & Updates

You might also be interested in

The European Medicines Agency has begun a formal review of Tavneos (avacopan) after emerging information raised questions about the integrity of key clinical trial data supporting its EU authorisation, with potential implications for adults living with rare autoimmune vasculitis.
On 24 February, in Brussels and online, EURORDIS will bring the rare disease community together for its fifteenth Black Pearl Awards ceremony, held in the lead‑up to Rare Disease Day.
Published on 14 December, this WHO technical document maps global trends in registered clinical studies using human genomic technologies from 1990 to 2024, including patterns of inclusion and equity.