Demo corpus. Scores are computed on a select set of biomedical paper/datasets and may be inaccurate for papers outside this corpus — DataRank relies on network effects that improve with scale. We aim to expand this into a fully open resource pending additional funding.

Mondo: Unifying diseases for the world, by the world

(2022)10.1101/2022.04.13.22273750Source: DataRank Database

Mondo: Unifying diseases for the world, by the world is a dataset (2022). On theSindex it has a DataRank of 2.6, placing it in the top 33.7% of the data-sharing corpus. It has been cited 100 times, with 72 citing works in its 1-hop citation network. Its calibrated FAIR score is 56/100.

Top 34%percentile

2.6DataRank

2.6Top 34%

Dataset Open Access100 citations · base score 4.6

Cite:

datarank_citation_only_1hop_v6· scope data_onlyMethodology

Abstract

There are thousands of distinct disease entities and concepts, each of which are known by different and sometimes contradictory names. The lack of a unified system for managing these entities poses a major challenge for both machines and humans that need to harmonize information to better predict causes and treatments for disease. The Mondo Disease Ontology is an open, community-driven ontology that integrates key medical and biomedical terminologies, supporting disease data integration to improve diagnosis, treatment, and translational research. Mondo records the sources of all data and is continually updated, making it suitable for research and clinical applications that require up-to-date disease knowledge.

›Data sources & pipeline

Pipeline:MetadataData-paper checkEnrichmentCitation networkScoring

Enrichment:Pending

FAIR Checklist

Context only (not used in score)

Findable (1/2)

Has DOI

Accessible (1/2)

Open Access

Interoperable (0/2)

Reusable (1/3)

Dataset classification

FAIR checklist signals are shown for context only and do not affect DataRank scoring.

56FAIR score

F Findable

A Accessible

I Interoperable

R Reusable

Top 8% by FAIRLLM-assessed✓ full text read

Calibrated FAIR score — a parallel quality metric, independent of the DataRank citation score. See the full evaluation →

DataRank Breakdown

Base Score 27%Citation Network 73%

Base Score Contribution

0.686

From this paper's citation signal

Citation Network Contribution

1.9

From 49 citing papers with measurable signal

Learn more about DataRank methodology →

Top 3 citers driving the network score

Ranked by citation count — the same ordering the engine uses when summing log1p(C_q) over citers.

Considerations for building and using integrated single-cell atlases
Nature Methods202432 citationsDataRank 0.524
The Single-cell Pediatric Cancer Atlas: Data portal and open-source tools for single-cell transcriptomics of pediatric tumors
202411 citationsDataRank 0.424Top 50%
Exploratory electronic health record analysis with ehrapy
20232 citationsDataRank 0.165

Why this DataRank?

DataRank blends this paper's own citation count with the influence of the papers that cite it. Here, roughly 27% comes from its base citations and 73% from the citation network (49 citing papers contributed measurable signal).

Base score B(p): log1p(citation_count) — grows sub-linearly, so a paper with 1,000 citations is not 10× a paper with 100.
Network N(p): Σ over citers of log1p(C_q) ÷ max(outdegree_q, 1). Being cited by a highly-cited paper with few references counts most.
Damping factor d = 0.85: DataRank = (1−d)·B(p) + d·N(p) — the two cards above are each already multiplied by their share.
Self-citations excluded: Citers sharing any OpenAlex author ID with this paper are filtered out before the network sum.

Citers are pulled from OpenAlex sorted by cited_by_count:descand capped per paper, so when the cap binds we keep the highest-signal references and the score is reproducible across reruns.

Read the full methodology →

Click a node to highlight its connections. Use scroll to zoom. Drag to pan.

Node colors:CenterData PaperData + Open AccessNon-dataSelected & links| Node size = percentile rank