Demo corpus. Scores are computed on a select set of biomedical paper/datasets and may be inaccurate for papers outside this corpus — DataRank relies on network effects that improve with scale. We aim to expand this into a fully open resource pending additional funding.
A joint NCBI and EMBL-EBI transcript set for clinical genomics and research
Top 21%percentile
9.4Top 21%
Dataset
584 citations · base score 6.4
›Data sources & pipeline
Pipeline:MetadataData-paper checkEnrichmentCitation networkScoring
Enrichment:Pending
FAIR Checklist
Context only (not used in score)Findable (2/2)
- Has DOI
- Indexed in repositories
Accessible (0/2)
Interoperable (2/2)
- DataCite relations
- Linked datasets
Reusable (1/3)
- Dataset classification
FAIR checklist signals are shown for context only and do not affect DataRank scoring.
DataRank Breakdown
Base Score 10%Citation Network 90%
Base Score Contribution
0.956
From this paper's citation signal
Citation Network Contribution
8.5
From 200 citing papers with measurable signal
Top citers
Why this DataRank?
DataRank blends this paper's own citation count with the influence of the papers that cite it. Here, roughly 10% comes from its base citations and 90% from the citation network (200 citing papers contributed measurable signal).
- Base score B(p)
- log1p(citation_count) — grows sub-linearly, so a paper with 1,000 citations is not 10× a paper with 100.
- Network N(p)
- Σ over citers of log1p(Cq) ÷ max(outdegreeq, 1). Being cited by a highly-cited paper with few references counts most.
- Damping factor d = 0.85
- DataRank = (1−d)·B(p) + d·N(p) — the two cards above are each already multiplied by their share.
- Self-citations excluded
- Citers sharing any OpenAlex author ID with this paper are filtered out before the network sum.
Citers are pulled from OpenAlex sorted by cited_by_count:descand capped per paper, so when the cap binds we keep the highest-signal references and the score is reproducible across reruns.