Demo corpus. Scores are computed on a select set of biomedical paper/datasets and may be inaccurate for papers outside this corpus — DataRank relies on network effects that improve with scale. We aim to expand this into a fully open resource pending additional funding.

A joint NCBI and EMBL-EBI transcript set for clinical genomics and research

Name: A joint NCBI and EMBL-EBI transcript set for clinical genomics and research
Published: 2022

Nature(2022)10.1038/s41586-022-04558-8Source: DataRank Database

Top 21%percentile

9.4DataRank

9.4Top 21%

Dataset

584 citations · base score 6.4

datarank_citation_only_1hop_v6· scope data_onlyMethodology

›Data sources & pipeline

Pipeline:MetadataData-paper checkEnrichmentCitation networkScoring

Enrichment:Pending

FAIR Checklist

Context only (not used in score)

Findable (2/2)

Has DOI
Indexed in repositories

Accessible (0/2)

Interoperable (2/2)

DataCite relations
Linked datasets

Reusable (1/3)

Dataset classification

FAIR checklist signals are shown for context only and do not affect DataRank scoring.

DataRank Breakdown

Base Score 10%Citation Network 90%

Base Score Contribution

0.956

From this paper's citation signal

Citation Network Contribution

8.5

From 200 citing papers with measurable signal

Learn more about DataRank methodology →

Top citers

Why this DataRank?

DataRank blends this paper's own citation count with the influence of the papers that cite it. Here, roughly 10% comes from its base citations and 90% from the citation network (200 citing papers contributed measurable signal).

Base score B(p): log1p(citation_count) — grows sub-linearly, so a paper with 1,000 citations is not 10× a paper with 100.
Network N(p): Σ over citers of log1p(C_q) ÷ max(outdegree_q, 1). Being cited by a highly-cited paper with few references counts most.
Damping factor d = 0.85: DataRank = (1−d)·B(p) + d·N(p) — the two cards above are each already multiplied by their share.
Self-citations excluded: Citers sharing any OpenAlex author ID with this paper are filtered out before the network sum.

Citers are pulled from OpenAlex sorted by cited_by_count:descand capped per paper, so when the cap binds we keep the highest-signal references and the score is reproducible across reruns.

Read the full methodology →

Authors (40)

Shashikant Pujar,Jane E. Loveland,Alex Astashyn,Ruth BennettORCID,Andrew BerryORCID

A joint NCBI and EMBL-EBI transcript set for clinical genomics and research

FAIR Checklist

DataRank Breakdown

Top citers

Authors (40)

Related Papers