🏆 Finalist — NIH Data Sharing Index (“S-Index”) Challenge
Demo corpus. Scores are computed on a select set of biomedical paper/datasets and may be inaccurate for papers outside this corpus — DataRank relies on network effects that improve with scale. We aim to expand this into a fully open resource pending additional funding.

UniProt: the universal protein knowledgebase in 2021

(2020)10.1093/nar/gkaa1100Source: DataRank Database
Top 11%percentile
16.0DataRank
16.0Top 11%
Dataset Open Access
7014 citations · base score 8.9
datarank_citation_only_1hop_v6· scope data_onlyMethodology
â€șData sources & pipeline
Pipeline:MetadataData-paper checkEnrichmentCitation networkScoring
Enrichment:Pending

FAIR Checklist

Context only (not used in score)
Findable (1/2)
  • Has DOI
Accessible (1/2)
  • Open Access
Interoperable (0/2)
    Reusable (1/3)
    • Dataset classification

    FAIR checklist signals are shown for context only and do not affect DataRank scoring.

    DataRank Breakdown

    Base Score 8%Citation Network 92%

    Base Score Contribution

    1.3

    From this paper's citation signal

    Citation Network Contribution

    14.7

    From 190 citing papers with measurable signal

    Learn more about DataRank methodology →

    Top citers

    Why this DataRank?

    DataRank blends this paper's own citation count with the influence of the papers that cite it. Here, roughly 8% comes from its base citations and 92% from the citation network (190 citing papers contributed measurable signal).

    Base score B(p)
    log1p(citation_count) — grows sub-linearly, so a paper with 1,000 citations is not 10× a paper with 100.
    Network N(p)
    ÎŁ over citers of log1p(Cq) Ă· max(outdegreeq, 1). Being cited by a highly-cited paper with few references counts most.
    Damping factor d = 0.85
    DataRank = (1−d)·B(p) + d·N(p) — the two cards above are each already multiplied by their share.
    Self-citations excluded
    Citers sharing any OpenAlex author ID with this paper are filtered out before the network sum.

    Citers are pulled from OpenAlex sorted by cited_by_count:descand capped per paper, so when the cap binds we keep the highest-signal references and the score is reproducible across reruns.

    Read the full methodology →

    Authors (99)

    MarĂ­a MartinORCID,Sandra OrchardORCID,Michele MagraneORCID,Rahat Agivetova,Shadab AhmadORCID

    Related Papers