🏆 Finalist — NIH Data Sharing Index (“S-Index”) Challenge
Demo corpus. Scores are computed on a select set of biomedical paper/datasets and may be inaccurate for papers outside this corpus — DataRank relies on network effects that improve with scale. We aim to expand this into a fully open resource pending additional funding.

anndata: Access and store annotated data matrices

Journal of Open Source Software(2024)10.21105/joss.04371Source: DataRank Database

anndata: Access and store annotated data matrices is a dataset published in Journal of Open Source Software (2024). On theSindex it has a DataRank of 2.9, placing it in the top 32.5% of the data-sharing corpus. It has been cited 182 times, with 171 citing works in its 1-hop citation network. Its calibrated FAIR score is 49/100.

Top 32%percentile
2.9DataRank
2.9Top 32%
Dataset Open Access182 citations · base score 5.0
Cite:
datarank_citation_only_1hop_v6· scope data_onlyMethodology
Data sources & pipeline
Pipeline:MetadataData-paper checkEnrichmentCitation networkScoring
Enrichment:Pending

FAIR Checklist

Context only (not used in score)
Findable (1/2)
  • Has DOI
Accessible (1/2)
  • Open Access
Interoperable (0/2)
    Reusable (1/3)
    • Dataset classification

    FAIR checklist signals are shown for context only and do not affect DataRank scoring.

    49FAIR score
    F Findable
    33
    A Accessible
    80
    I Interoperable
    50
    R Reusable
    33
    Top 22% by FAIRLLM-assessed✓ full text read

    Calibrated FAIR score — a parallel quality metric, independent of the DataRank citation score. See the full evaluation →

    DataRank Breakdown

    Base Score 26%Citation Network 74%

    Base Score Contribution

    0.748

    From this paper's citation signal

    Citation Network Contribution

    2.1

    From 92 citing papers with measurable signal

    Learn more about DataRank methodology →

    Top 5 citers driving the network score

    Ranked by citation count — the same ordering the engine uses when summing log1p(Cq) over citers.

    1. PyTorch: An Imperative Style, High-Performance Deep Learning Library
      arXiv (Cornell University)201916,186 citationsDataRank 1.5
    2. Integrated analysis of multimodal single-cell data
      Cell202115,542 citationsDataRank 1.4
    3. Tidy Data
      Journal of Statistical Software2014885 citationsDataRank 1.0
    4. Over 1000 tools reveal trends in the single-cell RNA-seq analysis landscape
      Genome Biology2021187 citationsDataRank 4.4Top 29%
    Why this DataRank?

    DataRank blends this paper's own citation count with the influence of the papers that cite it. Here, roughly 26% comes from its base citations and 74% from the citation network (92 citing papers contributed measurable signal).

    Base score B(p)
    log1p(citation_count) — grows sub-linearly, so a paper with 1,000 citations is not 10× a paper with 100.
    Network N(p)
    Σ over citers of log1p(Cq) ÷ max(outdegreeq, 1). Being cited by a highly-cited paper with few references counts most.
    Damping factor d = 0.85
    DataRank = (1−d)·B(p) + d·N(p) — the two cards above are each already multiplied by their share.
    Self-citations excluded
    Citers sharing any OpenAlex author ID with this paper are filtered out before the network sum.

    Citers are pulled from OpenAlex sorted by cited_by_count:descand capped per paper, so when the cap binds we keep the highest-signal references and the score is reproducible across reruns.

    Read the full methodology →

    Click a node to highlight its connections. Use scroll to zoom. Drag to pan.

    Node colors:CenterData PaperData + Open AccessNon-dataSelected & links| Node size = percentile rank