πŸ† Finalist β€” NIH Data Sharing Index (β€œS-Index”) Challenge

Methodology

DataRank v4.0 Citation-Only 1-Hop Model

DataRank now uses a citation-only 1-hop approximation. FAIR/DataCite/repository metadata remain visible for context and auditing, but do not affect score computation.

v4.0 β€” Citation-Only 1-Hop DataRank

Current

March 2026

Core equations

Seed base score B(p) uses only the paper's citation count. Network score N(p)uses only citer citation counts and reference outdegree, with self-citations removed by OpenAlex author-ID overlap.

Scoring scope

  • Active runtime damping: DATARANK_DAMPING (default 0.85)
  • SSCORE_CITATION_BOOST is accepted for backward env compatibility but is no-op
  • FAIR/DataCite/file/download/OA signals are retained in API/UI as metadata only (non-scoring)
  • Canonical API fields are emitted with legacy aliases mapped to the same numeric values

Strengths

  • +Simple and auditable: score depends only on citation counts and graph structure
  • +Stable interpretation across sources: no metadata-weight tuning in scoring
  • +1-hop model is fast enough for live DOI streaming

Limitations

  • -Depends on citation coverage and latency in external indexes
  • -1-hop approximation omits multi-hop citation propagation
  • -Scores remain corpus-relative for percentile ranking
P

Percentile Ranking

Papers are sorted by DataRank and mapped into equal quintiles (S1–S5). Percentiles and quintiles are recomputed on each full corpus backfill.

R

Researcher Score

Each researcher's score is the mean of their top 10 paper DataRank scores within the indexed corpus. This directly reflects the quality of an author's highest-impact work in the database.

Where p1, p2, …, pk are the author's papers sorted by DataRank descending.

Scope

  • -Author scores reflect only papers indexed in this database, not the author's full publication record
  • -Co-authors on the same paper currently receive the same paper-level DataRank contribution
  • -Click on any author to see which papers contributed to their score
A

Computation Audit

Each run is logged to datarank_snapshotswith algorithm version (current: datarank_citation_only_1hop_v4), damping factor, corpus size, and score distribution.

See it in action

Search any DOI and get a DataRank score, percentile tier, and base-vs-network breakdown in seconds.