About DataRank

The metric data sharing has been waiting for

DataRank measures the impact of a scientific paper by combining its own citation count with the citations of the papers that cite it. FAIR and DataCite metadata sit alongside each score for context — they aren't baked into the number.

Try it now Read the methodology

Multi-source corpus

NIH-funded papers, enriched with OpenAlex citation data and DataCite/FAIR repository signals for context.

DataRank engine

Each paper's own citations plus a one-step propagation through the papers that cite it. Heavily-cited citers carry more weight.

Percentile ranking

Data papers mapped to a 0–100 percentile against the rest of the data-paper corpus. The 99th percentile is the top 1%.

The pipeline

From DOI to DataRank

Six steps from raw DOI metadata to a percentile-ranked score.

Aggregate metadata — 🐬 DOIphin

DOIphin, our federated aggregator, cross-walks each DOI across 14+ scholarly APIs (CrossRef, OpenAlex, DataCite, Zenodo, Dryad, and more) into one unified record — and builds the citation/link graph DataRank scores. FAIR/DataCite signals are kept as context.

How DOIphin works

Identify data papers (DrPaper)

Our DrPaper classifier — a fine-tuned SciBERT model — reads each paper and decides whether its main contribution is a dataset (cohort, atlas, benchmark, database) versus a method, theory, or review. Only data papers receive a percentile ranking.

How DrPaper works

FAIR checklist

Each paper is evaluated against the FAIR principles (Findable, Accessible, Interoperable, Reusable). Shown alongside scores for transparency — not part of the score itself.

Build the citation graph

For each paper we fetch its citers from OpenAlex — the papers that cite it — to measure downstream influence.

Compute DataRank

Combine the paper's own citation count with a one-step propagation through its citers, weighted so heavily-cited citers count more. Self-citations are removed.

Rank as a percentile

Sort data papers by DataRank and map each to a percentile in [0, 100] within the data-paper corpus. A paper at the 99th percentile is in the top 1% by impact.

Go deeper

Explore the methodology, meet the team, or browse the open-source tools we've built.

Methodology

Full technical details on the citation-only 1-hop scoring system (v6.0) and percentile ranking.

Team

Meet the computational scientists and open-science advocates behind DataRank.

Meet the team

Resources

Open-source notebooks, datasets, APIs, and models — all freely available.

View artifacts