🏆 Finalist — NIH Data Sharing Index (“S-Index”) Challenge
Demo corpus. Scores are computed on a select set of biomedical paper/datasets and may be inaccurate for papers outside this corpus — DataRank relies on network effects that improve with scale. We aim to expand this into a fully open resource pending additional funding.

Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences

Proceedings of the National Academy of Sciences(2021)10.1073/pnas.2016239118Source: DataRank Database
Top 17%
13.8DataRank
13.8Top 17%
Open Access
2999 citations · base score 8.0
datarank_citation_only_1hop_v4Methodology
Data sources & pipeline
Pipeline:CrossRefSciBERTdoi-metadataOpenAlexDataRank
Enrichment:Pending
FAIR ChecklistContext only (not used in score)
FFindable
Has DOI
AAccessible
Open Access
IInteroperable
RReusable

FAIR checklist signals are shown for context only and do not affect DataRank scoring.

DataRank Breakdown

Base Component 9%Network Component 91%

Base Score Contribution

1.2

From this paper's citation signal

Citation Network Contribution

12.6

From 197 citing papers with measurable signal

Learn more about DataRank methodology →

Authors (11)

Joshua MeierORCID,Tom SercuORCID,Siddharth Goyal,Zeming LinORCID,Jason LiuORCID

Related Papers