Precision annotation of digital samples in NCBI’s gene expression omnibus is a dataset published in Scientific Data (2017). On theSindex it has a DataRank of 1.9, placing it in the top 36% of the data-sharing corpus. It has been cited 53 times, with 38 citing works in its 1-hop citation network. Its calibrated FAIR score is 65/100.
The Gene Expression Omnibus (GEO) contains more than two million digital samples from functional genomics experiments amassed over almost two decades. However, individual sample meta-data remains poorly described by unstructured free text attributes preventing its largescale reanalysis. We introduce the Search Tag Analyze Resource for GEO as a web application (http://STARGEO.org) to curate better annotations of sample phenotypes uniformly across different studies, and to use these sample annotations to define robust genomic signatures of disease pathology by meta-analysis. In this paper, we target a small group of biomedical graduate students to show rapid crowd-curation of precise sample annotations across all phenotypes, and we demonstrate the biological validity of these crowd-curated annotations for breast cancer. STARGEO.org makes GEO data findable, accessible, interoperable and reusable (i.e., FAIR) to ultimately facilitate knowledge discovery. Our work demonstrates the utility of crowd-curation and interpretation of open 'big data' under FAIR principles as a first step towards realizing an ideal paradigm of precision medicine.
FAIR checklist signals are shown for context only and do not affect DataRank scoring.
Calibrated FAIR score — a parallel quality metric, independent of the DataRank citation score. See the full evaluation →
Base Score Contribution
0.587
From this paper's citation signal
Citation Network Contribution
1.3
From 33 citing papers with measurable signal
Ranked by citation count — the same ordering the engine uses when summing log1p(Cq) over citers.
DataRank blends this paper's own citation count with the influence of the papers that cite it. Here, roughly 31% comes from its base citations and 69% from the citation network (33 citing papers contributed measurable signal).
Citers are pulled from OpenAlex sorted by cited_by_count:descand capped per paper, so when the cap binds we keep the highest-signal references and the score is reproducible across reruns.
Click a node to highlight its connections. Use scroll to zoom. Drag to pan.