🏆 Finalist — NIH Data Sharing Index (“S-Index”) Challenge
Demo corpus. Scores are computed on a select set of biomedical paper/datasets and may be inaccurate for papers outside this corpus — DataRank relies on network effects that improve with scale. We aim to expand this into a fully open resource pending additional funding.

Transcriptional regulatory code of a eukaryotic genome

Nature(2004)10.1038/nature02800Source: DataRank Database

Transcriptional regulatory code of a eukaryotic genome is a dataset published in Nature (2004). On theSindex it has a DataRank of 18.1, placing it in the top 8.1% of the data-sharing corpus. It has been cited 2,183 times, with 188 citing works in its 1-hop citation network. Its calibrated FAIR score is 31/100.

Top 8%percentile
18.1DataRank
18.1Top 8%
Dataset Open Access2183 citations · base score 7.7
Cite:
datarank_citation_only_1hop_v6· scope data_onlyMethodology

Abstract

DNA-binding transcriptional regulators interpret the genome's regulatory code by binding to specific sequences to induce or repress gene expression. Comparative genomics has recently been used to identify potential cis-regulatory sequences within the yeast genome on the basis of phylogenetic conservation, but this information alone does not reveal if or when transcriptional regulators occupy these binding sites. We have constructed an initial map of yeast's transcriptional regulatory code by identifying the sequence elements that are bound by regulators under various conditions and that are conserved among Saccharomyces species. The organization of regulatory elements in promoters and the environment-dependent use of these elements by regulators are discussed. We find that environment-specific use of regulatory elements predicts mechanistic models for the function of a large population of yeast's transcriptional regulators.

Data sources & pipeline
Pipeline:MetadataData-paper checkEnrichmentCitation networkScoring
Enrichment:Pending

FAIR Checklist

Context only (not used in score)
Findable (1/2)
  • Has DOI
Accessible (1/2)
  • Open Access
Interoperable (0/2)
    Reusable (1/3)
    • Dataset classification

    FAIR checklist signals are shown for context only and do not affect DataRank scoring.

    31FAIR score
    F Findable
    64
    A Accessible
    48
    I Interoperable
    0
    R Reusable
    13
    Top 87% by FAIRLLM-assessed⚠ abstract only
    Estimated from the abstract only. The agent couldn't read this paper's full text, so body-dependent criteria (data-availability statement, formats, license) are inferred. For a confident score, upload the PDF or supply full text →

    Calibrated FAIR score — a parallel quality metric, independent of the DataRank citation score. See the full evaluation →

    DataRank Breakdown

    Base Score 6%Citation Network 94%

    Base Score Contribution

    1.2

    From this paper's citation signal

    Citation Network Contribution

    16.9

    From 188 citing papers with measurable signal

    Learn more about DataRank methodology →

    Top 5 citers driving the network score

    Ranked by citation count — the same ordering the engine uses when summing log1p(Cq) over citers.

    1. Initial sequencing and comparative analysis of the mouse genome
      Nature20027,236 citationsDataRank 16.2Top 10%
    2. Genetic regulatory mechanisms in the synthesis of proteins
      Journal of Molecular Biology19616,563 citationsDataRank 1.3
    3. Wisdom of crowds for robust gene network inference
      Nature Methods20121,765 citationsDataRank 1.1
    4. A genomic code for nucleosome positioning
      Nature20061,524 citationsDataRank 1.1
    Why this DataRank?

    DataRank blends this paper's own citation count with the influence of the papers that cite it. Here, roughly 6% comes from its base citations and 94% from the citation network (188 citing papers contributed measurable signal).

    Base score B(p)
    log1p(citation_count) — grows sub-linearly, so a paper with 1,000 citations is not 10× a paper with 100.
    Network N(p)
    Σ over citers of log1p(Cq) ÷ max(outdegreeq, 1). Being cited by a highly-cited paper with few references counts most.
    Damping factor d = 0.85
    DataRank = (1−d)·B(p) + d·N(p) — the two cards above are each already multiplied by their share.
    Self-citations excluded
    Citers sharing any OpenAlex author ID with this paper are filtered out before the network sum.

    Citers are pulled from OpenAlex sorted by cited_by_count:descand capped per paper, so when the cap binds we keep the highest-signal references and the score is reproducible across reruns.

    Read the full methodology →

    Click a node to highlight its connections. Use scroll to zoom. Drag to pan.

    Node colors:CenterData PaperData + Open AccessNon-dataSelected & links| Node size = percentile rank

    Authors (26)

    D. Benjamin Gordon,Tong Ihn LeeORCID,Nicola J. RinaldiORCID,Kenzie D. Macisaac,Timothy W. Danford

    Related Papers (10)

    Current Opinion in Genetics & Development(2009)
    co-cited
    10.1016/j.gde.2009.09.007
    Nature(2012)
    co-citedsame journal
    10.1038/nature11232
    Nature Genetics(2007)
    co-cited
    10.1038/ng2117