Demo corpus. Scores are computed on a select set of biomedical paper/datasets and may be inaccurate for papers outside this corpus — DataRank relies on network effects that improve with scale. We aim to expand this into a fully open resource pending additional funding.

EAnnot: A genome annotation tool using experimental evidence

Genome Research(2004)10.1101/gr.3152604Source: DataRank Database

EAnnot: A genome annotation tool using experimental evidence is a research paper published in Genome Research (2004). On theSindex it has a DataRank of 1.3. It has been cited 21 times, with 19 citing works in its 1-hop citation network.

N/A

1.3DataRank · unranked

1.3

21 citations · base score 3.1

Cite:

datarank_citation_only_1hop_v6· scope data_onlyMethodology

Abstract

The sequence of any genome becomes most useful for biological experimentation when a complete and accurate gene set is available. Gene prediction programs offer an efficient way to generate an automated gene set. Manual annotation, when performed by experienced annotators, is more accurate and complete than automated annotation. However, it is a laborious and expensive process, and by its nature, introduces a degree of variability not found with automated annotation. EAnnot (Electronic Annotation) is a program originally developed for manually annotating the human genome. It combines the latest bioinformatics tools to extract and analyze a wide range of publicly available data in order to achieve fast and reliable automatic gene prediction and annotation. EAnnot builds gene models based on mRNA, EST, and protein alignments to genomic sequence, attaches supporting evidence to the corresponding genes, identifies pseudogenes, and locates poly(A) sites and signals. Here, we compare manual annotation of human chromosome 6 with annotation performed by EAnnot in order to assess the latter's accuracy. EAnnot can readily be applied to manual annotation of other eukaryotic genomes and can be used to rapidly obtain an automated gene set.

›Data sources & pipeline

Pipeline:MetadataData-paper checkEnrichmentCitation networkScoring

Enrichment:Pending

FAIR Checklist

Context only (not used in score)

Findable (1/2)

Has DOI

Accessible (0/2)

Interoperable (0/2)

Reusable (0/3)

FAIR checklist signals are shown for context only and do not affect DataRank scoring.

Run a calibrated FAIR evaluation for this paper →

DataRank Breakdown

Base Score 34%Citation Network 66%

Base Score Contribution

0.464

From this paper's citation signal

Citation Network Contribution

0.883

From 16 citing papers with measurable signal

Learn more about DataRank methodology →

Top 1 citer driving the network score

Ranked by citation count — the same ordering the engine uses when summing log1p(C_q) over citers.

Ensembl 2002: accommodating comparative genomics
Nucleic Acids Research2003242 citationsDataRank 24.8

Why this DataRank?

DataRank blends this paper's own citation count with the influence of the papers that cite it. Here, roughly 34% comes from its base citations and 66% from the citation network (16 citing papers contributed measurable signal).

Base score B(p): log1p(citation_count) — grows sub-linearly, so a paper with 1,000 citations is not 10× a paper with 100.
Network N(p): Σ over citers of log1p(C_q) ÷ max(outdegree_q, 1). Being cited by a highly-cited paper with few references counts most.
Damping factor d = 0.85: DataRank = (1−d)·B(p) + d·N(p) — the two cards above are each already multiplied by their share.
Self-citations excluded: Citers sharing any OpenAlex author ID with this paper are filtered out before the network sum.

Citers are pulled from OpenAlex sorted by cited_by_count:descand capped per paper, so when the cap binds we keep the highest-signal references and the score is reproducible across reruns.

Read the full methodology →

Click a node to highlight its connections. Use scroll to zoom. Drag to pan.

Node colors:CenterData PaperData + Open AccessNon-dataSelected & links| Node size = percentile rank