Evidence for a novel overlapping coding sequence in POLG initiated at a CUG start codon
Evidence for a novel overlapping coding sequence in POLG initiated at a CUG start codon is a research paper published in BMC Genetics (2020). On theSindex it has a DataRank of 1.1. It has been cited 41 times, with 24 citing works in its 1-hop citation network.
Abstract
BackgroundPOLG, located on nuclear chromosome 15, encodes the DNA polymerase γ(Pol γ). Pol γ is responsible for the replication and repair of mitochondrial DNA (mtDNA). Pol γ is the only DNA polymerase found in mitochondria for most animal cells. Mutations in POLG are the most common single-gene cause of diseases of mitochondria and have been mapped over the coding region of the POLG ORF.ResultsUsing PhyloCSF to survey alternative reading frames, we found a conserved coding signature in an alternative frame in exons 2 and 3 of POLG, herein referred to as ORF-Y that arose de novo in placental mammals. Using the synplot2 program, synonymous site conservation was found among mammals in the region of the POLG ORF that is overlapped by ORF-Y. Ribosome profiling data revealed that ORF-Y is translated and that initiation likely occurs at a CUG codon. Inspection of an alignment of mammalian sequences containing ORF-Y revealed that the CUG codon has a strong initiation context and that a well-conserved predicted RNA stem-loop begins 14 nucleotides downstream. Such features are associated with enhanced initiation at near-cognate non-AUG codons. Reanalysis of the Kim et al. (2014) draft human proteome dataset yielded two unique peptides that map unambiguously to ORF-Y. An additional conserved uORF, herein referred to as ORF-Z, was also found in exon 2 of POLG. Lastly, we surveyed Clinvar variants that are synonymous with respect to the POLG ORF and found that most of these variants cause amino acid changes in ORF-Y or ORF-Z.ConclusionsWe provide evidence for a novel coding sequence, ORF-Y, that overlaps the POLG ORF. Ribosome profiling and mass spectrometry data show that ORF-Y is expressed. PhyloCSF and synplot2 analysis show that ORF-Y is subject to strong purifying selection. An abundance of disease-correlated mutations that map to exons 2 and 3 of POLG but also affect ORF-Y provides potential clinical significance to this finding.
›Data sources & pipeline
FAIR Checklist
Context only (not used in score)- Has DOI
- Open Access
FAIR checklist signals are shown for context only and do not affect DataRank scoring.
DataRank Breakdown
Base Score Contribution
0.561
From this paper's citation signal
Citation Network Contribution
0.511
From 22 citing papers with measurable signal
Top 5 citers driving the network score
Ranked by citation count — the same ordering the engine uses when summing log1p(Cq) over citers.
- MUSCLE: multiple sequence alignment with high accuracy and high throughputNucleic Acids Research200446,160 citationsDataRank 1.6
- Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. CohenJournal of Molecular Biology200112,937 citationsDataRank 1.4
- EMBOSS: The European Molecular Biology Open Software SuiteTrends in Genetics20009,778 citationsDataRank 1.4
- GENCODE 2021Nucleic Acids Research20201,452 citationsDataRank 9.7Top 22%
- PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regionsBioinformatics20111,073 citationsDataRank 1.0
Why this DataRank?
DataRank blends this paper's own citation count with the influence of the papers that cite it. Here, roughly 52% comes from its base citations and 48% from the citation network (22 citing papers contributed measurable signal).
- Base score B(p)
- log1p(citation_count) — grows sub-linearly, so a paper with 1,000 citations is not 10× a paper with 100.
- Network N(p)
- Σ over citers of log1p(Cq) ÷ max(outdegreeq, 1). Being cited by a highly-cited paper with few references counts most.
- Damping factor d = 0.85
- DataRank = (1−d)·B(p) + d·N(p) — the two cards above are each already multiplied by their share.
- Self-citations excluded
- Citers sharing any OpenAlex author ID with this paper are filtered out before the network sum.
Citers are pulled from OpenAlex sorted by cited_by_count:descand capped per paper, so when the cap binds we keep the highest-signal references and the score is reproducible across reruns.
Click a node to highlight its connections. Use scroll to zoom. Drag to pan.