A comprehensive fitness landscape model reveals the evolutionary history and future evolvability of eukaryotic <i>cis</i> -regulatory DNA sequences is a research paper (2021). On theSindex it has a DataRank of 0.307. It has been cited 4 times, with 4 citing works in its 1-hop citation network.
Mutations in non-coding cis -regulatory DNA sequences can alter gene expression, organismal phenotype, and fitness. Fitness landscapes, which map DNA sequence to organismal fitness, are a long-standing goal in biology, but have remained elusive because it is challenging to generalize accurately to the vast space of possible sequences using models built on measurements from a limited number of endogenous regulatory sequences. Here, we construct a sequence-to-expression model for such a landscape and use it to decipher principles of cis -regulatory evolution. Using tens of millions of randomly sampled promoter DNA sequences and their measured expression levels in the yeast Sacccharomyces cerevisiae , we construct a deep transformer neural network model that generalizes with exceptional accuracy, and enables sequence design for gene expression engineering. Using our model, we predict and experimentally validate expression divergence under random genetic drift and strong selection weak mutation regimes, show that conflicting expression objectives in different environments constrain expression adaptation, and find that stabilizing selection on gene expression leads to the moderation of regulatory complexity. We present an approach for detecting selective constraint on gene expression using our model and natural sequence variation, and validate it using observed cis -regulatory diversity across 1,011 yeast strains, cross-species RNA-seq from three different clades, and measured expression-to-fitness curves. Finally, we develop a characterization of regulatory evolvability, use it to visualize fitness landscapes in two dimensions, discover evolvability archetypes, quantify the mutational robustness of individual sequences and highlight the mutational robustness of extant natural regulatory sequence populations. Our work provides a general framework that addresses key questions in the evolution of cis -regulatory sequences.
FAIR checklist signals are shown for context only and do not affect DataRank scoring.
Base Score Contribution
0.241
From this paper's citation signal
Citation Network Contribution
0.0659
From 4 citing papers with measurable signal
Ranked by citation count — the same ordering the engine uses when summing log1p(Cq) over citers.
DataRank blends this paper's own citation count with the influence of the papers that cite it. Here, roughly 79% comes from its base citations and 21% from the citation network (4 citing papers contributed measurable signal).
Citers are pulled from OpenAlex sorted by cited_by_count:descand capped per paper, so when the cap binds we keep the highest-signal references and the score is reproducible across reruns.
Click a node to highlight its connections. Use scroll to zoom. Drag to pan.