Demo corpus. Scores are computed on a select set of biomedical paper/datasets and may be inaccurate for papers outside this corpus — DataRank relies on network effects that improve with scale. We aim to expand this into a fully open resource pending additional funding.

How Much Should We Trust Differences-In-Differences Estimates?

The Quarterly Journal of Economics(2004)10.1162/003355304772839588Source: DataRank Database

How Much Should We Trust Differences-In-Differences Estimates? is a research paper published in The Quarterly Journal of Economics (2004). On theSindex it has a DataRank of 1.4. It has been cited 10,436 times.

N/A

1.4DataRank · unranked

1.4

10436 citations · base score 9.3

Cite:

datarank_citation_only_1hop_v6· scope data_onlyMethodology

Abstract

Most papers that employ Differences-in-Differences estimation (DD) use many years of data and focus on serially correlated outcomes but ignore that the resulting standard errors are inconsistent. To illustrate the severity of this issue, we randomly generate placebo laws in state-level data on female wages from the Current Population Survey. For each law, we use OLS to compute the DD estimate of its "effect" as well as the standard error of this estimate. These conventional DD standard errors severely understate the standard deviation of the estimators: we find an "effect" significant at the 5 percent level for up to 45 percent of the placebo interventions. We use Monte Carlo simulations to investigate how well existing methods help solve this problem. Econometric corrections that place a specific parametric form on the time-series process do not perform well. Bootstrap (taking into account the autocorrelation of the data) works well when the number of states is large enough. Two corrections based on asymptotic approximation of the variance-covariance matrix work well for moderate numbers of states and one correction that collapses the time series information into a "pre"- and "post"-period and explicitly takes into account the effective sample size works well even for small numbers of states.

›Data sources & pipeline

Pipeline:MetadataData-paper checkEnrichmentCitation networkScoring

Enrichment:Pending

FAIR Checklist

Context only (not used in score)

Findable (1/2)

Has DOI

Accessible (0/2)

Interoperable (0/2)

Reusable (0/3)

FAIR checklist signals are shown for context only and do not affect DataRank scoring.

Run a calibrated FAIR evaluation for this paper →

DataRank Breakdown

Base Score 100%Citation Network 0%

Base Score Contribution

1.4

From this paper's citation signal

Citation Network Contribution

Citation network not refreshed for this result

This paper's DataRank is currently driven only by its base citation score. Citation network data was not refreshed for this result.

Learn more about DataRank methodology →

Why this DataRank?

DataRank blends this paper's own citation count with the influence of the papers that cite it. Here, roughly 100% comes from its base citations and 0% from the citation network.

Base score B(p): log1p(citation_count) — grows sub-linearly, so a paper with 1,000 citations is not 10× a paper with 100.
Network N(p): Σ over citers of log1p(C_q) ÷ max(outdegree_q, 1). Being cited by a highly-cited paper with few references counts most.
Damping factor d = 0.85: DataRank = (1−d)·B(p) + d·N(p) — the two cards above are each already multiplied by their share.
Self-citations excluded: Citers sharing any OpenAlex author ID with this paper are filtered out before the network sum.

Citers are pulled from OpenAlex sorted by cited_by_count:descand capped per paper, so when the cap binds we keep the highest-signal references and the score is reproducible across reruns.

Read the full methodology →