The Data

California counties show different rates of maternal mortality. Some adopted a new maternal health program; others did not. Counties with the program have lower mortality. Is the program working? (Data are simulated for illustration.)

County Wealth vs Maternal Mortality

No Program
Has Program

The crucial question: Is the program reducing mortality, or are certain types of counties more likely to adopt it?

Who Gets Treated?

Economists start by asking: "What determines which units receive treatment?" The answer reveals potential threats to causal inference. If treatment assignment is related to outcomes for reasons other than the treatment itself, simple comparisons will be misleading.

Causal diagram showing multiple factors (wealth, infrastructure, political will) affecting both program adoption and maternal mortality, with a dashed arrow from program to mortality indicating uncertain causal effect

Arrows show potential causal relationships. The dashed arrow is what we want to measure.

Understanding selection: The association between program and outcomes reflects two things mixed together.

How do we separate the causal effect from the selection effect?

Selection Mechanisms

Use the filters below to explore what predicts program adoption. When certain characteristics strongly predict treatment, comparing treated and untreated units becomes problematic.

Explore by:

Selection Into Treatment

Selection into treatment occurs when units that receive treatment differ systematically from those that do not. It creates bias when:

  • The same factors that predict treatment also predict outcomes
  • Treated units would have different outcomes even without treatment
  • The treatment-outcome association reflects both causation and selection

The pattern is clear: Wealthier counties with better infrastructure are far more likely to adopt the program.

What does this mean for interpreting the mortality difference?

Implications

Selection into treatment has profound consequences for policy evaluation. The observed difference between treated and untreated groups tells us little about what would happen if we implemented the program elsewhere.

What the Comparison Actually Captures

Component Description What It Means
True Program Effect Lives saved by the program itself What we want to measure
Selection Effect Advantage from being a county that adopts programs Wealthier, better-resourced counties would do better anyway
Observed Difference Program Effect + Selection Effect Overstates the true impact of the program

Overestimating Program Benefits

If wealthy counties adopt programs and have better outcomes, we may attribute to the program what would have happened anyway.

What would mortality look like in these counties if they had NOT adopted the program?

Misleading Policy Conclusions

Scaling a program to different counties may not replicate the observed benefits if those counties lack the same underlying advantages.

Would the program work equally well in under-resourced counties that haven't adopted it?

The Counterfactual Problem

We never observe what would have happened to treated counties without treatment. Selection makes this counterfactual especially hard to estimate.

The control group may not tell us what treated counties would have experienced.

Regression Cannot Fully Adjust

Statistical controls help but cannot eliminate bias from unmeasured factors that predict both adoption and outcomes.

What unmeasured factors might wealthy counties have that affect mortality?

Selection is a fundamental challenge: It does not invalidate observational research, but it demands careful thinking about the treatment assignment mechanism.

What strategies can address selection bias?

Key Insight

Understanding selection is the first step toward credible causal inference. These questions help you think about whether selection threatens your analysis. They do not guarantee valid causal estimates, but they reveal where the analysis is vulnerable.

Questions Economists Ask About Selection

What predicts treatment? List the factors that make units more likely to receive treatment.
Do those factors also predict outcomes? If so, selection bias is likely.
Is there overlap? Are there similar treated and untreated units to compare?
What is unmeasured? Even after controlling for observables, what hidden factors might remain?
Why did some units NOT get treated? The reasons for non-adoption matter as much as reasons for adoption.

Key Takeaway

Selection into treatment is not a statistical problem with a statistical fix. When units that receive treatment differ systematically from those that do not, simple comparisons confound causation with selection. The solution is not more sophisticated adjustment but finding sources of treatment variation that operate independently of the factors that determine outcomes. Policy mandates, geographic rollouts, and eligibility thresholds can provide this. This is what economists mean by "identification."