Moving Beyond Correlation
An interactive journey into Causal Inference: the science of asking "What if?" and truly understanding cause and effect.
Correlation ≠ Causation
This is the foundational principle of causal inference. Two things can be related without one causing the other. This section provides an interactive example to demonstrate this critical distinction.
The Classic Example
The chart shows a strong positive correlation between monthly ice cream sales and the number of drowning incidents. Does eating ice cream cause drowning?
The Hidden Confounder: Heat!
Of course not. The lurking or 'confounding' variable is the season. Hot weather (e.g., summer) causes both an increase in ice cream sales and an increase in people swimming, which unfortunately leads to more drowning incidents. The variables are correlated, but not causally linked.
Core Concepts
To perform causal analysis, we need a specific vocabulary. These are the building blocks for understanding the methods used to uncover cause-and-effect relationships.
Treatment
The action, intervention, or variable whose effect we want to measure. For example, a new drug, a marketing campaign, or a government policy.
Counterfactual
The "what if" scenario. What would have happened to the same group if they had not received the treatment? The core challenge is that we can never observe this directly.
Confounding
A third variable that influences both the treatment and the outcome, creating a spurious association. As we saw with "heat" in the previous example.
The Ladder of Causation
Computer scientist Judea Pearl proposed a three-level hierarchy for causal reasoning. This framework helps us understand the types of questions we can answer with data. Click on each step to learn more.
Association (Seeing)
Intervention (Doing)
Counterfactuals (Imagining)
Level 1: Association (Seeing)
This is the level of standard statistics and machine learning. We look for patterns and correlations in data.
Question: "What is the relationship between cholesterol and heart disease?"
Action: Observing data, finding correlations.
Causal Inference Methods
Scientists have developed various methods to climb the ladder of causation and estimate causal effects, especially when direct experiments are not possible. Select a method to see how it works.
Randomized Controlled Trials (RCTs)
Considered the "gold standard." A population is randomly assigned to a treatment group (gets the intervention) or a control group (does not). Because the assignment is random, any systematic difference in outcomes between the groups can be attributed to the treatment, effectively eliminating confounding variables.
Why It Matters: Real-World Applications
Causal inference isn't just academic; it's crucial for making effective decisions in business, policy, medicine, and beyond. Understanding causality helps us design better systems and interventions.
Medicine & Public Health
Does a new vaccine prevent disease? Does a public health campaign reduce smoking rates? Answering these questions requires rigorous causal analysis to save lives.
Economics & Policy
Does increasing the minimum wage cause unemployment? Does a universal basic income program improve health outcomes? Governments rely on causal estimates to enact effective policies.
Business & Technology
Did our new website design cause an increase in sales? Does a specific ad campaign have a positive return on investment? Companies use causal inference to optimize strategies and products.