Why don't machines ask: what if?

🇵🇱 Polski
Why don't machines ask: what if?

📚 Based on

Przyczyny i skutki. Rewolucyjna nauka wnioskowania przyczynowego

👤 About the Author

Judea Pearl

University of California, Los Angeles (UCLA)

Judea Pearl (born 1936) is an Israeli-American computer scientist and philosopher, widely recognized for his pioneering work in artificial intelligence and statistics. He is a professor emeritus at the University of California, Los Angeles (UCLA), where he directs the Cognitive Systems Laboratory. Pearl is best known for inventing the probabilistic approach to artificial intelligence and for developing a comprehensive mathematical framework for causal inference based on structural models. His work has fundamentally transformed how researchers in computer science, statistics, epidemiology, and the social sciences analyze cause-and-effect relationships. In 2011, he was awarded the A.M. Turing Award, the highest distinction in computer science, for his fundamental contributions to the field. Beyond his technical research, he is a prominent advocate for the study of causality and has authored several influential books bridging technical theory and broader philosophical inquiry.

Dana Mackenzie

Dana Walter Nance Mackenzie (born November 4, 1958) is an American mathematician, science writer, and chess expert. He earned his Ph.D. in mathematics from Princeton University in 1983. After an academic career, he transitioned into full-time science writing, contributing to publications such as Science, New Scientist, and Scientific American. Mackenzie is well-known for his ability to communicate complex mathematical and scientific concepts to general audiences. He has authored or co-authored numerous books, including 'The Universe in Zero Words' and 'The Big Splat, or How Our Moon Came to Be'. He is also recognized for his collaboration with Judea Pearl on 'The Book of Why', which explores the science of causal inference. Beyond his writing, Mackenzie is a USCF Life Master in chess and an active volunteer in his local community in Santa Cruz, California.

Introduction

Modern data science is stuck on the first rung of Judea Pearl’s Ladder of Causation: association. Although algorithms process massive datasets, they confuse statistical correlation with actual causal relationships. This article explains why data without a causal model is merely a silent record of the past. The reader will learn how moving from passive observation to intervention and counterfactual thinking forms the foundation of true intelligence, distinguishing humans from advanced probability calculators.

The Association Trap: Why Data Is Not Enough for Thinking

Modern AI systems do not understand causality because they operate solely on conditional probability P(Y | X). These systems are like an owl observing the movements of a mouse—they predict events based on historical patterns without understanding the mechanisms of the world. The inability to distinguish cause from effect stems from the fact that raw data contains no information about the structure of reality. Algorithms lack the innate mental models that would allow them to understand, for example, that a rooster’s crowing does not cause the sun to rise.

Why Algorithms Do Not Understand the Counterfactual World

AI systems cannot move beyond the levels of correlation and intervention because they lack intentionality and the formal do(X) operator. Intervention requires simulating a change in a system, and counterfactual thinking requires the ability to analyze scenarios that never occurred. Algorithms are limited to facts, whereas intelligence requires imagination. Without an internal model of the world, machines cannot answer "what if" questions because they are not agents with their own goals, but merely passive processors of the statistical echoes of past decisions.

The Counterfactual Paradox: How to Study Worlds That Do Not Exist

Science manages to study alternative scenarios thanks to causal models and the calculus of potential outcomes. Although we cannot observe two paths simultaneously, mathematics allows us to build synthetic worlds. Understanding causality is essential for resolving paradoxes such as Simpson’s paradox or Berkson’s paradox. Data alone leads to erroneous conclusions because it ignores the structure of information generation. Understanding that variables can be colliders or confounders allows us to avoid false correlations that, in pure statistics, are indistinguishable from the truth.

Why Statistics Is Not Enough: The Power of Causal Thinking

Pure statistical analysis without a causal model leads to errors because it treats all correlations as equivalent. In the face of paradoxes like Lord’s paradox, controlling variables without knowledge of their role in the causal structure can distort the result. Understanding causality allows us to distinguish noise from significant mechanisms. Without this tool, statistics becomes a sophisticated deception that fails to explain why certain phenomena occur. True science requires moving from "what is happening" to "why it is happening."

How to Avoid Statistical Traps Through Causal Inference

Causal inference uses Bayesian networks and diagrams to unlock false information paths. Thanks to them, the researcher becomes a geologist who understands the history of processes rather than just seeing the rock. Applying the do(X) operator allows for the simulation of experiments where ethics or logistics make real-world action impossible. This approach restores power to the questions that data tacitly assumes, allowing us to consciously design the future instead of passively guessing it.

Summary

In the era of big data, it is easy to have a delusive sense of omniscience derived from charts. True knowledge, however, requires moving beyond the passivity of an observer. The ability to ask counterfactual questions—about what could have been—defines our freedom and responsibility. In a world dominated by algorithms that see everything but understand nothing, will our most important skill remain the ability to consciously create alternatives? It is this very question that marks the boundary between a mindless machine and human reason.

📄 Full analysis available in PDF

📖 Glossary

Drabina Przyczynowości
Trzystopniowa klasyfikacja zdolności poznawczych obejmująca kojarzenie, interwencję i kontrfaktyczność, opracowana przez Judeę Pearla.
Operator do(X)
Formalne narzędzie matematyczne służące do symulowania zewnętrznej interwencji w modelu przyczynowym, niezależnie od naturalnych przyczyn zmiennej.
Kontrfaktyczność
Najwyższy poziom rozumowania polegający na analizowaniu alternatywnych scenariuszy i pytań typu 'co by było, gdyby' w odniesieniu do przeszłości.
Czynnik zakłócający (Confounder)
Zmienna, która wpływa jednocześnie na przyczynę i skutek, tworząc fałszywe korelacje statystyczne sugerujące nieistniejący związek bezpośredni.
Prawdopodobieństwo warunkowe P(Y | X)
Miara szansy na wystąpienie zdarzenia Y pod warunkiem zaobserwowania zdarzenia X, stanowiąca fundament statystycznego poziomu kojarzenia.
Randomizowane badania kontrolne (RCT)
Metoda badawcza polegająca na losowym przydziale obiektów do grup, co pozwala na izolację wpływu konkretnej interwencji od innych zmiennych.

Frequently Asked Questions

Why does modern AI have trouble understanding causality?
Most algorithms are based on passive observation and statistical pattern matching, which does not allow for distinguishing correlations from real causal relationships.
How is association different from intervention on the Ladder of Causality?
Association is the passive observation of data and estimation of probabilities based on patterns, while intervention is the active action of changing the system in order to observe effects.
What is the do(X) operator in causal modeling?
It is a mathematical notation of an intervention that allows us to model a situation in which we deliberately force a change in a specific value, ignoring its previous statistical conditions.
Why is data alone not enough to build intelligent machines?
Raw data is a record of facts that have already occurred; it does not contain information about the workings of the world or what would have happened in alternative scenarios that the machine has not seen.
What is the birth weight paradox mentioned in the text?
This is an example of a statistical error resulting from an incorrect selection of variables (colliders), where the correlation falsely suggests that maternal smoking may be beneficial for the newborn.

Related Questions

🧠 Thematic Groups

Tags: Ladder of Causality Judea Pearl matchmaking intervention counterfactualism do(X) operator machine learning correlation causality conditional probability randomized controlled trials confounding factor mental models neural networks causal inference