Statistics after the loss of innocence: New rigor in the age of AI

Fundacja Dobre Państwo • April 18, 2026 • 🇵🇱 Polski

📚 Based on

Deep Learning assisted statistical methods with examples in R (2026)
Chapman and Hall/CRC
ISBN: 9781041158431

👤 About the Author

Tianyu Zhan

AbbVie Inc.

Tianyu Zhan is a Director at AbbVie Inc., where he focuses on innovative clinical trial designs and advanced statistical analysis methods. He earned his Ph.D. in Biostatistics from the University of Michigan, Ann Arbor, in 2017. His professional work and research interests center on late-phase clinical trials, leveraging computational approaches to optimize trial design and outcomes. Zhan has contributed significantly to the field by integrating deep learning techniques with traditional statistical methods to address complex challenges in hypothesis testing, point estimation, and optimization. His work aims to enhance statistical efficiency and interpretability in practical applications, particularly within the biopharmaceutical industry. He is recognized for his efforts in promoting responsible and valid applications of AI-based methodologies in scientific research.

📄 Download PDF 🎧 Listen (Audio)

Introduction

Modern statistics has lost its innocence, evolving into a discipline where traditional rigor must coexist with the power of artificial intelligence. Tianyu Zhan’s project proposes a new methodology: statistics as an architecture of accountability. Instead of blind faith in algorithms, researchers must adopt the role of inference architects who utilize neural networks to optimize processes while maintaining inviolable mathematical guarantees. This article explains how, in the age of AI, one can reconcile innovation with regulatory rigor and ethical responsibility for data.

Statistics in the Age of AI: From Watchmaker to Bridge Engineer

In adaptive research, deep neural networks (DNNs) serve as auxiliary modules that optimize test statistics where classical formulas fall short. To avoid losing control over inference, Zhan proposes a two-stage test construction: the first network optimizes the statistic for the data, while the second determines critical values, guaranteeing the maintenance of the Type I error rate. As a result, AI does not replace statistical thinking but becomes a tool for building solutions where theory fails to provide closed-form expressions. This approach integrates modern machine learning with the requirements of scientific rigor, treating the algorithm as a loyal executor within human-defined boundaries.

The Mechanization of Integrity: AI as a Guarantor of Research Rigor

The use of trained DNN models strengthens research integrity through quantitative pre-specification. Instead of relying on a researcher's "gentleman's agreement," we freeze the algorithm's parameters before data arrival, which eliminates the temptations of p-hacking and subjective interpretation. The mechanization of procedural honesty makes the system deterministic and resistant to post-hoc manipulation. Integrating DNNs into such a structure changes the nature of clinical statistics from an ethical declaration into an institutional control system. Consequently, even in complex studies, the cognitive process remains auditable, and the analyst's role shifts from improvisation to the guardianship of non-negotiable inference principles.

Algorithms as Infrastructure Citizens and Guardians of Evidence

Reconciling advanced models with ethics and regulations requires treating algorithms as citizens of the infrastructure. To ensure reproducibility and safety, AI-based systems must be equipped with safeguards that protect against errors outside the training range. In biostatistics, this means the necessity of controlling data representativeness to avoid perpetuating social asymmetries. Integrating deep learning with classical statistics allows for the construction of hybrids that combine high predictive power with rigorous error control. The inference architect must therefore ensure that the model does not become an "epistemological fraud," but rather a tool that, thanks to its computational architecture, guarantees scientific credibility even in the face of an unpredictable reality.

Summary

Statistics is becoming a new regime of cognition, in which methodological sovereignty is ensured by an architecture of accountability. The key to success is statistical republicanism: the distribution of power between human discernment and precisely designed algorithms. Can we harness computational power so that, instead of quick illusions, it provides us with secure guarantees of reliability? The answer lies in building systems that are safe by design, rather than merely by the researcher's intent.

📄 Full analysis available in PDF

Summary

This article examines the evolution of statistics in the face of artificial intelligence, pointing to the need to shift from traditional inference to hybrid systems. Drawing on the work of Tianyu Zhan, the author discusses the role of deep neural networks as tools that support, rather than replace, statistical thinking. A key concept is safeguarding—a set of safeguards protecting the integrity of clinical trials from algorithmic errors. The text emphasizes the importance of computational infrastructure and new regulatory guidelines, such as ICH E20, in maintaining methodological rigor. In the age of AI, statistics is becoming an engineering discipline, where managing risk and type I error requires precise decision-making system architecture and quantitative pre-specification.

📖 Glossary

Safeguarding: Praktyka zabezpieczania metod statystycznych przed błędami generowanymi przez świat i model w celu utrzymania rygoru w systemach hybrydowych.
Prespecyfikacja ilościowa: Przeniesienie kluczowych elementów zmienności do fazy przed badaniem poprzez sztywne reguły i zamrożone parametry algorytmu, co ogranicza uznaniowość analityka.
Integralność badania: Sposób zaprojektowania procedur tak, aby wynik nie był podatny na manipulację po fakcie, gwarantujący obiektywizm procesu naukowego.
Estymanda: Teoretyczna wielkość opisująca efekt leczenia, która musi być precyzyjnie zdefiniowana i spójna z celem badania klinicznego oraz wybranym estymatorem.
Własności graniczne: Statystyczne gwarancje zachowywane przez model nawet przy jego niedoskonałościach, zapewniające bezpieczeństwo wnioskowania w złożonych systemach.
Błąd pierwszego rodzaju: Sytuacja, w której błędnie odrzuca się prawdziwą hipotezę zerową; zachowanie nominalnej kontroli nad tym błędem jest kluczowe dla ustroju poznania.

Frequently Asked Questions

How do traditional statistics differ from new statistics in the AI era?

Traditional statistics relied on mathematical elegance and full interpretability of parameters, while the new approach shifts the focus to computational architecture and precise construction of training data while maintaining the rigor of inference.

What is the role of neural networks in Zhan's proposal?

Deep neural networks do not replace statistical thinking, but work as auxiliary modules optimizing test statistics and determining critical values in areas analytically inaccessible to classical methods.

Why is quantitative pre-specification more important than qualitative pre-specification?

Qualitative pre-specification is just a declaration of intent in the documentation, while quantitative pre-specification creates an immutable, deterministic inference architecture based on frozen code, resistant to human weaknesses and manipulation.

What is safeguarding in the context of hybrid systems?

Safeguarding is the practice of protecting research methods from errors resulting from the unpredictability of data and models, thereby maintaining scientific rigor and nominal error control in the world of AI black boxes.

What is the importance of computing infrastructure for modern methodology?

Infrastructure shifts the focus from the model itself to the computational process. The modern statistician becomes the architect of a system that must withstand the unpredictable shocks generated by incoming data streams.

🧠 Thematic Groups

grupa 1: Ewolucja metodologii statystycznej i przejście od tradycyjnej inferencji do systemów hybrydowych wspieranych przez sztuczną inteligencję
grupa 2: Rygor badawczy i integralność procesów w badaniach klinicznych w świetle wytycznych regulacyjnych takich jak ICH E20 i E9(R1)
grupa 3: Zastosowanie głębokich sieci neuronowych (DNN) jako narzędzi deterministycznych w prespecyfikacji ilościowej i optymalizacji obliczeń
grupa 4: Zarządzanie ryzykiem, czasem i infrastrukturą w procesie badawczym poprzez safeguarding oraz nowoczesną architekturę systemów decyzyjnych

Tags: statistics artificial intelligence deep neural networks safeguarding statistical inference clinical trials quantitative pre-specification integrity of the study computing infrastructure estimate adaptability ICH E20 guidelines type I error ensemble estimator methodological rigor

📚 Based on

👤 About the Author

Tianyu Zhan

Introduction

Statistics in the Age of AI: From Watchmaker to Bridge Engineer

The Mechanization of Integrity: AI as a Guarantor of Research Rigor

Algorithms as Infrastructure Citizens and Guardians of Evidence

Summary

📖 Glossary

Frequently Asked Questions

Related Questions

🧠 Thematic Groups