Workshop: "Safe, Anytime-Valid Inference" (SAVI)
Dec 13 - Dec 17
A large fraction of published research in top journals in applied sciences such as medicine and psychology has been claimed as irreproducable. In light of this 'replicability crisis', traditional methods for hypothesis testing, most notably those based on p-values, have come under intense scrutiny. One central problem is the following: if our test result is promising but non-conclusive (say, p = 0.07) we cannot simply decide to gather a few more data points. While this practice is ubiquitous in science, it invalidates p-values and error guarantees and makes the results of standard meta-analyses very hard to interpret. This issue is not unique for p-values: other approaches, such as replacing testing by estimation with confidence intervals, suffer from similar optional continuation problems. Over the last few years several distinct but closely related solutions have been proposed, such as anytime-valid confidence sequences, anytime-valid p-values, and safe tests.
Remarkably, all these approaches can be understood in terms of (sequential) gambling. One formulates a gambling strategy under which one would not expect to gain any money if the null hypothesis were true. If for the given data one would have won a large amount of money in this game, this provides evidence against the null hypothesis. The test statistic in traditional statistics gets replaced by the gambling strategy; the p-value gets replaced by the (virtual) amount of money gained. In more mathematical terms, evidence against the null and confidence sets are derived in terms of non-negative supermartingales. While this idea in essence goes back to Wald’s sequential testing of the 1950's and its extensions by Robbins and co in the early 1960's and Lai in the 1970's, it never really caught on because it used to be applicable only to very simple statistical models and testing scenarios.
However, recent work shows that this idea is essentially universally applicable – one can design supermartingales for large classes of tests, both parametric and non-parametric, and many estimation problems, yielding anytime P-values using non-asymptotic versions of the law of the iterated logarithm. A variation of the idea, the S-value, can even be applied to completely arbitrary tests - at the price of only allowing for a weaker form of optional stopping. All these techniques have both a gambling and a Bayes factor interpretation. Thus, these directions are able to somewhat unite Bayesian and frequentest ways of thinking; with the explicit ability to use prior knowledge, with frequentest error control and confidence bounds, but often using Bayesian techniques.
This workshop aims to get together two groups of researchers --- those who have been developing the mathematical, probabilistic and statistical foundations of this area, and practitioners who have studied and written about the reproducibility crisis in the sciences.
This conference supports the Welcoming Environment Statement of the Association for Women in Mathematics (AWM).
|Peter Grünwald||CWI, Amsterdam|
|Aaditya Ramdas||Carnegie Mellon University|
|Akshay Balsubramani||Stanford University||Genetics, Computer Science|
|Leonhard Held||University of Zurich||Biostatistics, Reproducibility|
|Chris Jennison||University of Bath||Statistics|
|Wouter Koolen||CWI, Amsterdam||Computer Science (ML)|
|Daniel Lakens||TU Eindhoven||Human-Technology Interaction, Reproducibility|
|Theis Lange||University of Copenhagen||Biostatistics, Reproducibility|
|Alexander Ly||University of Amsterdam and CWI||Psychology, Computer Science|
|Deborah Mayo||Virginia Tech||Philosophy|
|Luigi Pace||University of Udine||Economics, Statistics|
|Don van Ravenzwaaij||Groningen University||Psychologym Statistics|
|Christian Robert||Université Paris-Dauphine + Warwick University||Statistics|
|Alessandra Salvan||University of Padova||Statistics|
|Glenn Shafer||Rutgers University||Foundations of Probability and Statistics|
|Mark Simmonds||University of York||Review and Dissemination|
|Anne Lyngholm Sørensen||University of Copenhagen||Statistics|
|Vladimir Vovk||Royal Holloway London||Computer Science (ML), Statistics|
|Eric-Jan Wagenmakers||University of Amsterdam||Psychology|
|Ruodu Wang||University of Waterloo||Statistics, Actuarial Science|
|Larry Wasserman||Carnegie Mellon University||Statistics, ML|