# European Institute for Statistics, Probability, Stochastic Operations Research and its Applications

About | Research | Events | People | Reports | Alumni | ContactHome

 Young European statisticians Workshop (YES-III)Workshop "Paradigms of Model Choice" October 5-6-7, 2009 Summary This is the third workshop in the series of YES (Young European Statisticians) workshops. The first was held in October 2007 on Shape Restricted Inference with seminars given by Lutz Dümbgen (Bern) and Jon Wellner (Seattle) together with shorter talks by Laurie Davies (Duisburg-Essen) and Geurt Jongbloed (Delft). The second workshop was held in October 2008 on High Dimensional Statistics with seminars given by Sara van de Geer (Zürich), Nicolai Meinshausen (Oxford) and Gilles Blanchard (Berlin). The present workshop is directed at young statisticians (mainly Ph.D. students and postdocs) who are interested in the problem of model choice. Short seminars each consisting of three 45 minute talks on various aspects of model choice will be given by Professor Laurie Davies, Duisburg-Essen, Professor Peter Grünwald, Amsterdam, Professor Nils Hjort, Oslo and Professor Christian Robert - Paris. The participants will also have the opportunity to give short talks of 25 minutes and 5 minutes discussion on their own research. Model choice has for many years been a point of disagreement and research in statistics. The applications range from the choice between several low dimensional models all of which are reasonable models for the data, the choice of variables to be included in a linear regression, and the choice of smoothing parameter in nonparametric regression, inverse and other ill-posed problems. Over the years several techniques have been developed such as AIC, BIC, MDL (Minimum Description Length) , cross-validation, Lasso (more generally L_1-penalization) and FIC (focused information criterion). Many of these techniques have proved successful for certain types of problem but there is still a need for a discussion of the principles (if any) involved as well as the advantages and disadvantages of these approaches. It is the aim of the workshop to inform the participants of the state-of-the-art in each of these several paradigms and to encourage discussion between the different schools. It is also intended that each of the paradigms of model choice provide examples of their use in real problems to demonstrate their applicability to the analysis of data. Each of the speakers will concentrate on the problem of model choice from their own perspective.   Approximate models and regularization (Davies) This approach to model choice is based on the idea of approximate models. A model is regarded as an adequate approximation to a data set if typical' data generated under the model looks like' the real data. The word typical' is made precise by specifying a real number α, 0 < α < 1, which determines what percentage of the data sets generated under the model, are to be regarded as typical. The words look like' must be operationalized (in practice often in the form of a computer program) so that for any model and any data set it is possible to decide whether the model is an adequate approximation to the data. The precise nature of this will depend on the problem at hand; there is no general principle which can be used. Typically there will be many adequate models and interest will centre on certain simplest ones where simplicity can be defined in terms of shape (e.g. the minimum number of local extreme values) or smoothness (minimum total variation of a derivative) or the absence of free lunches' (minimum Fisher information). The ideas and the applications will be illustrated by several examples, amongst others, from the area of nonparametric regression. References Davies, P. L. (1995) Data features. Statistica Neerlandica, (49), 185-245. Davies, P. L. (2008) Approximating data (with discussion. Journal of the Korean Statistical Society, (37) 191-240. Tukey, J. W. (1993) Issues relevant to an honest account of data-based inference, partially in the light of Laurie Davies's paper. Princeton University, Princeton, http://www.stat-math.uni-essen.de/tukey/tukey.php The Minimum Description Length Principle(Grünwald) We give a self-contained introduction to the Minimum Description Length (MDL) Principle., introduced by J. Rissanen in 1978. MDL is a theory of inductive inference, based on the idea that the more one is able to compress a given set of data, the more one can be said to have learned about the data. This idea can be applied to general statistical problems, and in particular to problems of model choice. In its simplest form, for a given class of probability models M and sample D, it tells us to pick the model H \in M that minimizes the sum of the number of bits needed to describe first the model H and then data D where D is encoded with the help of H'. This is a special case of the general formulation of MDL, which is based on the information-theoretic concept of a universal model', which embody an automatic trade-off between goodness-of-fit and complexity. In these lectures, we focus on three aspects: * Frequentist Considerations - Consistency and Minimax Convergence Rates: MDL model choice and prediction is statistically consistent under a wide variety of conditions. We review A. Barron's surprisingly simple proofs of these results, which provide a direct link between data compression and statistical convergence rates: each estimator can be interpreted as a code, and the better this code compresses the data in expectation, the faster the estimator's risk converges. * Bayesian Considerations - since prior distributions may be interpreted as codes, practical MDL implementations are often quite similar to Bayes factor model selection and model averaging, but there are important differences. For example, the Bayes predictive distribution reappears in MDL, but the Bayes posterior does not. Also, MDL avoids the Bayesian inconsistency results of Diaconis and Freedman, since these are based on priors that provably do not lead to data compression. * AIC/BIC-dilemma: standard MDL does not achieve the optimal minimax convergence rates in some nonparametric settings. We explain this phenomenon and describe the switch distribution as a potential remedy. References A. Barron, J. Rissanen and B. Yu. The Minimum Description Length Principle in Coding and Modeling. IEEE Transactions on Information Theory 44(6), 2743-2760, 1998. P. Grunwald. A Tutorial Introduction to the MDL Principle. Chapters 1 and 2 of 'Advances in MDL: Theory and Practice', MIT Press, 2005. P. Grunwald. The Minimum Description Length Principle. MIT Press, 2007. T. van Erven, P. Grunwald and S. de Rooij. Catching up Faster by Switching Sooner: a prequential solution to the AIC-BIC dilemma. preprint, arXiv:0807.1005, 2008, November 2008. Focused Information Criterion (Hjort) The FIC was developed by Gerda Claeskens and Nils Hjort in two articles in the Journal of the American Statistical Association in 2003. They have since become two of the most cited articles on the problem of model choice. Whereas criterion such as AIC or BIC choose a model without reference to its intended use, the FIC criterion explicitly demands that the use to which the model is to be put be made precise. If for example a quantile is of interest one may choose a different model than that if the mean were the quantity of interest. If both are of interest for the same data set, then one could choose one model for the quantile and a different one for the mean. Claeskens and Hjort have made this precise in an asymptotic setting and shown how their approach can be validated. They are also able to prove the advantage of model averaging if the results for different models are close together. A further result which comes from their analysis is the calculation of confidence intervals. Many statisticians choose a model on the basis of some criterion and then, having chosen it, calculate confidence intervals neglecting the process by which the model was chosen. This is known to lead to over optimistic confidence intervals. Claeskens and Hjort have shown how this problem can be overcome within their paradigm so that the confidence intervals have at least asymptotically the correct coverage probability. References Hjort, N.L. and Claeskens, G. (2003) Frequentist model average estimates. Journal of the American Staistical Association, (98) 879--899. Hjort, N.L. and Claeskens, G. (2003) Frequentist model average estimates. Journal of the American Staistical Association, (98) 900--916. Computational approaches to Bayesian model choice (Robert) The seminar will cover recent developments in the computation of marginal distributions for the comparison of statistical models in a Bayesian framework. Although the introduction of reversible jump MCMC by Green in 1995 is rightly perceived as the second MCMC revolution', its implementation is often too complex for the problems at hand. When the number of models under consideration is of a reasonable magnitude there exist computational alternatives such as bridge sampling, nested sampling and ABC (Approximate Bayes Computation) which avoid model exploration with reasonable efficiency. The seminar will be devoted to discussing the advantages and disadvantages of these alternatives. Registration is closed. Conference Location the workshop location is EURANDOM,  Den Dolech 2, 5612 AZ Eindhoven, Laplace Building, 1st floor, LG 1.105. EURANDOM is located on the campus of Eindhoven University of Technology, in the 'Laplacegebouw' building' (LG on the map). The university is located at 10 minutes walking distance from Eindhoven railway station (take the exit north side and walk towards the tall building on the right with the sign TU/e). For all information on how to come to Eindhoven, please check  http://www.eurandom.tue.nl/contact.htm Hotel For keynote speakers Hotel Queen is reserved. You are requested to indicate arrival and departure dates on the registration form . For excepted contributed speakers, a reservation will be made in the Sandton Hotel Eindhoven City Centre, FREE of charge. The room will be shared. For preference of a single room (costs 45 euro per night) mark the box on the registration form. For participants it is possible to book a room through the organisation for the reduced price of 89 euro per night plus 3,50 tourist tax in the Sandton hotel Eindhoven City Centre. Breakfast is included.  Indicate arrival and departure dates on the registration form . For private bookings we suggest to consult the web pages of the Tourist Information Eindhoven, Postbus 7, 5600 AA Eindhoven. Lunches/dinner On October 5 & 6 lunches are organised, free of costs for all participants, if ordered on the registration form. The conference dinner will be held on Tuesday, October 6.  For non-invitees an amount of 35 euro is requested, to be paid at arrival in cash (preferably exact amount in euros). Indicate your attendance on the registration form. Contact For more information please contact Mrs. Lucienne Coolen, workshop officer of  EURANDOM, at coolen'at'eurandom.tue.nl Organisers • Prof. P. L. Davies, University of Duisburg--Essen, Germany/Eindhoven University of Technology, Eindhoven/ EURANDOM, Eindhoven. (laurie.davies'at'uni-duisburg-essen.de  ) • Prof. G. Jongbloed, University of Technology, Delft (G.jongbloed'at'ewi.tudelft.nl )   FORMER YES WORKSHOPS   With special thanks and acknowledgement for the contributions of the following sponsors: P.O. Box 513, 5600 MB  Eindhoven, The Netherlands tel. +31 40 2478100  fax +31 40 2478190     e-mail: office@eurandom.tue.nl