logo

European Institute for Statistics, Probability, Stochastic Operations Research and its Applications

About | Research | Events | People | Reports | Alumni | ContactHome


Young European statisticians Workshop (YES-III)Workshop

"Paradigms of Model Choice"
October 5-6-7, 2009

Programme

Monday 5th October

09.30-10.30 Registration  
10.30-10.45 Welcome  
10.45-11.30 Peter Grünwald The Minimum Description Length Principle 1
11.30-12.15 Nils Hjort Focused Information Criterion 1
12.15-12.35 Joern Dannemann Testing for two states in a hidden Markov model
12.35-13.45 Lunch  
13.45- 14.30 Laurie Davies Location and the analysis of variance 1
14.30-15.15 Christian Robert Computational approaches to Bayesian model choice 1
15.15-15.45 Coffee/tea break  
15.45-16.05 Maik Schwarz Adaptive circular deconvolution by model selection under unknow error distribution
16.05-16.25 Maya Shevlyakova Forward Stagewise variable selection for the high-dimensional Cox's model
16.25-16.45 Mehrdad Niaparast Quasi-Likelihood and the Optimal Designs for the Poisson regression Models with Random Intercept
16.45-17.05 Lidia Burzala Estimation of human dose response curve based on the allometric scaling rule
17.05-17.30 Discussion  
18.30 Workshop Dinner  

Tuesday 6th October

09.00-09.45 Laurie Davies A concept of approximation
09.45-10.05 Uwe Saint-Mont Modelling in Context
10.05-10.30 Coffee/tea break  
10.30-11.15 Nils Hjort Focused Information Criterion 2
11.15-11.35 Birgit Witte Smooth plug-in inverse estimators in the current status continuous mark mode
11.35-11.55 Josep Tadjuidje Kamgaing Some asymptotic for hidden Markov mixture of autoregressive processes with ARCH components
11.55-12.15 Habib Jafari Optimal Design for MNL and NMNL Models in Discrete Choice Experiment
12.15-12.35 Rembert De Blander Extended Hausman-Taylor Estimation with Automated Instrument Selection
12.35-13.30 Lunch  
13.30-14.15 Peter Grünwald The Minimum Description Length Principle 2
14.15-15.00 Christian Robert Computational approaches to Bayesian model choice 2
15.00-15.30 Coffee/tea break  
15.30-16.15 Laurie Davies Approximate Models and Regularization
16.15-17.00 Discussion  

Wednesday 7th October

09.00-09.45 Peter Grünwald The Minimum Description Length Principle 3
09.45-10.30 Nils Hjort Focused Information Criterion
10.30-11.00 Coffee/tea break with Surprise!
11.00-11.45 Christian Robert Computational approaches to Bayesian model choice 3
11.45.12.30 Discussion  
12.30 End of the workshop  

 

ABSTRACTS


Rembert De Blander

Extended Hausman-Taylor Estimation with Automated Instrument Selection

In this presentation, the extension of the Hausman-Taylor (1981) (HT) estimator to the two-way error component panel data model as proposed by Wyhowski (1994) is discussed. Such a type of instrumental variables (IV) procedure typically results in a large number of (potential) instruments which all require testing. Since this task can become quite time consuming, I propose the use of an automated downward sequential Lagrange multiplier (LM), i.e. score, testing procedure that eliminates the unsuitable instruments.


Lidia Burzala

Estimation of human dose response curve based on the allometric scaling rule

In this talk we introduce two models for estimating human dose response curve based on the allometric scaling rule. These models are an example of Shape Invariant Models. Estimation of the human dose response curve involves estimation of a common model function (archetype) and a scaling factor. We consider a parametric and a shape constrained nonparametric approach for modeling the archetype function. The asymptotic properties of these estimators will be discussed.


Jörn Dannemann

Testing for two states in a hidden Markov model

For hidden Markov models (HMMs), choosing the number of states of the underlying Markov chain, is an essential problem. Our test for this problem is an extension to HMMs of the modified likelihood ratio test for two states in a finite mixture. It is based on inference for the marginal mixture distribution of the HMM. Its asymptotic distribution theory under the null hypothesis of two states is derived.


Laurie Davies
I   - Location and the analysis of variance 1
II  - A concept of approximation
III - Approximation and examples


Peter Grünwald

The Minimum Description Length Principle

We give a self-contained introduction to the Minimum Description Length (MDL) Principle., introduced by J. Rissanen in 1978. MDL is a theory of inductive inference, based on the idea that the more one is able to compress a given set of data, the more one can be said to have learned about the data. This idea can be applied to general statistical problems, and in particular to problems of model choice. In its simplest form, for a given class of probability models M and sample D, it tells us to pick the model H \in M that minimizes the sum of the number of bits needed to describe first the model H and then data D where D is encoded `with the help of H'. This is a special case of the general formulation of MDL, which is based on the information-theoretic concept of a `universal model', which embody an automatic trade-off between goodness-of-fit and complexity.

First Lecture:

We give a crash course on data compression. We focus on the 1-to-1 relationship between codelength functions and probability mass functions via the Kraft inequality. This allows us, essentially, to view log-likelihoods as codelengths. We then introduce the fundamental notion of a 'universal code'. Given a set of codes (data compression methods) L, a universal code is a code that allows us to compress data at least as well as the code in L that is optimal with hindsight; this should hold no matter what the data are. We introduce three important universal codes: two-part codes, Bayesian codes and minimax optimal ``normalized maximum likelihood'' codes.

Lecture

Second Lecture:

We define the MDL principle for model selection, estimation and prediction. We show how optimal coding is essentially the same as optimal sequential prediction under a log scoring rule. This allows us to prove frequentist consistency and rate of converge theorems for MDL estimation and model selection; we briefly explain Barron's two main theorems on this issue. We highlight the close similarities yet essential differences between MDL and Bayesian inference.

Third Lecture:

We analyze the reasons that MDL and Bayes factor model selection do not achieve the minimax optimal convergence rates in nonparameteric model selection problems. We show how this so-called ''catch-up phenomenon'' may be adressed by putting prior distributions (i.e. codes) on sequences of models rather than individual models. In this way we provide a novel solution to the AIC-BIC dilemma in model selection: we arrive at a model selection/averaging procedure that is both consistent if the true model happens to be finite dimensional, and achieves the minimax rates if the true model cannot be expressed with a finite number of parameters. We discuss this result in the greater framework of Dawid's prequential analysis.

In these lectures, we focus on three aspects:

* Frequentist Considerations - Consistency and Minimax Convergence Rates: MDL model choice and prediction is statistically consistent under a wide variety of conditions. We review A. Barron's surprisingly simple proofs of these results, which provide a direct link between data compression and statistical convergence rates: each estimator can be interpreted as a code, and the better this code compresses the data in expectation, the faster the estimator's risk converges. * Bayesian Considerations - since prior distributions may be interpreted as codes, practical MDL implementations are often quite similar to Bayes factor model selection and model averaging, but there are important differences. For example, the Bayes predictive distribution reappears in MDL, but the Bayes posterior does not. Also, MDL avoids the Bayesian inconsistency results of Diaconis and Freedman, since these are based on priors that provably do not lead to data compression. * AIC/BIC-dilemma: standard MDL does not achieve the optimal minimax convergence rates in some nonparametric settings. We explain this phenomenon and describe the switch distribution as a potential remedy.


Nils Hjort

Focused Information Criterion

The FIC was developed by Gerda Claeskens and Nils Hjort in two articles in the Journal of the American Statistical Association in 2003. They have since become two of the most cited articles on the problem of model choice. Whereas criterion such as AIC or BIC choose a model without reference to its intended use, the FIC criterion explicitly demands that the use to which the model is to be put be made precise. If for example a quantile is of interest one may choose a different model than that if the mean were  the quantity of interest. If both are of interest for the same data set, then one could choose one model for the quantile and a different one for the mean. Claeskens and Hjort have made this precise in an asymptotic setting and shown how their approach can be validated. They are also able to prove the advantage of model averaging if the results for different models are close together. A further result which comes from their analysis is the calculation of confidence intervals. Many statisticians choose a model on the basis of some criterion and then, having chosen it, calculate confidence intervals neglecting the process by which the model was chosen. This is known to lead to over-optimistic confidence intervals. Claeskens and Hjort have shown how this problem can be overcome within their paradigm so that the confidence intervals have at least asymptotically the correct coverage probability.


Habib Jafari , Otto-von-Guericke-Universität Magdeburg, Institute for Mathematical Stochastics (IMST)

Optimal Design for MNL and NMNL Models in Discrete Choice Experiment


Christian Robert

Computational approaches to Bayesian model choice

1
After introducing the basic Bayesian concepts for model choice and testing, the first lecture will focus on importance sampling based simulation methods used to compute the Bayes factor. This includes bridge sampling, harmonic mean estimators, defensive sampling, and nested sampling.

2
The second lecture makes the link from within-model techniques to cross-model techniques, with a foray inside reversible techniques. It will also cover the functional identities of Chib and of Dickey-Savage.

3
The third lecture considers a fundamentally different to model choice in a setting where the likelihood is not available, such as Gibbs random fields. It focus on the ABC (Approximate Bayesian Computation) methodology introduced a few years ago in Genomics for the analysis of philogenic trees. Fundamental properties of the ABC method are recalled, along with an illustration in the case of the Potts model for the selection of a neighbourhood structure.

Slides are available in a preliminary version on http://www.slideshare.net/xianblog/maxent-2009-talk


Joseph Tadjuidje Kamgaing

Some asymptotic for hidden Markov mixture of autoregressive processes with ARCH components

In this paper a class of generalized hidden Markov mixture of nonlinear and non parametric autoregressive processes with ARCH components is introduced. For illustration, we present stability conditions that allow for mixture of stationary and non stationary processes, e.g., causal and non causal AR processes, and derive the asymptotic behavior of the maximum likelihood estimator for the mixture of AR(1)-ARCH(1) processes with Gaussian residuals. Finally, some numerical results are presented.


Mehrdad Niaparast

Quasi-Likelihood and the Optimal Designs for the Poisson regression Models with Random Intercept

Most of the research on the optimal designs focuses on the Linear Models (LM) and Linear Mixed Models (LMMs). There exist a few results on the Generalized Linear Mixed Models (GLMMs). In fact in the recent case, due to the random effects, inference based on Likelihood function is quite intractable. In this talk we consider two special cases of GLMMs, which call Simple Poisson Regression Model with Random Intercept (SPRMRI) and Quadratic Poisson Regression Model with Random Intercept (QPRMRI). We also consider a quasi-likelihood approach as an alternative method (approximately) to estimate the unknown parameters of models. We prepare some points on the comparison between the results on the optimal designs for these models and the corresponding results on the optimal for LMs.


Uwe Saint-Mont

Modelling in Context

The talk compares the four paradigms of modeling in statistics. It draws a clear distinction between the procedure- and the model-driven approaches. Models are classified, and standard criticism directed towards them is considered. It turns out that the concepts of truth and information are crucial, giving the whole project direction and embedding it in the "cirle of research".


Maik Schwarz

Adaptive circular deconvolution by model selection under unknow error distribution


Maya Shevlyakova

Forward Stagewise variable selection for the high-dimensional Cox's model

Traditional approaches to the fitting of proportional hazard models do not apply in the setting of high-dimensional data. We adapt the idea of forward stagewise variable selection from the linear models and apply it to Cox's model. The main interest is to study the limits of partial likelihood, its cross-validation performance and its power in variable selection.


Birgit Witte

Smooth plug-in inverse estimators in the current status continuous mark mode

We consider the problem of estimating the joint distribution function of the event time and a continuous mark variable when the event time is subject to interval censoring and the continuous mark variable is only observed in case the event occured before time of inspection.

The nonparametric maximum likelihood estimator in this model is inconsistent. We study two alternative smooth estimators, based on the relation between the distribution function of interest and the density of the observable vector. We derive the pointwise asymptotic distribution of both estimators.


Last updated 05-okt-2009,
By LC
 

    P.O. Box 513, 5600 MB  Eindhoven, The Netherlands
tel. +31 40 2478100  fax +31 40 2478190  
  e-mail: office@eurandom.tue.nl