Stochastics - Theoretical and Applied Research

Mission - Cluster plan - Workshops - Seminars - Visitors - Education - Positions - Home

Cluster Plan

STAR -- Stochastics - Theoretical and Applied Research
MATHEMATICS CLUSTER STOCHASTICS

1.  Motivation
What is stochastics?
The stochastics cluster STAR
What research topics is the stochastics cluster targeting at?

3. Structure
3.1 Goals
3.2 Organisation
3.3 Collaboration
3.4. Budget

4. Education

5. Science, Industry and society

6. Research projects
6.1 General Methodology
6.2 Mathematical statistical Physics
6.3 Stochastics and the life sciences
6.4 Stochastic Networks
6.5

MATHEMATICS CLUSTER STOCHASTICS
Stochastics Theoretical and Applied Research

1.  Motivation

What is stochastics?

Randomness is key to phenomena as diverse as phase transitions in polymer chains, resource demands in computer networks, or data variation in micro-array experiments. Stochastics is the science of randomness. It is a branch of mathematics that builds general tools and theories that enable to understand, predict, and often control, the numerous phenomena that are subject to chance.

Stochastics is a multidisciplinary science that takes its motivation from a wide range of scientific fields and industrial applications. Examples are: aging in disordered materials, fluctuations of interest rates, congestion in telephone lines or road traffic, effect of air pollution on health, instability in logistic processes, climate change, or genetic determinants of diseases. Mathematical abstraction allows to extract a common denominator for the chance mechanisms inherent in such phenomena. The building of a unified mathematical body of concepts and theories relating to randomness has turned out to be extremely useful: from constructing, understanding and analysing to fitting and optimising.

In the rapid technological advancement of the past few decades complexity features as a keyword alongside randomness. Technology enables us to study in ever finer detail the various processes occurring in nature, and to build sophisticated instruments to monitor and influence these processes. In coping with complexity, mathematics – alongside the natural and life sciences – plays a crucial role. As part of mathematics, stochastics is particularly well equipped to model, analyse and optimize complex systems, be it because such systems are intrinsically random or because a probabilistic description allows to capture the essential features of such systems.

Stochastics encompasses the areas of probability theory, statistics and stochastic operations research. Probability theory builds up the mathematical framework to describe and interpret complex and random systems, statistics provides the methods and tools to properly handle and interpret the data drawn from these systems, while stochastic operations research offers ways to optimize and control their performance. These capabilities make stochastics an essential enabling technology. Conversely, the dynamic developments in the various areas present a challenging research agenda for stochastics. Very often, it is the fundamental work that leads to the deepest insight and the broadest range of applicability.

The stochastics cluster STAR

Stochastics is important both as an enabling technology and as a scientific discipline. Stochastics is presently flourishing internationally. The Dutch stochastics community is strong and thriving, as is evidenced by its high reputation, visibility and level of activity, both nationally and internationally. Over the years, it has built up a broad spectrum of active working relations with researchers from physics, biology, medicine, economics and industry. In addition, it has begun to coordinate its MSc and PhD educational programs at the regional and national levels. It aims to contribute to society, while engaging in stochastics as an important and exciting field of scientific research.

The establishment, in 1998, of the internationally oriented research institute Eurandom has given a boost to Dutch stochastics. Its workshop and visitor programs have drawn the best researchers in the field worldwide to The Netherlands, and its postdoc program has attracted a large number of highly talented postdocs from abroad – quite a few of whom have subsequently accepted a tenured position at a Dutch university or company.

In a number of areas Dutch stochastics is of internationally recognized excellence. However, Dutch stochastics is relatively small and its reputation hinges upon a small number of senior researchers. To keep the momentum, it is essential to offer a stimulating and attractive research environment to a new generation of talented researchers. In a few fields of vital importance Dutch stochastics is under-represented and a strong impetus is needed. Examples are biostatistics and stochastic finance. These are fields that, internationally, are going through a period of feverish activity and generate a large demand for well-trained probabilists and statisticians who can contribute to the ensuing application areas.

The stochastics cluster STAR, with the European research institute Eurandom in a coordinating role, pushes Dutch stochastics further to the international forefront and allows the Dutch stochastics community to attract and train the most talented students and young researchers worldwide. It gives a much needed impulse to under-represented fields and strengthens the existing leading position in well-represented fields. The cluster leads to more critical mass in stochastics, and further stimulates the interaction of researchers from probability theory, statistics and stochastic operations research with researchers from other disciplines. The cluster plan is ambitious, but it should be realized that similarly ambitious plans are being developed  in places like Berlin, Paris, Zürich, Berkeley and Vancouver. For example, the University of British Columbia in Vancouver has 5 full professors in probability and is targeting to have 10. It aims to establish “the world’s leading center for research and graduate training in stochastic science”, bundling forces with the Universities of Victoria and Washington, the Microsoft Theory Group in Seattle, and the Pacific Institute for the Mathematical Sciences. This is spurred by the belief that “we see the dawning of the age of stochasticity in every aspect of basic science and its applications, affecting virtually all of science in this century.” The best way in which The Netherlands can continue to collaborate on an equal footing with the leading centers in the world, and be able to keep its most talented probabilists and statisticians, is to join forces in a stochastics cluster.

What research topics is the stochastics cluster targeting at?

There is great potential for a comprehensive activity in a few well chosen areas of stochastics. The cluster aims for a coordinated research effort in the following five topics:

(1) General methodology
(2) Mathematical statistical physics
(3) Stochastics and the life sciences
(4) Stochastic networks
(5) Stochastic finance and econometrics

The remainder of this text is organized as follows. In Section 2 we list the participating staff. In Section 3 we describe the structure of the cluster, specify the main goals, and sketch how we intend to reach them. Educational aspects are discussed in Section 4. The contribution to industry and society at large is the subject of Section 5. In Section 6 we add a list of research projects that will be addressed by the cluster. These projects are grouped under the 5 topics mentioned above and represent the best the cluster has to offer in terms of importance, viability and strength.

2.  Participating staff

The cluster has Eurandom as its central, coordinating node. Apart from that, we have not put emphasis on institutes as research nodes, but rather on leading senior researchers in stochastics in The Netherlands. Below is a (non-exhaustive) list of key researchers, with their first affiliation and, where present, a second affiliation (several of them are advisor at Eurandom or CWI, or part-time professor at another university), who are already in some way involved in the cluster activities.

• I.J.B.F. Adan (TU/e + Eurandom + UvA)
• J. van den Berg (CWI + VU)
• S.C. Borst (TU/e + Eurandom)
• R.J. Boucherie (UT + Eurandom)
• O.J. Boxma (TU/e + Eurandom)
• F. Camia (VU)
• F.M. Dekking (TUD)
• J. Einmahl (UvT)
• A. van Enter (RUG)
• A.J. van Es
• R. Fernandez (UU)
• R.D. Gill (UL)
• A. Gnedin (UU)
• M.C.M. de Gunst (VU + Eurandom)
• R. van der Hofstad (TU/e + Eurandom)
• W.Th.F. den Hollander (UL + Eurandom)
• G. Hooghiemstra (TUD)
• G. Jongbloed (TUD + Eurandom)
• C.A.J. Klaassen (UvA + Eurandom)
• G.M. Koole (VU)
• C. Külske (RUG)
• R. Laeven (UvA + Eurandom)
• J.S.H. van Leeuwaarden (TU/e + Eurandom)
• M.C. van Lieshout (CWI + Eurandom)
• M.R.H. Mandjes (UvA + Eurandom)
• R.W.J. Meester (VU)
• R.D. van der Mei (CWI + VU)
• J. van Neerven (TUD)
• R. Núñez-Queija (CWI + UvA)
• F. Redig (TUD + CWI)
• M. Schröder (VU)
• V. Sidoravicius (CWI + UL + Eurandom)
• P.J.C. Spreij (UvA)
• A.W. van der Vaart (UL)
• M. Vlasiou (TU/e + Eurandom)
• M.A. van de Wiel (VU)
• J.H. van Zanten (UvA)
•  A.P. Zwart (CWI + VU + Eurandom)

Researchers appointed on STAR grants 2010/2011:

• A.C. Fey
• W.. Ruszell
• B.T. Szabo
• J.P. Dorsman
• M.R. Schauer
• M. Heydenreich
• B. Ros

Although this list is restricted to researchers who presently work in The Netherlands, we’d like to emphasize that the cluster is going to be highly internationally oriented. All the above-mentioned researchers have active collaborations with researchers abroad, and many of them participate in international networks. For example, the European research institute Eurandom presently participates in a European Network of Excellence, in a project of the European Investment bank, in an FP7 project, in a Marie-Curie project, and in a large bilateral Dutch-German research program funded by NWO and DFG, and most of these projects are executed with postdocs who come from all over the world; and similar remarks can be made for the other involved research groups.

3.  Structure

3.1  Goals

The goals of the cluster are:

• To further strengthen the quality of Dutch stochastics research, including underrepresented areas like biostatistics and stochastic finance;
to enhance its coherence;
and to increase its visibility (see the remainder of Section 3).
• To have a strong impact on the education of stochastics at the level of MSc and PhD
students (see Section 4).
• To make a major contribution to the analysis and optimisation of complex and random systems, arising in science, industry and society at large (see Section 5).
• To do top-level research in a number of key areas of stochastics (see Section 6).

3.2  Organisation

Eurandom acts as coordinating and facilitating node of the cluster. Eurandom is a research institute in the area of stochastics and its applications, located in Eindhoven. It has no tenured research staff, being predominantly a postdoc institute, with 20 postdocs (coming from all over the world) and 5-7 PhD students in temporary appointments. Eurandom has been operational since 1998, and has rapidly built up a very strong reputation as an institute with “an extremely stimulating research environment, a dynamic and high-quality research program, a strong visitor program and a very extensive lecture and workshop program”, according to a review by an international panel in 2005. The stochastics cluster will enable Eurandom to maintain and strengthen its role as a stochastics facility that organizes scientific meetings, attracts leading researchers to The Netherlands, and facilitates teaching at the MSc and PhD level.

As mentioned above, all the leading Dutch senior researchers in stochastics participate in the cluster. Some of the institutes with which they are affiliated will appoint a young researcher at the assistant professorship level, guaranteeing continuation of the funding of these positions when the term of the cluster has ended. There are several extremely talented young Dutch researchers in stochastics presently working abroad. The cluster will offer an excellent opportunity to bring some of them back to The Netherlands.

Eurandom and CWI will use part of the cluster money for hiring postdocs, as an effective and flexible way of providing a stimulus to Dutch stochastics. Experience has taught that:
(i) postdoc positions attract talented researchers from abroad, who often stay in The Netherlands;
(ii) a postdoc period, without substantial teaching obligations, is a crucial step forward in the career of a promising young researcher.

The (financial) administration of the cluster will be placed at Eurandom. The overall direction and research quality of the cluster will be supervised by a Scientific Committee, which will initially consist of the following persons:

O.J. Boxma (TU/e + Eurandom)
W.Th.F. den Hollander (UL + Eurandom)
R.D. van der Mei (VU + CWI)
C.A.J. Klaassen (UvA + Eurandom)
A.W. van der Vaart (UL + Eurandom)

Each of the five research topics, described in Sections 6.1– 6.5, will have a coordinator or coordinating team, initially:

(1) R.D. Gill (UL)
(2) R.W. van der Hofstad (TU/e + Scientific Director Eurandom)
(3) M.C.M. de Gunst (VU + Eurandom)
(4) M.R.H. Mandjes (UvA + Eurandom)
(5) P.J.C. Spreij (UvA), M. Schröder (VU)

The Educational Committee is responsible for the coordination of the educational activities in the MSc and PhD programs. It coordinates the MSc program activities with the directors of education at the participating institutes, the national “regie-orgaan” of Master Math, regional cooperations such as Stochastics and Financial Mathematics (S&FM), and the Dutch graduate network in Operations Research (LNMB). For the courses at the PhD level, it coordinates with LNMB and with the aio-network in stochastics. The initial committee presently consists of:

R.J. Boucherie (UT + Eurandom)
F. Redig (TUD + CWI)
J.H. van Zanten (UvA + Eurandom)

3.3  Collaboration

For each of the five topics, there is a more or less fixed day at which the participants meet and run a joint seminar. These meetings mainly take place at Eurandom and in Amsterdam. Office space is be made available at the institutes hosting the seminars to accommodate the weekly visitors. For each of the five topics, there will further be at least one workshop per year, including “user days” with participants from industry, science and society at large. There will also be national events, including the annual Lunteren conferences that already have a strong tradition in The Netherlands.

3.4  Budget

For the two-year period 2010-2011 STAR has received 750K Euro per year from NWO, which was allocated to 11 projects in 2010. These projects are shown in tabel 1; the financial breakdown for 2010-2011:

 STAR budget in 2010-2011, in k€ per year Tenure track 4 x 67.5 270 Postdoc 2 x 60 120 PhD 5 x 50 250 Advisor 10 Workshops, visitors 60 Outreach, administration 40 Total 750

The plans encompass investment in permanent positions, and funding for workshops, visitors, seminars, exchanges, and administration. The total costs are 1480 K€ per year. The financial breakdown is shown below.

 Requested STAR budget from 2012, in k€ per year Tenure track 8 x 70 560 Full professor finance 100 100 Tenure track finance 2 x 70 140 Postdoc 6 x 60 360 Workshops 150 Special months and exchanges 60 Visitors 50 Seminars 30 Outreach, administration 30 Total 1480

The cluster will take a leading role in the teaching of stochastics at the level of MSc and PhD students. A coherent program of courses will be offered, of course where possible making use of successful existing programs. In particular, the cluster will become a partner in the following existing activities

• The Dutch graduate network of Operations Research (LNMB) offers an extensive and coherent program in the area of stochastic operations research at the MSc and PhD level: 7 MSc courses (with exams) are being offered by LNMB each year, while 18 PhD courses (with extensive homework exercises and assignments) are presented in a biennial cycle. (See http://www.math.leidenuniv.nl/ ~lnmb/) These programs have given a strong impetus to the field of Operations Research, providing better training to students and strengthening interactions between staff and students, as well as among students.
• The master program Stochastics and Financial Mathematics (S&FM), currently run by the VU, UvA, UU and UL, offers a broad master program in stochastics and its applications, including finance and the life sciences. (See http://www.math.vu.nl/sto/onderwijs/sfm/) The program coordinates the master courses of the participating universities (some 15-20 courses every year), which include both basic master courses and more specialized courses, that can be viewed as being partially at the PhD level.
• The MasterMath is the national cooperation in mathematics education and contains several stochastic modules, including the 3TU program (See http://www.mastermath.nl/)

These activities will not come completely under the cluster, because they include areas outside stochastics (the LNMB also covers deterministic operations research, while the MasterMath is concerned with mathematics in general) and because they are partly of a regional character. However, the cluster will take responsibility and put these activities in a wider perspective:

• The cluster will suggest the stochastics core curriculum and its lecturers to the MasterMath ‘regie-orgaan’.
• The cluster will coordinate regional and local programs with the aims of increasing the efficiency of stochastics education and ensuring that all key subjects in stochastics are taught on a regular basis in The Netherlands. For the near future the cluster will for instance strive after an increase in the number of courses in the area of stochastics and life sciences.
• The cluster will organize minicourses and workshops, and other special activities.
• The cluster will advertise The Netherlands as an excellent place for studies in stochastics.

Eurandom will act as facilitating node, giving administrative support. National courses will be mainly given in Amsterdam.

Eurandom has organized many minicourses and tutorials in the past years, and will continue this tradition in the coming years. The lecturers were, among others, the Eurandom chairs (leading researchers receiving an appointment as visiting professor at Eurandom) and selected Stieltjes professors (appointed by the research school Thomas Stieltjes Institute of Mathematics).

Eurandom is hosting a series of  “Young European Probabilists (YEP)” workshops. These workshops focus on a single topic, and are organized for and by young researchers. With the exception of two keynote speakers, the participants are researchers at the stage either close before or close after their PhD. The YEP workshops have been highly successful, and have drawn many talented young researchers to The Netherlands. Recently, Eurandom has also organized workshops with a similar scope in statistics and stochastic operations research. The cluster will enable Eurandom to continue and extend this fruitful initiative.

Minicourses for PhD students are also organized by the aio-network Stochastics in their yearly “Hilversum” spring meetings, which have the purpose of strengthening the communication between the PhD students in stochastics at the various Dutch universities (http://www.math.vu.nl/~stochgrp/aionetwerk/). The activities of the aio-network will be brought under the responsibility of the cluster.

Minicourses in finance, for PhD students, researchers and people in industry, are given at the yearly Winter school in finance in Lunteren (See http://staff.science.uva.nl/~spreij/stieltjes/winterschool.html).

In Section 1 many examples were mentioned of situations in which probabilistic modelling is important. Thus it is not surprising that many researchers in the stochastics cluster have ties with researchers in other sciences, industry, government research institutes, or society at large. These and new ties will be actively pursued within the cluster. The cluster as a whole will function as an access point for expertise in the broad area of stochastics, and also as a training center in stochastic modelling for a new generation of researchers. In Section 1 we have listed the five topics on which the cluster will focus its attention. As argued in that same section, stochastics takes its motivation from a wide range of scientific fields, and once stochastic models and methods have been developed for one field they are often applicable to other fields as well. Hence we shall have an open eye for interesting problems in fields like astronomy, chemistry and materials science (with, a.o., fascinating problems regarding stochastic geometry), social sciences and law, even if these fields do not feature prominently in the list of research projects in Section 6.

The cluster received support letters from the following persons/organisations:

1. Prof.dr.ir. G.J. van Oortmerssen - Director of TNO-ICT
2. M.A. van den Brink - Executive Vice-President, ASML
3. The Scientific Council of Eurandom

Several more letters will be provided later on.

To illustrate of the wide range of applications in which the cluster is involved, an (incomplete) list of current projects that have already led to output in the form of joint publications or implementation of methodology is given below.

• Ion channel kinetics. B. van Duijn, Fytagoras Plant Science BV, Leiden
• Neuroscience, A.B. Brussaard, A.B. Smit, Department of Biology, VU, J. Verhaagen, Department of Neuroregeneration, Netherlands Institute for Neuroscience
• Carcinogenesis, E.G. Luebeck, Fred Hutchinson Cancer Research Center, Seattle
• Genomics, Proteomics, B. Ylstra, Faculty of Biology, VU, G. Meier, VU Medical Centre, E. Marchiori, Dept. of Computer Science. A.B. Smit, Department of Biology, VU, C. Jimenez, VU Medical Centre
• Statistical Genetics, Biological Psychology, D.I. Boomsma, Faculty of Psychology, VU, P. van Dommelen, TNO-Leiden, P. Heutink, VU Medical Centre
• Medical Imaging (PET, MEG), R. Boellaard, A.A. Lammertsma, C. Stam, VU Medical Centre
• Detecting effects of attachment therapy to disabled children, C.G.C. Janssen and C. Schuengel, Dept. of Special Education, VU
• Epidemiology, Biostatistics, J. Robins, Harvard School of Public Health
• Infectious animal diseases, H. Heesterbeek, Department Animal Medicine, UU, G. Boender, D. Klinkenberg, M. de Jong, IDDLO-Lelystad
• Forensic science, M. Sjerps, Forensic Institute, Rijswijk
• Batch-quality of horticultural products, O. van Kooten, L.L.M. Tijskens, Horticultural Production Chains Group, Wageningen University
• Statistical process control, R.J.M.M. Does, IBIS UvA BV
• Risk management, A. Lucas, Faculty of Economy and Business Sciences, VU
• Wireless networks, Various projects, a.o. with: M. Cook, J. Bruck, Dept Electrical Engineering, M. de Graaf (Thales), Ph. Whiting, P. Gupta (Lucent Technologies, Bell Labs, Murray Hill); with J. Wieland (Vodafone); with J.L. van den Berg, R. Litjens et al. (TNOICT)
• Wired networks, Surfnet, TNO-ICT, WorldCom, Lucent Technologies
• GRID networking, H. Bal, TH. Kielmann, Dept. Computer Science, VU
• Performance evaluation of ad-hoc networks, J.L. van den Berg, TNO-ICT
• Performance analysis of computer-communication networks, Projects on (i) Cable access networks, (ii) Networks-on-Chips, (iii) Mesh networks. T.J.J. Denteneer, A.J.E.M. Janssen, V. Pronk et al., Philips Research Laboratories

6.1 General methodology

The title of this subsection, uninformative as it may be, reflects the fact that an important part of stochastics research is directed at the development and understanding of models and methods that are “universally” applicable. For instance, it is clear that the application of methods from Bayesian statistics has increased dramatically in the past decade, in almost any area where statistics is important, including the other four research topics below. However, there is still a large gap in the understanding of the accuracy of these methods. The outcome of the following projects will be important in a broader context.

1. Bayesian semiparametrics

Bayesian methods in statistics go back to Bayes in the 18th century. They express prior beliefs about a situation in terms of a probability distribution, and next update this distribution using empirical evidence concerning the situation under study. These methods were initially propagated by subjectivist Bayesians, who stressed the subjective nature of prior beliefs, but have increasingly been adopted by objective statisticians, who use the Bayesian methods within a classical statistical set-up. A wide variety of Bayesian procedures can now be implemented using computer simulation (e.g. “MCMC”), and prior distributions can be used to model complicated structures arising in a variety of applications (e.g. classification, curve-fitting, covariate modelling, network modelling). In the past decade the emphasis has been on developing and studying algorithms used to implement Bayesian procedures, using a variety of priors. Although it is known that in an objective sense many (or even most) prior distributions give adverse results, inferior to non-Bayesian methods, relatively little research has been carried out to study the performance of Bayesian methods. Particularly in the complex settings where prior modelling is thought to be of help, deeper insight in the effect of choosing a particular prior is necessary. We intend to study Bayesian procedures for priors on complex structures, loosely speaking semiparametric models, or models of high dimension. Priors for functions may be constructed using models for stochastic processes, such as Gaussian or Lévy processes. For priors on large discrete structures, such as graphical networks or pedigrees, it is necessary to develop appropriate asymptotic methods to study the quality of posteriors.

2. High-dimensional models

In the 1980s and 1990s a new branch of statistical modelling was developed enabling application of statistical techniques in situations where less a-priori knowledge of the relationships between various variables is known. These techniques have been adopted in diverse areas of application. One may think of the success or failure of a medical treatment as a function of background variables of patients. Often many covariates (e.g. age, sex, weight, blood pressure, medical history, genetic factors) are thought to influence the outcome, but little is known about the exact numerical relationships. One may also think of an economic variable such as unemployment or the effect of counseling on unemployment as a function of person characteristics or other economic variables. One may think of nonresponse on an interview conducted by the CBS (Central Bureau of Statistics of The Netherlands). As a fourth example of application one may think of the effect of air pollution on health as a function of environmental and population data.

2a. Very high-dimensional models

To measure the causal effect of a treatment or condition (e.g. air pollution) using observational data it is necessary to include all variables that could influence the treatment in the analysis. Semiparametric models as developed in the past allow this, and are much more flexible than the parametric models that were the standard in the past, but in many ways still make many assumptions concerning the structure of the data (linearities, additivity, low dimensionality, homogeneity, smoothness, etc.) We wish to investigate methods that use still larger models, which can better fit reality. The conclusions that can be drawn using such larger models will have a larger uncertainty margin (technically: wider confidence intervals), which is unpleasant on the one hand, but provide more honest indications of uncertainty given the available data. Many controversial applications of statistics center around a proper quantification of uncertainty. For instance, it is simply not that easy to estimate the effect of air pollution on health.

2b. Shape-constrained methods

Shape constraints show up very naturally in different areas of application, e.g. in earth sciences, medical imaging and survival analysis. Often it is possible to fit useful models that only impose these shape constraints. We aim at developing deeper understanding of shape-constrained methods, conceptually, computationally and asymptotically. We also wish to popularize shape-constrained methods in diverse areas of application.

2c. Regularization

Many of the theoretical results for infinite-dimensional parameter spaces take the form of showing that a certain estimator is pointwise consistent or has a certain pointwise rate of convergence. For finite sample sizes, however large, there will often be a large infinite subset which provides models consistent with the data. Depending on the problem under consideration some form of regularization will be necessary. If for example the model is a semi-parametric translation model where the density is to be estimated, it makes sense to minimize the Fisher information within the class of models consistent with the data. Without this any confidence interval based on the calculated density may be wildly optimistic. None of the standard methods of estimation is concerned with this form of global regularization subject to a data fit. The aim of
the project is to develop such methods for some standard problems in non-and semiparametrics.

2d. Sparsity

An aspect of high dimensional models that can often be used to construct successful predictors is sparsity. A model is called sparse if the parameter space is high dimensional in principle, but the number of important parameters is known to be small. Sparsity based methods automatically select the important parameters from the large set of potential parameters and give meaning to estimation problems that used to be classified as unidentifiable, e.g. because the number of parameters was larger than the number of observed data points. A highly relevant application of sparse modelling is the analysis of microarray data in statistical genetics.

3. Statistics of extremes

The last two decades considerable progress has been made in Statistics of Extremes. This statistical theory is based on Extreme Value Theory, the probability theory of extremes. A prominent example is the estimation of quantiles that are so large that they are on the boundary or outside the range of the dataset. Statistics of Extremes is one of the subfields of mathematical statistics with the nice feature that the obtained theoretical results are applied immediately in various fields, whereas applications often trigger theoretical developments. Most of the existing theory deals with independent, identically distributed, one-dimensional random variables and in case the data are multivariate, the results are typically only of use in dimension 2 or 3. It is challenging and very useful to extend the present theory to more complex settings. One extension is to deal with high-dimensional data or even stochastic processes. Another extension considers time series instead of independent data. A third extension replaces the identical distributions by distributions where covariates play a role, the regression setting. Although some results have been obtained in all these directions, the majority of problems still have to be addressed.

4. Model choice

Most paradigms of model choice allow only comparisons between different models. They are based on a single valued fidelity measure and a single valued penalty term which measures the complexity of the model. Within this paradigm it is not possible either to say a particular model fits or does not fit the data without reference to other models. One essential ingredient of scientific practice and the main motor of scientific advance, that of not being satisfied with any of the models on offer, is therefore missing. The bonus of deciding whether the model fits the data is often reduced to that of diagnostics although it is clear that is a much too simplistic and dismissive attitude to the problem. The aim of the project will be to develop direct measures of fit which are multivalued and give grounds for accepting or rejecting a model on its own merits. This has to be complemented by measures of simplicity which again may be multivalued with the intention of calculating one or more simplest models which provide a satisfactory fit to the data in so far as one such exists.

5. Multi-dimensional structured Markov processes

Multi-dimensional structured Markov processes (MSMP) form the most natural model for complex real-world phenomena that exhibit stochastic behaviour. They are being widely applied in fields as diverse as physical sciences, biology, engineering, computer-communications and logistics. The adjective “structured” has been added to emphasize that, in nearly every application, Markov processes possess structural properties, and these properties should be exploited in the analysis. Multi-dimensional Markov processes are much less understood than their one-dimensional counterpart. New methods have to be developed for obtaining explicit and computable results on the behaviour of MSMP: matrix-analytic methods, methods to obtain bounds, and asymptotic methods for studying rare events in MSMP.

6. Transient behavior of high-dimensional Markov chains

Transient behaviour of stochastic processes represents systems before their relaxation to equilibrium, or systems where equilibrium is either trivial or non-existent. Typical applications where transient behaviour plays a dominant role are the following. In the PageRank problem pages are ordered according to their current relevance in the WWW-graph that is modelled as a Markov chain with a huge state space. In (wireless) communication systems dimensioning (capacity allocation) must often be carried out on short time-scales taking into account e.g. teletraffic bursts due to rush hours. Other applications include large population dynamics in e.g. biology, neuronal networks, and large flow models such as for atmospheric flows. In contrast with equilibrium behaviour, little is known about transient behaviour. Especially short-term behaviour is dominated by transient effects. Short term behaviour may be highly sensitive to small perturbations in intensities of transitions, such as due to steering of parameters or uncertainty in system parameters. Besides the obvious theoretical challenge of understanding the transient behaviour of stochastic processes, current and future applications of stochastic processes require more and more a detailed understanding of their short-term behaviour. In this respect, it is challenging to integrate techniques from numerical analysis, (partial) differential or difference equations for very large systems into the framework of stochastic processes, explicitly taking into account the structure of the algebra of the stochastic process.

• The key concept that binds all projects is “high dimensionality”. This is another way of referring to the complexity of present-day science, industry and society, mentioned in the introduction of the proposal. There is much information available that can be used to control and understand systems of importance, from biological networks to traffic on the internet or on the road, and new techniques to gather even more information are invented regularly. However, it is often not easy to use all this information in a useful and accurate manner. Modelling in terms of probabilities is a fruitful approach in many situations, but the models must reflect reality in order to lead to understanding or predictions. This requires studying the properties of the models, and developing techniques to tune them to the data. Because the situations are often of a novel nature, “classical” techniques are often not sufficient, even though classical fundamental concepts are always a good starting point.

6.2 Mathematical statistical physics

In physics, spatially extended systems consist of a large number of components that, though interacting only locally, exhibit a long-range global dependence, resulting in anomalous fluctuation phenomena and phase transitions. At the microscopic scale, a random dynamics acts on the components of the system, resulting in an evolution that is typically highly complex. The key challenge is to give a precise mathematical treatment of the interesting physics that arises from this complexity, at the macroscopic scale. Both equilibrium and non-equilibrium behavior are relevant. There is a strong link with stochastic networks, through the study of the statics and the dynamics of random networks of interactions. In addition, many notions and ideas developed in mathematical statistical physics are slowly making their way into mathematical biology, e.g. hierarchical structures, coalescence, universality.

1. Self-organized criticality

Certain classes of physical systems have the property that they evolve naturally – without fine-tuning of parameters such as temperature – into a stationary state that behaves in a critical way, e.g. characterized by power-law decay of correlations. This natural evolution into a critical state is called self-organized criticality (SOC), and has been observed and studied in a wide variety of models. Examples are the Bak-Sneppen model (for evolution of fitness), the Abelian sandpile model (for the motion of grains in a sandpile), forest fires and earthquakes. Models with SOC typically exhibit some form of “avalanches”, i.e., non-local rare events in which a large part of the system is updated. These avalanches are fundamental in order to create the self-organized critical state. The mathematical study of models with such non-local behavior is a challenging new area of interacting particle systems. The two-dimensional Abelian sandpile model represents a major challenge, as it is related to logarithmic conformal field theory and various interesting combinatorial objects, such as spanning trees and oriented circuits.

2. Metastability

Metastability is a phenomenon where a physical, chemical or biological system, under the influence of a noisy dynamics, moves between different regions of its state space on different time scales. On short time scales the system is in a quasi-equilibrium within a single region, while on long time scales it undergoes rapid transitions between quasi-equilibria in different regions. Examples of metastability are found in biology (folding of proteins), climatology (effects of global warming), economics (crashes of financial markets), materials science (anomalous relaxation in disordered media) and physics (freezing of supercooled liquids). The task of mathematics is to formulate microscopic models of the relevant underlying dynamics, to prove the occurrence of metastable behavior in these models on macroscopic space-time scales, and to identify the key mechanisms behind the experimentally observed universality in the metastable behavior of whole classes of systems.

3. Random polymers

A polymer is a long chain consisting of monomers that are tied together via chemical bonds. The monomers can be either single atoms (such as carbon) or molecules with an internal structure (such as the adenine-thymine and cytosine-guanine pairs in the DNA double helix). Examples of polymers are: proteins, sugars, fats, plastic and rubber. The chemical bonds are flexible, so that the polymer can arrange itself in various spatial configurations. The longer the chain, the more complex these configurations tend to be. For instance, the polymer can wind around itself to form a knot, attract or repel itself due to the presence of charges it carries, interact with a surface on which it may be adsorbed, or live in a wedge between two confining surfaces. The key challenge is to unravel the complexity in behavior due to the long-range interactions characteristic for polymer chains. Particularly challenging are phase transitions as a function of underlying model parameters, signalling drastic changes in behavior when these parameters cross critical values.

4. Spin glasses

Spin glasses are prime examples of complex systems. Spin glass models were introduced in condensed matter physics to describe amorphous systems (diluted magnetic alloys, structural glasses). Contrary to homogeneous systems, like crystals or ferromagnets, spin glasses have a highly non-trivial broken symmetry, with a hierarchical organization of equilibrium states. This is also reflected in an anomalous dynamical behavior: slow relaxation, aging, memory effects. The mathematical description of spin glasses made considerable progress in recent years. The rich mathematical structure that arises from the solution of the mean-field Sherrington-Kirkpatrick model is nowadays the subject of intensive investigation. Concepts originally introduced in the study of spin glasses also found applications in areas as diverse as combinatorial optimization, neural networks, protein folding and economics.

A paradigm of non-equilibrium is a system in contact with two heat baths at different temperatures or with particle reservoirs at different densities. The system will eventually evolve into a stationary state that is non-equilibrium (i.e., non-reversible, current-carrying) and that typically shows behavior very different from an equilibrium system. Of fundamental interest is the correlation structure and the large deviation behavior of such a non-equilibrium steady state. Typically, non-local large deviation free energies appear and the system exhibits persisting long-range correlations.

6. Random surfaces

Properties of random surfaces are of wide scientific interest. Not only can they be used as models for interfaces between different phases, describing the separation between gas and liquid or between liquid and solid, they also play an important role in interacting particle theories, and they occur in biological systems in the form of (cell or intracell) membranes. Whether subject to random fluctuations, disordered environments or external forces (like osmotic pressure), their shapes and widths, being governed by the size of the fluctuations, are of a wide interest. Random surfaces can in some sense be viewed as two-dimensional analogues of (onedimensional) polymers, but their properties tend to be harder to study. Phase transitions between different types of behavior are expected to occur, but many of their properties are as yet ill understood.

7. Critical phenomena in two dimensions

Physicists have extensively studied second order phase transitions (also known as critical phenomena) using the renormalization group and, in two dimensions, conformal field theory. These traditional approaches leave many questions unanswered, especially those that concern the geometric aspects of critical phenomena. Many physical systems undergoing a second order phase transition (such as magnetic materials) present fluctuating boundaries, which assume random shapes that can be described by conformally invariant random fractal curves. For this reason, one is led to study stochastic geometric models. In particular, the recent discovery of the Stochastic Loewner Evolution (SLE) has produced spectacular developments, has linked conformal field theory to probability theory and complex analysis, and has opened up new and unexpected perspectives. Key challenges are: using SLE to obtain a better understanding of critical phenomena, making further connections between SLE and some of the most important mathematical models of statistical mechanics, and explaining why many different models present the same behavior at or near the phase transition point.

8. Quantum statistics

Quantum statistics aims to do statistical inference (and design of experiments), to learn about the state of quantum systems and the working of quantum operations, with such a small amount of data that the quantum randomness in measurement outcomes is of a size similar to the signal one wants to extract. Since just a few years, physicists are able to work in the laboratory with single or very small numbers of quantum systems. The field of quantum information has grown explosively, partly with a view to future nanotechnology at, say, the atomic level. From a theoretical side, the contours of a LeCam like theory of convergence of quantum statistical experiments, is only just beginning to emerge, and should be used to clarify the presently fragmented and confusing asymptotic results on optimal state estimation. Topics such as quantum tomography should benefit from present day statistical insight into nonparametric curve estimation and nonparametric inverse problems. The design of Bell-type experiments, to confirm some of quantum physics’ most startling predictions, are linked to nonparametric missing data problems from classical statistics.

• The eight research projects listed above are linked through their common search for methods to describe complex macroscopic phenomena based on simple microscopic dynamics. Common tools are large deviation theory, variational calculus, Gibbs theory and operator theory. Coalescence is a key notion in metastability and spin glasses, universality is a driving force of self-organized criticality, random surfaces and critical phenomena in two dimensions, while ergodic theory underpins much of random polymers and non-equilibrium states.

6.3 Stochastics and the life sciences

Randomness and complexity abound in the life sciences. In genetics, the use of probabilistic models and statistical techniques has a long history, but the recent revolution in genomics and proteomics has changed the outlook dramatically. In the neurosciences and molecular cell biology, mathematical models play an increasingly important role, often in the form of networks of various nature (genetic, metabolic, neural). Stochastics contributes both via the analysis of new probabilistic models (e.g. time series, random graphs, Markov processes of many types, particle systems) and via supplying new statistical techniques that can cope with these new models and the many new data platforms, often of high-throughput type.

1. Population genetics

The evolution of DNA-sequences is part of population dynamics, with individuals representing alleles. Individuals are organized in “colonies”, migrate between colonies, and evolve within a colony due to resampling, mutation, selection and recombination. It turns out that on large space-time scales the population behaves in a way that is to a large extent independent of the precise evolution mechanism within a single colony. This is because the migration provides an effective way of “averaging out” over the population. For populations consisting of only one type of individual this phenomenon is well understood. For populations consisting of two or more competing types, very little is known, and a much greater richness of universality classes is expected.

2. Immunology

The human immune system contains 107 different types of T-cells. These T-cells interact with antigens (“antibody generating” cells) and trigger an immune response. However, the number of different types of intruder cells is at least 106 times larger than the number of available types of T-cells. Thus the question arises: “How does the system manage to recognize so many intruder cells in an effective manner, against a background of our ‘own’ cells that are non-intruders?” Stochastic models are capable of explaining this efficiency, both qualitatively and quantitatively: recognition is more robust when it is done in a stochastic rather than in a specific manner.

3. Sequence alignment

Given two DNA-sequences, how likely is it that the differences in base pairs occurring along these sequences are due to chance? In other words, what is the probability that the sequences are not related to each other via some given evolutionary mechanism? The so-called BLAST algorithm is a way to weigh the differences according to an appropriate penalty scheme and, based on the outcome, to accept or reject the hypothesis that the sequences are not related. This algorithm is used extensively throughout the genetics community, but so far has been given very little mathematical foundation. With the help of large deviation theory, it is possible to estimate the probability of seeing a large number of differences in stochastically unrelated sequences. This probability exhibits “Gumbel law” type behavior, known from extreme value theory. It is a challenge to compute the relevant parameters in this law, which are needed to provide the appropriate confidence intervals in statistical estimation.

4. Statistical genetics

Statistical genetics is a classic branch of applied statistics, studying the relationship of phenotypes, such as disease status or quantitative traits, to genotypes. The field has attained new emphasis with the advances of cell biology and experimental technology, which allows to measure unprecedented amounts of genotypic and phenotypic data: e.g. SNP-arrays, expression-and CHG-arrays, proteomics profiles. Classical statistical techniques, such as pedigree analysis, nonparametric linkage or association analysis, or variance decompositions, have not lost their value, but must be transformed to take into account the new questions and data. High-dimensionality is a common theme, and challenge, for all such extensions. Not only are the data themselves massive. Potential interactions (e.g. between genes or SNPs) or networks (genetic, proteomic, metabolic) rapidly multiply the number of possibilities one needs to investigate. Many traits are thought to depend on many genes, which may have small and nonadditive effects. Many medicines are thought to have nonlinear effects that depend on many factors. The techniques in modern statistical genetics come in a great variety. Some build a layer on existing techniques, e.g. multiple testing. Others are based on probabilistic models for the phenomenon under study, e.g. the coalescent for DNA evolution. A common theme is that high-dimensional data also provide high-dimensional noise, and statistical techniques are necessary to extract the signal. A general challenge is to link up with statistical and machine learning methods for model selection and adaptation that have been developed in theoretical statistics and other areas of application.

5. Survival analysis

How can genetic and historical information be combined to better predict the residual life of a person or an illness? The analysis of (censored) survival data has received much attention during the past decades. Asymptotic theory and algorithms exist for estimators in complicated models, but there are still big challenges ahead. One of those is to connect shape constrained survival models to high dimensional covariates. Such models can be used for prediction purposes, combining historical (often censored) information with usually high dimensional (genetic) data for the patient at hand. There is a strong link with Section 6.1.

6. Biological networks

Building mathematical models for biological networks and developing statistical techniques for the analysis of corresponding data is at the core of many areas: the investigation of gene regulatory networks in genetics and molecular biology, of neuronal networks in neuroscience, and of metabolic networks in molecular and cellular biology, all generate complex modelling problems and new statistical issues. Because the characteristics of these networks and the biological questions about them differ, the required modelling and analysis tools are different for each field. Also here high-dimensionality is a common theme, as is the complex interaction structure between the different network components of the same and of different types. Typically, interaction between genes or gene products, between spiking neurons, or between local field potentials or EEG and MEG signals of different brain areas are investigated pairwise or by means of classical techniques like cluster analysis. Although this still yields important information, there is a need for new multivariate techniques, in the time domain as well as in the frequency domain, with which more complex network connectivity patterns can be inferred. For more detailed investigations model based techniques are needed. Development of probabilistic models together with appropriate parameter estimation methods for different types of networks is therefore an important issue. Candidate models range from dynamic Bayesian networks or counting processes to hidden Markov models. Bayesian estimation techniques together with simulation based methods such as MCMC are expected to play a major role, but they need to be tuned to the high dimensional context of these applications. Traditionally, (elements of) metabolic networks have been modeled primarily by deterministic models consisting of sets of differential equations. New technical developments have resulted in larger and complex data sets in this area and in the possibility to look at one and the same biological question from different angles and on different scales. This means that several sources of randomness or noise – due to small space or time scales, or stemming from the observation of individual objects instead of groups/populations of objects – need to be taken into account for accurate modelling and parameter estimation. Next to building stochastic models that can be incorporated in or connected to the deterministic ones, another challenge is to adapt existing or develop new statistical techniques that can deal with the analysis of data based on such combined and complex models.

• Most of the projects listed above require input from a wide range of topics in stochastics, often used in a joint manner. For instance, statistical analysis of biological systems requires probability models, but realistic probability models can only be built by comparing their outputs statistically to data. The projects will be carried out in continuous interaction with biologists, psychologists, medical scientists, and other scientists from the life sciences. The results will be mathematical and methodological progress, but should also have a direct impact on the subject matter sciences. Several research topics also include a notable interaction with computer science.

6.4 Stochastic networks

Congestion phenomena occur when resources (machines at a factory, elevators, telephone lines, traffic lights) cannot immediately render the amount or the kind of service required by their users. Similar queueing phenomena also arise, at the byte level, in modern data-handling technologies (communication systems, computer networks); they are typically less visible but their effects at user level are usually not less serious. Such congestion phenomena are often very effectively studied by mathematical methods from queueing theory.

Congestion control in stochastic networks is an extremely active area of research. One of the key reasons for its strong viability is that, time and time again, interesting new questions from application areas like computer-communications and manufacturing give rise to new and challenging queueing problems. Much research is being triggered by the need to understand and control these highly complex systems, and thus to improve their design and performance.

Presently, novel communication networks (wireless, peer-to-peer, ad-hoc) are giving rise to random graph models and systems that exhibit some form of self-organization. The performance analysis of such networks requires the use of techniques from queueing theory and statistical physics. The cluster will stimulate interaction of researchers from both fields.

1. Simultaneous resource possession

In classical queueing networks customers (jobs, particles, products, transaction requests) move through the network requiring service from one service entity at a time and only request resources from another node after having released the previous one. However, in many real-life situations it is much more natural to allow customers to consume multiple resources simultaneously. Typical application areas are production (multi-type, multi-skill product composition) systems, and wireless communication networks (mobile ad-hoc networks, mesh networks) and computer-communication systems with intensive software/hardware interaction (application servers, middleware).

The phenomenon of simultaneous resource possession (SRP) causes strong dependencies between the operations of service nodes and the residence times of individual customers in the system, which opens up a wealth of challenging questions regarding the performance and control of such systems. Motivated by this, models with SRP have attracted considerable attention. For example, stability for such systems has been shown to be non-trivial, even in seemingly simple small-scale toy examples. Particular emphasis has been put on the use of fluid limit and diffusion scaling techniques. Another main line of research has focused on “computable” resource sharing strategies.

Today, an in-depth understanding of the fundamental dynamics of these systems is lacking. In this context, our aim is to further study systems with SRP, particularly under a variety of asymptotic analysis techniques. Besides under fluid and diffusion scalings, the system complexity may be reduced under scaling regimes such as heavy traffic and nearly complete decomposability which are often natural for the applications mentioned above. Singular perturbation methods for infinite-dimensional Markov processes may be used to study these complex systems in a suitable asymptotic regime. In particular, by equipping the “first” order limit with higher order refinements, the applicability of the chosen regime can be studied and adapted when necessary.

2. Online control of queueing systems

In many systems some form of control exists: jobs may have to be ordered before processing, jobs can be routed to different servers, and admission decisions have to be taken. In general, the class of all control policies is huge, but sometimes it can be shown that the optimal policy lies within a class of policies that is characterized by only a few parameters. And even if this is not the case, then restricting to a properly chosen subset of policies is not far from optimal. Typical questions that arise are: How to determine the proper set of control parameters? What is the (near-)optimal decision given these parameters?

A complicating factor is that these control parameters usually depend on other parameters such as the projected number of new customers that is about to arrive. The standard approach is using statistical techniques first to estimate parameter values and then look for the optimal policy within the chosen set. The main disadvantage, however, of this approach is that in practice these parameters fluctuate and are hard to estimate. This raises the need for the development of effective yet simple online congestion-control techniques in which parameter estimation is an integral part of the policy selection process.

3. Random graphs and complex networks

Empirical studies on real networks, such as the Internet, the World-Wide Web, social and sexual networks, and networks describing protein interactions, show fascinating similarities. Most of these networks are “small-worlds” (meaning that typical distances in the network are small) and have “power-law degree sequences” (meaning that the number of vertices with degree k falls off as an inverse power of k). Incited by these empirical findings, random graph models have been proposed to model and explain these phenomena. While the proposed models are quite different in nature, they behave rather universally, in the sense that many graph properties, such as the typical graph distance, the amount of clustering in the graph, its diameter and its connectivity properties, depend in a similar way on the degree sequence of the graph. Topological properties of networks are crucial for many processes living on these networks, such as the spread of a disease in a social network, viruses in Internet, HIV in a sexual network, or search engines on the WWW.

4. Excitable media

Consider a network with a large number of nodes, each of which can be in two states: ‘on’ (alert) or ‘off’ (recovering). A node in state ‘on’ can be triggered by signals from outside. When this happens, the signal spreads instantly over the entire connected component of ‘on’-nodes, after which all these nodes are turned ‘off’ and need a (random) recovery time before they turn ‘on’ again. In these systems typically the number of signals from outside, per node per unit time, is small. Examples are forest-fires (where the nodes are locations of trees, signals correspond to ignitions, and ‘on’ and ‘off’ stand for a tree being present or absent), networks of neurons, and rapidly spreading infections. These systems exhibit a form of self-organization, although how is still poorly understood.

• The four research projects listed above are deeply intertwined. For example, online control problems in queueing systems in which the simultaneous possession of resources plays a predominant role occur naturally in the modelling of many computer-communication systems (e.g. in the derivation of effective thread-spawning algorithms in file servers that may boost the performance of such servers). Another example, currently a “hot topic”, is the online control of stochastic networks in which the topology is largely subject to randomness (e.g. mobile ad-hoc networks and sensor networks). A final example concerns the problems related to excitable media spreading signals over connected components, which is closely related to the spreading of viruses over the Internet and has links with invasion percolation in statistical physics.

6.5 Stochastic finance and econometrics

In the field of economics, stochastics can contribute to finance and econometrics in particular. Econometrics has a strong ongoing link to statistics. Main concerns in mathematical finance are risk management, derivative pricing and portfolio optimisation. This field has grown exponentially in the past decade, and is dominated by stochastic modelling. The whole spectrum of stochastics is relevant, with examples ranging from questions in applied probability to notions usually encountered in statistical quantum physics. There is ample opportunity for a larger contribution of the Dutch stochastic community, in particular in the core field of derivatives.

1. Derivatives

Derivative securities such as options are financial instruments that allow taking out or reducing the risks of financial transactions. The markets for these instruments have grown worldwide in size and importance, and mathematical models are instrumental for determining the prices on these markets.

For better picturing market realities one seeks models more general than Brownian motion, for instance in terms of Lévy processes. Such models entail new questions of both a conceptual and practical nature: pricing in incomplete markets, calibration of martingale measures, or the fine structure of asset prices given high-frequency data, to name but a few. High-dimensional concepts are instrumental here once more. The stochastic volatility models developed in extension of the classical but simplifying Black-Scholes model depend on stochastic surfaces or higher-dimensional stochastic manifolds of finite dimension; models for fixed income or credit derivatives work with infinite dimensional stochastic (Banach) manifolds. Methods to compute derivatives prices in these models are generally lacking, let alone explicit pricing formulas.

Building on existing competence and expertise we shall pursue a two-pronged approach, which combines developing constructive methods with conceptual understanding. The main lines of research to be initially pursued include the following ones.

1a. Probabilistic structure of fundamental constructions in derivatives

Averaging over time of stochastic processes is typically such a construction. In the Brownian situation the difficulties of this construction have been resolved only recently, by establishing non-obvious but characteristic connections with other branches of mathematics, in particular with harmonic analysis on Poincaré's upper half plane. We will now in particular study the structure of time-averages of Lévy processes.

1b. Stochastics for new derivatives

Widely traded classes of options, in particular path-dependent options, suffer from intrinsic problems. We shall develop the stochastics concepts, and the consequent new financial instruments, which allow to cope with these problems. Barrier options, as a typical example, are prone to problems due to the discontinuities that arise from their underlyings hitting their barriers. Various ways to remedy these problems have been proposed in the Brownian case. We will complete the study of the explicit structure of the occupation-time and excursion-theoretic concepts these proposals are based on in the Brownian case, and then proceed to seek the relevant concepts and the extensions of these results in particular for Lévy processes.

1c. Constructive stochastics methods for high-dimensional valuation problems

Initially we will look at problems depending on stochastic surfaces, as they arise with stochastic volatility models. We shall seek extensions of the methods that have been developed in the Brownian case, which are based on orthogonal series, and which have proved there to provide efficient ways for computation. Extensions to higher-dimensional situations will form the second step of the program.

1d. Discrete and lattice methods in derivatives

Recent classes of fixed income derivatives combine discrete construction features in continuous-time and continuous-space models. To address this type of valuation problems we shall seek to adopt operator calculus methods and methods from lattice models in statistical quantum physics. We shall emphasize the ability of estimating pricing kernels as well as that of achieving high-precision computability.

1e. Stochastic structure of infinite dimensional derivative valuation problems

The evolution of financial variables such as interest rates, credit ratings, and many other variables is typically governed by SDEs or PSDEs and thus takes place in stochastic manifolds of an infinite dimension. On the one hand we will investigate the structure of these manifolds if additional constraints such as positivity are to be satisfied. This will be approached by further developing the connections with stochastic geometry, Malliavin calculus, as well as the theory of pseudodifferential and Fourier integral operators. On the other hand, we will seek efficient methods for approximating the corresponding infinite-dimensional valuation problems by finite-dimensional ones (using Banach bases or generalized orthogonal function spaces), and thus develop constructive approaches to the explicit valuation of derivatives in such situations.

2. Time series and risk modelling

Financial risk management requires the historical analysis of financial time series, in order to build accurate models that can predict future exposure to risk. Due to the Basel agreements on capital management, which regulate the internal risk management of financial institutions, the demand for accurate models, and their understanding and statistical estimation, is bigger than ever. Two main challenges are the modelling of the dependencies between the changes in the value of financial assets and the modelling of their extremal behaviour. Typical portfolios consist of hundreds of assets, whose joint behaviour is neither independent, nor can be captured by such simple measures as correlations. As risk has to do with extremal values of assets (e.g. “value-at-risk”), it is particularly important to understand the dependencies when one or more time series takes large values. Hidden Markov models and other stochastic process models are increasingly used for the purpose of modelling dependencies between the prices of assets. We contribute to this field through the study of extremes and development of statistical methods for stochastic processes.

3. Credit risk and insurance

Credit risk is concerned with the risk of default by a party to a credit contract. The risks in the credit market are huge and typically involve events (defaults) that occur with very low probability, but concern huge amounts of money. Credit derivatives are used as insurance and can be very complex. There are essentially two well-accepted approaches for the modelling of credit risk. The first approach is a structural one, linking the occurrences of default directly with the firm’s value behaviour. Default happens if the firm value falls below a certain low threshold. The approach uses techniques and stochastic processes that are also used in equity modelling. The other approach is an intensity based approach where default happens exogenously. The default time can be modelled as the jump time of a counting process, and there is a close connection to queueing. Stylized features of financial data in a credit risk setting are non-normal returns, heavy tailedness and certain jump dynamics. Defaults and credit risk are driven by shocks in the economy or individual firm. Modelling default risk without jump dynamics is not realistic and clearly severely underestimates the risks present. We shall study the impact of our results about averaging constructions for reduced models. We further pursue applications of derivative methods and results in insurance constructions such as embedded options in insurance contracts.

4. Change-point problems

Change-point problems have been studied extensively for the case of retrospective analysis of independent univariate data. This case is too restrictive for practical use. E.g., in process industries one usually has several correlated process characteristics, while feedback controllers result in observations that are highly correlated and for which persistent internal process changes can only be perceived through specific short-term patterns in external observations. To tackle this kind of problems, one needs to study the sequential procedures like GLR statistics for multivariate data for specific epidemic alternatives. The distribution of such procedures is very complicated and must be sufficiently understood to allow implementation. Moreover, post-mortem analysis of detected change-point problems should reveal the nature of the change-point, which is a delicate multivariate statistical problem.

• The four research projects are but a selection from a wide range of possible subjects. Other topics will be added after enlarging the research staff in mathematical finance, and in consultation with partners at the financial institutions and economics departments. The topics include all three areas of stochastics (statistics, probability and operations research) and involve some of the most involved mathematical theory available (e.g. stochastic integration, semimartingale theory, stochastic processes of many types). Because derivatives are used for risk management by the financial industry, there is a strong interaction between the first three projects.

back to the top