# Workshop "Heavy Tails"

## Dec 9 - Dec 13

#### Summary

The goal of the workshop is to bring together researchers from probability, statistics, and various application areas such as computer science, operations research, physics, engineering and finance and learn from each other on the latest developments in theory [covering both stochastic processes and spatial models], statistical and simulation algorithms, and applications.

#### Sponsors

#### Organizers

Remco van der Hofstad | TU Eindhoven |

Adam Wierman | Caltech |

Bert Zwart | CWI / TU Eindhoven |

#### Speakers

Bojan Basrak | University of Zagreb |

Ayan Bhattacharya | Wroclaw University |

Jose Blanchet | Stanford University |

Alessandra Cipriani | TU Delft |

Aaron Clauset | University of Colorado |

Claudia Klüppelberg | TU München |

Anja Janssen | KTH |

Dmitri Krioukov | Northeastern University |

Daniel Lacker | Columbia University |

Marie-Colette van Lieshout | CWI Amsterdam |

Nelly Litvak | University of Twente |

Thomas Mikosch | University of Copenhagen |

Sid Resnick | Cornell University |

Chang-Han Rhee | Northwestern University |

Gennady Samorodnitsky | Cornell University |

Johan Segers | UCLouvain |

Fiona Sloothaak | TU Eindhoven |

Clara Stegehuis | University of Twente |

Caspar de Vries | Erasmus University Rotterdam |

Nassim Taleb | New York University |

Olivier Wintenberger | Sorbonne Université |

#### Programme

The workshop will begin:

Monday December 9, 10.00

Expected closing:

Friday December 13, 16.00

#### Abstracts

**A k-means clustering procedure for extremes
**Dimension reduction has become an important topic in statistics and has more recently also been applied in the context of extreme value theory.

In this talk, we start by giving an overview over some approaches which have been pursued in this context so far and continue with discussing how the standard assumption of multivariate regular variation can be used to construct simple and efficient ways to model and describe dependency structures of multivariate extremes. In particular, we introduce a k-means clustering procedure on the empirical spectral measure that allows for a comprehensive description of "extremal prototypes". We illustrate our method with several data examples.

(joint work with Phyllis Wan from Erasmus University Rotterdam)

**Power Loss with Power Laws
**One common task in network/data science is to make reliable inferences from data, which is always finite. Perhaps the simplest example: Given a real-world network adjacency matrix, is the network sparse or dense? It appears to be not widely recognized in network science that the first question cannot have any rigorous answer. It is not surprising then that the question of whether a given network is power-law or not, has not been rigorously addressed at all, even though this question is so foundational in the history of network science.

We review the state of the art in extreme value statistics where power laws are understood as regularly varying distributions that properly formalize the idea in network science that "power laws are straight lines in the loglog scale". There exists a multitude of power-law exponent estimators whose consistent behavior in application to any regularly varying data had been proven long before network science was born. In application to real-world networks these estimators tell us what we already know -- that many of these networks are scale-free. Yet applied to any data these estimators always report some estimates, and the nature of the infinite-dimensional space of regularly varying distributions is such that such estimates cannot be translated to any rigorous guarantees or hypothesis testing methodologies that would be able to tell whether the data comes from a regularly varying distribution or not. This situation is conceptually no different from the impossibility to tell whether a given finite data set is sparse or dense, or whether it comes from a finite- or infinite-variance distribution, or whether it shows that the system has a phase transition. All these questions can be rigorously answered only in the infinite data size limit, never achieved in reality. An interesting big open problem in data science is how and why we tend to make correct inferences about finite data using tools and concepts that are known to work properly only at infinity and whose convergence speed is unknown.

**Nearest-neighbour Markov point processes on graphs with Euclidean edges
**We define nearest-neighbour point processes on graphs with Euclidean edges and linear networks. They can be seen as analogues of renewal processes on the real line. We show that the Delaunay neighbourhood relation on a tree satisfies the Baddeley–Møller consistency conditions and provide a characterisation of Markov functions with respect to this relation. We show that a modified relation defined in terms of the local geometry of the graph satisfies the consistency conditions for all graphs with Euclidean edges that do not contain triangles.

**Risk forecasting in the context of time series
**We propose an approach for forecasting risk contained in future observations in a time series. We take into account both the shape parameter and the extremal index of the data. This significantly improves the quality of risk forecasting over methods that are designed for i.i.d. observations and over the return level approach.

We prove functional joint asymptotic normality of the common estimators of the shape parameter and and extremal index estimators, based on which statistical properties of the proposed forecasting procedure can be analyzed.

(joint work with Xiaoyang Lu)

**One- versus multi-component regular variation**

One-component regular variation refers to the weak convergence of a properly rescaled random vector conditionally on the event that a single given variable exceeds a high threshold. Although the weak limit depends on the variable concerned by the conditioning event, the various limits are connected through an identity that resembles the time-change formula for regularly varying stationary time series. The formula is most easily understood through a single multi-component regular variation property concerning some (but not necessarily all) variables simultaneously.

The theory is illustrated for max-linear models, in particular recursive max-linear models on acyclic graphs, and for Markov trees. In the latter case, the one-component limiting distributions take the form of a collection of coupled multiplicative random walks generated by independent increments indexed on the edges of the tree. Changing the conditioning variable then amounts to changing the directions of certain edges and transforming their increment distributions in a specific way.

Reference:

Segers, J. (2019). "One- versus multi-component regular variation and extremes of Markov trees", https://arxiv.org/abs/1902.02226.

#### Registration

Please use this **link** to register

More information to follow soon!!