- This event has passed.
Community Detection and Network Reconstruction
Sep 19, 2017 - Sep 22, 2017
Recent advances in the study of networks in the physics, mathematics and computer science communities have provided novel probabilistic tools to address various network-related problems of practical relevance. Two seemingly different, but actually intimately related, problems are community detection and network reconstruction.
Community detection is the identification of groups of nodes that share some property (e.g., are densely connected with each other in a given real-world network). This seemingly simple task presents us with several computational and analytical challenges, even in very simple and idealized scenarios.
Network reconstruction problems, on the other hand, attempt to infer features of a real-world network, of which only partial and/or aggregate information is available (e.g., inference from a sample of a subset of the network).
These two broad classes of problems are conveniently tackled using random graph models that preserve some observed features of the real network, while providing a probabilistic estimate for the unobserved features.
This workshops focuses on various aspects of community detection and network reconstruction. Emphasis will be put on the interdisciplinary nature of both problems, and contributions will highlight both the applied and the theoretical side.
|Luca Avena||Leiden University|
|Rui Castro||TU Eindhoven|
|Diego Garlaschelli||Leiden University|
|Tim Hulshof||TU Eindhoven|
|Viresh Patel||University of Amsterdam|
|Ton Coolen||King’s College London, Abstract|
|Jean-Charles Delvenne||University of Louvain, Abstract|
|Andrea Gabrielli||ISC-CNR, Abstract|
|Fengnan Gao||Fudan University, Shanghai, Abstract|
|Alexandre Gaudilliere||CNRS, Abstract|
|Olga Klopp||ESSEC, Abstract|
|Iman van Lelyveld||VU Amsterdam, Abstract|
|Po-Ling Loh||University of Wisconsin, Abstract|
|Malwina Luczak||Queen Mary, University of London|
|Laurent Massoulié||MSR-Inria, Abstract|
|Raj Rao Nadakuditi||University of Michigan, Abstract|
|Tiago P. Peixoto||University of Bath, UK and ISI Foundation, Italy, Abstract|
|Kevin Scaman||Microsoft Research - INRIA Joint Center, Abstract|
|Tiziano Squartini||IMT, Abstract|
|Vincent Traag||CWTS, Abstract|
|Nicolas Tremblay||CNRS, Abstract|
|Ernst Wit||University of Groningen, Abstract|
Tailoring of Random Networks and Graphs
Networks are popular tools for characterizing and visualizing complex systems with many interacting variables. It is often desirable to be able to generate random networks with statistical features that mimic those of a given network instance. These random networks can serve as `null hypotheses' in hypothesis testing, or as proxies in simulation studies or system modelling. Defining and generating random networks with controlled topological characteristics (imposed via hard and/or soft constraints) involves interesting theoretical and practical challenges, including questions related to counting graphs analytically (via ensemble entropies) and the construction of detailed balance Markov chains on the basis of nontrivial rewiring moves. In this talk I will first discuss these challenges in the context of graph ensembles that typically produce treelike graphs, and focus towards the end on ensembles of `loopy’ graphs.
Dynamics on Networks and Communities
The problem of community detection arises naturally in many a context, in each time a specific flavour leading to various overlapping but distinct definitions or optimality criteria. We will review some of those contexts, and focus on the case where communities aim at capturing those patterns of the dynamics operating on the network that are induced by the structure of the network. The prototypical example is the tendency for a diffusion process---random walker, information spreading, epidemics, synchronization---to stay confined for a long time in a community, for lack of sufficiently many edges leaving the community.
We will compare this approach with the statistical inference approach, and extend some ideas to the network reconstruction problem.
Organization and hierarchy of the human functional brain network lead to a chain-like core
The brain is a paradigmatic example of a complex system as its functionality emerges as a global property of local mesoscopic and microscopic interactions. Complex network theory allows to elicit the functional architecture of the brain in terms of links (correlations) between nodes (grey matter regions) and to extract information out of the noise. Here we present the analysis of functional magnetic resonance imaging data from forty healthy humans during the resting condition for the investigation of the basal scaffold of the functional brain network organization. We show how brain regions tend to coordinate by forming a highly hierarchical chain-like structure of homogeneously clustered anatomical areas. A combined maximum spanning tree and network percolation approach  revealed the centrality of the occipital cortex and the peculiar aggregation of cerebellar regions to form a closed core. We also report the hierarchy of network segregation and the level of clusters integration as a function of the connectivity strength between brain regions.
Spectra, scales and partitions through random forests
We will discuss how random rooted forests provide tools for graph spectral analysis and multiscale partitioning.
(joint work with L. Avena, F. Castell and C. Mélot)
Statistical Estimation of Preferential Attachment Networks
The preferential attachment (PA) network is a popular way of modeling the social networks, the collaboration networks and etc. The PA network model is an evolving network where new nodes keep coming in. When a new node arrives, it establishes only one connection with an existing node. The random choice on the existing node is via a multinomial distribution with probability weights based on a preferential function $f$ on the degrees. $f$ is assumed apriori non-decreasing, which means the nodes with high degrees are more likely to get new connections, i.e., "the rich get richer". We proposed an estimator on $f$, that maps the natural numbers to the positive real line. We show, with techniques from branching process, our estimator is consistent. If $f$ is affine, meaning $f(k) = k + \delta$, it is well known that such a model leads to a power-law degree distribution. We proposed a maximum likelihood estimator for $\delta$ and establish the asymptotic normality and efficiency on the MLE. If the PA function is sublinear and assumes a parametric form, then we show the maximum likelihood estimator is also efficient, despite the difficulty in analyzing the likelihood. We will also talk about some recent advances revealing the connection between the empirical estimator and the maximum likelihood estimator.
(joint work with Aad van der Vaart)
Optimal graphon estimation
Inhomogeneous random graph models encompass many network models such as stochastic block models and latent position models. We study the twin problems of estimating the connection probability matrix of an inhomogeneous random graph and the graphon of a $W$-random graph. We consider classes of block constant matrices and step function graphons and establish the minimax estimation rates with respect to different distances: $l_1$, $l_2$ and the cut metric. We also extend our study to the case of unbounded graphons. Our results cover the important setting of sparse networks.
(joint work with A. Tsybakov and N. Verzelen)
Capturing networks for financial stability analysis
Capturing financial network linkages and contagion in stress test models are important goals for banking supervisors and central banks responsible for micro- and macroprudential policy. However, granular data on financial networks is often lacking, and instead the networks must be reconstructed from partial data. In a recent paper, we conduct a horse race of network reconstruction methods using network data obtained from 25 different markets spanning 13 jurisdictions. Our contribution is two-fold: first, we collate and analyze data on a wide range of financial networks. And second, we rank the methods in terms of their ability to reconstruct the structures of links and exposures in networks. In addition I will discuss recent developments in financial network data at the Dutch Central bank.
Optimal rates for community estimation in the weighted stochastic block model
Community identification in a network is an important problem in fields such as social science, neuroscience, and genetics. Over the past decade, stochastic block models (SBMs) have emerged as a popular statistical framework for this problem. However, SBMs have an important limitation in that they are suited only for networks with unweighted edges; in various scientific applications, disregarding the edge weights may result in a loss of valuable information. We study a weighted generalization of the SBM, in which observations are collected in the form of a weighted adjacency matrix and the weight of each edge is generated independently from an unknown probability density determined by the community membership of its endpoints. We characterize the optimal rate of misclustering error of the weighted SBM in terms of the Renyi divergence of order 1/2 between the weight distributions of within-community and between-community edges, substantially generalizing existing results for unweighted SBMs. Furthermore, we present a principled, computationally tractable algorithm based on discretization that achieves the optimal error rate without assuming knowledge of the weight densities.
Phase transitions on community detectability for various types of stochastic block models
In this talk we will survey available results and open questions on detectability of communities using polynomial-time algorithms for several variants of the stochastic block model (SBM). We will in particular consider degree-corrected SBM’s and labelled SBM’s and discuss how the phase transition captured by the so-called Kesten-Stigum threshold in the classical case translates in these other two scenarios.
OptFuse: Random matrix theory enabled optimal fusion
We place ourselves in the setting where we are given multiple low-rank ‘signal’ plus ‘noise’ type matrices and the objective is to optimally combine these matrices to best estimate the low-rank part. A concrete application in mind includes the determination of common community structure from multiple SBM sampled adjacency matrices. We describe an random matrix theory inspired algorithm that we called OptFuse that formulates the optimization problem in terms of an objective function derived using ‘free probability’ theory. We compare the empirical results with closed-form solutions, illustrate its success on a real-world dataset and discuss some extensions.
(joint work with Himanshu Nayar)
Nonparametric weighted stochastic block models
We present a Bayesian formulation of weighted stochastic block models that can be used to infer the large-scale modular structure of weighted networks, including their hierarchical organization. Our method is nonparametric, and thus does not require the prior knowledge of the number of groups or other dimensions of the model, which are instead inferred from data. We give a comprehensive treatment of different kinds of edge weights (i.e. continuous or discrete, signed or unsigned, bounded or unbounded), as well as arbitrary weight transformations, and describe an unsupervised model selection approach to choose the best network description. We illustrate the application of our method to a variety of empirical weighted networks, such as global migrations, voting patterns in congress, and neural connections in the human brain.
New results in distributed optimization and consensus learning
(The talk assumes no prior knowledge in convex optimization.) Consider a graph where each vertex v is associated to a convex function f_v. Assume that each vertex is a computing unit with oracle access to its associated function. We consider the distributed optimization problem of reaching the minimum of f:=\sum_v f_v. We show how the expansion properties of the graph and the conditioning of f interact in the description of the fundamental limits for this problem. A practical decentralized algorithm will also be presented with experiments for least-squares and logistic regression.
(joint work with Francis Bach, Yin Tat Lee, Laurent Massoulie and Sebastien Bubeck)
Network reconstruction techniques: an overview
When approaching the study of complex networks, a problem that is systematically encountered is represented by the limitation of available information. Examples are provided by economic and financial networks, whose structure is particularly difficult to access because of the privacy issues preventing these kinds of data to be publicly available. On top of this, the aggregate information one is able to collect is often inadequate to reproduce the set of interconnections between economic and financial entities, thus dramatically reducing the possibility of providing a realistic estimate of their crucial properties (e.g., the network resilience to the propagation of shocks).
In order to infer the presence of otherwise unaccesible connections as accurately as possible, researchers have developed algorithms to compensate for the scarcity of data: this talk is devoted to illustrate examples of such techniques, with particular emphasis on those methods rooted into entropy-maximization. A recently-developed Bayesian approach to network reconstruction will be also briefly discussed and a comparison with more traditional likelihood-based inference procedures provided.
Finding well connected communities in networks
Many complex networks have a modular structure: groups of densely connected nodes with few connections between the groups. Nodes in such groups often have something in common, and enrich our understanding of complex networks. Finding such so-called communities in large networks is far from trivial. One of the best-known methods for community detection is modularity, which specifies a quality function of a partition. However, modularity suffers from a well-known flaw, known as the resolution limit: it tends to oversimplify, and lump together several (sub)communities in one large community. We here show that only few quality functions can address this issue. One of the best algorithms for optimising modularity is the Louvain algorithm. We here show that it can lead to arbitrarily badly connected communities---in addition to the resolution limit of modularity. In particular, it can lead to disconnected communities. We introduce a new algorithm, which not only addresses this caveat, it asymptotically ensures that all subsets of all communities are locally optimally assigned. Finally, combined with a fast local move subroutine, we not only obtain an algorithm that finds higher quality partitions, it is also faster than Louvain.
Unifiying some multiscale community detection methods in the graph Fourier space
Multiscale community detection methods are important analysis tools for networks. They first scan the network's structure at different scales, then automatically detect relevant scales if they exist, before providing a partition in communities for each of these selected scales. We will explore the links existing between these methods and the field of graph signal processing that attempts to generalize the theory of signal processing to signals defined on graphs. We will show how the graph Fourier space provides a unifiying framework, and recast existing multiscale approaches as graph filters.
Reconstruction of Biological Networks
Biological processes are intrinsically complex: they deal with genetic regulation, metabolic interactions and various levels of organization. Modelling this biological reality is a process of abstraction: given a particular set of circumstances, one should select a time-scale and a functional mode of operation through which to view the phenomenon of interest.
Often, networks provide a good syntax of the underlying biological reality: gene regulatory networks, cell signaling networks, metabolic networks, cell differentiation networks. The semantics of the network, however, can change depending on which phenomena is described. We focus on two generative models, stochastic differential equation model (SDE) and ordinary differential equation model (ODE), and one exploratory model, the vector autoregressive model (VAR).
We show that by embedding these models into a penalized inference framework, they can be used to reconstruct biological networks. Clearly, most of the statistical methods are generic and can be applied in other settings too.
Registration is for the workshop is free, but compulsory: REGISTRATION
Eurandom, Mathematics and Computer Science Dept, TU Eindhoven,
De Groene Loper 5, 5612 AE EINDHOVEN, The Netherlands
Eurandom is located on the campus of Eindhoven University of Technology, in the MetaForum building, 4th floor (more about the building). The university is located at 10 minutes walking distance from Eindhoven main railway station (take the exit north side and walk towards the tall building on the right with the sign TU/e).
For lecturers and invited speakers, we will take care of accommodation. Other attendees must make their own arrangements.
Our preferred hotel is The Student Hotel Eindhoven. They offer 10% discount if you book with promocode, please follow the link TSHE .
For hotels around the university, please see: Hotels (please note: prices listed are “best available”).
More hotel options can be found on Tourist Information Eindhoven.
For those arriving by plane, there is a convenient direct train connection between Amsterdam Schiphol airport and Eindhoven. This trip will take about one and a half hour. For more detailed information, please consult the NS travel information pages.
Many low cost carriers also fly to Eindhoven Airport. There is a bus connection to the Eindhoven central railway station from the airport. (Bus route number 401) For details on departure times consult Public Transport.
The University can be reached easily by car from the highways leading to Eindhoven. For details: Route and map TU/e campus.
● Conference facilities
Conference room, MetaForum Building “MF11 & 12”.
The meeting-room is equipped with a data projector, an overhead projector, a projection screen and a blackboard. Please note that speakers and participants giving an oral presentation are kindly requested to bring their own laptop or their presentation on a memory stick.
● Conference Secretariat
Upon arrival, participants should register with the workshop officer, and collect their name badges. The workshop officer will be present for the duration of the conference, taking care of the administrative aspects and the day-to-day running of the conference: registration, issuing certificates and receipts, etc.
Should you need to cancel your participation, please contact Patty Koorn, the Workshop Officer.
Mrs. Patty Koorn, Workshop Officer, Eurandom/TU Eindhoven, email@example.com