European Institute for Statistics, Probability, Stochastic Operations Research and its Applications

About | Research | Events | People | Reports | Alumni | ContactHome


YEQT-IV (Young European Queueing Theorists)

"Optimal Control in Stochastic Systems"

November 25-27, 2010



Thursday November 25

10.30-10.45 Welcome Connie Cantrijn, Managing Director EURANDOM
10.45-11.15 Nicolas Gast Mean field for Markov decision processes: from discrete to continuous optimization
11.15-11.30 Break  
11.30-12.00 Rene Bekker Scheduling admissions and reducing variability in bed demand
12.00-12.30 Zumbul Atan Supply Disruptions in Multi-Echelon Inventory Systems
12.30-13.30 Lunch  
13.30-14.30 Chip White (Keynote) Optimal Control in Stochastic Systems: Challenges for the Future
14.30-15.00 Break  
15.00-16.00 Ulrich Rieder (Tutorial) Markov Decision Processes with Applications
16.00-16.20 Break  
16.20-16.50 Cagdas Buyukkaramikli Periodic Capacity Management under a Lead Time Performance Constraint
16.50-17.20 Taoying Farenhorst-Yuan A Perturbation Analysis Approach to Phantom Estimators
18.30           Dinner

Friday November 26

09.30-10.00 Dinard van der Laan Optimal mixing of suboptimal decision rules for MDP control
10.00-10.30 Itai Gurvich t.b.a.
10.30-11.00 Break  
11.00-11.30 Gerard Hoekstra Dynamic Traffic Splitting to Parallel Wireless Networks with Partial Information: A Bayesian Approach
11.30-12.00 Sofia Villar Real-state restless bandit problems: marginal productivity indexation methodology and its application to exible sensor tracking systems
12.00-12.30 David Hodge Asymptotic optimality of multi-action restless bandits
12.30-13.30 Lunch  
13.30-14.30 Semyon Meerkov (Keynote) Production Systems Engineering: Optimality through Improvability
14.30-15.00 Break  
15.00-16.00 Bernd Heidergott (Tutorial) Recent Trends in Optimization and Control of Markovian Systems
16.00-16.20 Break  
16.20-17.20 Oualid Jouni

On Queueing Systems with Finite Arrivals

Saturday November 27

10.00-11.00 Ger Koole (Tutorial) Monotonicity Results in MDP
11.00-11.15 Break  
11.15-11.45 Flora Spieksma Local uniformisation in the successive approximations method for processor sharing retrial queues
11.45-12.15 Jan-Pieter Dorsman Optimization of polling systems with batch service
12.15-13.00 Lunch  
13.00-13.30 Rene Haijema One-step policy improvements to reduce the waiting time at traffic lights
13.30-14.00 Sandra van Wijk Optimal lateral transshipment policies in spare parts inventory models



Zumbul Atan

Supply Disruptions in Multi-Echelon Inventory Systems

The effect of supply disruptions in multi-echelon inventory systems can be more severe than single location systems. In multi-echelon systems, a disruption in the supply process of a location can impact all the other locations. Some of them may not receive the inventory they request and the others may need to carry excess inventories which they originally planned to send if disruptions did not happen. For effective management of multi-echelon inventory systems subject to disruptions, all locations need to develop strategies for protecting against the damages caused by disruptions. In this study, we analyze two-echelon serial and distribution systems subject to supply disruptions. We consider a range of assumptions and propose algorithms to find the optimal or near-optimal stocking levels of all the locations in the system. We show how supply disruptions in different parts of the network affect inventory decisions and propose strategies that firms can use to mitigate the effects of these disruptions.


Rene Bekker

Scheduling admissions and reducing variability in bed demand

With the growing demand for health care resources, the pressure on the efficient usage of the available bed capacity is increasing. An efficient use of capacity is however complicated by a considerable degree of variability, which is typical for the workload at clinical wards. Surprisingly, studies have shown that the variation in the number of admitted scheduled (or elective) patients is generally at least as large as the variation in the number of emergency admissions. The variability in admission pattern leads to highly variable bed occupancy for both elective and emergency patients. Moreover, during the weekend the number of elective admissions is generally very small leading to extra workload fluctuations over the week.
The aim of this presentation is to get insight in the impact of the admission pattern on the variability in bed occupancy and the number of refused admissions in clinical wards. More specifically, we consider both the impact of reducing the daily variability in admissions, as well as reducing the variability in the weekly occupancy pattern, caused by the lack of scheduled admissions during the weekends.
Currently, the Erlang loss and delay models are often used to determine the required number of beds in clinical wards. This model is a natural candidate as the arrival process of (elective) patients can often well be approximated by a Poisson process. We propose to apply a heavy-traffic limit theorem for the G/G/infty queue yielding an intuitively appealing approximation in case the arrival process is not Poisson.
Regarding the weekly occupancy pattern, we assume a time-dependent weekly admissions pattern and determine the mean and variance in offered load per day. This performance analysis is the basis for finding an optimal admission schedule. Given a target average occupancy per day, we use the time-dependent analysis in combination with a Quadratic Programming model to determine the optimal number of elective admissions per day, such that the difference between the target and realized load is minimized. From the mathematical results, practical scenarios and guidelines are derived that can be used by hospital managers and support the method of quota scheduling.
Note: this is based on joint work with Paulien Koeleman-Out.


Cagdas Buyukkaramikli

Periodic Capacity Management under a Lead Time Performance Constraint

In this paper, we study a production system that operates under a lead time performance constraint which guarantees the completion of an order before a pre-determined lead time with a certain probability. The demand arrival times and the service requirements for the orders are random. To reduce the capacity related operational costs, the production system under study has the option to use flexible capacity. We focus on periodic capacity policies and model the production system as a queuing system that can change its capacity periodically and choose from two levels of capacity available: a permanent capacity and a permanent plus contingent capacity. The permanent capacity is always employed in the system. Contingent capacity is supplied if needed at the start of a period, and is available during that period, at a cost rate that is decreasing in period length. For a given lead time performance constraint and a permanent/contingent capacity cost structure, we develop a procedure to create a set for candidate period lengths and permanent/contingent capacity levels. Afterwards, we propose a search algorithm that finds the capacity levels and the threshold point that minimizes the capacity related costs for a given period length. Due to the cost structure of the permanent and contingent capacity levels, the behavior of the operational costs changes drastically under different period lengths and different cost structures. In our computational study, we observe that the periodic capacity flexibility reduces the capacity related operational costs significantly (up to 35 %). The percentage savings are higher for more ambitious lead time performance constraints. Moreover, we observe that the use of contingent capacity makes the total system costs more insensitive to the lead time performance requirements.


Jan-Pieter Dorsman

Optimization of polling systems with batch service

We consider a batch service polling system consisting of an inner part and an outer part. Type-i customers arrive at the outer system according to a renewal process and accumulate into a type-i batch. As soon as a certain fixed number of customers have arrived, the batch is forwarded to the inner system, the polling system, where the batch is processed.
Under the assumption that the service requirement of a batch is independent of its size (the number of customers in the batch), we study the problem of determining the combination of batch sizes that minimizes the (weighted) sum of the mean waiting times of the customers in the complete system. A balance in the trade-off between the waiting time in the inner part and the outer part of the system needs to be found.
Taking larger batch sizes results in a smaller load for the polling system, and thus in a shorter "inner waiting time", but on the other hand accumulation of a batch will take more time, increasing the "outer waiting time".
As this model does not easily allow for an exact analysis, we propose both a numerical approach for this problem and a closed-form approximation for the optimal combination of batch sizes, which allows for easy implementation. It turns out that these two methods complement each other.


Nicolas Gast

Mean field for Markov decision processes: from discrete to continuous optimization

In this talk, I will focus on the optimal control of large stochastic systems. I will explain how to extend classical mean field methods to study the controlled behavior of large systems. I will show that under mild assumptions, solving an optimal control problem for a system with a large number of objects can be reduced to the solving of a problem for a deterministic system as the number of objects grows large. In practice, this allows one to evaluate the performance of a policy and provide a way to asymptotically solve problems that used to be intractable.
This is a joint work with B. Gaujal and J.Y Le Boudec


Itai Gurvich

Motivated by call-center outsourcing problems, we consider a network with multiple call centers overflowing some of their calls to a central call center. Relying on a separation of time scales, we establish heavy-traffic approximations and prove an asymptotic independence result which facilitates, in turn, simplified expressions for the customer waiting-time distribution. The asymptotic independence is shown to be useful in solving some call-center optimization problems.


Rene Haijema

One-step policy improvements to reduce the waiting time at traffic lights

We consider a single intersection of roads that is controlled by traffic lights. For each traffic flow f leading to the intersection cameras and/or induction loops measure the number of cars approaching the intersection and their distances as well as the number of cars (qf) waiting at queue f. We model the control in discrete time to acknowledge a fixed switchover time when a light changes from green to yellow to all red. When a light shows green one either keeps it green or turns it into yellow.
An optimal policy that minimizes the overall long-run average waiting time given all possible queue states (q1, q2, …, qF) and all possible signal state (l) can be computed by standard MDP solution techniques (such as value iteration) but only for simple infrastructures of a single intersection in isolation. For more complex infrastructures, the MDP becomes intractable due to the curse of dimensionality in the state space: the number of states becomes easily too large to allow the computation of the relative values of all states (l ; q1, q2, …, qF).
Therefore we suggest a one-step policy improvement approach that improves the fixed cycle control (FC). In a fixed cycle the order in which the queues are served as well as the duration of the green periods of each flow are fixed. Therefore under FC the state space is decomposed into subspaces related to each traffic flow. For each flow in isolation of all other flows we formulate a Markov chain and numerically compute the so called relative values of the states under FC.
Because of the periodicity of the fixed cycle, the computation of relative values by value iteration is not straightforward. We discuss the right relative value concept for periodic problems. By simulation we test multiple new control policies that use these relative values of FC. Key idea of our relative value based rules is to interrupt the fixed cycle, by dynamically shortening or lengthening green periods and when all signals show red to prescribe which queue to serve next. The results indicate a great reduction of the overall long-run average waiting time compared to FC and exhaustive control. For simple intersection the new polices appear to perform close to the optimal (MDP) policy (where tractable).
If time permits we also address how these rules can be extended to deal with arrival information. In the cases with arrival information one uses information on the upstream positions of cars that are approaching the intersection, next to info on (l ; q1, q2, …, qF). It appears that the new policies may outperform current fixed cycle control policies in complex networks of intersection with green waves between intersections.


Bernd Heidergott

Recent Trends in Optimization and Control of Markovian Systems

Markovian systems are stochastic models which play an important role in many applications in areas as diverse as biology, finance, and industrial production. In this lecture we will provide an overview on recent trends in optimization and control of Markovian systems. We will discuss simulation based methods, such as gradient estimation and on-line control, and numerical approaches, such as Markov decision processes. Specifically, we will put an emphasis on the interplay between simulation based and numerical approaches.


David Hodge

Asymptotic optimality of multi-action restless bandits

The class of restless multi-armed bandits as proposed by Whittle in the 1980s have long since been known to be very hard to solve optimally. We present an optimality result which borrows greatly from the primary work on optimality of policies in the restless bandit framework, namely the 1990 paper of Weber & Weiss. We extend their framework to bandits with multiple levels of activation and therefore a more general resource constraint too. The work is motivated by a number of recent works of Glazebrook et al. which demonstrate the performance of index heuristics for resource allocation in stochastic systems. Before now these heuristics have only been shown to be optimal for the Whittle relaxation of the standard problem, i.e. to problems where resource is purchased by bandits rather than a fixed resource level is optimally shared. We find that under key assumptions about the solution to a deterministic differential equation the index heuristics above are asymptotically optimal.


Gerard Hoekstra

Dynamic Traffic Splitting to Parallel Wireless Networks with Partial Information: A Bayesian Approach

Contemporary wireless communication networks are based on a wide range of different technologies providing overlapping coverage. This offers users the opportunity to use several networks at the same time. Motivated by this, we consider a networking environment in which users are able to select between the available wireless networks to minimize the mean processing times for file downloads in the presence of background traffic. The information available to the user is only the total number of jobs in each network, rather than the per-network numbers of foreground and background jobs. This leads to a complex partial information decision problem which is the focus of this talk. A Bayesian learning algorithm is presented that optimally splits a traffic stream that minimizes the expected sojourn time.


Oualid Jouni

On Queueing Systems with Finite Arrivals

This work is motivated in practice by a situation where a finite number of arrivals occur in a service system. Although the importance of such type of systems, a few results are available. We consider single and multi-server models with exponential service times. The source of customers is assumed to be finite to the contrary of standard queueing systems. We obtain the system state probabilities and use them to evaluate several measures of system performance, including the distribution of the waiting time of an arbitrary customer. We then conduct numerical experiments and study the characteristic of these special types of queuing systems. In particular, we analyze the effect of heterogeneity in inter-arrival times on the performance measures. We show that modeling without accounting such features may lead to inappropriate results.


Ger Koole

Monotonicity Results in MDP

We focus on monotonicity results for dynamic systems that take values in the natural numbers or in more-dimensional lattices. The results are mostly formulated in terms of controlled queueing systems, but there are also applications to maintenance systems, revenue management, and so forth. We concentrate on results that are obtained by inductively proving properties of the dynamic programming value function. We give a framework for using this method that unifies results obtained for different models. We also give a comprehensive overview of the results that can be obtained through it, in which we discuss not only (partial) characterizations of optimal policies but also applications of monotonicity to optimization problems and comparisons of systems.


Dinard van der Laan

Optimal mixing of suboptimal decision rules for MDP control

Consider a Markov Decision Process (MDP) with the restriction that at decision epochs only a finite number of given Markov decision rules are admissible. Many open-loop control problems can be modeled as an MDP with such a restriction on the admissible decision rules. Moreover, such model can be considered for closed-loop problems if for example it is difficult to obtain or implement the optimal control policy. Then the set of admissible Markov decision rules D could be some suboptimal, but easy-implementable decision rules.
Within the class of available policies, optimal policies are generally non-stationary and it is difficult to prove that some policy is optimal. An interesting question is whether admissible optimal policies are non-stationary and periodic, for example having a period cycle like d1, d1, d2, d1, d2 in case of two admissible decision rules, D = {d1, d2}. To investigate such questions it is useful to consider an associated MDP on the space of all probability distributions on the state space of the original restricted MDP. Some structural results can be obtained if for the associated MDP a stationary optimal policy exists and performances do not depend on the initial state space distribution.  A sufficient condition is obtained.


Semyon Meerkov

Production Systems Engineering: Optimality through Improvability

Production Systems Engineering (PSE) is an emerging branch of Engineering intended to uncover fundamental principles that govern production systems and utilize them for the purposes of analysis, continuous improvement, and design. In PSE, the machines are assumed to be unreliable and the buffers are finite. Under these assumptions, production lines are nonlinear stochastic systems. Analyses of their statics, dynamics, and optimal operation are the goals of PSE. In this talk, the main problems of PSE and their solutions will be described along with several applications. In particular, the notion of improvability, which addresses the issue of system optimization through a series of suboptimal steps, will be discussed. A justification of this approach is in the fact that, under insufficient information on the factory floor, the optimality may be unachievable, while the improvability is possible. In addition, the so-called PSE Toolbox, which implements the methods and algorithms developed, will be discussed. A demo of the toolbox can be found at http://www.ProductionSystemsEngineering.com .


Ulrich Rieder

Markov Decision Processes with Applications

Markov Decision Processes are considered w.r.t. the total reward criterion. We present general verification and structure theorems for finite-stage and infinite-stage MDPs. Special attention is given to the computation of optimal policies. In the second part, stochastic
optimization problems with partially observable Markov processes and with piecewise deterministic Markov processes are investigated. We show how these problems can be solved by the theory of MDPs. As applications we treat a queueing problem, bandit problems and consumption-investment problems from finance.
The talk is based on the recent book of Baeuerle and Rieder (2010).


Flora Spieksma

Local uniformisation in the successive approximations method for processor sharing retrial queues

Successive Approximations is an important tool for proving structural results for average optimal policies in a Markov Decision Process. Its application requires a uniformisation step, that is not applicable when the jump rates are unbounded in action and state variables. This occurs in e.g. call centre models with abandonments and retrial queues.
However, the transformation operator associated with an induction step of the successive approximations method can often be easily shown to preserve structural properties of interest, provided it be locally uniformised. So far, it has not been clear in general whether this fact is sufficient to conclude that the value function of the MDP has the same structural properties.
Using a smoothed rate truncation method, we can show that this is indeed the case under weak conditions. We will illustrate our method with the example of a processor sharing retrial queue.

Taoying Farenhorst-Yuan

A Perturbation Analysis Approach to Phantom Estimators

There are two distinct philosophies to the implementation and coding of gradient estimators for stochastic cost functions: single-run computation and parallel systems computation. There is a trade-off between these two families of estimators. The single-run estimators are computationally more efficient, but have higher variance than the parallel estimators. In this talk, we discuss the possibility of combining the best of both worlds, namely, low variance and numerical efficiency.

Sofia Villar

Real-state restless bandit problems: marginal productivity indexation methodology and its application to exible sensor tracking systems

Many control problems of modern technological systems are readily formulated in the framework of Markov decision processes (MDPs), by considering the optimal sequential allocation of common resources to a collection of independent stochastic projects, each modeled as binary-action MDPs (i.e. one-armed bandit models), whose state variable naturally moves in a continuous state space according to transitions which are action speci c. Under such conditions, the optimal policy solves a multi-armed restless bandit problem (MARBP). Hence, its practical implementation is hindered both by the intractability of the conventional dynamic programming (DP) technique and by particular difficulties of continuous state spaces. A promising approach to overcome this limitation is based on the design of a priority index policy resulting from a Lagrangian relaxation and decomposition approach, introduced by Whittle (1988) and further developed by Nino-Mora (2001-) into a unifying framework.
In this talk, some concepts of real-state restless bandit indexation will be revised. Special emphasis will be placed on the economic interpretations given by the structural property of the one-armed restless bandit problem optimal policy, termed indexability, as well as its key role in providing a tractable and nearly optimal procedure to operate such systems. These ideas shall be illustrated by means of a concrete example of a multitarget tracking model with Kalman Filter dynamics.
This talk is based on the work of Ni~no-Mora on restless bandit indexation (2001-), in particular on the recently announced extensions to real-state restless bandits (2008) and on a joint work with Nino-Mora on Multitarget tracking (2009)

Chelsea White III

Optimal Control in Stochastic Systems: Challenges for the Future

We examine models of sequential decision making under uncertainty and risk; overview what we know about optimal value function and optimal policy structure and their implementation and computational implications in the context of applications; and examine several frontiers for future research in this general research area.


Sandra van Wijk

Optimal lateral transshipment policies in spare parts inventory models

We consider an inventory model for spare parts, where two stockpoints provide repairable parts for a critical component of advanced technical systems. As downtime costs for these systems are huge, ready-for-use spare parts are kept on stock, to be able to quickly respond to a breakdown of a system. We allow for lateral transshipments (stock transfers) of parts between the stockpoints upon a demand arrival. We are interested in the optimal lateral transshipment policy. Using dynamic programming, we completely characterize and prove the structure of this optimal policy. We also derive conditions under which simple policies are optimal. In addition, we discuss similar inventory models for which we are able to derive the same type of results.




Last updated 17-dec-2010,

    P.O. Box 513, 5600 MB  Eindhoven, The Netherlands
tel. +31 40 2478100  fax +31 40 2478190  
  e-mail: office@eurandom.tue.nl