- This event has passed.
YEQT X on “Queueing Theory in Operations Research”
Nov 7, 2016 - Nov 9, 2016
Part of Stochastic Activity Month: Data Driven Operations Management
The theme of this year’s YEQT workshop is “Queueing Theory in Operations Research”, where we aim to discuss research that uses understanding gained from queueing theory to help make better decisions in a variety of fields of operations. In this workshop, we intend to build a queueing-theory oriented program around the following OR‐related applications:
- Communication networks
- Energy networks
- Traffic networks
- Service systems
- Healthcare systems
|Leiden University and University of Amsterdam
|Technion – Israel Institute of Technology
|University of Manchester
|University of Southern California
|Télécom Paris Tech
|Carnegie Mellon University
|University of Southern California
|University of Queensland / University of Amsterdam
|Peter van de Ven
How inpatient boarding affects emergency department performance: A queueing analysis
Using a Markov-modulated fluid queue approach, this study seeks insights into the effect of boarding patients on emergency department performance, which is evaluated by expected waiting times, service levels, and maximum throughput. The elegant matrix-geometric property of the stationary distribution allows a more efficient calculation of service levels than the standard Quasi-birth-death method. The focal network features inpatient boarding; discontinuous treatment, such that patients may visit a physician more than once; and a semi-open structure, which ensures the capability to model limited bed capacity in addition to a limited physician capacity. Boarding reduction policies that focus on decreasing the boarding time perform better than policies that focus on decreasing the probability of boarding.
Networks of multi-server queues with parallel processing
We consider a network of multi-class multi-server queues. Each job can be processed in parallel by any subset of servers within a pre-defined set that depends on its class. Each server is allocated in FCFS order at each queue. Jobs arrive according to Poisson processes, have independent exponential service requirements and are routed independently at random. We prove that, when stable, the network has a product-form stationary distribution. From a practical perspective, we propose an algorithm on this basis to allocate the resources of a computer cluster according to balanced fairness. We finally examine further developments of this model to allow for dynamic adaptation to the system occupancy.
(joint work with Thomas Bonald)
Stein’s Method for Steady-State Approximations: Error Bounds and Engineering Solutions
Through queueing systems modeling customer call centers and hospital patient flows, I will give an introduction on how to use Stein’s method both as an engineering tool for generating good steady-state approximations and as a mathematical too for establishing error bounds for these approximations. These approximations are often universally accurate in multiple parameter regions, from underloaded to overloaded (when abandonment is possible).
I will focus on diffusion models for performance analysis, briefly discussing recent works by others on ergodic optimal controls and mean-field approximations.
(joint works with Anton Braverman and Jiekun Feng from Cornell and Pengyi Shi from Purdue)
OR in healthcare: data-driven appointment scheduling
We consider the problem of determining appointment schedules in settings where patients can request different types of consultations in advance. The main challenge is to schedule the appointments to time slots so that cost-effective service is ensured while patients experience short waiting times.
In our study, we investigate a local search-based procedure to optimize schedules under fairly general conditions. Our method is based on an exact evaluation algorithm of De Vuyst et al. (2014), which obtains accurate performance predictions at a low computational cost by using the transient solution of a modified Lindley recursion and a discrete-time setting. In contrast to many other studies in the literature, this model requires no special assumptions regarding the consultation time distributions, no-show probabilities or other model parameters. The fact that each patient may have a distinct consultation time distributions and no-show probability allows us to take prior knowledge into account when estimating the consultation time. Moreover, the method is fast in comparison to simulation and the explicit expressions of the Lindley recursion allow us to exploit some structural properties of the problem.
The performance of the heuristic is tested against a wide and diverse test-bed. We compare the results with both traditional appointment rules and the optimal solution. The analysis reveals that our method is able to find very good results within a few seconds (average optimality gap is 0.003%). In another experiment, we also investigated the impact of incorporating doctor’s lateness into the model. Most studies assume punctual doctors, but Klassen and Yoogalingam (2013) reported that this is not always the case. Our results indicate that modeling lateness should only be considered when there are few other sources of variability. That is, when the no-show rates and service time variability are relatively low.
Approximations of Queueing Performance for Rapid Systems Design
Recent advances in queueing analysis have yielded tractable approximations of performance metrics that can be used to quickly explore initial designs, to reduce computational burdens associated with simulation, or even to eliminate the need for simulation altogether. This tutorial takes you on an accessible tour of these recent methods, shows you how to apply them using numerical examples drawn from real applications, and discusses implementation challenges and potential opportunities.
(joint work with Steve Hackman (Georgia Tech))
A Queueing Model for Internal Wards
Hospital queues have unique features, which are not captured by standard queueing assumptions, necessitating the development of specialized models. In this work, we propose a queueing model that takes into account the most salient features of queues associated with large Internal Wards (IWs). We characterize the maximum long-run workload that the IW can handle, and introduce a deterministic (fluid) approximation for the non-stationary dynamics. The fluid model is shown to have a unique periodic equilibrium, so that long-run performance analysis can be carried out by simply considering that equilibrium. Consequently, evaluating the effects of policy changes on system’s performance and optimizing long-run operational costs are facilitated considerably.
A Better Model for Job Redundancy: Decoupling Server Slowdown and Job Size
Recent computer systems research has proposed using redundant requests — creating multiple copies of the same job and waiting for the first copy to complete service — to reduce latency. In the past few years, queueing theorists have begun to study redundancy, first via approximations, and, more recently, via exact analysis. Unfortunately, for analytical tractability, most existing theoretical analysis has assumed a model in which the replicas of a job each experience independent runtimes (service times) at different servers. This model is unrealistic and has led to theoretical results which can be at odds with computer systems implementation results. We introduce a much more realistic model of redundancy. Our model allows us to decouple the inherent job size (X) from the server-side slowdown (S), where we track both S and X for each job. Analysis within the S&X model is, of course, much more difficult. Nevertheless, we design a policy, Redundant-to-Idle-Queue (RIQ) which is both analytically tractable within the S&X model and has provably excellent performance.
(joint work with Mor Harchol-Balter and Alan Scheller-Wolf)
Refining Workload Measure in Hospital Units: From Census to Acuity-Adjusted Census in Intensive Care Units
We aim to better understand the impact of Intensive Care Unit (ICU) workload on patient outcomes, so that practitioners and researchers can use such understanding to provide high quality care despite increased hospital crowding. We use data collected from the medical ICU and the surgical ICU of a major teaching hospital. We measure ICU workload in a novel way that takes into account not only the census but also patient acuity. Having categorized the ICU workload at time of patient departure as low census, high-census/low-acuity, and high-census/high-acuity, we find that patients discharged on a day of high-census/high-acuity ICU workload had significantly worse health status than patients discharged from a low-census ICU workload. Moreover, we find patients with poorer health status at the time of ICU discharge experienced worse longer-term outcomes, including longer post-ICU length-of-stay (LOS), higher mortality, and higher total hospital costs.
In sum, we find that acuity was critical in accurately characterizing ICU workload, resulting ICU discharge decisions, and ultimately patient outcomes and hospital costs. Our findings suggest that (1) ICUs need to track the changes in patient acuity in addition to census, in order to use such information to prevent reaching high census with high acuity patients and (2) future studies (both empirical and analytical) of ICU workload should take patient acuity into account in ICU workload measures. Lastly, using a simulation study, we show how high-census/high-acuity workload can be prevented by reducing seasonality in the volume and acuity of patient arrivals.
Strategic Behavior in Transportation Systems
This research focuses on the behavior of strategic customers in transportation systems. We consider two different models. In the first model, arriving customers decide whether to join the transportation station or balk, based on a natural reward-cost structure that models their desire for service and their unwillingness to wait. Solving the game among customers, we determine their strategic behavior and explore the effect of key service parameters, such as the frequency and the punctuality of service, on customer behavior. In the second model, we assume that the administrator of the transportation system makes decisions as well. Specifically, arriving customers decide whether to join the station or balk and the administrator sets the fee. In this case, a two-stage game among the customers and the administrator takes place. We explore how system parameters affect the customer behavior and the fee imposed by the administrator. Moreover, we consider three cases distinguished by the level of delay information provided to customers at their arrival instants. We compare these three cases and show that the customers almost always prefer to know their exact waiting times whereas the administrator prefers to provide either no information or the exact waiting time depending on the system parameters.
Heavy traffic analysis of redundancy routing and join the shortest queue policies
We will present heavy traffic limit results on the JSQ and redundancy routing policies for the parallel server model, with heterogeneous and fixed number of servers. These include state space collapse, identification of the limit and delay calculations. A key role is played by considering the following link between the two policies: A policy that sends multiple replica to a number of servers can otherwise be described as one which sends only to the server having the least workload. (joint work with Prof. Rami Atar and Prof. Isaac Keslassy)
Reliability of DC Power Grids Under Uncertainty: \\a Large Deviations Approach
The advent of renewable energy has huge implications for the design and control of power grids. Due to increasing supply-side uncertainty, traditional reliability constraints such as strict bounds on current, voltage and temperature in a transmission line have to be replaced by chance constraints which are computationally hard. In this talk we use large deviations techniques to study the probability of current and temperature overloads in a DC network with stochastic power injections, and develop corresponding safe capacity regions. In particular, we characterize the set of admissible power injections such that the probability of overloading of any line over a given time interval stays below a fixed target. We show how enforcing (stochastic) constraints on temperature, rather than on current, results in a less conservative approach and can thus lead to capacity gains in power grids.
Detecting Markov Chain Instability: A Monte Carlo Approach
We devise a Monte Carlo based method for detecting whether a non-negative Markov chain is stable for a given set of potential parameterizations. More precisely, for a given set in parameter space, we develop an algorithm that is capable of deciding whether the set has a subset of positive Lebesgue measure for which the Markov chain is unstable. The approach is based on a variant of simulated annealing, and consequently only mild assumptions are needed to obtain performance guarantees.
I will illustrate the usage of our algorithm on models of communication and traffic networks.
(joint work with Michel Mandjes (University of Amsterdam) and Neil Walton (The University of Manchester))
Optimal model simplification to improve performance prediction in service processes
Operational process models such as generalized stochastic Petri nets and queueing networks are useful when answering performance queries on business processes (e.g. `how long will it take for a case to finish?’). Recent process mining methods discover and enrich operational models based on a log of recorded executions of processes, which enables evidence-based process analysis. To avoid a bias due to infrequent execution paths, mining algorithms strive for a balance between over-fitting and under-fitting regarding the originating log. However, state-of-the-art discovery algorithms address this balance solely for the control-flow dimension, neglecting possible over-fitting in terms of performance analysis.
In this talk, we introduce a technique for automated model reduction of operational process models, using structural simplification rules. Each rule induces an error in performance estimates with respect to the original model, and adds generality with respect to the data. We further demonstrate how to find an optimal sequence of simplification rules that obtain a minimal model under a given error budget for the performance estimates. We evaluate the approach with real-world data that stems from an outpatient cancer hospital, showing that model simplification indeed yields significant improvement in time prediction accuracy.
Peter van de Ven
Managing Appointment Scheduling under Patient Choices
Motivated by the use of appointment templates in healthcare scheduling practice, we study how to offer appointment slots to patients in order to maximize the utilization of provider time. We develop two models, non-sequential scheduling and sequential scheduling, to capture different types of interactions between patients and the scheduling system. In these models, the scheduler offers either a single set of appointment slots, or multiple sets in sequence, for arriving patients to choose, without knowing their preference information. For the non-sequential scheduling model, we identify certain problem instances where the greedy policy is suboptimal, but show through analytical and numerical results that for most moderate and large instances, greedy performs remarkably well. For the sequential model, we explicitly derive the optimal policy for a large class of instances; for those that we cannot solve for the optimal policy in closed-form, we develop an effective heuristic. Our case study, based on real patient preference data, demonstrates a potentially up to 17% improvement in provider capacity utilization by adopting our proposed scheduling policy.
This improvement may translate into $45k-$120k increase in annual revenues for a single primary care provider.
Longest-Queue Shortest-Queue: the long and short of it!
When arriving at a set of queues, it is natural to want to join the shortest and, when serving queues, it is natural to want to serve the longest. In this tutorial, we separately study Joint the Shortest Queue and Longest Queue First service. We discuss the consequences and generalizations of these decision rules. We survey classical results and techniques, we discuss recent applications and motivating technologies. We apply contemporary methods to understand optima and passimal performance in these systems.
Matching for Real-time Ridesharing
In a ridesharing system such as Uber or Lyft, arriving customers must be matched with available drivers. These decisions affect the overall number of customers matched, because they impact whether or not future available drivers will be close to the locations of arriving customers. A common policy used in practice is the closest driver (CD) policy that offers an arriving customer the closest driver. This is an attractive policy because no parameter information is required. However, we expect that a parameter-based policy can achieve better performance.
We propose to base the matching decisions on the solution to a linear program (LP) that accounts for (i) the differing arrival rates of drivers and customers in different areas of the city, (ii) how long customers are willing to wait for driver pick-up, and (iii) the time-varying nature of all the aforementioned parameters. Our main theorems establish the asymptotic optimality of an LP-based policy in a large market regime in which drivers are fully utilized. We show through extensive simulation experiments that an LP-based policy significantly outperforms the CD policy when there are large imbalances in customer and driver arrival rates across different areas in the city.
(joint with Erhun Ozkan)
(SSRN link: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2844451)
Observational Learning and Abandonment Behavior in Queues
In several service operations settings users only have partial information on system parameters pertinent to performance. In some cases it may only be possible for them to infer this through their own observations or experiences in the system. In this talk we present a simple stylized model of a queueing system that strives to capture some salient features of “observational learning,” and elucidate its effects on user behavior and the system’s equilibrium operating point.
(joint work with John Yao and Costis Maglaras.)
Bed Blocking in Hospitals due to Scarce Capacity in Geriatric Institutions — Cost Minimization via Fluid Models
This research focuses on elderly patients who have been hospitalized, are ready to be discharged but must remain in the hospital until a bed in a geriatric institution becomes available; these patients “block” a hospital bed. Bed-blocking has become a challenge to healthcare operators due to its economic implications and quality-of-life effect on patients. Hospital-delayed patients, who cannot access their most appropriate treatment (e.g., rehabilitation), prevent new admissions. Moreover, bed-blocking is costly since a hospital bed is more expensive to operate than a geriatric bed. We are thus motivated to model and analyze the flow of patients between hospitals and geriatric institutions, in order to improve their joint operation. To this end, we develop a mathematical fluid model of this patient flow, which accounts for blocking, mortality and readmission, all significant features of the discussed environment. Analyzing the fluid model and its offered-load counterpart yields a closed-form expression for bed allocation decisions, which minimizes underage and overage costs. Finally, we validate the model with a two-year data set from a hospital chain, which includes four general hospitals and three geriatric hospitals. Solving for the optimal number of geriatric beds in our system demonstrates that significant cost reductions are achievable, when compared to current operations.