Auxiliary Variables for Bayesian Inference in Multi-Class Queueing Networks

Queue networks describe complex stochastic systems of both theoretical and practical interest. They provide the means to assess alterations, diagnose poor performance and evaluate robustness across sets of interconnected resources. In the present paper, we focus on the underlying continuous-time Markov chains induced by these networks, and we present a flexible method for drawing parameter inference in multi-class Markovian cases with switching and different service disciplines. The approach is directed towards the inferential problem with missing data and introduces a slice sampling technique with mappings to the measurable space of task transitions between service stations. The method deals with time and tractability issues, can handle prior system knowledge and overcomes common restrictions on service rates across existing inferential frameworks. Finally, the proposed algorithm is validated on synthetic data and applied to a real data set, obtained from a service delivery tasking tool implemented in two university hospitals.


Introduction
Recent literature addressing queueing networks (QNs) has aimed to study inferential methods for the estimation of service requirements. These networks offer the means to describe complex stochastic systems through sets of interacting resources, and have found applications in the design * Work supported by RCUK through the Horizon Digital Economy Research grants (EP/G065802/1, EP/M000877/1) and The Health Foundation through the Insight 2014 project "Informatics to identify and inform best practice in out of hours secondary care" (7382). † Corresponding author, e-mail: iker.perez@nottingham.ac.uk of engineering and computing systems (Kleinrock, 1976), or within call centres (Koole and Mandelbaum, 2002), factories (Buzacott and Shanthikumar, 1993) and hospitals (Osorio and Bierlaire, 2009). Enabling the understanding of service performance is very important, since it provides quantitative input for the optimal design of interconnected service stations. Yet, drawing inference on parameters is a challenging errand, since in most applications successive network states are never fully observed. Hence, proposed approaches often rely on reduced summaries such as queue lengths, visit counts or response times, and perform inference in different ways, including regression-based estimation procedures, non-linear numerical optimization or maximum likelihood methods. For a recent review on the matter we refer the reader to Spinner et al. (2015) and references therein.
In this paper, we focus on the underlying continuoustime Markov chains (CTMCs) induced by general-form open QNs, and we develop a flexible framework for drawing Bayesian inference on parameters that govern these models; in the presence of general patterns of missing data currently only discussed in (Sutton and Jordan, 2011). Statistical computation is very difficult within this family of models, as it involves working with often countably infinite state spaces where observations provide little indirect information. Here, we target multi-class Markovian cases with possible class switching and different service disciplines, where few or no individual job departure times are observed at specific servers. Hence, knowledge is mostly restricted to task arrival and departures times to, and from, the network. A task is a collection of jobs undertaken at different service stations, and high loads make it virtually impossible to determine the state of the network at any point in time, including the ordering of jobs across multiple queues. We propose an inferential framework that allows the imposition of prior system knowledge and overcomes common restrictions on service rates across popular service disciplines in traditional modelling approaches. A key contribution is that we introduce a slice-sampling approach relying on mappings to the measurable space of task transitions across the service stations; this enables studying systems where the transition paths of tasks among the queues is unknown, and leads to an efficient sampler. The approach draws motivation from techniques aimed to explore countably infinite state spaces within Dirichlet mixture models or infinite-state hidden Markov models (Walker, 2007;Van Gael et al., 2008;Kalli et al., 2011), and sits well within a uniformization oriented MCMC scheme for jump processes as presented in Rao and Teh (2013).
Currently, common assumptions in inferential frameworks include the existence of complete data, product-form equilibrium distributions or unique classes with shared service rates. However, we often encounter systems where the completion of jobs at individual stations is only occasionally registered. In addition, inference on the basis of balance may in cases be inaccurate; for instance, the existence of equilibrium in service delivery systems with human workers is a strong assumption, since workload is usually externally controlled and arrivals hardly constitute a Poisson process. In addition, there exist concerns regarding the use of steady-state metrics whenever prior knowledge and constraints are imposed on parameters (Armero and Bayarri, 1994); and the use of product-form solutions within popular BCMP networks (Baskett et al., 1975) restricts first come first served (FCFS) queues to share service distributions over different task classes.
Aiming for flexible inferential methods, Bayesian procedures relying on Markov Chain Monte Carlo techniques were first explored in Sutton and Jordan (2011). There, the authors discussed a latent variable model targeting networks where only subsets of transition times are observed; the method was applicable to open QNs and defined through deterministic transformations between the data and independent service times across different disciplines. Later, Wang et al. (2016) proposed the use of a Gibbs sampler relying on product-form distributions and queue length observations, and it advanced the study of closed BCMP networks, offering an approximation method for the normalizing constant within the network's equilibrium distribution. To the best of our knowledge, no further advances exist in the study of exact Monte Carlo inferential frameworks overcoming known restrictions in the study of QNs. Yet, significant progress has been made with sampling techniques and approximate inference methods for continuous-time dynamic systems often modelled as CTMCs or continuous-time Bayesian networks (CTBN) (Nodelman et al., 2002;Fan and Shelton, 2008). However, simulating system dynamics conditioned on scarce observations remains a complex task; a review on the efficiency of various methods for this purpose (including direct sampling, rejection sampling and uniformization methods) can be found in Hobolth and Stone (2009).
Recently, authors Rao and Teh (2013) have presented a noteworthy contribution based on the principles of uniformization (Lippman, 1975;Jensen, 1953). Their work explores a class of auxiliary variable MCMC methods allowing for the efficient and exact computation of state evolutions in systems with discrete support (such as Markov jump pro-cesses). The framework relies on producing highly dependent time discretizations within subsequent blocked steps in a Gibbs sampler, and is hypothetically applicable to the study of system evolutions within QNs. However, such systems exhibit strong and characteristic temporal dependencies (cf. Sutton and Jordan (2011)), transitions over an infinite set of states, varying specifications of service disciplines and Markovian regimes often subject to switching. Hence, we face major impediments which require elaborate implementations of slice sampling techniques (Neal, 2003). In this work, we describe a method that controls the computational complexity within simulation procedures; for that matter, we employ families of auxiliary variables across steps in a Gibbs sampler targeting network paths. The result is a method that imposes strong restrictions within the vast space of permissible network transitions at each iteration; however, each subsequent step in the sampler allows for significant timing and routing deviations in limited numbers of tasks routed through the network, ensuring convergence to (i) the distribution of network path evolutions across its full space, given the evidence (ii) the posterior distribution of the arrival and service rates. Finally, we present results on both synthetic and real data, obtained from a service delivery tasking tool implemented in two jointly coordinated university hospitals in the United Kingdom.
The rest of the paper is organised as follows. Section 2 describes CTMCs induced by general form QNs, introduces notions of compatibility with observations, and states the problem addressed in the work. In section 3 the principle of uniformization and its application to networks is briefly revised, mappings to task transitions and auxiliary variables are introduced and the proposed sampler is described. Section 4 introduces results for three example networks of varying complexity with both synthetic and real data. Finally, Section 5 offers a brief closing discussion.

Queue networks and continuous-time Markov processes
Consider an open Markovian network with M single service stations, a population set C of different task classes and a non-deterministic network topology defined by a family of routing probability matrices P = {P c : c ∈ C}, such that • P c i,j denotes the probability of a class c ∈ C task immediately moving to station j after completing a job service in station i, for all 1 ≤ i, j ≤ M .
• P c i,0 denotes the probability of a class c ∈ C task immediately exiting the network after completing a job service in station i, for all 1 ≤ i ≤ M .
• M j=0 P c i,j = 1, for all 1 ≤ i ≤ M, c ∈ C. Furthermore, let λ c > 0 denote external arrival rates for each task class c ∈ C; and p c 0,i the corresponding probabilities for its first job to enter station i, 1 ≤ i ≤ M . Servers in the network are assumed independent and may differ in their queueing discipline. Service times are non-negative, have constant rates, and vary over servers and classes; we denote them µ c i for all 1 ≤ i ≤ M, c ∈ C. Switching is allowed and thus classes are not permanent categorizations; state-dependent service rates are not considered but follow naturally.
In Figure 1 we observe two example networks further examined within Section 4 in this paper. There, shaded circles indicate servers with exponential service rates µ c i , all accompanied by corresponding job queueing areas pictured as empty rectangles. Together, such server and queue pairs each represent a service station i, 1 ≤ i ≤ M . The shaded boxes are probabilistic routing junctions, where task destinations after a job service (or arrival) are determined according to P c (or p c ). Finally, λ c show rates for exponential task arrivals from outside the network. Under exponential and independence assumptions, there exists an underlying continuous-time Markov process X = (X t ) t≥0 that describes the system behaviour. Formally, denoting by S the countably infinite set of possible states in the network, X is a right-continuous stochastic process such that time-indexed variables X t are defined within a measurable space (S, Σ S ), where Σ S stands for the power set of S. On a basic level, X holds the ordering of jobs in each queue and server, along with their classes and task identifiers; and S is the multidimensional product of all possible congruent states at every station. The infinitesimal generator matrix Q of X is infinite and such that for all x, x ′ ∈ S. Elements in the generator describe rates for transitions within states in the chain, in addition Q x,x ′ ≥ 0 for all x = x ′ , and Q x := Q x,x = − x ′ ∈S:x =x ′ Q x,x ′ . Hence, rows in Q sum to 0, and the full rate for a state departure is given by |Q x |, for all x ∈ S. Note that transition rates are the product between routing probabilities and exponential rates above; for instance, • λ c p c 0,i is the transition rate among states in S accounting for a class-c arrival to service station i, • µ c i P c i,j is the transition rate among states in S accounting for a job of class c serviced at station i immediately transitioning to station j.

Observations
Let Γ = {0, . . . , M } 2 × N define a task transition space. A triplet γ = (i, j, k) ∈ Γ denotes a transition for a uniquely identifiable task k, with i and j specifying the departure and entry stations respectively. Note that it is possible to augment Γ in order to include task classes, yet given unique identifiers this information is redundant. In this work, a transition triplet is never fully observed; instead, we define a set of partial where Σ Γ stands for the power set of Γ.
Definition 2.1. An observation in O is a subset of Γ that contains all permitted task transitions in the network at some specified time t > 0, given external information on an arrival, departure or job service.

13→· 1·→0
Figure 2: Sample observations generated by a single task transitioning a bottleneck network with 3 servers. Observations are represented by rectangles. Dots inform us of transition times. Information below the dots specifies the actual task transitions at each step.
In Figure 2 we observe a bottleneck network produce four partial observations as it evolves over time. The network corresponds to that in Figure 1 (top), and observations include a single task arrival, two job services for the task, and a departure immediately after the final service. There, each task transition (i, j, k) ∈ Γ is marked as k i→j at its corresponding time point; note that indexes i, j take the value 0 in order to specify an external arrival or a departure. The observations take the form of elements of O, i.e.
In this toy example, it is possible to deduce the original path X in the network when considering the available observations along with the topology in Figure 1; including task orderings across all queues and servers at every point in time. However, in real world applications job service observations are often missing or do not exist at all. In this work, only arrivals and departures are assumed to always be available.

Compatibility
Let T : S 2 → Γ ∪ ∅ define a measurable function, equipped with the corresponding products of discrete algebras, which maps a pair of states x, x ′ ∈ S to its task transition triplet in Γ. For instance, should x ′ be reachable from x by servicing a job for task 12 in server 2 and immediately routing it to queue 3. Note that for this to be possible, a job for task 12 must be in server 2 within x, and the remaining tasks in the system must be distributed and ordered across stations so that there will exist full agreement with x ′ . If a state x ′ is not directly reachable from x, then T (x, x ′ ) = ∅. We note that the preimage of a triplet in T is given by a countably infinite set of pairs of network states in S, unless bounds on the task population are imposed.
Definition 2.2. Fix some terminal time T > 0 and let {O tr ∈ O : r = 1, . . . , R} be a sequence of observations at S for all r = 1, . . . , R. Then, we say that a process X is compatible with an observation O tr , and we write X ⊥ O tr if lim sրtr X s = y and X tr = y ′ , for some pair of network configurations (y, y ′ ) ∈ Y tr . Furthermore, we say that a process X is fully compatible with the observations if X ⊥ O tr for all r = 1, . . . , R. Figure 3: Example network paths, all compatible with arrival and departure observations for a single task entering and leaving a bottleneck network with three servers.
In Figure 3 we observe task transitions for sample paths X which are compatible with the arrival and departure information as shown in Figure 2. There, notice that the first sequence corresponds to the original path forming the observations. This time, no job services have been retained and there exist infinitely many paths X that could have produced the same output, with varying transition times and task orderings across the different stations. In large networks with multiple tasks and all simultaneously transitioning the system, it is hard to picture the infinite amount of fully compatible paths X, unless large proportions of job services are retrieved.

Latent network and problem statement
Denote by x 0 ∈ S the initial state in X. In this paper, this is assumed to be an empty state, where no jobs populate the network. It is however possible to define an initial distribution π over states, s.t. π( The likelihood function is proportional to the product of network path densities fully compatible withÕ, and is thus intractable. A Gibbs sampling approach centred around latent network evolutions is appropriate, iterating between paths and parameters. For that, note that every X is a piecewise-constant process and may be fully characterized by a set of transition times t = {t 1 , . . . , t n } along with network states x = {x 1 , . . . , x n }, so that X ≡ (t, x) with X 0 = x 0 . To ease notation, denote θ ≡ (P, p, λ, µ), where p is the vector of arrival routing probabilities. Now, let δ x be the number of transitions in x excluding task arrivals and departures. For each k = 1 . . . , K, the density of (t, x) given O k is (up to proportionality) such that where q denotes the probability that a job service in X is observed, and δ o ≤ δ x is the corresponding amount of service observations in O k . This density is supported in a suitably defined space of finite network evolutions and the term on top is proportional to Bernoulli trials penalizing network paths with unobserved job services. The term below follows from the properties of the minimum of exponentially distributed random variables.
Hence, drawing parameter inference entails the complex task of simulating network configurations from (1), over increasingly large state spaces and with strong conditional dependencies. In the following, we revise the notion of uniformization and sampling methods for jump processes introduced in Rao and Teh (2013), and we present an auxiliary observation-variable sampler fit for inference in QN models.

Uniformization and auxiliary observations
A generative process for sampling X requires alternating between exponentially distributed times and transitions in proportion to rates. Instead, a uniformization-based (Lippman, 1975;Jensen, 1953) sampling scheme employs a dominating rate Ω and introduces the notion of virtual transitions, so that all times are sampled in a blocked step.
In Algorithm 1 we observe a uniformization procedure to produce network paths.
Algorithm 1 Uniformization procedure for process X 1: Fix a dominating rate Ω ≥ max x∈S |Q x |. 2: Sample transition times 0 ≤ t 1 < · · · < t m ≤ T from a homogeneous Poisson process with rate Ω.
and probabilities π xi−1 given by A proof of probabilistic equivalence between a generative and uniformized sampling scheme involves comparing the marginal distribution across states at any time t ≥ 0, and can be found in Hobolth and Stone (2009). A uniformization procedure yields an augmented set of times . , x ′ m } that accounts for both real and virtual transitions in X. Whenever x i = x i−1 we refer to transition i as virtual and note that the number of such transitions is dependent on the choice of Ω. Finally, the density function in (1) may be rewritten to include virtual jumps, so that where terms not proportional to (t ′ , x ′ ) are omitted. In practice, simulating X only requires considering a limited number of candidate states in each transition; in close relation to the number of service stations. In Figure 4 we observe a graphical representation of times, states and transition probabilities for a uniformization-based procedure in the single-class bottleneck network in Figure 1 (top). There, we observe only one task from entry to departure, and we notice x ′ is unaltered after virtual transitions. Vertical rectangles are divided in proportion to rates for services and arrivals, and infeasible services are hashed in grey (the additional hashed area in the bottom accounts for a strictly positive dominating rate Ω). This determines the probabilities leading to new states at subsequent times, with virtual jumps associated to the sum of all hashed regions. Finally, removing virtual transitions within (t ′ , x ′ ) yields the desired realization (t, x) in X.

An auxiliary observation-variable sampler
A uniformization oriented approach can enable the construction of a Gibbs sampler targeting the conditional distribution f X ((t, x)|θ, O k , x 0 ). For such purpose, authors Rao and Teh (2013) show it is possible to recycle groups of real transition times within each iteration. The method applies well to many families of Markov jump processes, but it is insufficient in order to tackle complex systems such as QNs due to a quadratic cost on the number of states when producing x. This is a known problem in discrete-time systems with large state spaces (such as dynamic Bayesian networks or infinite-state hidden Markov models), and proposed solutions include approximate inference methods (Boyen and Koller, 1998;Ng et al., 2002) or the use of slice sampling techniques for exact inference (Van Gael et al., 2008).
However, QNs contain strong serial dependencies, and transitions over an infinite set of states are triggered by a very reduced number of rates; hence, this can render techniques aimed at Dirichlet mixture models (Walker, 2007;Kalli et al., 2011) or hidden Markov models unusable. A viable approach would ideally consider limited divergences in network paths X over subsequent steps in a sampler; yet allowing for considerable deviations in the routing of a reduced set of tasks. Here, we describe a sampling scheme that achieves this goal by employing random auxiliary mappings to the space of task transitions Γ. Intuitively: • In each iteration we will first produce additional auxiliary evidence, resulting from task transitions within the current trajectory of X.
• This evidence will be used next in order to significantly restrict the explorable range of network paths in the the following sampler iteration.
This approach poses a computationally tractable technique suited for the analysis of system transitions in QNs, and will construct a Markov chain of posterior trajectories over the entire range of paths in full agreement with the original observed evidence, where reasonably distant samples in the chain are statistically unrelated. Figure 4: Graphical representation of times t ′ , states x ′ and transition probabilities for a uniformization-based simulation in a single-class bottleneck network. We observe a single task routed from entry to departure, with virtual transitions represented by empty dots. Vertical rectangles are proportionally split according the likelihood of the various possible services and arrivals.

Preliminaries
Set Ω > max x∈S |Q x | and let t ′ and x ′ define some auxiliary frames of transition times and states in X, including both real and virtual values. Arrival, departure and job service observations must come at transition times in t ′ ; hence, we may define an augmented set of This accounts for missing observations; note that since arrivals and departures are always observed, a missing observation offers evidence for either an inner transition or virtual jump in the network. For simplicity, we assume that no state is reachable from itself in a transition, so that T (x, x) = ∅; however, the framework naturally extends to networks where self-transitions are a possibility. Now, denote by u an auxiliary family of subsets of Γ ∪ ∅, such that with some fixed p ∈ [0, 1], for all u i ∈ u, i = 1, . . . , m. Hence, auxiliary variables u ∈ u will refer to either the entire space of task transitions or sets with a single element in Γ; we note that these single element sets will be further contained within a larger observation-set O ∈ O ′ k . Recall that in queueing networks a task transition may follow from an infinite number of network configurations; that is, there may exist an infinite amount of task orderings across the stations so that a specific job is serviced in one given server and routed to another. However, any network state can only transition to a finite space, by relocating one task in a new queue after a service or an arrival. Thus strictly, for all x, x ′ ∈ S. Moreover, any u ∈ u such that u = Γ ∪ ∅ can only be produced by a limited set of uniformized paths in X, and compatibility definitions in Definition 2.2 extend naturally to these auxiliary-observation variables.
Restrictions are of two types: • Transition triplets impose a transition for an identifiable task. The transition probability is identical over all pairs of compatible states (x, x ′ ) ∈ S.
• Null sets impose virtual jumps. The transition probability (lack thereof) depends both on the network configuration and dominating rate Ω.

Sampler
Let (t, x) denote a network path in X fully compatible with O k , with t = {t 1 , . . . , t n } and x = {x 1 , . . . , x n }; then, marginalizing over (x ′ , u) the frame t ′ is independent of any observations and such that (cf. Rao and Teh (2013)) with t 0 = 0 and t n+1 = T . Thus, it may be sampled in a collapsed step incorporating virtual transitions to times in t, employing a succession of Poisson processes with rates {Ω + Q xi : x i ∈ x}. Note that t ′ along with (t, x) induces preliminary sequences of missing observations in O ′ k and uniformized transitions in x ′ . Next, we target u|x ′ sampling m auxiliary-observation variables from (2). Finally, we obtain a new path (t, x) in full agreement with both real and auxiliary observations, producing t, x, x ′ in a blocked step. This simplifies to sampling a sequence x ′ |t ′ , u, Ω, θ, O ′ k , x 0 and removing virtual entries; it is achieved by employing dynamic arrays within a procedure for discrete-time state-space models as shown in Algorithm 2. Alternatively, note it is possible to employ a particle filtering approach within a forward procedure, in order to impose further memory constraints.

Properties and considerations
Along with observations and naturally restrictive constraints on state transitions within QNs, auxiliary variables in u allow us to limit the space a sampler is allowed to explore within each iteration. These restrictions apply both within forward and backward procedures and leave the underlying filtering equations unaltered, up to proportionality. Increasing the value of p will make computationally expensive iterations less likely, at the cost of a higher dependence between subsequent realizations of X. Also, the term (1−q) enters the forward procedure penalizing network paths with unobserved transitions and is only proportionally relevant when no observation exists.
In Figure 5 we observe a task transition diagram with a single iteration in the proposed sampler, for the bottleneck network in Figure 1 (top). In this example, two tasks (numbered 1 and 2) are observed entering and leaving the network at different times; however, there exists no information regarding job services within the network. In each iteration, the sampler begins with a network path whose task transitions are fully compatible with the existing evidence. In an initial step, the existing path is supplemented with virtual transitions at the corresponding Poisson rates. In the Figure, we observe that nodes for both virtual jumps and the unobserved job services are superimposed over shaded boxes; the boxes represent further evidence for the lack of task arrivals or departures at these times. Next, auxiliary variables are produced across real and virtual jumps, the subsets are loosely represented by ticks (Γ ∪ ∅) and crosses ({T (x ′ i−1 , x ′ i )}) for open and clamped nodes respectively. Then, the uniformized frame is emptied and both real and auxiliary evidence is propagated, imposing task transitions or virtual jumps within clamped nodes and resulting in a restricted frame for possible network paths. Finally, a new compatible path is sampled via forward filtering backward sampling as summarized in Algorithm 2; this will consider the imposed task transitions and weight successive network states over the clamped epochs. The resulting path is fully compatible with the observed evidence, however, notice that task transitions at arrival or departure times may change between iterations.
Note that by choosing Ω strictly greater than all values in the diagonal of Q, the resulting Markov chain over posterior network transitions is irreducible. Increasing the dominating rate will improve mixing in exchange for higher computational requirements. Finally, we note that a high value of p may hinder the sampler from fully exploring the posterior range of network paths.

Parameter sampling
Finally, given a new family of network realizations X = {X k } k=1,...,K fully compatible with observation sequences O = {O k } k=1,...,K , we may obtain posterior samples of arrival and service rate parameters. For traditional FCFS stations this is such that µ c i |X ∼ Gamma γ c i , τ c i , for c ∈ C, i = 1, . . . , M ; assuming independent network parameters and uninformative priors. Here δ c , γ c i and τ c i denote respectively the number of class c arrivals, class c jobs served at station i and the time server i has been occupied Figure 5: Task transition diagram with a single iteration in the proposed sampler, for a bottleneck network with three servers. Here, tasks 1 and 2 are observed entering and leaving the network. First, start with a path whose task transitions are fully compatible with the evidence. Then, supplement it with virtual jumps at the corresponding rates, and produce auxiliary variables across real and virtual nodes. Next, empty the uniformized frame and propagate both real and auxiliary evidence; imposing task transitions or virtual jumps within clamped nodes. Finally, repopulate the frame via forward filtering backward sampling, hence maintaining agreement with the existing evidence.
by a class c job, in all realizations in X. Finally, posterior probability vectors for class c routings in every node i = 1, . . . , M are given by where κ c i defines a vector of transition counts from server i in X. Arrival posteriors in p are defined the same way. We note that in order to ease identifiability in the inferential problem, it is also possible to fix parameters, incorporate conjugate priors or to impose inequality constraints and bounds across parameters; we will show examples in Section 4 below. Also, the above expressions must be altered when stations respond to prioritization regimes other than FCFS (see Example 3 in Section 4).

Examples
In the following, we discuss results obtained across three example networks with both synthetic and real data, in or-der of increasing difficulty. In all cases, results are obtained through a JAVA implementation of the proposed sampler, and starting compatible network paths have been manually assigned.
The examples demonstrate the ability of the proposed algorithm in order to handle missing data in multi-class inferential problems with varying service disciplines, class switching and imposed prior constraints. Hence, the sampler offers the means to overcome necessary assumptions linked to the common use of product form equilibrium expressions for QNs. To the best of our knowledge, there exists no alternative approach overcoming these restrictions when drawing exact inference in general open Markovian networks.

Tandem network
In the simplest example, we analyse simulated data on a tandem network with two M/M/1 stations, FCFS service disciplines and a single task class. Data is generated so that true service rates are µ 1 = 0.2 and µ 2 = 0.5, arrivals are given by λ = 0.12 and the network topology is defined by a routing probability matrix P such that P 1,2 = 1 and P 2,0 = 1. Also, jobs enter directly into the first queue and p 0,1 = 1. For the inferential problem, job service observations (in first station) are always ignored and the only source of information are end-to-end measurements. Thus, available knowledge is limited to the times when tasks enter the queue on the first station and when they depart through the second station. Overall, we examine 5000 realizations totalling 17827 tasks during 115601 time units. For the purpose, the network topology in P is fixed deterministic, since there exists a unique route from start to completion of tasks. Also, in order to ensure identifiability we impose an inequality constraint on service rates and assign fairly uninformative parameter priors, so that π(µ 1 , µ 2 ) ∝ I(µ 1 ≤ µ 2 ) × exp(−10 −3 (µ 1 + µ 2 )).
Note that the problem directly links to the inferential task with two exponentially distributed random variables when only its sum is observed, with the further complexity that unknown waiting times have to be discounted from the empirical observations.
In Figure 6 (right) we observe a contour plot for the joint posterior kernel density estimation over service rates, and we notice a significant negative correlation in values (the dashed vertical and horizontal lines represent the original parameters values in the network path simulations). Results are obtained across two chains with 100000 iterations each, a 10000 burn-in stage, varying starting rates and different scales for dominating rates and probabilities producing auxiliary-observations, so that p 1 = 0 and Ω 1 = 2 max x∈S |Q x | and p 2 = 0.25, Ω 2 = 1.5 max x∈S |Q x |. Note that the second chain is produced employing restrictive auxiliary-observations as opposed to the first; hence, stronger serial dependencies across subsequent latent paths in the network should be expected. Yet, the remainder plots show marginal posterior kernel density estimations for both service rates, along with an autocorrelation summary across a thinned sample in the second chain, showing a satisfactory mixing.
A discussion on the effects and computational gains resulting from employing restrictive auxiliary observations follows in the next example. In general, networks of interest are complex and p = 0 would pose a computationally infeasible problem. Also, even in simple networks such as this example, computing times can be excessive, and considerable reductions can be traded at the cost of higher serial dependences.

Bottleneck network
We examine simulated data in the bottleneck network in Figure 1 (top), with 3 FCFS stations and 3 different task classes. The true service rates can be observed in Table 1, and task arrivals are given by λ 1 = 0.08, λ 2 = 0.06 and λ 3 = 0.04. In this case, along with end-to-end measurements, approximately half of all generated job service observations are retrieved so that q = 0.5. The network topology is Table 1: Summary statistics for posterior service rates along with computing times across three chains tuned differently in the bottleneck network in Figure 1 ( is identical for all three classes and assumed to be known. In addition, job entries are split evenly, i.e. p c 0,1 = 0.5 and p c 0,2 = 0.5 for all c ∈ C.
In total, we analyse 500 network realizations totalling 1281 tasks during 5083 time units. In order to ease identifiability we assume the existence of a slow, medium and fast server; and assign rather uninformative parameter priors, i.e.
) for all c ∈ C. Note that this network type may not be analysed by means of product-form representations centred around figures of queue-lengths (c.f Wang et al. (2016)). This is because traditional BCMP networks require FCFS stations to share service rates across task classes. On the other hand, an MCMC sampler as presented in (Sutton and Jordan, 2011) can be extended in order to handle general service distributions and target network path transitions; however, the framework is not designed for such aim, it would require an additional Metropolis-Hastings step and it is likely to perform poorly.
In Table 1 we observe summary statistics, computing times and effective sample sizes across three chains with 100000 iterations each, a 10000 burn-in stage, varying starting rates and different scales for dominating rates Ω and  probabilities p producing auxiliary-observations. There, we notice a good trade-off between effective samples and the drastic decrease in computing times required when imposing strong serial relations on network paths across subsequent iterations in the sampler. This is the case even when the volume of virtual jumps produced is reduced, and emphasizes the need for such slice sampling techniques in inferential problems with QNs. In addition, Table 2 displays the overall posterior correlation matrix between service rate parameters, and shows very mild relations in rates for each task class. There, we notice the importance of employing posterior samples from the produced chains in order to answer extrapolation-type questions in network systems. Finally, note it is possible to ease imposed restrictions on the network topology and to employ different service disciplines across servers (see next example).

Feedback network
Finally, we show how the proposed sampling scheme may be used to analyse a real data set. For the purpose, we employ work-logs for medical clinicians. Briefly, the data set includes task requests and completions for individual doctors outside the 9:00-17:00 Monday to Friday in hours settings. It belongs to two jointly coordinated university hospitals in the United Kingdom, together servicing a geographical region with over 2.5 million residents. In Figure 7 we show Fall Clerking Urgency Figure 7: Sample diagram with a subset of tasking data linked to a clinician during a shift a diagram with a small of subset of data linked to a clinician during a shift; there we observe three overlapping tasks recorded in the system (from request to completion), and each belonging to a different class. Note that it is not possible to know when the clinician was engaged with each duty; as individual jobs for tasks are not registered when queueing or being routed across teams of administrative staff, nurses and doctors. An extended description of the data set may be found in Perez et al. (2016). Multiple tasks are grouped across 14 categories and analysed with a feedback network as shown in Figure 8. There, we notice the presence of two M/M/1 servers with alternative disciplines and route switching among classes. Task observations for doctors are of roughly two kinds, based on whether they require engagement or not. Many tasks are recorded and erased within doctor work-logs in a short time span, due to no need for action; on the other hand, the remainder of tasks exhibit long processing times indicating the need for considerable doctor activity. In the proposed example, arrival jobs are buffered within an administrative FCFS priority type queue and depart to a transition center where they either leave the system or get routed for processing with some unknown probability.
Once they are assigned to further processing, they join the doctor's processing centre and switch their routing mechanism; so they will depart the network next time they undergo administrative processing in the first queue. The service station aimed to capture strain on doctor workload is assigned a processor sharing (PS) discipline with a single worker, aiming to accommodate doctors attending concurrent duties outside standard working hours. No job service observations are available, so that q = 0 and only the arrival and departure times for tasks to the network are observed.
In total, we analyse a reduced subset of 10000 doctor shifts roughly distributed across 4 years of observations. The network topology is partially known; i.e. P c 2,1 = p c 0,1 = 1 for all task classes, and P c 1,0 = 1 after tasks have undergone processing and hence switched routing mechanism. However, P c 1,0 = 1 − P c 1,2 needs to be determined for all existing task classes. Processing rates for tasks are assumed equal in the first service station and different in the PS server; we assign no constraints and we impose loosely uninformative priors such that for all i ∈ {1, 2} and c ∈ C. Also, note that within a PS discipline posterior rates given network realizations are given by where γ c denotes the number of class c jobs served at the station in all realizations in X; and where summations are across all jobs processed in the PS station in realization k, and a j , d j denote the arrival and departure times of the job to the server. In Table 3 we observe summary statistics for parameters across two chains with 100000 iterations each, a 50000 burn-in stage and varying starting rates. In one chain, we use Ω = 2 max x∈S |Q x | and p = 0.75; in the second we have Ω = 1.5 max x∈S |Q x | and p = 0.5. In addition, Table 4 shows point estimates and standard errors for average completion times in all task types, these correspond to the full processing times from entry to departure in the network (excluding queueing times) and are reported in hour units. Hence, we notice it is possible to assess workload both globally and across single components in the system, thus allowing to answer extrapolation kinds of questions on workload; i.e. in relation to means, variances and extreme values for system strain under likely alterations.

Discussion
This paper has presented a flexible approach for carrying exact Bayesian inference within known or hypothesized queueing networks. Its focus is on multi-class, open and Markovian cases and the approach is centred around the underlying continuous-time Markov chains induced by these complex stochastic systems. The proposed method relies on a slice sampling technique with mappings to the space of task transitions across servers in the network. It sits well over uniformization-oriented MCMC approaches introduced in Rao and Teh (2013) and can deal with missing data, imposed prior knowledge and strong serial dependencies posing a complex inferential task (cf. Sutton and Jordan (2011)).
The need for such inferential frameworks with missing data is justified by the ability of general-form networks to allow evaluating response times in complex systems. Overall, recovering measures such as processing times is a technically difficult task when designing increasingly complex IT systems (Liu et al., 2006), or in service delivery networks (such as those in hospitals) due to ethical issues with such an intrusive process (Perez et al., 2016). Yet, QNs provide the tools to assess system alterations, diagnose poor performance or evaluate robustness to spikes in workload.
The advantage of the presented inferential method is that it permits retrospectively assessing the likely status of systems at any point in time; rather than only providing summary information on strain over individual bits. However, limitations relate to tractability restrictions with high-magnitude networks. In such cases, controlling the dimensionality of unobservable state spaces requires imposing strong serial dependencies within simulated latent network paths across steps in the sampler. This however may restrict the produced chain from exploring the posterior range of network paths efficiently. Approximate inferential frameworks relying on reduced product-form simplifications of state beliefs may improve the scalability of the method. Moreover, it is possible to explore the use of particle filtering approaches along with auxiliary variables for this purpose, since clamping explorable spaces within filtering procedures would likely ease the usual challenges regarding particle degeneracy; that is, ending with a very few particles having non-zero weights.
Also, the use of the uniformization technique will limit applications of the present framework to the study of purely Markovian processes. While it is possible to employ Markov-modulated regimes that adapt service and arrival rates to network states, this will greatly expand state spaces under consideration. Also, uniformization may deem the sampler computationally inefficient should service rates vary greatly across queues or job classes, as certain transition types will greatly dominate the underlying discrete time Markov chain.
Finally, the paper assumes that the volume of job service observations retrieved across the network is given by %(100·q) of the total processing during a fixed time interval.
For simplicity, q is assumed fixed and known to the user. Many network structures (such as bottleneck networks) will allow for uncertainty regarding this parameter to be quantified by means of the presented sampler, as each iteration will provide a total number of network transitions complementing the observation number as a sufficient statistic. However, it is necessary to impose the knowledge of q in order to ensure model identifiability whenever networks contain either global or self-loops.