Size-aware dispatching in two queues

Problem statement

We consider two heterogeneous G/G/1 queues in parallel. When a job arrives, it joins a dispatcher, which sends it to one of the queues instantly. The dispatcher can observe the size \(S_n\) of each job \(n\), and it adopts the following policy: if \(S_n\le s\), then it sends the job to queue \(1\), otherwise to queue 2. The idea here is to force the separation of long and short jobs.

The goal of this session is to find the cutoff size \(s\) that minimizes the mean waiting time.

Unfortunately, no exact formula is available for the mean waiting time of the G/G/1 queue in general. Given this difficulty, we can approach this problem either by simulation or by analytical approximations. In the former case, we rely on Lindley’s equation (no need to use the general code written for the G/G/1 queue) while in the latter, we assume that the arrival process is Poisson and rely on the Pollaczek–Khinchine formula. We will assess the accuracy of the approximation given by the Pollaczek–Khinchine formula with respect to the optimal probability computed via simulation.

Hint: for the simulation part, you can adapt the code of the former session Response time minimization in two parallel queues

Assumptions

We assume that the service rate at queue \(i\) is \(\mu_i\) and that the arrival rate is \(\lambda\). We assume that the job sizes are independent and identically distributed. We let \(S\) denote a random variable having the distribution of the job size (\(S\) has the same probability distributions of the \(S_n\)’s). For stability, we assume that \(\lambda \mathbb{E}[S]<\mu_1+\mu_2\).

Approximation via Pollaczek–Khinchine formula

Using the Pollaczek–Khinchine formula and the Poisson decomposition property, the mean waiting time of a random job as a function of the cutoff size \(s\), say \(W=W(s)\), is given by S|S<s \[ W(s) = p W_1(s) + (1-p) W_2(s) = p \times \frac{\lambda p}{2} \frac{ \mathbb{E}[S^2\mid S\le s]/\mu_1^2}{1-\lambda p\mathbb{E}[S\mid S\le s]/\mu_1} + (1-p) \times \frac{\lambda (1-p)}{2} \frac{\mathbb{E}[S^2\mid S>s]/\mu_2^2}{1-\lambda(1-p)\mathbb{E}[S\mid S> s]/\mu_2} \] where \[ p:=\mathbb{P}(S_n\le s) \] and we are interested in minimizing \(W(s)\) over \(s\in[0,S_{\max}]\). This is not trivial to solve analytically but we can always solve it numerically.

When \(S\) follows a uniform distribution over \([0,S_{\max}]\), we obtain \(p=\frac{s}{S_{\max}}\), \(\mathbb{E}[S\mid S<s]=\frac{s}{2}\) and \(\mathbb{E}[S^2\mid S<s]= \frac{s^2}{12}+\frac{s^2}{4} = \frac{s^2}{3}\).

Optimization via simulation

For all \(n=1,2,\ldots,N\), let \(W_{1,n}\) and \(W_{2,n}\) denote the waiting times of the \(n\)-th job joining the system if it is sent to queue 1 and 2, respectively; we can assume \(W_{1,n}=W_{2,n}=0\). Using Lindley’s equation, we obtain \[ W_{1,n+1} = \max \left(W_{1,n} + \frac{S_n}{\mu_1}\,I_{(S_{n}\le s)} - A_n, \,0 \right) \] and \[ W_{2,n+1} = \max \left(W_{2,n} + \frac{S_n}{\mu_2}\,I_{(S_{n}> s)} - A_n, \,0 \right) \] where \(I_E\) is the indicator function of the event \(E\), \(A_n\) denotes the inter-arrival time between jobs \(n\) and \(n+1\), and \(S_n\) is the job size of job \(n\). Now, we write a code that iterates over this equation and plots the average waiting time \(\frac{1}{N}\sum_{n=1}^N W_n\) as a function of \(s\).

knitr::opts_chunk$set(echo = TRUE)

LB_SizeInterval <- function(N,s,mu,InterArrival,JobSize) {      

  WaitingTime=rep(0,2);
  WaitingTimeCum=rep(0,2);
  
  # first job
  U = ifelse(JobSize[1]<=s,1,2);
  
  for (n in 2:N) #for each job
  {
    for (k in 1:2) #for each server
    {
      # Update the waiting time of server k (Lindley's recursion)
      WaitingTime[k] = max(WaitingTime[k] + ifelse(U==k,1,0)*JobSize[n-1]/mu[k] - InterArrival[n], 0);
    }
    # Dispatching decision for job n
    U = ifelse(JobSize[n]<=s,1,2);
    
    WaitingTimeCum[U] = WaitingTimeCum[U]+WaitingTime[U];
  }

 return(sum(WaitingTimeCum)/N);
  
}

################################################################################
N=1e5; # number of jobs to simulate
set.seed(7); # Set seed for reproducibility

mu = sample(x=1:3, 2, replace = TRUE); # service rates
mu = c(3,1);

Smax=100;
JobSize=runif(N,0,Smax); # Job sizes
ES=Smax/2; 
rho=0.8; # network load to test: lambda*E[S]/(mu_1+mu_2)
lambda = rho*sum(mu)/ES; # arrival rate

# InterArrival = rexp(n=N, rate = lambda);           # Inter-arrival times: Poisson
InterArrival = runif(n=N, min = 0, max=2/lambda);    # Inter-arrival times: Uniform

# smin=65;smax=85;
smin=84;smax=95;
step=(smax-smin)/10;
s_values=seq(smin,smax, step);
W_SizeInterval=rep(NA,length(s_values));

cnt=1;
for (s in s_values) 
{
    W_SizeInterval[cnt]=LB_SizeInterval(N,s,mu,InterArrival,JobSize);
    cat(paste("Avg Waiting Time (s=", round(s,digits=3), ")=", round(W_SizeInterval[cnt],digits=3),sep=""), "\n");
    cnt=cnt+1;
}
## Avg Waiting Time (s=84)=121.073 
## Avg Waiting Time (s=85.1)=60.954 
## Avg Waiting Time (s=86.2)=42.891 
## Avg Waiting Time (s=87.3)=35.693 
## Avg Waiting Time (s=88.4)=34.26 
## Avg Waiting Time (s=89.5)=35.69 
## Avg Waiting Time (s=90.6)=39.712 
## Avg Waiting Time (s=91.7)=46.013 
## Avg Waiting Time (s=92.8)=58.378 
## Avg Waiting Time (s=93.9)=78.893 
## Avg Waiting Time (s=95)=120.582
# plot the results
plot(s_values, W_SizeInterval, xlim=c(smin,smax), ylim = c(0.85*min(W_SizeInterval,na.rm = T), 1.3*mean(W_SizeInterval,na.rm = T)),
     col="blue", lty=1, ylab="",xlab="size s", type = "b", pch = 19, main=paste("Average Waiting Time W(s)"));

grid(nx = NULL, ny = NULL, lty = 2, col = "lightgray", lwd = 1)

cat("The optimal cutoff size is:", (which.min(W_SizeInterval)-1)*step+smin,'\n');
## The optimal cutoff size is: 88.4