Job dispatching among parallel servers TD Évaluation de Performances, INFO4 Polytech Grenoble

Problem Statement

We consider a system composed of \(K\) servers working in parallel, each with its own queue. In the following, “queues” and “servers” are synonyms. Queues have an infinite buffer and operate under the First-In First-Out (FIFO) scheduling discipline. Server \(k\) operates with speed \(\mu_k\). Jobs join the network via an exogenous process with rate \(\lambda\) and are dispatched to queue \(k\) according to some dispatching algorithm; see below. After processing, jobs leave the system and never return.

Job sizes are independent and classified as either long, i.e., of size \(x_M\), or short, i.e., of size \(x_m\), with \(0< x_m < x_M <\infty\). Let \(X\) denote the random variable of the job sizes. We also assume that job sizes have a “heavy-tailed” distribution in the sense that

\[ \mathbb{P}(\mbox{Long job}) =\mathbb{P}(X=x_M)=\alpha\, x_M^{-\beta},\quad \alpha>0, \, 0\le \beta\le 1 \]

with \(x_M\) “large”. Here, the tail of \(X\) decays polynomially and we will be particularly interested in the case where \(\beta\le1\) as this implies high variance when \(x_M\) grows; say \(x_m=1\) and \(x_M=10^3\).

We are interested in evaluating the performance induced by the following dispatching algorithms:

Random (RND): Upon arrival, each job is dispatched to queue \(i\) with probability \(1/K\) independently of all else.
Round-Robin (RR): Upon arrival, the \(n\)-th job is sent to queue \(1+(n\bmod K)\).
Least Loaded (LL-\(d\)): Upon arrival, each job is dispatched to the queue with the shortest workload among \(d\) selected uniformly at random.
Size Interval Task Allocation (SITA-\(c\)): Upon arrival, a short resp. job is sent uniformly at random among servers \(\{1,\ldots,c\}\) resp. \(\{c+1,\ldots,K\}\), for some \(c\).

The performance metric of interest is the waiting time, i.e., the average time that jobs spend waiting before receiving service.

The objective is to understand which one works better and, if possible, to extract some quantitative insight. Towards this purpose, we rely on simulation and initially assume that servers operate with speed one, i.e., \(\mu_k=1\).

Implementation in R

We write a code to simulate the waiting times induced by the job dispatching schemes above. We rely on Lindley’s recursion.

knitr::opts_chunk$set(echo = TRUE)

LB_random <- function(N,K,mu,InterArrival,JobSize) {      

  WaitingTime=rep(0,K);
  WaitingTimeCum=rep(0,K);
  
  for (n in 1:N) #for each job
  {
    # Dispatching decision of RANDOM
    U = sample(x=1:K,1); 
    for (k in 1:K) #for each server
    {
      # Update the waiting time of server k (Lindley's recursion)
      WaitingTime[k] = max(WaitingTime[k] + ifelse(U==k,1,0)*JobSize[n]/mu[k] - InterArrival[n], 0);
    }
    WaitingTimeCum[U] = WaitingTimeCum[U]+WaitingTime[U];
  }

 return(sum(WaitingTimeCum)/N);
}

LB_RR <- function(N,K,mu,InterArrival,JobSize) {      

  WaitingTime=rep(0,K);
  WaitingTimeCum=rep(0,K);

  for (n in 1:N) #for each job
  {
    # Dispatching decision of Round Robin
    U = n%%K; 
    U = U+1;
    for (k in 1:K) #for each server
    {
      # Update the waiting time of server k (Lindley's recursion)
      WaitingTime[k] = max(WaitingTime[k] + ifelse(U==k,1,0)*JobSize[n]/mu[k] - InterArrival[n], 0);
    }
    WaitingTimeCum[U] = WaitingTimeCum[U]+WaitingTime[U];
  }

 return(sum(WaitingTimeCum)/N);
}

LB_LLd <- function(N,K,mu,InterArrival,JobSize,d) {      

  WaitingTime=rep(0,K);
  WaitingTimeCum=rep(0,K);

  for (n in 1:N) #for each job
  {
    # Dispatching decision of Least Loaded - d
    s=sample(x=1:K, d, replace = F);
    k_win=s[which.min(WaitingTime[s])];

    for (k in 1:K) #for each server
    {
      # Update the waiting time of server k (Lindley's recursion)
      WaitingTime[k] = max(WaitingTime[k] + ifelse(k_win==k,1,0)*JobSize[n]/mu[k_win] - InterArrival[n], 0);
    }

    WaitingTimeCum[k_win] = WaitingTimeCum[k_win]+WaitingTime[k_win];

  }

 return(sum(WaitingTimeCum)/N);
}

LB_SITAc <- function(N,K,mu,InterArrival,JobSize,c,xm) {      

  WaitingTime=rep(0,K);
  WaitingTimeCum=rep(0,K);
  
  for (n in 1:N) #for each job
  {
    
    # Dispatching decision of SITA-c
    k_win=sample(x=(c+1):K);
    if(JobSize[n]==xm)
    {
      # job n is short
      k_win=sample(x=1:c);
    }

    for (k in 1:K) #for each server
    {
      # Update the waiting time of server k (Lindley's recursion)
      WaitingTime[k] = max(WaitingTime[k] + ifelse(k_win==k,1,0)*JobSize[n]/mu[k_win] - InterArrival[n], 0);
    }

    WaitingTimeCum[k_win] = WaitingTimeCum[k_win]+WaitingTime[k_win];

  }

 return(sum(WaitingTimeCum)/N)

}

################################################################################
N=2e6; # number of jobs to simulate
K=50;  # number of servers
mu = sample(x=1:1, K, replace = TRUE); # service rates

set.seed(17); # Set seed for reproducibility

beta=0.8;
xM=1e3;
xm=1;
JobSize=sample(c(xm,xM),N,replace=T,prob=c(1-1/xM^beta,1/xM^beta)); # Job sizes

EX=xm*(1-1/xM^beta) + xM^(1-beta); # Expected value of job sizes

B=10;
xdata=(0.5:(B-0.5))/B;
W_RANDOM=rep(0,length(xdata));
W_RR=rep(0,length(xdata));
W_LLd=rep(0,length(xdata));

cnt=1;
for (rho in xdata) 
{
  # For stability
  lambda=K * rho/EX; #"overall arrival rate"<"overall service rate"=(mu[1]+...+ mu[K])/EX
  InterArrival = rexp(n=N, rate = lambda);    # Inter-arrival times

  W_RANDOM[cnt]=LB_random(N,K,mu,InterArrival,JobSize);
  W_RR[cnt]=LB_RR(N,K,mu,InterArrival,JobSize);
  W_LLd[cnt]=LB_LLd(N,K,mu,InterArrival,JobSize,2);
  
  print(paste("Waiting Times with rho =",format(round(rho, 2), nsmall = 2),"and K =",K))
  
  print(paste("Random: ", W_RANDOM[cnt]))
  print(paste("RR: ", W_RR[cnt]))
  print(paste("LL-d: ", W_LLd[cnt]))

  #c=floor(K/2); # making sure that c is an integer
  #print(paste("SITA-c: ", LB_SITAc(N,K,mu,InterArrival,JobSize,c,xm)))

  cnt=cnt+1;
  cat("\n")
}

## [1] "Waiting Times with rho = 0.05 and K = 50"
## [1] "Random:  24.9407391177727"
## [1] "RR:  23.0829875179501"
## [1] "LL-d:  4.72371764317381"
## 
## [1] "Waiting Times with rho = 0.15 and K = 50"
## [1] "Random:  74.8361655570211"
## [1] "RR:  72.682005354119"
## [1] "LL-d:  9.4317051164116"
## 
## [1] "Waiting Times with rho = 0.25 and K = 50"
## [1] "Random:  134.66345694938"
## [1] "RR:  135.451641230803"
## [1] "LL-d:  18.6189321444192"
## 
## [1] "Waiting Times with rho = 0.35 and K = 50"
## [1] "Random:  215.498936756677"
## [1] "RR:  216.550889872638"
## [1] "LL-d:  33.3977975647431"
## 
## [1] "Waiting Times with rho = 0.45 and K = 50"
## [1] "Random:  333.154659102608"
## [1] "RR:  325.490107048932"
## [1] "LL-d:  55.1474898744794"
## 
## [1] "Waiting Times with rho = 0.55 and K = 50"
## [1] "Random:  500.662539471911"
## [1] "RR:  485.258878268817"
## [1] "LL-d:  85.9999186207679"
## 
## [1] "Waiting Times with rho = 0.65 and K = 50"
## [1] "Random:  728.358773357734"
## [1] "RR:  731.22991791053"
## [1] "LL-d:  132.188519279443"
## 
## [1] "Waiting Times with rho = 0.75 and K = 50"
## [1] "Random:  1187.48012625492"
## [1] "RR:  1171.1901540522"
## [1] "LL-d:  206.983161390235"
## 
## [1] "Waiting Times with rho = 0.85 and K = 50"
## [1] "Random:  2089.34184001803"
## [1] "RR:  2040.60518620151"
## [1] "LL-d:  335.984144695851"
## 
## [1] "Waiting Times with rho = 0.95 and K = 50"
## [1] "Random:  4398.69389561897"
## [1] "RR:  4296.29082343331"
## [1] "LL-d:  675.476173668452"

length(xdata)

## [1] 10

xdata

##  [1] 0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.95

length(W_RANDOM)

## [1] 10

plot(xdata, W_RANDOM, xlim=c(0,1), ylim = c(0, max(W_RANDOM)*1.1), type="o",
     col="blue", pch="o", lty=1, ylab="",xlab="rho",
     main=paste("Average Waiting Times - K=",K))
points(xdata, W_RR, col="red", pch="*")
lines(xdata, W_RR, col="red",lty=2)
points(xdata, W_LLd, col="dark red", pch="*")
lines(xdata, W_LLd, col="dark red",lty=2)
legend("topleft",legend=c("RANDOM","RR","LL2"), col=c("blue","red","black"),
                                   pch=c("o","*","+"),lty=c(1,2,3), ncol=1)

Therefore, LL-2 provides a much better performance than RR and RANDOM, which provide more or less the same performance.

Homework

How the above plots change if job sizes follow an exponential distribution with rate one?
Assume \(K=100\) and compare the average waiting time of RR-2 with the one of SITA-\(c\) where \(c=\lfloor c^* \rfloor\) with \(c^*\) given by \(\frac{x_m}{c^*}=\frac{x_M}{K-c^*}\). What is the interpretation of \(c^*\)?

Job dispatching among parallel servers
TD Évaluation de Performances, INFO4 Polytech Grenoble

Jonatha Anselmi

2023-02-03

Problem Statement

Implementation in R

Homework