Load Balancing

Introduction

In multi-server distributed queueing systems, the access of stochastically arriving jobs to resources is often regulated by a dispatcher, also known as load balancer. A fundamental problem consists in designing a load balancing algorithm that minimizes the delays experienced by jobs in the long run.

In this homework assignment, we consider a system composed of two servers working in parallel, each with its own queue. In the following, “queues and”servers" are synonyms. Queues have an infinite buffer and operate under the First-In First-Out (FIFO) scheduling discipline with unitary speed. Jobs join the network via an exogenous Poisson process with rate \(\lambda\) and are dispatched to queue \(i\) according to some dispatching algorithm; see below. The processing, or service, times of all jobs are modelled as independent and identically distributed random variables equal in distribution to some random variable \(S\) such that \(\mathbb{E}[S]=1\). We assume that \(S\) follows a Pareto(\(s_m,\alpha\)) distribution where \(s_m\) resp. \(\alpha\) is the scale resp. shape parameter. Since common values of \(\alpha\) range in the interval \([1,2]\), we let \(\alpha=1.5\). On the other hand, \(s_m\) is given by the condition \(\mathbb{E}[S]=1\). Service and inter-arrival times are also assumed mutually independent. After processing, jobs leaves the system and never return.

We are interested in evaluating the performance induced by three dispatching algorithms:

Random (RND): Upon arrival, each job is dispatched to queue \(i\) with probability \(1/2\) independently of all else.
Round-Robin (RR): Upon arrival, the \(n\)-th job is sent to queue \(1+(n\bmod 2)\).
Size-Interval Task Allocation (SITA): Upon arrival, each job is dispatched to queue one if its service time is less than a given threshold \(x\) and otherwise to two.

We are particularly interested in the response time, i.e., the average time that jobs spend in the network. In the following, let \(R_{RND}\), \(R_{RR}\) and \(R_{SITA-x}\) denote the response times obtained with the dispatching algorithms above.

Homework assignment

The goal of this homework assignment is i) to write a code in R that simulates the dynamics of jobs in the multiserver platform described above and ii) to use the code to investigate some practical questions.

Step 1: Simulation code

Modify the code that simulates the dynamics of a G/G/1 queue to simulate two parallel G/G/1 queues as described above; just one RMarkdown file simulating all the three dispatching schemes.

System parameters

N: number of arriving jobs to simulate;
lambda: job arrival rate;
sm,alpha: the parameters of the Pareto distribution, where sm resp. alpha is the scale resp. shape parameter.
mu: the average service rate, i.e., \(1/\mathbb{E}[S]\).
x: threshold value for SITA.

Using the inversion method, note that

N=5;
sm=1;
alpha=1.5;
sm/(runif(N)^(1/alpha));

## [1] 1.330070 1.927783 2.377986 1.215944 3.258546

gives a sample associated to N i.i.d. Pareto distributed random variables.

State variables

t: simulation time;
Arrival: size-\(N\) vector of jobs’ arrival times;
Service: size-\(N\) vector of jobs’ service times;
Queue: size-\(N\) vector indicating where the jobs are dispatched to;
Remaining: size-\(N\) vector indicating jobs’ remaining times;
Completion: size-\(N\) vector indicating jobs’ completion times;
CurrJob: size-\(2\) vector indicating the current job in processing at each server, i.e., CurrJob[i], for \(i=1,2\), takes value in \(\{1,\ldots,N\}\);
NextJob: this variable is incremented by one each time a new job arrives (not necessary but may simplify the code).

Do not hesitate to introduce other variables if needed.

Code structure

# Initialize Arrivals, Service, Queue, Remaining and other variables.

while (TRUE) {
    dtA = ... # time of the next arrival
    dtC = ... # time of the next completion
    if(is.na(dtA) & is.na(dtC)) {break;}
    dt = min(dtA,dtC)

    if((NextArrival <=N) & (Arrival[NextArrival] == t)) 
    {
      # update Remaining and, possibly, and other state variables
    }
    
    for(k in 1:2)
    {
      if(!is.na(CurrJob[k]))
      {
        # update Remaining and, possibly, CurrJob to NA and other state variables
      }
    }
    
    for(k in 1:2)
    {
      if(is.na(CurrJob[k]))
      {
        # assign a job and a copy to server k
        # update state variables
      }
    }

}

Note for the evaluation: The more efficient the code, the better. Efficiency refers to the ability of simulating an increased number of jobs given a time constraint.

Step 2: Theoretical and numerical investigation

Question 1

Specify the stability condition induced by RND, RR and SITA, respectively.
Hint: for RND and SITA, use the Poisson thinning property and for RR, construct the sequence of inter-arrival times at each queue.

Question 2

Plot the response times \(R_{RND}\), \(R_{RR}\) and \(R_{SITA-x}\) as a function of lambda, where lambda varies between 0 and \(95\)% of the size of the stability region obtained by the corresponding dispatching algorithm; one single figure.

Choose the threshold x as you believe is more appropriate and make sure that N is large enough to make your plots representative of the mean time that jobs spend in the network.
Verify that \(R_{RR}\le R_{RND}\) for all \(\lambda\)

Question 3

Assume \(\lambda=1.5\). Among RND, RR and SITA, we want to understand which dispatching algorithm provides the best response time. Towards this purpose, we want to check whether there exists x such that \(R_{SITA-x}\le \min\{R_{RND},R_{RR}\}\). Let us call such xstar such x. Does xstar exists? If yes, compare the stability region induced by such xstar with the one induced by RND and RR.

Question 4 (Optional)

Propose, and possibly implement, an alternative dispatching scheme that may improve the performance achieved by RR, RND and SITA with x=xstar. Explain your intuition.

Load Balancing - DM

Jonatha Anselmi

February 25, 2021