1 The Inverse Transform Method

We want to use the simulation way to generate the value of a discrete random variable $X$, which have the probability mass function

\[ P(X=x_i)=p_i, \ i=0,1,\dots \], where $\sum_i p_i=1$.

First, we generate a random number $U$, which follows uniform(0,1).

Set

\[ X=\left\{\begin{array}{ll} x_0 & \text{if } U<p_0, \\ x_1 & \text{if } p_0 \le U < p_0+p_1, \\ \vdots & \\ x_j & \text{if } \sum_{i=0}^{j-1} p_i \le U < \sum_{i=0}^{j} p_i \\ \vdots \end{array}\right. \] Hence,

\[ P(X=x_j)=P\bigg( \sum_{i=0}^{j-1} p_i \le U < \sum_{i=0}^{j} p_i \bigg) = p_j \]

If $x_0<x_1<\dots$, then $X$ will be equal $x_j$ if $F(x_{j-1}) \le U < F(x_j)$. It is called the discrete inverse transform method for generating $X$.

Example 1: Bernoulli random variables

Let $X \sim Ber(0,1)$. Then

\[ P(X=x) = \left\{\begin{array}{ll} 1 & \text{, if } U_i< p, \\ 0 & \text{, if } U_i\ge p, \end{array}\right. \]

, $i=1,2,3,\dots,n$.

ber.fun = function(n,p){
  u = runif(n,0,1)
  x = ifelse(u<=p, 1, 0)
  return(x)
}
res = ber.fun(10000, 0.4)
data.frame('x' = c('failed (0)','success (1)'),
           'Prob.' = c(0.6, 0.4),
           'Freq.' = as.vector(table(res))/10000)

##             x Prob.  Freq.
## 1  failed (0)   0.6 0.6027
## 2 success (1)   0.4 0.3973

Example 2: Arbitrary $p_j$

Let $X$ be a discrete random variable with the pmf

\[ P(X=x)=\left\{\begin{array}{ll} 0.2 & ,x=1 \\ 0.15 & ,x=2 \\ 0.25 & ,x=3 \\ 0.4 & ,x=4 \end{array}\right. \]

Method 1.

discrete.fun1 = function(n){
  draws = rep(NA,n)
  for (i in 1:n) {
    u = runif(1,0,1)
    if (u < 0.2) {
      draws[i] = 1
    }else if(u <0.35){
      draws[i] = 2
    }else if(u <0.6){
      draws[i] = 3
    }else{
      draws[i] = 4
    }
  }
  return(draws)
}
res = discrete.fun1(100000)
data.frame('x' = 1:4,
           'Prob.' = c(0.2,0.15,0.25,0.4),
           'Freq.' = as.vector(table(res)/100000))

##   x Prob.   Freq.
## 1 1  0.20 0.19864
## 2 2  0.15 0.15020
## 3 3  0.25 0.25163
## 4 4  0.40 0.39953

Method 2. (Can be used for arbitrary random variable with 4 events)

discrete.fun2 = function(n, p, x){
  draws = rep(NA,n)
  for (i in 1:n) {
    u = runif(1,0,1)
    if (u < p[1]) {
      draws[i] = x[1]
    }else if(u < sum(p[1:2])){
      draws[i] = x[2]
    }else if(u < sum(p[1:3])){
      draws[i] = x[3]
    }else{
      draws[i] = x[4]
    }
  }
  return(draws)
}
res = discrete.fun2(100000, c(0.2,0.15,0.25,0.4), c(1,2,3,4))
data.frame('x' = 1:4,
           'Prob.' = c(0.2,0.15,0.25,0.4),
           'Freq.' = as.vector(table(res)/100000))

##   x Prob.   Freq.
## 1 1  0.20 0.19807
## 2 2  0.15 0.15040
## 3 3  0.25 0.24979
## 4 4  0.40 0.40174

Method 3. (Most general)

discrete.fun3 = function(n,x,p){
  draws = rep(NA,n)
  for (j in 1:n) {
    u = runif(1,0,1)
    cum.p = p[1]
    i = 1
    while (u >= cum.p) {
      i = i + 1
      cum.p = cum.p + p[i]
    }
    draws[j] = i
  }
  return(draws)
}
res = discrete.fun3(100000, c(1,2,3,4), c(0.2,0.15,0.25,0.4))
data.frame('x' = 1:4,
           'Prob.' = c(0.2,0.15,0.25,0.4),
           'Freq.' = as.vector(table(res)/100000))

##   x Prob.   Freq.
## 1 1  0.20 0.20137
## 2 2  0.15 0.14926
## 3 3  0.25 0.24822
## 4 4  0.40 0.40115

Example 3: Discrete Uniform distribution

Let $X$ follows a discrete uniform distribution. Then,

\[ P(X=x)=\frac{1}{k} \quad , \ x = 1, \dots, k. \]

Note that

\[ P(X \le j) = \frac{j}{k} \]

Then, after generating $U$,

\[ X = j, \quad \text{if} \ \frac{j-1}{k} \le U < \frac{j}{k} \]

It implies

\[ X=j, \quad \text{if} \ j-1 \le kU < j \]

Then,

\[ X = \lceil kU \rceil \]

Therefore, we can improve our algorithm as follows.

discrete.unif = function(n, k){
  u = runif(n,0,1)
  X = ceiling(k*u)
  return(X)
}
res = discrete.unif(100000, 10)
data.frame('x' = 1:10,
           'Prob.' = rep(0.1, 10),
           'Freq.' = as.vector(table(res)/100000))

##     x Prob.   Freq.
## 1   1   0.1 0.10048
## 2   2   0.1 0.09891
## 3   3   0.1 0.09865
## 4   4   0.1 0.10110
## 5   5   0.1 0.10054
## 6   6   0.1 0.09934
## 7   7   0.1 0.10055
## 8   8   0.1 0.09961
## 9   9   0.1 0.10122
## 10 10   0.1 0.09960

Example 4: Geometry random variables

Let $X \sim \text{Geom}(p)$. Then,

\[ P(X=i) = p(1-p)^{i-1}, i=1,2,3,\dots \]

Here, $X$ is the total number of trial when we get the first success. The probability of a success is $p$, and that of a failure is $1-p$. Sometime, we will denoted $1-p$ by $q$. Hence, the cumulative distribution function is

\[ \begin{align} P(X \le j-1) &= 1 - P(X>j-1) \\ &= 1 - P\{\text{First $j-1$ trails are all failed} \} \\ &= 1 - q^{j-1}, \ j \ge 1. \end{align} \]

We can generate $X=j$ by generating $U$, satisfying

\[ P(X \le j-1) \le U < P(X \le j) \]

That is,

\[ 1-q^{j-1} \le U < 1-q^j \]

This implies

\[ q^j < 1-U \le q^{j-1} \]

Therefore,

\[ \begin{align} X &= \min\{j : q^j < 1-U \} \\ &= \min\{j : j\log(q) < \log(1-U) \} \\ &= \min\{j : j > \frac{\log(1-U)}{\log(q)} \} \\ &= \bigg\lceil \frac{\log(1-U)}{\log(q)} \bigg\rceil \end{align} \]

geom.fun1 = function(n, p){
  u = runif(n,0,1)
  X = ceiling(log(1-u)/log(1-p))
  return(X)
}
res = geom.fun1(100000, 0.2)
data.frame('x' = 1:5,
           'Prob.' = dgeom(0:4, 0.2),
           'Freq.' = as.vector(table(res)/100000)[1:5])

##   x   Prob.   Freq.
## 1 1 0.20000 0.20082
## 2 2 0.16000 0.16187
## 3 3 0.12800 0.12787
## 4 4 0.10240 0.10253
## 5 5 0.08192 0.08207

If we use another definition of the geometry distribution, then\[ P(X=i) = p(1-p)^i, i=0,1,2,\dots \]

That is, $X$ is the total number of failure when we get the first success trail. Then,

\[ \begin{align} P(X \le j) &= 1 - P(X>j) \\ &= 1 - P\{\text{First $j$ trails are all failed} \} \\ &= 1 - q^j, \ j \ge 0. \end{align} \]

If $X=j$, then

\[ P(X \le j) \le U < P(X \le j+1) \iff 1 - q^j \le U < 1 - q^{j+1} \iff q^{j+1} < 1-U \le q^j \]

Therefore,

\[ \begin{align} X &= \min\{j : q^{j+1} < 1-U \} \\ &= \min\{j : (j+1)\log(q) < \log(1-U) \} \\ &= \min\{j : j+1 > \frac{\log(1-U)}{\log(q)} \} \\ &= \bigg\lceil \frac{\log(1-U)}{\log(q)} - 1 \bigg\rceil \\ &= \bigg\lfloor \frac{\log(1-U)}{\log(q)} \bigg\rfloor \end{align} \]

geom.fun2 = function(n, p){
  u = runif(n,0,1)
  X = ceiling(log(1-u)/log(1-p) - 1)
  return(X)
}
res = geom.fun2(100000, 0.2)
data.frame('x' = 1:5,
           'Prob.' = dgeom(0:4, 0.2),
           'Freq.' = as.vector(table(res)/100000)[1:5])

##   x   Prob.   Freq.
## 1 1 0.20000 0.19968
## 2 2 0.16000 0.16091
## 3 3 0.12800 0.12808
## 4 4 0.10240 0.10223
## 5 5 0.08192 0.08124

Example 5: Poisson random variables

Let $X\sim Poisson(\lambda)$. \[ P(X=x)=\frac{\lambda^xe^{-\lambda}}{x!}, \ x = 0,1,2,\dots \]

pois.fun1 = function(n, lambda){
  draws = rep(NA,n)
  u = runif(n,0,1) # Or generate every time in the loop
  for (j in 1:n){
    i = 0
    cum.p = exp(-lambda)
    while (u[j] >= cum.p){
      i = i + 1 
      cum.p = cum.p + (exp(-lambda) * lambda^i / factorial(i))
    }
    draws[j] = i
  }
  return (draws)
}
res = pois.fun1(100000, 5)
data.frame('x' = 0:12,
           'Pr' = round(dpois(0:12, 5), 2),
           'Simulation' = round(as.vector(table(res)[1:13])/100000, 2)
           )

##     x   Pr Simulation
## 1   0 0.01       0.01
## 2   1 0.03       0.03
## 3   2 0.08       0.09
## 4   3 0.14       0.14
## 5   4 0.18       0.18
## 6   5 0.18       0.18
## 7   6 0.15       0.15
## 8   7 0.10       0.11
## 9   8 0.07       0.06
## 10  9 0.04       0.04
## 11 10 0.02       0.02
## 12 11 0.01       0.01
## 13 12 0.00       0.00

## The approximate of the mean:  4.99377

By the probability mass function,

\[ \frac{p_{i+1}}{p_i} = \frac{\Pr(X=i+1)}{\Pr(X=i)} = \frac{\frac{\lambda^{i+1}e^{-\lambda}}{(i+1)!}}{\frac{\lambda^ie^{-\lambda}}{i!}} = \frac{\lambda}{i+1} \]

Alternatively, we can improve the algorithm which is the way more efficient.

pois.fun2 = function(n, lambda){
  draws = rep(NA,n)
  u = runif(n,0,1)
  for (j in 1:n){
    i = 0
    p = exp(-lambda)
    cum.p = p
    while (u[j] >= cum.p){
       p = p * lambda/(i+1)
       cum.p = cum.p + p
       i = i + 1
    }
    draws[j] = i
  }
  return (draws)
}
res = pois.fun2(100000, 5)
data.frame('x' = 0:9,
           'Pr' = round(dpois(0:9, 5), 2),
           'Simulation' = round(as.vector(table(res)[1:10])/100000, 2)
           )

##    x   Pr Simulation
## 1  0 0.01       0.01
## 2  1 0.03       0.03
## 3  2 0.08       0.08
## 4  3 0.14       0.14
## 5  4 0.18       0.18
## 6  5 0.18       0.17
## 7  6 0.15       0.15
## 8  7 0.10       0.11
## 9  8 0.07       0.06
## 10 9 0.04       0.04

Example 6: Binomial random variables

Let \[X\sim Bin(n,p)\].

\[ \Pr(X=x)=\binom{n}{x}p^x(1-p)^{n-x}, \ x = 0,1,2,\dots \]

Then, we have

\[ \frac{p_{i+1}}{p_i}=\frac{\binom{n}{i+1}p^{i+1}(1-p)^{n-i-1}}{\binom{n}{i}p^i(1-p)^{n-i}} = \frac{n-i}{i+1}\frac{p}{1-p} \]

That is,

\[ p_{i+1} = \frac{(n-i)p}{(i+1)(1-p)}p_i \]

bin.fun = function(N, n, p){
  draws = rep(NA, N)
  for (j in 1:N){
    i = 0
    pr = (1-p)^n
    Fx = pr
    u = runif(N,0,1)
    while(u[j] >= Fx){
      pr = pr * (n-i)/(i+1)*p/(1-p)
      Fx = Fx + pr
      i = i+1
    }
    draws[j] = i
  }
  return(draws)
}
res = bin.fun(10000, 10, 0.5)
data.frame('x' = 0:10,
           'Pr' = round(dbinom(0:10, 10, 0.5), 2),
           'Simulation' = round(as.vector(table(res)[1:11])/10000, 2)
           )

##     x   Pr Simulation
## 1   0 0.00       0.00
## 2   1 0.01       0.01
## 3   2 0.04       0.05
## 4   3 0.12       0.12
## 5   4 0.21       0.20
## 6   5 0.25       0.25
## 7   6 0.21       0.20
## 8   7 0.12       0.12
## 9   8 0.04       0.04
## 10  9 0.01       0.01
## 11 10 0.00       0.00

2 The Composite Approach

Let $X$ follow a random variable with mixture distribution.

\[ P(X=j) = \alpha p_j^{(1)} + (1-\alpha) p_j^{(2)}, \quad j = 1,2, \dots, \]

where $0<\alpha<1$. Or, it can be expressed as

\[ X = \left\{\begin{array}{l} X_1 \quad \text{with probability} \ \alpha, \\ X_2 \quad \text{with probability} \ 1-\alpha, \end{array}\right. \]

where $X_1$ and $X_2$ are random variables having respective mass functions $\{p_j^{(1)} \}$ and $\{p_j^{(2)} \}$.

Example 7:

Let $X$ be a random variable with pmf

\[ p_j = P(X=j) = \left\{\begin{array}{ll} 0.05 & j=1,2,3,4,5 \\ 0.15 & j=6,7,8,9,10. \end{array}\right. \]

By noting that $p_j=0.5p_j^{(1)}+0.5p_j^{(2)}$, where

\[ p_j^{(1)} = 0.1 \quad , \ j=1,2,\dots,10, \\ p_j^{(2)} = 0.2 \quad , \ j=6,7,8,9,10. \]

mix.fun = function(n, alpha, X1, X2){
  X = rep(NA, n)
  for (i in 1:n){
    u1 = runif(1,0,1)
    u2 = runif(1,0,1)
    if (u1 < alpha){
      index = ceiling(length(X1) * u2)
      X[i] = X1[index]
    }else{
      index = ceiling(length(X2) * u2)
      X[i] = X2[index]
    }
  }
  return(X)
}
res.H = mix.fun(100000, 0.5, 1:10, 6:10)
res.V = mix.fun(100000, 0.25, 1:5, 6:10)
data.frame('X' = 1:10, 
           'Prob.' = c(rep(0.05,5), rep(0.15, 5)),
           'Freq.Hor' = round(as.vector(table(res.H))/100000, 2), 
           'Freq.Vert' = round(as.vector(table(res.V))/100000, 2)
           )

##     X Prob. Freq.Hor Freq.Vert
## 1   1  0.05     0.05      0.05
## 2   2  0.05     0.05      0.05
## 3   3  0.05     0.05      0.05
## 4   4  0.05     0.05      0.05
## 5   5  0.05     0.05      0.05
## 6   6  0.15     0.15      0.15
## 7   7  0.15     0.15      0.15
## 8   8  0.15     0.15      0.15
## 9   9  0.15     0.15      0.15
## 10 10  0.15     0.15      0.15

Example 8:

Let $X$ be a mixture random variable with the pmf

$X=x_j$	1	2	3	4	5	6	7	8	9	10
$p_j$	0.06	0.06	0.06	0.06	0.06	0.15	0.13	0.14	0.15	0.13

mix.fun2 = function(n, support, prob){
  
  p = unique(prob)  # The catagory of prob.
  num.support = length(support) # The num of the place that have nonzero prob.
  num.group = length(p) # The num of the catagories of prob.
  
  alpha = rep(NA, num.group)
  cum.p = rep(NA, num.group)
  group = list(NA)
  
  for (i in 1:num.group) {
    index.in.prob = which(prob == p[i]) # Find those j's s.t. P(X=xj) are the same
    p.group = prob[index.in.prob]
    group[[i]] = support[index.in.prob] # Grouping support
    alpha[i] = sum(p.group) # The probability of each group
    cum.p[i] = sum(alpha[1:i])
  }
  
  X = rep(NA, n)
  for (j in 1:n){
    u1 = runif(1,0,1)
    u2 = runif(1,0,1)
    k = min(which(u1 < cum.p))
    index.in.group = ceiling(length(group[[k]]) * u2)
    X[j] = group[[k]][index.in.group]
  }
  return(X)
}
res = mix.fun2(100000, 1:10, c(rep(0.06,5), 0.15, 0.13, 0.14, 0.15, 0.13))
data.frame('x' = 1:10,
           'Prob.' = c(rep(0.06,5), 0.15, 0.13, 0.14, 0.15, 0.13),
           'Relative Freq.' = round(as.vector(table(res))/100000, 2) )

##     x Prob. Relative.Freq.
## 1   1  0.06           0.06
## 2   2  0.06           0.06
## 3   3  0.06           0.06
## 4   4  0.06           0.06
## 5   5  0.06           0.06
## 6   6  0.15           0.15
## 7   7  0.13           0.13
## 8   8  0.14           0.14
## 9   9  0.15           0.15
## 10 10  0.13           0.13

3 The Acceptance-Rejection Method

Suppose that we want to generate $X$ random variable from a discrete distribution with the pmf

\[ P\{ X=j \} = p_j. \]

Suppose that we can easily generate a random variable, $Y$, with its pmf

\[ P\{ Y=j \} = q_j, \]

such that

\[ \frac{P(X=j)}{P(Y=j)} = \frac{p_j}{q_j} \le c, \ \text{for all} \ j. \]

It can be further expressed as

\[ c \ge \sup_j \{ p_j/q_j \}. \]

Theorem. The acceptance-rejection algorithm generates a random variable $X$ with its pmf $P(X =j) =p_j$. In addition, the number of iterations of the algorithm is a geometric random variable with mean $c$.

proof:

When we easily generate a random variable, $Y$ , with its pmf $P(Y = j) = q_j$, then determine the probability that a single iteration produces the accepted value $j$.

\[ P(Y = j \text{ and } U < p_j/cq_j) =P(U < p_j/cq_j \mid Y =j) P(Y = j) = \frac{p_j}{cq_j}qj = \frac{p_j}{c} \]

Then,

\[ P\{ Acceptance \} = \sum_j P\{ Y=j \ \text{and} \ U<\frac{p_j}{cq_j} \} = \frac{1}{c} \]

Hence, each iteration produces an accepted value with probability $\frac{1}{c}$ and it is a geometric random variable with mean $c$.

In addition,

\[ P( X=j ) = P(Y=j \mid U< \frac{p_j}{cq_j}) = \frac{p_j/c}{1/c} = p_j. \]

Example 9:

$X$ is a random variable with pmf

$x_j$	1	2	3	4	5	6	7	8	9	10
$p_j$	0.11	0.12	0.09	0.08	0.12	0.1	0.09	0.09	0.1	0.1
$q_j$	0.1	0.1	0.1	0.1	0.1	0.1	0.1	0.1	0.1	0.1
$p_j/q_j$	1.1	1.2	0.9	0.8	1.2	1	0.9	0.9	1	1
Acceptance Rate	$\frac{1.1}{1.2}$	1	$\frac{0.9}{1.2}$	$\frac{0.8}{1.2}$	1	$\frac{1}{1.2}$	$\frac{0.9}{1.2}$	$\frac{0.9}{1.2}$	$\frac{1}{1.2}$	$\frac{1}{1.2}$

We generate $Y$ from a discrete uniform distribution with $q_j=0.1, \ j=1,2,\dots, 10$. Then, c = 1.2.

Next step, if $U \le p_j/(1.2 \times 0.1)$, set $X=Y$.

AR.method = function(n, support, p){
  num.support = length(support)
  q = 1/num.support
  c = max( p/q )
  acceptance.rate = p/(q*c)
  
  X = rep(NA, n)
  iter.accept = rep(1,n)
  for (i in 1:n) {
    index = ceiling(num.support * runif(1))
    Y = support[index]
    u = runif(1,0,1)
    
    while (u >= acceptance.rate[index]) {
      index = ceiling(num.support * runif(1))
      Y = support[index]
      u = runif(1,0,1)
      iter.accept[i] = iter.accept[i] + 1
    }
    
    X[i] = Y
  }
  res = list()
  res$X = X
  res$iter.accept = iter.accept
  return (res)
}

res = AR.method(100000, 1:10, c(0.11,0.12,0.09,0.08,0.12,0.1,0.09,0.09,0.1,0.1))
data.frame('x' = 1:10,
           'Prob.' = c(0.11,0.12,0.09,0.08,0.12,0.1,0.09,0.09,0.1,0.1),
           'Relative Freq.' = round(as.vector(table(res$X))/100000, 3) )

##     x Prob. Relative.Freq.
## 1   1  0.11          0.112
## 2   2  0.12          0.120
## 3   3  0.09          0.088
## 4   4  0.08          0.080
## 5   5  0.12          0.119
## 6   6  0.10          0.098
## 7   7  0.09          0.091
## 8   8  0.09          0.090
## 9   9  0.10          0.103
## 10 10  0.10          0.099

## The mean of the number of the iteration is 1.19768

Chapter 4