Probability Integral Tranformation

n <- 1000
lambda <- 5
x <- rexp(n, lambda)
y <- 1 - exp(-lambda*x)
F_y <- cumsum(table(y))/sum(table(y))
par(mfrow=c(1,3))
hist(x, col='blue')
plot(F_y, col='green', pch=19, xlab='y', ylab=expression('F'['Y']), main=expression(paste('Y=1-e'^ paste('-',lambda,'X'))))
plot(ecdf(y), col='red', pch=19, xlab='y', ylab='ECDF(y)')

Application

  • Now, suppose that we have at our disposal only a function that can generate i.i.d. samples from \(X \sim U(0,1)\) distribution, now we want to use the function to generate i.i.d. samples from some other distribution (let’s say from \(X \sim Exp(\lambda=5)\) or \(X \sim Geom(p=0.3)\) or \(X \sim Laplace(\mu=0, b=4\)).

  • We can use a \(U(0,1)\) random variable transformed by the inverse CDF corresponding to the other distribution to get a random variable with that CDF.

  • Let’s use the above facts to draw \(n=1000\) samples from a \(Y \sim \xi(\lambda=5)\) distribution, using just the samples drawn from a \(X \sim (0,1) \) with the R function runif using probability intergal transform.

    • First draw n samples from \(X \sim (0,1) \).
    • Transform \(Y=F_X^{-1}(X)\) with the inverse CDF \(F_X^{-1}(x)=-\frac{1}{\lambda}ln(1-x)\).
    • Now \(Y\) has the same distribution as \(F_Y^{-1}(y)=1-e^{-\lambda y}\).
  • Let’s compare the histograms obtained with the samples drawn from \(Y \sim \xi(\lambda=5)\) using probability integral transform and with the R function rexp. As expected, historgam looks almost exactly the same, as can be seen from the following figure. Also, the times taken to draw 10000 such samples are quite comparable.

  • Similarly let’s use probability integral transform to draw samples from \(Y \sim Geom(p=0.1)\) using only \(X \sim U(0,1)\) transformed with the inverse CDF \(F_X^{-1}(x)=\frac{ln(1-x)}{ln(1-p)}\) of the geometric distribution, since the geometric distribution has the CDF \(F_X(x)=1-(1-p)^x\) and then compare with the ones drawn using R function rgeom. As expected, historgam looks almost exactly the same, as can be seen from the following figure. Also, the times taken to draw 10000 such samples are quite comparable.

  • Again let’s use probability integral transform to draw samples from \(Y \sim Laplace(\mu=0, b=4)\) using only \(X \sim U(0,1)\) transformed with the following inverse CDF

\(F_X^{-1}(x)=\left\{ \begin{array}{ll} \mu+b.ln(2x) & x < \mu \\ \mu-b.ln(2-2x) & x \geq \mu \\ \end{array} \right\}\)

since the laplace distribution has the following CDF

\(F_X(x)=\left\{ \begin{array}{ll} \frac{1}{2}exp(\frac{x-\mu}{b}) & x < \mu \\ 1-\frac{1}{2}exp(-\frac{x-\mu}{b}) & x \geq \mu \\ \end{array} \right\}\)

  • Then compare with the ones drawn using R function rlaplace. As expected, historgam looks almost exactly the same, as can be seen from the following figure. Also, the times taken to draw 10000 such samples are quite comparable.

  • Finally let’s say we don’t have the function rnorm but we only have the ICDF qnorm and we want to sample from a normal distribution with a given mean and variance using the probability integral transform.

  • Then let’s compare the histogram with those generated using rnorm. As can be seen, they look exactly same. Also, the times taken to draw 10000 such samples are quite comparable.