Inverse Transform method

The inverse transform method, otherwise known as inverse CDF method, is a probabilistic technique used to generate random numbers from a desired probability distribution by applying the inverse of the cumulative distribution function to uniformly distributed random numbers. For the continuous case:

Suppose that we have a continuous random variable \(X\) having Cumulative Density Function (CDF) \(F_{X}\). Then, the random variable

\[ U = F_{X}(X) \]

has a Uniform distribution \(U \sim U_{[0, 1]}\). So to generate a random variate \(x\) from the distribution of \(X\), we can use the following transformation

\[ F_{X}^{-1}(U) = x \]

where \(F_{X}^{-1}(.)\) is the inverse CDF or quantile function.

Weibull example

Example We want to generate a sample of \(10,000\) random realizations from a Weibull distribution \(W(5, 2)\) using the Inverse CDF method.

If \(X \sim\mathcal{W}ei(\alpha, \beta)\) having PDF and CDF defined respectively as

\[ \begin{align*} & \textbf{PDF} \ \ \ \ \ f_{X}(x) = \frac{ \alpha }{ \beta } \bigg( \frac{ x }{ \beta }\bigg)^{\alpha - 1} exp \bigg( - \frac{ x }{ \beta } \bigg)^{\alpha } \\ & \textbf{CDF} \ \ \ \ \ F_{X}(x) = 1 - exp \bigg( - \frac{x}{\beta} \bigg)^{\alpha} \\ \end{align*} \]

Weibull example solved

Then by the Inverse CDF method, we can generate realizations of \(X\) by equating \(U = F_{X}(x)\) and solving for \(x\). We have

\[ \begin{align*} U & = 1 - exp \bigg(- \frac{x}{2} \bigg)^{5} \\ 1 - U & = exp \bigg( - \frac{x}{2} \bigg)^{5} \\ - ln (1 - U ) &= \bigg( \frac{x}{2} \bigg)^{5} \\ \big(- ln (1 - U ) \big)^{1/5 } & = \frac{x}{2} \\ 2 \big(- ln (1 - U ) \big)^{1/5} & = x \\ \end{align*} \]

Code for Weibull realizations usingthe inverse CDF in R

# 1. Define Inverse CDF function
Inverse_CDF_Weibull = function(n, alpha, beta) {
   u <- runif(n)
   data <- beta*((-log(1-u))^(1/alpha))
   return(data.frame(data))
}

# 2. Generate realizations of the desired distribution
set.seed(2023)
dataset <- Inverse_CDF_Weibull(n = 1000, alpha = 5, beta = 2)
head(dataset)
##        data
## 1 1.8226044
## 2 1.6719235
## 3 1.4157089
## 4 1.7441408
## 5 0.9975117
## 6 1.3275159

Histogram in R

Then we can visualize the realizations using a histogram and a kernel density estimator in R as follows:

# 3. Plot an Histogram with ggplot2
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.2     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.2     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
ggplot(dataset, aes(x=data)) + 
  geom_histogram(aes(y = ..density..), color="black", fill="darkred", binwidth = 0.1) +
  geom_density(alpha=.2, color = 'black', lwd = 1.2) +
  labs(title = 'Histogram of Weibull(5,2) realizations',
       subtitle = 'Inverse CDF generated artificial dataset',
       y="count", x="value") +
  theme(axis.text=element_text(size=8),
        axis.title=element_text(size=8),
        axis.text.x=element_text(angle=40, hjust=1),
        plot.subtitle=element_text(size=9, face="italic", color="darkred"),
        panel.background = element_rect(fill = "white", colour = "grey50"),
        panel.grid.major = element_line(colour = "grey90"))
## Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(density)` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

References

Rizzo, M.L. (2019). Statistical Computing with R, Second Edition (2nd ed.). Chapman and Hall/CRC. \ https://doi.org/10.1201/9780429192760

The R Project for Statistical Computing: