1. Introduction

The beta distribution is a continuous probability distribution defined on the interval (0, 1). Characterized by two positive shape parameters, alpha (α) and beta (β), it is a versatile tool for modeling random variables that represent proportions or probabilities. The shape of the distribution is determined by these parameters, allowing for a wide range of curves from U-shaped to J-shaped and everything in between.

1.1 History

While the exact origins of the beta distribution are somewhat obscure, its mathematical foundations can be traced back to the works of mathematicians like Euler and Legendre in the 18th century. However, its application in probability and statistics gained prominence in the 20th century with the development of Bayesian inference. The conjugate relationship between the beta distribution and the binomial distribution, discovered by mathematicians like Harold Jeffreys and Bruno de Finetti, solidified its role in statistical modeling.

1.2 Applications

The beta distribution finds applications in numerous fields due to its ability to model proportions and probabilities. Here are some key areas:

  • Bayesian Statistics:

    • As a conjugate prior for the Bernoulli, binomial, negative binomial, and geometric distributions, the beta distribution is widely used in Bayesian inference to represent prior beliefs about parameters.

    • It’s employed in various applications, including A/B testing, clinical trials, and reliability analysis.

  • Machine Learning:

    • Beta distribution can be used to model the probability of events in classification problems.

    • It’s also used in reinforcement learning to represent uncertainty about the value of actions.

  • Finance: Modeling default probabilities, credit risk, and option pricing.

  • Quality Control: Analyzing process capability and control charts.

  • Reliability Engineering: Modeling failure rates and time-to-failure distributions.

  • Other Applications: Weather forecasting, economics, and social sciences.

2. Probability Density Function (PDF) of the Beta Distribution

2.1 Shape, Domain, and Parameters

The probability density function (PDF) of the beta distribution is defined as

\[ f(x; \alpha, \beta) = \frac{x^{\alpha-1} (1-x)^{\beta-1}}{B(\alpha, \beta)} \]

where:

  • x is the random variable, with 0 <= x <= 1

  • α and β are shape parameters, both greater than 0

  • B(α, β) is the beta function, a normalization constant

The beta distribution is defined on the interval (0, 1), making it suitable for modeling probabilities or proportions. The shape of the distribution is determined entirely by the parameters α and β:

  • α = β = 1: Uniform distribution on (0, 1).

  • α > 1, β > 1: Unimodal distribution with a peak within the interval (0, 1). As α and β increase, the peak becomes sharper and more centered.

  • α < 1, β < 1: U-shaped distribution with peaks at the endpoints of the interval.

  • α > 1, β < 1: Right-skewed distribution.

  • α < 1, β > 1: Left-skewed distribution.

2.2 Visualization of the Beta Distribution

The following R code demonstrates the flexibility of the beta distribution by plotting PDFs for different parameter values:

suppressWarnings({library(ggplot2)})

# Function to plot beta distribution
plot_beta <- function(alpha, beta) {
  x <- seq(0, 1, length.out = 100)
  y <- dbeta(x, alpha, beta)
  data <- data.frame(x = x, y = y)
  ggplot(data, aes(x, y)) +
    geom_line() +
    labs(title = paste("Beta Distribution (alpha =", alpha, ", beta =", beta, ")"),
         x = "x", y = "Density")
}

# Plot examples
plot_beta(2, 2)  # Symmetric, bell-shaped

plot_beta(4, 2)  # Right-skewed

plot_beta(2, 4)  # Left-skewed

plot_beta(0.5, 0.5)  # U-shaped

3. Distribution and Survival Functions of the Beta Distribution

3.1 Distribution Function (CDF)

The cumulative distribution function (CDF) of a continuous random variable X, denoted F(x), is the probability that X takes a value less than or equal to x. For the beta distribution, the CDF is given by the incomplete beta function ratio:

\[ F(x; α, β) = B(x; α, β) / B(α, β) \]

where:

  • B(x; α, β) is the incomplete beta function

  • B(α, β) is the complete beta function

In R, the CDF can be calculated using the pbeta function.

3.2 Survival Function (SF)

The survival function (SF), also known as the complementary cumulative distribution function (CCDF), is the probability that X takes a value greater than x. It is defined as:

\[ S(x; α, β) = 1 - F(x; α, β) = 1 - B(x; α, β) / B(α, β) \]

In R, the SF can be calculated using the pbeta function with the lower.tail = FALSE argument.

3.3 Plotting and Characteristics

Let’s plot the CDF and SF for a beta distribution with parameters α = 2 and β = 3:

suppressWarnings({library(ggplot2)})

# Function to plot CDF and SF
plot_beta_cdf_sf <- function(alpha, beta) {
  x <- seq(0, 1, length.out = 100)
  cdf <- pbeta(x, alpha, beta)
  sf <- pbeta(x, alpha, beta, lower.tail = FALSE)
  data <- data.frame(x = x, cdf = cdf, sf = sf)
  ggplot(data, aes(x = x)) +
    geom_line(aes(y = cdf, color = "CDF")) +
    geom_line(aes(y = sf, color = "SF")) +
    labs(title = paste("Beta Distribution CDF and SF (alpha =", alpha, ", beta =", beta, ")"),
         x = "x", y = "Probability") +
    scale_color_manual(values = c("CDF" = "blue", "SF" = "red"))
}

# Plot CDF and SF
plot_beta_cdf_sf(2, 3)

3.4 Characteristics of the CDF and SF

  • Both CDF and SF are increasing functions on the interval (0, 1).

  • The CDF approaches 0 as x approaches 0 and approaches 1 as x approaches 1.

  • The SF approaches 1 as x approaches 0 and approaches 0 as x approaches 1.

  • The sum of the CDF and SF at any point x is equal to 1.

The shapes of the CDF and SF depend on the values of α and β. For example, if α > 1 and β > 1, the CDF is S-shaped and the SF is concave down.

4. Parameter Estimation for the Beta Distribution

4.1 Method of Moments

A common method for estimating parameters is the method of moments. It involves equating the theoretical moments of the distribution to the sample moments. For the beta distribution, the first two moments are:

  • Mean: \(E(X) = \frac{\alpha}{\alpha + \beta}\)

  • Variance: \(\text{Var}(X) = \frac{\alpha \beta}{(\alpha + \beta)^2 (\alpha + \beta + 1)}\)

Given a sample of data, x1, x2, …, xn, we can calculate the sample mean and variance:

  • Sample mean : \(\bar{x} = \frac{\sum x_i}{n}\).

  • Sample variance: \(s^2 = \frac{\sum (x_i - \bar{x})^2}{n - 1}\).

Equating the theoretical and sample moments, we obtain a system of two equations with two unknowns (α and β). Solving these equations simultaneously gives the estimates for α and β.

4.1.1 Limitations of the Method of Moments

While the method of moments is simple to implement, it can be inefficient and sensitive to outliers. Other methods, such as maximum likelihood estimation, often provide more accurate estimates.

5. R Code to Obtain Parameter Estimates

# Sample data
set.seed(123)
sample_data <- rbeta(100, 2, 5)  # Generating a sample with known parameters for demonstration

# Method of Moments Estimators
sample_mean <- mean(sample_data)
sample_var <- var(sample_data)

# Solving for alpha and beta
moment_equations <- function(params) {
  alpha <- params[1]
  beta <- params[2]
  
  eq1 <- sample_mean - (alpha / (alpha + beta))
  eq2 <- sample_var - (alpha * beta) / ((alpha + beta)^2 * (alpha + beta + 1))
  
  return(c(eq1, eq2))
}

initial_guess <- c(1, 1)
estimates <- nleqslv::nleqslv(initial_guess, moment_equations)$x

alpha_hat <- estimates[1]
beta_hat <- estimates[2]

cat("Estimated alpha:", alpha_hat, "\n")
## Estimated alpha: 2.391231
cat("Estimated beta:", beta_hat, "\n")
## Estimated beta: 6.013886

6. Estimator Characteristics Through Simulation

To study the characteristics of the estimators, we can perform a simulation:

# Simulation parameters
n_simulations <- 1000
sample_size <- 100

# True parameters
true_alpha <- 2
true_beta <- 5

# Vectors to store estimates
alpha_estimates <- numeric(n_simulations)
beta_estimates <- numeric(n_simulations)

# Simulation loop
for (i in 1:n_simulations) {
  sample_data <- rbeta(sample_size, true_alpha, true_beta)
  sample_mean <- mean(sample_data)
  sample_var <- var(sample_data)
  
  moment_equations <- function(params) {
    alpha <- params[1]
    beta <- params[2]
    
    eq1 <- sample_mean - (alpha / (alpha + beta))
    eq2 <- sample_var - (alpha * beta) / ((alpha + beta)^2 * (alpha + beta + 1))
    
    return(c(eq1, eq2))
  }
  
  initial_guess <- c(1, 1)
  estimates <- nleqslv::nleqslv(initial_guess, moment_equations)$x
  
  alpha_estimates[i] <- estimates[1]
  beta_estimates[i] <- estimates[2]
}

# Analyzing the results
alpha_mean <- mean(alpha_estimates)
beta_mean <- mean(beta_estimates)

alpha_sd <- sd(alpha_estimates)
beta_sd <- sd(beta_estimates)

cat("True alpha:", true_alpha, "\n")
## True alpha: 2
cat("True beta:", true_beta, "\n")
## True beta: 5
cat("Mean of estimated alpha:", alpha_mean, "\n")
## Mean of estimated alpha: 2.022752
cat("Mean of estimated beta:", beta_mean, "\n")
## Mean of estimated beta: 5.05223
cat("Standard deviation of estimated alpha:", alpha_sd, "\n")
## Standard deviation of estimated alpha: 0.29692
cat("Standard deviation of estimated beta:", beta_sd, "\n")
## Standard deviation of estimated beta: 0.7779181
# Plotting the estimates
hist(alpha_estimates, main = "Histogram of Alpha Estimates", xlab = "Alpha Estimates", breaks = 30)

hist(beta_estimates, main = "Histogram of Beta Estimates", xlab = "Beta Estimates", breaks = 30)

This code simulates data from a beta distribution with known parameters, estimates parameters using both methods for each simulated dataset, and stores the results. You can then analyze the distribution of estimated parameters, calculate biases, variances, and other statistics to compare the performance of the two estimation methods.

7. Application

Describe at least one application of the selected probability distribution, including plotting and parameter estimation:

Example: Estimating Click-Through Rates (CTR) Using Moment Estimators

Suppose you are running an online advertisement campaign and you want to estimate the click-through rates (CTR) for two different ads (Ad A and Ad B). You can use the Beta distribution to model and compare the CTRs based on observed data. This time, we will use the method of moments to estimate the parameters of the Beta distribution.

R Code to Estimate Parameters and Plot the Distributions

# Load required packages
suppressPackageStartupMessages({
  library(ggplot2)
  library(tidyr)
  library(dplyr)
})


# Data for Ad A
clicks_A <- 50
impressions_A <- 1000

# Data for Ad B
clicks_B <- 30
impressions_B <- 800

# Sample mean and variance for Ad A
mean_A <- clicks_A / impressions_A
var_A <- mean_A * (1 - mean_A) / (impressions_A + 1)

# Sample mean and variance for Ad B
mean_B <- clicks_B / impressions_B
var_B <- mean_B * (1 - mean_B) / (impressions_B + 1)

# Method of moments estimates for Ad A
alpha_A <- mean_A * ((mean_A * (1 - mean_A) / var_A) - 1)
beta_A <- (1 - mean_A) * ((mean_A * (1 - mean_A) / var_A) - 1)

# Method of moments estimates for Ad B
alpha_B <- mean_B * ((mean_B * (1 - mean_B) / var_B) - 1)
beta_B <- (1 - mean_B) * ((mean_B * (1 - mean_B) / var_B) - 1)

# Create a sequence of x values from 0 to 1
x <- seq(0, 1, length.out = 1000)

# Compute the PDF of the Beta distribution for both ads
pdf_A <- dbeta(x, alpha_A, beta_A)
pdf_B <- dbeta(x, alpha_B, beta_B)

# Create a data frame for plotting
data <- data.frame(x = x, Ad_A = pdf_A, Ad_B = pdf_B)
data_melt <- pivot_longer(data, cols = c('Ad_A', 'Ad_B'), names_to = 'Ad', values_to = 'Density')

# Plot the posterior distributions
ggplot(data_melt, aes(x = x, y = Density, color = Ad, fill = Ad)) +
  geom_line(linewidth = 1) +
  geom_area(alpha = 0.3) +
  labs(title = 'Estimated Distributions of Click-Through Rates Using Moment Estimator',
       x = 'Click-Through Rate',
       y = 'Density',
       color = 'Ad',
       fill = 'Ad') +
  theme_minimal()

8. Goodness-of-Fit for Beta Distribution

Goodness-of-Fit for Beta Distribution Using Chi-Square Test

Overview

The Chi-Square test is a statistical method used to evaluate the goodness-of-fit for a distribution by comparing observed frequencies with expected frequencies. In the context of the Beta distribution, we can bin the observed data, calculate the expected frequencies for each bin based on the fitted Beta distribution, and then perform the Chi-Square test.

Example Data

Suppose we have the following observed data (proportion of successes in 10 trials):

observed_data=[0.1,0.2,0.3,0.5,0.7,0.9]

Steps in the Chi-Square Goodness-of-Fit Test

  1. Estimate Parameters: Estimate the parameters of the Beta distribution using the method of moments.

  2. Bin the Data: Divide the observed data into bins.

  3. Calculate Expected Frequencies: Compute the expected frequencies for each bin based on the fitted Beta distribution.

  4. Perform Chi-Square Test: Compare the observed and expected frequencies to calculate the Chi-Square statistic and p-value.

R Code for Chi-Square Goodness-of-Fit Test

# Load required packages
suppressPackageStartupMessages({
  library(MASS)
  library(ggplot2)
})
suppressWarnings({library(fitdistrplus)})
## Loading required package: survival
# Example observed data
observed_data <- c(0.1, 0.2, 0.3, 0.5, 0.7, 0.9)

# Estimate the parameters of the Beta distribution using method of moments
alpha_est <- mean(observed_data) * ((mean(observed_data) * (1 - mean(observed_data)) / var(observed_data)) - 1)
beta_est <- (1 - mean(observed_data)) * ((mean(observed_data) * (1 - mean(observed_data)) / var(observed_data)) - 1)

# Fit Beta distribution to the data
fit <- fitdist(observed_data, "beta", start = list(shape1 = alpha_est, shape2 = beta_est))

# Bin the data
bins <- cut(observed_data, breaks = seq(0, 1, length.out = 6))
observed_frequencies <- table(bins)

# Calculate expected frequencies
bin_edges <- seq(0, 1, length.out = 6)
expected_frequencies <- diff(pbeta(bin_edges, fit$estimate[1], fit$estimate[2])) * length(observed_data)

# Perform Chi-Square Test
chi_square_stat <- sum((observed_frequencies - expected_frequencies)^2 / expected_frequencies)
chi_square_p_value <- 1 - pchisq(chi_square_stat, df = length(observed_frequencies) - 1 - 2)

# Print results
cat("Chi-Square Test Statistic:", chi_square_stat, "\n")
## Chi-Square Test Statistic: 0.8087095
cat("Chi-Square Test P-Value:", chi_square_p_value, "\n")
## Chi-Square Test P-Value: 0.6674073
# Plot histogram and fitted PDF
ggplot() +
  geom_histogram(aes(x = observed_data, y = after_stat(density)), bins = 10, fill = "lightblue", alpha = 0.7) +
  stat_function(fun = dbeta, args = list(shape1 = fit$estimate[1], shape2 = fit$estimate[2]), color = "red", linewidth = 1) +
  labs(title = "Histogram and Fitted Beta PDF",
       x = "Observed Data",
       y = "Density") +
  theme_minimal()