final_project

Problem: 1 Probability Density 1: X~Gamma. Using R, generate a random variable X that has 10,000 random Gamma pdf values. A Gamma pdf is completely describe by n (a size parameter) and lambda (l , a shape parameter). Choose any n greater 3 and an expected value (l) between 2 and 10 (you choose).

Probability Density 2: Y~Sum of Exponentials. Then generate 10,000 observations from the sum of n exponential pdfs with rate/shape parameter (l). The n and l must be the same as in the previous case. (e.g., mysum=rexp(10000,l)+rexp(10000,l)+..)

Probability Density 3: Z~ Exponential. Then generate 10,000 observations from a single exponential pdf with rate/shape parameter (l).

NOTE: The Gamma distribution is quite common in data science. For example, it is used to model failures for multiple processes when each of those processes has the same failure rate. The exponential is used for constant failure rates, service times, etc.

#Set values for n and lambda
n <- 5
lambda <- 7

#Generate 10,000 random variables from Gamma distribution
X <- rgamma(10000, n, lambda)

#Generate 10,000 random variables from sum of exponential distributions
Y <- sum(rexp(10000, lambda), n)

#Generate 10,000 random variables from exponential distribution
Z <- rexp(10000, lambda)

1a. Calculate the empirical expected value (means) and variances of all three pdfs.

#Calculate empirical means and variances
mean_X <- mean(X)
mean_X

## [1] 0.7127722

var_X <- var(X)
var_X

## [1] 0.09995861

mean_Y <- mean(Y)
mean_Y

## [1] 1438.283

var_Y <- var(Y)
var_Y

## [1] NA

mean_Z <- mean(Z)
mean_Z

## [1] 0.1387415

var_Z <- var(Z)
var_Z

## [1] 0.01958095

1b. Using calculus, calculate the expected value and variance of the Gamma pdf (X). Using the moment generating function for exponentials, calculate the expected value of the single exponential (Z) and the sum of exponentials (Y)

#Calculate expected values and variances analytically
exp_X <- n / lambda
exp_X

## [1] 0.7142857

var_X <- n / lambda^2
var_X

## [1] 0.1020408

exp_Y <- n / lambda
exp_Y

## [1] 0.7142857

var_Y <- n / lambda^2
var_Y

## [1] 0.1020408

exp_Z <- 1 / lambda
exp_Z

## [1] 0.1428571

var_Z <- 1 / lambda^2
var_Z

## [1] 0.02040816

1c-e. Probability. For pdf Z (the exponential), calculate empirically probabilities a through c. Then evaluate through calculus whether the memoryless property holds.

P(Z> l| Z> l/2) b. P(Z> 2l | Z> l) b. P(Z>3l | Z> l)

#Calculate probabilities and check memorylessness
prob_a <- mean(Z > lambda) / mean(Z > lambda/2)
prob_b <- mean(Z > 2*lambda) / mean(Z > lambda)
prob_c <- mean(Z > 3*lambda) / mean(Z > lambda)
mem_property <- ifelse(prob_a == prob_b & prob_b == prob_c, "Memoryless property holds", "Memoryless property does not hold")

Loosely investigate whether P(YZ) = P(Y) P(Z) by building a table with quartiles and evaluating the marginal and joint probabilities.

#Create table for quartiles
Y_quartiles <- quantile(Y, c(0.25, 0.5, 0.75))
Z_quartiles <- quantile(Z, c(0.25, 0.5, 0.75))
table <- matrix(c(mean(Y[Z < Z_quartiles[1]]), mean(Y[Z >= Z_quartiles[1] & Z < Z_quartiles[2]]), mean(Y[Z >= Z_quartiles[2] & Z < Z_quartiles[3]]), mean(Y[Z >= Z_quartiles[3]]),
mean(Z[Y < Y_quartiles[1]]), NA, NA, NA,
mean(Z[Y >= Y_quartiles[1] & Y < Y_quartiles[2]]), NA, NA, NA,
mean(Z[Y >= Y_quartiles[2] & Y < Y_quartiles[3]]), NA, NA, NA,
mean(Z[Y >= Y_quartiles[3]]), NA, NA, NA), nrow = 4, ncol = 4, dimnames = list(c("1st Quartile Y", "2nd Quartile Y", "3rd Quartile Y", "4th Quartile Y"), c("1st Quartile Z", "2nd Quartile Z", "3rd Quartile Z", "4th Quartile Z")))

## Warning in matrix(c(mean(Y[Z < Z_quartiles[1]]), mean(Y[Z >= Z_quartiles[1] & :
## data length differs from size of matrix: [20 != 4 x 4]

colSums(table, na.rm = TRUE)

## 1st Quartile Z 2nd Quartile Z 3rd Quartile Z 4th Quartile Z 
##              0              0              0              0

final_project

Mohammed Rahman

2023-05-16