Problem 1. Probability Density 1: X~Gamma. Using R, generate a random variable X that has 10,000 random Gamma pdf values. A Gamma pdf is completely describe by n (a size parameter) and lambda (λ , a shape parameter). Choose any n greater 3 and an expected value (λ) between 2 and 10 (you choose).

n<-10
lambda<-7
random<-10000
x_gamma<-rgamma(random,n,lambda ) #generates n random variables
#(x_gamma)

hist(x_gamma)

Probability Density 2: Y~Sum of Exponentials. Then generate 10,000 observations from the sum of n exponential pdfs with rate/shape parameter (λ). The n and λ must be the same as in the previous case. (e.g., mysum=rexp(10000,λ)+rexp(10000,λ)+..)

Y<-rexp(random,lambda)+rexp(random,lambda)
typeof(Y)
## [1] "double"
hist(Y)

Probability Density 3: Z~ Exponential. Then generate 10,000 observations from a single exponential pdf with rate/shape parameter (λ).

#rexp(# observations, rate=rate )
z<-rexp(random, lambda)
hist(z)

NOTE: The Gamma distribution is quite common in data science. For example, it is used to model failures for multiple processes when each of those processes has the same failure rate. The exponential is used for constant failure rates, service times, etc.

1a. Calculate the empirical expected value (means) and variances of all three pdfs.

#weighted.mean(vals, probs)
#weighted.mean(alpha,x_gamma)
#(c(random %*% x_gamma))
#weighted.mean(random,x_gamma)
#sum(random*x_gamma)

mean(x_gamma)
## [1] 1.432131
mean(Y)
## [1] 0.2849196
mean(z)
## [1] 0.1436883
var(x_gamma)
## [1] 0.2036016
var(Y)
## [1] 0.04067144
var(z)
## [1] 0.02089289

1b. Using calculus, calculate the expected value and variance of the Gamma pdf (X). Using the moment generating function for exponentials, calculate the expected value of the single exponential (Z) and the sum of exponentials (Y).

Mean(expected value) \[ f(x)=\frac{1}{\Gamma \lambda n^{\lambda}}x^{\lambda-1}e^{-x/n} \] where: \[\Gamma ({\lambda})=\int_{0}^{\infty} t^{\lambda-1}e^{t}dt \]

based on the property from the Gamma function:

\[\Gamma(\lambda +1 )=\lambda \Gamma (\lambda) \]

\[ Mean=E(X)=\int_{0}^{\infty} x \frac{1}{\Gamma(\lambda) n^{\lambda}}x^{\lambda-1}e^{-x/n} dx \]

\[=\frac{1}{\Gamma \lambda n^{\lambda}}\int_{0}^{\infty}x^{\lambda}e^{-x/n} dx \] Using some sort of substitution which the video referenced below uses:

\[ t=\frac{x}{n}; x=nt; dx=ndt; \frac{1}{\Gamma(\lambda)n^{\lambda}}\int_{0}^{\infty}(nt)^{\lambda}e^{-t}n dt=\frac{1}{\Gamma(\lambda)n^{\lambda}}n^{\lambda}{n}\int_{0}^{\infty}t^{\lambda}e^{-t} dt \] \[ \lambda=δ-1; δ=\lambda+1=\frac{n}{\Gamma(\lambda)} \int_{0}^{\infty}t^{δ-1}e^{-t}dt\] \[ \therefore E(X)=\frac{n \Gamma(\lambda +1)}{\Gamma(\lambda)} \]

According to the video, textbook, etc:

\[ Variance:E(x-Ex)^{2}=Ex^{2}-(Ex)^{2}\]

\[Ex^{2}=\int x^{2} \frac{1}{\Gamma(\lambda)n^{\lambda}}nt^{\lambda-1}e^{-t}n dt=\frac{1}{\Gamma(\lambda)n^{\lambda}}nt^{\lambda+1} =\frac{n^{2}}{\Gamma(\lambda)}\int_{0}^{\infty}t^{\lambda+1}e^{-t} dt\] The moment generating function (MGF) of an exponential distribution with parameter λ is given by:

\[M(t) = E[e^{tX}] = \int_{0}^{\infty} e^{tx} * λ * e^{-λx} dx \]

where X is a random variable that follows an exponential distribution with parameter λ. To calculate the expected value of a single exponential, we can differentiate the MGF with respect to t and evaluate it at t=0:

\[E[X] = M'(0) = λ / (λ - t)² \] (evaluated at t=0)

Therefore, the expected value of a single exponential is 1/λ.

Exponential function:

\[ f(x)=\lambda e^{-\lambda x} dx \] \[ M(t)=\int_{0}^{\infty} e^{tx}\lambda e^{-\lambda}x dx \] \[M(t)=\lambda \int_{0}^{\infty} e^{tx} \lambda e^{(t-\lambda)x}dx \] \[=\lambda (\frac{1}{t-\lambda}e^{t-\lambda x}) \Biggr_{0}^{\infty} \]

Evaluate at t=0 \[ E(x)=\frac{dM(t)}{dt}=\frac{\lambda}{(\lambda-0)^{2}}=\frac{1}{\lambda} \]

2nd Moment: \[ E(x^{2})=\frac{d^{2}M(t)}{dt^{2}}=\frac{2\lambda}{(\lambda-0)^{3}}=\frac{2}{\lambda^{2}} \]

Variance: V(x)=E(x{2})-E(x){2}

\[\therefore V(x)=\frac{2}{\lambda^{2}}=(\frac{1}{\lambda}^{2})=\frac{1}{\lambda^{2}} \]

Plugging it back to the variance formula, we get: \[V(x)=\frac{2}{\lambda^{2}}-(\frac{1}{{\lambda}^{2}})=\frac{1}{\lambda^{2}} \]

1c-e. Probability.For pdf Z (the exponential), calculate empirically probabilities a through c.Then evaluate through calculus whether the memoryless property holds.

a. P(Z> λ| Z> λ/2)

P(Z > λ | Z > λ/2): This probability represents the conditional probability that Z is greater than λ, given that Z is already greater than λ/2.

P(Z > λ): This probability represents the probability that Z is greater than λ, without any conditions.

To calculate P(Z > λ | Z > λ/2), we need to consider the conditional probability formula:

P(A | B) = P(A ∩ B) / P(B)

In this case, A represents the event Z > λ and B represents the event Z > λ/2.

# Set the parameters

lambda_half <- lambda / 2

# Calculate the probabilities using the exponential distribution
prob_Z_greater_than_lambda <- 1 - pexp(lambda, rate = 1/lambda)
prob_Z_greater_than_lambda_half <- 1 - pexp(lambda_half, rate = 1/lambda)

# Calculate P(Z > λ | Z > λ/2)
prob_conditional <- prob_Z_greater_than_lambda / prob_Z_greater_than_lambda_half

# Print the result
cat("P(Z > λ | Z > λ/2):", prob_conditional, "\n")
## P(Z > λ | Z > λ/2): 0.6065307

b. P(Z> 2λ | Z> λ)

lambda_twice <- 2 * lambda

# Calculate the probabilities using the exponential distribution
prob_Z_greater_than_lambda <- 1 - pexp(lambda, rate = 1/lambda)
prob_Z_greater_than_lambda_twice <- 1 - pexp(lambda_twice, rate = 1/lambda)

# Calculate P(Z > 2λ | Z > λ)
prob_conditional <- prob_Z_greater_than_lambda_twice / prob_Z_greater_than_lambda

# Print the result
cat("P(Z > 2λ | Z > λ):", prob_conditional, "\n")
## P(Z > 2λ | Z > λ): 0.3678794

c. P(Z>3λ | Z> λ)

lambda_three_times <- 3 * lambda

# Calculate the probabilities using the exponential distribution
prob_Z_greater_than_lambda <- 1 - pexp(lambda, rate = 1/lambda)
prob_Z_greater_than_lambda_three_times <- 1 - pexp(lambda_three_times, rate = 1/lambda)

# Calculate P(Z > 3λ | Z > λ)
prob_conditional <- prob_Z_greater_than_lambda_three_times / prob_Z_greater_than_lambda

# Print the result
cat("P(Z > 3λ | Z > λ):", prob_conditional, "\n")
## P(Z > 3λ | Z > λ): 0.1353353

Loosely investigate whether P(YZ) = P(Y) P(Z) by building a table with quartiles and evaluating the marginal and joint probabilities.

set.seed(123)
Y <- rexp(10000, lambda)
Z <- rexp(10000, lambda)
# Compute the quartiles for Y and Z
quartiles_Y <- quantile(Y, probs = c(0.25, 0.5, 0.75))
quartiles_Z <- quantile(Z, probs = c(0.25, 0.5, 0.75))

# Build the table with quartiles and initialize the sum
table_data <- matrix(0, nrow = 5, ncol = 5)
colnames(table_data) <- c("1st Quartile Y", "2nd Quartile Y", "3rd Quartile Y", "4th Quartile Y", "Sum")
rownames(table_data) <- c("1st Quartile Z", "2nd Quartile Z", "3rd Quartile Z", "4th Quartile Z", "Sum")

# Compute the joint probabilities and fill in the table
for (i in 1:3) {
  for (j in 1:3) {
    prob_joint <- mean(Y >= quartiles_Y[i] & Z >= quartiles_Z[j])
    table_data[i, j] <- prob_joint
  }
}

# Compute the marginal probabilities for Y and Z
marginal_Y <- colSums(table_data[1:3, 1:3])
marginal_Z <- rowSums(table_data[1:3, 1:3])

# Compute the sum
table_data[5, 1:4] <- c(marginal_Y, sum(marginal_Y))
table_data[1:4, 5] <- c(marginal_Z, sum(marginal_Z))
table_data[5, 5] <- sum(table_data[5, 1:4])

# Print the table
print(table_data)
##                1st Quartile Y 2nd Quartile Y 3rd Quartile Y 4th Quartile Y
## 1st Quartile Z         0.5620         0.3729         0.1858         0.0000
## 2nd Quartile Z         0.3750         0.2466         0.1226         0.0000
## 3rd Quartile Z         0.1864         0.1225         0.0599         0.0000
## 4th Quartile Z         0.0000         0.0000         0.0000         0.0000
## Sum                    1.1234         0.7420         0.3683         2.2337
##                   Sum
## 1st Quartile Z 1.1207
## 2nd Quartile Z 0.7442
## 3rd Quartile Z 0.3688
## 4th Quartile Z 2.2337
## Sum            4.4674

Check to see if independence holds by using Fisher’s Exact Test and the Chi Square Test. What is the difference between the two? Which is most appropriate? Source: https://www.reneshbedre.com/blog/fisher-exact-test.html

# Perform Fisher's Exact Test
fisher_result <- fisher.test(table_data[1:3, 1:3])
## Warning in fisher.test(table_data[1:3, 1:3]): 'x' has been rounded to integer:
## Mean relative difference: 0.9444867
# Perform Chi-Square Test
chi2_result <- chisq.test(table_data[1:3, 1:3])
## Warning in chisq.test(table_data[1:3, 1:3]): Chi-squared approximation may be
## incorrect
fisher_result
## 
##  Fisher's Exact Test for Count Data
## 
## data:  table_data[1:3, 1:3]
## p-value = 1
## alternative hypothesis: two.sided
chi2_result
## 
##  Pearson's Chi-squared test
## 
## data:  table_data[1:3, 1:3]
## X-squared = 3.2482e-05, df = 4, p-value = 1

The difference between the two tests is that Chi-Squared is a non-parametric test that allows without any specific assumptions about the distribution that the data is drawn from. Additionally, the Chi-Squared test is prone to errors when working with smaller sample sizes. Fisher test prefers smaller sample sizes (less than 1000).

X-squared = 3.2482e-05": This is the calculated value of the chi-squared test statistic. The value is very close to zero, indicating very little discrepancy.

"p-value = 1": The p-value represents the probability of observing a test statistic as extreme as, or more extreme than, the one calculated under the null hypothesis. A p-value of 1 suggests that there is no evidence to reject the null hypothesis, meaning there is no significant association between the categorical variables in the given data.

A warning was present: 'x' has been rounded to integer: Mean relative difference: 0.9444867 Chi-squared approximation may be incorrect. It gave the warning because many of the expected values will be very small and therefore the approximations of p may not be right. Therefore, in this case, I believe that the Fisher’s Exact Test is the most appropriate test to use for our data.