Using R, generate a random variable X that has 10,000 random Gamma pdf values. A Gamma pdf is completely describe by n (a size parameter) and lambda (λ , a shape parameter). Choose any n greater 3 and an expected value (λ) between 2 and 10 (you choose)

n <- 5  # Size parameter
lambda <- 4 # Shape 


X <- rgamma(10000, shape = n, rate = lambda)
head(X)

## [1] 0.5759197 1.2978958 1.3236099 0.7709575 1.4093825 0.8408435

Then generate 10,000 observations from the sum of n exponential pdfs with rate/shape parameter (). The n and λ must be the same as in the previous case.

y_exp <-rexp(10000, rate = lambda)
Y <- colSums(matrix(y_exp, ncol= n, byrow=TRUE))
head(Y)

## [1] 499.5612 513.5093 489.4681 499.7636 502.8363

Then generate 10,000 observations from a single exponential pdf with rate/shape parameter.

Z<- rexp(10000, rate = lambda)
head(Z)

## [1] 0.03861341 0.02760268 0.42822660 0.32754520 0.01521667 0.23514084

1A

X_mean <- mean(X)
X_var <- var(X)

Y_mean <- mean(Y)
Y_var <- var(Y)

Z_mean <- mean(Z)
Z_var <- var(Z)


X_mean

## [1] 1.245656

X_var

## [1] 0.3008274

Y_mean

## [1] 501.0277

Y_var

## [1] 74.10843

Z_mean

## [1] 0.2518294

Z_var

## [1] 0.06181484

1B Calculating expected value and variance

I utilized R to create the calculus problems I hope this isnt an issue. I’m not sure if you wanted the full formulas with the integrals or just the simplified.I figureed you would want just the simplified since this is a recording. If incorrect please let me know so I may correct it if possible

Expected Value (Mean): The expected value of the Gamma distribution : E(X) = α / β.

Variance: The variance of the Gamma distribution : Var(X) = α / β².

in our case a=n , β = lambda(λ)

Therefore for our chosen variables

expmean <- n/lambda
expvar<- n/(lambda**2)
cat(expmean, expvar)

## 1.25 0.3125

Utilizing the MGF we can find the expected values for the single exponential (Z) and the sum of exponentials (Y)

We know that M(t) = 1 / (1 - t/λ) for an exponential distribution

to evaluate this even further we know that we need to find the differential and solve t=0

E(Z)

M’(t) = λ / (λ - t)² M’(0) = λ / λ² M’(0) = 1 / λ

E(Z) = 1/4

E(Y) We know we have to raise the MGF to the power of n and this is kind of straightforward because the expected value should always be 1 when tasking into account the memoryless property of the exponential distribution.

M(t) = 1 / (1 - t/λ) M(t)^n = (1 / (1 - t/λ))^n M(0)^n = (1/1)^n E(Y) = 1^5 E(Y) = 1

1 C-E

For pdf Z (the exponential), calculate empirically probabilities a through c. Then evaluate through calculus whether the memoryless property holds.

P(Z>λ)| Z>λ/2
P(Z>2λ | Z> λ)
P(Z>3λ|Z>λ)

The memoryless property of the exponential distribution states that for any positive values a and b:

P(Z > a + b | Z > a) = P(Z > b)

pdf of exponential distribution

To verify this property through calculus, we need to compare the conditional probability expression on the left-hand side with the probability expression on the right-hand side and check if they are equal.

the conditional probability P(Z > a + b | Z > a) : P(Z > a + b | Z > a) = P(Z > a + b) / P(Z > a)

pdf of exponential distribution : f(z) = λ * e^(-λz) for z >= 0

The cumulative distribution function (CDF) is the integral of the pdf: 1 - e^(-λz)

Using the CDF, we can calculate the conditional probability:

P(Z > a + b | Z > a) = [1 - e^(-λ(a + b))] / [1 - e^(-λa)]

P(Z > b) = 1 - e^(-λb) If the two expressions are equal, the memoryless property holds.

P(Z > λ | Z > λ/2) = P(Z>λ)

exp(-lambda)/exp(-lambda/2)

## [1] 0.1353353

exp(-lambda/2)

## [1] 0.1353353

P(Z>2λ | Z> λ) = P(Z>λ)

exp(-lambda*2) /exp(-lambda)

## [1] 0.01831564

exp(-lambda)

## [1] 0.01831564

P(Z>3λ|Z>λ)

exp(-lambda*3)/ exp(-lambda)

## [1] 0.0003354626

exp(-lambda)

## [1] 0.01831564

Memory less property holds for all solutions

Loosely investigate whether P(YZ) = P(Y) P(Z) by building a table with quartiles and evaluating the marginal and joint probabilities.

# Compute the quartiles for Y and Z
quartiles_Y <- quantile(Y, probs = c(0.25, 0.5, 0.75))
quartiles_Z <- quantile(Z, probs = c(0.25, 0.5, 0.75))

# Build the table with quartiles and initialize the sum
table_data <- matrix(0, nrow = 5, ncol = 5)
colnames(table_data) <- c("1st Quartile Y", "2nd Quartile Y", "3rd Quartile Y", "4th Quartile Y", "Sum")
rownames(table_data) <- c("1st Quartile Z", "2nd Quartile Z", "3rd Quartile Z", "4th Quartile Z", "Sum")

# Compute the joint probabilities and fill in the table
for (i in 1:3) {
  for (j in 1:3) {
    prob_joint <- mean(Y >= quartiles_Y[i] & Z >= quartiles_Z[j])
    table_data[i, j] <- prob_joint
  }
}

# Compute the marginal probabilities for Y and Z
marginal_Y <- colSums(table_data[1:3, 1:3])
marginal_Z <- rowSums(table_data[1:3, 1:3])

# Compute the sum
table_data[5, 1:4] <- c(marginal_Y, sum(marginal_Y))
table_data[1:4, 5] <- c(marginal_Z, sum(marginal_Z))
table_data[5, 5] <- sum(table_data[5, 1:4])

# Print the table
print(table_data)

##                1st Quartile Y 2nd Quartile Y 3rd Quartile Y 4th Quartile Y
## 1st Quartile Z         0.5981         0.4001         0.2012         0.0000
## 2nd Quartile Z         0.4496         0.3004         0.1523         0.0000
## 3rd Quartile Z         0.2987         0.1998         0.1010         0.0000
## 4th Quartile Z         0.0000         0.0000         0.0000         0.0000
## Sum                    1.3464         0.9003         0.4545         2.7012
##                   Sum
## 1st Quartile Z 1.1994
## 2nd Quartile Z 0.9023
## 3rd Quartile Z 0.5995
## 4th Quartile Z 2.7012
## Sum            5.4024

# Perform Fisher's Exact Test
fisher_result <- fisher.test(table_data[1:3, 1:3])

## Warning in fisher.test(table_data[1:3, 1:3]): 'x' has been rounded to integer:
## Mean relative difference: 0.9273656

# Perform Chi-Square Test
chi2_result <- chisq.test(table_data[1:3, 1:3])

## Warning in chisq.test(table_data[1:3, 1:3]): Chi-squared approximation may be
## incorrect

fisher_result

## 
##  Fisher's Exact Test for Count Data
## 
## data:  table_data[1:3, 1:3]
## p-value = 1
## alternative hypothesis: two.sided

chi2_result

## 
##  Pearson's Chi-squared test
## 
## data:  table_data[1:3, 1:3]
## X-squared = 4.4027e-06, df = 4, p-value = 1

Fisher’s Exact Test is preferred for small sample sizes or when assumptions are violated, while the Chi-Square Test is more suitable for larger sample sizes and when the assumptions are met.

Final Exam

Keeno Glanville

2023-05-16

1A