Assignment 1. Part 1: Exponential Distribution

The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also also 1/lambda. Set lambda = 0.2 for all of the simulations. In this simulation, you will investigate the distribution of averages of 40 exponential(0.2)s. Note that you will need to do a thousand or so simulated averages of 40 exponentials.

Since the mean of exponential distribution is 1/lambda and lambda = 0.2, then the mean is 5.

###1. Show where the distribution is centered at and compare it to the theoretical center of the distribution. We do this by showing the distribution of the simulations

#We create a function that takes the number of simulations as a parameter
simulate <- function(iteration){
  set.seed(10)
    vector <- c()
    for(i in 1:iteration){
        nums <- rexp(40, 0.2)
        avg <- mean(nums)
        vector <- append(vector, avg)
        }
    return(vector) 
}

par(mar = par("mar")/2)

Hmeans <- simulate(100)
hist(Hmeans, main="Histogram for 100 simulations")
abline(v=mean(Hmeans), col = "red")

plot of chunk unnamed-chunk-1

Kmeans <- simulate(1000)
hist(Kmeans, main="Histogram for 1,000 simulations")
abline(v=mean(Kmeans), col = "red")

plot of chunk unnamed-chunk-1

TENKmeans <- simulate(10000)
hist(TENKmeans, main="Histogram for 10,000 simulations")
abline(v=mean(TENKmeans), col = "red")

plot of chunk unnamed-chunk-1

HUDREKmeans <- simulate(100000)
hist(HUDREKmeans, main="Histogram for 100,000 simulations")
abline(v=mean(HUDREKmeans), col = "red")

plot of chunk unnamed-chunk-1

The vertical red line in each histogram shows in mean. Note that they are all close to 5, the theoretical center of distribution.

###2. Show how variable it is and compare it to the theoretical variance of the distribution. We calculate the standard deviation. Note that the theorectical SD is supposed to be 5, but our results show very different numbers!

## [1] 0.819
## [1] 0.7983
## [1] 0.7872
## [1] 0.7912

###3. Show that the distribution is approximately normal. Normal distributions is supposed to have:

CENTER <- mean(HUDREKmeans) #center of distribution
SD <- sd(HUDREKmeans) #standard deviation

#create subsets of data between 1, 2, and 3 of SD 
#then fine the percent of the values within the subset
within1SD <- HUDREKmeans[HUDREKmeans >= (CENTER - SD) & HUDREKmeans <= (CENTER + SD)]
length(within1SD)/length(HUDREKmeans)
## [1] 0.6845
within2SD <- HUDREKmeans[HUDREKmeans >= (CENTER - 2 * SD) & HUDREKmeans <= (CENTER + 2 * SD)]
length(within2SD)/length(HUDREKmeans)
## [1] 0.9556
within3SD <- HUDREKmeans[HUDREKmeans >= (CENTER - 3 * SD) & HUDREKmeans <= (CENTER + 3 * SD)]
length(within3SD)/length(HUDREKmeans)
## [1] 0.9963

Note that percentages of values within the subset are 68%, 95%, and 99%, which correspond with their respective standard deviations.

###4. Evaluate the coverage of the confidence interval for 1/lambda We use the quantile function to find the lower and upper limit for the confidence interval.

ERROR <- qt(0.975, df=length(HUDREKmeans)-1)*SD/sqrt(length(HUDREKmeans))
LOWER <- CENTER - ERROR
UPPER <- CENTER + ERROR
LOWER
## [1] 4.991
UPPER
## [1] 5.001

The results show that the confidence interval is (4.991, 5.001)