Title: “Assignment 1 - Statistical Inference”

Author: “ahlulwulus”

Date: “November 21, 2015”

Output: pdf_document

Author: Ahlulwulus

This is the report for part 1 (simulation). The problem statement is defined as follows:

The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also also 1/lambda. Set lambda = 0.2 for all of the simulations.

First, we will collect 1,000 sample means of size n=40 from a simulated dataset that has exponential distribution with lambda = 0.2

set.seed(1)
library(data.table)

Define function to collect sample mean of n=40 from aforementioned distribution

sim_exp = function(x){return(x * mean(rexp(40, 0.2)))}

Then, collect 1000 sample means of size n=40 with lambda = 0.2

sim = apply(data.table(rep.int(1, 1000)), 1, sim_exp)

Show the distribution is centered at and compare it to the theoretical center of the distribution.
Show how variable it is and compare it to the theoretical variance of the distribution.

These two questions can be answered by simply calculating the observed and theoretical mean and variance. The theoretical variance is the square of the theoretical standard deviation, which is 1/lambda divided by the square root of the sample size n =40. The theoretical mean is 1/lambda.

theoretical_variance = round(((1/.2) / sqrt(40))^2,3)
actual_variance = round(var(sim),3)
theoretical_mean = 1/.2 
actual_mean = round(mean(sim),3)
compare_var = rbind(theoretical_mean,actual_mean,theoretical_variance, actual_variance)

The difference theoretical variance of the distribution

compare_var

##                       [,1]
## theoretical_mean     5.000
## actual_mean          4.990
## theoretical_variance 0.625
## actual_variance      0.611

As we can see, the theoretical variance and observed variance are very similar. Furthermore, the theoretical mean and actual mean are also very similar.

Show that the distribution is approximately normal.

Generate the Histogram

As we can see from this histogram, the observed density is very similar to the normal density, and thus it is very fair to say that the observed data is approximately normal.

Evaluate the coverage of the confidence interval for 1/lambda: X ± 1.96 (s/???n).

Calculate the confidence interval

ci = mean(sim) + (c(-1,1)* (1.96 * ((1/0.2) / sqrt(40))))
ci

## [1] 3.440509 6.539541

Calculate the coverage

sum(between(sim, ci[1], ci[2])) / length(sim)

## [1] 0.949

The coverage is approximately 94.9%. This is expected as the data is normally distributed.