Statistical Inference - Course Project

library(knitr)
library(ggplot2)
opts_chunk$set(echo=TRUE, results='asis', cache=TRUE)

Simulation

The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. For these simulations, we set lambda to 0.2. We investigate the distribution of averages of 40 exponentials. For this purpose, we perform a thousand or so simulated averages of 40 exponentials.

set.seed(123)
lamda <- 0.2
nSamples <- 40
nSimulation <- 1000
exp_simulation <- replicate(nSimulation, expr = rexp(nSamples, lamda))
## Histogram of total simulated values
plot1 <- qplot(as.vector(exp_simulation), type = "histogram")+ xlab("values") + ylab("count")
plot1

## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.

plot of chunk unnamed-chunk-1

Now we’ll take the average of every simulation, we need to remember that this is a random variable itself.

exp_simulation_means <- apply(exp_simulation, MARGIN = 2, FUN = mean)
plot2 <-ggplot(as.data.frame(exp_simulation_means), aes(x = exp_simulation_means))+geom_histogram(aes(y=..density..)) + xlab("means of simulations") + geom_vline(xintercept= mean(exp_simulation_means), colour = "white", linetype = 4) + stat_function(fun = dnorm, args = list(mean = 1/lamda, sd = (1/lamda)/sqrt(nSamples)), size = 2)
plot2

## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.

plot of chunk unnamed-chunk-2

Show where the distribution is centered at and compare it to the theoretical center of the distribution.

On the latter graph you can see a vertical line that represent the average of the means of every simulation.

## Value of mean of simulation means
mean(exp_simulation_means)

[1] 5.012

## Value from expression 
1/lamda

[1] 5

Show how variable it is and compare it to the theoretical variance of the distribution.

## Value of standard diviation of simulation means
sd(exp_simulation_means)

[1] 0.7749

## Value from expression 
(1/lamda)/sqrt(nSamples)

[1] 0.7906

Show that the distribution is approximately normal.

We have already shown on our second graph how close are the average of means from the simulation to the normal distibution. We can also see the same thing trough a QQ plot:

qqnorm(exp_simulation_means)
qqline(exp_simulation_means, col = "red")

plot of chunk unnamed-chunk-5

Statistical Inference - Course Project - Part 1

Augusto Hassel

Sunday, November 23, 2014

Simulation

Show where the distribution is centered at and compare it to the theoretical center of the distribution.

Show how variable it is and compare it to the theoretical variance of the distribution.

Show that the distribution is approximately normal.