Victor D. Salda?a C.

PhD(c) in Geoinformatics Engineering
Technical University of Madrid (Spain)
September 2016

Overview.

This is the report of the first part of the Statistical Inference Peer Assignment: A simulation exercise. It includes supporting material such as the codes and figures.

1. Load packages.

#1. Let's load the packages we need. If you haven't download them do it first
library("ggplot2")
library("knitr")
library("datasets")
library("ggplot2")
library("ggpmisc")
library("gridExtra")
library("plyr")

3. Simulation.

#Create a logical vector to be filled with the means of the "n" samples from
#the exponential distribution. 
means<-vector()

#Loop to fill the vector with 1000 samples (with n= 40) means from the exponential distribution. 
for (i in 1:1000){
        means<-cbind(means,mean(rexp(40,rate=0.2)))
}

4. Sample Mean versus Theoretical Mean.

#histogram of the sample means (histmeans).
histmeans<-hist(means,xlab = "Sample Means", col="light gray",breaks=30,
                main="Sampling distribution of the mean")

#Let's show the sample mean and compare it to the theoretical mean 
#of the distribution. To calculate the center of the distribution (cd) is needed to 
#calculate the mean of the sampling distribution of the mean. On the other hand, the
#theoretical center (tc) of the distribution is just 1/lambda.
cd<-mean(means)
tc<-1/0.2

#Let's add these values to the histogram to show where the distribution is centered
#(blue line) at and compare it to the theoretical center (red line). 
abline(v=cd,col="blue",lwd=3)
abline(v=tc,col="red",lwd=3)

5. Sample Variance versus Theoretical Variance.

#Show how variable the sample is (via variance) and compare it to 
#the theoretical variance of the distribution.
vsd<-var(as.vector(means))
vtc<-(1/0.2^2)/40 
#vsd y vtc
## [1] 0.581875 0.625000

6. Normality of the distribution.

#Let's show that the distribution is approximately normal. Therefore, a QQPlot
#with normal theorical quantiles is going to be used. 
qqnorm(means)
qqline(means)

#As seen in the plot the distribution is quite normal