Statistical Inference part 1

Final project for Statistical Inference by Jose Ramon Pineda

Packages used: tidyverse, ggpubr

#PART 1

lambda <-  0.2
n <- 40
sim <- 1000
set.seed(seed = 421)
mu <- 1/lambda

To begin, we expect a mean of 5 (1/lambda)

test <- replicate(sim, rexp(n, lambda))
meantest <- data.frame(means = apply(test, 2, mean))

mean(meantest$means)

## [1] 5.017794

#Sample mean turns out to be close to the expected 5 at 5.02

var <- ((1/lambda)/sqrt(n))^2
var

## [1] 0.625

testvar <- var(meantest$means)
testvar

## [1] 0.5995459

#We see that the expected variance (6.25) is a bit higher than the variance we got in the test (5.99)

ggplot(data = meantest, aes(x = means)) + 
        geom_histogram(binwidth=0.1, aes(y=..density..), alpha=0.2) + 
        geom_vline(xintercept = mu, size=1, colour="red") + 
        geom_density(colour="blue", size=1) +
        scale_x_continuous(breaks=seq(mu-5,mu+5,1), limits=c(mu-5,mu+6))  +
        labs(title = "Is this a normal distribution?", subtitle = "At first glance, yes") +
        stat_function(fun = dnorm, args = list(mean = mu , sd = sqrt(var)), colour = "red", size=1)

## Warning: Removed 2 rows containing missing values (geom_bar).

Graphically this seems like a normal distribution. The blue line shows our test data and the red shows what a normal distribution would look like. Let’s run a Shapiro test to confirm that it is actually normal.

shapiro.test(meantest$means)

## 
##  Shapiro-Wilk normality test
## 
## data:  meantest$means
## W = 0.98919, p-value = 1.005e-06

ggqqplot(meantest$means, title = "QQ plot")

The shapiro test shows an extremely small p-value, so we reject the Ho that this is normally distributed.

We also run a qqplot to visualize this some more. We see then that some of the points on the right tail fall outside the confidence interval testing normality, which likely explains the Shapiro test results.