R Markdown

download.file("http://www.openintro.org/stat/data/ames.RData", destfile = "ames.RData")
load("ames.RData")
 population <- ames$Gr.Liv.Area
samp <- sample(population, 60)
hist(samp)

describe the distribution

this historam shows the sample is skewed right

would you expect another students to be identical

I would not expect another students to be identical however because there is a sample size of 60 (which is significatly large) I would expect theirs to be simular and display the same overall shape.

sample mean

sample_mean <- mean(samp)
mean(samp)
## [1] 1472.583

95% confidence window

se <- sd(samp) / sqrt(60)
lower <- sample_mean - 1.96 * se
upper <- sample_mean + 1.96 * se
c(lower, upper)
## [1] 1351.138 1594.029

when will the confidence interval be valid?

when the sample is larger than 30.

population mean

mean(population)
## [1] 1499.69

##Does your confidence interval capture the true average size of houses in Ames?

yes it does.

Each student in your class should have gotten a slightly different confidence interval. What proportion of those intervals would you expect to capture the true population mean?

95% or 49 in 50, since you are 95% sure you captured the true mean.

confidence interval for first of 50 samples

samp_mean <- rep(NA, 50)
samp_sd <- rep(NA, 50)
n <- 60

for(i in 1:50){
  samp <- sample(population, n) # obtain a sample of size n = 60 from the population
  samp_mean[i] <- mean(samp)    # save sample mean in ith element of samp_mean
  samp_sd[i] <- sd(samp)        # save sample sd in ith element of samp_sd
}
lower_vector <- samp_mean - 1.96 * samp_sd / sqrt(n) 
upper_vector <- samp_mean + 1.96 * samp_sd / sqrt(n)

c(lower_vector[1], upper_vector[1])
## [1] 1403.571 1628.363

on your own

##What proportion of your confidence intervals include the true population mean? Is this proportion exactly equal to the confidence level? If not, explain why.

plot_ci(lower_vector, upper_vector, mean(population))

1 in 50 did not match up which makes sense because 49 out of 50 should. however if it did not match perfectly that would just be random chance and the ore samples you do the closer to it you wouuld be.

##2 Pick a confidence level of your choosing, provided it is not 95%. What is the appropriate critical value?

for 90

abs(qt(.90, 49))
## [1] 1.299069

3. Calculate 50 confidence intervals at the confidence level you chose in the previous question.

lower <- sample_mean - 1.6 * se
upper <- sample_mean + 1.6 * se
c(lower, upper)
## [1] 1373.444 1571.723