library(openintro)
download.file("http://www.openintro.org/stat/data/ames.RData", destfile = "ames.RData")
load("ames.RData")
population <- ames$Gr.Liv.Area
samp <- sample(population, 60)

Exercise 1

Describe the distribution of your sample. What would you say is the “typical” size within your sample? Also state precisely what you interpreted “typical” to mean.

hist(samp)

summary(samp)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     492    1103    1394    1475    1733    3222

The distribution is right skewed. I interpreted typical to mean average, which is 1576.

Exercise 2

I would not expect another student’s data to be exactly the same because a random 60 values are being chosen. It should be similar, but not the same. …

sample_mean <- mean(samp)
se <- sd(samp) / sqrt(60)
lower <- sample_mean - 1.96 * se
upper <- sample_mean + 1.96 * se
c(lower, upper)
## [1] 1348.223 1602.010

Exercise 3

For the confidence interval to be valid, the sample mean must be normally distributed and have standard error s/n‾√. What conditions must be met for this to be true?

The sample size would have to be sufficiently large (greater than 30), random, and not strongly skewed (relatively normal distribution).

Exercise 4

What does “95% confidence” mean? If you’re not sure, see Section 4.2.2.

A 95% confidence interval means we are 95% confident the tue population mean is in between the two values.

mean(population) [1] 1499.69

Exercise 5

Does your confidence interval capture the true average size of houses in Ames? Mean = 1499.69 Confidence interval = 1433.903 1718.797 Yes, the mean is within the confidence interval.

Exercise 6

Each student in your class should have gotten a slightly different confidence interval. What proportion of those intervals would you expect to capture the true population mean? Why? If you are working in this lab in a classroom, collect data on the intervals created by other students in the class and calculate the proportion of intervals that capture the true population mean.

We would expect 95% of those values to capture the true population mean because everyone used a 95% confidence interval.

samp_mean <- rep(NA, 50)
samp_sd <- rep(NA, 50)
n <- 60
for(i in 1:50){
  samp <- sample(population, n) # obtain a sample of size n = 60 from the population
  samp_mean[i] <- mean(samp)    # save sample mean in ith element of samp_mean
  samp_sd[i] <- sd(samp)        # save sample sd in ith element of samp_sd
}
lower_vector <- samp_mean - 1.96 * samp_sd / sqrt(n) 
upper_vector <- samp_mean + 1.96 * samp_sd / sqrt(n)
c(lower_vector[1], upper_vector[1])
## [1] 1406.230 1654.603

On your own

plot_ci(lower_vector, upper_vector, mean(population))

47 out of 50 confidence intervals include the true population, 96% of values include the mean. While this is close to the 95% confidence interval it is slightly off.

lower_vector_90 <- samp_mean - 1.65 * samp_sd / sqrt(n) 
upper_vector_90 <- samp_mean + 1.65 * samp_sd / sqrt(n)
plot_ci(lower_vector_90, upper_vector_90, mean(population))

With a confidence interval of 90% there is a critical value of 1.645. There is a higher percentage of values that do not include the true population, which we would expect with a lower confidence interval.

LS0tCnRpdGxlOiAiTGFiIDYiCmF1dGhvcjogIktpcnN0ZW4gR29sZG5lciIKb3V0cHV0OiBvcGVuaW50cm86OmxhYl9yZXBvcnQKLS0tCgpgYGB7ciBsb2FkLXBhY2thZ2VzLCBtZXNzYWdlPUZBTFNFfQpsaWJyYXJ5KG9wZW5pbnRybykKYGBgCmBgYHtyfQpkb3dubG9hZC5maWxlKCJodHRwOi8vd3d3Lm9wZW5pbnRyby5vcmcvc3RhdC9kYXRhL2FtZXMuUkRhdGEiLCBkZXN0ZmlsZSA9ICJhbWVzLlJEYXRhIikKbG9hZCgiYW1lcy5SRGF0YSIpCmBgYAoKYGBge3J9CnBvcHVsYXRpb24gPC0gYW1lcyRHci5MaXYuQXJlYQpzYW1wIDwtIHNhbXBsZShwb3B1bGF0aW9uLCA2MCkKYGBgCgoKIyMjIEV4ZXJjaXNlIDEKCkRlc2NyaWJlIHRoZSBkaXN0cmlidXRpb24gb2YgeW91ciBzYW1wbGUuIFdoYXQgd291bGQgeW91IHNheSBpcyB0aGUg4oCcdHlwaWNhbOKAnSBzaXplIHdpdGhpbiB5b3VyIHNhbXBsZT8gQWxzbyBzdGF0ZSBwcmVjaXNlbHkgd2hhdCB5b3UgaW50ZXJwcmV0ZWQg4oCcdHlwaWNhbOKAnSB0byBtZWFuLgoKYGBge3IgY29kZS1jaHVuay1sYWJlbH0KaGlzdChzYW1wKQpzdW1tYXJ5KHNhbXApCmBgYAoKVGhlIGRpc3RyaWJ1dGlvbiBpcyByaWdodCBza2V3ZWQuIEkgaW50ZXJwcmV0ZWQgdHlwaWNhbCB0byBtZWFuIGF2ZXJhZ2UsIHdoaWNoIGlzIDE1NzYuIAoKIyMjIEV4ZXJjaXNlIDIKCkkgd291bGQgbm90IGV4cGVjdCBhbm90aGVyIHN0dWRlbnQncyBkYXRhIHRvIGJlIGV4YWN0bHkgdGhlIHNhbWUgYmVjYXVzZSBhIHJhbmRvbSA2MCB2YWx1ZXMgYXJlIGJlaW5nIGNob3Nlbi4gSXQgc2hvdWxkIGJlIHNpbWlsYXIsIGJ1dCBub3QgdGhlIHNhbWUuIAouLi4KCmBgYHtyfQpzYW1wbGVfbWVhbiA8LSBtZWFuKHNhbXApCnNlIDwtIHNkKHNhbXApIC8gc3FydCg2MCkKbG93ZXIgPC0gc2FtcGxlX21lYW4gLSAxLjk2ICogc2UKdXBwZXIgPC0gc2FtcGxlX21lYW4gKyAxLjk2ICogc2UKYyhsb3dlciwgdXBwZXIpCmBgYAoKIyMjIEV4ZXJjaXNlIDMKCkZvciB0aGUgY29uZmlkZW5jZSBpbnRlcnZhbCB0byBiZSB2YWxpZCwgdGhlIHNhbXBsZSBtZWFuIG11c3QgYmUgbm9ybWFsbHkgZGlzdHJpYnV0ZWQgYW5kIGhhdmUgc3RhbmRhcmQgZXJyb3Igcy9u4oC+4oiaLiBXaGF0IGNvbmRpdGlvbnMgbXVzdCBiZSBtZXQgZm9yIHRoaXMgdG8gYmUgdHJ1ZT8KClRoZSBzYW1wbGUgc2l6ZSB3b3VsZCBoYXZlIHRvIGJlIHN1ZmZpY2llbnRseSBsYXJnZSAoZ3JlYXRlciB0aGFuIDMwKSwgcmFuZG9tLCBhbmQgbm90IHN0cm9uZ2x5IHNrZXdlZCAocmVsYXRpdmVseSBub3JtYWwgZGlzdHJpYnV0aW9uKS4gCgojIyMgRXhlcmNpc2UgNAoKV2hhdCBkb2VzIOKAnDk1JSBjb25maWRlbmNl4oCdIG1lYW4/IElmIHlvdeKAmXJlIG5vdCBzdXJlLCBzZWUgU2VjdGlvbiA0LjIuMi4KCkEgOTUlIGNvbmZpZGVuY2UgaW50ZXJ2YWwgbWVhbnMgd2UgYXJlIDk1JSBjb25maWRlbnQgdGhlIHR1ZSBwb3B1bGF0aW9uIG1lYW4gaXMgaW4gYmV0d2VlbiB0aGUgdHdvIHZhbHVlcy4gCgoKbWVhbihwb3B1bGF0aW9uKQpbMV0gMTQ5OS42OQoKIyMjIEV4ZXJjaXNlIDUKRG9lcyB5b3VyIGNvbmZpZGVuY2UgaW50ZXJ2YWwgY2FwdHVyZSB0aGUgdHJ1ZSBhdmVyYWdlIHNpemUgb2YgaG91c2VzIGluIEFtZXM/IApNZWFuID0gMTQ5OS42OQpDb25maWRlbmNlIGludGVydmFsID0gMTQzMy45MDMgMTcxOC43OTcKWWVzLCB0aGUgbWVhbiBpcyB3aXRoaW4gdGhlIGNvbmZpZGVuY2UgaW50ZXJ2YWwuIAoKCiMjIyBFeGVyY2lzZSA2CkVhY2ggc3R1ZGVudCBpbiB5b3VyIGNsYXNzIHNob3VsZCBoYXZlIGdvdHRlbiBhIHNsaWdodGx5IGRpZmZlcmVudCBjb25maWRlbmNlIGludGVydmFsLiBXaGF0IHByb3BvcnRpb24gb2YgdGhvc2UgaW50ZXJ2YWxzIHdvdWxkIHlvdSBleHBlY3QgdG8gY2FwdHVyZSB0aGUgdHJ1ZSBwb3B1bGF0aW9uIG1lYW4/IFdoeT8gSWYgeW91IGFyZSB3b3JraW5nIGluIHRoaXMgbGFiIGluIGEgY2xhc3Nyb29tLCBjb2xsZWN0IGRhdGEgb24gdGhlIGludGVydmFscyBjcmVhdGVkIGJ5IG90aGVyIHN0dWRlbnRzIGluIHRoZSBjbGFzcyBhbmQgY2FsY3VsYXRlIHRoZSBwcm9wb3J0aW9uIG9mIGludGVydmFscyB0aGF0IGNhcHR1cmUgdGhlIHRydWUgcG9wdWxhdGlvbiBtZWFuLgoKV2Ugd291bGQgZXhwZWN0IDk1JSBvZiB0aG9zZSB2YWx1ZXMgdG8gY2FwdHVyZSB0aGUgdHJ1ZSBwb3B1bGF0aW9uIG1lYW4gYmVjYXVzZSBldmVyeW9uZSB1c2VkIGEgOTUlIGNvbmZpZGVuY2UgaW50ZXJ2YWwuIAoKYGBge3J9CnNhbXBfbWVhbiA8LSByZXAoTkEsIDUwKQpzYW1wX3NkIDwtIHJlcChOQSwgNTApCm4gPC0gNjAKYGBgCgpgYGB7cn0KZm9yKGkgaW4gMTo1MCl7CiAgc2FtcCA8LSBzYW1wbGUocG9wdWxhdGlvbiwgbikgIyBvYnRhaW4gYSBzYW1wbGUgb2Ygc2l6ZSBuID0gNjAgZnJvbSB0aGUgcG9wdWxhdGlvbgogIHNhbXBfbWVhbltpXSA8LSBtZWFuKHNhbXApICAgICMgc2F2ZSBzYW1wbGUgbWVhbiBpbiBpdGggZWxlbWVudCBvZiBzYW1wX21lYW4KICBzYW1wX3NkW2ldIDwtIHNkKHNhbXApICAgICAgICAjIHNhdmUgc2FtcGxlIHNkIGluIGl0aCBlbGVtZW50IG9mIHNhbXBfc2QKfQpgYGAKCmBgYHtyfQpsb3dlcl92ZWN0b3IgPC0gc2FtcF9tZWFuIC0gMS45NiAqIHNhbXBfc2QgLyBzcXJ0KG4pIAp1cHBlcl92ZWN0b3IgPC0gc2FtcF9tZWFuICsgMS45NiAqIHNhbXBfc2QgLyBzcXJ0KG4pCmBgYAoKYGBge3J9CmMobG93ZXJfdmVjdG9yWzFdLCB1cHBlcl92ZWN0b3JbMV0pCmBgYAoKIyMjIE9uIHlvdXIgb3duCgpgYGB7cn0KcGxvdF9jaShsb3dlcl92ZWN0b3IsIHVwcGVyX3ZlY3RvciwgbWVhbihwb3B1bGF0aW9uKSkKYGBgCgo0NyBvdXQgb2YgNTAgY29uZmlkZW5jZSBpbnRlcnZhbHMgaW5jbHVkZSB0aGUgdHJ1ZSBwb3B1bGF0aW9uLCA5NiUgb2YgdmFsdWVzIGluY2x1ZGUgdGhlIG1lYW4uIFdoaWxlIHRoaXMgaXMgY2xvc2UgdG8gdGhlIDk1JSBjb25maWRlbmNlIGludGVydmFsIGl0IGlzIHNsaWdodGx5IG9mZi4gCgpgYGB7cn0KbG93ZXJfdmVjdG9yXzkwIDwtIHNhbXBfbWVhbiAtIDEuNjUgKiBzYW1wX3NkIC8gc3FydChuKSAKdXBwZXJfdmVjdG9yXzkwIDwtIHNhbXBfbWVhbiArIDEuNjUgKiBzYW1wX3NkIC8gc3FydChuKQpwbG90X2NpKGxvd2VyX3ZlY3Rvcl85MCwgdXBwZXJfdmVjdG9yXzkwLCBtZWFuKHBvcHVsYXRpb24pKQpgYGAKCldpdGggYSBjb25maWRlbmNlIGludGVydmFsIG9mIDkwJSB0aGVyZSBpcyBhIGNyaXRpY2FsIHZhbHVlIG9mIDEuNjQ1LiBUaGVyZSBpcyBhIGhpZ2hlciBwZXJjZW50YWdlIG9mIHZhbHVlcyB0aGF0IGRvIG5vdCBpbmNsdWRlIHRoZSB0cnVlIHBvcHVsYXRpb24sIHdoaWNoIHdlIHdvdWxkIGV4cGVjdCB3aXRoIGEgbG93ZXIgY29uZmlkZW5jZSBpbnRlcnZhbC4g