library(tidyverse)
library(openintro)
download.file("http://www.openintro.org/stat/data/ames.RData", destfile = "ames.RData")
load("ames.RData")
Exercise 1
Shape: Skew right Center: 1505 sq. ft. Spread: s = 517.4 sq. ft.
Let the “typical” house size be the mean of the house sizes. The
typical house size of the sample is 1505 square feet.
set.seed(1)
population <- ames$Gr.Liv.Area
samp <- sample(population, 60)
hist(samp)

## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 672 1100 1484 1505 1841 2690
## [1] 517.4003
Exercise 2
I wouldn’t expect another student’s distribution to be the same as
mine, as there is hopefully randomness to the sampling (which in R there
definitely is). I would expect the distributions to look quite similar
due to our sample size of 60.
Exercise 3
The conditions that we must meet in using these calculations is that
our variable is independently identically distributed (i.i.d.), the
sampling is random, our sample size is no larger than 10% of the
population size, and our sample is sufficiently large.
sample_mean <- mean(samp)
se <- sd(samp) / sqrt(60)
lower <- sample_mean - 1.96 * se
upper <- sample_mean + 1.96 * se
c(lower, upper)
## [1] 1374.296 1636.137
Exercise 4
95% confidence signifies that there is a 95% chance that the
population parameter we are trying to estimate falls within the interval
we have constructed.
Exercise 5
My confidence interval does capture the true average house size in
Ames (population mean).
## [1] 1374.296 1636.137
## [1] 1499.69
Exercise 6
There is, in fact, a binomial distribution for the probabilities of
our class capturing the population mean. The proportion of students who
captured the population mean with their intervals should be around 0.95,
because the definition of a confidence level is simply the probability
that your observation is accurate.
On your Own
- 0.96 is the proportion of confidence intervals that captured the
population mean (48/50), because there were 2 intervals (in red) which
did not capture the population mean. The proportion is not exactly equal
to the confidence level, but it’s as close as you can get because there
is no ratio of an integer to 50 which evaluates to 0.95, (47.5/50
discrete samples capturing the population mean makes no sense, because
each sample is a discrete unit).
set.seed(5)
samp_mean <- rep(NA, 50)
samp_sd <- rep(NA, 50)
n <- 60
for(i in 1:50){
samp <- sample(population, n) # obtain a sample of size n = 60 from the population
samp_mean[i] <- mean(samp) # save sample mean in ith element of samp_mean
samp_sd[i] <- sd(samp) # save sample sd in ith element of samp_sd
}
lower_vector <- samp_mean - 1.96 * samp_sd / sqrt(n)
upper_vector <- samp_mean + 1.96 * samp_sd / sqrt(n)
plot_ci(lower_vector, upper_vector, mean(population))

- Let the new confidence level be 99.7. I will use the t confidence
interval, not the z confidence interval, because they should be
relatively similar anyways for n=60 and I want practice.
# 0.003 / 2 = 0.0015, I believe this is what is called the critical value?
qt(0.9985, df = 59)
## [1] 3.095932
# this value gives the number that will replace 1.96
- The proportion of the confidence intervals which captured the
population mean was 1.00. This is, similarly, as close as you could get
to 0.997 as a ratio of an integer to 50. What I would be interested in
is how much larger are these CIs than those for 0.95, which is a simple
calculation, but more importantly, is this sacrifice in precision worth
the increased accuracy? At what confidence level does the interval
become practically useless?
lower_vector <- samp_mean - qt(0.9985, df = 59) * samp_sd / sqrt(n)
upper_vector <- samp_mean + qt(0.9985, df = 59) * samp_sd / sqrt(n)
plot_ci(lower_vector, upper_vector, mean(population))

LS0tDQp0aXRsZTogIk9wZW5JbnRybyBMYWI6IENvbmZpZGVuY2UgSW50ZXJ2YWxzIg0KYXV0aG9yOiAiQWRpdHlhIFZlcm1hIg0KZGF0ZTogImByIFN5cy5EYXRlKClgIg0Kb3V0cHV0OiBvcGVuaW50cm86OmxhYl9yZXBvcnQNCi0tLQ0KDQpgYGB7ciBsb2FkLXBhY2thZ2VzLCBtZXNzYWdlPUZBTFNFfQ0KbGlicmFyeSh0aWR5dmVyc2UpDQpsaWJyYXJ5KG9wZW5pbnRybykNCmRvd25sb2FkLmZpbGUoImh0dHA6Ly93d3cub3BlbmludHJvLm9yZy9zdGF0L2RhdGEvYW1lcy5SRGF0YSIsIGRlc3RmaWxlID0gImFtZXMuUkRhdGEiKQ0KbG9hZCgiYW1lcy5SRGF0YSIpDQpgYGANCg0KIyMjIEV4ZXJjaXNlIDENCg0KU2hhcGU6IFNrZXcgcmlnaHQNCkNlbnRlcjogMTUwNSBzcS4gZnQuDQpTcHJlYWQ6IHMgPSA1MTcuNCBzcS4gZnQuDQoNCkxldCB0aGUgInR5cGljYWwiIGhvdXNlIHNpemUgYmUgdGhlIG1lYW4gb2YgdGhlIGhvdXNlIHNpemVzLiBUaGUgdHlwaWNhbCBob3VzZQ0Kc2l6ZSBvZiB0aGUgc2FtcGxlIGlzIDE1MDUgc3F1YXJlIGZlZXQuDQoNCmBgYHtyIGNvZGUtY2h1bmstMX0NCnNldC5zZWVkKDEpDQpwb3B1bGF0aW9uIDwtIGFtZXMkR3IuTGl2LkFyZWENCnNhbXAgPC0gc2FtcGxlKHBvcHVsYXRpb24sIDYwKQ0KaGlzdChzYW1wKQ0Kc3VtbWFyeShzYW1wKQ0Kc2Qoc2FtcCkNCmBgYA0KDQojIyMgRXhlcmNpc2UgMg0KDQpJIHdvdWxkbid0IGV4cGVjdCBhbm90aGVyIHN0dWRlbnQncyBkaXN0cmlidXRpb24gdG8gYmUgdGhlIHNhbWUgYXMgbWluZSwgYXMNCnRoZXJlIGlzIGhvcGVmdWxseSByYW5kb21uZXNzIHRvIHRoZSBzYW1wbGluZyAod2hpY2ggaW4gUiB0aGVyZSBkZWZpbml0ZWx5IGlzKS4NCkkgd291bGQgZXhwZWN0IHRoZSBkaXN0cmlidXRpb25zIHRvIGxvb2sgcXVpdGUgc2ltaWxhciBkdWUgdG8gb3VyIHNhbXBsZSBzaXplIG9mDQo2MC4NCg0KIyMjIEV4ZXJjaXNlIDMNCg0KVGhlIGNvbmRpdGlvbnMgdGhhdCB3ZSBtdXN0IG1lZXQgaW4gdXNpbmcgdGhlc2UgY2FsY3VsYXRpb25zIGlzIHRoYXQgb3VyIHZhcmlhYmxlDQppcyBpbmRlcGVuZGVudGx5IGlkZW50aWNhbGx5IGRpc3RyaWJ1dGVkIChpLmkuZC4pLCB0aGUgc2FtcGxpbmcgaXMgcmFuZG9tLCBvdXINCnNhbXBsZSBzaXplIGlzIG5vIGxhcmdlciB0aGFuIDEwJSBvZiB0aGUgcG9wdWxhdGlvbiBzaXplLCBhbmQgb3VyIHNhbXBsZSBpcw0Kc3VmZmljaWVudGx5IGxhcmdlLg0KDQpgYGB7ciBjb2RlLWNodW5rLTN9DQpzYW1wbGVfbWVhbiA8LSBtZWFuKHNhbXApDQpzZSA8LSBzZChzYW1wKSAvIHNxcnQoNjApDQpsb3dlciA8LSBzYW1wbGVfbWVhbiAtIDEuOTYgKiBzZQ0KdXBwZXIgPC0gc2FtcGxlX21lYW4gKyAxLjk2ICogc2UNCmMobG93ZXIsIHVwcGVyKQ0KYGBgDQoNCiMjIyBFeGVyY2lzZSA0DQoNCjk1JSBjb25maWRlbmNlIHNpZ25pZmllcyB0aGF0IHRoZXJlIGlzIGEgOTUlIGNoYW5jZSB0aGF0IHRoZSBwb3B1bGF0aW9uDQpwYXJhbWV0ZXIgd2UgYXJlIHRyeWluZyB0byBlc3RpbWF0ZSBmYWxscyB3aXRoaW4gdGhlIGludGVydmFsIHdlIGhhdmUNCmNvbnN0cnVjdGVkLg0KDQojIyMgRXhlcmNpc2UgNQ0KDQpNeSBjb25maWRlbmNlIGludGVydmFsIGRvZXMgY2FwdHVyZSB0aGUgdHJ1ZSBhdmVyYWdlIGhvdXNlIHNpemUgaW4gQW1lcw0KKHBvcHVsYXRpb24gbWVhbikuDQoNCmBgYHtyIGNvZGUtY2h1bmstNX0NCmMobG93ZXIsdXBwZXIpDQptZWFuKHBvcHVsYXRpb24pDQpgYGANCg0KIyMjIEV4ZXJjaXNlIDYNCg0KVGhlcmUgaXMsIGluIGZhY3QsIGEgYmlub21pYWwgZGlzdHJpYnV0aW9uIGZvciB0aGUgcHJvYmFiaWxpdGllcyBvZiBvdXIgY2xhc3MNCmNhcHR1cmluZyB0aGUgcG9wdWxhdGlvbiBtZWFuLiBUaGUgcHJvcG9ydGlvbiBvZiBzdHVkZW50cyB3aG8gY2FwdHVyZWQgdGhlDQpwb3B1bGF0aW9uIG1lYW4gd2l0aCB0aGVpciBpbnRlcnZhbHMgc2hvdWxkIGJlIGFyb3VuZCAwLjk1LCBiZWNhdXNlIHRoZQ0KZGVmaW5pdGlvbiBvZiBhIGNvbmZpZGVuY2UgbGV2ZWwgaXMgc2ltcGx5IHRoZSBwcm9iYWJpbGl0eSB0aGF0IHlvdXIgb2JzZXJ2YXRpb24NCmlzIGFjY3VyYXRlLg0KDQojIyMgT24geW91ciBPd24NCg0KMS4gMC45NiBpcyB0aGUgcHJvcG9ydGlvbiBvZiBjb25maWRlbmNlIGludGVydmFscyB0aGF0IGNhcHR1cmVkIHRoZSBwb3B1bGF0aW9uDQogICAgbWVhbiAoNDgvNTApLCBiZWNhdXNlIHRoZXJlIHdlcmUgMiBpbnRlcnZhbHMgKGluIHJlZCkgd2hpY2ggZGlkIG5vdCBjYXB0dXJlDQogICAgdGhlIHBvcHVsYXRpb24gbWVhbi4gVGhlIHByb3BvcnRpb24gaXMgbm90IGV4YWN0bHkgZXF1YWwgdG8gdGhlIGNvbmZpZGVuY2UNCiAgICBsZXZlbCwgYnV0IGl0J3MgYXMgY2xvc2UgYXMgeW91IGNhbiBnZXQgYmVjYXVzZSB0aGVyZSBpcyBubyByYXRpbyBvZiBhbg0KICAgIGludGVnZXIgdG8gNTAgd2hpY2ggZXZhbHVhdGVzIHRvIDAuOTUsICg0Ny41LzUwIGRpc2NyZXRlIHNhbXBsZXMgY2FwdHVyaW5nIHRoZQ0KICAgIHBvcHVsYXRpb24gbWVhbiBtYWtlcyBubyBzZW5zZSwgYmVjYXVzZSBlYWNoIHNhbXBsZSBpcyBhIGRpc2NyZXRlIHVuaXQpLg0KDQogICAgDQpgYGB7ciBjb2RlLWNodW5rLW95bzF9DQpzZXQuc2VlZCg1KQ0KDQpzYW1wX21lYW4gPC0gcmVwKE5BLCA1MCkNCnNhbXBfc2QgPC0gcmVwKE5BLCA1MCkNCm4gPC0gNjANCg0KZm9yKGkgaW4gMTo1MCl7DQogIHNhbXAgPC0gc2FtcGxlKHBvcHVsYXRpb24sIG4pICMgb2J0YWluIGEgc2FtcGxlIG9mIHNpemUgbiA9IDYwIGZyb20gdGhlIHBvcHVsYXRpb24NCiAgc2FtcF9tZWFuW2ldIDwtIG1lYW4oc2FtcCkgICAgIyBzYXZlIHNhbXBsZSBtZWFuIGluIGl0aCBlbGVtZW50IG9mIHNhbXBfbWVhbg0KICBzYW1wX3NkW2ldIDwtIHNkKHNhbXApICAgICAgICAjIHNhdmUgc2FtcGxlIHNkIGluIGl0aCBlbGVtZW50IG9mIHNhbXBfc2QNCn0NCg0KbG93ZXJfdmVjdG9yIDwtIHNhbXBfbWVhbiAtIDEuOTYgKiBzYW1wX3NkIC8gc3FydChuKSANCnVwcGVyX3ZlY3RvciA8LSBzYW1wX21lYW4gKyAxLjk2ICogc2FtcF9zZCAvIHNxcnQobikNCg0KcGxvdF9jaShsb3dlcl92ZWN0b3IsIHVwcGVyX3ZlY3RvciwgbWVhbihwb3B1bGF0aW9uKSkNCmBgYA0KDQoyLiBMZXQgdGhlIG5ldyBjb25maWRlbmNlIGxldmVsIGJlIDk5LjcuIEkgd2lsbCB1c2UgdGhlIHQgY29uZmlkZW5jZSBpbnRlcnZhbCwgbm90DQogICAgdGhlIHogY29uZmlkZW5jZSBpbnRlcnZhbCwgYmVjYXVzZSB0aGV5IHNob3VsZCBiZSByZWxhdGl2ZWx5IHNpbWlsYXIgYW55d2F5cw0KICAgIGZvciBuPTYwIGFuZCBJIHdhbnQgcHJhY3RpY2UuDQogICAgDQpgYGB7ciBjb2RlLWNodW5rLW95bzJ9DQojIDAuMDAzIC8gMiA9IDAuMDAxNSwgSSBiZWxpZXZlIHRoaXMgaXMgd2hhdCBpcyBjYWxsZWQgdGhlIGNyaXRpY2FsIHZhbHVlPw0KcXQoMC45OTg1LCBkZiA9IDU5KQ0KIyB0aGlzIHZhbHVlIGdpdmVzIHRoZSBudW1iZXIgdGhhdCB3aWxsIHJlcGxhY2UgMS45Ng0KYGBgDQoNCjMuIFRoZSBwcm9wb3J0aW9uIG9mIHRoZSBjb25maWRlbmNlIGludGVydmFscyB3aGljaCBjYXB0dXJlZCB0aGUgcG9wdWxhdGlvbg0KICAgIG1lYW4gd2FzIDEuMDAuIFRoaXMgaXMsIHNpbWlsYXJseSwgYXMgY2xvc2UgYXMgeW91IGNvdWxkIGdldCB0byAwLjk5NyBhcw0KICAgIGEgcmF0aW8gb2YgYW4gaW50ZWdlciB0byA1MC4gV2hhdCBJIHdvdWxkIGJlIGludGVyZXN0ZWQgaW4gaXMgaG93IG11Y2gNCiAgICBsYXJnZXIgYXJlIHRoZXNlIENJcyB0aGFuIHRob3NlIGZvciAwLjk1LCB3aGljaCBpcyBhIHNpbXBsZSBjYWxjdWxhdGlvbiwNCiAgICBidXQgbW9yZSBpbXBvcnRhbnRseSwgaXMgdGhpcyBzYWNyaWZpY2UgaW4gcHJlY2lzaW9uIHdvcnRoIHRoZSBpbmNyZWFzZWQNCiAgICBhY2N1cmFjeT8gQXQgd2hhdCBjb25maWRlbmNlIGxldmVsIGRvZXMgdGhlIGludGVydmFsIGJlY29tZSBwcmFjdGljYWxseQ0KICAgIHVzZWxlc3M/DQoNCmBgYHtyIGNvZGUtY2h1bmstb3lvM30NCmxvd2VyX3ZlY3RvciA8LSBzYW1wX21lYW4gLSBxdCgwLjk5ODUsIGRmID0gNTkpICogc2FtcF9zZCAvIHNxcnQobikgDQp1cHBlcl92ZWN0b3IgPC0gc2FtcF9tZWFuICsgcXQoMC45OTg1LCBkZiA9IDU5KSAqIHNhbXBfc2QgLyBzcXJ0KG4pDQoNCnBsb3RfY2kobG93ZXJfdmVjdG9yLCB1cHBlcl92ZWN0b3IsIG1lYW4ocG9wdWxhdGlvbikpDQpgYGANCg0K