This module is about the intersection between statistical computing and statistical inference.
All questions refer to the College data in the ISLR package. Which is from the 1995 US News and World report data on 777 US Colleges and University.
install.packages("ISLR") # Only do this the first time you use the package
library(ISLR) # Adds package to working library
attach(College)
dim(College) # Shows the number of rows and columns in the dataset
## [1] 777 18
help(College) # Shows the documentation associated with the dataset
## starting httpd help server ... done
Submit .rmd and .html (output after knitting an .rmd file) files to eCampus by 11:59pm on September 5, 2022.
Not sure how to use R Markdown yet? No problem! This is a very helpful resource.
In this question, we will look at the use of different functions to
understand how to use best practices in R based on timing and based on
results. We will go through three different storage methods for storing
and running bootstrap intervals. Run a bootstrap confidence interval for
the Apps variable in the College dataset (this is the
number of applications received), using 1,000 bootstrap samples.
Calculate the 80% Confidence interval for the mean number of
applications each school receives. Pick one of these three methods to
use (Hint: Check out the Atlanta Commute Example):
Interpret the 80% confidence interval that you calculated.
Building on problem 1, take your preferred method and calculate the intervals using 3 different random seeds (you used one previously so you need two more) to see how sensitive the result is to the random number generator.
After this, you will have 3 intervals with 1,000 bootstrap samples, using the same three seeds and methodology calculate intervals using 5,000 bootstrap samples. Report the resulting intervals and comment on the results.
In the same College dataset, we have the Private
variable, which is an indicator of what schools are private versus
public. Test the statement that public schools have more applications
than private schools, using randomized distributions. Pick a reasonable
significance level and state your result. Comment on the data size. Are
777 data points enough to make any conclusions? Defend your statement
(donโt need to do any analysis, just give reasoning). Hint:
Check the example average SAT scores between men and women.
privpub <- Private
private <- subset(privpub, Private == "1")
public <- subset(privpub, Private == "2")
dim(private)
## NULL
dim(public)
## NULL
College$Private
## [1] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
## [19] Yes No Yes No Yes No Yes No Yes No Yes Yes Yes Yes Yes Yes Yes Yes
## [37] Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes Yes Yes Yes Yes Yes Yes
## [55] Yes Yes No Yes Yes Yes Yes No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
## [73] Yes Yes Yes Yes Yes Yes No No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
## [91] Yes Yes No Yes Yes Yes Yes Yes Yes Yes Yes Yes No No No Yes Yes Yes
## [109] Yes Yes Yes Yes No Yes Yes Yes Yes Yes No No Yes Yes Yes Yes Yes No
## [127] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes No Yes Yes
## [145] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No
## [163] Yes Yes Yes No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No No Yes Yes
## [181] No No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes No Yes No
## [199] Yes Yes Yes No Yes No Yes Yes No No Yes Yes Yes Yes Yes Yes Yes Yes
## [217] Yes Yes No Yes Yes Yes No No Yes Yes Yes Yes Yes Yes Yes Yes Yes No
## [235] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
## [253] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No
## [271] Yes Yes Yes No No Yes Yes No Yes No Yes No Yes Yes Yes No Yes Yes
## [289] No Yes No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes No
## [307] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes Yes No No Yes Yes
## [325] No No Yes Yes Yes Yes Yes Yes Yes No Yes Yes Yes Yes Yes Yes No Yes
## [343] Yes Yes Yes Yes Yes No Yes Yes Yes Yes Yes Yes Yes No Yes Yes Yes Yes
## [361] Yes Yes Yes No Yes No No No Yes No Yes Yes Yes Yes Yes No No No
## [379] Yes Yes Yes No No No Yes No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
## [397] Yes Yes Yes Yes Yes No Yes Yes Yes No No Yes Yes Yes No No No Yes
## [415] Yes No Yes No Yes No No No Yes Yes Yes Yes Yes No Yes Yes Yes Yes
## [433] No Yes Yes Yes No Yes Yes Yes Yes Yes Yes Yes No No Yes No Yes Yes
## [451] Yes Yes Yes Yes Yes Yes Yes No Yes Yes Yes No Yes Yes Yes No No Yes
## [469] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No No No No
## [487] Yes Yes Yes No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
## [505] Yes Yes Yes Yes No Yes No Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes
## [523] Yes Yes Yes Yes Yes Yes Yes No No No Yes No Yes Yes No No Yes Yes
## [541] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes Yes Yes Yes Yes Yes
## [559] No Yes No No No No No No No No No No No No No No Yes Yes
## [577] Yes Yes Yes Yes Yes No No Yes Yes No Yes No Yes Yes Yes No Yes Yes
## [595] Yes Yes Yes Yes Yes Yes Yes Yes No No No No No No Yes Yes No No
## [613] Yes Yes Yes Yes Yes Yes Yes No No Yes No No No Yes No Yes No No
## [631] No No No No No No Yes No No No No No No No No Yes No No
## [649] Yes No No No No No No No No No No No Yes No No Yes No Yes
## [667] Yes No Yes Yes Yes Yes No Yes No No No Yes No No No Yes Yes No
## [685] No No No Yes Yes Yes Yes No No No No No No No No No No No
## [703] No Yes Yes Yes No Yes Yes Yes Yes No No No Yes Yes Yes Yes Yes Yes
## [721] Yes Yes Yes Yes Yes Yes Yes No Yes No Yes Yes Yes Yes Yes Yes Yes Yes
## [739] No No Yes No Yes No Yes No No No Yes Yes Yes No Yes Yes Yes Yes
## [757] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No No Yes Yes Yes Yes No Yes
## [775] Yes Yes Yes
## Levels: No Yes