Assignment Introduction

This module is about the intersection between statistical computing and statistical inference.

All questions refer to the College data in the ISLR package. Which is from the 1995 US News and World report data on 777 US Colleges and University.

install.packages("ISLR") # Only do this the first time you use the package
library(ISLR) # Adds package to working library
attach(College)
dim(College)  # Shows the number of rows and columns in the dataset
## [1] 777  18
help(College) # Shows the documentation associated with the dataset
## starting httpd help server ... done

Submit .rmd and .html (output after knitting an .rmd file) files to eCampus by 11:59pm on September 5, 2022.

Not sure how to use R Markdown yet? No problem! This is a very helpful resource.

Problem 1 (15 points)

In this question, we will look at the use of different functions to understand how to use best practices in R based on timing and based on results. We will go through three different storage methods for storing and running bootstrap intervals. Run a bootstrap confidence interval for the Apps variable in the College dataset (this is the number of applications received), using 1,000 bootstrap samples. Calculate the 80% Confidence interval for the mean number of applications each school receives. Pick one of these three methods to use (Hint: Check out the Atlanta Commute Example):

Interpret the 80% confidence interval that you calculated.

Problem 2 (10 points)

Building on problem 1, take your preferred method and calculate the intervals using 3 different random seeds (you used one previously so you need two more) to see how sensitive the result is to the random number generator.

After this, you will have 3 intervals with 1,000 bootstrap samples, using the same three seeds and methodology calculate intervals using 5,000 bootstrap samples. Report the resulting intervals and comment on the results.

Problem 3 (15 points)

In the same College dataset, we have the Private variable, which is an indicator of what schools are private versus public. Test the statement that public schools have more applications than private schools, using randomized distributions. Pick a reasonable significance level and state your result. Comment on the data size. Are 777 data points enough to make any conclusions? Defend your statement (donโ€™t need to do any analysis, just give reasoning). Hint: Check the example average SAT scores between men and women.

privpub <- Private
private <- subset(privpub, Private == "1")
public <- subset(privpub, Private == "2")

dim(private)
## NULL
dim(public)
## NULL
College$Private
##   [1] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
##  [19] Yes No  Yes No  Yes No  Yes No  Yes No  Yes Yes Yes Yes Yes Yes Yes Yes
##  [37] Yes Yes Yes Yes Yes Yes Yes Yes Yes No  Yes Yes Yes Yes Yes Yes Yes Yes
##  [55] Yes Yes No  Yes Yes Yes Yes No  Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
##  [73] Yes Yes Yes Yes Yes Yes No  No  Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
##  [91] Yes Yes No  Yes Yes Yes Yes Yes Yes Yes Yes Yes No  No  No  Yes Yes Yes
## [109] Yes Yes Yes Yes No  Yes Yes Yes Yes Yes No  No  Yes Yes Yes Yes Yes No 
## [127] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No  Yes Yes No  Yes Yes
## [145] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No 
## [163] Yes Yes Yes No  Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No  No  Yes Yes
## [181] No  No  Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No  Yes Yes No  Yes No 
## [199] Yes Yes Yes No  Yes No  Yes Yes No  No  Yes Yes Yes Yes Yes Yes Yes Yes
## [217] Yes Yes No  Yes Yes Yes No  No  Yes Yes Yes Yes Yes Yes Yes Yes Yes No 
## [235] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
## [253] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No 
## [271] Yes Yes Yes No  No  Yes Yes No  Yes No  Yes No  Yes Yes Yes No  Yes Yes
## [289] No  Yes No  Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No  Yes No 
## [307] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No  Yes Yes Yes No  No  Yes Yes
## [325] No  No  Yes Yes Yes Yes Yes Yes Yes No  Yes Yes Yes Yes Yes Yes No  Yes
## [343] Yes Yes Yes Yes Yes No  Yes Yes Yes Yes Yes Yes Yes No  Yes Yes Yes Yes
## [361] Yes Yes Yes No  Yes No  No  No  Yes No  Yes Yes Yes Yes Yes No  No  No 
## [379] Yes Yes Yes No  No  No  Yes No  Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
## [397] Yes Yes Yes Yes Yes No  Yes Yes Yes No  No  Yes Yes Yes No  No  No  Yes
## [415] Yes No  Yes No  Yes No  No  No  Yes Yes Yes Yes Yes No  Yes Yes Yes Yes
## [433] No  Yes Yes Yes No  Yes Yes Yes Yes Yes Yes Yes No  No  Yes No  Yes Yes
## [451] Yes Yes Yes Yes Yes Yes Yes No  Yes Yes Yes No  Yes Yes Yes No  No  Yes
## [469] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No  No  No  No 
## [487] Yes Yes Yes No  Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
## [505] Yes Yes Yes Yes No  Yes No  Yes Yes Yes Yes Yes Yes Yes Yes Yes No  Yes
## [523] Yes Yes Yes Yes Yes Yes Yes No  No  No  Yes No  Yes Yes No  No  Yes Yes
## [541] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No  Yes Yes Yes Yes Yes Yes Yes
## [559] No  Yes No  No  No  No  No  No  No  No  No  No  No  No  No  No  Yes Yes
## [577] Yes Yes Yes Yes Yes No  No  Yes Yes No  Yes No  Yes Yes Yes No  Yes Yes
## [595] Yes Yes Yes Yes Yes Yes Yes Yes No  No  No  No  No  No  Yes Yes No  No 
## [613] Yes Yes Yes Yes Yes Yes Yes No  No  Yes No  No  No  Yes No  Yes No  No 
## [631] No  No  No  No  No  No  Yes No  No  No  No  No  No  No  No  Yes No  No 
## [649] Yes No  No  No  No  No  No  No  No  No  No  No  Yes No  No  Yes No  Yes
## [667] Yes No  Yes Yes Yes Yes No  Yes No  No  No  Yes No  No  No  Yes Yes No 
## [685] No  No  No  Yes Yes Yes Yes No  No  No  No  No  No  No  No  No  No  No 
## [703] No  Yes Yes Yes No  Yes Yes Yes Yes No  No  No  Yes Yes Yes Yes Yes Yes
## [721] Yes Yes Yes Yes Yes Yes Yes No  Yes No  Yes Yes Yes Yes Yes Yes Yes Yes
## [739] No  No  Yes No  Yes No  Yes No  No  No  Yes Yes Yes No  Yes Yes Yes Yes
## [757] Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No  No  Yes Yes Yes Yes No  Yes
## [775] Yes Yes Yes
## Levels: No Yes