STAT 360 Lab 9: Sampling Distributions (Part 1)

Name: Rebecca Lewis

Load R libraries

library(rmarkdown)
library(knitr)

Set the Seed

set.seed(33)

Exercise 1:

Use rbinom() to take a random sample of 10 people from a population where 13% of individuals have the flu. Calculate the corresponding sample proportion:

sample<-rbinom(1,10,.13)
sample
## [1] 1

Exercise 2:

Use rbinom() to take 10,000 random samples of 10 people each from a population where 13% of individuals have the flu. Calculate the corresponding sample statistics and assign them to the object props10:

props10<-rbinom(10000,10,.13)

Exercise 3:

Generate a histogram of the 10,000 sample statistics and describe the resulting distribution:

hist(props10)

The distribution is right skewed and unimodal with a couple of high-value outliers. 

Exercise 4:

Use rbinom() to take 10,000 random samples of 50, 100, and 500 people each from a population where 13% of individuals have the flu. Calculate the corresponding sample statistics and assign them to the objects props50, props100, and props500. Generate a histogram for each set of 10,000 sample statistics:

props50<-rbinom(10000,50,.13)
props100<-rbinom(10000,100,.13)
props500<-rbinom(10000,500,.13)
hist(props50)

hist(props100)

hist(props500)

Exercise 5:

Describe what happens to the shape of the sampling distribution as the sample size increases:

As the sample size increases, the right skew of the sampling distribution starts to disappear and the histograms become more normalized.

Exercise 6:

Describe how this pattern illustrates the Central Limit Theorem:

The central limit theorem states that as the sample size increase, the sampling distribution for the proportion of a random sample from any population will approach a normal distribution, which is what the historgrams show for our sample sizes.

Exercise 7:

Calculate the standard deviation of the empirical sampling distribution for a sample size of 100:

sd(props100)
## [1] 3.395121

Exercise 8:

How does your answer to Exercise 7 compare to the theoretical standard error of the sampling distribution for a sample size of 100?

p<-.13
q<-1-.13
n<-100
theosd<-((sqrt(p*q))/(sqrt(n)))
theosd
## [1] 0.03363034
The standard error is .03363 or 3.363% which is very simalar to the standard deviation of the empirical sampling for a sample size of 100, which was 3.667.

Exercise 9:

Calculate the minimum sample size you would need to take in order to properly construct a one-sample z-interval:

10/.13
## [1] 76.92308
We need a minimum of 10 yes and 10 no. So we would need minimum of 77 samples to get 10 cases with the flu.

Exercise 10:

Use rbinom() to take a single random sample of the minimum sample size calculated in Exercise 9 from a population where 13% of individuals have the flu. Calculate the corresponding sample statistic, assign it to the object p.hat, and display the value:

p.hat<-rbinom(1,77,.13)
p.hat
## [1] 8

Exercise 11:

Use p.hat to construct a 95% Confidence Interval for the proportion of the entire population that has the flu:

p1<-.13
q2<-1-.13
n<-77
theosd2<-((sqrt(p1*q2))/(sqrt(n)))
Confupper<-p.hat+1.96*theosd2
Conflower<-p.hat-1.96*theosd2
Conflower
## [1] 7.924882
Confupper
## [1] 8.075118

Exercise 12:

Does your Confidence Interval capture the true population parameter of 13%? No this confidence interval does not capture the true population parameter of 13%.