Lab 2: Theory Based Inference for a Single Proportion & for a Single Mean

library(rmarkdown)
library(openintro)

## Loading required package: airports

## Loading required package: cherryblossom

## Loading required package: usdata

library(tidyverse)

## ── Attaching packages
## ───────────────────────────────────────
## tidyverse 1.3.2 ──

## ✔ ggplot2 3.3.6      ✔ purrr   0.3.4 
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.0      ✔ stringr 1.4.1 
## ✔ readr   2.1.2      ✔ forcats 0.5.2 
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

library(readr)

GROUP 1

Exercise 1:

Consider the scenario from Section 1.2, we looked at the rock-paper-scissors game with a novice player that threw scissors 4 out of the 20 rounds. Will the theory-based approach (more formally called the the one proportion z-test; normal approximation) work here? Explain.

This test would not work as there needs to be a larger sample size that can accommodate 10 successes and 10 failures. If we were to have a larger sample size a theory based approuch could work.

Exercise 2:

Go to the One Proportion applet and run both the simulation and theory-based approaches for this rock-paper-scissors scenario. What do you notice?

Probability of success: .333 Sample size (n): 20 Number of samples: 10000

The graph is symmetrical with a mean of 6.748 and a standard deviation of 2.207

GROUP 2

Exercise 3:

I am left eye dominate

As we saw in Lab 1, we can use the command read_table( ) to import the EyeDominance data from the following address “https://willamette.edu/~ijohnson/138/EyeDominance.txt”. This data contains responses to the question ‘Are you right-eye dominant?’ for a random sample of students at a large university.

EyeDominance <- read_table("https://willamette.edu/~ijohnson/138/EyeDominance.txt", col_types="c")

count(EyeDominance)

## # A tibble: 1 × 1
##       n
##   <int>
## 1   115

Exercise 4:

Write commands to view the head and tail of the data in the two R-chunks below.

Conventional wisdom says that more people are right-handed than left, so for now let’s have our research hypothesis be that more often people tend to be right-eye dominant. Our null hypothesis is that left and right eye dominance are equally prevalent; in other words the proportion of right eye dominance is 0.50. Our alternative hypothesis is that right-eye dominance is more prevalent than left, or that the proportion of right eye dominance is more prevalent than 0.50.

Using notation, H0:π=0.50 and Ha:π>0.50.

head(EyeDominance)

## # A tibble: 6 × 1
##   RightDominance
##   <chr>         
## 1 N             
## 2 Y             
## 3 N             
## 4 Y             
## 5 Y             
## 6 N

tail(EyeDominance)

## # A tibble: 6 × 1
##   RightDominance
##   <chr>         
## 1 Y             
## 2 N             
## 3 Y             
## 4 N             
## 5 Y             
## 6 Y

Exercise 5:

Create a table of values that counts the number of Y’s and N’s for the RightDominance variable. (Hint: refer to the commands from Lab 1.). Name your table table_eye

RightDominance <- EyeDominance$RightDominance

table_eye <- table(EyeDominance)

barplot(table_eye, main="Right Eye Dominance")

Exercise 6:

Record the observed proportion, sample size and the assumed proportion under the null hypothesis in R.

#example: eye dominance
#obs_prop <- 0.62 
#n <- 115
#null_pi <- 0.50

Exercise 7:

Use R as a calculator to checking the conditions to use the one proportion z-test. Describe what the numbers mean in the space below the R chunk.

115*0.62

## [1] 71.3

115 - 115*0.62

## [1] 43.7

Exercise 8:

Calculate the standard deviation of our null distribution

SD <- sqrt(0.50*(1 - 0.50)/115)
SD

## [1] 0.04662524

Exercise 9:

Next calculate the standardized statistic z.

z_stat <- (0.62 - 0.50)/0.047
z_stat

## [1] 2.553191

Exercise 10:

Last, calculate the theory-based p-value and write out a conclusion that includes the research question, the p-value, and the significance of the p-value.

SSp_value <- pnorm(z_stat, lower.tail = FALSE)
SSp_value

## [1] 0.00533704

Conclusion for 10: There are a higher percentage of right handed people compared to left handed amongst the populace. This study, using a random sample of 115 students at a large university seeks to provide evidence that this proportion is also similar with right eye and left eye dominance. With the null statistic set as 0.50, we calculated the p-value of the observed statistic and have discovered that this sample, which is representative of the larger student body, has a p-value of 0.005. A result that provides very strong evidence against our null hypothesis indicating that the it is unlikely that random chance alone would have produced such a skewed result.

GROUP 3

Exercise 11:

Although it is known that the white shark grows to a mean length of 21 feet, a marine biologist believes that the great white sharks off the Bermuda coast grow much longer due to unusual feeding habits. To test this claim, a number of full-grown great white sharks are captured off the Bermuda coast, measured and then set free. For the 15 sharks that were caught, a sample mean of 22.1 feet and sample standard deviation of 3.2 feet was found. A histogram of the sample data appeared to be approximately normal. Perform a test of significance using a significance level of 0.05.

t_stat <- (22.1-21)/(3.2/sqrt(15))
t_stat

## [1] 1.331338

One sided p-value test

PC1_p_value <- pt(t_stat, df=15-1, lower.tail=TRUE)
PC1_p_value

## [1] 0.8978244

This p-value fails to reject the null hypothesis, meaning that the mean of 22.1 feet with a p-value of 0.897 falls within the expected range of great white shark length and is not proof that the overall average is higher than 21 feet.

Lab 2: Theory Based Inference for a Single Proportion & for a Single Mean

Theo Balwit

2022-10-04