library(rmarkdown)
library(openintro)
## Loading required package: airports
## Loading required package: cherryblossom
## Loading required package: usdata
library(tidyverse)
## ── Attaching packages
## ───────────────────────────────────────
## tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.0 ✔ stringr 1.4.1
## ✔ readr 2.1.2 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(readr)
GROUP 1
Exercise 1:
Consider the scenario from Section 1.2, we looked at the rock-paper-scissors game with a novice player that threw scissors 4 out of the 20 rounds. Will the theory-based approach (more formally called the the one proportion z-test; normal approximation) work here? Explain.
This test would not work as there needs to be a larger sample size that can accommodate 10 successes and 10 failures. If we were to have a larger sample size a theory based approuch could work.
Exercise 2:
Go to the One Proportion applet and run both the simulation and theory-based approaches for this rock-paper-scissors scenario. What do you notice?
Probability of success: .333 Sample size (n): 20 Number of samples: 10000
The graph is symmetrical with a mean of 6.748 and a standard deviation of 2.207
GROUP 2
Exercise 3:
I am left eye dominate
As we saw in Lab 1, we can use the command read_table( ) to import the EyeDominance data from the following address “https://willamette.edu/~ijohnson/138/EyeDominance.txt”. This data contains responses to the question ‘Are you right-eye dominant?’ for a random sample of students at a large university.
EyeDominance <- read_table("https://willamette.edu/~ijohnson/138/EyeDominance.txt", col_types="c")
count(EyeDominance)
## # A tibble: 1 × 1
## n
## <int>
## 1 115
Exercise 4:
Write commands to view the head and tail of the data in the two R-chunks below.
Conventional wisdom says that more people are right-handed than left, so for now let’s have our research hypothesis be that more often people tend to be right-eye dominant. Our null hypothesis is that left and right eye dominance are equally prevalent; in other words the proportion of right eye dominance is 0.50. Our alternative hypothesis is that right-eye dominance is more prevalent than left, or that the proportion of right eye dominance is more prevalent than 0.50.
Using notation, H0:π=0.50 and Ha:π>0.50.
head(EyeDominance)
## # A tibble: 6 × 1
## RightDominance
## <chr>
## 1 N
## 2 Y
## 3 N
## 4 Y
## 5 Y
## 6 N
tail(EyeDominance)
## # A tibble: 6 × 1
## RightDominance
## <chr>
## 1 Y
## 2 N
## 3 Y
## 4 N
## 5 Y
## 6 Y
Exercise 5:
Create a table of values that counts the number of Y’s and N’s for the RightDominance variable. (Hint: refer to the commands from Lab 1.). Name your table table_eye
RightDominance <- EyeDominance$RightDominance
table_eye <- table(EyeDominance)
barplot(table_eye, main="Right Eye Dominance")
Exercise 6:
Record the observed proportion, sample size and the assumed proportion under the null hypothesis in R.
#example: eye dominance
#obs_prop <- 0.62
#n <- 115
#null_pi <- 0.50
Exercise 7:
Use R as a calculator to checking the conditions to use the one proportion z-test. Describe what the numbers mean in the space below the R chunk.
115*0.62
## [1] 71.3
115 - 115*0.62
## [1] 43.7
Exercise 8:
Calculate the standard deviation of our null distribution
SD <- sqrt(0.50*(1 - 0.50)/115)
SD
## [1] 0.04662524
Exercise 9:
Next calculate the standardized statistic z.
z_stat <- (0.62 - 0.50)/0.047
z_stat
## [1] 2.553191
Exercise 10:
Last, calculate the theory-based p-value and write out a conclusion that includes the research question, the p-value, and the significance of the p-value.
SSp_value <- pnorm(z_stat, lower.tail = FALSE)
SSp_value
## [1] 0.00533704
Conclusion for 10: There are a higher percentage of right handed people compared to left handed amongst the populace. This study, using a random sample of 115 students at a large university seeks to provide evidence that this proportion is also similar with right eye and left eye dominance. With the null statistic set as 0.50, we calculated the p-value of the observed statistic and have discovered that this sample, which is representative of the larger student body, has a p-value of 0.005. A result that provides very strong evidence against our null hypothesis indicating that the it is unlikely that random chance alone would have produced such a skewed result.
GROUP 3
Exercise 11:
Although it is known that the white shark grows to a mean length of 21 feet, a marine biologist believes that the great white sharks off the Bermuda coast grow much longer due to unusual feeding habits. To test this claim, a number of full-grown great white sharks are captured off the Bermuda coast, measured and then set free. For the 15 sharks that were caught, a sample mean of 22.1 feet and sample standard deviation of 3.2 feet was found. A histogram of the sample data appeared to be approximately normal. Perform a test of significance using a significance level of 0.05.
t_stat <- (22.1-21)/(3.2/sqrt(15))
t_stat
## [1] 1.331338
One sided p-value test
PC1_p_value <- pt(t_stat, df=15-1, lower.tail=TRUE)
PC1_p_value
## [1] 0.8978244
This p-value fails to reject the null hypothesis, meaning that the mean of 22.1 feet with a p-value of 0.897 falls within the expected range of great white shark length and is not proof that the overall average is higher than 21 feet.