library(readr)
HarrisburgSenators <- read_csv("HarrisburgSenators.csv")
## Error: 'HarrisburgSenators.csv' does not exist in current working directory ('/Users/xinyizhu/Dropbox').
View(HarrisburgSenators)
## Error in as.data.frame(x): object 'HarrisburgSenators' not found
When you examine the data frame you imported into R, you will notice that it is not in the appropriate format for performing a chi-square test. Convert the data frame that you imported into a table such that the columns show the Win-Lose (W and L) variable and the rows show Age variable (Separate the players in the middle, take the median age and split it into a binary Age variable). Use the R categorical functions that you have learned in the course. Name the table you create ‘Senators_t’. Copy and paste your code below that created the table.
Develop the appropriate null and alternative hypothesis for testing an association among these variables (Winning-Losing, Age group). State the null and alternative hypothesis in the context of this problem.
H0: H1:
Run the chi-square test in R using the assocstats() function. Copy and paste your code and the output below. What is the value of the chi-square test statistic? What is the p-value related to the chi-square test statistic?
Compute the χ2 test statistic that you obtained from the R output in part d. You should show your computations. Show the computations of the expected counts. Provide a final table that includes the observed counts with the expected counts in parenthesis for each cell. The final table should have row and column margin totals and a grand total.
Does the test statistic you found by hand in part e match the χ2 test statistic from the assocstats() output in part d? State Yes or No.
Use the pchisq() function in R to find the p-value associated with the χ2 test statistic. Copy and paste your code below. Does the p-value you found with the pchisq() function match the p-value from the assocstats() output in part d? State Yes or No.
Using the p-value for this test, do you reject or not reject the null hypothesis?
State a conclusion back in terms of the context of the problem.
#code from text
fat <- matrix(c(6, 4, 2, 11), 2, 2)
dimnames(fat) <- list(diet = c("LoChol", "HiChol"),
disease = c("No", "Yes"))
fat
## disease
## diet No Yes
## LoChol 6 2
## HiChol 4 11
You want to perform another test of association between the variables diet and disease. You decide not to use an odds ratio test.
Only state the name of appropriate statistical test, you do not have to perform the test.
You are going to be using the Cochran Mantel-Haenszel (CMH) test to test some associations among these variables.
Why is a CMH test appropriate for this type of data?
Run the CMH test in R. Copy and paste your code and output below.
Letters c-e are asking you questions specific to the R output line entitled ‘rmeans.’ c. State the null and alternative hypothesis for the line in the R output entitled ‘rmeans.’ State the null and alternative hypothesis in the context of the problem; do not use the generic statements from the course notes. H0:
H1:
What is the decision of this statistical test, that is, can you reject or not reject the null hypothesis? Use α=level of significance=.05.
Using the line ‘rmeans’ in the R output, state a conclusion back in the context of the problem.
Letters f-h are asking you questions specific to the R output line entitled ‘cmeans.’ f. State the null and alternative hypothesis for the line in the R output entitled ‘cmeans.’ State the null and alternative hypothesis in the context of the problem; do not use the generic statements from the course notes.
H0:
H1:
What is the decision of this statistical test, that is, can you reject or not reject the null hypothesis? Use α=level of significance=.05.
Using the line ‘cmeans’ in the R output, state a conclusion back in the context of the problem.
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.