Jessica McPhaul - 6372 _Unit_10_Homework
Consider the vitamin C study previously described. Use the workflow discussed in class to write a final report that includes what the study design is, the report and conclusion of a test, and report and conclusion of a confidence interval.
VitC<-matrix(c(335,76,302,105),2,2,byrow=T)
dimnames(VitC)<-list(Exposure=c("Placebo","Vit C"),"Arrested"=c("Cold","No Cold"))
VitC
## Arrested
## Exposure Cold No Cold
## Placebo 335 76
## Vit C 302 105
STUDY DESIGN: This was an observational study where participants were either given a placebo or Vitamin C, and researchers noted whether these participants developed a cold or not. The results were compiled into the following matrix, w/ the goal to assess whether taking Vitamin C reduced the incidence of the common cold compared to a placeb: Arrested Exposure Cold No Cold Placebo 335 76 Vit C 302 105
Analysis and Results: Test for Association (Chi-Square Test):
# Create the matrix
VitC <- matrix(c(335, 76, 302, 105), 2, 2, byrow = TRUE)
dimnames(VitC) <- list(Exposure = c("Placebo", "Vit C"), "Arrested" = c("Cold", "No Cold"))
# Perform Fisher's Exact Test
fisher_result <- fisher.test(VitC)
# Print the results
fisher_result
##
## Fisher's Exact Test for Count Data
##
## data: VitC
## p-value = 0.01444
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
## 1.083492 2.172490
## sample estimates:
## odds ratio
## 1.531722
Reflecting on the results from Fisher’s Exact Test on the Vitamin C study data I analyzed:
P-value: The test yielded a p-value of 0.01444. This value is below the commonly used significance threshold of 0.05, indicating that the differences observed in cold incidence between the Vitamin C and placebo groups are statistically significant. In simpler terms, there’s a low probability that the observed association happened by chance under the null hypothesis of no association.
Confidence Interval for the Odds Ratio: The 95% confidence interval for the odds ratio ranges from 1.083492 to 2.172490. This interval does not include 1, suggesting a significant association between Vitamin C intake and cold incidence. More specifically, the interval suggests that those taking Vitamin C are between approximately 1.08 to 2.17 times more likely to not catch a cold compared to those taking a placebo.
Interpretation Interpretation: Based on the p-value and the confidence interval, my analysis suggests that Vitamin C intake is associated with a lower incidence of colds. This implies that participants who took Vitamin C were significantly less likely to catch a cold compared to those who took a placebo.
It’s important for me to note in my report that, although the results suggest a beneficial effect of Vitamin C on preventing colds, the study design—whether it was randomized or not—plays a crucial role in interpreting these findings. If it wasn’t a randomized controlled trial, other unmeasured factors might influence the incidence of colds, and the observed association might not purely be due to the effect of Vitamin C.
Therefore, while the statistical analysis indicates a significant association, I should conclude cautiously, highlighting the need for further research, preferably through randomized controlled trials, to confirm these findings and establish a causal relationship between Vitamin C intake and cold prevention.
Researchers took random samples of 534 women who had breast cancer and 1044 women who did not have cancer. They reached out individually and asked the question, “Do you have fewer than four drinks per week?” of which they could answer yes or no. The table of the results follow.
The goal of the study was to assess the risk of drinking on breast cancer. Use the workflow discussed in class to write a final report that includes what the study design is, the test being used, the conclusion of the test, and interpretation of a confidence interval.
Drinks<-matrix(c(330,658,204,386),2,2,byrow=T)
dimnames(Drinks)<-list(Drinks=c("Fewer than 4","4 or more"),"Status"=c("Cancer","Control"))
Drinks
## Status
## Drinks Cancer Control
## Fewer than 4 330 658
## 4 or more 204 386
Study Design: This is an observational study where participants were divided into 2 groups based on their health status (cancer or control) and asked about their drinking habits. It was designed to explore the relationship between alcohol consumption (namely imbibing less than 4 drinks per week) and the instance of breast cancer.
Data Description The subjects comprised two cohorts: 534 women with a breast cancer diagnosis and 1044 women without cancer, serving as the control group. Each participant was questioned individually regarding their alcohol consumption, specifically whether they had fewer than four drinks per week, allowing for binary responses of ‘yes’ or ‘no.’
Data Collection The responses were organized into a 2x2 contingency table based on the alcohol consumption frequency and cancer status, as follows:
| Cancer | Control |
———–|——–|———| < 4 drinks | 330 | 658 | ≥ 4 drinks | 204 | 386 |
Statistrical Method A Chi-square test for independence was utilized to analyze the association btwn reported: alcohol consumption and breast cancer incidents. This statistical test is appropriate for determining if there’s a significant association between two categorical variables.
library(MASS)
# Create the matrix of observed frequencies
Drinks <- matrix(c(330, 204, 658, 386), nrow = 2, byrow = TRUE)
dimnames(Drinks) <- list(Drinks = c("Fewer than 4", "4 or more"),
Status = c("Cancer", "Control"))
# View the table
Drinks
## Status
## Drinks Cancer Control
## Fewer than 4 330 204
## 4 or more 658 386
# Perform the Chi-square test of independence
chi_test_result <- chisq.test(Drinks)
# Print the test result
chi_test_result
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: Drinks
## X-squared = 0.1785, df = 1, p-value = 0.6727
# Display the p-value and the expected frequencies
chi_test_result$p.value
## [1] 0.6726679
chi_test_result$expected
## Status
## Drinks Cancer Control
## Fewer than 4 334.3422 199.6578
## 4 or more 653.6578 390.3422
Analysis and Results: Test for Association (Chi-Square Test): To analyze the data, I used the Chi-square test to see if there was a significant difference in drinking habits between the two groups. Pvalues My Chi-square test showed a statistic of about 0.178 and a p-value of around 0.673. These numbers suggest there’s no strong evidence of a link between drinking frequency and breast cancer risk in my study sample. The expected numbers—around 334 women with cancer drinking fewer than four drinks a week, and about 200 women with cancer drinking four or more—were pretty close to what I actually observed. This outcome indicates that there is not a statistically significant relationship between the frequency of alcohol consumption (specifically, having fewer than four drinks per week versus four or more) and the incidence of breast cancer in this study sample.
Expected Frequencies The expected frequencies, calculated under the assumption of no association between alcohol consumption frequency and breast cancer status, are:
For “Fewer than 4 drinks per week”: 334.34 for cancer and 653.66 for control. For “4 or more drinks per week”: 199.66 for cancer and 390.34 for control.
Confidence Interval for Odds Ratio: For
this I ran a Fisher’s Exact test. The results:
Fisher’s Exact:
# Perform Fisher's Exact Test for Count Data
fisher_result <- fisher.test(Drinks)
# Print the results
fisher_result
##
## Fisher's Exact Test for Count Data
##
## data: Drinks
## p-value = 0.6601
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
## 0.7611771 1.1842188
## sample estimates:
## odds ratio
## 0.9489964
# Extract and print just the confidence interval for the odds ratio
fisher_result$conf.int
## [1] 0.7611771 1.1842188
## attr(,"conf.level")
## [1] 0.95
Final Report: After running Fisher’s Exact Test on my data, I found that the p-value is approximately 0.6601, which again indicates that there’s no statistically significant association between alcohol consumption (less than versus four or more drinks per week) and breast cancer risk, given that the p-value is much higher than the conventional alpha level of 0.05.
The 95 percent confidence interval for the odds ratio ranges from about 0.761 to 1.184. This means that with 95% confidence, the true odds ratio of having breast cancer associated with drinking fewer than four versus four or more drinks per week falls within this range. Since the interval includes 1, this further supports the conclusion that there is no significant association between alcohol consumption at the specified levels and breast cancer risk in this sample.
The sample estimates provide an odds ratio of approximately 0.949, suggesting that, in this sample, drinking fewer than four drinks per week does not significantly increase or decrease the risk of breast cancer compared to drinking four or more drinks per week. This interpretation aligns with the conclusion drawn from both the chi-squared and Fisher’s exact test results, underlining the lack of evidence for a significant link between the specified levels of alcohol consumption and breast cancer risk based on the data I analyzed.