This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
df <- read.csv("Cancer.csv")
1.a. Yes, breast has the highest median. It also has the greatest IQR, along with ovary. Ovary’s box appears to have values skewed downward, and stomach’s as well (though to a lesser extent).
boxplot(
df$Survival ~ df$Organ,
main="Survival Time by Organ",
ylab="Survival (days)",
xlab="Organ"
)
1.b. Reject the null hypothesis, each mean is different.
mean_by_organ <- tapply(df$Survival, df$Organ, mean, na.rm = TRUE)
print(mean_by_organ)
## Breast Bronchus Colon Ovary Stomach
## 1395.9091 211.5882 457.4118 884.3333 286.0000
1.c. Through calculation, every standard deviation is different.
stdev_by_organ <- tapply(df$Survival, df$Organ, function(x) sd(x, na.rm = TRUE))
print(stdev_by_organ)
## Breast Bronchus Colon Ovary Stomach
## 1238.9667 209.8586 427.1686 1098.5788 346.3096
1.d. 0.05
1.e-g. SSTO=37983905, MSTO=3332214
anova_result <- aov(Survival ~ Organ, data = df)
summary(anova_result)
## Df Sum Sq Mean Sq F value Pr(>F)
## Organ 4 11535761 2883940 6.433 0.000229 ***
## Residuals 59 26448144 448274
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
1.h. Reject the null hypothesis.
{knitr::opts_chunk$set(echo=TRUE)}