In this part, the ToothGrowth data will be loaded and some basic exploratory data analyses will be performed.
knitr::opts_chunk$set(echo = TRUE)
if(!require(tidyverse)){
install.packages("tidyverse")
}
library(tidyverse)
library(datasets) #load the dataset
data(ToothGrowth)
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
table(ToothGrowth$supp, ToothGrowth$dose) # split of cases between different dose levels and delivery methods
##
## 0.5 1 2
## OJ 10 10 10
## VC 10 10 10
ToothGrowth is a data frame with 60 observations on 3 variables, namely “len”, “supp” and “dose”.
The variable “len” refers to the length of odontoblasts in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day, under the variable “dose”) by one of two delivery methods, orange juice (coded as OJ) or ascorbic acid (a form of vitamin C and coded as VC), under the variable “supp”. There were 10 animal subjects for each dose level under each delivery method.
Below shows the basic exploratory data analyses of ToothGrowth:
ggplot(data=ToothGrowth,
mapping=aes(x=as.factor(dose), y=len, fill=supp))+
geom_boxplot() +
labs(x= "Dose of Vitamin C(mg/day)",
y = "Length of odontoblasts",
title = "Effect of vitamin C on the length of odonoblasts in guinea pigs")+
stat_summary(fun = median,
geom = "text",
aes(label = after_stat(y)),
vjust = -1, # Adjust vertical position above the line
size = 4) # Adjust text size
print(summary(ToothGrowth))
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
print("Summary table of ToothGrowth dataset")
## [1] "Summary table of ToothGrowth dataset"
Meandata<-ToothGrowth |>
group_by(supp, dose)|>
summarise(mean_length=mean(len), SD=sd(len),.groups = "drop")
print(Meandata)
## # A tibble: 6 × 4
## supp dose mean_length SD
## <fct> <dbl> <dbl> <dbl>
## 1 OJ 0.5 13.2 4.46
## 2 OJ 1 22.7 3.91
## 3 OJ 2 26.1 2.66
## 4 VC 0.5 7.98 2.75
## 5 VC 1 16.8 2.52
## 6 VC 2 26.1 4.80
print("Summary table of the mean length of odonoblasts with different dosages and delivery methods")
## [1] "Summary table of the mean length of odonoblasts with different dosages and delivery methods"
Meandata2<-ToothGrowth |>
group_by(dose)|>
summarise(mean_length=mean(len), SD=sd(len), .groups = "drop")
print(Meandata2)
## # A tibble: 3 × 3
## dose mean_length SD
## <dbl> <dbl> <dbl>
## 1 0.5 10.6 4.50
## 2 1 19.7 4.42
## 3 2 26.1 3.77
print("Summary table of the mean length of odonoblasts")
## [1] "Summary table of the mean length of odonoblasts"
print("with different dosages")
## [1] "with different dosages"
ggplot(data=Meandata2,
mapping=aes(x=dose, y=mean_length))+
geom_point(colour="blue")+
geom_smooth(formula = 'y ~ x', method = 'lm', se=FALSE, colour="grey")+
geom_errorbar(aes(ymin = mean_length - SD, ymax = mean_length + SD),
width = 0.2, position = position_dodge(0.3))+
labs(x= "Dose of Vitamin C(mg/day)",
y = "Length of odontoblasts",
title = "Effect of vitamin C on the mean length of odonoblasts in guinea pigs")
Based on the boxplot and the line graph above, vitamin C supplement shows a positive relationship with length of odonoblasts in guinea pig in a dose-dependent mannner, as the length increased with the dose. The delivery method, VC, seems to be more effective than OJ at the lower dose levels (0.5mg/day and 1mg/day), as suggested by the higher median values in the boxplot. Same trends were observed based on the table of the mean length values. Statistical analyses are needed to confirm these observations.
To prepare data for statistical analyses, ToothGrowth dataset was subset as following:
oj_data <- filter(ToothGrowth, supp == "OJ")
vc_data <- filter(ToothGrowth, supp == "VC")
dose0.5 <- ToothGrowth[ToothGrowth$dose == 0.5, ]$len
dose1 <- ToothGrowth[ToothGrowth$dose == 1, ]$len
dose2 <- ToothGrowth[ToothGrowth$dose == 2, ]$len
oj_dose0.5 <- oj_data[oj_data$dose == 0.5, ]$len
oj_dose1 <- oj_data[oj_data$dose == 1, ]$len
oj_dose2 <- oj_data[oj_data$dose == 2, ]$len
vc_dose0.5 <- vc_data[vc_data$dose == 0.5, ]$len
vc_dose1 <- vc_data[vc_data$dose == 1, ]$len
vc_dose2 <- vc_data[vc_data$dose == 2, ]$len
To support the notion that vitamin C supplement shows a positive
relationship with length of odonoblasts in guinea pig, it is
hypothesized that a higher dose resulted in a greater length while the
null hypotheses state that the mean of length remained unchanged
regardless to the dose level. One sided t-tests were performed as
following:
Test 1:
H0: length mean of dose0.5 = length mean of dose1
Ha: length mean of dose0.5 < length mean of dose1
t.test(dose1-dose0.5, paired = FALSE, alt = "greater")
##
## One Sample t-test
##
## data: dose1 - dose0.5
## t = 6.9669, df = 19, p-value = 6.127e-07
## alternative hypothesis: true mean is greater than 0
## 95 percent confidence interval:
## 6.863996 Inf
## sample estimates:
## mean of x
## 9.13
Test 2:
H0: length mean of dose1 = length mean of dose2
Ha: length mean of dose1 < length mean of dose2
t.test(dose2-dose1, paired = FALSE, alt = "greater")
##
## One Sample t-test
##
## data: dose2 - dose1
## t = 4.6046, df = 19, p-value = 9.671e-05
## alternative hypothesis: true mean is greater than 0
## 95 percent confidence interval:
## 3.974821 Inf
## sample estimates:
## mean of x
## 6.365
In both cases, the t statistics are larger than the 95 percent
confidence interval and the p-values are less than 0.05. Thus, the null
hypotheses are rejected, supporting the alternative hypotheses, i.e. a
higher dose resulted in a greater length of odonoblasts. To investigate
if this also holds true for different delivery methods, further
one-sided t-tests (Tests 3 - 6) are carried out as following:
Test 3:
H0: length mean of oj_dose0.5 = length mean of oj_dose1
Ha: length mean of oj_dose0.5 < length mean of oj_dose1
Test 4:
H0: length mean of oj_dose1 = length mean of oj_dose2
Ha: length mean of oj_dose1 < length mean of oj_dose2
Test 5:
H0: length mean of vc_dose0.5 = length mean of vc_dose1
Ha: length mean of vc_dose0.5 < length mean of vc_dose1
Test 6:
H0: length mean of vc_dose1 = length mean of vc_dose2
Ha: length mean of vc_dose1 < length mean of vc_dose2
p3<-t.test(oj_dose1-oj_dose0.5, paired = FALSE, alt = "greater")$p.value
p4<-t.test(oj_dose2-oj_dose1, paired = FALSE, alt = "greater")$p.value
p5<-t.test(oj_dose1-oj_dose0.5, paired = FALSE, alt = "greater")$p.value
p6<-t.test(oj_dose2-oj_dose1, paired = FALSE, alt = "greater")$p.value
# Create a data frame with the results
p_values_table <- data.frame(
Test = c("Test 3", "Test 4", "Test 5", "Test 6"),
Comparison = c("OJ: Dose 1 vs 0.5", "OJ: Dose 2 vs 1", "VC: Dose 1 vs 0.5", "VC: Dose 2 vs 1"),
P_Value = c(p3, p4, p5, p6)
)
# Print the table to the console
print(p_values_table)
## Test Comparison P_Value
## 1 Test 3 OJ: Dose 1 vs 0.5 0.00121757
## 2 Test 4 OJ: Dose 2 vs 1 0.04191956
## 3 Test 5 VC: Dose 1 vs 0.5 0.00121757
## 4 Test 6 VC: Dose 2 vs 1 0.04191956
As all the p-values are lower than 0.05, the null hypotheses are rejected with p-value = 0.05, supporting the notion that a higher dose results in a greater length of odonoblasts regardless to the delivery methods (OJ or VC).
To evaluate whether OJ is a more effective delivery method than VC over different dose levels, following t-tests are performed (one-sided for Test 7 to Test 9; two-sided for test 10.
Test 7:
H0: length mean of vc_dose0.5 = length mean of oj_dose0.5
Ha: length mean of vc_dose0.5 < length mean of oj_dose0.5
Test 8:
H0: length mean of vc_dose1 = length mean of oj_dose1
Ha: length mean of vc_dose1 < length mean of oj_dose1
Test 9:
H0: length mean of vc_dose2 = length mean of oj_dose2
Ha: length mean of vc_dose2 < length mean of oj_dose2
Test 10:
H0: length mean of vc_dose2 = length mean of oj_dose2
Ha: length mean of vc_dose2 =/= length mean of oj_dose2
p7<-t.test(oj_dose0.5-vc_dose0.5, paired = FALSE, alt = "greater")$p.value
p8<-t.test(oj_dose1-vc_dose1, paired = FALSE, alt = "greater")$p.value
p9<-t.test(oj_dose2-vc_dose2, paired = FALSE, alt = "greater")$p.value
p10<-t.test(oj_dose2-vc_dose2, paired = FALSE, alt = "two.sided")$p.value
# Create a data frame with the results
p_values_table2 <- data.frame(
Test = c("Test 7", "Test 8", "Test 9", "Test 10"),
Comparison = c("Dose 0.5: OJ vs VC", "Dose 1: OJ vs VC", "Dose 2: OJ vs VC", "Dose 2: OJ vs VC (two-sided)"),
P_Value = c(p7, p8, p9, p10)
)
# Print the table to the console
print(p_values_table2)
## Test Comparison P_Value
## 1 Test 7 Dose 0.5: OJ vs VC 0.007736024
## 2 Test 8 Dose 1: OJ vs VC 0.004114624
## 3 Test 9 Dose 2: OJ vs VC 0.516521648
## 4 Test 10 Dose 2: OJ vs VC (two-sided) 0.966956704
As the p values for dose levels 0.5mg/day and 1mg/day (Test 7 & Test 8) are much lower than 0.05, we can reject null hypotheses and support the alternative hypotheses: OJ is a more effective delivery method at the dose levels 0.5mg/day and 1mg/day. However, at dose level 2mg/day, the p-value for either one-sided (Test 9) or two-sided (Test 10) t-test is much higher than 0.05 and thus we the null hypothesis cannot be rejected. Thus, OJ and VC appear to make no significant difference in stimulating the growth of odonoblasts at the higher dose level.
*Conclusions:
- Length of odonoblasts increased with the dose of Vitamin C, regardless
to the delivery methods, suggesting a positive influence of Vitamin C on
the growth of odonoblasts.
- OJ was a more effective delivery method compared to VC at the dose
levels 0.5mg/day and 1mg/day.
*Assumption:
- The experiment was done with random assignment of guinea pigs to
different dose levels and delivery methods to control for confounders
that might affect the outcome.
- Members of the sample population, i.e. the 60 guinea pigs, are
representative of the entire population of guinea pigs. This assumption
allows us to generalize the results.
- For the t-tests, the variances are assumed to be different for the two
groups being compared as the default setting was used
(var.equal=FALSE).