In the ToothGrowth dataset from the R environment, the variable of interest is the length of odontoblasts, which are cells responsible for tooth growth, measured in 60 guinea pigs. Each guinea pig was administered one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) via either orange juice(coded as OJ) or ascorbic acid (a form of vitamin C and coded as VC), representing two distinct delivery methods.
Load necessary libraries and the ToothGrowth dataset from the datasets package, preparing it for analysis.
library(ggplot2)
library(dplyr)
library(datasets)
data(ToothGrowth)
attach(ToothGrowth)
Convert the ‘dose’ variable to a factor, ensuring correct categorical representation for analysis.
ToothGrowth$dose <- factor(ToothGrowth$dose)
Examine the structure of the ToothGrowth dataset to understand its variables and data types.
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: Factor w/ 3 levels "0.5","1","2": 1 1 1 1 1 1 1 1 1 1 ...
Create a scatter plot and box plot to visualize the relationship between tooth length and dose levels, colored by the supplement delivery method.
set.seed(123)
ggplot(ToothGrowth, aes(dose, len)) +
geom_boxplot(aes(fill = supp)) +
geom_jitter(alpha = 0.7, aes(color = supp)) +
scale_color_manual(values = c("green", "blue")) +
labs(title = "Scatter Plot of Tooth Length and Dose Levels",
x = "Dose Levels", y = "Tooth Length (Millimeters)") +
theme_minimal()
Density plot illustrating the distribution of tooth lengths, categorized by the method of supplement delivery.
ggplot(ToothGrowth, aes(len, fill = supp)) +
geom_density(alpha = 0.5) +
scale_fill_manual(values = c("green", "blue")) +
labs(title = "Density Plot of Tooth Length by Supplement Type",
x = "Tooth Length (Millimeters)", y = "Density") +
theme_minimal()
Bar plot with facets to visualize the mean tooth length for each combination of dose level and supplement delivery method.
# Calculate mean tooth length for each combination of dose level and supplement delivery method
mean_lengths <- ToothGrowth %>%
group_by(dose, supp) %>%
summarise(mean_length = mean(len))
# Create a faceted bar plot
ggplot(mean_lengths, aes(x = supp, y = mean_length, fill = supp)) +
geom_bar(stat = "identity") +
facet_wrap(~dose, scales = "free_x", ncol = 3) +
labs(title = "Mean Tooth Length by Dose Level and Supplement Delivery Method",
x = "Supplement Delivery Method",
y = "Mean Tooth Length (Millimeters)") +
scale_fill_manual(values = c("green", "blue")) +
theme_minimal()
Generate summary statistics to gain insights into the central tendency and spread of the tooth growth data.
summary(ToothGrowth)
## len supp dose
## Min. : 4.20 OJ:30 0.5:20
## 1st Qu.:13.07 VC:30 1 :20
## Median :19.25 2 :20
## Mean :18.81
## 3rd Qu.:25.27
## Max. :33.90
Tabulate the distribution of observations by supplement delivery method and dose levels, providing a categorical overview of the data.
table(ToothGrowth$supp,ToothGrowth$dose)
##
## 0.5 1 2
## OJ 10 10 10
## VC 10 10 10
We aim to assess the relationship between the delivery method of supplements and the change in tooth growth. We proceed under the assumption of unequal variances between the two groups.
t.test(len ~ supp, paired = F, var.equal = F, data = ToothGrowth)
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means between group OJ and group VC is not equal to 0
## 95 percent confidence interval:
## -0.1710156 7.5710156
## sample estimates:
## mean in group OJ mean in group VC
## 20.66333 16.96333
The calculated 95% confidence interval, [-0.1710156, 7.5710156], encompasses zero, and the resulting p-value, 0.06063, exceeds the significance threshold of 0.05. Consequently, we fail to reject the null hypothesis.
Based on the t-test analysis, we conclude that there is no significant correlation between the delivery method and tooth length.
Subset the data for different dose level combinations for further analysis.
# Subset for dose levels 0.5 and 1.0
Dose_0510 <- ToothGrowth %>% filter(dose == 0.5 | dose == 1.0)
# Subset for dose levels 0.5 and 2.0
Dose_0520 <- ToothGrowth %>% filter(dose == 0.5 | dose == 2.0)
# Subset for dose levels 1.0 and 2.0
Dose_1020 <- ToothGrowth %>% filter(dose == 1.0 | dose == 2.0)
Examining the data to identify any correlation between the dosage level and the change in tooth growth, while considering potential unequal variances within the two groups.
The null hypothesis for the ensuing three t-tests posits that there exists no correlation between the dose level and tooth length.
The hypotheses for the following three t-tests suggest that:
Null hypothesis (H0): There is no correlation between the dose level and tooth length.
Alternative hypothesis (H1): A correlation exists between the dose level and tooth length.
t.test(len ~ dose, paired = F, var.equal = F, data = Dose_0510)
##
## Welch Two Sample t-test
##
## data: len by dose
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means between group 0.5 and group 1 is not equal to 0
## 95 percent confidence interval:
## -11.983781 -6.276219
## sample estimates:
## mean in group 0.5 mean in group 1
## 10.605 19.735
In this scenario, the 95% confidence interval, ranging from -11.983781 to -6.276219, does not encompass zero. Additionally, the p-value of 1.268e-07 is below the significance level of 0.05. Consequently, we have sufficient evidence to confidently reject the null hypothesis.
t.test(len ~ dose, paired = F, var.equal = F, data = Dose_0520)
##
## Welch Two Sample t-test
##
## data: len by dose
## t = -11.799, df = 36.883, p-value = 4.398e-14
## alternative hypothesis: true difference in means between group 0.5 and group 2 is not equal to 0
## 95 percent confidence interval:
## -18.15617 -12.83383
## sample estimates:
## mean in group 0.5 mean in group 2
## 10.605 26.100
In this scenario, the 95% confidence interval, ranging from -18.15617 to -12.83383, does not encompass zero. Additionally, the p-value of 4.398e-14 is below the significance level of 0.05. Consequently, we have sufficient evidence to confidently reject the null hypothesis.
t.test(len ~ dose, paired = F, var.equal = F, data = Dose_1020)
##
## Welch Two Sample t-test
##
## data: len by dose
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means between group 1 and group 2 is not equal to 0
## 95 percent confidence interval:
## -8.996481 -3.733519
## sample estimates:
## mean in group 1 mean in group 2
## 19.735 26.100
In this scenario, the 95% confidence interval, ranging from -8.996481 to -3.733519, does not encompass zero. Additionally, the p-value of 1.906e-05 is below the significance level of 0.05. Consequently, we have sufficient evidence to confidently reject the null hypothesis.
Examining the data to investigate the correlation between the Delivery Method and the change in Tooth Growth within individual Dose Levels, while considering potential unequal variances between the two groups.
Hypothesis for the following three t-tests posits that,
Null hypothesis (H0): There exists no correlation between the Delivery Method and Tooth Length for the specified Dose Level.
Alternative hypothesis (H1): There is a correlation between the Delivery Method and Tooth Length for the specified Dose Level.
Dose05 <- ToothGrowth %>% filter(dose == 0.5)
Dose10 <- ToothGrowth %>% filter(dose == 1.0)
Dose20 <- ToothGrowth %>% filter(dose == 2.0)
t.test(len ~ supp, paired = F, var.equal = F, data = Dose05)
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means between group OJ and group VC is not equal to 0
## 95 percent confidence interval:
## 1.719057 8.780943
## sample estimates:
## mean in group OJ mean in group VC
## 13.23 7.98
In this instance, the 95% confidence interval, which ranges from 1.719057 to 8.780943, does not include zero. Additionally, the p-value of 0.006359 is lower than the commonly accepted significance level of 0.05. Thus, we have substantial evidence to reject the null hypothesis with confidence.
t.test(len ~ supp, paired = F, var.equal = F, data = Dose10)
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means between group OJ and group VC is not equal to 0
## 95 percent confidence interval:
## 2.802148 9.057852
## sample estimates:
## mean in group OJ mean in group VC
## 22.70 16.77
In this situation, the 95% confidence interval, spanning from [2.802148 to 9.057852, does not include zero. Moreover, with a p-value of 0.001038, falling below the significance threshold of 0.05, we possess substantial evidence to firmly reject the null hypothesis.
t.test(len ~ supp, paired = F, var.equal = F, data = Dose20)
##
## Welch Two Sample t-test
##
## data: len by supp
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means between group OJ and group VC is not equal to 0
## 95 percent confidence interval:
## -3.79807 3.63807
## sample estimates:
## mean in group OJ mean in group VC
## 26.06 26.14
In this case, the 95% confidence interval ranges from -3.79807 to 3.63807, inclusive of zero. Furthermore, the p-value of 0.9639 exceeds the significance level of 0.05. Therefore, we do not have enough evidence to reject the null hypothesis.
Increase in Supplement Dose Levels leads to an overall increase in Tooth Length: The analysis indicates a significant correlation between dosage levels and tooth length. As the dosage level increases, tooth length tends to increase. This suggests that higher doses of Vitamin C may promote tooth growth in guinea pigs.
Supplement Delivery Method has no overall significant impact on Tooth Length: The analysis suggests that, overall, the supplement delivery method does not significantly affect tooth length. However, within specific dose levels, there are differences. For dose levels 0.5 and 1.0, orange juice increases tooth length more rapidly compared to ascorbic acid. Yet, for the 2.0 dose level, there is no significant difference in the increase of tooth length between the two supplement delivery methods. This implies that the impact of delivery method on tooth length varies depending on the dosage level.