Required libraries
library(datasets)
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.0.2
library(RColorBrewer)
Over view:
In this project we will analyze the ‘ToothGrowth’ data from the R datasets package.
Task 1: Load the ‘ToothGrowth’ data
data("ToothGrowth")
dim(ToothGrowth)
## [1] 60 3
head(ToothGrowth)
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
summary(ToothGrowth)
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
Summary of data:
The ToothGrowth dataset contains data about the effect of supplements: VC (Vitamin C), OJ (Orange Juice) on tooth growth of Guinea pigs.
Task 2: Performing some basic exploratory data analysis on ToothGrowth dataset
Creating plot for the supplements dosage
g<-ggplot(data=ToothGrowth, aes(x=interaction(supp, dose), y=len, fill=supp))
# selecting colors and shape for the plot
g<-g+geom_boxplot(outlier.colour="blue")+scale_fill_brewer(palette="Paired")
# Settting labels for the plot
g<-g+labs(x="Supplements type and dose", y="Tooth length",
title="Tooth growth by dose")
print(g)

Conclusion from Exploratory data analysis:
The graph shows that, when the dose levels are higher, both supplements gives the similar result.But supplement ‘OJ’ gives more consistent result than the ‘VC’
Task 3: Compare tooth growth by supplement with dosage using t-test.
Run t-test
t.test(len~supp, data=ToothGrowth)
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1710156 7.5710156
## sample estimates:
## mean in group OJ mean in group VC
## 20.66333 16.96333
p-value of this test is 0.06063, confidence intervals : -0.17, 7.57
so, supplement type seems to have no impact on tooth growth. Hence we can reject the null Hypothesis
Now, we will compare tooth growth by supplement dosage.
# Subsetting data by dose
testI<-subset(ToothGrowth, ToothGrowth$dose %in% c(0.5,1))
testII<-subset(ToothGrowth, ToothGrowth$dose %in% c(1, 2))
testIII<-subset(ToothGrowth, ToothGrowth$dose %in% c(2,0.5))
# Run t-test by dosage
t.test(len~dose, data=testI)
##
## Welch Two Sample t-test
##
## data: len by dose
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.983781 -6.276219
## sample estimates:
## mean in group 0.5 mean in group 1
## 10.605 19.735
t.test(len~dose, data=testII)
##
## Welch Two Sample t-test
##
## data: len by dose
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -8.996481 -3.733519
## sample estimates:
## mean in group 1 mean in group 2
## 19.735 26.100
t.test(len~dose, data=testIII)
##
## Welch Two Sample t-test
##
## data: len by dose
## t = -11.799, df = 36.883, p-value = 4.398e-14
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -18.15617 -12.83383
## sample estimates:
## mean in group 0.5 mean in group 2
## 10.605 26.100
From the above tests of p-values and confidence intervals, it is clearly shows that supplement dosage has an impact on tooth growth.
Conclusion from t-test:
t-test analysis can conclude that the supplement delivery method has no effect on tooth length but increased dosages do result in increased tooth length.