This report aims to analyze the ToothGrowth data in the R datasets package. Per the course project instructions, the following items should occur: Load the ToothGrowth data and perform some basic exploratory data analyses Provide a basic summary of the data. Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose (only use the techniques from class, even if thereâs other approaches worth considering). State your conclusions and the assumptions needed for your conclusions.
ToothGrowth datasetToothGrowth dataset is the length of odontoblasts (teeth) in each of 10 guinea pigs at each of three dose levels of Vitamin C (0.5, 1, and 2 mg) with each of two delivery methods (orange juice or ascorbic acid).
The format is a data frame with 60 observations on 3 variables:
len numeric Tooth lengthsupp factor Supplement type (VC or OJ).dose numeric Dose in milligrams.Now, let’s load the ToothGrowth data
toothGrowthDF <- data.table(ToothGrowth)
head(toothGrowthDF)
## len supp dose
## 1: 4.2 VC 0.5
## 2: 11.5 VC 0.5
## 3: 7.3 VC 0.5
## 4: 5.8 VC 0.5
## 5: 6.4 VC 0.5
## 6: 10.0 VC 0.5
ToothGrowth dataset:str(toothGrowthDF)
## Classes 'data.table' and 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
## - attr(*, ".internal.selfref")=<externalptr>
The numeric variable dose is an experimental factor with three discrete levels (0.5, 1 and 2 milligrams), so we will convert it to a factor variable.
ToothGrowth$dose <- factor(ToothGrowth$dose)
table(ToothGrowth$dose)
##
## 0.5 1 2
## 20 20 20
ToothGrowth dataset:summary(toothGrowthDF)
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
toothGrowthFreqTable <- table( ToothGrowth$supp, ToothGrowth$dose)
toothGrowthFreqTable
##
## 0.5 1 2
## OJ 10 10 10
## VC 10 10 10
plot( len ~ supp, data=ToothGrowth)
plot( len ~ dose, data=ToothGrowth)
From the above two plots we notice that as the dosage increases then length of the tooth increases. However we are unable to determine which of the two supplements (VC or OJ), is contributing to the growth of the teeth.
In the subsequent steps we will use Confidence Interval and Null Hypothesis Tests to compare tooth growth by supplement and dosage.
Increasing the dosages of VC or OJ, DOES NOT increase the length of the teethLet’s begin by creating three vectors of data, one for each dosage level (0.5mg, 1mg and 2mg)
subset0.5mg <- subset(toothGrowthDF,dose=='0.5')$len
subset1.0mg <- subset(toothGrowthDF,dose=='1')$len
subset2.0mg <- subset(toothGrowthDF,dose=='2')$len
Working towards our null hypothesis we will start comparing dosage in their increasing order by using t-tests to see if there are differences between the two dosage groups.
Performing t-tests:
tTest0.5mgTo1.0mg <- t.test(subset0.5mg, subset1.0mg, paired=FALSE,var.equal=FALSE)
Get the 95% confidence interval for the mean appropriate to the specified alternative hypothesis.
tTest0.5mgTo1.0mg$conf.int[1:2]
## [1] -11.983781 -6.276219
When we increase the dosage from 0.5mg to 1.0mg, we found that the confidence intervals do not contain zero (0)
Performing t-tests:
tTest1.0mgTo2.0mg <- t.test(subset1.0mg, subset2.0mg, paired=FALSE,var.equal=FALSE)
Get the confidence interval for the mean appropriate to the specified alternative hypothesis.
tTest1.0mgTo2.0mg$conf.int[1:2]
## [1] -8.996481 -3.733519
When we increase the dosage from 1.0mg to 2.0mg, we found that the confidence intervals do not contain zero (0)
In both the cases, confidence intervals do not contain zero (0) we can reject the null hypothesis to conclude that Increasing the dose DOES increase the length of the teeth
Increasing vitamin C supplements alone, DOES NOT increase the length of the teethLet’s being by creating two vectors of data, one for VC and other for OJ:
subsetVC <- subset(toothGrowthDF,supp=='OJ')$len
subsetOJ <- subset(toothGrowthDF,supp=='VC')$len
Performing t-tests:
tTestSupplements <- t.test(subsetVC, subsetOJ, paired=FALSE,var.equal=FALSE)
Get the p-value and confidence interval for the mean appropriate to the specified alternative hypothesis.
tTestSupplements$p.value
## [1] 0.06063451
tTestSupplements$conf.int[1:2]
## [1] -0.1710156 7.5710156
We observe that p-value os 0.60 and the confidence interval contains 0 and therefore we DO NOT REJECT the null hypothesis and conclude that Increasing vitamin C supplements alone, DOES NOT increase the length of the teeth.
Supplements with 0.5mg dosage of vitamin C DOES NOT affect tooth growthLet’s being by creating two vectors of data, one for VC and other for OJ for 0.5mg dosage:
subsetVC0.5mg <- subset(toothGrowthDF,supp=='VC' & dose == '0.5')$len
subsetOJ0.5mg <- subset(toothGrowthDF,supp=='OJ' & dose == '0.5')$len
Performing t-tests:
tTestSupplements0.5mg <- t.test(subsetVC0.5mg, subsetOJ0.5mg, paired=FALSE,var.equal=FALSE)
Get the p-value and confidence interval for the mean appropriate to the specified alternative hypothesis.
tTestSupplements0.5mg$p.value
## [1] 0.006358607
tTestSupplements0.5mg$conf.int[1:2]
## [1] -8.780943 -1.719057
We observe that the confidence interval does not contains 0 and therefore we REJECT the null hypothesis.
Supplements with 1.0mg dosage of vitamin C DOES NOT affect tooth growthLet’s being by creating two vectors of data, one for VC and other for OJ for 1.0mg dosage:
subsetVC1.0mg <- subset(toothGrowthDF,supp=='VC' & dose == '1')$len
subsetOJ1.0mg <- subset(toothGrowthDF,supp=='OJ' & dose == '1')$len
Performing t-tests:
tTestSupplements1.0mg <- t.test(subsetVC1.0mg, subsetOJ1.0mg, paired=FALSE,var.equal=FALSE)
Get the p-value and confidence interval for the mean appropriate to the specified alternative hypothesis.
tTestSupplements1.0mg$p.value
## [1] 0.001038376
tTestSupplements1.0mg$conf.int[1:2]
## [1] -9.057852 -2.802148
We observe that the confidence interval does not contains 0 and therefore we REJECT the null hypothesis.
Supplements with 2.0mg dosage of vitamin C DOES NOT affect tooth growthLet’s being by creating two vectors of data, one for VC and other for OJ for 2.0mg dosage:
subsetVC2.0mg <- subset(toothGrowthDF,supp=='VC' & dose == '2')$len
subsetOJ2.0mg <- subset(toothGrowthDF,supp=='OJ' & dose == '2')$len
Performing t-tests:
tTestSupplements2.0mg <- t.test(subsetVC2.0mg, subsetOJ2.0mg, paired=FALSE,var.equal=FALSE)
Get the p-value and confidence interval for the mean appropriate to the specified alternative hypothesis.
tTestSupplements2.0mg$p.value
## [1] 0.9638516
tTestSupplements2.0mg$conf.int[1:2]
## [1] -3.63807 3.79807
We observe that p-value is close to 1 and the confidence interval DOES have 0 and therefore we ACCEPT the null hypothesis and conclude that `vitamin C dosage of 2.0mg does not affect tooth growth. However, we cannot conclude which of the two supplements types has a greater impact.