Overview

This report aims to analyze the ToothGrowth data in the R datasets package. Per the course project instructions, the following items should occur: Load the ToothGrowth data and perform some basic exploratory data analyses Provide a basic summary of the data. Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose (only use the techniques from class, even if there’s other approaches worth considering). State your conclusions and the assumptions needed for your conclusions.

About ToothGrowth dataset

ToothGrowth dataset is the length of odontoblasts (teeth) in each of 10 guinea pigs at each of three dose levels of Vitamin C (0.5, 1, and 2 mg) with each of two delivery methods (orange juice or ascorbic acid).

The format is a data frame with 60 observations on 3 variables:

  1. [,1] len numeric Tooth length
  2. [,2] supp factor Supplement type (VC or OJ).
  3. [,3] dose numeric Dose in milligrams.

Now, let’s load the ToothGrowth data

toothGrowthDF <- data.table(ToothGrowth)
head(toothGrowthDF)
##     len supp dose
## 1:  4.2   VC  0.5
## 2: 11.5   VC  0.5
## 3:  7.3   VC  0.5
## 4:  5.8   VC  0.5
## 5:  6.4   VC  0.5
## 6: 10.0   VC  0.5

Basic Exploratory Data Analysis

Review the internal structure of the ToothGrowth dataset:

str(toothGrowthDF)
## Classes 'data.table' and 'data.frame':   60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
##  - attr(*, ".internal.selfref")=<externalptr>

The numeric variable dose is an experimental factor with three discrete levels (0.5, 1 and 2 milligrams), so we will convert it to a factor variable.

ToothGrowth$dose <- factor(ToothGrowth$dose)
table(ToothGrowth$dose)
## 
## 0.5   1   2 
##  20  20  20

Review the summary of ToothGrowth dataset:

summary(toothGrowthDF)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000

Frequency table to verify the number of observations

toothGrowthFreqTable <- table( ToothGrowth$supp, ToothGrowth$dose)
toothGrowthFreqTable
##     
##      0.5  1  2
##   OJ  10 10 10
##   VC  10 10 10

Plot length of teeth v/s supplement (VC or OJ)

 plot( len ~ supp, data=ToothGrowth)

Plot length of teeth v/s dosage

 plot( len ~ dose, data=ToothGrowth)

From the above two plots we notice that as the dosage increases then length of the tooth increases. However we are unable to determine which of the two supplements (VC or OJ), is contributing to the growth of the teeth.

Confidence Interval and Null Hypothesis Tests

In the subsequent steps we will use Confidence Interval and Null Hypothesis Tests to compare tooth growth by supplement and dosage.

Comparative Analysis of Dosages

Null Hypothesis: Increasing the dosages of VC or OJ, DOES NOT increase the length of the teeth

Let’s begin by creating three vectors of data, one for each dosage level (0.5mg, 1mg and 2mg)

subset0.5mg <- subset(toothGrowthDF,dose=='0.5')$len
subset1.0mg <- subset(toothGrowthDF,dose=='1')$len
subset2.0mg <- subset(toothGrowthDF,dose=='2')$len

Working towards our null hypothesis we will start comparing dosage in their increasing order by using t-tests to see if there are differences between the two dosage groups.

Case 1: Increase the dosage from 0.5mg to 1.0mg:

Performing t-tests:

tTest0.5mgTo1.0mg <- t.test(subset0.5mg, subset1.0mg, paired=FALSE,var.equal=FALSE)

Get the 95% confidence interval for the mean appropriate to the specified alternative hypothesis.

tTest0.5mgTo1.0mg$conf.int[1:2]
## [1] -11.983781  -6.276219

When we increase the dosage from 0.5mg to 1.0mg, we found that the confidence intervals do not contain zero (0)

Case 2: Increase the dosage from 1.0mg to 2.0mg:

Performing t-tests:

tTest1.0mgTo2.0mg <- t.test(subset1.0mg, subset2.0mg, paired=FALSE,var.equal=FALSE)

Get the confidence interval for the mean appropriate to the specified alternative hypothesis.

tTest1.0mgTo2.0mg$conf.int[1:2]
## [1] -8.996481 -3.733519

When we increase the dosage from 1.0mg to 2.0mg, we found that the confidence intervals do not contain zero (0)

Conclusion from Case 1 and Case 2:

In both the cases, confidence intervals do not contain zero (0) we can reject the null hypothesis to conclude that Increasing the dose DOES increase the length of the teeth

Comparative Analysis of Supplements

Null Hypothesis: Increasing vitamin C supplements alone, DOES NOT increase the length of the teeth

Let’s being by creating two vectors of data, one for VC and other for OJ:

subsetVC <- subset(toothGrowthDF,supp=='OJ')$len
subsetOJ <- subset(toothGrowthDF,supp=='VC')$len
Compare the two supplements:

Performing t-tests:

tTestSupplements <- t.test(subsetVC, subsetOJ, paired=FALSE,var.equal=FALSE)

Get the p-value and confidence interval for the mean appropriate to the specified alternative hypothesis.

tTestSupplements$p.value
## [1] 0.06063451
tTestSupplements$conf.int[1:2]
## [1] -0.1710156  7.5710156
Conclusion

We observe that p-value os 0.60 and the confidence interval contains 0 and therefore we DO NOT REJECT the null hypothesis and conclude that Increasing vitamin C supplements alone, DOES NOT increase the length of the teeth.

Comparative Analysis of two supplements with each of the three dosages:

Null Hypothesis: Supplements with 0.5mg dosage of vitamin C DOES NOT affect tooth growth

Let’s being by creating two vectors of data, one for VC and other for OJ for 0.5mg dosage:

subsetVC0.5mg <- subset(toothGrowthDF,supp=='VC' & dose == '0.5')$len
subsetOJ0.5mg <- subset(toothGrowthDF,supp=='OJ' & dose == '0.5')$len
Compare the two supplements with 0.5mg dosage:

Performing t-tests:

tTestSupplements0.5mg <- t.test(subsetVC0.5mg, subsetOJ0.5mg, paired=FALSE,var.equal=FALSE)

Get the p-value and confidence interval for the mean appropriate to the specified alternative hypothesis.

tTestSupplements0.5mg$p.value
## [1] 0.006358607
tTestSupplements0.5mg$conf.int[1:2]
## [1] -8.780943 -1.719057
Conclusion

We observe that the confidence interval does not contains 0 and therefore we REJECT the null hypothesis.

Null Hypothesis: Supplements with 1.0mg dosage of vitamin C DOES NOT affect tooth growth

Let’s being by creating two vectors of data, one for VC and other for OJ for 1.0mg dosage:

subsetVC1.0mg <- subset(toothGrowthDF,supp=='VC' & dose == '1')$len
subsetOJ1.0mg <- subset(toothGrowthDF,supp=='OJ' & dose == '1')$len
Compare the two supplements with 1.0mg dosage:

Performing t-tests:

tTestSupplements1.0mg <- t.test(subsetVC1.0mg, subsetOJ1.0mg, paired=FALSE,var.equal=FALSE)

Get the p-value and confidence interval for the mean appropriate to the specified alternative hypothesis.

tTestSupplements1.0mg$p.value
## [1] 0.001038376
tTestSupplements1.0mg$conf.int[1:2]
## [1] -9.057852 -2.802148
Conclusion

We observe that the confidence interval does not contains 0 and therefore we REJECT the null hypothesis.

Null Hypothesis: Supplements with 2.0mg dosage of vitamin C DOES NOT affect tooth growth

Let’s being by creating two vectors of data, one for VC and other for OJ for 2.0mg dosage:

subsetVC2.0mg <- subset(toothGrowthDF,supp=='VC' & dose == '2')$len
subsetOJ2.0mg <- subset(toothGrowthDF,supp=='OJ' & dose == '2')$len
Compare the two supplements with 2.0mg dosage:

Performing t-tests:

tTestSupplements2.0mg <- t.test(subsetVC2.0mg, subsetOJ2.0mg, paired=FALSE,var.equal=FALSE)

Get the p-value and confidence interval for the mean appropriate to the specified alternative hypothesis.

tTestSupplements2.0mg$p.value
## [1] 0.9638516
tTestSupplements2.0mg$conf.int[1:2]
## [1] -3.63807  3.79807
Conclusion

We observe that p-value is close to 1 and the confidence interval DOES have 0 and therefore we ACCEPT the null hypothesis and conclude that `vitamin C dosage of 2.0mg does not affect tooth growth. However, we cannot conclude which of the two supplements types has a greater impact.

Final Conclusions

  1. Increasing dosages of Vitamin C alone affects tooth growth.
  2. Increasing Vitamin C supplement alone does not affect tooth growth.
  3. With 0.5mg or 1.0mg intake of Orange Juice (OJ) supplement increases tooth growth than Ascorbic Acid (VC).
  4. With 2.0mg intake intake of Orange Juice (OJ) or Ascorbic Acid (VC) supplements there is no impact on the tooth growth and we cannot conclude which of these two supplements has a greater affect of tooth growth.

Assumptions

  1. We are working with Independent and Identically Distributed (IID) samples.
  2. Confidence intervals are not paired and therefore we don’t want a paired test.
  3. The variances are not equal and therefore we are not using pooled variances to estimate the variance, instead the Welch (or Satterthwaite) approximation ot the degrees of freedom is used.