Now in the second portion of the class, we’re going to analyze the ToothGrowth data in the R datasets package.

data(ToothGrowth)

1. Load the ToothGrowth data and perform some basic exploratory data analyses

The ToothGrowth data shows the length of odontoblasts (teeth) in each of 10 guinea pigs at each of three dose levels of Vitamin C (0.5, 1, and 2 mg) with each of two delivery methods (orange juice or ascorbic acid).

plot of chunk unnamed-chunk-2plot of chunk unnamed-chunk-2

From the graphs, it looks like higher dosages are associated with longer length. It also looks like OJ is associated with longer length at lower (0.5) and medium (1) doses than VC.

2. Provide a basic summary of the data

summary(ToothGrowth); str(ToothGrowth)

There are 60 observations and 3 variables in this dataset. ‘Len’, the length of tooth, is a continous numeric variable ranging from 4.2 to 33.90. ‘Supp’, supplement type, is a categorical variable with two levels: VC or OJ. ‘Dose’, dose in mg, is a discrete numeric variable ranging from 0.5 to 2.

3. Use confidence intervals and hypothesis tests to compare tooth growth by supp and dose. (Use the techniques from class even if there’s other approaches worth considering)

We will use t tests to determine whether the means of the two groups (e.g. OJ and VC) are equal to each other. The null hypothesis is that the two means are equal, and the alternative is that they are not. The variances of the two delivery methods are not equal to each other based on the graphs from section 1 above.

t.test(len~supp, paired=FALSE, var.equal=FALSE, subset=ToothGrowth$dose==0.5, data=ToothGrowth)
t.test(len~supp, paired=FALSE, var.equal=FALSE, subset=ToothGrowth$dose==1, data=ToothGrowth)
t.test(len~supp, paired=FALSE, var.equal=FALSE, subset=ToothGrowth$dose==2, data=ToothGrowth)

At dose 0.5mg, the p value of delivery methods (OJ vs. VC) is 0.006359 and the confidence interval is between 1.719 and 8.781, therefore we can reject the null hypothesis that the two means are equal in favor of the alternative hypothesis that there is a difference between the two delivery methods.

At dose 1mg, the p value of delivery methods (OJ vs. VC) is 0.001038 and the confidence interval is between 2.802 and 9.058, therefore we can reject the null hypothesis that the two means are equal in favor of the alternative hypothesis that there is a difference between the two delivery methods.

At dose 2mg, the p value of delivery methods (OJ vs. VC) is 0.9639 and the confidence interval is between -3.798 and 3.638, therefore we cannot reject the null hypothesis that the two means are equal.

t.test(len~dose, paired=FALSE, var.equal=FALSE, data=toj12)
t.test(len~dose, paired=FALSE, var.equal=FALSE, data=toj23)
t.test(len~dose, paired=FALSE, var.equal=FALSE, data=toj13)

t.test(len~dose, paired=FALSE, var.equal=FALSE, data=tvc12)
t.test(len~dose, paired=FALSE, var.equal=FALSE, data=tvc23)
t.test(len~dose, paired=FALSE, var.equal=FALSE, data=tvc13)

With delivery method OJ, the p-value between dose 0.5mg and dose 1mg is 8.785e-05, this is highly significant; the p-value between dose 1mg and dose 2mg is 0.0392, this is significant; the p-value between dose 0.5mg and 2mg is 1.324e-06, this is highly significant.

With delivery method VC, the p-value between dose 0.5mg and dose 1mg is 6.811e-07, this is highly significant; the p-value between dose 1mg and dose 2mg is 9.156e-05, this is highly significant; the p-value between dose 0.5mg and 2mg is 4.682e-08, this is highly significant.

Applying the Bonferroni correction for multiple t tests, we would take the 0.05 divided by 3 and use that 0.0166 as our cutoff value. The only change this new cutoff elicits is the difference between dose 1mg and dose 2mg for delivery OJ. This means that we cannot reject the null hypothesis that the mean for 1mg and the mean for 2mg are different.

4. State your conclusions and the assumptions needed for your conclusions

The supplement type (i.e. orange vs. ascorbic acid) has a statistically signficant impact to odontoblasts length in the 0.5mg Vitamin C dose and the 1mg Vitamin C dose, but not the 2mg Vitamin C dose. That is, Vitamin C from orange juice is more effective than ascorbic acid in growing longer odontoblasts length at the 0.5mg dose and 1mg dose.

For the most part, the dosage (i.e. 0.5mg, 1mg, 2mg) of Vitamin C has a statistically signficant impact to odontoblasts length. For ascorbid acid delivery method, higher dose results in longer odontoblasts length; that is, 2mg dose generates better results than all other lower dosage. For orange juice delivery method, the benefits of higher dosage stops at 1mg: 1mg is better than 0.5mg, however, 2mg is not entirely better than 1mg.

Depending on the cost difference between OJ and VC, the guinea pigs can be given either OJ at the 1mg dose or VC at the 2mg for the longest odontoblasts length.

The assumptions under which these conclusion are drawn are: 1. Data is assumed to be i.i.d., that is a random draw from a population. 2. The guinea pigs are randomly assigned into groups of dosage and delivery methods. 3. There is no other confounding factors on length of odontoblasts.