Basic Inferential Data Analysis

In Part 2 of this paper, I will analyze the ToothGrowth data in the R datasets package. The ToothGrowth dataset is data depicting the effect of Vitamin C on Tooth Growth in Guinea Pigs.

According to the ToothGrowth Documentation:

  1. The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs.
  2. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, (orange juice or ascorbic acid (a form of vitamin C and coded as VC).

Here, I will load the ToothGrowth data and perform some basic exploratory data analyses.

data("ToothGrowth")
dim(ToothGrowth)
## [1] 60  3
str(ToothGrowth)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

The dim() function allows me to see how many rows and columns the ToothGrowth dataset has. There are 60 rows and 3 columns of data in this dataset. The str() function allows me to see that the dataset consists of three fields, len (num), supp (Factor w/2 levels “OJ”, “VC”), and dose (num).

Overview

Below is a basic summary of the data.

summary(ToothGrowth)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000
head(ToothGrowth)
##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5

Here is a plot of the length versus the dose dependent on the type of supplement given to each guinea pig. This plot is based on an example from the ToothGrowth Documentation section of the R-manual.

require(graphics)
coplot(len ~ dose | supp, data = ToothGrowth, panel = panel.smooth,
       xlab = "ToothGrowth data: length vs dose, given type of supplement")

Hypothesis Testing

I have decided to use a T hypothesis tests to compare tooth growth by supp and dose. I will be using the R function t.test. I will look at the tooth length by supplement first. Recall that the supp field consists of factors of either “OJ” or “VC”.

t.test(len ~ supp, data = ToothGrowth)
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean in group OJ mean in group VC 
##         20.66333         16.96333

Since the p-value = 0.06063, and the obtained confidence interval, [-0.1710156, 7.5710156] contains zero, then we should fail to reject the null hypothesis that the different supplement types have no effect on tooth length. So, now I will compare the lengths to the supplements dependent on each length for further analysis. Based on the the ToothGrowth Documentation, I know that each guinea pig received one of three dose levels of vitamin C (0.5, 0.1, and 2 mg/day). So, I will be using three separate t.test to explore tooth length by dosages below.

# Test using subset for 0.5 mg/day of Vitamin C
ToothGrowth.dose0.5 <- subset (ToothGrowth, dose == 0.5)
t.test(len ~ supp, data = ToothGrowth.dose0.5)
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1.719057 8.780943
## sample estimates:
## mean in group OJ mean in group VC 
##            13.23             7.98
# Test using subset for 1.0 mg/day of Vitamin C
ToothGrowth.dose1.0 <- subset (ToothGrowth, dose == 1.0)
t.test(len ~ supp, data = ToothGrowth.dose1.0)
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  2.802148 9.057852
## sample estimates:
## mean in group OJ mean in group VC 
##            22.70            16.77
# Test using subset for 0.2 mg/day of Vitamin C
ToothGrowth.dose2.0 <- subset (ToothGrowth, dose == 2.0)
t.test(len ~ supp, data = ToothGrowth.dose2.0)
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.79807  3.63807
## sample estimates:
## mean in group OJ mean in group VC 
##            26.06            26.14

Below, I will test dosage levels independent from supplement type to see if dosage levels alone have an impact on tooth length. I will have to use three separate t.test for this portion of the analysis as well.

# Test using subset comparing 0.5 mg/day of Vitamin C to 1.0 mg/day of Vitamin C
ToothGrowth.dose0.5To1.0 <- subset (ToothGrowth, dose %in% c(0.5, 1.0))
t.test(len ~ dose, data = ToothGrowth.dose0.5To1.0)
## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.983781  -6.276219
## sample estimates:
## mean in group 0.5   mean in group 1 
##            10.605            19.735
# Test using subset comparing 0.5 mg/day of Vitamin C to 2.0 mg/day of Vitamin C
ToothGrowth.dose0.5To2.0 <- subset (ToothGrowth, dose %in% c(0.5, 2.0))
t.test(len ~ dose, data = ToothGrowth.dose0.5To2.0)
## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -11.799, df = 36.883, p-value = 4.398e-14
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -18.15617 -12.83383
## sample estimates:
## mean in group 0.5   mean in group 2 
##            10.605            26.100
# Test using subset comparing 1.0 mg/day of Vitamin C to 2.0 mg/day of Vitamin C
ToothGrowth.dose1.0To2.0 <- subset (ToothGrowth, dose %in% c(1.0, 2.0))
t.test(len ~ dose, data = ToothGrowth.dose1.0To2.0)
## 
##  Welch Two Sample t-test
## 
## data:  len by dose
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -8.996481 -3.733519
## sample estimates:
## mean in group 1 mean in group 2 
##          19.735          26.100

Conclusions and Assumptions

Assumptions

  1. This was a random sample of 60 guinea pigs.
  2. The population variances are different.
  3. Each animal received one of three dose levels of vitamin C(0.5, 1, and 2 mg / day).
  4. Each animal received their vitamin C through either and orange juice supplement or an ascorbic acid supplement.

Conclusions

  1. Tooth length increases as the Vitamin C dosage levels increase irregardless of the supplement type.
  2. If we consider supplement type, then OJ (Orange Juice) seems to yield more tooth growth than when using the VC (Ascorbic Acid) supplement.
  3. Higher dosages of the OJ supplement result in the most rapidly growing teeth in guinea pigs.