Introduction

The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, (orange juice or ascorbic acid (a form of vitamin C and coded as VC).

A data frame with 60 observations on 3 variables.

[,1] len numeric Tooth length

[,2] supp factor Supplement type (VC or OJ)

[,3] dose numeric Dose in milligrams/day

Basic Data Summary

The Tooth Growth Data is divided into two equal sized subgroups depending on supplement type, be it Orange Juice[OJJ] or Vitamin C[VC] (ascorbic acid). In addition, these subgroups are partioned by dosage of in lots of 10 entries each for 0.5, 1.0 and 2.0 mg/day. Thus, resulting in a group of 30 entries each for the OJ and VC sub groups. We can think of this as 6 semi-subgroups, with respect to both supplement and dosage, that make up the original set. The data that is displayed below gives representative statistics. The graph grops all of this data together with proper partitioning. The six points identified onthe graph are the mean growth length for each of 6 semi-subgroups. These means are also calculated and stored in the vector meanT.

tg <- ToothGrowth

head(tg)
##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5
str(tg)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
summary(tg)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000
## This chunk calculates means for 6 different sets of tooth lenths
meanT=NULL
 for(i in c("OJ","VC")){  ##SUPPLEMENTS
      for(j in c(0.5,1.0,2.0)){  ## DOSAGE
           meanT<-rbind(meanT,mean(filter(tg,supp==i&dose==j)$len)) ## MEAN TOOTH LENGTH
     } 
 }
meanT
##       [,1]
## [1,] 13.23
## [2,] 22.70
## [3,] 26.06
## [4,]  7.98
## [5,] 16.77
## [6,] 26.14
coplot(len ~ dose | supp, data = tg, panel = panel.smooth,xlab="Dose", ylab="Growth Length",
      main = "ToothGrowth Data: Length vs Dose, given supplement type")

Confidence Intervals and Hypothesis Testing Analysis

The Confidence Interval (t-interval) is defined as X’ +/- t_(n-1)*s/sqrt(n) where t_(n-1) is the relevant quantile s is the sample mean. The t interval assumes that the data are iid normal, though it is robust to this assumption and works well whenever the distribution of the data is roughly symmetric and mound shaped.

Part 1: Affects of the Supplement

We can use the R function t.test with arguments ojtg and vctg to arrive at our first result. The paired argument is FALSE since the sample groups of guinea pigs are different. This is a one-sided test since the alternative is set to “greater”. The Null Hypothesis: We are 95% confident that the average mean difference is less than zero. Alternatively stated, the positive impact of Vitamin C on tooth growth is more significant than that of Orange Juice. As with the other R test functions, the t.test returns a lot of information.

ojtg<-filter(tg,supp=="OJ") ## CREATE SUBGROUPS BY CONSTANT SUPPLEMENT
vctg<-filter(tg,supp=="VC")
#head(vctg)
## What is this mess?
## This is a t confidence test for tooth lenth comparison of the VC and OJ subgroups

gsupp<-t.test(ojtg$len,vctg$len,alternative = "greater", paired = FALSE, var.equal = FALSE, conf.level = 0.95)

gsupp
## 
##  Welch Two Sample t-test
## 
## data:  ojtg$len and vctg$len
## t = 1.9153, df = 55.309, p-value = 0.03032
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.4682687       Inf
## sample estimates:
## mean of x mean of y 
##  20.66333  16.96333
qt(.975,55)
## [1] 2.004045

The P-value(.03) being less than alpha(.05) indicates that we should reject the Null Hypothesis. Note that the t statistics are nearly the same for the tabulated and calculated values. This analysis reveals that Orange Juice is a greater contributer to tooth growth than Vitamin C.

Part 2: Affects of the Dosage

This time the tooth date is partitioned into three subgroups depending on the given dosage. Halftg is the set of all values where the dosage is equal to 0.5 mg/day . Onetg and Twotg are the respective subgroups for 1.0 mg/day and 2.0 mg/day.

Again, using the R function t.test with arguments Halftg and Onetg as arguments follwed by a call with Onetg anf Twotg as arguments.to The paired argument is FALSE since the sample groups of guinea pigs are different. This is a one-sided test since the alternative is set to “greater”, for both sets of value. The Null Hypothesis: We are 95% confident that the average mean difference is less than zero. Alternatively stated, the positive impact of larger dose on tooth growth is more significant than that of the smaller dose.

Halftg<-filter(tg,dose==0.5)## CREATE SUBGROUPS BY CONSTANT DOSAGE
Onetg<-filter(tg,dose==1.0)
Twotg<-filter(tg,dose==2.0)

t.test(Halftg$len, Onetg$len, alternative = "greater", paired = FALSE, var.equal = FALSE, conf.level = 0.95)
## 
##  Welch Two Sample t-test
## 
## data:  Halftg$len and Onetg$len
## t = -6.4766, df = 37.986, p-value = 1
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  -11.50668       Inf
## sample estimates:
## mean of x mean of y 
##    10.605    19.735
t.test(Onetg$len, Twotg$len, alternative = "greater", paired = FALSE, var.equal = FALSE, conf.level = 0.95)
## 
##  Welch Two Sample t-test
## 
## data:  Onetg$len and Twotg$len
## t = -4.9005, df = 37.101, p-value = 1
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  -8.55613      Inf
## sample estimates:
## mean of x mean of y 
##    19.735    26.100

The P-value(1.0) being greater than alpha(.05) indicates that we should accept the Null Hypothesis. Note that the t statistics are nearly the same for the tabulated and calculated values. This analysis reveals that Orange larger dose is a greater contributer to tooth growth than smaller dose.

Part 3: Upper Limit of Data Analysis

As we examine the original graph for each type supplement, it is apparent that for small dosages (.i.e. 0.5 mg/day and 1.0 mg/day) the affect of OJ is clearly predominant over the VC. There appears, however, to be some uncertanty making this determination for the 2.0 mg/day dosage. Maybe some further analysis is warranted.

If, again, we produce two sub groups for OJ and VC but this time limit the dosage to the 2.0 mg/day entries. This time we want to see how close the means are to each other. In other how does the difference of the means hover around zero. This indicates we should use a two-sided test. The Null Hypothesis: The means are identical

OJtgTwo<-filter(tg,supp=="OJ" & dose == 2.0) ## OJ and high dose only
VCtgTwo<-filter(tg,supp=="VC" & dose == 2.0)

t.test(OJtgTwo$len, VCtgTwo$len, alternative = "two.sided", paired = FALSE, var.equal = FALSE, conf.level = 0.95)
## 
##  Welch Two Sample t-test
## 
## data:  OJtgTwo$len and VCtgTwo$len
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.79807  3.63807
## sample estimates:
## mean of x mean of y 
##     26.06     26.14

The mean values of the two groups, as we saw earlier, are very close to each other (i.e. OJtgTwo = 26.06 and VCtgTwo = 26.14). We also notice the P-value of .964 says we should accept the Null Hypothesis. The confidence interval certainly does hover about zero (i.e. -3.79807 3.63807 ).

Conclusion

The tooth growth for the two groups of Guinea Pigs tends to be greatest for low dosages of Orange Juice. However as the dosage is increase the group that received the Vitamin C seems to grow at a steeper rate. I conclude that more testing needs to be performed before a valid conclusion can be formulated.