Project procedure

  1. Load the ToothGrowth data and perform some basic exploratory data analyses Provide a basic summary of the data.
data ("ToothGrowth")
summary(ToothGrowth)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000
head(ToothGrowth)
##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5
tail(ToothGrowth)
##     len supp dose
## 55 24.8   OJ    2
## 56 30.9   OJ    2
## 57 26.4   OJ    2
## 58 27.3   OJ    2
## 59 29.4   OJ    2
## 60 23.0   OJ    2
  1. Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose. (Only use the techniques from class, even if there’s other
data ("ToothGrowth")
library (ggplot2)
 ggplot(data = ToothGrowth, aes(x = supp, y=len))+
        geom_point(aes(colour = factor (ToothGrowth$dose)))

p <- ggplot(data = ToothGrowth,aes(x = supp,y = len))+
        geom_boxplot(aes (colour = factor(supp)))
p + facet_grid(. ~ dose)

  1. Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose. (Only use the techniques from class, even if there’s other approaches worth considering)

Null Hypothesis - There is no effect of does on tooth growth Alternate Hypothesis - There is effect of does on tooth growth

does05 <- ToothGrowth[ToothGrowth$dose == 0.5,]
does1 <- ToothGrowth[ToothGrowth$dose == 1,]
does2 <- ToothGrowth[ToothGrowth$dose == 2,]

comparing betwen does = 0.5 and 1

t.test (does05$len, does1$len, alternative = "less")
## 
##  Welch Two Sample t-test
## 
## data:  does05$len and does1$len
## t = -6.4766, df = 37.986, p-value = 6.342e-08
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
##       -Inf -6.753323
## sample estimates:
## mean of x mean of y 
##    10.605    19.735

As P value is very small, we can reject H0 and accept that len is increased at does 1

comparing between does = 1 and 2

t.test (does1$len, does2$len, alternative = "less")
## 
##  Welch Two Sample t-test
## 
## data:  does1$len and does2$len
## t = -4.9005, df = 37.101, p-value = 9.532e-06
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
##      -Inf -4.17387
## sample estimates:
## mean of x mean of y 
##    19.735    26.100

As P value is very small, we can reject H0 and accept that len is increased at does 2

comparing different supplements by dosage

Null Hypothesis: There is no effect between OJ and VC

Alternate Hypothesis: OJ is better than VC

OJ <- ToothGrowth[ToothGrowth$supp == "OJ",]
VC <- ToothGrowth[ToothGrowth$supp == "VC",]
t.test (OJ$len, VC$len, alternative = "greater")
## 
##  Welch Two Sample t-test
## 
## data:  OJ$len and VC$len
## t = 1.9153, df = 55.309, p-value = 0.03032
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.4682687       Inf
## sample estimates:
## mean of x mean of y 
##  20.66333  16.96333

Since p value is larger than 0.001, thus we can’t reject null hypothesis. There is no significant proof that OJ is better than VC.

Now it’s better to analyze OJ and VC at different dosage.

at dosage of 0.5

OJ05 <- OJ[OJ$dose == 0.5,]
VC05 <- VC[VC$dose == 0.5,]
t.test(OJ05$len, VC05$len, alternative = "greater")
## 
##  Welch Two Sample t-test
## 
## data:  OJ05$len and VC05$len
## t = 3.1697, df = 14.969, p-value = 0.003179
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  2.34604     Inf
## sample estimates:
## mean of x mean of y 
##     13.23      7.98

Since p value is smaller than 0.05, we reject the null hypothesis at dosage of 1

OJ1 <- OJ[OJ$dose == 1,]
VC1 <- VC[VC$dose == 1,]
t.test(OJ1$len, VC1$len, alternative = "greater")
## 
##  Welch Two Sample t-test
## 
## data:  OJ1$len and VC1$len
## t = 4.0328, df = 15.358, p-value = 0.0005192
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  3.356158      Inf
## sample estimates:
## mean of x mean of y 
##     22.70     16.77

Since p value is smaller than 0.05, we reject the null hypothesis

at dosage of 2

OJ2 <- OJ[OJ$dose == 2,]
VC2 <- VC[VC$dose == 2,]
t.test(OJ2$len, VC2$len, alternative = "greater")
## 
##  Welch Two Sample t-test
## 
## data:  OJ2$len and VC2$len
## t = -0.046136, df = 14.04, p-value = 0.5181
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  -3.1335     Inf
## sample estimates:
## mean of x mean of y 
##     26.06     26.14

p value is ~0.5, we can not reject H0. There is no significant proof that OJ is better than VC. Since p value is smaller than 0.05, we reject the null hypothesis 4. State your conclusions and the assumptions needed for your conclusions.

For dosage of 0.5 mg and 1 mg, there are significant difference between OJ and VC in promoting tooth growth. But at 2mg, there is no clear improment.