Exploratory data analyses

The analyzed dataset studies the effect of Vitamin C on Tooth Growth in Guinea Pigs. The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice or ascorbic acid.

First I load the data, clarify the names of the columns and the supplement method/source, make sure that the dose is handled as factor and finally summarize the basic properties of the dataframe.

data(ToothGrowth)
names(ToothGrowth)<-c("length", "supplement", "dose")
levels(ToothGrowth$supplement)<-c("orange juice", "ascorbic acid")
ToothGrowth$dose<-as.factor(ToothGrowth$dose)
summary(ToothGrowth)
##      length              supplement  dose   
##  Min.   : 4.20   orange juice :30   0.5:20  
##  1st Qu.:13.07   ascorbic acid:30   1  :20  
##  Median :19.25                      2  :20  
##  Mean   :18.81                              
##  3rd Qu.:25.27                              
##  Max.   :33.90

I prepare a figure to show both the effects of delivery methods, orange juice or ascorbic acid and the Vitamin C dose levels. In this figure all boxes are made based on 10-10 data points.

library(ggplot2)

ggplot(data=ToothGrowth, aes(x=dose, y=length, fill=supplement)) +
  geom_boxplot() +
  ggtitle("Tooth Growth for the Effect of Vitamin C") +
  xlab("dose levels (mg/day)") +
  ylab("length of odontoblasts")

Summary of the ToothGrowth data

The dataset consist of 60 rows of the studied 3 variables. An exploratory analysis indicates that the length variable is in a connection with the vitamin C dose. At dose levels of 0.5 and 1 mg/day a difference between the effects of supplement delivery methods can be also observed. In these cases orange juice seems more effective than ascorbic acid. At dose level of 2 mg/day the odontoblasts seem around 2.5 times longer than at dose level of 0.5 mg/day regardless of the delivery method.

Tooth growth by supp and dose

Statistical hypothesis testing can confirm 1, if there is an effect of vitamin C on tooth growth and 2, if there is a difference in the effectiveness of the two delivery methods: orange juice or ascorbic acid. It can also show 3, if 1 mg/day from orange juice has the same effectiveness as 2 mg/day ascorbic acid (as the figure indicates).

Since low sample numbers are available (10/group), Student’s t-Tests are applied for the comparisons of two groups. These tests assume that the populations follow normal distributions (so the samples follow Student’s t distributions) and that the samples are independent. The two population variances are not assumed to be equal (Welch’s t-test).

When p-value ??? 0.05, the null hypothesis (equality of means) cannot be rejected. When p-value < 0.05, the null hypothesis is rejected and the alternative hypothesis is accepted, “the true difference in means is not equal to 0”.

1, Student’s t-Tests for the effect of dose

Effect of dose from orange juice

0.5 <-> 1 mg/day

p-value = 8.785e-05 There is a statistically significant difference.

t.test(ToothGrowth$length[ToothGrowth$dose==0.5 & ToothGrowth$supplement=="orange juice"], ToothGrowth$length[ToothGrowth$dose==1 & ToothGrowth$supplement=="orange juice"])
## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$length[ToothGrowth$dose == 0.5 & ToothGrowth$supplement ==  and ToothGrowth$length[ToothGrowth$dose == 1 & ToothGrowth$supplement ==     "orange juice"] and     "orange juice"]
## t = -5.0486, df = 17.698, p-value = 8.785e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -13.415634  -5.524366
## sample estimates:
## mean of x mean of y 
##     13.23     22.70
1 <-> 2 mg/day

p-value = 0.0392 There is a statistically significant difference (however, very close to 0.05).

t.test(ToothGrowth$length[ToothGrowth$dose==1 & ToothGrowth$supplement=="orange juice"], ToothGrowth$length[ToothGrowth$dose==2 & ToothGrowth$supplement=="orange juice"])
## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$length[ToothGrowth$dose == 1 & ToothGrowth$supplement ==  and ToothGrowth$length[ToothGrowth$dose == 2 & ToothGrowth$supplement ==     "orange juice"] and     "orange juice"]
## t = -2.2478, df = 15.842, p-value = 0.0392
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -6.5314425 -0.1885575
## sample estimates:
## mean of x mean of y 
##     22.70     26.06

Effect of dose from ascorbic acid

0.5 <-> 1 mg/day

p-value = 6.811e-07 There is a statistically significant difference.

t.test(ToothGrowth$length[ToothGrowth$dose==0.5 & ToothGrowth$supplement=="ascorbic acid"], ToothGrowth$length[ToothGrowth$dose==1 & ToothGrowth$supplement=="ascorbic acid"])
## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$length[ToothGrowth$dose == 0.5 & ToothGrowth$supplement ==  and ToothGrowth$length[ToothGrowth$dose == 1 & ToothGrowth$supplement ==     "ascorbic acid"] and     "ascorbic acid"]
## t = -7.4634, df = 17.862, p-value = 6.811e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.265712  -6.314288
## sample estimates:
## mean of x mean of y 
##      7.98     16.77
1 <-> 2 mg/day

p-value = 9.156e-05 There is a statistically significant difference.

t.test(ToothGrowth$length[ToothGrowth$dose==1 & ToothGrowth$supplement=="ascorbic acid"], ToothGrowth$length[ToothGrowth$dose==2 & ToothGrowth$supplement=="ascorbic acid"])
## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$length[ToothGrowth$dose == 1 & ToothGrowth$supplement ==  and ToothGrowth$length[ToothGrowth$dose == 2 & ToothGrowth$supplement ==     "ascorbic acid"] and     "ascorbic acid"]
## t = -5.4698, df = 13.6, p-value = 9.156e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -13.054267  -5.685733
## sample estimates:
## mean of x mean of y 
##     16.77     26.14

2, Student’s t-Tests for the effect of supplement delivery method (orange juice <-> ascorbic acid)

Effect of delivery method at 0.5 mg/day

p-value = 0.006359 There is a statistically significant difference.

t.test(ToothGrowth$length[ToothGrowth$dose==0.5 & ToothGrowth$supplement=="orange juice"], ToothGrowth$length[ToothGrowth$dose==0.5 & ToothGrowth$supplement=="ascorbic acid"])
## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$length[ToothGrowth$dose == 0.5 & ToothGrowth$supplement ==  and ToothGrowth$length[ToothGrowth$dose == 0.5 & ToothGrowth$supplement ==     "orange juice"] and     "ascorbic acid"]
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1.719057 8.780943
## sample estimates:
## mean of x mean of y 
##     13.23      7.98

Effect of delivery method at 1 mg/day

p-value = 0.001038 There is a statistically significant difference.

t.test(ToothGrowth$length[ToothGrowth$dose==1 & ToothGrowth$supplement=="orange juice"], ToothGrowth$length[ToothGrowth$dose==1 & ToothGrowth$supplement=="ascorbic acid"])
## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$length[ToothGrowth$dose == 1 & ToothGrowth$supplement ==  and ToothGrowth$length[ToothGrowth$dose == 1 & ToothGrowth$supplement ==     "orange juice"] and     "ascorbic acid"]
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  2.802148 9.057852
## sample estimates:
## mean of x mean of y 
##     22.70     16.77

Effect of delivery method at 2 mg/day

p-value = 0.9639 The null hypothesis cannot be rejected, the two means are equal.

t.test(ToothGrowth$length[ToothGrowth$dose==2 & ToothGrowth$supplement=="orange juice"], ToothGrowth$length[ToothGrowth$dose==2 & ToothGrowth$supplement=="ascorbic acid"])
## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$length[ToothGrowth$dose == 2 & ToothGrowth$supplement ==  and ToothGrowth$length[ToothGrowth$dose == 2 & ToothGrowth$supplement ==     "orange juice"] and     "ascorbic acid"]
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.79807  3.63807
## sample estimates:
## mean of x mean of y 
##     26.06     26.14

3, Orange juice 1 mg/day <-> ascorbic acid 2 mg/day

p-value = 0.09653 The null hypothesis cannot be rejected, the two means are not statistically significantly different from each other.

t.test(ToothGrowth$length[ToothGrowth$dose==1 & ToothGrowth$supplement=="orange juice"], ToothGrowth$length[ToothGrowth$dose==2 & ToothGrowth$supplement=="ascorbic acid"])
## 
##  Welch Two Sample t-test
## 
## data:  ToothGrowth$length[ToothGrowth$dose == 1 & ToothGrowth$supplement ==  and ToothGrowth$length[ToothGrowth$dose == 2 & ToothGrowth$supplement ==     "orange juice"] and     "ascorbic acid"]
## t = -1.7574, df = 17.297, p-value = 0.09653
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -7.5643336  0.6843336
## sample estimates:
## mean of x mean of y 
##     22.70     26.14

Conclusions

Both exploratory data analyses and statistical hypothesis testing confirm the following statements:

1. There is a positive effect of vitamin C on tooth growth. As the dose grows, the length grows as well for both orange juice and ascorbic acid.

2. At low vitamin C doses of 0.5 and 1 mg/day the orange juice is statistically significantly more effective than ascorbic acid. However, at 2 mg/day the two supplements have the same effectiveness.

3. I showed that the length means for the effect of 1 mg/day orange juice and for 2 mg/day ascorbic acid are not statistically significantly different. Therefore, I can recommend to give 1 mg/day vitamin C by orange juice to Guinea Pigs as the most costeffective tooth growth supplement method.