Overview
- ToothGrowth dataset contains information about impact on length of tooth on intake of Orange Juice(OJ) and Vitamin C(VC) on Guinea Pigs.
- Three doses of both the supplements are given i.e. 0.5, 1 & 2.
- I will analyse the impact of these suppliments on the length of Tooth.
Exploratory Data Analysis
- A plot is made with two different supplements. Also a line connecting mean value is drawn for each supplement.
- A table is created containing mean and median for each supplement corresponding to each dose.

- From EDA(in figure), I found that as the dose increases the length of tooth increases.
Summary of Tooth Length
0.5 |
OJ |
13.23 |
12.25 |
0.5 |
VC |
7.98 |
7.15 |
1.0 |
OJ |
22.70 |
23.45 |
1.0 |
VC |
16.77 |
16.50 |
2.0 |
OJ |
26.06 |
25.95 |
2.0 |
VC |
26.14 |
25.95 |
- Also in table for dose of 2 of each supplement mean is approximately equal.
For Code - refer Appendix
Confidence Interval Test and Hypothesis test.
I have done T-Confidence Interval Test(paired) for the data.For all the Hypothesis testing p-value >= 0.95.First, data for Orange Juice and Vitamin C is compared irrespective of dose.Assumption(Null Hypothesis) - Orange Juice and Vitamin C have same impact on tooth growth i.e. difference in means of both supplements is 0.
tVC <- ToothGrowth[1:30,]
tOJ <- ToothGrowth[31:60,]
tIntervalOJVC <- t.test(tOJ$len,tVC$len,paired = TRUE)
- In the above test we found that 95% confidence interval for mean of differences is 1.4086586, 5.9913414.
- It states that with .95 probability the mean of differences will lie in this interval and actual mean is 3.7.
- P-value is 0.0025498 which is less than 0.95 so we reject the null hypothesis.
- In this case(difference of Orange Juice & Vitamin C) the result is positive. It signifies that Orange Juice has large effect on length of Tooth than Vitamin C.
Second, at a dosage level comparison of two supplements is done.Assumption(Null Hypothesis) - Orange Juice and Vitamin C have same impact on tooth growth i.e. difference in means of both supplements for each dose is 0.For dose = 0.5
t05 <- filter(ToothGrowth, dose == 0.5)
tInterval05 <- t.test(filter(t05, supp == 'OJ')$len, filter(t05, supp == 'VC')$len, paired = TRUE)
- In the above test we found that 95% confidence interval for mean of differences is 1.2634583, 9.2365417.
- It states that with .95 probability the mean of differences will lie in this interval and actual mean is 5.25.
- P-value is 0.015472 which is less than 0.95 so we reject the null hypothesis.
- In this case(difference of Orange Juice & Vitamin C) the result is positive. It signifies that Orange Juice has large effect on length of Tooth than Vitamin C for 0.5 dose.
For dose = 1
t1 <- filter(ToothGrowth, dose == 1)
tInterval1 <- t.test(filter(t1, supp == 'OJ')$len, filter(t1, supp == 'VC')$len, paired = TRUE)
- In the above test we found that 95% confidence interval for mean of differences is 1.9519109, 9.9080891.
- It states that with .95 probability the mean of differences will lie in this interval and actual mean is 5.93.
- P-value is 0.0082292 which is less than 0.95 so we reject the null hypothesis.
- In this case(difference of Orange Juice & Vitamin C) the result is positive. It signifies that Orange Juice has large effect on length of Tooth than Vitamin C for 1 dose.
For dose = 2
t2 <- filter(ToothGrowth, dose == 2)
tInterval2 <- t.test(filter(t2, supp == 'OJ')$len, filter(t2, supp == 'VC')$len, paired = TRUE)
- In the above test we found that 95% confidence interval for mean of differences is -4.3289765, 4.1689765.
- It states that with .95 probability the mean of differences will lie in this interval and actual mean is -0.08.
- P-value is 0.9669567 which is greater than 0.95 so we accept the null hypothesis i.e both the supplements have almost similar impact at this dose level.
- In this case(difference of Orange Juice & Vitamin C) the result is negative. It signifies that Orange Juice has less effect on length of Tooth than Vitamin C for 2 dose.
Third, For a particular supplement, effect of increasing the dose.Assumption(Null Hypothesis) - Effect of increasing the dose from 0.5 to 1 is same as 1 to 2 for each supplement.For Vitamin C
tVC105 <- filter(tVC,dose==1)$len - filter(tVC,dose==0.5)$len
tVC21 <- filter(tVC,dose==2)$len - filter(tVC,dose==1)$len
tIntervalVC <- t.test(tVC21,tVC105,paired = TRUE)
- For the above test, difference of length at 1, 0.5 and that at 2, 1 are taken and then they are compared with T confidence intervals.
- For Vitamin C, the mean is 0.58, which is very less and confidence interval is -5.3223375, 6.4823375
- P-value is 0.8290479 which is less than 0.95 so we reject the null hypothesis.
- Above results shows that increasing Vitamin C dose from 1 to 2 has more impact than from 0.5 to 1.
For Orange Juice
tOJ105 <- filter(tOJ,dose==1)$len - filter(tOJ,dose==0.5)$len
tOJ21 <- filter(tOJ,dose==2)$len - filter(tOJ,dose==1)$len
tIntervalOJ <- t.test(tOJ21,tOJ105,paired = TRUE)
- For the above test, difference of length at 1, 0.5 and that at 2, 1 are taken and then they are compared with T confidence intervals.
- For Orange Juice, the mean is -6.11, which is negative and confidence interval is -14.3884822, 2.1684822
- P-value is 0.1293383 which is less than 0.95 so we reject the null hypothesis.
- Above results shows that increasing Orange Juice dose from 1 to 2 has very less impact than from 0.5 to 1.
Appendix
Code of Exploratory Data Analysis
library(ggplot2)
data("ToothGrowth")
g <- ggplot(ToothGrowth,aes(x = dose,y = len, color = supp))
g <- g + geom_point()
g <- g + stat_summary(aes(group=1),geom = "line",fun.y = mean, col="black", size = 1)
g + facet_grid(. ~ supp)
library(xtable); library(dplyr); library(knitr)
tGroup <- group_by(ToothGrowth,dose,supp)
tSummary <- kable(summarize(tGroup,meanLen = mean(len), medLen = median(len)),
format = "pandoc", caption = "Summary of Tooth Length")
tSummary