Given: ToothGrowth data
Required: Perform basic exploratory data analysis. Provide basic summary of the data. Use confidence intervals and/or hypothesis tests to compare tooth growth by supplement and dose. State your conclusions and assumptions needed for the conclusions.
The ToothGrowth is a dataset of a research which records the growth of odontoblast of the incisor teeth as a criterion of vitamin C intake of the guinea pig.* The dataset is a data frame consisting of 60 rows (observations) and 3 columns (variables) and no Nulls. The range of variable len varies from 4.2 to 33.9. The dose column has 3 repeating values of 0.5, 1.0, 2.0. The variable supplement is a factor, with 2 levels, VC and OJ. A boxplot shows varying levels of effect by the dose level and supplement type on tooth growth. It seems to indicate that dose 2.0 has the similar effect of increasing growth by supplement OJ and VC.
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
This section compares the effect of dose and supplement variables on the length variable.
Null hypothesis, Ho: Length of tooth growth = 0
Alternative hypothesis, Ha: Length of tooth growth <> 0
Result shows 95% probability that the average difference in tooth growth is -7.56 to .17 for each subject. A mean of 0 falls within the interval indicates the possibility of not ruling out 0 for the population difference between the 2 groups, and thus, failing to reject the null hypothesis. A p-value of 0.06 > 0.05 means this is not significant.
t.test(len ~ I(relevel(supp, 2)), paired = FALSE, data = ToothGrowth)
##
## Welch Two Sample t-test
##
## data: len by I(relevel(supp, 2))
## t = -1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -7.5710156 0.1710156
## sample estimates:
## mean in group VC mean in group OJ
## 16.96333 20.66333
The data is subsetted for each dose. The results will show 95% probability that the average difference in tooth growth varies between positive values for doses 0.5 and 1.0 but not for dose 2.0. The p-values are low (< 0.05) which indicates significance for doses 0.5 and 1.0 but opposite for dose 2.0 with p=0.96 > 0.05.
toothdose4 <- subset(ToothGrowth, dose == 0.5)
toothdose5 <- subset(ToothGrowth, dose == 1.0)
toothdose6 <- subset(ToothGrowth, dose == 2.0)
t.test(len ~ supp, data = toothdose4)
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 1.719057 8.780943
## sample estimates:
## mean in group OJ mean in group VC
## 13.23 7.98
t.test(len ~ supp, data = toothdose5)
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 2.802148 9.057852
## sample estimates:
## mean in group OJ mean in group VC
## 22.70 16.77
t.test(len ~ supp, data = toothdose6)
##
## Welch Two Sample t-test
##
## data: len by supp
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.79807 3.63807
## sample estimates:
## mean in group OJ mean in group VC
## 26.06 26.14
The sample dataset is representative of the population.
The sample dataset is random and independent.
Equal variances between groups.
There is no significant difference on the effect of supplements, OJ and VC, on length of tooth growth.
There is significant difference on the effect of different dose combinations on length of tooth growth, in the lower 50%, from the CI negative values.
There is significant difference on the effect of dose 0.5 and 1.0, for each supplement, which is an increase in length, from the CI positive values.
There is no significant difference on the effect of dose 2.0 by each supplement as the 95% CI contains a mean of 0.
References: * The Journal of Nutrition 33(5): 491-504.