For this Project the dataset ToothGrowth was used. Comparisons were made between the length of odontoblasts on Guinea pigs that received different doses of vitamin C, via supplements or orange juice. T-tests were made to compare the means of the different sample groups, to state if there’s a statistical significant difference in length growth from different doses of Vitamin C (0.5, 1, 2 mg/day), or if there’s a statistical significant difference in the samples from the two delivery methods.
The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice or ascorbic acid (a form of vitamin C and coded as VC).
An exploratory data Analysis was made. The dataset consists of 60 observations of 3 variables. The variable dose was converted to a factor for plotting and statistical testing, then subsets were made to compare samples from different doses.
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
summary(ToothGrowth)
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
data_subset1 <- filter(ToothGrowth, dose == "0.5" | dose == "1")
data_subset2 <- filter(ToothGrowth, dose == "0.5" | dose == "2")
data_subset3 <- filter(ToothGrowth, dose == "1" | dose == "2")
A plot box was made to compare length by dose in both delivery methods groups.
g <- ggplot(data = ToothGrowth, aes(x = dose, y = len, fill = dose))
g + geom_boxplot() +
scale_fill_brewer(palette = "Accent") +
facet_grid(cols = vars(supp)) +
labs(title = "Toothgrowth length by supplement type and dose",
x = "Dose", y = "Length") +
theme_minimal()
It can be seen that there’s an evident positive correlation between dose and tooth growth, and there appears to be more growth in the orange juice groups. We can establish the hypothesis that the mean in growth in the orange juice groups are larger than the mean of growth for the Vitamin C supplements. To test this hypotheses, further testing is needed.
T-test were used to identify if there’s a statistical significance between delivery methods and length at 0.8 mg/day, 1 mg/day and 2 mg/day.
Below are the results for the T-tests comparing the means of the orange juice and vitamin C supplements samples, at 0.5 mg/day dose.
t.test(len~supp, data = ToothGrowth[ToothGrowth$dose == "0.5", ])
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 1.719057 8.780943
## sample estimates:
## mean in group OJ mean in group VC
## 13.23 7.98
Below are the results for the T-tests comparing the means of the orange juice and vitamin C supplements samples, at 1 mg/day dose.
t.test(len~supp, data = ToothGrowth[ToothGrowth$dose == "1", ])
##
## Welch Two Sample t-test
##
## data: len by supp
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 2.802148 9.057852
## sample estimates:
## mean in group OJ mean in group VC
## 22.70 16.77
And finally, below are the results for the T-tests comparing the means of the orange juice and vitamin C supplements samples, at 2 mg/day dose.
t.test(len~supp, data = ToothGrowth[ToothGrowth$dose == "2", ])
##
## Welch Two Sample t-test
##
## data: len by supp
## t = -0.046136, df = 14.04, p-value = 0.9639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.79807 3.63807
## sample estimates:
## mean in group OJ mean in group VC
## 26.06 26.14
In order to perform the T-test, some assumptions were made:
* The populations are independent.
* Random samples from the population of interest were used.
* The data follows an approximate normal distribution.
* There is homogeneity of variances.
From our data analysis, it was found that a higher Vitamin C dose is correlated with more length on tooth growth, based on our samples. It was also found that the difference in means for vitamin C from orange juice appears to be associated with more growth at 0.5 mg/day and 1 mg/day doses. There was no statistical significance on the effect of orange juice versus supplements at 2 mg/day doses, although the corresponding mean in growth for both groups was higher.
From this data analysis we can conclude that in our sample, the effect for growth of vitamin C from orange juice was higher than the effect on growth from Vitamin C supplements at 0.5 and 1 mg/day doses, with no statistical difference for a 2 mg/day dose. There is also a positive correlation for dose and growth. Below is a summary table for the p-values and confidence intervals for our T-tests.
| p.value | confidence.interval | |
|---|---|---|
| 0.5 mg/day | 0.0063586 | 1.719 - 8.781 |
| 1 mg/day | 0.0010384 | 2.802 - 9.058 |
| 2 mg/day | 0.9638516 | -3.798 - 3.638 |