Length of odontoblasts (cells responsible for tooth growth) was measured in 60 guinea pigs (ToothGrowth data set supplied with datasets package in R). Guinea pigs were subjected to one of the three doses of vitamin C (0.5, 1 and 2 mg/day) by one of two delivery methods (orange juice or ascorbic acid). The data set was loaded and subjected to basic summary analysis. Then the association of vitamin C delivery method with odontoblast length was investigated. In addition, the association of vitamin C dose with odontoblast length was analyzed.
First, I loaded the ToothGrowth data and stored it in a variable df.
Then I performed basic exploratory data analysis (see below). The function str was applied to the data set. We found that there are 60 observations of 3 variables in the data set (numeric len, factor with 2 levels “OJ” and “VC” supp, and numeric dose). Function unique applied to the dose variable revealed that there are 3 dose levels: 0.5, 1.0, and 2.0. To check if there are missing values is.na function wrapped by the sum function was applied. Since the outcome is 0, there are no missing data points in the data set.
str(df)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
unique(df$dose)
## [1] 0.5 1.0 2.0
sum(is.na(df))
## [1] 0
To visualize the data split by different groups, odontoblast length was grouped by the delivery method and plotted as the box plot (Figure 1 in the Appendix). The odontoblast lengths appear to be symmetrical and similarly distributed between the delivery method groups. When the odontoblast lengths were grouped by the dose of the vitamin C (Figure 1 in the Appendix), odontoblast lengths appear to be symmetrical in all groups, however there appears to be a trend of larger odontoblasts with increasing dose of the vitamin C. In addition, the odontoblast lengths were grouped by the delivery method and plotted on separate graphs by the vitamin C dose (Figure 1 in the Appendix). The odontoblast lengths appear to be symmetrical within the groups. It appears that the odontoblast lengths tend to be smaller in the ascorbic acid delivery method as opposed to orange juice at 0.5 mg / day and at 1 mg / day doses, without any differences at 2 mg / day vitamin C dose.
In order to determine whether the odontoblast length follows normal distribution, basic summary of the data was analyzed using stat.desc function from the pastecs package (Table 1 of the Appendix). Criteria for kurtosis (kurt.2SE) and skewness (skew.2SE) are below 1 for all groups suggesting that the data are mesokurtic and symmetrical. In addition, the p-values for Shapiro-Wilk normality test (normtest.p) are greater than 0.05 suggesting that the data follow normal distribution within each group.
Since there are 2 factors (dose and delivery method) available in the data set, I analyzed the association of each factor with the odontoblast length.
To test whether there is an effect of vitamin C delivery method on the odontoblast length, the odontoblast length values were grouped by vitamin C delivery method and compared with the use of t.test function. The null hypothesis is that there is no effect of vitamin C delivery method on the osteoblast length. Alternative hypothesis is that there is an effect of vitamin C delivery method on the osteoblast length. Since it is unknown if the variances of both groups are equal, I have assumed that they are different. The p-value of this test is 0.0606345. The confidence interval at the \(\alpha\) level of 0.05 is [-0.17, 7.57]. Since the confidence interval contains 0, I fail to reject null hypothesis. Therefore I conclude that there is no effect of the delivery method on the length of osteoblasts in guinea pigs with 95% confidence.
To test whether there is an effect of vitamin C dose on the odontoblast length in guinea pigs, the odontoblast length values were grouped by vitamin C dose and compared pairwise with the use of t.test function. Under the null hypothesis the dose of the vitamin C does not affect the odontoblast length. Under alternative hypothesis the dose of the vitamin C affects the odontoblast length. The results of comparisons are presented in the Table 2 of the Appendix. Since multiple comparisons could affect the outcome of the t test, Bonferroni adjustment was performed and adjusted p values are shown in the Table 2 of the Appendix. P values for all comparisons performed (0.5 vs 1, 1 vs 2, 0.5 vs 2) are below 0.05, therefore I reject the null hypothesis. Therefore I conclude that with 95% confidence the dose of vitamin C has an effect on the odontoblast length in guinea pigs. Increasing vitamin C dose leads to increase in odontoblast length.
To conclude vitamin C delivery (by orange juice or by ascorbic acid) does not affect odontoblast length in guinea pigs, while the dose of vitamin C affects odontoblast length in guinea pigs. Increasing vitamin C dose leads to increase in odontoblast length.
To reach these conclusions the following assumptions were used: 1. The guinea pigs used in this study were independent and identically distributed. 2. Odontoblast length is normally distributed. 3. There is unequal variance in groups split by vitamin C delivery method or by vitamin C dose.
setwd("~/Documents/classes/dataScSpec/statInfer/Assignment")
require(knitr)
require(datasets)
require(ggplot2)
require(pastecs)
require(dplyr)
require(gridExtra)
df <- ToothGrowth
p <- ggplot(aes(x = supp, y = len), data = df) +
geom_boxplot(aes(fill = supp)) +
labs(x = "Delivery method",
y = "Odontoblast length (micron)")
q <- ggplot(aes(x = dose, y = len), data = df) +
geom_boxplot(aes(fill = factor(dose))) +
labs(x = "Vitamin C dose (mg/day)",
y = "Odontoblast length (micron)")
r <- ggplot(aes(x = supp, y = len), data = df) +
facet_grid(. ~ dose) +
geom_boxplot(aes(fill = supp)) +
labs(x = "Delivery method",
y = "Odontoblast length (micron)")
lay <- rbind(c(1, 2),
c(3, 3))
grid.arrange(grobs = list(p, q, r), layout_matrix = lay)
summ <- df %>%
group_by(supp, dose) %>%
summarize(mean = mean(len),
median = median(len),
sd = round(sd(len), 2),
max = max(len),
min = min(len),
skew.2SE = round(stat.desc(len, norm = TRUE)["skew.2SE"], 2),
kurt.2SE = round(stat.desc(len, norm = TRUE)["kurt.2SE"], 2),
normtest.W = round(stat.desc(len, norm = TRUE)["normtest.W"], 2),
normtest.p = round(stat.desc(len, norm = TRUE)["normtest.p"], 2))
kable(summ,
caption = "Basic summary of the odontoblast length by the vitamin C \
delivery method and the vitamin C dose")
suppTest <- t.test(len ~ supp, paired = FALSE, var.equal = FALSE, data = df)
dose1test <- t.test(len ~ dose, paired = FALSE, var.equal = FALSE,
data = df %>% filter(dose %in% c(0.5, 1)))
dose2test <- t.test(len ~ dose, paired = FALSE, var.equal = FALSE,
data = df %>% filter(dose %in% c(1, 2)))
dose3test <- t.test(len ~ dose, paired = FALSE, var.equal = FALSE,
data = df %>% filter(dose %in% c(0.5, 2)))
pvals <- c(dose1test$p.value, dose2test$p.value, dose3test$p.value)
padj <- p.adjust(pvals, method = "bonferroni")
confInts <- rbind(dose1test$conf.int, dose2test$conf.int, dose3test$conf.int)
confVect <- sapply(1:3, function(i){
paste0(
"[",
round(confInts[i, 1], 2),
",",
round(confInts[i, 2], 2),
"]"
)
})
doseTable <- data.frame(
comparison = c("0.5 vs 1", "1 vs 2", "0.5 vs 2"),
confidence.interval = confVect,
p.value = pvals,
adj.p.value = padj
)
names(doseTable) <- c("Comparison", "Confidence Interval", "P value", "Adjusted p value")
kable(doseTable, caption = "Effect of the vitamin C dose on the odontoblast length in \
guinea pigs. For the purpose of the confidence intervals the second condition \
in the comparison column was subtracted from the first condition")
| supp | dose | mean | median | sd | max | min | skew.2SE | kurt.2SE | normtest.W | normtest.p |
|---|---|---|---|---|---|---|---|---|---|---|
| OJ | 0.5 | 13.23 | 12.25 | 4.46 | 21.5 | 8.2 | 0.32 | -0.51 | 0.89 | 0.18 |
| OJ | 1.0 | 22.70 | 23.45 | 3.91 | 27.3 | 14.5 | -0.50 | -0.25 | 0.93 | 0.42 |
| OJ | 2.0 | 26.06 | 25.95 | 2.66 | 30.9 | 22.4 | 0.27 | -0.41 | 0.96 | 0.81 |
| VC | 0.5 | 7.98 | 7.15 | 2.75 | 11.5 | 4.2 | 0.10 | -0.68 | 0.89 | 0.17 |
| VC | 1.0 | 16.77 | 16.50 | 2.52 | 22.5 | 13.6 | 0.67 | 0.03 | 0.91 | 0.27 |
| VC | 2.0 | 26.14 | 25.95 | 4.80 | 33.9 | 18.5 | 0.12 | -0.46 | 0.97 | 0.92 |
| Comparison | Confidence Interval | P value | Adjusted p value |
|---|---|---|---|
| 0.5 vs 1 | [-11.98,-6.28] | 1.00e-07 | 4.00e-07 |
| 1 vs 2 | [-9,-3.73] | 1.91e-05 | 5.72e-05 |
| 0.5 vs 2 | [-18.16,-12.83] | 0.00e+00 | 0.00e+00 |
Distributions of odontoblasts by vitamin C delivery method, by vitamin C dose or by both. The upper left panel shows distribution of odontoblast length when grouped by vitamin C delivery method. The upper right panel shows distribution of odontoblast length when grouped by vitamin C dose. The bottom panel shows the distribution of odontoblast length grouped by vitamin C delivery method, separated by vitamin C dose in separate plots.