A html version is available at RPubs: http://rpubs.com/svicente99/Inference_Peer_Assesment_2
In this project we analyse the ToothGrowth data contained inside R datasets package. After a quick exploratory data analysis about them, we use confidence intervals and/or hypothesis tests to compare tooth growth by supplement and dosage. Lastly, some conclusions are established.
The data object of this study
The data analysed were those in the ToothGrowth data provided as part the R {datasets} package. Before beginning the analysis, it is necessary to clarify details of the nature of the data where the information provided in R is either misleading or incorrect. The data consists of measurements of the mean size of the odontoblast cells harvested from the incisor teeth of a population of 60 guinea pigs. These animals were divided into 6 groups of 10 and consistently fed a diet with one of 6 Vitamin C supplement regimes for a period of 42 days. The Vitamin C was administered either in the form of Orange Juice (OJ) or chemically pure Vitamin C (VC) in aqueous solution. Each animal received the same daily dosage of Vitamin C (either 0.5, 1.0 or 2.0 milligrams) consistently. Since each combination of supplement type and dosage was given to 10 animals, this required a total of 60 animals for the study. After 42 days, the animals were euthanized, their incisor teeth were harvested and subject to analysis via optical microscopy to determine the length (in microns) of the odontoblast cells (the layer between the pulp and the dentine).
Main Reference
Follow this link in JN - The Journal of Nutrition.
Nomenclature
The ToothGrowth data set consists of 60 observations of the 3 variables:Loading ToothGrowth data to be processed:
library(datasets)
data(ToothGrowth)
TG <- ToothGrowth
head(TG)
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
We have in total 60 rows corresponding to each animal used in this study. A brief summary about these data:
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
boxplot(len ~ supp * dose, data=TG, ylab="Tooth Length", main="Tooth Growth - BOX PLOTS")
box("outer", col="maroon", lwd=3)
Looking at these boxplots we take note a tendency of tooth length increasing in same direction of more dosage of vitamin C (from 0.5 to 2.0mg), supplemented either by Orange Juice (OJ) or synthetic acid ascorbic (VC).
However, the difference of supplementation methods are visually highlighted only for two minor dosages (0.5/1.0). To the highest (2.0mg) is quite imperceptible in relation to mean length (27 microns roughly) - there is no effect for delivery methods, presumption that ought to be validated by an hypothesis test further.
TG_0.5 <- TG[ TG$dose==0.5, ]
TG_1.0 <- TG[ TG$dose==1.0, ]
TG_2.0 <- TG[ TG$dose==2.0, ]
Thus, we subset:
Now, we may perform hypothesis tests between these subsets, to compare ‘len’ var.
Ttest_0.5 <- t.test (len ~ supp, paired = FALSE, var.equal = FALSE, data = TG_0.5)
Ttest_1.0 <- t.test (len ~ supp, paired = FALSE, var.equal = FALSE, data = TG_1.0)
Ttest_2.0 <- t.test (len ~ supp, paired = FALSE, var.equal = FALSE, data = TG_2.0)
The associated p-values are:
pValue0.5 <- Ttest_0.5$p.value ## 0.5 mg/day
pValue1.0 <- Ttest_1.0$p.value ## 1.0 mg/day
pValue2.0 <- Ttest_2.0$p.value ## 2.0 mg/day
Decision about these tests:
Null hypothesis (H0):Dose=0.5: p-value = 0.0063586 < 0.05 (5%) —–> strong presumption against null hypothesis
E.g.: there is a difference between treatments OJ x VC [REJECTED].
Dose=1.0: p-value = 0.0010384 < 0.05 (5%) —–> strong presumption against null hypothesis
E.g.: there is a difference between treatments OJ x VC [REJECTED]