Farzad Ravari
May 8, 2017
Basic Inferential Data Analysis Analysis of the ToothGrowth data in the R datasets package. 1.Load the ToothGrowth data and perform some basic exploratory data analyses 2.Provide a basic summary of the data. 3.Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose. (Only use the techniques from class, even if there’s other approaches worth considering) 4.Stater conclusions and the assumptions needed for your conclusions
Load Necessary libraries
library(ggplot2)
library(graphics)
Load Tooth Growth Data
data("ToothGrowth")
summary(ToothGrowth)
len supp dose
Min. : 4.20 OJ:30 Min. :0.500
1st Qu.:13.07 VC:30 1st Qu.:0.500
Median :19.25 Median :1.000
Mean :18.81 Mean :1.167
3rd Qu.:25.27 3rd Qu.:2.000
Max. :33.90 Max. :2.000
Evaluate first few rows
head(ToothGrowth)
len supp dose
1 4.2 VC 0.5
2 11.5 VC 0.5
3 7.3 VC 0.5
4 5.8 VC 0.5
5 6.4 VC 0.5
6 10.0 VC 0.5
Data details
str(ToothGrowth)
'data.frame': 60 obs. of 3 variables:
$ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
$ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
$ dose: Factor w/ 3 levels "0.5","1","2": 1 1 1 1 1 1 1 1 1 1 ...
Arrangment of Variables
ToothGrowth$dose<-as.factor(ToothGrowth$dose)
ggplot(aes(x = supp, y = len), data = ToothGrowth) + geom_boxplot(aes(fill = supp))
T-test to evaluate Tooth Growth by Supplement
t.test(ToothGrowth$len[ToothGrowth$supp == "OJ"], ToothGrowth$len[ToothGrowth$supp == "VC"])
t = 1.9153, df = 55.309, p-value = 0.06063
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval: -0.1710156 7.5710156
sample estimates:
mean of x mean of y
20.66333 16.96333
With confidence interval=0.95 and significance level=.05 ,P-value=0.061 and CI: (-0.0171, 7.571),there is no significant difference between the tooth length by two different methods “OJ” and “VC”
Compare dose 0.5 vs 1
t.test(ToothGrowth$len[ToothGrowth$dose == 0.5], ToothGrowth$len[ToothGrowth$dose == 1])
t = -6.4766, df = 37.986, p-value = 1.268e-07
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval: -11.983781 -6.276219
sample estimates:
mean of x mean of y
10.605 19.735
Comapre dose 0.5 Vs 2
t.test(ToothGrowth$len[ToothGrowth$dose == 0.5], ToothGrowth$len[ToothGrowth$dose == 2])
t = -11.799, df = 36.883, p-value = 4.398e-14
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval: -18.15617 -12.83383
sample estimates:
mean of x mean of y
10.605 26.100
Compare dose 1 vs 2
t.test(ToothGrowth$len[ToothGrowth$dose == 1], ToothGrowth$len[ToothGrowth$dose == 2])
t = -4.9005, df = 37.101, p-value = 1.906e-05
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval: -8.996481 -3.733519
sample estimates:
mean of x mean of y
19.735 26.100
By evaluation of p-values and t-tests for each different dose mean of tooth length for each dose are different and it will shows the amount of Vitamin C dose can have different effect on tooth length.
ggplot(aes(x = dose, y = len), data = ToothGrowth) + geom_boxplot(aes(fill = supp))
According to above boxplots, low dose of vitamin C has less effect on Tooth growth in compare of Orange juice but in higher dose of vitamin C will increase but still orange juice has the same effect on increasing the dose therefor if it is possible using of orange juice is better than vitamin C supplementits but in some region that taking of orange juice is not possible due to its price and etc its better to use higher dose of vitamin C for tooth growth .