Basic Inferential Data Analysis
- Load the ToothGrowth data and perform some basic exploratory data analyses
- Provide a basic summary of the data.
- Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose.
- State your conclusions and the assumptions needed for your conclusions. ## Load the ToothGrowth data and perform some basic exploratory data analyses
Import Library
Load Data
Summary of data and explore data
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
b <- ggplot(data, aes(x=factor(data$dose), y=data$len, fill = data$dose)) +
geom_boxplot(outlier.colour="red", outlier.shape = 1, outlier.size=4) +
scale_fill_gradient(low = "blue", high = "red")
b + ggtitle("Plot of length by dose") + xlab("Dose") + ylab("length") +
theme_dark()+ theme(legend.position="none") b <- ggplot(data, aes(x=factor(data$dose), y=data$len, fill = data$supp)) +
geom_boxplot(outlier.colour="red", outlier.shape = 1, outlier.size=4) +
scale_color_gradient(low = "blue", high = "red")
b + ggtitle("Plot of length by dose and with group") + xlab("Dose") +
ylab("length") + theme_dark()+ theme(legend.position="bottom") Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose.
Hypothesis
1. By supplement
First divide into individual groups
Run Two sample T-test with confidence level 95%
##
## Welch Two Sample t-test
##
## data: group1 and group2
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1710156 7.5710156
## sample estimates:
## mean of x mean of y
## 20.66333 16.96333
2. By dose
Our case we know the value of dose which is 0.5,1,2.0. So we will run test twice between 0.5-1.0 and 1.0-2.0.
group_dose1 = data$len[data$dose == 0.5]
group_dose2 = data$len[data$dose == 1.0]
group_dose3 = data$len[data$dose == 2.0]##
## Welch Two Sample t-test
##
## data: group_dose1 and group_dose2
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.983781 -6.276219
## sample estimates:
## mean of x mean of y
## 10.605 19.735
##
## Welch Two Sample t-test
##
## data: group_dose2 and group_dose3
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -8.996481 -3.733519
## sample estimates:
## mean of x mean of y
## 19.735 26.100
3. By supplement and dose
Our case we know the value of dose which is 0.5,1,2.0 and supplment are OJ and VC. So we will run test three.
group_dose_supp1 = data$len[data$dose == 0.5 & data$supp == 'OJ']
group_dose_supp2 = data$len[data$dose == 0.5 & data$supp == 'VC']
group_dose_supp3 = data$len[data$dose == 1.0 & data$supp == 'OJ']
group_dose_supp4 = data$len[data$dose == 1.0 & data$supp == 'VC']
group_dose_supp5 = data$len[data$dose == 2.0 & data$supp == 'OJ']
group_dose3_supp6 = data$len[data$dose == 2.0 & data$supp == 'VC']##
## Welch Two Sample t-test
##
## data: group_dose_supp1 and group_dose_supp2
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 1.719057 8.780943
## sample estimates:
## mean of x mean of y
## 13.23 7.98
##
## Welch Two Sample t-test
##
## data: group_dose_supp3 and group_dose_supp4
## t = 4.0328, df = 15.358, p-value = 0.001038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 2.802148 9.057852
## sample estimates:
## mean of x mean of y
## 22.70 16.77
##
## Welch Two Sample t-test
##
## data: group_dose_supp5 and group_dose_supp5
## t = 0, df = 18, p-value = 1
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -2.494589 2.494589
## sample estimates:
## mean of x mean of y
## 26.06 26.06
State your conclusions and the assumptions needed for your conclusions.
From the T-test carried out folowing conclusions can be drawn
- OJ supplement with 0.5 and 1.o mg dose has shown higher length than VC
- VC and OJ has same implication on length when dose 2.0 mg