Loading the data
data("ToothGrowth")
Summarizing the data
ToothGrowth$dose = factor(ToothGrowth$dose)
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: Factor w/ 3 levels "0.5","1","2": 1 1 1 1 1 1 1 1 1 1 ...
There is considerable difference in the mean difference in tooth growth observed in the patients for the different delivery methods of supplement provided
As dosage is increased the average tooth length increases at each step increase in dosage.
We observe that there is marked difference mean distribution of tooth length for patients that consumed 0.5 or 1 dosage of supplement OJ and VC, but the difference is almost non-existent for the 2mg dosage.
Consider the supplement OJ being better performing than supplement VS
## Subsetting the data
x = ToothGrowth$len
group = ToothGrowth$supp
## Perfomring t-test
t.test(x~group, paired = F)
##
## Welch Two Sample t-test
##
## data: x by group
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1710156 7.5710156
## sample estimates:
## mean in group OJ mean in group VC
## 20.66333 16.96333
The p value is greater than hence we reject the null hypothesis
Calculating the power with which we can reject the null hypothesis
n = length(x)
mu0 = mean(x[group=="OJ"])
mua = mean(x[group=="VC"])
sigma = sd(x[group=="VC"])
delta = (mua - mu0)/sigma
power.t.test(n = n, sd = sigma, delta = delta, type = "one.sample", alt = "one.sided")$power
## [1] 0.01972157
Our power is pretty low but regardless, the null hypothesis has been rejected but we can also bring out due to the low power that there isn’t much significant difference between the supplement delivery modes.
As we can see from the visualization that the two distributions are highly overlapping.
Another thing we noticed from the supplement~dosage visualization was that there is considerable difference in output for the various dosage levels 0.5, 1 and 2.
Consider the null hypothesis being dosage level has no effect on the tooth length.
Let us now check to see if we get a significant p-value for each pairing of the dosage.
x = subset(ToothGrowth ,dose %in% c(0.5,1))$len
group = subset(ToothGrowth, dose %in% c(0.5,1))$dose
t.test(x~group, paired = F)
##
## Welch Two Sample t-test
##
## data: x by group
## t = -6.4766, df = 37.986, p-value = 1.268e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.983781 -6.276219
## sample estimates:
## mean in group 0.5 mean in group 1
## 10.605 19.735
x = subset(ToothGrowth ,dose %in% c(1,2))$len
group = subset(ToothGrowth, dose %in% c(1,2))$dose
t.test(x~group, paired = F)
##
## Welch Two Sample t-test
##
## data: x by group
## t = -4.9005, df = 37.101, p-value = 1.906e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -8.996481 -3.733519
## sample estimates:
## mean in group 1 mean in group 2
## 19.735 26.100
x = subset(ToothGrowth ,dose %in% c(0.5,2))$len
group = subset(ToothGrowth, dose %in% c(0.5,2))$dose
t.test(x~group, paired = F)
##
## Welch Two Sample t-test
##
## data: x by group
## t = -11.799, df = 36.883, p-value = 4.398e-14
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -18.15617 -12.83383
## sample estimates:
## mean in group 0.5 mean in group 2
## 10.605 26.100
In each case the p-value is very low hence we can conclude that the hypothesis can be easily rejected.
There appears to be statistical significance on tooth growth by varying the dosage levels, as dosage increases the mean tooth length increases.
And, There seems to be negligible role played by the delivery methods - VC, OJ although it is to be noted that at dosage levels 0.5 and 1 OJ has higher overall outcome in toothgrowth, whereas negligible for dosage of level 2