In this assignment we are required to analyse ToothGowth data using a dataset this is part of the R dataset package. The response is the length of odontoblasts (teeth) in each of 10 guinea pigs at each of three dose levels of Vitamin C (0.5, 1, and 2 mg) with each of two delivery methods (orange juice or ascorbic acid). This assignment will use a variety of statistical techniques to analyse the data and draw relevant inferences.
The R Code used to generate answers for Questions 1 to 3 of the exercise is outlined below.
#loading the data

library(datasets)
data(ToothGrowth)
#quick check of the data
head(ToothGrowth) 
##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5
# how many observations
str(ToothGrowth)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
#summary of data
summary(ToothGrowth)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
sd(ToothGrowth$len)
## [1] 7.649315

Question 1 Load the ToothGrowth data and perform some basic exploratory data analyses?

We have 60 observations based on the above and the core variables len capturing length, supp a factor with 2 values delivered at various dosages. 10 observations ae returned for each dose of 0.5, 1.0 and 2.0 of OJ and VC.The mean length is 18.81 with a min of 4.2 and max of 33.9. This allows us to quickly infer that result of the dose is relatively widely spread (sd is 7.64) and we will now explore the data further.
require(graphics)
coplot(len ~ dose | supp, data = ToothGrowth, panel = panel.smooth,
       xlab = "ToothGrowth data: length vs dose, given type of supplement") 

# Boxplot of Tooth Growth Against 2 Crossed Factors
# boxes colored for ease of interpretation 
boxplot(len~supp*dose, data=ToothGrowth, notch= FALSE, 
  col=(c("lightblue","darkblue")),
  main="Tooth Growth", xlab="Suppliment and Dose")

Question 2 Provide a basic summary of the data

A number of graphs produced show different len results for a number of different doses. The BoxPlot and Plot above show that OJ0.5 and OJ 1.0 tend to produce a higher len value but also have a higher spread relative to VC0.5 VC 1.0. Conversely at VC2 has a larger spread and max len relative to OJ2.

Question 3 Use confidence intervals and/or hypothesis tests to compare tooth growth by supp and dose.

Given the above analysis we will test the hypothesis H0: mean length at one level of dose equals to mean length at another level of dose. H1: mean length at one level of dose is longer or shorter than mean length at one level of dose. We perform both confidence intervals and independent t test.
Given the 3 levels of dosage (0.5, 1.0, 2.0), the t test is performed in the following orders: (1) dose 0.5 vs. dose 1.0; (2) dose 1.0 vs. dose 2.0; and (3) dose 0.5 vs. dose 2.0.
mn05 <- mean(ToothGrowth$len[ToothGrowth$dose == 0.5])
sd05 <- sd(ToothGrowth$len[ToothGrowth$dose == 0.5])
n05 <- length(ToothGrowth$len[ToothGrowth$dose == 0.5])
        
mn1 <- mean(ToothGrowth$len[ToothGrowth$dose == 1.0])
sd1 <- sd(ToothGrowth$len[ToothGrowth$dose == 1.0])
n1 <- length(ToothGrowth$len[ToothGrowth$dose == 1.0])

mn2 <- mean(ToothGrowth$len[ToothGrowth$dose == 2.0])
sd2 <- sd(ToothGrowth$len[ToothGrowth$dose == 2.0])
n2 <- length(ToothGrowth$len[ToothGrowth$dose == 2.0])

t.test(ToothGrowth$len[ToothGrowth$dose == 1.0], ToothGrowth$len[ToothGrowth$dose == 0.5], paired = FALSE, var.equal = TRUE)$conf
## [1]  6.276252 11.983748
## attr(,"conf.level")
## [1] 0.95
t.test(ToothGrowth$len[ToothGrowth$dose == 2.0], ToothGrowth$len[ToothGrowth$dose == 1.0], paired = FALSE, var.equal = TRUE)$conf
## [1] 3.735613 8.994387
## attr(,"conf.level")
## [1] 0.95
t.test(ToothGrowth$len[ToothGrowth$dose == 2.0], ToothGrowth$len[ToothGrowth$dose == 0.5], paired = FALSE, var.equal = TRUE)$conf
## [1] 12.83648 18.15352
## attr(,"conf.level")
## [1] 0.95
The confidence intervals for all 3 intervals lead us to reject the H0. We can conclude that the larger the supp dose, the longer the tooth.

Question 4 State your conclusions and the assumptions needed for your conclusions.

Based on the sample data provided:
1. Lower dosages (.5 Mg - 1 Mg), orange juice provides more tooth growth than ascorbic acid.
2. A higher dosage (2 Mg), the rate of tooth growth is not statistically different between supplement methods.
3. Regardless of the supplement method, dosage is a key factor in tooth growth.
4. Data is assumed to be normally distributed and t tests we used given the relatively low sample size