INTRODUCTION: The response is the length of odontoblasts(cells responsiblefor tooth growth) in 60 guinea pigs.Each animal received one of three dose levels of vitamin C(0.5,1,2 mg/day) by one of two delivery methods, orange juice(OJ) or ascorbic acid(VC). Dataset : Tooth Growth The data frame has 60 observations on 3 variables. 1. len : Tooth length (numeric) 2. supp: Supplement type(VC or OJ) (factor) 3. dose : Dose in milligram/day (numeric)

Hypothesis Testing :Is the tooth growth of the animal affected by the dose level .The dose level is restricted only by two delivery method,by OJ or VC. 1. Null Hypothesis(H_o): Supplementary delivery method affects the tooth length. 2. Alternative Hypothesis(H_a):Supplementary delivery method doesnot affect tooth length.

Setting up Enviornment : The enviornment is based on the programming language R.We need to install the following packages:

if (!require('knitr'))
{
  install.packages('knitr',repos="https://cran.us.r-project.org");
  library(knitr) 
  }
## Loading required package: knitr
if (!require('rmarkdown'))
{
  install.packages('rmarkdown',repos="https://cran.us.r-project.org");
  library(rmarkdown) 
}
## Loading required package: rmarkdown
## Warning: package 'rmarkdown' was built under R version 3.4.4
if (!require('tinytex'))
{
  install.packages('tinytex')
  
}
## Loading required package: tinytex
## Warning: package 'tinytex' was built under R version 3.4.4
library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(rmarkdown)
library()
head(ToothGrowth)
##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5

Summary of the Dataset:

summary(ToothGrowth)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000

Transform the data type of dose (from numeric to factor): Since the data type of dose is numeric, we need to transform into factor.

Exploratory Data Analysis: This is an approach to analyze the dataset and summerize their characteristics with visual methods.Here the variables are dose,len and supp.

ToothGrowth$dose <- as.factor(ToothGrowth$dose)

Plotting of tooth length to the Delivery method by dose amount: Relation between Dose and length (Here len(dependent variable of length) and dose (independent variable))

tg_aggr <- aggregate(len ~ dose, data = ToothGrowth, mean)
plot(tg_aggr,xlab = 'Dose', ylab = "Length",main = "Average growth of tooth per dose", type = "b", col = "purple")

This plot explains clearly that average growth of tooth length is related to the dose of vitamin to the animal.So it is unclear that whether the supplement really how much affected from this plotting.

Hypothesis Testing:

This testing helps to compare the tooth growth by supplement.whether the supplement affects the growth or not.

t.test(len ~ supp, data = ToothGrowth)
## 
##  Welch Two Sample t-test
## 
## data:  len by supp
## t = 1.9153, df = 55.309, p-value = 0.06063
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1710156  7.5710156
## sample estimates:
## mean in group OJ mean in group VC 
##         20.66333         16.96333

From t-test, we found out that p-value is 0.06063.The p-value is greater than 0.05, so confidence interval of the test contains zero, we can say supplement doesnot affect the growth . Comparision of tooth growth by dose: This datset has 6 groups (3 doses and 2 supplements)

tg_dose1 <- filter(ToothGrowth,dose == 0.5)
tg_dose2 <- filter(ToothGrowth,dose == 1.0)
tg_dose3 <- filter(ToothGrowth,dose == 2.0)
t.test(len ~ supp, data = tg_dose1, pair = TRUE)
## 
##  Paired t-test
## 
## data:  len by supp
## t = 2.9791, df = 9, p-value = 0.01547
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1.263458 9.236542
## sample estimates:
## mean of the differences 
##                    5.25
t.test(len ~ supp, data = tg_dose2, pair = TRUE)
## 
##  Paired t-test
## 
## data:  len by supp
## t = 3.3721, df = 9, p-value = 0.008229
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1.951911 9.908089
## sample estimates:
## mean of the differences 
##                    5.93
t.test(len ~ supp, data = tg_dose3, pair = TRUE)
## 
##  Paired t-test
## 
## data:  len by supp
## t = -0.042592, df = 9, p-value = 0.967
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -4.328976  4.168976
## sample estimates:
## mean of the differences 
##                   -0.08

Linear Regression Analysis: Model1

model1 <- lm(len ~ supp, data = ToothGrowth)
summary(model1)
## 
## Call:
## lm(formula = len ~ supp, data = ToothGrowth)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -12.7633  -5.7633   0.4367   5.5867  16.9367 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   20.663      1.366  15.127   <2e-16 ***
## suppVC        -3.700      1.932  -1.915   0.0604 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.482 on 58 degrees of freedom
## Multiple R-squared:  0.05948,    Adjusted R-squared:  0.04327 
## F-statistic: 3.668 on 1 and 58 DF,  p-value: 0.06039

Model2

model2 <- lm(len ~ supp + dose, data = ToothGrowth)
summary(model2)
## 
## Call:
## lm(formula = len ~ supp + dose, data = ToothGrowth)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -7.085 -2.751 -0.800  2.446  9.650 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  12.4550     0.9883  12.603  < 2e-16 ***
## suppVC       -3.7000     0.9883  -3.744 0.000429 ***
## dose1         9.1300     1.2104   7.543 4.38e-10 ***
## dose2        15.4950     1.2104  12.802  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.828 on 56 degrees of freedom
## Multiple R-squared:  0.7623, Adjusted R-squared:  0.7496 
## F-statistic: 59.88 on 3 and 56 DF,  p-value: < 2.2e-16

Conclusion :

1.Dosage has influenced the length of the teeth.The higher the dose, the longer the teeth. 2.The supplement delivery method has no effect on growth of tooth length. 3.The sample is the true representative of the population. 4.The distribution follows the central limit theorem.