2026-03-06

Introduction

The statistical method of linear regression establishes how two variables are connected to each other.

In this study, we examine the impact of different vitamin C doses on tooth development in guinea pigs through the ToothGrowth dataset, which we analyze using R.

What is Linear Regression?

Linear regression establishes a direct relationship between a predictor variable and a response variable through the use of a straight line.

The model estimates how changes in one variable are associated with changes in another variable.

Linear Regression Model

\[ y = \beta_0 + \beta_1 x + \epsilon \]

Where:

  • \(y\) is the response variable
  • \(x\) is the predictor variable
  • \(\beta_0\) is the intercept
  • \(\beta_1\) is the slope
  • \(\epsilon\) represents random error

Prediction Equation

\[ \hat{y} = \beta_0 + \beta_1 x \]

This equation predicts the value of the response variable based on the predictor variable.

In this analysis:

  • \(y\) = tooth length
  • \(x\) = vitamin C dose

The ToothGrowth Dataset

The ToothGrowth dataset measures tooth length in guinea pigs receiving vitamin C.

Variables include:

  • len (tooth length)
  • dose (vitamin C dose)
  • supp (supplement type)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

Tooth Length vs Vitamin C Dose

Linear Regression Fit

Interactive Plot

Linear Regression Model

model <- lm(len ~ dose, data = ToothGrowth)
summary(model)
## 
## Call:
## lm(formula = len ~ dose, data = ToothGrowth)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.4496 -2.7406 -0.7452  2.8344 10.1139 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   7.4225     1.2601    5.89 2.06e-07 ***
## dose          9.7636     0.9525   10.25 1.23e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.601 on 58 degrees of freedom
## Multiple R-squared:  0.6443, Adjusted R-squared:  0.6382 
## F-statistic: 105.1 on 1 and 58 DF,  p-value: 1.233e-14

Conclusion

Linear regression provides a simple way to analyze relationships between variables.

Using the ToothGrowth dataset, the results suggest that higher vitamin C doses are associated with increased tooth growth in guinea pigs.