2025-03-16

Simple Linear Regression

Linear Regression is the tool use to plot independent data again dependent data assuming that there is specifically a linear relationship between all independent variables to the one dependent variable.

  • Although this does not always apply, it tends to be the most simple and its error and plotting can be used to determine if a linear relationship seems plausible or if there needs to be a non linear functions to map the variables better

Simple Linear Regression, however, is unique because it only involves one independent variable and one dependent variable.

  • This makes it one of the most basic possible tools use to measure how variables are related to each other

Simple Linear Regression on Puromycin

The Puromycin dataset measures the reaction velocity of an enzymatic reaction.

The dataset is specifically split based on whether the reaction’s state is “treated” or “untreated”

I will be measuring the whether the concentration of the solution is directly proportional to the rate both when it is treated and when it’s not treated.

We will soon see if the treated or untreated nature has any impact on the relationship between concentration of solutes and the reaction rate.

Quick summary and example of data

summary(Puromycin)

head(Puromycin)

##       conc             rate             state   
##  Min.   :0.0200   Min.   : 47.0   treated  :12  
##  1st Qu.:0.0600   1st Qu.: 91.5   untreated:11  
##  Median :0.1100   Median :124.0                 
##  Mean   :0.3122   Mean   :126.8                 
##  3rd Qu.:0.5600   3rd Qu.:158.5                 
##  Max.   :1.1000   Max.   :207.0
##   conc rate   state
## 1 0.02   76 treated
## 2 0.02   47 treated
## 3 0.06   97 treated
## 4 0.06  107 treated
## 5 0.11  123 treated
## 6 0.11  139 treated

Treated Linear Regression

## 
## Call:
## lm(formula = rate ~ conc, data = Puromycin, subset = (state == 
##     "treated"))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -58.696 -19.701   2.126  24.584  35.676 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   103.49      12.02   8.607 6.17e-06 ***
## conc          110.42      23.37   4.725 0.000811 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 30.9 on 10 degrees of freedom
## Multiple R-squared:  0.6906, Adjusted R-squared:  0.6597 
## F-statistic: 22.32 on 1 and 10 DF,  p-value: 0.0008106

Treated Graph

Regression Line: Rate = 103.49 + 110.42*Conc; \(R^2\) = .6597

## `geom_smooth()` using formula = 'y ~ x'

Untreated Linear Regression

## 
## Call:
## lm(formula = rate ~ conc, data = Puromycin, subset = (state == 
##     "untreated"))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -36.824 -14.111   2.135  18.722  25.308 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   86.038      8.764   9.817 4.17e-06 ***
## conc          89.338     20.729   4.310  0.00196 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 22 on 9 degrees of freedom
## Multiple R-squared:  0.6736, Adjusted R-squared:  0.6373 
## F-statistic: 18.57 on 1 and 9 DF,  p-value: 0.001963

Untreated Graph

Regression Line: Rate = 86.038 + 89.338*Conc; \(R^2\) = .6373

## `geom_smooth()` using formula = 'y ~ x'

Treated and Untreated graph together

Conclusion (no linear relationship)

Neither the untreated nor treated solution have a linear relationship between the initial concentration and the corresponding reaction rate. Instead, the data looks closer to a square root relationship:

\[ \text{rate} \propto \text{conc}^{1/2} \]

or in other words,

\[ \text{rate}^2 \propto \text{conc}. \]

To show this more quantitatively, the linear regression model was insufficient from the \(R^2\) value. Specifically, it remained in the range of 0.60–0.70. Since a strong linear relationship typically requires \(R^2 > 0.90\), these results show that a linear relationship does not fit this dataset.