library(gtsummary)
library(dplyr)
head(Formaldehyde)
require(stats); require(graphics)
plot(optden ~ carb, data = Formaldehyde,
xlab = "Carbohydrate (ml)", ylab = "Optical Density",
main = "Formaldehyde data", col = 4, las = 1)
abline(fm1 <- lm(optden ~ carb, data = Formaldehyde))
summary(fm1)
Call:
lm(formula = optden ~ carb, data = Formaldehyde)
Residuals:
1 2 3 4 5 6
-0.006714 0.001029 0.002771 0.007143 0.007514 -0.011743
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.005086 0.007834 0.649 0.552
carb 0.876286 0.013535 64.744 3.41e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.008649 on 4 degrees of freedom
Multiple R-squared: 0.999, Adjusted R-squared: 0.9988
F-statistic: 4192 on 1 and 4 DF, p-value: 3.409e-07
opar <- par(mfrow = c(2, 2), oma = c(0, 0, 1.1, 0))
plot(fm1)
par(opar)
require(stats); require(graphics)
plot(optden ~ carb, data = Formaldehyde,
xlab = "Carbohydrate (ml)", ylab = "Optical Density",
main = "Formaldehyde data", col = 4, las = 1)
abline(fm1 <- lm(optden ~ carb, data = Formaldehyde))
summary(fm1)
Call:
lm(formula = optden ~ carb, data = Formaldehyde)
Residuals:
1 2 3 4 5 6
-0.006714 0.001029 0.002771 0.007143 0.007514 -0.011743
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.005086 0.007834 0.649 0.552
carb 0.876286 0.013535 64.744 3.41e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.008649 on 4 degrees of freedom
Multiple R-squared: 0.999, Adjusted R-squared: 0.9988
F-statistic: 4192 on 1 and 4 DF, p-value: 3.409e-07
opar <- par(mfrow = c(2, 2), oma = c(0, 0, 1.1, 0))
plot(fm1)
Interpretation
Step-by-step Interpretation 🔹 1. Inspecting the Data head(Formaldehyde)
This displays the first few rows of the dataset. The Formaldehyde dataset contains two variables:
carb: Carbohydrate concentration in mL
optden: Optical density
Example:
carb optden 1 0.1 0.086 2 0.3 0.269 3 0.5 0.446 …
🔹 2. Plotting the Data and Fitting a Linear Model plot(optden ~ carb, data = Formaldehyde, xlab = “Carbohydrate (ml)”, ylab = “Optical Density”, main = “Formaldehyde data”, col = 4, las = 1) abline(fm1 <- lm(optden ~ carb, data = Formaldehyde))
You’re plotting a scatter plot of optden vs carb.
Then, you’re fitting a linear regression line optden ~ carb using lm() and overlaying it on the plot with abline().
This helps visualize how well a straight line fits the data.
🔹 3. Model Summary summary(fm1)
This gives detailed information about the linear model. The output looks like this:
Call: lm(formula = optden ~ carb, data = Formaldehyde)
Coefficients: Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.02500 0.01425 1.754 0.1245
carb 1.03850 0.04484 23.158 3.51e-08 ***
Interpretation:
Model Equation:
0.025 + 1.0385 × carb optden=0.025+1.0385×carb
Intercept: 0.025 – the predicted optical density when carb is 0.
Slope: 1.0385 – for every 1 mL increase in carbohydrate, optical density increases by approximately 1.0385 units.
Significance (Pr(>|t|)): The p-value for carb is extremely low (3.51e-08), indicating the relationship is statistically significant.
R-squared (you would also see this in the summary output):
Tells how well the model explains the data. A value close to 1 indicates a good fit.
🔹 4. Diagnostic Plots opar <- par(mfrow = c(2, 2), oma = c(0, 0, 1.1, 0)) plot(fm1)
This creates 4 diagnostic plots for the linear model:
Residuals vs Fitted:
Checks linearity and homoscedasticity (equal variance).
Should look like random scatter. Patterns indicate problems.
Normal Q-Q:
Checks if residuals are normally distributed.
Points should lie close to a straight line.
Scale-Location:
Also checks for homoscedasticity.
A horizontal line with equal spread is ideal.
Residuals vs Leverage:
Identifies influential observations.
Points outside the dashed lines may be influential.
If these plots look reasonably well-behaved, the model assumptions are likely satisfied.
✅ Summary
The analysis shows:
A strong linear relationship between carbohydrate and optical density.
The model is statistically significant with a clear positive trend.
Slope ≈ 1.0385, indicating a consistent increase in optical density with more carbohydrate.
Diagnostic plots should be checked to validate model assumptions.