15-12-2023            >eR-BioStat
Simple linear regression using R
Ziv Shkedy et al
## load/install libraries
.libPaths(c("./Rpackages",.libPaths()))
library(knitr)
library(tidyverse)
library(deSolve)
library(minpack.lm)
library(ggpubr)
library(readxl)
library(gamlss)
library(data.table)
library(grid)
library(png)
library(nlme)
library(gridExtra)
library(mvtnorm)
library(e1071)
library(lattice)
library(ggplot2)
library(dslabs)
library(NHANES)
library(plyr)
library(dplyr)
library(nasaweather)
library(ggplot2)
library(gganimate)
library(av)
library(gifski)
library(foreach)
library("DAAG")
library(DT)
dim(mtcars)
## [1] 32 11
head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
#plot(mtcars$wt,mtcars$mpg, ylab = "mpg", xlab = "weight (1000 lbs)")
qplot(wt,mpg,data = mtcars)
Figure 1: mog vs. weight
cor(mtcars$wt,mtcars$mpg)
## [1] -0.8676594
For the mtcars dataset, we consider the model
\[mpg_{i}=\beta_{0}+\beta_{1} \times weight_{i}+\varepsilon_{i}\].
fit.lm<-lm(mtcars$mpg~mtcars$wt)
summary(fit.lm)
##
## Call:
## lm(formula = mtcars$mpg ~ mtcars$wt)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.5432 -2.3647 -0.1252 1.4096 6.8727
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.2851 1.8776 19.858 < 2e-16 ***
## mtcars$wt -5.3445 0.5591 -9.559 1.29e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.046 on 30 degrees of freedom
## Multiple R-squared: 0.7528, Adjusted R-squared: 0.7446
## F-statistic: 91.38 on 1 and 30 DF, p-value: 1.294e-10
The parametr estimates for the intercept and slope are equal. respectivly, to \(\hat{\beta}_{0}=37.28\) and \(\hat{\beta}_{1}=-5.34\)
Figure 2 shows the data (mpg vs. weight) and fitted regression line, \(\hat{mpg}_{i}=37.28-5.34 \times wt_{i}\)
qplot(wt,mpg,data = mtcars)+
geom_smooth(method = "lm",se = F)
Figure 2: Data and fitted model
For the mtcars data, the residuals from the model can be obtained by calling to the object resid, Figure 5 shows the diagnostic plots for the regression model.
fit.lm$resid
## 1 2 3 4 5 6 7
## -2.2826106 -0.9197704 -2.0859521 1.2973499 -0.2001440 -0.6932545 -3.9053627
## 8 9 10 11 12 13 14
## 4.1637381 2.3499593 0.2998560 -1.1001440 0.8668731 -0.0502472 -1.8830236
## 15 16 17 18 19 20 21
## 1.1733496 2.1032876 5.9810744 6.8727113 1.7461954 6.4219792 -2.6110037
## 22 23 24 25 26 27 28
## -2.9725862 -3.7268663 -3.4623553 2.4643670 0.3564263 0.1520430 1.2010593
## 29 30 31 32
## -4.5431513 -2.7809399 -3.2053627 -1.0274952
par(mfrow=c(1,2))
qqnorm(fit.lm$resid)
abline(0,1)
plot(fit.lm$fit,fit.lm$resid)
abline(0,0,col=2)
Figure 3: Diagnostic plots for the residuals