# Load packages
library(dplyr)
library(ggplot2)
library(openintro)
# Load Data
data(cars)
The simple linear regression model can be visualized by a straight line, a “best fit” line that cuts through the data in a way that minimizes the distance between the line and the data points. This can be done by using the geom_smooth() function.
# Scatterplot with regression line
ggplot(data = cars, aes(x = price, y = weight)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) # lm stands for linear model; se for standard errors
Create a linear model using lm(). This function return a model object having class “lm”. This object contains lots of information about your regression model, including:
# Linear model for weight as a function of height
lm(price ~ weight, data = cars)
##
## Call:
## lm(formula = price ~ weight, data = cars)
##
## Coefficients:
## (Intercept) weight
## -20.29521 0.01326
Interpretation
Show that the mean of residuals is zero (not exactly zero due to rounding error). Calculate residual standard error.
# Create a linear model
mod <- lm(price ~ weight, data = cars)
# View summary of model
summary(mod)
##
## Call:
## lm(formula = price ~ weight, data = cars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.767 -3.766 -1.155 2.568 35.440
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -20.295205 4.915159 -4.129 0.000132 ***
## weight 0.013264 0.001582 8.383 3.17e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.575 on 52 degrees of freedom
## Multiple R-squared: 0.5747, Adjusted R-squared: 0.5666
## F-statistic: 70.28 on 1 and 52 DF, p-value: 3.173e-11
Interpretation