View(mtcars)
plot(mtcars$mpg, mtcars$wt)
scatter.smooth(mtcars$mpg, mtcars$wt)
There is negative relationship between miles per gallon and weight.
model <- lm(mtcars$mpg ~ mtcars$wt)
summary(model)
##
## Call:
## lm(formula = mtcars$mpg ~ mtcars$wt)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.5432 -2.3647 -0.1252 1.4096 6.8727
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.2851 1.8776 19.858 < 2e-16 ***
## mtcars$wt -5.3445 0.5591 -9.559 1.29e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.046 on 30 degrees of freedom
## Multiple R-squared: 0.7528, Adjusted R-squared: 0.7446
## F-statistic: 91.38 on 1 and 30 DF, p-value: 1.294e-10
The regression coefficients: 37.2851 (Intercept) and -5.3445 (weight/slope) can be analyzed with the above given data as:
when weight of a car is zero, the mileage of the car is 37.2851 miles per gallon.
If the weight of a car increases by 1 unit, the mileage is decreasing by 5.3445 miles per gallon.
If weight of a car decreases by 1 unit, mileage is increasing by 5.3445 miles per gallon.
The Multiple R-squared value tells how well the ‘x’ explains the ‘y’ variable. In this case, multiple R-squared value of 0.7528 tells that the weight variable in the given data is explaining 75.28% of miles per gallon variable.
plot(model,1)
The plot shows non-linear relationship between predicted values and residuals.
plot(model,2)
The distribution of residuals shows normal distribution. The plot shows there is about 95% of normal distribution in the plot, but there are also few outliers.
plot(model,3)
The plot shows the errors are not constant. There is no consistency. It means my model is predicting different values for the same observation.
plot(model,4)
#This plot shows 5 influential observations in the model.
plot(model,4,id.n=5)
#MAE - mean absolute error
mean(abs(model$residuals))
## [1] 2.340642
#MAPE - Mean absolute percentage error
mean(abs(model$residuals)/mtcars$wt)
## [1] 0.8046639
#MSE - Mean square error
mean(model$residuals^2)
## [1] 8.697561
#RMSE - root mean square error
sqrt(mean(model$residuals^2))
## [1] 2.949163
Whereas R-squared is a relative measure of fit, RMSE is an absolute measure of fit. As the square root of a variance, RMSE can be interpreted as the standard deviation of the unexplained variance, and has the useful property of being in the same units as the response variable. Lower values of RMSE indicate better fit.