price and weight.Suppose that you want to build a regression model that predicts the price of cars using a data set named cars.
price and weight.There is strong posotive relationship between price and weight.The lighter the car, the less expensive it is. While the heavier the car, the more expensive it is. Create scatterplots
## 'data.frame': 54 obs. of 6 variables:
## $ type : Factor w/ 3 levels "large","midsize",..: 3 2 2 2 2 1 1 2 1 2 ...
## $ price : num 15.9 33.9 37.7 30 15.7 20.8 23.7 26.3 34.7 40.1 ...
## $ mpgCity : int 25 18 19 22 22 19 16 19 16 16 ...
## $ driveTrain: Factor w/ 3 levels "4WD","front",..: 2 2 2 3 2 2 3 2 2 2 ...
## $ passengers: int 5 5 6 4 6 6 6 5 6 5 ...
## $ weight : int 2705 3560 3405 3640 2880 3470 4105 3495 3620 3935 ...
## [1] 0.758112
Interpretation
Run a regression model for price with one explanatory variable, weight, and answer Q2 through Q5.
Yes the stats show that the weight is stattistically significant at 5%.
The price the model predicts would suggest the car at 4,000 pounds would be just about $32,000. However, the only statistical data given to us shows that a car ws purchased at $48,000 dollars.
The reported residual standard error is 7.57 on 52 degrees of freedom. This is simply telling us the line of best fit is a line that cuts through the data that minimizes the distance between a big group of data points.
.566 or 56.6% . This simply refers to how price is dependant on weight.
Run a second regression model for price with two explanatory variables: weight and passengers, and answer Q6.
Model 2 better fits the data in this case due to the fact it has a smaller residual error. Also when you look at the adjusted r squared it is a larger number showing that it is more accurate.
Build regression model
##
## Call:
## lm(formula = price ~ weight, data = cars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.767 -3.766 -1.155 2.568 35.440
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -20.295205 4.915159 -4.129 0.000132 ***
## weight 0.013264 0.001582 8.383 3.17e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.575 on 52 degrees of freedom
## Multiple R-squared: 0.5747, Adjusted R-squared: 0.5666
## F-statistic: 70.28 on 1 and 52 DF, p-value: 3.173e-11
##
## Call:
## lm(formula = price ~ weight + passengers, data = cars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -14.647 -3.688 -1.134 2.677 33.704
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -7.348709 7.480301 -0.982 0.3305
## weight 0.015891 0.001925 8.256 5.8e-11 ***
## passengers -4.094465 1.831085 -2.236 0.0297 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.3 on 51 degrees of freedom
## Multiple R-squared: 0.6127, Adjusted R-squared: 0.5975
## F-statistic: 40.34 on 2 and 51 DF, p-value: 3.127e-11
Interpretation