price and weight.Suppose that you want to build a regression model that predicts the price of cars using a data set named cars.
price and weight.You can see there is a positive relationship between weight and price. Smaller cars are cheaper while the heavier cars are more expensive.
## 'data.frame': 54 obs. of 6 variables:
## $ type : Factor w/ 3 levels "large","midsize",..: 3 2 2 2 2 1 1 2 1 2 ...
## $ price : num 15.9 33.9 37.7 30 15.7 20.8 23.7 26.3 34.7 40.1 ...
## $ mpgCity : int 25 18 19 22 22 19 16 19 16 16 ...
## $ driveTrain: Factor w/ 3 levels "4WD","front",..: 2 2 2 3 2 2 3 2 2 2 ...
## $ passengers: int 5 5 6 4 6 6 6 5 6 5 ...
## $ weight : int 2705 3560 3405 3640 2880 3470 4105 3495 3620 3935 ...
## [1] 0.758112
Interpretation
Run a regression model for price with one explanatory variable, weight, and answer Q2 through Q5.
Yes, our statistics show that the coefficient of weight is significant at 5%. Changes in the weight is statisitically significant to the price of the vehicle.
For a car that weighs in at 4,000 pounds, I predict that the car will be priced around $43,000 dollars. This was found by using this equation:
W = 2171 + (43*P); 4,000 = 2171 + 43P; 4,000/2171 = 43P; 1829/43 = 43P; 42.53 = P
Hint: Check the units of the variables in the openintro manual.
The reported residual standard error dound in our stats is 433 pounds on 52 degress of freedom. This means that it is the typical difference between the actual weight and the weight being predicted.
The reported Adjusted R-squared is 0.7209 in our statistics. The R^2 of 0.5136 means that 51.36% of the variability in weight can be explained by height.
Run a second regression model for price with two explanatory variables: weight and passengers, and answer Q6.
Model 2 better fits the data in our stats. This is because model 2 has a smaller residual standard error at 347 while the first model is at 433.
Build regression model
##
## Call:
## lm(formula = weight ~ price, data = cars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1328.29 -228.09 10.92 258.19 924.27
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2171.113 118.956 18.251 < 2e-16 ***
## price 43.331 5.169 8.383 3.17e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 433 on 52 degrees of freedom
## Multiple R-squared: 0.5747, Adjusted R-squared: 0.5666
## F-statistic: 70.28 on 1 and 52 DF, p-value: 3.173e-11
##
## Call:
## lm(formula = weight ~ price + passengers, data = cars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -976.81 -201.56 6.13 151.33 799.88
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 294.25 356.98 0.824 0.414
## price 35.99 4.36 8.256 5.80e-11 ***
## passengers 395.91 72.56 5.456 1.44e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 347.4 on 51 degrees of freedom
## Multiple R-squared: 0.7315, Adjusted R-squared: 0.7209
## F-statistic: 69.46 on 2 and 51 DF, p-value: 2.748e-15
Interpretation