price and weight.Suppose that you want to build a regression model that predicts the price of cars using a data set named cars.
price and weight.Make sure to interpret the direction and the magnitude of the relationship. In addition, keep in mind that correlation (or regression) coefficients do not show causation but only association. - According to the scatter plot and the computed correlation coeffient the relationship between the two variables " price" and “weight” is positive, showing that the graph is pointing up and right.
Create scatterplots
## 'data.frame': 54 obs. of 6 variables:
## $ type : Factor w/ 3 levels "large","midsize",..: 3 2 2 2 2 1 1 2 1 2 ...
## $ price : num 15.9 33.9 37.7 30 15.7 20.8 23.7 26.3 34.7 40.1 ...
## $ mpgCity : int 25 18 19 22 22 19 16 19 16 16 ...
## $ driveTrain: Factor w/ 3 levels "4WD","front",..: 2 2 2 3 2 2 3 2 2 2 ...
## $ passengers: int 5 5 6 4 6 6 6 5 6 5 ...
## $ weight : int 2705 3560 3405 3640 2880 3470 4105 3495 3620 3935 ...
## [1] 0.758112
Interpretation
Run a regression model for price with one explanatory variable, weight, and answer Q2 through Q5.
price with two explanatory variables: weight and passengers, and answer Q6.Build regression model
##
## Call:
## lm(formula = weight ~ price, data = cars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1328.29 -228.09 10.92 258.19 924.27
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2171.113 118.956 18.251 < 2e-16 ***
## price 43.331 5.169 8.383 3.17e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 433 on 52 degrees of freedom
## Multiple R-squared: 0.5747, Adjusted R-squared: 0.5666
## F-statistic: 70.28 on 1 and 52 DF, p-value: 3.173e-11
##
## Call:
## lm(formula = weight ~ price + passengers, data = cars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -976.81 -201.56 6.13 151.33 799.88
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 294.25 356.98 0.824 0.414
## price 35.99 4.36 8.256 5.80e-11 ***
## passengers 395.91 72.56 5.456 1.44e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 347.4 on 51 degrees of freedom
## Multiple R-squared: 0.7315, Adjusted R-squared: 0.7209
## F-statistic: 69.46 on 2 and 51 DF, p-value: 2.748e-15
Interpretation