Suppose that you want to build a regression model that predicts the price of cars using a data set named cars.
Q1. Per the scatter plot and the computed correlation coefficient, describe relationships between the two variables - price and weight.
Make sure to interpret the direction and the magnitude of the relationship. In addition, keep in mind that correlation (or regression) coefficients do not show causation but only association.
- There is an strong upward sloping, positive relationship between the two variables. Because the absolute value is greater than .6, the relationship is strong
Create scatterplots 
Run a regression model for price with one explanatory variable, weight, and answer Q2 through Q5.
Q2. Is the coefficient of weight statistically significant at 5%? Interpret the coefficient.
- Yes it is statistically significant at 5% because it is significant at .001 which means it is significant at all other higher percentages.
Q3. What price does the model predict for a car that weighs 4000 pounds?
Hint: Check the units of the variables in the openintro manual. *$48,000 USD
Q4. What is the reported residual standard error? What does it mean?
*433 this is the models difference in the actual weight compared to the models predicted weight (how much the model misses the actual weight on average) the model estimated weight misses the actual weight by about 433 lbs.
Q5. What is the reported adjusted R squared? What does it mean?
*.5666 means that 56.66% of the variability in weight can be explained by price. This model is good to compare one model to another. The higher the percentage the better the model.
Run a second regression model for price with two explanatory variables: weight and passengers, and answer Q6.
Q6. Which of the two models better fits the data? Discuss your answer by comparing the residual standard error and the adjusted R squared between the two models.
- Model #2 would be a better model to use. The adjusted R squared has a higher percentage in the second model and the residual standard error is smaller which makes for a more accurate model because it more closely represents the actual weight on average.
Build regression model