#####################
# READING IN DATA
#####################
sales <- read.csv("retail_sales.csv")Quiz 2A: Multiple Linear Regression
Multiple Linear Regression
#############################
# MULTIPLE LINEAR REGRESSION
#############################
model <- lm(sales ~ markup + advertising, data=sales)
summary(model)
Call:
lm(formula = sales ~ markup + advertising, data = sales)
Residuals:
Min 1Q Median 3Q Max
-1721.57 -610.36 25.09 674.06 2211.27
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1784.04 424.83 4.199 0.000161 ***
markup -511.73 265.67 -1.926 0.061788 .
advertising 37.36 13.41 2.787 0.008347 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 851 on 37 degrees of freedom
Multiple R-squared: 0.2017, Adjusted R-squared: 0.1586
F-statistic: 4.675 on 2 and 37 DF, p-value: 0.01548
Question 1
The residual standard error for this model is:
\(851\)
Question 2
The upper bound of the confidence interval for the advertising coefficient is:
######################
# CONDFIDENCE INTERVAL
######################
confint(model) 2.5 % 97.5 %
(Intercept) 923.25430 2644.8306
markup -1050.02007 26.5635
advertising 10.19832 64.5222
\(64.52\)
Question 3
The degrees of freedom for the t-test to test the significance of the markup coefficient is: (enter your answer as a whole number)
\(\text{df}=n-p-1=40-2-1=37\)
Question 4
The numerator degrees of freedom for the F-test of overall model significance is: (enter your answer as a whole number)
This the numerator of the F statistic is given by the \(MS_{reg}\) which has \(p\) degrees of freedom. In this case, we have that \(p=2\).
Question 5
The standard error of the coefficient for advertising is:
\(13.41\)
Question 6
The value of the test statistic for the test of overall model significance is:
\(4.675\)
Question 7
The value of the test statistic for the test of overall model significance is:
True
False
False
Question 8
The model coefficient estimates that appear in the R output are the values of the population parameters (β0 and β1).
True
False
False
Question 9
The correlation between sales and markup is high.
True
False
#####################
# CORRELATION MATRIX
#####################
cor(sales$markup, sales$sales)[1] -0.1848131
False
Question 10
The lower bound of the confidence interval for the markup coefficient is:
\(-1050.02\)
Question 11
The test of overall model significance shows that the model is significant at the 1% level.
True
False
False
Multiple-Choice
Question 12
In a multiple linear regression model, multicollinearity occurs when:
- The independent variables provide complementary information about the dependent variable.
- There exists a high degree of correlation between the independent variables and the dependent variable.
- There exists a high degree of correlation between the independent variables included in the model.
- The dependent variable provides redundant information about the independent variables.
- The fitted model yields estimates that are non-linear in form.
(c)
Question 13
Which of the following interpretations of the advertising coefficient (from the multiple linear regression analysis in the previous part) is correct?
- On average, average monthly sales increases by 37.36 sales for every R1000 increase in advertising spend, holding all else constant.
- On average, average monthly sales increases by 37.36 sales for every R1 increase in advertising spend, holding all else constant.
- On average, advertising spend increases by 37.36 units for a unit increase in average monthly sales, holding all else constant.
- We cannot interpret the advertising coefficient because it is not statistically significant.
- On average, average monthly sales increases by 1784.04 sales for every unit increase in advertising spend, holding all else constant.
(a)
Question 14
What is the difference between a simple and multiple linear regression model?
A simple linear regression model only uses one explanatory variable to describe the change in the dependent variable whereas a multiple linear regression model uses more than one explanatory variable.
Question 15
What are we testing when performing an F-test of overall model significance?
Whether the model is different from a null model.