Determine whether each of the following six claims is true or false? Provide support or explanation for your answers. Simply stating true or false without support will be given no credit. The subparts of this question are not related to each other.

1a) Let Model 1 be \(Y = \alpha_1 + \beta_1 X + \varepsilon\). Let Model 2 be \(Y = \alpha_2 + \beta_2 X + \delta_2 Z + \varepsilon\). Suppose we fit each model using least squares and suppose that cov\((X,Z)=0\) in the data. Our notation will be that \(b_j\) is the estimate of \(\beta_j\). I claim that the fitted coefficients on the \(X\) variable are equal, or in other words that \(b_1 = b_2\). Is this claim true or false?

1b) Suppose our dataset has variables \(Y\), \(X_1\), and \(X_2\). We find that cov\((X_1,X_2)=0.36\). I claim that it is possible that this dataset could exhibit perfect multi-colinearity between \(X_1\) and \(X_2\). Is this claim true or false?

1c) Suppose our data results from a data generating process where the errors are heteroskedastic, but we assume they are homoskedastic. We then estimate the parameters of our linear model using OLS (ordinary least squares). I claim that the OLS estimator is unbiased in this situation. Is this claim true or false?

1d) I claim the Box-Ljung test is one way to check for multicollinearity. Is this claim true or false?

1e) Suppose you use a bootstrap method to calculate the standard error of the slope parameter for a simple linear regression model. I claim that the bootstrap distribution is centered over the true parameter value \(\beta\). Is this claim true or false?

1f) I claim that all maximum likelihood estimators are unbiased. Is this claim true or false?

Consider the simple linear regression model: \(Y_i = \beta_0 + \beta_1 X_i + \varepsilon_i\) with \(\varepsilon_i \sim \text{i.i.d. } \mathcal{N}(0,\sigma^2)\). Suppose you estimated of the parameters of this model using least squares with a dataset containing 1000 observations. Some calculations using the \(X\) matrix, \(Y\) vector, and vector of residuals (\(e\)) are provided below. Use that information to test the Null Hypothesis that \(\beta_1 = 5\) at a 95% confidence level. What do you conclude?

\[ [X'X]^{-1} = \begin{bmatrix} 0.5 & 0.1 \\ 0.1 & 3 \end{bmatrix} \hspace{3em} X'Y = \begin{bmatrix} -4 \\ 2 \end{bmatrix} \hspace{3em} e'e = 212.91 \]

Use the information on this page to answer the questions on the next page.

The `Flat_Panel_TV`

dataset contains data on 70 televisions for sale. The data include the following varibles:

`Price`

– the price of the television in dollars`Size`

– the diagonal length in inches of the screen`Brand`

– one of LG, Panasonic, or Samsung`Type`

– either LED or Plasma

A summary of the data is as follows:

`summary(Flat_Panel_TV)`

```
## Price Size Brand Type
## Min. : 499.0 Min. :32.00 LG :21 LED :36
## 1st Qu.: 927.5 1st Qu.:46.00 Panasonic:17 Plasma:34
## Median :1335.5 Median :50.00 Samsung :32
## Mean :1423.2 Mean :49.57
## 3rd Qu.:1795.0 3rd Qu.:55.00
## Max. :4049.0 Max. :65.00
```

The code and abbreviated output for two different linear regressions with these data are provided below:

`lmSumm(lm(log(Price) ~ log(Size) + Type + Brand, data=Flat_Panel_TV))`

```
Coefficients:
Estimate Std Error t value p value
(Intercept) -0.91070 0.72990 -1.25 0.217
log(Size) 2.09100 0.19040 10.98 0.000
TypePlasma -0.25710 0.07322 -3.51 0.001
BrandPanasonic -0.03968 0.08850 -0.45 0.655
BrandSamsung 0.17450 0.06717 2.60 0.012
---
Standard Error of the Regression: 0.2388
Multiple R-squared: 0.712 Adjusted R-squared: 0.695
Overall F stat: 40.22 on 4 and 65 DF, pvalue= 0
```

`lmSumm(lm(log(Price) ~ log(Size) + Type, data=Flat_Panel_TV))`

```
Coefficients:
Estimate Std Error t value p value
(Intercept) -1.3040 0.73370 -1.78 0.08
log(Size) 2.2190 0.19160 11.58 0.00
TypePlasma -0.3254 0.06605 -4.93 0.00
---
Standard Error of the Regression: 0.2531
Multiple R-squared: 0.667 Adjusted R-squared: 0.657
Overall F stat: 67.1 on 2 and 67 DF, pvalue= 0
```

3a) According to the first regression, which brand sells the cheapest 56-inch Plasma TVs?

3b) How do you interpret the coefficient on `log(Size)`

in the first regression?

3c) According to the first regression, what is the predicted price of a 65-inch Samsung LED TV?

3d) Test whether the set of brand dummy variables significantly improved the regression at a 95% confidence level. Note that `qf(p=0.95, df1=2, df2=65)`

= 3.138.

The `mammals`

dataset has information on body weight (in kilograms) and brain wieght (in grams) for 62 mammals. Suppose we regress brain weight on body weight and its square:

```
data(mammals, package="DataAnalytics")
mammals$bodywgt2 <- mammals$bodywgt^2
lmSumm(lm(brainwgt ~ bodywgt + bodywgt2, data=mammals))
```

```
## Multiple Regression Analysis:
## 3 regressors(including intercept) and 62 observations
##
## lm(formula = brainwgt ~ bodywgt + bodywgt2, data = mammals)
##
## Coefficients:
## Estimate Std Error t value p value
## (Intercept) 20.1600000 2.747e+01 0.73 0.466
## bodywgt 2.1230000 1.179e-01 18.00 0.000
## bodywgt2 -0.0001893 1.870e-05 -10.12 0.000
## ---
## Standard Error of the Regression: 204
## Multiple R-squared: 0.953 Adjusted R-squared: 0.952
## Overall F stat: 604.56 on 2 and 59 DF, pvalue= 0
```

4a) Does this regression suggest that there is a nonlinear relationship between body weight and brain weight? Why or why not?

4b) Write the formula that describes the expected change in brain weight for a small change in body weight, according to the fitted regression.

5a) Suppose we would like to forecast the log-GDP of the United States. We are debating whether to use a linear-trend model or a simple auto-regressive (i.e., an AR(1)) model. How could we decide which model is more appropriate?

5b) Suppose instead we decide to use an ARIMA(1,1,0) model. The code below estimates the ARIMA(1,1,0) model on monthly US GDP data for the 287 months ending on November 2018. Using the data below and the fitted ARIMA(1,1,0) model, what is the predicted GDP in December 2018? (Hint: not the log GDP)

```
Date lnGDP
2018-08-01 9.810481
2018-09-01 9.815965
2018-10-01 9.826152
2018-11-01 9.834753
```

`lmSumm(lm(diff(lnGDP)~back(diff(lnGDP))))`

```
Coefficients:
Estimate Std Error t value p value
(Intercept) 0.005 0.0006724 7.44 0
back(diff(lnGDP)) 0.360 0.0553200 6.51 0
```

6a) The complexity parameter (\(\lambda\)) in the LASSO model is usually determined by k-fold cross validation. What is k-fold cross validation? (Hint: one way to answer this question is to provide suedo-code that outlines the steps of k-fold cross validation.)

6b) Why does it make sense to standardized the \(X\) variables (i.e., scale each \(X_j\) variable to that it has unit variance) before using those variables in a LASSO model?