Q1
m<-c(1,2,4)
cov<-matrix(data=c(2,3,1,3,5,2,1,2,6),nrow=3,ncol=3)
Compute the slope and intercept in the conditional mean function E[y|x]. Compute the two slopes and the intercept in the conditional mean function E[y|x,z]. Is the slope on x the same in the two functions? Explain.
Compute the conditional variance Var[y|x]. Compute the conditional variance Var[y,x,z].
Compute the squared correlation between y and x. Compute the squared correlation between y and E[y|x].
Compute the squared correlation between y and E[y|x,z].
Q2.
Using “GasBill” data, the first tab in “homework1data.xls,” complete the following questions. The data contains gas bills and number of rooms for 144 apartments.
gasbill_data<-read.csv("D:/Boston College/MS AE Courses/Spring 2018 - Econometrics/Homework 1.csv")
require(lattice)
## Loading required package: lattice
xyplot(gasbill_data$GasBill ~ gasbill_data$Units, data = gasbill_data,
xlab = "No. of Units",
ylab = "Gas Bill",
main = "Gas Bill by Units",
sub = "Scatter Plot"
)
plot(GasBill ~ as.factor(Units), data = gasbill_data)
#aggregate(mtcars$hp, by=list(carb=mtcars$carb, am=mtcars$am), mean)
#aggregate(hp ~ carb + am, mtcars, mean)
aggregate(GasBill ~ Units, data = gasbill_data, mean)
## Units GasBill
## 1 3 236.6667
## 2 4 325.2857
## 3 5 422.0278
## 4 6 525.0385
## 5 7 720.8148
## 6 8 812.9565
## 7 9 1085.0625
## 8 10 912.7500
## 9 11 1179.0000
Q-3
The regression model of interest is y = X1β1 + X2β2 + ε where X1 is K1 variables, including a constant and X2 is K2 variables not including a constant. It is believed that multicollinearity between the columns of X1 and X2 is adversely affecting the regression. Consider the following ‘cure.’ We will first regress each variable in X2 on all of the variables in X1. By construction, the residuals in these regressions, call them Z2 = (z1,…,zK2), are orthogonal to every variable in X1. So, instead of regressing y on X1 and X2, we linearly regress y on X1 and Z2. Denote by b = (b1,b2) the least squares coefficients in the original regression, and by c = (c1,c2) the least squares coefficients in the regression of y on X1 and Z2. Show the algebraic relation between b and c. Is c unbiased?
Using the gasoline data, second tab in “homework1data.xls”, let y be the variable G in the data set, let X2 denote the three macroeconomic price indexes, Pd, Pn, and Ps, and let X1 denote the other independent variables, constant, GasP, and PCIncome. Carry out the computations listed above and verify that the algebraic results you obtained do appear in the empirical results.
Q-4
What is the effect on R2 of computing a linear regression without a constant term? (Note, it depends on how R2 is computed.)