Since this is a midterm, you are to work on this assignment by yourself.

You can ask me for help, but I will be vague in my answers to avoid giving away the answers.

Please type your answers in a Word document. I will provide a place in Blackboard to upload your answers.

Section 0:

  1. What is your name?
  2. What is your quest?
  3. What is the airspeed velocity of an unladen swallow?

Section 1: Multiple choice

  1. The interpretation of the slope coefficient in the model \(log(y) = \beta_0 + \beta_1 log(x_1) + u\) is:
  1. a 1% change in x is associated with a \(\beta_1\) percent change in y.
  2. a change in X by one unit is associated with a \(\beta_1\) change in y.
  3. a change in X by one unit is associated with a 100$ _1$ percent change in y.
  4. a 1% change in X is associated with a change in Y of 0.01$ _1$

  1. In the linear probability model, the interpretation of the slope coefficient is
  1. the change in odds associated with a unit change in X, holding other regressors constant.
  2. not all that meaningful since the dependent variable is either 0 or 1.
  3. the change in probability that Y=1 associated with a unit change in X, holding others regressors constant.
  4. the response in the dependent variable to a percentage change in the regressor.

  1. The slope estimator, \(\beta_1\), has a smaller standard error, other things equal, if
  1. there is more variation in the explanatory variable, x.
  2. there is a large variance of the error term, u.
  3. the sample size is smaller.
  4. the intercept, \(\beta_0\), is small.

  1. When there are omitted variables in the regression, which are determinants of the dependent variable, then
  1. you cannot measure the effect of the omitted variable, but the estimator of your included variable(s) is (are) unaffected.
  2. this has no effect on the estimator of your included variable because the other variable is not included.
  3. this will always bias the OLS estimator of the included variable.
  4. the OLS estimator is biased if the omitted variable is correlated with the included variable.

  1. The overall regression F-statistic tests the null hypothesis that
  1. all slope coefficients are zero.
  2. all slope coefficients and the intercept are zero.
  3. the intercept in the regression and at least one, but not all, of the slope coefficients is zero.
  4. the slope coefficient of the variable of interest is zero, but that the other slope coefficients are not.

  1. Nonexperimental data is called _____________.
  1. cross-sectional data
  2. observational data
  3. time series data
  4. panel data

  1. Empirical analysis relies on _______ to test a theory.
  1. common sense
  2. ethical considerations
  3. data
  4. customs and conventions

  1. A dependent variable is also known as a(n)__________.
  1. explanatory variable
  2. control variable
  3. predictor variable
  4. respons variable

  1. The normality assumption implies that:
  1. the population error \(u\) is dependent on the explanatory variables and is normally distributed with mean equal to one and variance \(\sigma^2\).
  2. the population error \(u\) is independent of the explanatory variables and is normally distributed with mean equal to one and variance\(\sigma\).
  3. the population error \(u\) is dependent on the explanatory variables and is normally distributed with mean zero and variance \(\sigma\).
  4. the population error \(u\) is independent of the explanatory variables and is normally distributed with mean zero and variance \(\sigma^2\).

  1. A variable is standardized in the sample:
  1. by multiplying by its mean.
  2. by subtracting off its mean and multiplying by its standard deviation.
  3. by subtracting off its mean and dividing by its standard deviation.
  4. by multiplying by its standard deviation.

Section 2: Short answer

  1. Use these regression results to answer the questions below.
load("../380/Wooldridge Material/Data Sets- R/wage1.RData")
wage1 <- data
fit1 <- lm(lwage ~ educ, wage1)
summary(fit1)
## 
## Call:
## lm(formula = lwage ~ educ, data = wage1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.21158 -0.36393 -0.07263  0.29712  1.52339 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.583773   0.097336   5.998 3.74e-09 ***
## educ        0.082744   0.007567  10.935  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4801 on 524 degrees of freedom
## Multiple R-squared:  0.1858, Adjusted R-squared:  0.1843 
## F-statistic: 119.6 on 1 and 524 DF,  p-value: < 2.2e-16
  1. What is the effect of an extra year of schooling?
  2. If you had a strong belief that years of high school education were different from college education, how would you modify the equation? What if your theory suggested that there was a “diploma effect”?

  1. Use these regression results to answer the questions below
df <- Galton # from the HistData package
fit2 <- lm(child ~ parent,df)
summary(fit2)
## 
## Call:
## lm(formula = child ~ parent, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.8050 -1.3661  0.0487  1.6339  5.9264 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 23.94153    2.81088   8.517   <2e-16 ***
## parent       0.64629    0.04114  15.711   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.239 on 926 degrees of freedom
## Multiple R-squared:  0.2105, Adjusted R-squared:  0.2096 
## F-statistic: 246.8 on 1 and 926 DF,  p-value: < 2.2e-16
  1. What is the predicted height for a child whose parents’ height was 70 inches?
  2. How would you test the hypothesis that child’s height should be an inch greater if the parents’ height is an inch greater?

  1. Use these regression results to answer the questions below
fit3 <- lm(lwage ~ female * scale(educ,scale=F), wage1)
summary(fit3)
## 
## Call:
## lm(formula = lwage ~ female * scale(educ, scale = F), data = wage1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.02673 -0.27468 -0.03721  0.26221  1.34740 
## 
## Coefficients:
##                                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    1.796e+00  2.702e-02  66.486   <2e-16 ***
## female                        -3.609e-01  3.907e-02  -9.236   <2e-16 ***
## scale(educ, scale = F)         7.723e-02  8.988e-03   8.593   <2e-16 ***
## female:scale(educ, scale = F) -6.408e-05  1.450e-02  -0.004    0.996    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4459 on 522 degrees of freedom
## Multiple R-squared:  0.3002, Adjusted R-squared:  0.2962 
## F-statistic: 74.65 on 3 and 522 DF,  p-value: < 2.2e-16
  1. What is the expected return to an extra year of schooling for men?
  2. What is the expected return to an extra year of schooling for women?

  1. What is multicollinearity? What problems does it create in regression results? (Answer in three sentences or less.)

data1 <- read.csv('http://global.oup.com/uk/orc/busecon/economics/dougherty4e/01student/datasets/csv/eaef/eaef10.csv')
fit4 <- lm(EARNINGS ~ FEMALE*SINGLE + FEMALE*DIVORCED + FEMALE*MARRIED - 1,data1)
summary(fit4)
## 
## Call:
## lm(formula = EARNINGS ~ FEMALE * SINGLE + FEMALE * DIVORCED + 
##     FEMALE * MARRIED - 1, data = data1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -20.443  -7.651  -2.737   3.596  83.857 
## 
## Coefficients: (1 not defined because of singularities)
##                 Estimate Std. Error t value Pr(>|t|)    
## FEMALE          -10.8074     1.3385  -8.074 4.54e-15 ***
## SINGLE           19.3230     1.8691  10.338  < 2e-16 ***
## DIVORCED         16.8120     1.8110   9.283  < 2e-16 ***
## MARRIED          26.7126     0.9583  27.876  < 2e-16 ***
## FEMALE:SINGLE     6.4785     3.5482   1.826 0.068433 .  
## FEMALE:DIVORCED   9.1449     2.7536   3.321 0.000958 ***
## FEMALE:MARRIED        NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12.68 on 534 degrees of freedom
## Multiple R-squared:  0.7213, Adjusted R-squared:  0.7182 
## F-statistic: 230.3 on 6 and 534 DF,  p-value: < 2.2e-16
  1. What is the predicted hourly wage of a single man?
  2. What is the difference between the predicted hourly wage of a divorced man and a divorced woman?