(i) Simple Regression Model

Model: log(price) = Bo + B1*log(dist) + u

# Subset the data for the year 1981
data_1981 <- subset(data, year == 1981)
# Perform simple regression
model1 <- lm(log(price) ~ log(dist), data = data_1981)
summary(model1)
## 
## Call:
## lm(formula = log(price) ~ log(dist), data = data_1981)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.87318 -0.22657 -0.01985  0.25687  0.95045 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  8.04716    0.64624  12.452  < 2e-16 ***
## log(dist)    0.36488    0.06576   5.548 1.39e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3543 on 140 degrees of freedom
## Multiple R-squared:  0.1803, Adjusted R-squared:  0.1744 
## F-statistic: 30.79 on 1 and 140 DF,  p-value: 1.395e-07

B1= 0,36488

In the event that B1 is statistically significant and negative, it implies that the incinerator’s existence lowers home values as one gets farther away from it.

(ii) Adding Additional Variables

Model: log(price) = Bo + B1log(dist) + B2log(intst) + B3log(area) + B4log(land) + B5rooms + B6baths + B7*age + u

model2 <- lm(log(price) ~ log(dist) + log(intst) + log(area) + log(land) + rooms + baths + age, data = data_1981)
summary(model2)
## 
## Call:
## lm(formula = log(price) ~ log(dist) + log(intst) + log(area) + 
##     log(land) + rooms + baths + age, data = data_1981)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.74072 -0.10669  0.00932  0.11817  0.61387 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  7.592332   0.641711  11.831  < 2e-16 ***
## log(dist)    0.055389   0.057621   0.961 0.338153    
## log(intst)  -0.039032   0.051662  -0.756 0.451261    
## log(area)    0.319294   0.076418   4.178 5.27e-05 ***
## log(land)    0.076824   0.039505   1.945 0.053908 .  
## rooms        0.042528   0.028251   1.505 0.134588    
## baths        0.166923   0.041944   3.980 0.000113 ***
## age         -0.003567   0.001059  -3.369 0.000985 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.201 on 134 degrees of freedom
## Multiple R-squared:  0.7475, Adjusted R-squared:  0.7344 
## F-statistic: 56.68 on 7 and 134 DF,  p-value: < 2.2e-16

After adjusting for other factors, it indicates that the incinerator’s existence still has an impact on house prices if it is statistically significant and stays negative. The extra control variables in model (ii) could be the cause of differences in the outcomes between (i) and (ii).

(iii) Adding [log(intst) ~

Model: log(price) = Bo + B1log(dist) + B2log(intst) + B3log(area) + B4log(land) + B5rooms + B6baths + B7*age + u The significance of the functional form was assessed by comparing the results with model (ii).

model3 <- lm(log(price) ~ log(dist) + log(intst), data = data_1981)
summary(model3)
## 
## Call:
## lm(formula = log(price) ~ log(dist) + log(intst), data = data_1981)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.92562 -0.19871 -0.02897  0.22916  0.88075 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  8.65979    0.62227  13.917  < 2e-16 ***
## log(dist)    0.04682    0.09449   0.495    0.621    
## log(intst)   0.26555    0.05971   4.447 1.77e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3327 on 139 degrees of freedom
## Multiple R-squared:  0.2824, Adjusted R-squared:  0.272 
## F-statistic: 27.35 on 2 and 139 DF,  p-value: 9.664e-11

(iv) Significance of the Square of log(dist)

A significant coefficient indicates that the relationship between distance and housing prices is not linear.

data_1981$dist_squared <- (log(data_1981$dist))^2
model4 <- lm(log(price) ~ log(dist) + log(intst) + log(area) + log(land) + rooms + baths + age + dist_squared, data = data_1981)
summary(model4)
## 
## Call:
## lm(formula = log(price) ~ log(dist) + log(intst) + log(area) + 
##     log(land) + rooms + baths + age + dist_squared, data = data_1981)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.75771 -0.10073  0.00257  0.11296  0.56269 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -15.758968   7.768580  -2.029  0.04450 *  
## log(dist)      4.844616   1.589145   3.049  0.00277 ** 
## log(intst)     0.052321   0.058606   0.893  0.37360    
## log(area)      0.311436   0.074255   4.194 4.97e-05 ***
## log(land)      0.059774   0.038777   1.541  0.12558    
## rooms          0.039987   0.027448   1.457  0.14752    
## baths          0.166285   0.040733   4.082 7.65e-05 ***
## age           -0.002855   0.001055  -2.706  0.00770 ** 
## dist_squared  -0.251447   0.083383  -3.016  0.00307 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1952 on 133 degrees of freedom
## Multiple R-squared:  0.7637, Adjusted R-squared:  0.7495 
## F-statistic: 53.73 on 8 and 133 DF,  p-value: < 2.2e-16
library(wooldridge)
data("wage1")

(i) OLS Estimation

# Perform OLS estimation
model <- lm(log(wage) ~ educ + exper + I(exper^2), data = wage1)
summary(model)
## 
## Call:
## lm(formula = log(wage) ~ educ + exper + I(exper^2), data = wage1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.96387 -0.29375 -0.04009  0.29497  1.30216 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.1279975  0.1059323   1.208    0.227    
## educ         0.0903658  0.0074680  12.100  < 2e-16 ***
## exper        0.0410089  0.0051965   7.892 1.77e-14 ***
## I(exper^2)  -0.0007136  0.0001158  -6.164 1.42e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4459 on 522 degrees of freedom
## Multiple R-squared:  0.3003, Adjusted R-squared:  0.2963 
## F-statistic: 74.67 on 3 and 522 DF,  p-value: < 2.2e-16

(ii) Significance of exper

No need to access the “t value” for this purpose.

# Test the significance of exper
exper_summary <- summary(model)
exper_p_value <- coef(exper_summary)["exper", "Pr(>|t|)"]
exper_p_value
## [1] 1.765073e-14

(iii) Approximate Returns

Using the given formula, approximate returns for fifth and twentieth years of experience were found.

# Calculate approximate returns
B_exper <- coef(model)["exper"]
B_exper_squared <- coef(model)["I(exper^2)"]
approx_return_5_years <- 100 * (B_exper + 2 * B_exper_squared * 5)
approx_return_20_years <- 100 * (B_exper + 2 * B_exper_squared * 20)
approx_return_5_years
##    exper 
## 3.387329
approx_return_20_years
##    exper 
## 1.246655

(iv) Predicted Log(wage) and Experience