Make sure to include the unit of the values whenever appropriate.
Hint: The variables are available in the CPS85 data set from the mosaicData package.
data(CPS85, package="mosaicData")
wages_lm <- lm(wage ~ educ + exper + sex,
data = CPS85)
# View summary of model 1
summary(wages_lm)
##
## Call:
## lm(formula = wage ~ educ + exper + sex, data = CPS85)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.571 -2.746 -0.653 1.893 37.724
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6.50451 1.20985 -5.376 1.14e-07 ***
## educ 0.94051 0.07886 11.926 < 2e-16 ***
## exper 0.11330 0.01671 6.781 3.19e-11 ***
## sexM 2.33763 0.38806 6.024 3.19e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.454 on 530 degrees of freedom
## Multiple R-squared: 0.2532, Adjusted R-squared: 0.2489
## F-statistic: 59.88 on 3 and 530 DF, p-value: < 2.2e-16
The coefficient of education is statisically significant at 5%.
Hint: Discuss both its sign and magnitude.
This means that we are 99.9% confident that the intercept is true. This means that we are extremely confident that education has an impact on wages.
Hint: Discuss all three aspects of the relevant predictor: 1) statistical significance, 2) sign, and 3) magnitude.
There is evidence for gender discrimination in wages. The coefficient of sex is statisctically significant in determining wages. The sign for the variable is positive and the magnitude is 2.33.
The wage for a woman who has 15 years of education and 5 years of experience is $14.67.
Hint: Provide a technical interpretation.
The intercept is -6.50451. This means that the base wage is $6.50 if all other intercepts are 0.
Hint: Discuss in terms of both residual standard error and reported adjusted R squared.
data(CPS85, package="mosaicData")
wages_lm <- lm(wage ~ educ + exper + sex + union,
data = CPS85)
# View summary of model 1
summary(wages_lm)
##
## Call:
## lm(formula = wage ~ educ + exper + sex + union, data = CPS85)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.496 -2.708 -0.712 1.909 37.784
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6.48023 1.20159 -5.393 1.05e-07 ***
## educ 0.93495 0.07835 11.934 < 2e-16 ***
## exper 0.10692 0.01674 6.387 3.70e-10 ***
## sexM 2.14765 0.39097 5.493 6.14e-08 ***
## unionUnion 1.47111 0.50932 2.888 0.00403 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.423 on 529 degrees of freedom
## Multiple R-squared: 0.2648, Adjusted R-squared: 0.2592
## F-statistic: 47.62 on 4 and 529 DF, p-value: < 2.2e-16
The residual standard error for the first set of data is 4.454 and 4.423 for the second set. The error for the second data is lower and means that it misses the actual wage by less than the first data set.
The adjusted R-squared for the first data set is 0.2489 and 0.2592 for the second set. This means that 25% and 26% of the variability in wages is expressed in the models.
The second model misses the actual wage by less than the first model, and also shows more of the variability than the first model making the second model much better.
Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.