library(tidyverse)
## ── Attaching packages ──────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.1 ✔ purrr 0.3.2
## ✔ tibble 2.1.3 ✔ dplyr 0.8.3
## ✔ tidyr 0.8.3 ✔ stringr 1.4.0
## ✔ readr 1.3.1 ✔ forcats 0.4.0
## ── Conflicts ─────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(scales)
##
## Attaching package: 'scales'
## The following object is masked from 'package:purrr':
##
## discard
## The following object is masked from 'package:readr':
##
## col_factor
options(scipen = 999)
library(tidyverse)
data(CPS85, package="mosaicData")
houses_lm <- lm(wage ~ educ + race + sex + age + exper,
data = CPS85)
# View summary of model 1
summary(houses_lm)
##
## Call:
## lm(formula = wage ~ educ + race + sex + age + exper, data = CPS85)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.605 -2.695 -0.683 1.963 37.641
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.1857 6.8316 -0.759 0.448
## educ 1.2806 1.1186 1.145 0.253
## raceW 0.9355 0.5827 1.605 0.109
## sexM 2.3637 0.3885 6.084 0.00000000225 ***
## age -0.3467 1.1180 -0.310 0.757
## exper 0.4606 1.1190 0.412 0.681
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.451 on 528 degrees of freedom
## Multiple R-squared: 0.2569, Adjusted R-squared: 0.2499
## F-statistic: 36.51 on 5 and 528 DF, p-value: < 0.00000000000000022
Make sure to include the unit of the values whenever appropriate.
Hint: The variables are available in the CPS85 data set from the mosaicData package.
data(CPS85, package="mosaicData")
houses_lm <- lm(wage ~ educ + exper + sex,
data = CPS85)
# View summary of model 1
summary(houses_lm)
##
## Call:
## lm(formula = wage ~ educ + exper + sex, data = CPS85)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.571 -2.746 -0.653 1.893 37.724
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6.50451 1.20985 -5.376 0.0000001141795 ***
## educ 0.94051 0.07886 11.926 < 0.0000000000000002 ***
## exper 0.11330 0.01671 6.781 0.0000000000319 ***
## sexM 2.33763 0.38806 6.024 0.0000000031877 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.454 on 530 degrees of freedom
## Multiple R-squared: 0.2532, Adjusted R-squared: 0.2489
## F-statistic: 59.88 on 3 and 530 DF, p-value: < 0.00000000000000022
Yes the coefficient of education is statistically significant at 5% because the p value is 2e-16, which is less than 5%.
Hint: Discuss both its sign and magnitude. Three stars at the end of the p value means the variable is significant at p value. Also indicates we are 99.9% true.
Hint: Discuss all three aspects of the relevant predictor: 1) statistical significance, 2) sign, and 3) magnitude. Yes there is evidence for gender discrimination in wages, I found a female teacher that has 12 years of education and 44 years of experience and is getting ppaid $9.17 an hour. I found a male teacher that has 14 years education and 22 years of experience and is getting paid $10.00 an hour. A male can have less experience than a women and still get paid more.
A would say around $9.75, because in the chart their is a women how has 12 years of education and 5 years of experience and is getting paid $9.57.
Hint: Provide a technical interpretation. Three stars at the end of p value means variable is significant. Also indicates we are 99.9% true.
Hint: Discuss in terms of both residual standard error and reported adjusted R squared. Reported risdual error is at 4.454 is the difference between the wages of males and females. Reported adjusted R^2 is .2489, this means that 24.89% of male teachers get paid more than female teachers, which is explained by the model.
Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.