library(tidyverse)
## ── Attaching packages ──────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.1     ✔ purrr   0.3.2
## ✔ tibble  2.1.3     ✔ dplyr   0.8.3
## ✔ tidyr   0.8.3     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.4.0
## ── Conflicts ─────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(scales)
## 
## Attaching package: 'scales'
## The following object is masked from 'package:purrr':
## 
##     discard
## The following object is masked from 'package:readr':
## 
##     col_factor
options(scipen = 999)
library(tidyverse)

data(CPS85, package="mosaicData")
houses_lm <- lm(wage ~ educ + race + sex + age + exper,
                data = CPS85)

# View summary of model 1
summary(houses_lm)
## 
## Call:
## lm(formula = wage ~ educ + race + sex + age + exper, data = CPS85)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.605 -2.695 -0.683  1.963 37.641 
## 
## Coefficients:
##             Estimate Std. Error t value      Pr(>|t|)    
## (Intercept)  -5.1857     6.8316  -0.759         0.448    
## educ          1.2806     1.1186   1.145         0.253    
## raceW         0.9355     0.5827   1.605         0.109    
## sexM          2.3637     0.3885   6.084 0.00000000225 ***
## age          -0.3467     1.1180  -0.310         0.757    
## exper         0.4606     1.1190   0.412         0.681    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.451 on 528 degrees of freedom
## Multiple R-squared:  0.2569, Adjusted R-squared:  0.2499 
## F-statistic: 36.51 on 5 and 528 DF,  p-value: < 0.00000000000000022

Make sure to include the unit of the values whenever appropriate.

Q1 Build a regression model to predict wages using the following predictors: 1) years of education, 2) years of experience, and 3) sex.

Hint: The variables are available in the CPS85 data set from the mosaicData package.

data(CPS85, package="mosaicData")
houses_lm <- lm(wage ~ educ + exper + sex, 
                data = CPS85)
# View summary of model 1
summary(houses_lm)
## 
## Call:
## lm(formula = wage ~ educ + exper + sex, data = CPS85)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.571 -2.746 -0.653  1.893 37.724 
## 
## Coefficients:
##             Estimate Std. Error t value             Pr(>|t|)    
## (Intercept) -6.50451    1.20985  -5.376      0.0000001141795 ***
## educ         0.94051    0.07886  11.926 < 0.0000000000000002 ***
## exper        0.11330    0.01671   6.781      0.0000000000319 ***
## sexM         2.33763    0.38806   6.024      0.0000000031877 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.454 on 530 degrees of freedom
## Multiple R-squared:  0.2532, Adjusted R-squared:  0.2489 
## F-statistic: 59.88 on 3 and 530 DF,  p-value: < 0.00000000000000022

Q2 Is the coefficient of education statistically significant at 5%?

Yes the coefficient of education is statistically significant at 5% because the p value is 2e-16, which is less than 5%.

Q3 Interpret the coefficient of education.

Hint: Discuss both its sign and magnitude. Three stars at the end of the p value means the variable is significant at p value. Also indicates we are 99.9% true.

Q4 Is there evidence for gender discrimination in wages? Make your argument using the relevant test results.

Hint: Discuss all three aspects of the relevant predictor: 1) statistical significance, 2) sign, and 3) magnitude. Yes there is evidence for gender discrimination in wages, I found a female teacher that has 12 years of education and 44 years of experience and is getting ppaid $9.17 an hour. I found a male teacher that has 14 years education and 22 years of experience and is getting paid $10.00 an hour. A male can have less experience than a women and still get paid more.

Q5 Predict wage for a woman who has 15 years of education, 5 years of experience.

A would say around $9.75, because in the chart their is a women how has 12 years of education and 5 years of experience and is getting paid $9.57.

Q6 Interpret the Intercept.

Hint: Provide a technical interpretation. Three stars at the end of p value means variable is significant. Also indicates we are 99.9% true.

Q7 Build another model by adding a predictor to the model above. The additional predictor is whether the person is a union member. Which of the two models is better?

Hint: Discuss in terms of both residual standard error and reported adjusted R squared. Reported risdual error is at 4.454 is the difference between the wages of males and females. Reported adjusted R^2 is .2489, this means that 24.89% of male teachers get paid more than female teachers, which is explained by the model.

Q8 Hide the messages, but display the code and its results on the webpage.

Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.