Introduction

In this homework, I’m interested in the relationship between education and income. In addition, I want to see how this relationship may be influenced by gender.

Data

The data for this homework comes from Zelig package, with the name voteincome.

Results

library(Zelig)
data("voteincome")
tibble::glimpse(voteincome)
## Observations: 1,500
## Variables: 7
## $ state     <fct> AR, AR, AR, AR, AR, AR, AR, AR, AR, AR, AR, AR, AR, ...
## $ year      <int> 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000...
## $ vote      <int> 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1...
## $ income    <int> 9, 11, 12, 16, 10, 12, 14, 10, 17, 8, 15, 13, 10, 9,...
## $ education <int> 2, 2, 2, 4, 4, 3, 4, 1, 2, 1, 3, 3, 2, 2, 2, 3, 2, 1...
## $ age       <int> 73, 24, 24, 40, 85, 78, 31, 75, 54, 78, 71, 40, 46, ...
## $ female    <int> 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0...
m1 <- lm(income ~ education, data = voteincome)
m2 <- lm(income ~ education + female, data = voteincome)
m3 <- lm(income ~education*female, data = voteincome)
library(texreg)
htmlreg(list(m1, m2, m3))
Statistical models
Model 1 Model 2 Model 3
(Intercept) 6.97*** 7.33*** 7.89***
(0.24) (0.26) (0.37)
education 2.07*** 2.05*** 1.85***
(0.08) (0.08) (0.13)
female -0.55** -1.54**
(0.17) (0.48)
education:female 0.37*
(0.17)
R2 0.29 0.30 0.30
Adj. R2 0.29 0.30 0.30
Num. obs. 1500 1500 1500
RMSE 3.30 3.29 3.28
p < 0.001, p < 0.01, p < 0.05

Model 1 shows that 1 unit increase in education leads to 2.07 units increase in income.

Model 2 shows that after controlling for gender, the effect of education on income decreases slightly, from 2.07 to 2.05. Also, Model 2 suggests that women on average earn 0.55 less than men.

Model 3, with the newly included interaction term between education and gender suggests a few things. First, there is a significant interaction effect between education and gender. Second, education increases income more for women than for men, although on average men still earn more than women.

Discussion

Now, use the data to come back to the larger motivating questions that you want to answer. Why do people care about the relationship between education and income. Why do you care about how gender interacts with the effects. Talk about the relevance of your findings to the world at large. This is generally how the homeworks should look from now on.