Using Maximum Likelihood Estimations (MLE) to Understand the Relationship between Income & Education

Introduction

The national attainment rate for post-secondary education, including quality credentials, as of 2017 is 47.6% [1]. On average, every year Americans are gradually earning a degree beyond high school or a credential that provides them a skill to elevate their role in the workforce. Prior data suggests a strong correlation between educational attainment and income: as education increases, people become specialized in a certain field and earn a higher income.

I am interested in assessing the complex relationship between education and income by looking at the “turnout” dataset, imported from the Zelig package. The Zelig package provides a framework of statistical models on a unified interface. To assess the relations between the independent variable (education) and the dependent variable (income) I will be conducting maximum likelihood estimations and comparing results from Slide 4.15 and Slide 4.19. Slide 4.15 maximizes the likelihood of education’s impact on the mean income (mu). Slide 4.19, maximizes the likelihood of education’s impact on the variance of income, or income inequality (sigma).

Maximum likelihood estimations determine the parameter values (such as the mean, slope, y-intercept, or variance) from a data sample. From these estimates, it is able to maximize the probability of obtaining the observed data.

Installed Packages

library(Zelig)
## Loading required package: survival
library(maxLik)
## Loading required package: miscTools
## 
## Please cite the 'maxLik' package as:
## Henningsen, Arne and Toomet, Ott (2011). maxLik: A package for maximum likelihood estimation in R. Computational Statistics 26(3), 443-458. DOI 10.1007/s00180-010-0217-1.
## 
## If you have questions, suggestions, or comments regarding the 'maxLik' package, please use a forum or 'tracker' at maxLik's R-Forge site:
## https://r-forge.r-project.org/projects/maxlik/

Dataset

The “turnout” dataset is imported from the Zelig package. The variables I will be looking at are education and income. Education serves as the independent variable, while income is a dependent variable. When finding the probability mass function (PMF), income will be treated as a discreet variable and when finding the probability density function, income will be treated as a continuous variable.

data("turnout")
head(turnout)
##    race age educate income vote
## 1 white  60      14 3.3458    1
## 2 white  51      10 1.8561    0
## 3 white  24      12 0.6304    0
## 4 white  38       8 3.4183    1
## 5 white  25      12 2.7852    1
## 6 white  67      12 2.3866    1

The Log-likehood Function:

Impact of Education on Mean Income (Slide 4.15)

ols.lf <- function(param) {
    beta <- param[-1] 
    sigma <- param[1]
    y <- as.vector(turnout$income)
    x <- cbind(1, turnout$educate)
    mu <- x%*%beta
    sum(dnorm(y, mu, sigma, log = TRUE))}
param <- c(1,2,3,4,5)
param[-1]
## [1] 2 3 4 5
mle_ols <- maxLik(logLik = ols.lf, start = c(sigma = 1, beta1 = 1, beta2 = 1), method = "nm")
summary(mle_ols)
## --------------------------------------------
## Maximum Likelihood estimation
## Nelder-Mead maximization, 178 iterations
## Return code 0: successful convergence 
## Log-Likelihood: -4691.257 
## 3  free parameters
## Estimates:
##       Estimate Std. error t value Pr(> t)    
## sigma  2.52584    0.03992  63.274 < 2e-16 ***
## beta1 -0.65328    0.20518  -3.184 0.00145 ** 
## beta2  0.37636    0.01641  22.937 < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## --------------------------------------------

Results

Slide 4.15 displays the relationship between education and average income. There are three parameters in this first log-likelihood function (sigma, beta1, beta2). Sigma represents the standard deviation of income and is 2.526. Beta1 refers to the y-intercept for the mean. Y-intercept is -0.653. This y-intercept states that when education is equal to 0, the income value is -0.653. Beta2 estimates the slope of the mean, which is 0.376. According to the slope value, when education increases by one unit, then income value increases by 0.376 units. These results sugggest a positive correlation between education and average income; as education increases, so does income increase.

The Likelihood Function:

Impact of Education on Standard Deviation of Income (Slide 4.19)

ols.lf2<-function(param){
  mu <- param[1]
  theta <- param[-1]
  y <- as.vector(turnout$income)
  x <- cbind(1, turnout$educate)
  sigma <- x%*%theta
  sum(dnorm(y, mu, sigma, log=TRUE))}
mle_ols2 <- maxLik(logLik = ols.lf2, start = c(mu = 1, theta1 = 1, theta2 = 1), method = "nm")
summary(mle_ols2)
## --------------------------------------------
## Maximum Likelihood estimation
## Nelder-Mead maximization, 154 iterations
## Return code 0: successful convergence 
## Log-Likelihood: -4861.964 
## 3  free parameters
## Estimates:
##        Estimate Std. error t value Pr(> t)    
## mu     3.516252   0.070582   49.82  <2e-16 ***
## theta1 1.462007   0.107430   13.61  <2e-16 ***
## theta2 0.109056   0.009234   11.81  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## --------------------------------------------

Results

Slide 4.19 displays the relationship between education and income inequality. There are three population parameters in this likelihood estimation (mu, theta1, and theta2). The mu or mean value for income is 3.516. Theta1 now refers to y-intercept and is 1.462. Theta2 represents the slope as 0.109. When education is equal to 0, y-intercept indicates that income varies by 1.462 across individuals.The slope value states that when education increases by one unit, the standard deviation for income increases by 0.109. These results also suggest a positive correlation between education and income inequality, thus as education increases, there will be a higher variation among income.

Log-likelihood Function 2:

Age as Second Independent Variable

This function will assess the relationship between age, education, and income.

ols.lf3<-function(param){
  beta <- param[-1]
  sigma <- param[1]
  y <- as.vector(turnout$income)
  x <- cbind(1, turnout$educate, turnout$age)
  mu <- x%*%beta
  sum(dnorm(y, mu, sigma, log=TRUE))}
mle_ols3<-maxLik(logLik=ols.lf3, start=c(sigma=1, beta1=1, beta2=1, beta3=1), method = "nm")
summary(mle_ols3)
## --------------------------------------------
## Maximum Likelihood estimation
## Nelder-Mead maximization, 123 iterations
## Return code 0: successful convergence 
## Log-Likelihood: -4702.483 
## 4  free parameters
## Estimates:
##        Estimate Std. error t value  Pr(> t)    
## sigma  2.676915   0.046455  57.624  < 2e-16 ***
## beta1  0.561730   0.322201   1.743 0.081261 .  
## beta2  0.327644   0.018671  17.548  < 2e-16 ***
## beta3 -0.012936   0.003602  -3.591 0.000329 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## --------------------------------------------

Results

After adding a second independent variable (age) to the first statistical model, I can further understand the relationship between income and education. I predict that age will have a positive correlation with income, much like education. As individuals get older, they may decide to go back to college to earn a degree/cerification that elevates their position in their job field. Young mothers may return to college after their children grow up. With the return to school at a later age, income will most likely increase as individuals become more skilled and experienced.

In the model above, there are four parameters (sigma, beta1, beta2, and beta3. Sigma or standard deviation is 2.677. The y-intercept 0.562. When education and age are equal to zero, the mean income value is 0.562. The slope is 0.328. This indicates that for every one unit increase in education, there is an increase in income by 0.328. However, beta3 is a negative coefficient (-0.013). This negative coefficient suggests my original prediction is incorrect. There is a negative correlation between age and income. As age increases, income decreases by -0.013.

Likelihood Function 2

Age as Second Independent Variable

ols.lf4 <- function(param){
  mu <- param[1]
  theta <- param[-1]
  y <- as.vector(turnout$income)
  x <- cbind(1, turnout$educate, turnout$age)
  sigma <- x%*%theta
  sum(dnorm(y, mu, sigma, log=TRUE))}
mle_ols4<-maxLik(logLik=ols.lf4, start=c (mu=1, theta1=1, theta2=1, theta3=1), method = "nm")
## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced

## Warning in dnorm(y, mu, sigma, log = TRUE): NaNs produced
summary(mle_ols4)
## --------------------------------------------
## Maximum Likelihood estimation
## Nelder-Mead maximization, 269 iterations
## Return code 0: successful convergence 
## Log-Likelihood: -4866.043 
## 4  free parameters
## Estimates:
##         Estimate Std. error t value  Pr(> t)    
## mu      3.221451   0.069045  46.657  < 2e-16 ***
## theta1 -0.722354   0.123845  -5.833 5.45e-09 ***
## theta2  0.187446   0.007655  24.486  < 2e-16 ***
## theta3  0.029008   0.002344  12.377  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## --------------------------------------------

Results

There are four parameters in the likelihood function above (mu, theta1, theta2, and theta3). Mu or average income is 3.222. The y-intercept is -0.722 suggesting a negative correlation between income when age and education are equal to zero. When age and education are equal to zero, income varies by -0.723. The slope is 0.188, which means that as education increase by one unit, so income variance will increase by 0.188. Theta3, the slope for age, is 0.031. This slope indicates that as age increases by one unit, so will income variance increase by 0.031. Although the data suggests that age and education have a positive correlation with income inequality, the results are not statistically significant.

References

  1. Lumina Foundation. Accessed Februrary 2019. http://strongernation.luminafoundation.org/report/2019/#nation