4.15 and 4.19 of the lecture slidesIn order of find correlation between education and income, two different parameter used on those slides. Response variable or dependent variable is education and explanatory variable or independent variable is education. In slide 4.15, the most commonly used log likelihood function used to maximize the estimate of income in terms of the parameter \(\mu\). The model used:
\[ \begin{aligned} Y \sim N(\beta_0+ \beta_1X,\sigma) \\ \end{aligned} \]
Now lets us recreate the result:
# Installing the packeges
library(Zelig)
library(maxLik)
# Loding the data
data(turnout)
#log likelihood function
ols.lf <- function(param) {
beta <- param[-1]
sigma <- param[1]
y <- as.vector(turnout$income)
x <- cbind(1, turnout$educate)
mu <- x%*%beta
sum(dnorm(y, mu, sigma, log = TRUE))}
# Maximize the log likelihood function
mle_ols <- maxLik(logLik = ols.lf, start = c(sigma = 1, beta0 = 1, beta1 = 1))
summary(mle_ols)
--------------------------------------------
Maximum Likelihood estimation
Newton-Raphson maximisation, 12 iterations
Return code 2: successive function values within tolerance limit
Log-Likelihood: -4691.256
3 free parameters
Estimates:
Estimate Std. error t value Pr(> t)
sigma 2.52613 0.03989 63.326 < 2e-16 ***
beta0 -0.65207 0.20827 -3.131 0.00174 **
beta1 0.37613 0.01663 22.612 < 2e-16 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
--------------------------------------------
lm Function, We Will Check Our Result # Linear regression model summary
summary(lm(income ~ educate, data = turnout))
Call:
lm(formula = income ~ educate, data = turnout)
Residuals:
Min 1Q Median 3Q Max
-6.2028 -1.7363 -0.4273 1.3150 11.0632
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.65207 0.21016 -3.103 0.00194 **
educate 0.37613 0.01677 22.422 < 2e-16 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 2.527 on 1998 degrees of freedom
Multiple R-squared: 0.201, Adjusted R-squared: 0.2006
F-statistic: 502.8 on 1 and 1998 DF, p-value: < 2.2e-16
For the result, I can see \(\beta_0\) is y-intercept (Professor used \(\beta_1\) for Y-intercept variable) and \(\beta_1\) represent slope. \(\beta_0\) value tells us the average \(y\) value when \(x=0\). In our case, form \(\beta_0\), we can expect that people with no education expect to make on average \(-0.65\). \(\beta_1\) give us the slope value and it tells us, given one unite increase in education(\(x\)), there will be expected increase in income(\(y\)) which mean that average income expected to increase by \(0.38\) for one year of education. Therefore, we can say that increase in education will increase average income.
On the other hand, slide 4.19, less commonly used parameter Sigma or standard deviation was used to estimate correlation between income and education. The model used:
\[ \begin{aligned} Y \sim N(\mu,\theta_0+ \theta_1X) \\ \end{aligned} \]
#Log likelihood function for parameter standard deviation
ols.lf2 <- function(param) {
mu <- param[1]
theta <- param[-1]
y <- as.vector(turnout$income)
x <- cbind(1, turnout$educate)
sigma <- x%*%theta
sum(dnorm(y, mu, sigma, log = TRUE))
}
mle_ols2 <- maxLik(logLik = ols.lf2, start = c(mu = 1, theta0 = 1, theta1 = 1))
summary(mle_ols2)
--------------------------------------------
Maximum Likelihood estimation
Newton-Raphson maximisation, 9 iterations
Return code 2: successive function values within tolerance limit
Log-Likelihood: -4861.964
3 free parameters
Estimates:
Estimate Std. error t value Pr(> t)
mu 3.516764 0.070320 50.01 <2e-16 ***
theta0 1.461011 0.106745 13.69 <2e-16 ***
theta1 0.109081 0.009185 11.88 <2e-16 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
--------------------------------------------
From the result, we see that \(\theta_0\) which is the y-intercept and \(\theta_1\) is the slope for our likelihood function for parameter standard deviation. \(\theta_0\) \(=\) \(1.46\) mean that people with no education expected to have \(1.46\) standard deviation of income. theata1 is the slop means that people with one year of education expected to increase \(0.109\) standard deviation of income. Since it is a positive correlation, we can say that higher education cause higher income differences.
Including Age in our model: \(Y \sim N(\beta_0+\beta_1X,\sigma)\): I am expecting to have a positive correlation, since more older people get, more educated they are and more money they make.
#log likelihood function with Age
ols.lf_with_Age <- function(param) {
beta <- param[-1]
sigma <- param[1]
y <- as.vector(turnout$income)
x <- cbind(1, turnout$educate, turnout$age)
mu <- x%*%beta
sum(dnorm(y, mu, sigma, log = TRUE))}
# Maximize the log likelihood function
mle_ols_with_Age<- maxLik(logLik = ols.lf_with_Age, start = c(sigma = 1, beta0 = 1, beta1 = 1,beta2 = 1))
summary(mle_ols_with_Age)
--------------------------------------------
Maximum Likelihood estimation
Newton-Raphson maximisation, 16 iterations
Return code 2: successive function values within tolerance limit
Log-Likelihood: -4690.815
4 free parameters
Estimates:
Estimate Std. error t value Pr(> t)
sigma 2.525576 0.039919 63.268 <2e-16 ***
beta0 -0.446047 0.300583 -1.484 0.138
beta1 0.371011 0.017493 21.209 <2e-16 ***
beta2 -0.003184 0.003373 -0.944 0.345
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
--------------------------------------------
lm Function, We Will Check Our Result # Multiple Linear regression model summary
summary(lm(income ~ educate + age, data = turnout))
Call:
lm(formula = income ~ educate + age, data = turnout)
Residuals:
Min 1Q Median 3Q Max
-6.2128 -1.7471 -0.4217 1.3042 11.1256
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.446084 0.303955 -1.468 0.142
educate 0.371013 0.017641 21.031 <2e-16 ***
age -0.003183 0.003394 -0.938 0.348
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 2.527 on 1997 degrees of freedom
Multiple R-squared: 0.2014, Adjusted R-squared: 0.2006
F-statistic: 251.8 on 2 and 1997 DF, p-value: < 2.2e-16
From the result, we can see the y-intercept beta0 is \(-0.446\), which mean people with zero education and age expect to have average income of \(-0.446\). From \(\beta_2\) value, we can a see a week negative correlation. \(\beta_2\) \(=\) \(-0.003\) tells us that one year increase of age will decrease average income by \(0.003\) adjusting or controlling for income. Beta1 which is education still have almost same value of \(0.37\), if we round up. Therefore, result proof my educated guess or hypothesis wrong.
Including Age in our model, \(Y \sim N(\mu,\theta_0+\theta_1X)\): Again I am expecting to have a positive correlation, since more older people get and more education they have, we can expect more income inequality.
#Log likelihood function with age for parameter standard deviation
ols.lf2_With_Age <- function(param) {
mu <- param[1]
theta <- param[-1]
y <- as.vector(turnout$income)
x <- cbind(1, turnout$educate, turnout$age)
sigma <- x%*%theta
sum(dnorm(y, mu, sigma, log = TRUE))
}
mle_ols2_with_Age <- maxLik(logLik = ols.lf2_With_Age, start = c(mu = 1, theta0 = 1, theta1 = 1, theta2 = 1),method='BFGS')
summary(mle_ols2_with_Age)
--------------------------------------------
Maximum Likelihood estimation
BFGS maximization, 150 iterations
Return code 0: successful convergence
Log-Likelihood: -4843.15
4 free parameters
Estimates:
Estimate Std. error t value Pr(> t)
mu 3.555011 0.069193 51.378 < 2e-16 ***
theta0 0.362114 0.204550 1.770 0.0767 .
theta1 0.133349 0.010756 12.398 < 2e-16 ***
theta2 0.017507 0.002852 6.139 8.32e-10 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
--------------------------------------------
From the result, we can see that y-intercept is \(0.362\), which is people with zero education and age expected there income’s standards deviation of \(0.0175\). From \(\theta_2\) value, we associate that an increase of one year in Age with a standard deviation of income by \(0.0175\) controlling for Education. Moreover, \(\theta_1\) value is still positive. Therefore, I can say that my hypothesis is right that as people get older and more educated income inequality increase.
newdata <- subset(turnout,select=c(age, educate,income))
# Creating Four Catecory for Education
newdata$educate[newdata$educate>= 9 & newdata$educate <15]<- "Intermediate Education"
newdata$educate[newdata$educate >= 15 & newdata$educate <= 20 ]<- "Higher Education"
newdata$educate[newdata$educate>= 4 & newdata$educate<= 8]<- "Some Education"
newdata$educate[newdata$educate< 4]<- "Little Education"
ggplot(newdata) +
aes(x = age) +
aes(y = income) +
facet_wrap(~ educate, scales = "free_y", nrow = 2) +
geom_smooth(col = "grey30")+
ylab("INCOME")+
xlab( "AGE") +
labs(title = "Income Patterns for Different Education Level Throughout Different Age")+
theme_bw(base_size = 13)
From the patterns of income, we can see that people with little education make significantly make less money from the people with higher education. Therefore, the positive correlation of education and income is visible here. When comes to age, we see that people in different age group make the most money in there early age and people with at least some education income started to go down once they are over 60 since they started to retire. This might explain week negative correlation between income and age. If we filter out people over sixty-five, we most likely see a positive correlation between income and age. Only exception is people with low education. Their income increase after their retirement that may cause by the social welfare they receive in the old age.