library(Zelig)
data(turnout)
head(turnout)
ols.lf <- function(param){
beta <- param[-1]
sigma <- param[1]
y <- as.vector(turnout$income)
x <- cbind(1, turnout$educate)
mu <- x%*%beta
sum(dnorm(y, mu, sigma, log = TRUE))
}
head(turnout)
library(maxLik)
mle_ols <- maxLik(logLik = ols.lf, start = c(sigma = 1, beta1 = 1, beta2 = 1))
NaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs produced
summary(mle_ols)
--------------------------------------------
Maximum Likelihood estimation
Newton-Raphson maximisation, 12 iterations
Return code 2: successive function values within tolerance limit
Log-Likelihood: -4691.256
3 free parameters
Estimates:
Estimate Std. error t value Pr(> t)
sigma 2.52613 0.03989 63.326 < 2e-16 ***
beta1 -0.65207 0.20827 -3.131 0.00174 **
beta2 0.37613 0.01663 22.612 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
--------------------------------------------
summary(lm(income ~ educate, data = turnout))
Call:
lm(formula = income ~ educate, data = turnout)
Residuals:
Min 1Q Median 3Q Max
-6.2028 -1.7363 -0.4273 1.3150 11.0632
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.65207 0.21016 -3.103 0.00194 **
educate 0.37613 0.01677 22.422 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.527 on 1998 degrees of freedom
Multiple R-squared: 0.201, Adjusted R-squared: 0.2006
F-statistic: 502.8 on 1 and 1998 DF, p-value: < 2.2e-16
So this is what we looked at in figure 4.14 What this shows is the (Intercept) here is Beta 1 from the the previous function, and the educate coefficient is Beta2 which is also the slope. Residual standard error was expressed as sigma before as well.
so far, what we are looking into is whether or not education influences income. From our results, the more education a person has the more likely the higher the income they have.
Now we are going to use the same function to demomstrate figure 4.18 and then we will compare the results.
ols.lf2 <- function(param) {
mu <- param[1]
theta <- param[-1]
y <- as.vector(turnout$income)
x <- cbind(1, turnout$educate)
sigma <- x%*%theta
sum(dnorm(y, mu, sigma, log = TRUE))
}
library(maxLik)
mle_ols2 <- maxLik(logLik = ols.lf2, start = c(mu = 1, theta1 = 1, theta2 = 1))
NaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs produced
summary(mle_ols2)
--------------------------------------------
Maximum Likelihood estimation
Newton-Raphson maximisation, 9 iterations
Return code 2: successive function values within tolerance limit
Log-Likelihood: -4861.964
3 free parameters
Estimates:
Estimate Std. error t value Pr(> t)
mu 3.516764 0.070320 50.01 <2e-16 ***
theta1 1.461011 0.106745 13.69 <2e-16 ***
theta2 0.109081 0.009185 11.88 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
--------------------------------------------
So the function itself is similar except for the fact that we are trying to find the mu or average in this rather than the standard deviation which is represented by sigma. Also beta has now been changed to theta. From my understanding, if there is an increase in education there will be a .109 increase in income. Which is saying that those with more education have higher income. (Unless I’m totatlly confusing the two, but this is what seems to make sense) Both seem to be saying the same results, however the numbers or outcome is different.
So what would happen if we added age? Well for one I think that we would get different results for each function, but I’m not sure how much it would really change. We can assume that as people get older, their education experience should be higher and so would their income, but there are other factors to consider. Sometimes education doesn’t matter and someone’s income can be high just based on the amount of years in which they worked for the company - which we can assume the person would be older. But maybe the oldee the person, the higher the education and income they would have.