RETURN EDUCATION

Can this be considered as causal effect. To find that out what is the relationship between the variables that you control and the variables that you don’t control. u

Are there other things that can effect wage, how much your parents make, people who go to college are more determine. You want to control as much as possible. Is there any other factor that i did not control for that may potentially affect my y. If there is something that you ignore that would hold into the error term and if they are related to the things you control, if the answer is yes then you will not be able to get a ceteris paribus effect.

Education and fertility

Let kids denote the number of children ever born to a woman, and let educ denote years of education for the woman. A simple model relating fertility to years of education is kids=β0+β1educ+u where u is the unobserved error. What kinds of factors might contain in u ? Are these likely to be correlated with level of education? The types of factors that might contain u ranges from being correlated with the level of education and uncorrelated.

An example of a correlated factor with education is the woman’s parents education. An example of an uncorrelated factor would be whether the group sample can have children.

Based on your answer in part (a), will a simple regression analysis uncover the ceteris paribus effect of education on fertility? Explain. A simple regression analysis would not uncover the cetaris paribus effect on education and woman’s fertility. The reason is that finding a causal effect between education and woman’s fertility is more complex. The model would need to account more for correlated and uncorrelated independent variables.

Most important, needs to be unbiased. The following table contains the ACT scores and the GPA (grade point average) for eight college students. Grade point average is based on a four-point scale and has been rounded to one digit after the decimal,

StudentID = c(001, 002, 003, 004, 005, 006, 007, 008)
GPA = c(2.8, 3.4, 3.0, 3.5, 3.6, 3.0, 2.7, 3.7)
ACT = c(21, 24, 26, 27, 29, 25, 25, 30)

Estimate the relationship between GPA and ACT using OLS; that is, obtain the intercept and slope estimates in the equation,

GPA=β0+β1ACT+u

Comment on the direction of the relationship. Does the intercept have a useful interpretation here? Explain. How much higher is the GPA predicted to be if the ACT score is increased by five points? The direction is upward sloping. The intercept = 0.5681 and the lope = 0.1022. The intercept is saying when ACT is 0 GPA is .5681. This is not useful information because GPA would never be at 0 in this data set. The slope is the parameter with value information. If ACT increases by 5 points then 0.5681 + (0.1022 * 5) = 1.079. 1.079 is how much GPA would increase if ACT increased by 5 points according to this model.

library(huxtable)
## Warning: package 'huxtable' was built under R version 4.2.3
modelo <- lm(GPA ~ ACT)
summary(modelo)
## 
## Call:
## lm(formula = GPA ~ ACT)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.42308 -0.14863  0.06703  0.10742  0.37912 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  0.56813    0.92842   0.612   0.5630  
## ACT          0.10220    0.03569   2.863   0.0287 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2692 on 6 degrees of freedom
## Multiple R-squared:  0.5774, Adjusted R-squared:  0.507 
## F-statistic: 8.199 on 1 and 6 DF,  p-value: 0.02868

Intercepto (0.56), se estima que el valor medio de GPA ascienda a 0.56 (1.00), siendo un valor independientemente del valor del score de ACT.El coeficiente no es estadísticamente significativo. Es decir, que el valor de la constante es igual a cero para un nivel de significación del 5%.

ACT (0.10). Se estima que un aumento en la puntuación (ACT) de un punto adicional, con el resto de los factores constantes, provocaría un incremento de GPA de 0.10 unidades. P-valor = 0.02 < 0.05; la variable es estadísticamente significativo.

El coeficiente R2 es 57%, indicando que las variaciones del score (ACT) permiten explicar el 57% de las variaciones en el GPA, alrededor de su media.

El estadístico de significación global asciende a 8.199 con un p valor 0..02 < 0.05. Tenemos evidencia empírica suficiente para rechazar H0. Las pendientes son conjuntamente significativas.

prediccion = 0.10220*5
print(prediccion)
## [1] 0.511

Se estima que un aumento de 5 puntos en la variable ACT aumentaría el GPA en un promedio de 0.51 de puntos, manteniendo constante el resto de las variables.

GPA1 <- 0.56813+0.10220*5
print(GPA1)
## [1] 1.07913

Se estima que el GPA estimado para un individuo que tenga un GPA de 5 puntos asciende a 1.07 puntos.

GPA2 <- 0.56813+0.10220*(5+5)
print(GPA2)
## [1] 1.59013

Se estima que el GPA estimado para un individuo que tenga un GPA de 10 puntos asciende a 1.59 puntos.

dif <- GPA2-GPA1
print(dif)
## [1] 0.511

El incremento de 5 puntos en ACT provoca un incremento de GPA de 0.511.

Compute the fitted values and residuals for each observation, and verify that the residuals (approximately) sum to zero. Residuals vs Fitted graphs shows that the residuals approximately sum to zero.

fitted <- modelo$fitted.values
head(fitted, n = 8)
##        1        2        3        4        5        6        7        8 
## 2.714286 3.020879 3.225275 3.327473 3.531868 3.123077 3.123077 3.634066
resids <- modelo$residuals 
head(resids, n = 6)
##           1           2           3           4           5           6 
##  0.08571429  0.37912088 -0.22527473  0.17252747  0.06813187 -0.12307692
x <- lm(fitted ~ resids)
par(mfrow = c(2,2))
plot(modelo)

mean(resids)
## [1] -3.469447e-18

La media de los residuos es prácticamente cero.

t_test_result <- t.test(resids, mu = 0)
t_test_result
## 
##  One Sample t-test
## 
## data:  resids
## t = -3.9377e-17, df = 7, p-value = 1
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -0.2083414  0.2083414
## sample estimates:
##     mean of x 
## -3.469447e-18

El estadístico de contraste es aproximadamente cero. El p valor es 1 > 0.05. Tenemos evidencia empírica suficiente de que la media de los residuales es aproximadamente cero.

El intervalo de confianza al 95% [-0.20, 0.20], estando dentro del intervalo el valor de 0 (h0: mu = 0). Por tanto, se acepta la hipótesis nula de que la media de los errores es aproximadamente cero.

What is the predicted value of GPA when ACT = 20?

prediccion2 <- 0.56813+0.10220*(20)
print(prediccion2)
## [1] 2.61213

La predicción del GPA de un individuo con un score en ACT de 20 puntos es de 2.61 (3.00)

How much of the variation in GPA for these eight students is explained by ACT?

El coeficiente R2 es 57%, indicando que las variaciones del score (ACT) permiten explicar el 57% de las variaciones en el GPA, alrededor de su media.