Main Questions:
Do children do better economically than their parents?
What does economic mobility look like in different areas (commuting zones)?
What are the average characteristics of commuting zones with high opportunity?
Main Findings:
This experiment had five main findings relating to the characteristics of mobility zones across the United States:
Regression Equation: prob_q1_q5 = b0 + b1(urban) + e (Note: all variables should have “i” subscripts)
The coefficient of -3.3276 for b1 in the regression equation listed above indicates that a child who lives in an urban area is associated with a 3.3276 percentage less likely to be in top 5th of income distribution if their parents are in the bottom 5th.
Ho: b1 = 0
Ha: b1 != 0
Based on the p-value for this test, the coefficient on urban is statistically significant at the one percent level of significance.
##
## Call:
## lm(formula = prob_q1_q5 ~ urban, data = Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.0762 -2.8728 -0.5563 2.0083 24.4279
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 11.2864 0.2301 49.056 <2e-16 ***
## urban -3.3276 0.3398 -9.792 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.508 on 707 degrees of freedom
## Multiple R-squared: 0.1194, Adjusted R-squared: 0.1182
## F-statistic: 95.89 on 1 and 707 DF, p-value: < 2.2e-16
urban: The coefficient of -2.41917 for b1 in the regression equation listed above indicates that a child who lives in an urban area is associated with a 2.41917 percentage less likely to be in top 5th of income distribution if their parents are in the bottom 5th. (Note: This variable is significant at the 5% level of significance)
cz_race_black: The coefficient of -0.18543 for b2 in the regression equation listed above indicates that a a one percent increase in the commuting zone population being black is associated with a 0.18543 percentage less likely to be in top 5th of income distribution if their parents are in the bottom 5th. (Note: This variable is significant at the 5% level of significance)
Based on the regression equation from question 2, it seems that the previous model did suffer from omitted variable bias. It is likely that black people more commonly live in urban zones, so there is likely correlation between the urban and cz_race_black variable. This indicates that our coefficient in the previous model was biased and inconsistent.
##
## Call:
## lm(formula = prob_q1_q5 ~ urban + cz_race_black, data = Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.931 -2.390 -0.550 1.565 23.390
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.36969 0.21144 58.501 < 2e-16 ***
## urban -2.41917 0.30030 -8.056 3.36e-15 ***
## cz_race_black -0.18543 0.01207 -15.358 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.906 on 706 degrees of freedom
## Multiple R-squared: 0.34, Adjusted R-squared: 0.3381
## F-statistic: 181.8 on 2 and 706 DF, p-value: < 2.2e-16
Urban Commuting Zones Regression Equation: prob_q1_q5i= (bo + b1) + (b2+b3)taxrate + b4cz_race_black + b5cz_labforce + e (with “i” subscripts)
Non-Urban Commuting Zones Regression Equation: prob_q1_q5i= bo + b2taxrate + b4cz_race_black + b5cz_labforce + ei (with “i” subscripts)
Effect of a commuting zone’s tax rate on intergenerational mobility for urban zones: b2 + b3
Calculated effect of a commuting zone’s tax rate on intergenerational mobility for urban zones: (1.04607 - 0.57883 = 0.46724) For an urban commuting zone, a one percent increase in the tax rate is associated with a 0.46724 percent increase in the probability a child is in top 5th of income distribution if parents in bottom 5th.
##
## Call:
## lm(formula = prob_q1_q5 ~ urban + taxrate + urban * taxrate +
## cz_race_black + cz_labforce, data = Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.060 -2.127 -0.491 1.397 23.083
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.67902 1.61843 2.891 0.00396 **
## urban -1.31460 0.94317 -1.394 0.16382
## taxrate 1.04607 0.17442 5.997 3.21e-09 ***
## cz_race_black -0.16009 0.01222 -13.099 < 2e-16 ***
## cz_labforce 0.08371 0.02723 3.074 0.00219 **
## urban:taxrate -0.57833 0.40528 -1.427 0.15403
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.758 on 702 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.3925, Adjusted R-squared: 0.3881
## F-statistic: 90.7 on 5 and 702 DF, p-value: < 2.2e-16
## function (object, ...)
## UseMethod("summary")
## <bytecode: 0x0000017b5830d178>
## <environment: namespace:base>
The total effect of a commuting zone’s tax rate on intergenerational mobility from the regression equation is b2+ 2b3taxrate (with “i” subscripts)
The positive coefficient for b2 (or the taxrate) and the negative coefficient for b3 (or the taxrate^2) tells us that an increase in the taxrate with initially low average tax rates causes an upward trend in the probability of mobility, but that effect rises at a decreasing rate at higher average tax rates.
##
## Call:
## lm(formula = prob_q1_q5 ~ urban + taxrate + taxrate2 + cz_race_black +
## cz_labforce, data = Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.2071 -2.1301 -0.5284 1.4890 22.8121
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.84972 1.66454 2.313 0.02102 *
## urban -2.65621 0.31377 -8.465 < 2e-16 ***
## taxrate 2.05415 0.47852 4.293 2.01e-05 ***
## taxrate2 -0.16775 0.06843 -2.451 0.01447 *
## cz_race_black -0.15866 0.01221 -12.998 < 2e-16 ***
## cz_labforce 0.07685 0.02727 2.818 0.00496 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.748 on 702 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.3959, Adjusted R-squared: 0.3916
## F-statistic: 92 on 5 and 702 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = prob_q1_q5 ~ urban + taxrate + taxrate2, data = Data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.599 -2.779 -0.540 2.085 23.329
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.3621 0.8065 6.649 5.94e-11 ***
## urban -3.0897 0.3266 -9.461 < 2e-16 ***
## taxrate 3.2560 0.5293 6.152 1.29e-09 ***
## taxrate2 -0.2700 0.0771 -3.502 0.000492 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.257 on 704 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.2181, Adjusted R-squared: 0.2148
## F-statistic: 65.46 on 3 and 704 DF, p-value: < 2.2e-16