Based on Chetty et al (2014)

A Brief Summary

Main Questions:

Do children do better economically than their parents?

What does economic mobility look like in different areas (commuting zones)?

What are the average characteristics of commuting zones with high opportunity?

Main Findings:

This experiment had five main findings relating to the characteristics of mobility zones across the United States:

  1. Racial Segregation - areas with larger black populations had lower rates of mobility.
  2. Schools funded by property taxes - Areas with higher local tax rates, which are predominantly used to finance public schools, have higher rates of mobility
  3. Inequality - CZs with larger Gini coefficients have less upward mobility
  4. Strength of Community Ties - high upward mobility areas tend to have higher fractions of religious individuals and greater participation in local civic organizations
  5. Family Structure - Children of married parents have higher rates of upward mobility if they live in communities with fewer single parents.(Association between two-parent households in the community and upward mobility)

Regression Q2

Regression Equation: prob_q1_q5 = b0 + b1(urban) + e (Note: all variables should have “i” subscripts)

The coefficient of -3.3276 for b1 in the regression equation listed above indicates that a child who lives in an urban area is associated with a 3.3276 percentage less likely to be in top 5th of income distribution if their parents are in the bottom 5th.

Ho: b1 = 0

Ha: b1 != 0

Based on the p-value for this test, the coefficient on urban is statistically significant at the one percent level of significance.

## 
## Call:
## lm(formula = prob_q1_q5 ~ urban, data = Data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.0762 -2.8728 -0.5563  2.0083 24.4279 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  11.2864     0.2301  49.056   <2e-16 ***
## urban        -3.3276     0.3398  -9.792   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.508 on 707 degrees of freedom
## Multiple R-squared:  0.1194, Adjusted R-squared:  0.1182 
## F-statistic: 95.89 on 1 and 707 DF,  p-value: < 2.2e-16

Regression Q3

urban: The coefficient of -2.41917 for b1 in the regression equation listed above indicates that a child who lives in an urban area is associated with a 2.41917 percentage less likely to be in top 5th of income distribution if their parents are in the bottom 5th. (Note: This variable is significant at the 5% level of significance)

cz_race_black: The coefficient of -0.18543 for b2 in the regression equation listed above indicates that a a one percent increase in the commuting zone population being black is associated with a 0.18543 percentage less likely to be in top 5th of income distribution if their parents are in the bottom 5th. (Note: This variable is significant at the 5% level of significance)

Based on the regression equation from question 2, it seems that the previous model did suffer from omitted variable bias. It is likely that black people more commonly live in urban zones, so there is likely correlation between the urban and cz_race_black variable. This indicates that our coefficient in the previous model was biased and inconsistent.

## 
## Call:
## lm(formula = prob_q1_q5 ~ urban + cz_race_black, data = Data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.931 -2.390 -0.550  1.565 23.390 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   12.36969    0.21144  58.501  < 2e-16 ***
## urban         -2.41917    0.30030  -8.056 3.36e-15 ***
## cz_race_black -0.18543    0.01207 -15.358  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.906 on 706 degrees of freedom
## Multiple R-squared:   0.34,  Adjusted R-squared:  0.3381 
## F-statistic: 181.8 on 2 and 706 DF,  p-value: < 2.2e-16

Interaction Term + Regression

Urban Commuting Zones Regression Equation: prob_q1_q5i= (bo + b1) + (b2+b3)taxrate + b4cz_race_black + b5cz_labforce + e (with “i” subscripts)

Non-Urban Commuting Zones Regression Equation: prob_q1_q5i= bo + b2taxrate + b4cz_race_black + b5cz_labforce + ei (with “i” subscripts)

Effect of a commuting zone’s tax rate on intergenerational mobility for urban zones: b2 + b3

Calculated effect of a commuting zone’s tax rate on intergenerational mobility for urban zones: (1.04607 - 0.57883 = 0.46724) For an urban commuting zone, a one percent increase in the tax rate is associated with a 0.46724 percent increase in the probability a child is in top 5th of income distribution if parents in bottom 5th.

## 
## Call:
## lm(formula = prob_q1_q5 ~ urban + taxrate + urban * taxrate + 
##     cz_race_black + cz_labforce, data = Data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -12.060  -2.127  -0.491   1.397  23.083 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    4.67902    1.61843   2.891  0.00396 ** 
## urban         -1.31460    0.94317  -1.394  0.16382    
## taxrate        1.04607    0.17442   5.997 3.21e-09 ***
## cz_race_black -0.16009    0.01222 -13.099  < 2e-16 ***
## cz_labforce    0.08371    0.02723   3.074  0.00219 ** 
## urban:taxrate -0.57833    0.40528  -1.427  0.15403    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.758 on 702 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.3925, Adjusted R-squared:  0.3881 
## F-statistic:  90.7 on 5 and 702 DF,  p-value: < 2.2e-16

Scatterplot From Simplified Q4 Regression

## function (object, ...) 
## UseMethod("summary")
## <bytecode: 0x0000017b5830d178>
## <environment: namespace:base>

Squared Term Regression

The total effect of a commuting zone’s tax rate on intergenerational mobility from the regression equation is b2+ 2b3taxrate (with “i” subscripts)

The positive coefficient for b2 (or the taxrate) and the negative coefficient for b3 (or the taxrate^2) tells us that an increase in the taxrate with initially low average tax rates causes an upward trend in the probability of mobility, but that effect rises at a decreasing rate at higher average tax rates.

## 
## Call:
## lm(formula = prob_q1_q5 ~ urban + taxrate + taxrate2 + cz_race_black + 
##     cz_labforce, data = Data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.2071 -2.1301 -0.5284  1.4890 22.8121 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    3.84972    1.66454   2.313  0.02102 *  
## urban         -2.65621    0.31377  -8.465  < 2e-16 ***
## taxrate        2.05415    0.47852   4.293 2.01e-05 ***
## taxrate2      -0.16775    0.06843  -2.451  0.01447 *  
## cz_race_black -0.15866    0.01221 -12.998  < 2e-16 ***
## cz_labforce    0.07685    0.02727   2.818  0.00496 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.748 on 702 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.3959, Adjusted R-squared:  0.3916 
## F-statistic:    92 on 5 and 702 DF,  p-value: < 2.2e-16

Simplified Regression Q8 Scatterplot

## 
## Call:
## lm(formula = prob_q1_q5 ~ urban + taxrate + taxrate2, data = Data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.599 -2.779 -0.540  2.085 23.329 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   5.3621     0.8065   6.649 5.94e-11 ***
## urban        -3.0897     0.3266  -9.461  < 2e-16 ***
## taxrate       3.2560     0.5293   6.152 1.29e-09 ***
## taxrate2     -0.2700     0.0771  -3.502 0.000492 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.257 on 704 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.2181, Adjusted R-squared:  0.2148 
## F-statistic: 65.46 on 3 and 704 DF,  p-value: < 2.2e-16