Main Analyses
bivar.model <- lm(Unemployment.Rate ~ minwagegap)
summary(bivar.model)
##
## Call:
## lm(formula = Unemployment.Rate ~ minwagegap)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.18557 -0.51693 -0.04903 0.61086 1.70040
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.43427 0.50797 8.729 1.78e-11 ***
## minwagegap -0.04833 0.04154 -1.163 0.25
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7902 on 48 degrees of freedom
## Multiple R-squared: 0.02742, Adjusted R-squared: 0.007161
## F-statistic: 1.353 on 1 and 48 DF, p-value: 0.2504
plot(Unemployment.Rate ~ minwagegap,
main = "Bivariate Model: Unemployment Rate ~ Minimum Wage Gap",
xlab = "Minimum Wage Gap $",
ylab = "Unemployment Rate %",
col = "blue")
abline(bivar.model, col = "red")
The first bivariate regression examines whether states with larger minimum wage gaps experience different unemployment rates. The scatterplot shows a slight negative trend, with the regression line suggesting that as the wage gap increases, unemployment rates tend to decrease slightly. However, the regression results reveal that this relationship is not statistically significant. The coefficient for the minimum wage gap is -0.048, meaning that for every additional dollar in the wage gap, unemployment is predicted to decrease by approximately 0.05 percentage points. However, with a p-value of 0.25, this result fails to reach statistical significance of p < 0.05, so there is no confident conclusion that a meaningful relationship exists between the wage gap and unemployment when examined alone. The R-squared value of 0.027 indicates that only about 2.7% of the variation in state unemployment rates is explained by the minimum wage gap, suggesting that other factors play a much larger role in determining unemployment levels.
plot(fitted(bivar.model), residuals(bivar.model),
main = "Bivariate Residual Plot: Unemployment ~ Minimum Wage Gap",
xlab = "Fitted Values",
ylab = "Residuals",
col = "steelblue")
abline(h = 0, col = "red", lty = 2)
The residual plot for this first bivariate model helps assess whether the linear regression assumptions are met. The plot shows the residuals in differences between observed and predicted unemployment rates scattered around the horizontal line at zero. Ideally, residuals should be randomly distributed around zero without a clear pattern. The scatterplot shows that the residuals are relatively evenly distributed above and below the zero line across the range of fitted values, suggesting that the linear model is appropriate for the data. However, a few points with larger residuals above 1 and below -1, indicating that the model does not fit perfectly for all states. Overall, the residual plot suggests that the assumptions of linear regression are adequately met, though the weak relationship indicated by the low R-squared remains.
bivar.model2 <- lm(Unemployment.Rate ~ Average.Household.Income)
summary(bivar.model2)
##
## Call:
## lm(formula = Unemployment.Rate ~ Average.Household.Income)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.75165 -0.52045 0.02928 0.52543 1.79361
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.039e+00 8.748e-01 2.331 0.0240 *
## Average.Household.Income 2.936e-05 1.401e-05 2.095 0.0414 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.767 on 48 degrees of freedom
## Multiple R-squared: 0.0838, Adjusted R-squared: 0.06472
## F-statistic: 4.391 on 1 and 48 DF, p-value: 0.04144
plot(Unemployment.Rate ~ Average.Household.Income,
main = "Bivariate Model: Unemployment Rate ~ Average Household Income",
xlab = "Average Household Income $",
ylab = "Unemployment Rate %",
col = "blue")
abline(bivar.model2, col = "red")
The second bivariate regression examines the relationship between state unemployment rates and average household income. The scatterplot reveals a slight positive trend, with the regression line suggesting that states with higher average incomes tend to have slightly higher unemployment rates. The regression results show that the relationship is statistically significant. The coefficient for average household income is 0.00002936, meaning that for every additional $10,000 in average household income, unemployment is predicted to increase by approximately 0.29 percentage points. With a p-value of 0.041, this relationship is statistically significant at the 0.05 level, allowing the rejection of the null hypothesis stating there is no relationship between income and unemployment. The R-squared value of 0.084 indicates that about 8.4% of the variation in state unemployment rates can be explained by average household income alone. While there is still a lot unaccounted for in the reasons behind unemployment rates, this relationship is notably stronger than the minimum wage gap variable previously examined, and the statistical significance suggests that income plays a meaningful role in predicting state unemployment levels.
plot(fitted(bivar.model2), residuals(bivar.model2),
main = "Bivariate Residual Plot: Unemployment ~ Household Income",
xlab = "Fitted Values",
ylab = "Residuals",
col = "steelblue")
abline(h = 0, col = "red", lty = 2)
The residual plot for this second bivariate model shows residuals scattered around the zero line across the range of fitted values. Similar to the first residual model, the residual points appear relatively random with no strong patterns, also suggesting that the assumptions of linear regression are satisfied. A few states have larger residuals above 1.5 and below -1.5, indicating that the model does not perfectly predict unemployment across all states. Some states have unemployment rates considerably higher and lower than expected, given their income levels. Overall, the residual plot supports the appropriateness of using a linear model for this relationship. However, it also highlights that other unmeasured factors beyond income are important for explaining the variation.
multivar.model <- lm(Unemployment.Rate ~ minwagegap*Average.Household.Income)
summary(multivar.model)
##
## Call:
## lm(formula = Unemployment.Rate ~ minwagegap * Average.Household.Income)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.59338 -0.46966 -0.02287 0.41757 1.80671
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.123e+00 5.159e+00 -0.993 0.3259
## minwagegap 6.163e-01 4.308e-01 1.431 0.1593
## Average.Household.Income 1.540e-04 8.372e-05 1.840 0.0722 .
## minwagegap:Average.Household.Income -1.074e-05 7.028e-06 -1.529 0.1332
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7568 on 46 degrees of freedom
## Multiple R-squared: 0.1452, Adjusted R-squared: 0.08948
## F-statistic: 2.605 on 3 and 46 DF, p-value: 0.06307
The final model examines whether the relationship between the minimum wage gap and unemployment depends on a state’s average household income by including an interaction term that multiplies the minimum wage gap by average household income. The coefficient for the minimum wage gap alone is 0.616, while the coefficient for average household income is 0.000154, 1.54 percentage points for every $10,000 increase. The interaction term coefficient is -0.0000107, suggesting the relationship between the wage gap and unemployment becomes weaker as state income levels increase. However, none of these individual coefficients are statistically significant at the 0.05 level.
Despite the lack of individual statistical significance, the R-squared value of 0.145 indicates that this model explains about 14.5% of the variation in state unemployment rates, which is substantially more than either bivariate model alone. This suggests that considering both variables together, along with their interaction, provides a more complete picture of state unemployment patterns.
plot(fitted(multivar.model), residuals(multivar.model),
main = "Multivariate Residual Plot:
Unemployment ~ Minimum Wage Gap x Household Income",
xlab = "Fitted Values",
ylab = "Residuals",
col = "steelblue")
abline(h = 0, col = "red", lty = 2)
The residual plot shows residuals distributed randomly around zero with no apparent patterns, supporting the appropriateness of the linear model with the interaction term. However, several observations still have residuals above 1.5 and below -1.5, indicating that some states’ unemployment rates are not well predicted even by this more complex model.