Introduction

My study examines the cross-sectional relationship between state unemployment rates and the gap between minimum wage and living wage in 2025. Using bivariate analyses, I first investigate the effect of the minimum wage gap on state unemployment rates, then examine the relationship between state median household income and unemployment rates. Finally, I conduct a multiple regression analysis with an interaction term to determine whether the relationship between the minimum wage gap and unemployment varies across states with different income levels.

Data Description

statedata2025 <- read.csv("GovFullData.csv")
attach(statedata2025)
head(statedata2025)
##        State Average.Household.Income Unemployment.Rate MIT.Living.Wage
## 1    Alabama                    53394             3.200           20.50
## 2     Alaska                    66130             4.712           24.11
## 3    Arizona                    63045             4.050           24.42
## 4   Arkansas                    51251             3.688           19.49
## 5 California                    76960             5.388           28.72
## 6   Colorado                    71968             4.650           25.47
##   Minimum.Wage
## 1         7.25
## 2        13.00
## 3        14.70
## 4        11.00
## 5        16.50
## 6        14.81
variable.names(statedata2025)
## [1] "State"                    "Average.Household.Income"
## [3] "Unemployment.Rate"        "MIT.Living.Wage"         
## [5] "Minimum.Wage"

The dataset used in this analysis was compiled from three sources to examine all 50 U.S. states in 2025. The first dataset, obtained from the U.S. Department of Justice’s “Census Bureau Median Family Income By Family Size,” provides state-level average household income data for single-earner households with no children. The second dataset comes from the Bureau of Labor Statistics’ “State unemployment rates over the last 10 years, seasonally adjusted.” For this project, I calculated the mean unemployment rate for each state across all months in 2025 to capture the annual average. The third dataset is MIT’s 2025 “Living Wage Calculator,” which provides both the estimated living wage and the current minimum wage for each state, though some minimum wages have changed since the dataset was last updated in February.

Each observation in the final dataset corresponds to a U.S. state, for a total of 50 observations; the District of Columbia is excluded. The key variables include: Unemployment Rate which is measured as the percentage of unemployed individuals in 2025, and serves as the dependent variable; MIT Living Wage which is the hourly wage needed to cover basic living expenses for a single adult; Minimum Wage which is the the state’s legal minimum hourly wage; and Average Household Income which is the median household income for single-person households in each state.

minwagegap <- `MIT.Living.Wage` - `Minimum.Wage`
head(minwagegap)
## [1] 13.25 11.11  9.72  8.49 12.22 10.66

To measure the gap between what workers need to earn and what they are legally guaranteed, I created a new variable, Minimum Wage Gap, by subtracting the state’s minimum wage from MIT’s estimated living wage. This gap represents the additional dollars per hour needed beyond the minimum wage to meet basic living costs.

Descriptive Analyses

summary(Unemployment.Rate)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.850   3.272   3.800   3.858   4.447   5.562
hist(Unemployment.Rate, 
     main="Histogram of Unemployment Rate (2025)", 
     xlab = "Unemployment Rate %", 
     ylab = "Number of States", 
     col = "lightblue")

The unemployment rate across the 50 states in 2025 ranges from 1.85% to 5.56%, with a mean of 3.86% and a median of 3.80%. There is a relatively symmetric distribution, as evidenced by the close alignment of the mean and median. The histogram of unemployment rates shows that most states cluster in the 3.5% to 4.5% range, with the majority experiencing rates between 3% and 5%. Only South Dakota has an unemployment rate below 2.5%, and only Nevada has an unemployment rate above 5.5%; overall, the distribution appears relatively normal, with a slight concentration in the middle.

summary(minwagegap)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   7.120   9.685  11.960  11.929  13.785  17.530
hist(minwagegap, 
     main="Histogram of Minimum Wage Gap (2025)", 
     xlab = "Minimum Wage Gap in Dollars", 
     ylab = "Number of States", 
     breaks = 10, 
     col = "lightgreen")

The minimum wage gap, the difference between MIT’s living wage and the state minimum wage, ranges from $7.12 to $17.53 per hour across states. The mean gap is $11.93 per hour, while the median is $11.96. Similarly, the close alignment of the values indicates a relatively symmetric distribution. The histogram reveals that the most common wage gap falls in the $13-14 range, with 11 states in the range. However, there is considerable variation across states: some states have relatively small gaps of around $7-9 per hour, while others have substantial gaps exceeding $15 per hour. The distribution shows that in most states, minimum wage workers would need to earn an additional $10-14 per hour to meet basic living costs.

summary(Average.Household.Income)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   47569   56300   60112   61951   66484   80330
hist(Average.Household.Income, 
     main="Average Annual Income By State (2025)", 
     xlab = "Single Person Average Income", 
     ylab = "Number of States", 
     col = "lightcoral")

Average household income for single-person households varies substantially across states, ranging from $47,569 to $80,330. The mean income is $61,951, while the median is $60,112, suggesting a roughly symmetric distribution with a slight rightward skew. The histogram shows that the largest group of 15 states has average incomes in the $55,000-60,000 range. Most states cluster between $55,000 and $70,000, though a few outlier states have particularly high incomes above $75,000, and one state has a lower income below $50,000. On the histogram, Mississippi is the only state outlier below $50,000, and Massachusetts is the only state outlier above $80,000.

Main Analyses

bivar.model <- lm(Unemployment.Rate ~ minwagegap)
summary(bivar.model)
## 
## Call:
## lm(formula = Unemployment.Rate ~ minwagegap)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.18557 -0.51693 -0.04903  0.61086  1.70040 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  4.43427    0.50797   8.729 1.78e-11 ***
## minwagegap  -0.04833    0.04154  -1.163     0.25    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7902 on 48 degrees of freedom
## Multiple R-squared:  0.02742,    Adjusted R-squared:  0.007161 
## F-statistic: 1.353 on 1 and 48 DF,  p-value: 0.2504
plot(Unemployment.Rate ~ minwagegap, 
     main = "Bivariate Model: Unemployment Rate ~ Minimum Wage Gap", 
     xlab = "Minimum Wage Gap $", 
     ylab = "Unemployment Rate %", 
     col = "blue")
abline(bivar.model, col = "red")

The first bivariate regression examines whether states with larger minimum wage gaps experience different unemployment rates. The scatterplot shows a slight negative trend, with the regression line suggesting that as the wage gap increases, unemployment rates tend to decrease slightly. However, the regression results reveal that this relationship is not statistically significant. The coefficient for the minimum wage gap is -0.048, meaning that for every additional dollar in the wage gap, unemployment is predicted to decrease by approximately 0.05 percentage points. However, with a p-value of 0.25, this result fails to reach statistical significance of p < 0.05, so there is no confident conclusion that a meaningful relationship exists between the wage gap and unemployment when examined alone. The R-squared value of 0.027 indicates that only about 2.7% of the variation in state unemployment rates is explained by the minimum wage gap, suggesting that other factors play a much larger role in determining unemployment levels.

plot(fitted(bivar.model), residuals(bivar.model),
     main = "Bivariate Residual Plot: Unemployment ~ Minimum Wage Gap",
     xlab = "Fitted Values",
     ylab = "Residuals",
     col = "steelblue")
abline(h = 0, col = "red", lty = 2)

The residual plot for this first bivariate model helps assess whether the linear regression assumptions are met. The plot shows the residuals in differences between observed and predicted unemployment rates scattered around the horizontal line at zero. Ideally, residuals should be randomly distributed around zero without a clear pattern. The scatterplot shows that the residuals are relatively evenly distributed above and below the zero line across the range of fitted values, suggesting that the linear model is appropriate for the data. However, a few points with larger residuals above 1 and below -1, indicating that the model does not fit perfectly for all states. Overall, the residual plot suggests that the assumptions of linear regression are adequately met, though the weak relationship indicated by the low R-squared remains.

bivar.model2 <- lm(Unemployment.Rate ~ Average.Household.Income)
summary(bivar.model2)
## 
## Call:
## lm(formula = Unemployment.Rate ~ Average.Household.Income)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.75165 -0.52045  0.02928  0.52543  1.79361 
## 
## Coefficients:
##                           Estimate Std. Error t value Pr(>|t|)  
## (Intercept)              2.039e+00  8.748e-01   2.331   0.0240 *
## Average.Household.Income 2.936e-05  1.401e-05   2.095   0.0414 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.767 on 48 degrees of freedom
## Multiple R-squared:  0.0838, Adjusted R-squared:  0.06472 
## F-statistic: 4.391 on 1 and 48 DF,  p-value: 0.04144
plot(Unemployment.Rate ~ Average.Household.Income, 
     main = "Bivariate Model: Unemployment Rate ~ Average Household Income", 
     xlab = "Average Household Income $", 
     ylab = "Unemployment Rate %", 
     col = "blue")
abline(bivar.model2, col = "red")

The second bivariate regression examines the relationship between state unemployment rates and average household income. The scatterplot reveals a slight positive trend, with the regression line suggesting that states with higher average incomes tend to have slightly higher unemployment rates. The regression results show that the relationship is statistically significant. The coefficient for average household income is 0.00002936, meaning that for every additional $10,000 in average household income, unemployment is predicted to increase by approximately 0.29 percentage points. With a p-value of 0.041, this relationship is statistically significant at the 0.05 level, allowing the rejection of the null hypothesis stating there is no relationship between income and unemployment. The R-squared value of 0.084 indicates that about 8.4% of the variation in state unemployment rates can be explained by average household income alone. While there is still a lot unaccounted for in the reasons behind unemployment rates, this relationship is notably stronger than the minimum wage gap variable previously examined, and the statistical significance suggests that income plays a meaningful role in predicting state unemployment levels.

plot(fitted(bivar.model2), residuals(bivar.model2),
     main = "Bivariate Residual Plot: Unemployment ~ Household Income",
     xlab = "Fitted Values",
     ylab = "Residuals",
     col = "steelblue")
abline(h = 0, col = "red", lty = 2)

The residual plot for this second bivariate model shows residuals scattered around the zero line across the range of fitted values. Similar to the first residual model, the residual points appear relatively random with no strong patterns, also suggesting that the assumptions of linear regression are satisfied. A few states have larger residuals above 1.5 and below -1.5, indicating that the model does not perfectly predict unemployment across all states. Some states have unemployment rates considerably higher and lower than expected, given their income levels. Overall, the residual plot supports the appropriateness of using a linear model for this relationship. However, it also highlights that other unmeasured factors beyond income are important for explaining the variation.

multivar.model <- lm(Unemployment.Rate ~ minwagegap*Average.Household.Income)
summary(multivar.model)
## 
## Call:
## lm(formula = Unemployment.Rate ~ minwagegap * Average.Household.Income)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.59338 -0.46966 -0.02287  0.41757  1.80671 
## 
## Coefficients:
##                                       Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                         -5.123e+00  5.159e+00  -0.993   0.3259  
## minwagegap                           6.163e-01  4.308e-01   1.431   0.1593  
## Average.Household.Income             1.540e-04  8.372e-05   1.840   0.0722 .
## minwagegap:Average.Household.Income -1.074e-05  7.028e-06  -1.529   0.1332  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7568 on 46 degrees of freedom
## Multiple R-squared:  0.1452, Adjusted R-squared:  0.08948 
## F-statistic: 2.605 on 3 and 46 DF,  p-value: 0.06307

The final model examines whether the relationship between the minimum wage gap and unemployment depends on a state’s average household income by including an interaction term that multiplies the minimum wage gap by average household income. The coefficient for the minimum wage gap alone is 0.616, while the coefficient for average household income is 0.000154, 1.54 percentage points for every $10,000 increase. The interaction term coefficient is -0.0000107, suggesting the relationship between the wage gap and unemployment becomes weaker as state income levels increase. However, none of these individual coefficients are statistically significant at the 0.05 level.

Despite the lack of individual statistical significance, the R-squared value of 0.145 indicates that this model explains about 14.5% of the variation in state unemployment rates, which is substantially more than either bivariate model alone. This suggests that considering both variables together, along with their interaction, provides a more complete picture of state unemployment patterns.

plot(fitted(multivar.model), residuals(multivar.model),
     main = "Multivariate Residual Plot: 
     Unemployment ~ Minimum Wage Gap x Household Income",
     xlab = "Fitted Values",
     ylab = "Residuals",
     col = "steelblue")
abline(h = 0, col = "red", lty = 2)

The residual plot shows residuals distributed randomly around zero with no apparent patterns, supporting the appropriateness of the linear model with the interaction term. However, several observations still have residuals above 1.5 and below -1.5, indicating that some states’ unemployment rates are not well predicted even by this more complex model.

Discussion

My analysis examined whether state unemployment rates are related to the gap between minimum wage and living wage, and whether this relationship depends on average household income levels. The bivariate regressions showed that the minimum wage gap alone was not significantly related to unemployment, while average household income showed a weak but significant positive relationship with unemployment. The multiple regression model with the interaction term suggested that the effect of the wage gap on unemployment may differ across state income levels, though the interaction did not reach statistical significance. Overall, the combined model explained 14.5% of the variation in state unemployment rates, more than either variable alone.

These findings suggest that the relationship between wage policy and unemployment is more complex than a simple direct effect. The lack of a significant relationship between the minimum wage gap and unemployment in the bivariate model suggests that a larger gap between the minimum wage and the living wage does not necessarily translate into higher unemployment rates across states. The positive relationship between average income and unemployment reflects other economic factors, such as industry composition and cost-of-living dynamics, that affect both wage and employment patterns. The interaction model suggests that income levels moderate the relationship between wage gaps and unemployment. However, with only 50 observations, detecting these subtle effects is challenging.

Future research could extend these findings by examining whether these relationships hold over time using data from multiple years, allowing an examination of how changes in minimum wage policies affect unemployment within the states. Other state-level control variables, such as industry composition, education levels, and cost-of-living adjustments, could help explain a greater share of the variation in unemployment rates. Additionally, analyzing whether the wage gap affects different demographic groups differently could provide more targeted insight into the effects on unemployment rates.