1. Data Overview
I converted all the count columns to per_100k
These columns were converted –
- Overdose_Total: Overdose_Total_per_100k
- Employer: Employer_per_100k
- Non_Group: Non_Group_per_100k
- Medicaid: Medicaid_per_100k
- Medicare: Medicare_per_100k
- Military: Military_per_100k
- Uninsured: Uninsured_per_100k
- Individual_Homeless: Individual_Homeless_per_100k
- Family_Homeless: Family_Homeless_per_100k
- Total_Homeless: Total_Homeless_per_100k
| State | Year | Population | Overdose_Male | Overdose_Female | Overdose_Total | Median_Income | LFPR | Employer | Non_Group | Medicaid | Medicare | Military | Uninsured | Unemployment_Rate | HS_Grad_Rate | Individual_Homeless | Family_Homeless | Total_Homeless | Poverty_Rate | Overdose_Total_per_100k | Employer_per_100k | Non_Group_per_100k | Medicaid_per_100k | Medicare_per_100k | Military_per_100k | Uninsured_per_100k | Individual_Homeless_per_100k | Family_Homeless_per_100k | Total_Homeless_per_100k |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Alabama | 2008 | 4718206 | 107 | 78 | 185 | 47244 | 57.2 | 2346200 | 245000.0 | 646300.0 | 578200 | 78700.00 | 632500.0 | 5.7 | 80.80 | 4086 | 1301 | 5387 | 14.3 | 3.92 | 49726.53 | 5192.65 | 13698.00 | 12254.66 | 1668.01 | 13405.52 | 86.60 | 27.57 | 114.17 |
| Alabama | 2009 | 4757938 | 118 | 88 | 206 | 46905 | 53.1 | 2230800 | 226000.0 | 808200.0 | 576300 | 93300.00 | 634500.0 | 11.0 | 80.80 | 4686 | 1394 | 6080 | 16.6 | 4.33 | 46885.86 | 4749.96 | 16986.35 | 12112.39 | 1960.93 | 13335.61 | 98.49 | 29.30 | 127.79 |
| Alabama | 2010 | 4785437 | 117 | 70 | 187 | 51999 | 53.2 | 2195700 | 232900.0 | 842600.0 | 599200 | 94100.00 | 686800.0 | 10.5 | 81.40 | 4553 | 4553 | 4553 | 17.2 | 3.91 | 45882.96 | 4866.85 | 17607.59 | 12521.32 | 1966.38 | 14351.88 | 95.14 | 95.14 | 95.14 |
| Alabama | 2011 | 4799069 | 101 | 75 | 176 | 51248 | 53.4 | 2191200 | 248700.0 | 857600.0 | 609800 | 107500.00 | 661100.0 | 9.6 | 81.90 | 4325 | 1233 | 5558 | 15.4 | 3.67 | 45658.86 | 5182.26 | 17870.13 | 12706.63 | 2240.02 | 13775.59 | 90.12 | 25.69 | 115.81 |
| Alabama | 2012 | 4815588 | 105 | 60 | 165 | 47382 | 53.5 | 2210500 | 233300.0 | 903600.0 | 625800 | 98500.00 | 624300.0 | 8.0 | 82.60 | 3825 | 1384 | 5209 | 16.2 | 3.43 | 45903.01 | 4844.68 | 18764.06 | 12995.30 | 2045.44 | 12964.15 | 79.43 | 28.74 | 108.17 |
| Alabama | 2013 | 4830081 | 104 | 62 | 166 | 47885 | 53.6 | 2143300 | 263600.0 | 893600.0 | 667700 | 94600.00 | 642900.0 | 7.2 | 83.10 | 3387 | 1302 | 4689 | 16.7 | 3.44 | 44374.00 | 5457.47 | 18500.72 | 13823.78 | 1958.56 | 13310.34 | 70.12 | 26.96 | 97.08 |
| Alabama | 2014 | 4841799 | 160 | 110 | 270 | 48812 | 53.3 | 2191400 | 268700.0 | 915500.0 | 683100 | 90700.00 | 564800.0 | 6.8 | 83.70 | 3115 | 1446 | 4561 | 17.8 | 5.58 | 45260.04 | 5549.59 | 18908.26 | 14108.39 | 1873.27 | 11665.09 | 64.34 | 29.86 | 94.20 |
| Alabama | 2015 | 4852347 | 180 | 102 | 282 | 50988 | 53.5 | 2205700 | 289700.0 | 948700.0 | 708200 | 97500.00 | 480300.0 | 6.1 | 84.30 | 2868 | 1102 | 3970 | 16.3 | 5.81 | 45456.35 | 5970.31 | 19551.36 | 14595.00 | 2009.34 | 9898.30 | 59.11 | 22.71 | 81.82 |
| Alabama | 2016 | 4863525 | 218 | 125 | 343 | 52633 | 53.9 | 2216700 | 318500.0 | 952800.0 | 712600 | 94400.00 | 435200.0 | 5.8 | 84.80 | 3019 | 1092 | 4111 | 16.2 | 7.05 | 45578.05 | 6548.75 | 19590.73 | 14651.92 | 1940.98 | 8948.24 | 62.07 | 22.45 | 84.53 |
| Alabama | 2017 | 4874486 | 277 | 145 | 422 | 49989 | 54.6 | 2179900 | 292200.0 | 977900.0 | 745800 | 93100.00 | 452600.0 | 4.4 | 85.30 | 2985 | 808 | 3793 | 15.0 | 8.66 | 44720.61 | 5994.48 | 20061.60 | 15300.07 | 1909.94 | 9285.08 | 61.24 | 16.58 | 77.81 |
| Alabama | 2018 | 4887681 | 248 | 133 | 381 | 51798 | 55.4 | 2232000 | 257400.0 | 928300.0 | 754900 | 96600.00 | 483400.0 | 3.9 | 85.04 | 2570 | 864 | 3434 | 16.0 | 7.80 | 45665.83 | 5266.30 | 18992.65 | 15444.95 | 1976.40 | 9890.17 | 52.58 | 17.68 | 70.26 |
| Alabama | 2019 | 4903185 | 262 | 152 | 414 | 64010 | 57.5 | 2250900 | 263400.0 | 929500.0 | 763800 | 99000.00 | 460400.0 | 3.2 | 87.10 | 2519 | 742 | 3261 | 12.9 | 8.44 | 45906.90 | 5372.02 | 18957.07 | 15577.63 | 2019.10 | 9389.81 | 51.37 | 15.13 | 66.51 |
| Alabama | 2020 | 5031864 | 421 | 190 | 611 | 61650 | 56.8 | 2224986 | 266185.7 | 897814.3 | 687200 | 95771.43 | 547842.9 | 6.4 | 88.00 | 2497 | 854 | 3351 | 14.9 | 12.14 | 44217.92 | 5290.00 | 17842.58 | 13656.97 | 1903.30 | 10887.47 | 49.62 | 16.97 | 66.60 |
| Alabama | 2021 | 5050380 | 667 | 314 | 981 | 63750 | 56.7 | 2281400 | 293700.0 | 941200.0 | 791900 | 102900.00 | 489600.0 | 3.4 | 87.90 | 2064 | 492 | 2556 | 15.9 | 19.42 | 45172.84 | 5815.40 | 18636.22 | 15680.01 | 2037.47 | 9694.32 | 40.87 | 9.74 | 50.61 |
| Alabama | 2022 | 5073903 | 751 | 346 | 1097 | 62290 | 56.7 | 2274100 | 293500.0 | 1023600.0 | 803500 | 99900.00 | 421400.0 | 2.5 | 88.80 | 2482 | 1270 | 3752 | 13.6 | 21.62 | 44819.54 | 5784.50 | 20173.82 | 15835.94 | 1968.90 | 8305.24 | 48.92 | 25.03 | 73.95 |
Summary Statistics
##
## Summary Statistics
## ===================================================================================
## Statistic N Mean St. Dev. Min Max
## -----------------------------------------------------------------------------------
## Population 765 6,274,779.00 7,068,697.00 546,043 39,512,223
## Overdose_Male 765 543.64 710.98 7.50 5,736.00
## Overdose_Female 765 255.30 279.12 2.50 1,710.00
## Overdose_Total 765 798.95 984.62 10 7,347
## Median_Income 765 64,820.71 13,486.62 37,714 112,500
## LFPR 765 62.04 4.46 49.30 73.20
## Employer 765 3,006,814.00 3,246,341.00 272,900.00 18,538,700.00
## Non_Group 765 362,617.00 464,168.70 18,000.00 2,946,300.00
## Medicaid 765 1,147,663.00 1,477,773.00 50,800.00 10,466,200.00
## Medicare 765 778,675.90 807,229.50 38,500.00 4,577,200.00
## Military 765 88,832.19 102,552.40 3,500.00 509,900.00
## Uninsured 765 707,730.90 1,052,865.00 18,600.00 6,781,300.00
## Unemployment_Rate 765 5.84 2.27 2.00 13.70
## HS_Grad_Rate 765 88.38 3.45 78.90 95.00
## Individual_Homeless 765 7,256.67 15,361.04 247 145,983
## Family_Homeless 765 4,135.99 7,878.52 75 96,940
## Total_Homeless 765 10,890.98 20,591.95 327 171,521
## Poverty_Rate 765 12.65 3.46 3.70 23.10
## Overdose_Total_per_100k 765 13.53 10.21 0.63 70.19
## Employer_per_100k 765 48,623.23 5,131.97 33,146.74 61,880.66
## Non_Group_per_100k 765 5,707.48 1,476.35 2,116.46 10,829.41
## Medicaid_per_100k 765 17,634.96 4,634.79 6,508.07 33,496.30
## Medicare_per_100k 765 12,794.70 2,267.51 5,546.92 18,441.57
## Military_per_100k 765 1,656.23 1,036.19 341.99 6,587.14
## Uninsured_per_100k 765 10,344.59 4,350.53 2,342.92 23,422.56
## Individual_Homeless_per_100k 765 109.04 92.12 18.24 725.22
## Family_Homeless_per_100k 765 66.97 75.11 2.98 680.50
## Total_Homeless_per_100k 765 168.13 150.42 21.22 1,217.53
## -----------------------------------------------------------------------------------
2. EDA
Total Deaths
## Total Deaths (2008-2022): 611193
## Total Deaths per 100k people (2008-2022): 12.73
Overdose Deaths per 100k by State
Year over Year Trend
Year over Year State Trends
3. Modeling
Correlation for all years
VIF
VIF Results
| Variable | VIF | |
|---|---|---|
| Employer_per_100k | Employer_per_100k | 17.45 |
| Medicaid_per_100k | Medicaid_per_100k | 15.01 |
| Uninsured_per_100k | Uninsured_per_100k | 12.60 |
| Medicare_per_100k | Medicare_per_100k | 5.75 |
| Poverty_Rate | Poverty_Rate | 5.05 |
| HS_Grad_Rate | HS_Grad_Rate | 3.57 |
| LFPR | LFPR | 3.49 |
| Military_per_100k | Military_per_100k | 2.76 |
| Median_Income | Median_Income | 2.65 |
| Non_Group_per_100k | Non_Group_per_100k | 2.65 |
| Unemployment_Rate | Unemployment_Rate | 2.45 |
| Total_Homeless_per_100k | Total_Homeless_per_100k | 1.58 |
Fixed Effects Model
##
## Fixed Effects Model with no scaling
## =============================================
## Dependent variable:
## ---------------------------
## Overdose_Total_per_100k
## ---------------------------------------------
## LFPR 1.03***
## (0.16)
##
## Unemployment_Rate 0.47**
## (0.19)
##
## Medicare_per_100k 0.004***
## (0.0003)
##
## ---------------------------------------------
## Observations 765
## R2 0.41
## Adjusted R2 0.37
## F Statistic 165.77*** (df = 3; 711)
## =============================================
## Note: *p<0.1; **p<0.05; ***p<0.01
Random Effects Model
##
## Random Effects Model with no scaling
## =============================================
## Dependent variable:
## ---------------------------
## Overdose_Total_per_100k
## ---------------------------------------------
## LFPR 0.65***
## (0.14)
##
## Unemployment_Rate 0.14
## (0.18)
##
## Medicare_per_100k 0.004***
## (0.0002)
##
## Constant -72.93***
## (11.08)
##
## ---------------------------------------------
## Observations 765
## R2 0.35
## Adjusted R2 0.35
## F Statistic 414.71***
## =============================================
## Note: *p<0.1; **p<0.05; ***p<0.01
Model Comparison
##
## Comparison of Regression Models with no Logs
## =========================================================
## Dependent variable:
## ---------------------------------------
## Opioid Overdose Deaths per 100k People
## Fixed Effects Random Effects
## ---------------------------------------------------------
## LFPR 1.03*** 0.65***
## (0.16) (0.14)
## Unemployment_Rate 0.47** 0.14
## (0.19) (0.18)
## Medicare_per_100k 0.004*** 0.004***
## (0.0003) (0.0002)
## Constant -72.93***
## (11.08)
## ---------------------------------------------------------
## Observations 765 765
## =========================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
Hausman test
Fixed-Effects Model is Preferred: Since the p-value is significantly less than 0.05, the test suggests that the fixed-effects model should be preferred over the random-effects model for our data. This implies that individual-specific effects are correlated with the explanatory variables, which violates one of the key assumptions of the random-effects model.
##
## Hausman Test
##
## data: Overdose_Total_per_100k ~ LFPR + Unemployment_Rate + Medicare_per_100k
## chisq = 58.473, df = 3, p-value = 1.246e-12
## alternative hypothesis: one model is inconsistent
5. Questions for Dr. MacDonald
Scaling Data
Some variables are in counts per 100k people & some are rates – Do we need to scale the counts per 100k variables differently than the rates columns?
What other things can I look into in terms of variable selection, is correlations and VIF enough?