DATA OVERVIEW & PROCESSING.

data <- read.csv(file = "~/Desktop/ECON 505/505_data.csv")
attach(data)
summary(data)
##     state              stateid          year            t           cpi       
##  Length:52          Min.   :1.00   Min.   :2006   Min.   : 6   Min.   :198.3  
##  Class :character   1st Qu.:1.75   1st Qu.:2009   1st Qu.: 9   1st Qu.:211.1  
##  Mode  :character   Median :2.50   Median :2012   Median :12   Median :226.7  
##                     Mean   :2.50   Mean   :2012   Mean   :12   Mean   :224.0  
##                     3rd Qu.:3.25   3rd Qu.:2015   3rd Qu.:15   3rd Qu.:233.9  
##                     Max.   :4.00   Max.   :2018   Max.   :18   Max.   :247.9  
##  mcare_millions  medicaid_spend_actual   overdoses      population      
##  Min.   :26584   Min.   :1.276e+10     Min.   : 944   Min.   :18166990  
##  1st Qu.:36465   1st Qu.:2.175e+10     1st Qu.:1171   1st Qu.:19522574  
##  Median :42798   Median :3.769e+10     Median :1520   Median :22301948  
##  Mean   :45126   Mean   :4.038e+10     Mean   :1684   Mean   :25726531  
##  3rd Qu.:51326   3rd Qu.:5.401e+10     3rd Qu.:1958   3rd Qu.:30476800  
##  Max.   :77603   Max.   :8.553e+10     Max.   :3245   Max.   :39461588  
##     hhincome       state_gdp       unemployment_pct labor_participation_pct
##  Min.   :50468   Min.   : 758264   Min.   : 3.200   Min.   :54.80          
##  1st Qu.:54798   1st Qu.:1112822   1st Qu.: 4.675   1st Qu.:57.17          
##  Median :57898   Median :1333119   Median : 5.850   Median :60.80          
##  Mean   :59032   Mean   :1454560   Mean   : 6.525   Mean   :59.77          
##  3rd Qu.:62981   3rd Qu.:1791195   3rd Qu.: 8.150   3rd Qu.:61.52          
##  Max.   :69065   Max.   :2721651   Max.   :12.200   Max.   :63.40          
##   insured_pct     grad_hs_pct    post_recession   per_capita_state_tax_revenue
##  Min.   :75.26   Min.   :78.70   Min.   :0.0000   Min.   :3235                
##  1st Qu.:82.08   1st Qu.:80.95   1st Qu.:1.0000   1st Qu.:3612                
##  Median :85.10   Median :83.50   Median :1.0000   Median :4440                
##  Mean   :85.12   Mean   :83.34   Mean   :0.8462   Mean   :5152                
##  3rd Qu.:88.60   3rd Qu.:85.53   3rd Qu.:1.0000   3rd Qu.:6413                
##  Max.   :94.60   Max.   :87.60   Max.   :1.0000   Max.   :9498                
##  pct_change_str       medicaidml    tmcaremcaidml    tmcaremcaidmlreal
##  Min.   :-0.08900   Min.   :12763   Min.   : 44107   Min.   : 55133   
##  1st Qu.: 0.00025   1st Qu.:21750   1st Qu.: 65046   1st Qu.: 71734   
##  Median : 0.03150   Median :37693   Median : 81984   Median : 92451   
##  Mean   : 0.02792   Mean   :40379   Mean   : 86417   Mean   : 94691   
##  3rd Qu.: 0.04925   3rd Qu.:54015   3rd Qu.: 95847   3rd Qu.:108307   
##  Max.   : 0.15300   Max.   :85531   Max.   :167579   Max.   :167579   
##  Prescription.Rate   prison_pop         hless_pop     
##  Min.   :35.10     Min.   :   129.1   Min.   : 23122  
##  1st Qu.:48.23     1st Qu.: 59583.0   1st Qu.: 36874  
##  Median :55.70     Median :103041.5   Median : 59309  
##  Mean   :59.26     Mean   :108463.2   Mean   : 69681  
##  3rd Qu.:71.33     3rd Qu.:158074.5   3rd Qu.: 97411  
##  Max.   :87.60     Max.   :173942.0   Max.   :151278

We have to make prison,homeless pop & overdoses into ratios.

df2 <- data.frame(state,stateid,t,overdoses,population,unemployment_pct,labor_participation_pct,
                  insured_pct,pct_change_str,grad_hs_pct,mcare_millions,
                  Prescription.Rate,prison_pop,hless_pop,hhincome)
df2$pris_rate <- (df2$prison_pop / df2$population) * 100
df2$hless_pop_rate <- (df2$hless_pop / df2$population) * 100
df2$overdose_rate <- (df2$overdoses / df2$population) * 100

Now that we have the population rates, we can drop the hless_pop, prison_pop & overdoses columns.

df2 = subset(df2, select = -c(prison_pop,hless_pop, overdoses) )
summary(df2)
##     state              stateid           t        population      
##  Length:52          Min.   :1.00   Min.   : 6   Min.   :18166990  
##  Class :character   1st Qu.:1.75   1st Qu.: 9   1st Qu.:19522574  
##  Mode  :character   Median :2.50   Median :12   Median :22301948  
##                     Mean   :2.50   Mean   :12   Mean   :25726531  
##                     3rd Qu.:3.25   3rd Qu.:15   3rd Qu.:30476800  
##                     Max.   :4.00   Max.   :18   Max.   :39461588  
##  unemployment_pct labor_participation_pct  insured_pct    pct_change_str    
##  Min.   : 3.200   Min.   :54.80           Min.   :75.26   Min.   :-0.08900  
##  1st Qu.: 4.675   1st Qu.:57.17           1st Qu.:82.08   1st Qu.: 0.00025  
##  Median : 5.850   Median :60.80           Median :85.10   Median : 0.03150  
##  Mean   : 6.525   Mean   :59.77           Mean   :85.12   Mean   : 0.02792  
##  3rd Qu.: 8.150   3rd Qu.:61.52           3rd Qu.:88.60   3rd Qu.: 0.04925  
##  Max.   :12.200   Max.   :63.40           Max.   :94.60   Max.   : 0.15300  
##   grad_hs_pct    mcare_millions  Prescription.Rate    hhincome    
##  Min.   :78.70   Min.   :26584   Min.   :35.10     Min.   :50468  
##  1st Qu.:80.95   1st Qu.:36465   1st Qu.:48.23     1st Qu.:54798  
##  Median :83.50   Median :42798   Median :55.70     Median :57898  
##  Mean   :83.34   Mean   :45126   Mean   :59.26     Mean   :59032  
##  3rd Qu.:85.53   3rd Qu.:51326   3rd Qu.:71.33     3rd Qu.:62981  
##  Max.   :87.60   Max.   :77603   Max.   :87.60     Max.   :69065  
##    pris_rate         hless_pop_rate    overdose_rate     
##  Min.   :0.0003301   Min.   :0.08283   Min.   :0.003883  
##  1st Qu.:0.3097544   1st Qu.:0.16470   1st Qu.:0.004844  
##  Median :0.4741612   Median :0.29773   Median :0.005429  
##  Mean   :0.4333532   Mean   :0.26910   Mean   :0.007013  
##  3rd Qu.:0.5547532   3rd Qu.:0.33357   3rd Qu.:0.008082  
##  Max.   :0.6947599   Max.   :0.47053   Max.   :0.016458
attach(df2)

VIZUALIZATIONS

Using Boxplots for checking data outliers. There are not a lot of outliers present, therefore we can use the data to make acurate predictions.

library(ggplot2)
ggplot(data = df2,
       mapping = aes(x = reorder(state, hless_pop_rate), y = overdoses, fill = state)) +
  geom_boxplot() +
  labs(x=NULL, y="overdoses") +
  coord_flip() +
  geom_hline(yintercept = 493, linetype="dashed", color = "red", size = 1) +
  theme_bw()

Using the plotly to investigate overdoses~prescription rates in all 4 states.

library(plotly)
gap4 <- ggplot(data, aes(x = overdoses, y = Prescription.Rate, color = state)) +
  geom_point(aes(shape = state,size = population,frame = year))
ggplotly(gap4)

Using Scatter Plots to Further Investigate relation b/t Variables.

library(gridExtra)
library(cowplot)
sc1 = ggplot(data = df2, aes(x =overdose_rate, y = hless_pop_rate)) + geom_point() + geom_smooth(method = lm, fill="blue", color="blue", se = FALSE) + xlab("overdose") + ylab("Prison Population") + ggtitle("overdoses ~ Homeless pop.")
sc2 = ggplot(data = df2, aes(x = overdose_rate, y = unemployment_pct)) + geom_point() + geom_smooth(method = lm, fill="blue", color="blue", se = FALSE) + xlab("overdose") + ylab("Unemployment %") + ggtitle("overdoses ~ Unemployment %.")
sc3 = ggplot(data = df2, aes(x =overdose_rate, y = Prescription.Rate)) + geom_point() + geom_smooth(method = lm, fill="blue", color="red", se = FALSE) + xlab("overdose") + ylab("Prescription Rate") + ggtitle("overdoses ~ Prescription rate")
sc4 = ggplot(data = df2, aes(x =Prescription.Rate, y = pris_rate)) + geom_point() + geom_smooth(method = lm, fill="blue", color="red", se = FALSE) + xlab("homeless") + ylab("prison") + ggtitle("Prescription Rate ~ Prison_pop %")
plot_grid(sc1, sc2,sc3,sc4)

correlation of Variables, using the corrplot library.

library(corrplot)
library(RColorBrewer)
df3<- data.frame(overdose_rate,hless_pop_rate,Prescription.Rate,grad_hs_pct,pct_change_str,
                 labor_participation_pct,unemployment_pct, pris_rate,insured_pct)
M <-cor(df3)
corrplot(M, type="upper", order="hclust",
         col=brewer.pal(n=8, name="RdYlBu"))

### We can see that overdose rates are most heavily coorelated with high school draduation rates, homeless pop. rate, and insured % of pop. Whereas, homeless rates are most corelated with unemployment and change in state revenues.

COW PLOTS. From the density charts below, we conlude that our dependend variables are evenly disterbuted even with the large population data.

library(ggplot2)
library(cowplot)
ovd = ggplot(data = df2, aes(x = overdose_rate))  + 
  geom_density(col = "pink", fill = "pink", alpha = 0.5) +
  xlab("Overdoses") + ggtitle("Overdoses for all 4 states")

ho = ggplot(data = df2, aes(x = hless_pop_rate))  + 
  geom_density(col = "blue", fill = "blue", alpha = 0.5) +
  xlab("homeless pop.") + ggtitle("Homeless pop. for all 4 states")

pop = ggplot(data = df2, aes(x = pris_rate))  + 
  geom_density(col = "yellow", fill = "yellow", alpha = 0.5) +
  xlab("Prison pop.") + ggtitle("Prison pop. for all 4 states")

ur = ggplot(data = df2, aes(x = Prescription.Rate))  + 
  geom_density(col = "green", fill = "green", alpha = 0.5) +
  xlab("Prescription Rate") + ggtitle("Prescription Rate for all 4 states")

plot_grid(ovd, ho, pop, ur)

MODELS & PREDICTIONS.

MODEL 1: Linear regression with Dummy Variables for the years and states.

dummyvar = lm(overdose_rate~hless_pop_rate+factor(t) +factor(stateid), 
              data = df2)
summary(dummyvar)
## 
## Call:
## lm(formula = overdose_rate ~ hless_pop_rate + factor(t) + factor(stateid), 
##     data = df2)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0033044 -0.0010011  0.0001143  0.0009087  0.0038190 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      -0.0024337  0.0023137  -1.052 0.300085    
## hless_pop_rate    0.0162226  0.0058569   2.770 0.008913 ** 
## factor(t)7        0.0003577  0.0013368   0.268 0.790613    
## factor(t)8        0.0004908  0.0013368   0.367 0.715751    
## factor(t)9        0.0009702  0.0013380   0.725 0.473208    
## factor(t)10       0.0009078  0.0013372   0.679 0.501681    
## factor(t)11       0.0012008  0.0013379   0.897 0.375587    
## factor(t)12       0.0008895  0.0013386   0.664 0.510753    
## factor(t)13       0.0011494  0.0013416   0.857 0.397400    
## factor(t)14       0.0016295  0.0013476   1.209 0.234677    
## factor(t)15       0.0028141  0.0013495   2.085 0.044404 *  
## factor(t)16       0.0051521  0.0013546   3.803 0.000549 ***
## factor(t)17       0.0059194  0.0013464   4.396 9.78e-05 ***
## factor(t)18       0.0053334  0.0013386   3.984 0.000327 ***
## factor(stateid)2  0.0059228  0.0009483   6.246 3.67e-07 ***
## factor(stateid)3  0.0032882  0.0007806   4.212 0.000168 ***
## factor(stateid)4  0.0028640  0.0014492   1.976 0.056040 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.00189 on 35 degrees of freedom
## Multiple R-squared:  0.7871, Adjusted R-squared:  0.6898 
## F-statistic: 8.087 on 16 and 35 DF,  p-value: 1.488e-07

MODEL 1.1: Linear regression with Dummy Variables for the years and states.

dummyvar2 = lm(hless_pop_rate~Prescription.Rate+factor(t) +factor(stateid), 
              data = df2)
summary(dummyvar2)
## 
## Call:
## lm(formula = hless_pop_rate ~ Prescription.Rate + factor(t) + 
##     factor(stateid), data = df2)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.055500 -0.031028 -0.000693  0.027108  0.089260 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       -0.192873   0.108795  -1.773  0.08496 .  
## Prescription.Rate  0.010532   0.002055   5.124 1.10e-05 ***
## factor(t)7        -0.029383   0.029744  -0.988  0.33000    
## factor(t)8        -0.048197   0.030501  -1.580  0.12307    
## factor(t)9        -0.066492   0.031184  -2.132  0.04008 *  
## factor(t)10       -0.073459   0.032012  -2.295  0.02786 *  
## factor(t)11       -0.064264   0.031059  -2.069  0.04598 *  
## factor(t)12       -0.053703   0.030271  -1.774  0.08475 .  
## factor(t)13       -0.027017   0.029199  -0.925  0.36117    
## factor(t)14       -0.011454   0.029364  -0.390  0.69886    
## factor(t)15        0.026306   0.031276   0.841  0.40601    
## factor(t)16        0.041586   0.032985   1.261  0.21575    
## factor(t)17        0.105252   0.039002   2.699  0.01064 *  
## factor(t)18        0.163802   0.045041   3.637  0.00088 ***
## factor(stateid)2  -0.356213   0.052382  -6.800 6.93e-08 ***
## factor(stateid)3   0.090596   0.018785   4.823 2.74e-05 ***
## factor(stateid)4  -0.371871   0.035042 -10.612 1.75e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04124 on 35 degrees of freedom
## Multiple R-squared:  0.9031, Adjusted R-squared:  0.8588 
## F-statistic: 20.38 on 16 and 35 DF,  p-value: 3.81e-13

MODEL 2: Panel Linear Model with state fixed effects.

library(plm)
within = plm(overdose_rate ~ hless_pop_rate+ factor(t), index="stateid", model="within", data=df2 )
summary(within)
## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = overdose_rate ~ hless_pop_rate + factor(t), data = df2, 
##     model = "within", index = "stateid")
## 
## Balanced Panel: n = 4, T = 13, N = 52
## 
## Residuals:
##        Min.     1st Qu.      Median     3rd Qu.        Max. 
## -0.00330442 -0.00100111  0.00011431  0.00090869  0.00381900 
## 
## Coefficients:
##                  Estimate Std. Error t-value  Pr(>|t|)    
## hless_pop_rate 0.01622265 0.00585687  2.7699 0.0089132 ** 
## factor(t)7     0.00035765 0.00133676  0.2676 0.7906132    
## factor(t)8     0.00049076 0.00133683  0.3671 0.7157513    
## factor(t)9     0.00097019 0.00133801  0.7251 0.4732083    
## factor(t)10    0.00090778 0.00133718  0.6789 0.5016809    
## factor(t)11    0.00120077 0.00133791  0.8975 0.3755869    
## factor(t)12    0.00088946 0.00133863  0.6645 0.5107528    
## factor(t)13    0.00114941 0.00134156  0.8568 0.3974004    
## factor(t)14    0.00162953 0.00134757  1.2092 0.2346772    
## factor(t)15    0.00281412 0.00134952  2.0853 0.0444037 *  
## factor(t)16    0.00515214 0.00135459  3.8035 0.0005490 ***
## factor(t)17    0.00591936 0.00134639  4.3965 9.776e-05 ***
## factor(t)18    0.00533338 0.00133863  3.9842 0.0003268 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    0.00033096
## Residual Sum of Squares: 0.00012508
## R-Squared:      0.62206
## Adj. R-Squared: 0.44929
## F-statistic: 4.43135 on 13 and 35 DF, p-value: 0.00021424

PLM no.2

within2 = plm(hless_pop_rate~ Prescription.Rate+ factor(t), index="stateid", model="within", data=df2 )
summary(within2)
## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = hless_pop_rate ~ Prescription.Rate + factor(t), 
##     data = df2, model = "within", index = "stateid")
## 
## Balanced Panel: n = 4, T = 13, N = 52
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -0.0555000 -0.0310277 -0.0006928  0.0271076  0.0892600 
## 
## Coefficients:
##                     Estimate Std. Error t-value  Pr(>|t|)    
## Prescription.Rate  0.0105325  0.0020555  5.1241 1.102e-05 ***
## factor(t)7        -0.0293830  0.0297440 -0.9879   0.33000    
## factor(t)8        -0.0481972  0.0305014 -1.5802   0.12307    
## factor(t)9        -0.0664917  0.0311841 -2.1322   0.04008 *  
## factor(t)10       -0.0734593  0.0320125 -2.2947   0.02786 *  
## factor(t)11       -0.0642643  0.0310585 -2.0691   0.04598 *  
## factor(t)12       -0.0537025  0.0302706 -1.7741   0.08475 .  
## factor(t)13       -0.0270169  0.0291995 -0.9253   0.36117    
## factor(t)14       -0.0114536  0.0293640 -0.3901   0.69886    
## factor(t)15        0.0263059  0.0312761  0.8411   0.40601    
## factor(t)16        0.0415857  0.0329855  1.2607   0.21575    
## factor(t)17        0.1052518  0.0390018  2.6986   0.01064 *  
## factor(t)18        0.1638020  0.0450409  3.6367   0.00088 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    0.11194
## Residual Sum of Squares: 0.059527
## R-Squared:      0.46821
## Adj. R-Squared: 0.22511
## F-statistic: 2.37043 on 13 and 35 DF, p-value: 0.021162

MODEL 3: General Additives Model (GAM)

library(gam)
gam1<- gam(overdose_rate~ns(hless_pop_rate, 1)+ unemployment_pct+Prescription.Rate , data=df2)
summary(gam1)
## 
## Call: gam(formula = overdose_rate ~ ns(hless_pop_rate, 1) + unemployment_pct + 
##     Prescription.Rate, data = df2)
## Deviance Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0051749 -0.0019833 -0.0004031  0.0005986  0.0088362 
## 
## (Dispersion Parameter for gaussian family taken to be 0)
## 
##     Null Deviance: 6e-04 on 51 degrees of freedom
## Residual Deviance: 5e-04 on 48 degrees of freedom
## AIC: -446.9452 
## 
## Number of Local Scoring Iterations: 2 
## 
## Anova for Parametric Effects
##                       Df     Sum Sq    Mean Sq F value  Pr(>F)  
## ns(hless_pop_rate, 1)  1 0.00005629 5.6295e-05  5.8144 0.01977 *
## unemployment_pct       1 0.00005563 5.5628e-05  5.7455 0.02047 *
## Prescription.Rate      1 0.00001085 1.0849e-05  1.1205 0.29510  
## Residuals             48 0.00046474 9.6820e-06                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Ploting the GAM prediction

plot(gam1, se=TRUE, col="blue")  # not an impressive representation"

MODEL 4: ANOVA in GAM

gam.V1 = gam(overdose_rate~s(hless_pop_rate,5), data=df2)
gam.V2 = gam(overdose_rate~s(per_capita_state_tax_revenue,5), data=df2)
gam.V3 = gam(overdose_rate~s(Prescription.Rate,5), data=df2)

anova(gam.V1, gam.V2, gam.V3, test="F")
## Analysis of Deviance Table
## 
## Model 1: overdose_rate ~ s(hless_pop_rate, 5)
## Model 2: overdose_rate ~ s(per_capita_state_tax_revenue, 5)
## Model 3: overdose_rate ~ s(Prescription.Rate, 5)
##   Resid. Df Resid. Dev         Df    Deviance     F   Pr(>F)    
## 1        46 0.00029515                                          
## 2        46 0.00027392 0.00010574  2.1228e-05 20984 1.05e-05 ***
## 3        46 0.00044010 0.00011944 -1.6618e-04                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(gam.V2)
## 
## Call: gam(formula = overdose_rate ~ s(per_capita_state_tax_revenue, 
##     5), data = df2)
## Deviance Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0033684 -0.0011363 -0.0001925  0.0003673  0.0083391 
## 
## (Dispersion Parameter for gaussian family taken to be 0)
## 
##     Null Deviance: 6e-04 on 51 degrees of freedom
## Residual Deviance: 3e-04 on 46.0001 degrees of freedom
## AIC: -470.4338 
## 
## Number of Local Scoring Iterations: NA 
## 
## Anova for Parametric Effects
##                                    Df     Sum Sq    Mean Sq F value    Pr(>F)
## s(per_capita_state_tax_revenue, 5)  1 7.5031e-05 7.5031e-05    12.6 0.0009022
## Residuals                          46 2.7392e-04 5.9550e-06                  
##                                       
## s(per_capita_state_tax_revenue, 5) ***
## Residuals                             
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Anova for Nonparametric Effects
##                                    Npar Df Npar F     Pr(F)    
## (Intercept)                                                    
## s(per_capita_state_tax_revenue, 5)       4 10.015 6.476e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Plotting the predictions

plot(gam.V1, se=TRUE, col="blue")

plot(gam.V3, se=TRUE, col="red")

Making predictions with our best performing model

PredictGam = predict(gam.V2, newdata=df2)
PredictGam
##           1           2           3           4           5           6 
## 0.004982173 0.004853897 0.004833103 0.005073297 0.005014066 0.004846096 
##           7           8           9          10          11          12 
## 0.004910060 0.004852216 0.004892576 0.005045020 0.005135518 0.005171789 
##          13          14          15          16          17          18 
## 0.005381501 0.006681986 0.006349033 0.006370425 0.007118888 0.007428355 
##          19          20          21          22          23          24 
## 0.007490464 0.007476304 0.007485523 0.007495347 0.007439793 0.007386162 
##          25          26          27          28          29          30 
## 0.007140049 0.007199480 0.005263255 0.005775406 0.006070270 0.005875179 
##          31          32          33          34          35          36 
## 0.005881283 0.006624971 0.007604558 0.008689495 0.010462683 0.012185695 
##          37          38          39          40          41          42 
## 0.013925214 0.014088003 0.016441044 0.007592290 0.007430287 0.007251627 
##          43          44          45          46          47          48 
## 0.007307070 0.007409443 0.007290695 0.006940937 0.006581177 0.006168943 
##          49          50          51          52 
## 0.005978688 0.006242525 0.005875346 0.005681131

Adding local reression fits by lo() Models

gam.lo = gam(overdose_rate~s(hless_pop_rate,df=4)+lo(per_capita_state_tax_revenue,span=0.5)+Prescription.Rate, data=df2)


summary(gam.lo)
## 
## Call: gam(formula = overdose_rate ~ s(hless_pop_rate, df = 4) + lo(per_capita_state_tax_revenue, 
##     span = 0.5) + Prescription.Rate, data = df2)
## Deviance Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0036708 -0.0008376 -0.0004748  0.0004493  0.0070003 
## 
## (Dispersion Parameter for gaussian family taken to be 0)
## 
##     Null Deviance: 6e-04 on 51 degrees of freedom
## Residual Deviance: 2e-04 on 42.122 degrees of freedom
## AIC: -475.8456 
## 
## Number of Local Scoring Iterations: 15 
## 
## Anova for Parametric Effects
##                                                  Df     Sum Sq    Mean Sq
## s(hless_pop_rate, df = 4)                     1.000 9.6109e-05 9.6109e-05
## lo(per_capita_state_tax_revenue, span = 0.5)  1.000 9.7430e-06 9.7430e-06
## Prescription.Rate                             1.000 4.4878e-05 4.4878e-05
## Residuals                                    42.122 2.1264e-04 5.0480e-06
##                                              F value    Pr(>F)    
## s(hless_pop_rate, df = 4)                    19.0381 8.126e-05 ***
## lo(per_capita_state_tax_revenue, span = 0.5)  1.9300   0.17206    
## Prescription.Rate                             8.8897   0.00475 ** 
## Residuals                                                         
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Anova for Nonparametric Effects
##                                              Npar Df  Npar F     Pr(F)    
## (Intercept)                                                               
## s(hless_pop_rate, df = 4)                        3.0  2.5964   0.06493 .  
## lo(per_capita_state_tax_revenue, span = 0.5)     2.9 13.2865 3.971e-06 ***
## Prescription.Rate                                                         
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1