1. Research Question: What is the impact of unemployment rate and obtaining a HS diploma on poverty rate in Chicago?
library(sf)     
## Warning: package 'sf' was built under R version 4.3.2
## Linking to GEOS 3.11.2, GDAL 3.7.2, PROJ 9.3.0; sf_use_s2() is TRUE
library(dplyr)   
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(spData) 
## Warning: package 'spData' was built under R version 4.3.3
## To access larger datasets in this package, install the spDataLarge
## package with: `install.packages('spDataLarge',
## repos='https://nowosad.github.io/drat/', type='source')`
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.3.2
library(ggthemes)
## Warning: package 'ggthemes' was built under R version 4.3.2
library(spdep)
## Warning: package 'spdep' was built under R version 4.3.3
library(spatialreg)
## Warning: package 'spatialreg' was built under R version 4.3.3
## Loading required package: Matrix
## 
## Attaching package: 'spatialreg'
## The following objects are masked from 'package:spdep':
## 
##     get.ClusterOption, get.coresOption, get.mcOption,
##     get.VerboseOption, get.ZeroPolicyOption, set.ClusterOption,
##     set.coresOption, set.mcOption, set.VerboseOption,
##     set.ZeroPolicyOption
library(GWmodel)
## Warning: package 'GWmodel' was built under R version 4.3.3
## Loading required package: robustbase
## Warning: package 'robustbase' was built under R version 4.3.3
## Loading required package: sp
## Warning: package 'sp' was built under R version 4.3.2
## Loading required package: Rcpp
## Welcome to GWmodel version 2.3-2.
library(tidyr)
## 
## Attaching package: 'tidyr'
## The following objects are masked from 'package:Matrix':
## 
##     expand, pack, unpack
chicago <- st_read("airbnb_Chicago 2015.shp")
## Reading layer `airbnb_Chicago 2015' from data source 
##   `C:\Users\jules\Downloads\Spring 24\DIDA 370\airbnb_Chicago 2015.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 77 features and 20 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -87.94011 ymin: 41.64454 xmax: -87.52414 ymax: 42.02304
## Geodetic CRS:  WGS 84
chi_model_data <- chicago %>% select(c(poverty, without_hs, unemployed)) %>% filter(!poverty %in% "NA")

ggplot()+
  geom_sf(data = chicago, aes(fill=as.numeric(poverty)))+
  scale_fill_steps(
    name = "Poverty",
    low = "lightsteelblue1",
    high = "tomato1",
    n.breaks = 5,
    show.limits = T)+
  theme_void()

model <- lm(poverty ~ without_hs + unemployed, chi_model_data) 
summary(model)
## 
## Call:
## lm(formula = poverty ~ without_hs + unemployed, data = chi_model_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.0642  -5.6940  -0.2384   4.8156  15.6825 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.12939    1.95078   0.579   0.5644    
## without_hs   0.15609    0.07047   2.215   0.0298 *  
## unemployed   1.13589    0.11044  10.285 6.51e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.789 on 74 degrees of freedom
## Multiple R-squared:  0.6625, Adjusted R-squared:  0.6533 
## F-statistic: 72.62 on 2 and 74 DF,  p-value: < 2.2e-16

Based on the results of the linear regression analysis, we can conclude that both the proportion of individuals without a high school diploma and the unemployment rate have significant effects on the poverty rate in Chicago. For every one unit increase in the proportion of individuals without a high school diploma, we expect to see an increase of approximately 0.15609 units in the poverty rate, while for every one-unit increase in the unemployment rate, we expect to see an increase of approximately 1.13589 units in the poverty rate.

chi_list <- chi_model_data %>% 
  poly2nb(st_geometry(chi_model_data)) %>% 
  nb2listw(zero.policy = TRUE)

lm_moran_test <- lm.morantest(model, chi_list)

lm_moran_test
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = poverty ~ without_hs + unemployed, data =
## chi_model_data)
## weights: chi_list
## 
## Moran I statistic standard deviate = 5.692, p-value = 6.279e-09
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##      0.371038481     -0.028342012      0.004923201
  1. There is strong evidence of significant spatial autocorrelation in the residuals of the regression model. This is indicated by the Moran’s I statistic standard deviate of 5.692 and the associated p-value of 6.279e-09, both of which are well below the typical significance level of 0.05. Additionally, the observed Moran’s I value of 0.371 suggests a clustering pattern in the residuals, further confirming the presence of spatial autocorrelation.
lm_lag <- lagsarlm(poverty ~ without_hs + unemployed,
              data = chi_model_data,
              listw = chi_list,
              zero.policy = TRUE, 
              na.action = na.omit)

summary(lm_lag)
## 
## Call:
## lagsarlm(formula = poverty ~ without_hs + unemployed, data = chi_model_data, 
##     listw = chi_list, na.action = na.omit, zero.policy = TRUE)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -14.80438  -4.63181  -0.86583   4.15380  15.89788 
## 
## Type: lag 
## Coefficients: (asymptotic standard errors) 
##              Estimate Std. Error z value  Pr(>|z|)
## (Intercept) -2.369114   2.022446 -1.1714   0.24143
## without_hs   0.150221   0.065599  2.2900   0.02202
## unemployed   0.912207   0.127489  7.1552 8.358e-13
## 
## Rho: 0.32033, LR test value: 8.3426, p-value: 0.0038726
## Asymptotic standard error: 0.10583
##     z-value: 3.0269, p-value: 0.002471
## Wald statistic: 9.1619, p-value: 0.002471
## 
## Log likelihood: -251.0314 for lag model
## ML residual variance (sigma squared): 38.863, (sigma: 6.234)
## Number of observations: 77 
## Number of parameters estimated: 5 
## AIC: 512.06, (AIC for lm: 518.41)
## LM test for residual autocorrelation
## test value: 10.971, p-value: 0.00092567
  1. The very small p-value from the Global Moran’s I test for the regression residuals indicates significant spatial autocorrelation. So choosing a spatial lag regression model is appropriate, it accounts for this autocorrelation by including a spatially lagged dependent variable, ensuring more reliable estimates of the relationships between poverty, educational attainment, and unemployment in Chicago.

  2. In interpreting the results of the spatial lag regression model, we observe that both the proportion of individuals without a high school diploma and the unemployment rate re statistically significant predictors of the poverty rate in Chicago. Specifically, a one-unit increase in the proportion of individuals without a high school diploma is associated with a 0.150221 increase in poverty rate (p = 0.02202), while a one-unit increase in the unemployment rate leads to a 0.912207 increase in poverty rate (p < 0.001). These findings show the significance of addressing educational attainment and employment opportunities to mitigate poverty in urban areas like Chicago.