Geographically Weighted Regression

Generally, if we have a continuous outcome, we consider using the OLS model and when we have data collected over space, we have other assumptions too.

library(spdep)
library(maptools)
library(car)
library(lmtest)
library(spgwr)
library(classInt)
library(RColorBrewer)
setwd("~/Google Drive//dem7263/data")

Stationarity is a general term in statistics/math, generally it means something doesn’t change over time or space e.g. stationary population e.g. stationary time series In spatial statistics, we can stay stationarity = homogeneity of an effect, or, that a process works the same regardless of where you observe the process. In spatial statistics, the latter can be a weak assumption, and we can ask, does X affect Y differently at different geographic locations, or in terms of parameters: If we estimate in OLS \(\beta\) = .5, are there locations in the data where \(\beta\) != .5?

Approaches to non-stationarity

There are several approaches to dealing with this problem. Among them are: * Create zones of homogeneity and stratify * Allow parameters to vary constantly

The first technique is stratification, or spatial regimes. This is where we stratify the data by either a known fixed quality, such as the state in which a county is located, or perhaps some other socio-demographic or socio-economic variable. Technique 1 - Stratification

We then fit separate regression models for each region. These could be OLS or some other specification we have discussed.

First, we read in our data and make a spatial representation of it

spdat<-readShapePoly("~/Google Drive/dem7263/data/usdata_mort.shp")
spdat$population<-as.numeric(as.character(spdat$Population))
spdat$deaths<-as.numeric(as.character(spdat$Deaths))
spdat<-spdat[spdat$STATE%in%c("05","22","40","35", "48"),]
spdat$mig_group<-cut(spdat$p5yrinmig, breaks = quantile(spdat$p5yrinmig, p=c(0, .3, .6, 1)), include.lowest = T)
lvs<-levels(spdat$mig_group)

nbs<-knearneigh(coordinates(spdat), k=2)

nbsmg1<-knearneigh(coordinates(spdat[spdat$mig_group==lvs[1],]), k=2)
nbsmg2<-knearneigh(coordinates(spdat[spdat$mig_group==lvs[2],]), k=2)
nbsmg3<-knearneigh(coordinates(spdat[spdat$mig_group==lvs[3],]), k=2)

nbs<-knn2nb(nbs,sym=T)
nbs1<-knn2nb(nbsmg1, sym=T)
nbs2<-knn2nb(nbsmg2, sym=T)
nbs3<-knn2nb(nbsmg3, sym=T)

Here’s the basic OLS model:

fit.1<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat)
summary(fit.1)

## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0066877 -0.0005928  0.0000377  0.0006448  0.0031275 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.046e-02  7.256e-05 144.199  < 2e-16 ***
## scale(ppersonspo)  5.545e-04  7.066e-05   7.847 2.64e-14 ***
## scale(p65plus)     2.092e-03  6.286e-05  33.281  < 2e-16 ***
## scale(pblack_1)    9.032e-05  7.291e-05   1.239    0.216    
## scale(phisp)      -8.643e-04  7.401e-05 -11.677  < 2e-16 ***
## I(RUCC >= 7)TRUE  -1.620e-04  1.195e-04  -1.356    0.176    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.001159 on 495 degrees of freedom
## Multiple R-squared:  0.8073, Adjusted R-squared:  0.8054 
## F-statistic: 414.8 on 5 and 495 DF,  p-value: < 2.2e-16

We could stratify by state:

#Examine basic regimes, using states as a nesting mechanism
spplot(spdat, "STATE")

fit.strat<-lm(I(deaths/population)~STATE/(scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7)), data=spdat)
summary(fit.strat)

## 
## Call:
## lm(formula = I(deaths/population) ~ STATE/(scale(ppersonspo) + 
##     scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 
##     7)), data = spdat)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0044801 -0.0005148  0.0000290  0.0006616  0.0043913 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                9.682e-03  6.809e-04  14.219  < 2e-16 ***
## STATE22                   -9.979e-04  2.059e-03  -0.485 0.628201    
## STATE35                   -1.388e-04  1.560e-03  -0.089 0.929152    
## STATE40                    1.940e-03  8.023e-04   2.418 0.015976 *  
## STATE48                    6.572e-04  6.893e-04   0.953 0.340858    
## STATE05:scale(ppersonspo)  6.526e-04  2.551e-04   2.558 0.010828 *  
## STATE22:scale(ppersonspo)  3.219e-04  2.586e-04   1.245 0.213715    
## STATE35:scale(ppersonspo)  5.528e-04  1.943e-04   2.845 0.004634 ** 
## STATE40:scale(ppersonspo)  8.200e-04  1.930e-04   4.249 2.59e-05 ***
## STATE48:scale(ppersonspo)  2.606e-04  1.112e-04   2.344 0.019477 *  
## STATE05:scale(p65plus)     1.953e-03  1.991e-04   9.808  < 2e-16 ***
## STATE22:scale(p65plus)     2.355e-03  2.776e-04   8.485 2.83e-16 ***
## STATE35:scale(p65plus)     1.575e-03  1.694e-04   9.299  < 2e-16 ***
## STATE40:scale(p65plus)     2.530e-03  1.855e-04  13.639  < 2e-16 ***
## STATE48:scale(p65plus)     2.228e-03  8.561e-05  26.025  < 2e-16 ***
## STATE05:scale(pblack_1)    2.350e-04  1.605e-04   1.464 0.143734    
## STATE22:scale(pblack_1)    9.331e-05  1.901e-04   0.491 0.623848    
## STATE35:scale(pblack_1)    6.666e-04  2.100e-03   0.317 0.751093    
## STATE40:scale(pblack_1)    1.879e-04  4.565e-04   0.412 0.680838    
## STATE48:scale(pblack_1)    4.049e-04  1.633e-04   2.480 0.013505 *  
## STATE05:scale(phisp)      -1.783e-03  9.267e-04  -1.925 0.054887 .  
## STATE22:scale(phisp)      -3.092e-03  2.549e-03  -1.213 0.225843    
## STATE35:scale(phisp)       1.852e-04  2.336e-04   0.793 0.428117    
## STATE40:scale(phisp)       1.186e-04  5.806e-04   0.204 0.838172    
## STATE48:scale(phisp)      -4.991e-04  1.334e-04  -3.741 0.000206 ***
## STATE05:I(RUCC >= 7)TRUE   2.247e-04  3.017e-04   0.745 0.456661    
## STATE22:I(RUCC >= 7)TRUE   3.250e-04  3.560e-04   0.913 0.361859    
## STATE35:I(RUCC >= 7)TRUE  -5.091e-04  4.390e-04  -1.160 0.246774    
## STATE40:I(RUCC >= 7)TRUE  -2.993e-04  2.947e-04  -1.016 0.310346    
## STATE48:I(RUCC >= 7)TRUE  -3.663e-04  1.629e-04  -2.249 0.024975 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.001079 on 471 degrees of freedom
## Multiple R-squared:  0.8412, Adjusted R-squared:  0.8314 
## F-statistic: 86.01 on 29 and 471 DF,  p-value: < 2.2e-16

anova(fit.1, fit.strat)

## Analysis of Variance Table
## 
## Model 1: I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + scale(pblack_1) + 
##     scale(phisp) + I(RUCC >= 7)
## Model 2: I(deaths/population) ~ STATE/(scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7))
##   Res.Df        RSS Df  Sum of Sq     F    Pr(>F)    
## 1    495 0.00066533                                  
## 2    471 0.00054848 24 0.00011685 4.181 4.489e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Or, we could examine an alternative regime, based on a demographic characteristic, say migration:

spplot(spdat, "mig_group", col.regions=brewer.pal(n=3, "Greys"))

fit.strat2<-lm(I(deaths/population)~mig_group/(scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7)), data=spdat)
summary(fit.strat2)

## 
## Call:
## lm(formula = I(deaths/population) ~ mig_group/(scale(ppersonspo) + 
##     scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 
##     7)), data = spdat)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0060718 -0.0005637  0.0000301  0.0006485  0.0033088 
## 
## Coefficients:
##                                            Estimate Std. Error t value
## (Intercept)                               1.052e-02  1.417e-04  74.268
## mig_group(0.18,0.227]                     2.396e-04  1.891e-04   1.267
## mig_group(0.227,0.504]                   -3.476e-04  1.900e-04  -1.830
## mig_group[0.0622,0.18]:scale(ppersonspo)  2.522e-04  1.135e-04   2.222
## mig_group(0.18,0.227]:scale(ppersonspo)   5.935e-04  1.557e-04   3.812
## mig_group(0.227,0.504]:scale(ppersonspo)  5.085e-04  1.198e-04   4.243
## mig_group[0.0622,0.18]:scale(p65plus)     1.738e-03  1.272e-04  13.661
## mig_group(0.18,0.227]:scale(p65plus)      2.175e-03  1.264e-04  17.204
## mig_group(0.227,0.504]:scale(p65plus)     2.118e-03  8.822e-05  24.007
## mig_group[0.0622,0.18]:scale(pblack_1)    1.319e-04  1.149e-04   1.147
## mig_group(0.18,0.227]:scale(pblack_1)    -1.363e-04  1.404e-04  -0.971
## mig_group(0.227,0.504]:scale(pblack_1)    9.340e-05  1.453e-04   0.643
## mig_group[0.0622,0.18]:scale(phisp)      -8.396e-04  1.184e-04  -7.092
## mig_group(0.18,0.227]:scale(phisp)       -9.079e-04  1.304e-04  -6.962
## mig_group(0.227,0.504]:scale(phisp)      -8.952e-04  1.386e-04  -6.461
## mig_group[0.0622,0.18]:I(RUCC >= 7)TRUE   3.227e-04  2.125e-04   1.519
## mig_group(0.18,0.227]:I(RUCC >= 7)TRUE   -2.944e-04  2.044e-04  -1.441
## mig_group(0.227,0.504]:I(RUCC >= 7)TRUE  -3.302e-04  1.901e-04  -1.737
##                                          Pr(>|t|)    
## (Intercept)                               < 2e-16 ***
## mig_group(0.18,0.227]                    0.205722    
## mig_group(0.227,0.504]                   0.067872 .  
## mig_group[0.0622,0.18]:scale(ppersonspo) 0.026740 *  
## mig_group(0.18,0.227]:scale(ppersonspo)  0.000156 ***
## mig_group(0.227,0.504]:scale(ppersonspo) 2.65e-05 ***
## mig_group[0.0622,0.18]:scale(p65plus)     < 2e-16 ***
## mig_group(0.18,0.227]:scale(p65plus)      < 2e-16 ***
## mig_group(0.227,0.504]:scale(p65plus)     < 2e-16 ***
## mig_group[0.0622,0.18]:scale(pblack_1)   0.251779    
## mig_group(0.18,0.227]:scale(pblack_1)    0.332149    
## mig_group(0.227,0.504]:scale(pblack_1)   0.520587    
## mig_group[0.0622,0.18]:scale(phisp)      4.74e-12 ***
## mig_group(0.18,0.227]:scale(phisp)       1.10e-11 ***
## mig_group(0.227,0.504]:scale(phisp)      2.54e-10 ***
## mig_group[0.0622,0.18]:I(RUCC >= 7)TRUE  0.129412    
## mig_group(0.18,0.227]:I(RUCC >= 7)TRUE   0.150336    
## mig_group(0.227,0.504]:I(RUCC >= 7)TRUE  0.082976 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.001116 on 483 degrees of freedom
## Multiple R-squared:  0.8257, Adjusted R-squared:  0.8196 
## F-statistic: 134.6 on 17 and 483 DF,  p-value: < 2.2e-16

anova(fit.1, fit.strat2)

## Analysis of Variance Table
## 
## Model 1: I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + scale(pblack_1) + 
##     scale(phisp) + I(RUCC >= 7)
## Model 2: I(deaths/population) ~ mig_group/(scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7))
##   Res.Df        RSS Df Sum of Sq      F    Pr(>F)    
## 1    495 0.00066533                                  
## 2    483 0.00060171 12 6.362e-05 4.2557 2.056e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

anova(fit.strat2, fit.strat)

## Analysis of Variance Table
## 
## Model 1: I(deaths/population) ~ mig_group/(scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7))
## Model 2: I(deaths/population) ~ STATE/(scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7))
##   Res.Df        RSS Df  Sum of Sq      F    Pr(>F)    
## 1    483 0.00060171                                   
## 2    471 0.00054848 12 5.3229e-05 3.8092 1.477e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot(spdat, main="Neighbors for low migration counties")
plot(nbs1,coords=coordinates(spdat[spdat$mig_group==lvs[1],]), col=2, add=T)

plot(spdat, main="Neighbors for medium migration counties")
plot(nbs2,coords=coordinates(spdat[spdat$mig_group==lvs[2],]), col=2, add=T)

plot(spdat, main="Neighbors for high migration counties")
plot(nbs3,coords=coordinates(spdat[spdat$mig_group==lvs[3],]), col=2, add=T)

wts<-nb2listw(nbs)
wts1<-nb2listw(nbs1)
wts2<-nb2listw(nbs2)
wts3<-nb2listw(nbs3)

So we see non-stationarity in several of the covariates, with respect to state. The effects of poverty, age structure and race/ethnicty all vary significantly by state. Similar results are also seen for the migration rate grouping variable, althought the final model comparison of the two stratified models shows more variation between states than between migration groups.

If we wanted, we could stratify our analysis by state. Below, I simply stratify the analysis by state and use the OLS model.

fit.ar<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat, subset=STATE=="05")
summary(fit.ar)

## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = STATE == "05")
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -2.356e-03 -4.771e-04  3.499e-05  5.518e-04  2.209e-03 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        0.0096818  0.0005626  17.210  < 2e-16 ***
## scale(ppersonspo)  0.0006526  0.0002108   3.097  0.00283 ** 
## scale(p65plus)     0.0019531  0.0001645  11.871  < 2e-16 ***
## scale(pblack_1)    0.0002350  0.0001326   1.773  0.08072 .  
## scale(phisp)      -0.0017834  0.0007656  -2.329  0.02277 *  
## I(RUCC >= 7)TRUE   0.0002247  0.0002492   0.902  0.37037    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0008916 on 69 degrees of freedom
## Multiple R-squared:  0.8106, Adjusted R-squared:  0.7969 
## F-statistic: 59.08 on 5 and 69 DF,  p-value: < 2.2e-16

fit.la<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat, subset=STATE=="22")
summary(fit.la)

## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = STATE == "22")
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0035436 -0.0003792 -0.0000223  0.0005726  0.0018121 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        8.684e-03  1.663e-03   5.222 2.50e-06 ***
## scale(ppersonspo)  3.219e-04  2.212e-04   1.455    0.151    
## scale(p65plus)     2.355e-03  2.375e-04   9.916 4.25e-14 ***
## scale(pblack_1)    9.331e-05  1.627e-04   0.573    0.569    
## scale(phisp)      -3.092e-03  2.181e-03  -1.417    0.162    
## I(RUCC >= 7)TRUE   3.250e-04  3.047e-04   1.067    0.291    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0009234 on 58 degrees of freedom
## Multiple R-squared:  0.7974, Adjusted R-squared:  0.7799 
## F-statistic: 45.65 on 5 and 58 DF,  p-value: < 2.2e-16

fit.ok<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat, subset=STATE=="40")
summary(fit.ok)

## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = STATE == "40")
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -3.029e-03 -5.192e-04 -5.854e-05  6.775e-04  2.030e-03 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        0.0116220  0.0003860  30.106  < 2e-16 ***
## scale(ppersonspo)  0.0008200  0.0001755   4.671 1.38e-05 ***
## scale(p65plus)     0.0025302  0.0001688  14.993  < 2e-16 ***
## scale(pblack_1)    0.0001879  0.0004152   0.452    0.652    
## scale(phisp)       0.0001186  0.0005281   0.225    0.823    
## I(RUCC >= 7)TRUE  -0.0002993  0.0002681  -1.116    0.268    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0009816 on 71 degrees of freedom
## Multiple R-squared:  0.8346, Adjusted R-squared:  0.8229 
## F-statistic: 71.65 on 5 and 71 DF,  p-value: < 2.2e-16

fit.nm<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat, subset=STATE=="35")
summary(fit.nm)

## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = STATE == "35")
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0044801 -0.0007503 -0.0000290  0.0008573  0.0043913 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        0.0095430  0.0021307   4.479 0.000123 ***
## scale(ppersonspo)  0.0005528  0.0002949   1.874 0.071753 .  
## scale(p65plus)     0.0015753  0.0002571   6.126 1.52e-06 ***
## scale(pblack_1)    0.0006666  0.0031883   0.209 0.835951    
## scale(phisp)       0.0001852  0.0003545   0.522 0.605598    
## I(RUCC >= 7)TRUE  -0.0005091  0.0006664  -0.764 0.451516    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.001638 on 27 degrees of freedom
## Multiple R-squared:  0.6221, Adjusted R-squared:  0.5521 
## F-statistic:  8.89 on 5 and 27 DF,  p-value: 4.382e-05

fit.tx<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat, subset=STATE=="48")
summary(fit.tx)

## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = STATE == "48")
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0042947 -0.0005506  0.0000912  0.0006799  0.0029490 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.034e-02  1.101e-04  93.921  < 2e-16 ***
## scale(ppersonspo)  2.606e-04  1.144e-04   2.278 0.023568 *  
## scale(p65plus)     2.228e-03  8.809e-05  25.292  < 2e-16 ***
## scale(pblack_1)    4.049e-04  1.680e-04   2.410 0.016702 *  
## scale(phisp)      -4.991e-04  1.373e-04  -3.636 0.000337 ***
## I(RUCC >= 7)TRUE  -3.663e-04  1.676e-04  -2.186 0.029786 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.00111 on 246 degrees of freedom
## Multiple R-squared:  0.8453, Adjusted R-squared:  0.8422 
## F-statistic: 268.9 on 5 and 246 DF,  p-value: < 2.2e-16

Or, we could run a SAR model for each subset using the weights we created above:

#first the "global" model
efit<-lagsarlm(scale(I(deaths/population))~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat, listw=wts)

Then, a regime specification as used by Brazil 2015

efit0<-lagsarlm(scale(I(deaths/population))~mig_group/(scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7)), data=spdat, listw=wts)
summary(efit0)

## 
## Call:
## lagsarlm(formula = scale(I(deaths/population)) ~ mig_group/(scale(ppersonspo) + 
##     scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 
##     7)), data = spdat, listw = wts)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -2.2535080 -0.2175824  0.0092613  0.2420036  1.2534095 
## 
## Type: lag 
## Coefficients: (asymptotic standard errors) 
##                                           Estimate Std. Error z value
## (Intercept)                               0.070702   0.052020  1.3591
## mig_group(0.18,0.227]                     0.062960   0.069376  0.9075
## mig_group(0.227,0.504]                   -0.144586   0.069608 -2.0771
## mig_group[0.0622,0.18]:scale(ppersonspo)  0.089667   0.041622  2.1543
## mig_group(0.18,0.227]:scale(ppersonspo)   0.212126   0.057090  3.7156
## mig_group(0.227,0.504]:scale(ppersonspo)  0.161934   0.044400  3.6472
## mig_group[0.0622,0.18]:scale(p65plus)     0.613779   0.047683 12.8721
## mig_group(0.18,0.227]:scale(p65plus)      0.791811   0.047329 16.7298
## mig_group(0.227,0.504]:scale(p65plus)     0.784719   0.033044 23.7478
## mig_group[0.0622,0.18]:scale(pblack_1)    0.052671   0.042112  1.2507
## mig_group(0.18,0.227]:scale(pblack_1)    -0.042097   0.051436 -0.8184
## mig_group(0.227,0.504]:scale(pblack_1)    0.038860   0.053224  0.7301
## mig_group[0.0622,0.18]:scale(phisp)      -0.284882   0.044298 -6.4311
## mig_group(0.18,0.227]:scale(phisp)       -0.301421   0.048620 -6.1995
## mig_group(0.227,0.504]:scale(phisp)      -0.282886   0.052171 -5.4223
## mig_group[0.0622,0.18]:I(RUCC >= 7)TRUE   0.086956   0.078148  1.1127
## mig_group(0.18,0.227]:I(RUCC >= 7)TRUE   -0.121716   0.075011 -1.6226
## mig_group(0.227,0.504]:I(RUCC >= 7)TRUE  -0.154449   0.069837 -2.2116
##                                           Pr(>|z|)
## (Intercept)                              0.1741000
## mig_group(0.18,0.227]                    0.3641342
## mig_group(0.227,0.504]                   0.0377878
## mig_group[0.0622,0.18]:scale(ppersonspo) 0.0312148
## mig_group(0.18,0.227]:scale(ppersonspo)  0.0002027
## mig_group(0.227,0.504]:scale(ppersonspo) 0.0002651
## mig_group[0.0622,0.18]:scale(p65plus)    < 2.2e-16
## mig_group(0.18,0.227]:scale(p65plus)     < 2.2e-16
## mig_group(0.227,0.504]:scale(p65plus)    < 2.2e-16
## mig_group[0.0622,0.18]:scale(pblack_1)   0.2110316
## mig_group(0.18,0.227]:scale(pblack_1)    0.4131111
## mig_group(0.227,0.504]:scale(pblack_1)   0.4653192
## mig_group[0.0622,0.18]:scale(phisp)      1.267e-10
## mig_group(0.18,0.227]:scale(phisp)       5.665e-10
## mig_group(0.227,0.504]:scale(phisp)      5.883e-08
## mig_group[0.0622,0.18]:I(RUCC >= 7)TRUE  0.2658359
## mig_group(0.18,0.227]:I(RUCC >= 7)TRUE   0.1046657
## mig_group(0.227,0.504]:I(RUCC >= 7)TRUE  0.0269975
## 
## Rho: 0.12088, LR test value: 16.799, p-value: 4.156e-05
## Asymptotic standard error: 0.028979
##     z-value: 4.1712, p-value: 3.0305e-05
## Wald statistic: 17.399, p-value: 3.0305e-05
## 
## Log likelihood: -264.3111 for lag model
## ML residual variance (sigma squared): 0.16721, (sigma: 0.40891)
## Number of observations: 501 
## Number of parameters estimated: 20 
## AIC: 568.62, (AIC for lm: 583.42)
## LM test for residual autocorrelation
## test value: 3.5091, p-value: 0.061031

anova(efit, efit0)

##       Model df    AIC  logLik Test L.Ratio    p-value
## efit      1  8 595.77 -289.89    1                   
## efit0     2 20 568.62 -264.31    2   51.15 8.7649e-07

Or, I could fit the fully stratified models, one regime at a time by subsetting and using the group-specific spatial weights:

efit1<-lagsarlm(scale(I(deaths/population))~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat[spdat$mig_group==lvs[1],], listw=wts1)

efit2<-lagsarlm(scale(I(deaths/population))~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat[spdat$mig_group==lvs[2],], listw=wts2)

efit3<-lagsarlm(scale(I(deaths/population))~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat[spdat$mig_group==lvs[3],], listw=wts3)

summary(efit1)

## 
## Call:lagsarlm(formula = scale(I(deaths/population)) ~ scale(ppersonspo) + 
##     scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 
##     7), data = spdat[spdat$mig_group == lvs[1], ], listw = wts1)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -2.4544851 -0.2654621 -0.0015307  0.2899709  1.0866101 
## 
## Type: lag 
## Coefficients: (asymptotic standard errors) 
##                    Estimate Std. Error z value  Pr(>|z|)
## (Intercept)       -0.044804   0.054807 -0.8175   0.41365
## scale(ppersonspo)  0.122484   0.054066  2.2655   0.02348
## scale(p65plus)     0.573820   0.048549 11.8195 < 2.2e-16
## scale(pblack_1)    0.054060   0.064939  0.8325   0.40515
## scale(phisp)      -0.401658   0.072147 -5.5672 2.588e-08
## I(RUCC >= 7)TRUE   0.102547   0.091158  1.1249   0.26062
## 
## Rho: 0.15548, LR test value: 6.4493, p-value: 0.0111
## Asymptotic standard error: 0.059236
##     z-value: 2.6247, p-value: 0.0086724
## Wald statistic: 6.8891, p-value: 0.0086724
## 
## Log likelihood: -103.2209 for lag model
## ML residual variance (sigma squared): 0.22757, (sigma: 0.47704)
## Number of observations: 151 
## Number of parameters estimated: 8 
## AIC: 222.44, (AIC for lm: 226.89)
## LM test for residual autocorrelation
## test value: 2.8762, p-value: 0.089897

summary(efit2)

## 
## Call:lagsarlm(formula = scale(I(deaths/population)) ~ scale(ppersonspo) + 
##     scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 
##     7), data = spdat[spdat$mig_group == lvs[2], ], listw = wts2)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -1.244411 -0.258925 -0.022572  0.238146  1.121171 
## 
## Type: lag 
## Coefficients: (asymptotic standard errors) 
##                    Estimate Std. Error z value  Pr(>|z|)
## (Intercept)        0.061767   0.047187  1.3090    0.1905
## scale(ppersonspo)  0.170906   0.038992  4.3831 1.170e-05
## scale(p65plus)     0.778337   0.043870 17.7418 < 2.2e-16
## scale(pblack_1)   -0.031947   0.041736 -0.7655    0.4440
## scale(phisp)      -0.315580   0.044718 -7.0571 1.701e-12
## I(RUCC >= 7)TRUE  -0.133604   0.074314 -1.7978    0.0722
## 
## Rho: 0.10064, LR test value: 3.8633, p-value: 0.049354
## Asymptotic standard error: 0.050143
##     z-value: 2.007, p-value: 0.044746
## Wald statistic: 4.0282, p-value: 0.044746
## 
## Log likelihood: -77.42665 for lag model
## ML residual variance (sigma squared): 0.16374, (sigma: 0.40464)
## Number of observations: 150 
## Number of parameters estimated: 8 
## AIC: 170.85, (AIC for lm: 172.72)
## LM test for residual autocorrelation
## test value: 0.16968, p-value: 0.6804

summary(efit3)

## 
## Call:lagsarlm(formula = scale(I(deaths/population)) ~ scale(ppersonspo) + 
##     scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 
##     7), data = spdat[spdat$mig_group == lvs[3], ], listw = wts3)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -1.384600 -0.188351  0.037029  0.229312  1.174578 
## 
## Type: lag 
## Coefficients: (asymptotic standard errors) 
##                    Estimate Std. Error z value  Pr(>|z|)
## (Intercept)        0.048072   0.038961  1.2338   0.21726
## scale(ppersonspo)  0.143595   0.035900  3.9998 6.339e-05
## scale(p65plus)     0.856224   0.036718 23.3190 < 2.2e-16
## scale(pblack_1)    0.023191   0.034032  0.6815   0.49558
## scale(phisp)      -0.207571   0.034207 -6.0680 1.295e-09
## I(RUCC >= 7)TRUE  -0.119329   0.067403 -1.7704   0.07666
## 
## Rho: 0.015616, LR test value: 0.12585, p-value: 0.72278
## Asymptotic standard error: 0.043516
##     z-value: 0.35886, p-value: 0.7197
## Wald statistic: 0.12878, p-value: 0.7197
## 
## Log likelihood: -97.39382 for lag model
## ML residual variance (sigma squared): 0.15505, (sigma: 0.39376)
## Number of observations: 200 
## Number of parameters estimated: 8 
## AIC: 210.79, (AIC for lm: 208.91)
## LM test for residual autocorrelation
## test value: 0.80749, p-value: 0.36886

GWR Model

First proposed by Brunsdon et al 1996

Their model was specified, where now each \(\beta_k\) is estimated at the location \(u_i,v_j\) where i and j are the coordinates or geographic location of the observation i. \(\beta_k(ui,vj)\) is the local realization of \(\beta\) at location i,j. This constructs a trend surface of parameter values for each independent variable and the model intercept.
Note that the basic OLS regression model above is just a special case of the GWR model where the coefficients are constant over space. The parameters in the GWR are estimated by weighted least squares. The weighting matrix is a diagonal matrix, with each diagonal element being a function of the location of the observation. if Wi is the weighting matrix at location i, then the parameter estimate at that location would be

\(\hat \beta_i = (X'WX)^{-1} X'WY\)

The role of the weight matrix is to give more value to observations that are close to i. as it is assumed that observations that are close will influence each other more than those that are far away. This is NOT the same W matrix as was used in Moran’s I!

W is generally constructed based on the distances between locations, and if \(d_ij\) < d , then \(w_{ij}\)=1, and 0 otherwise. Brunsdson, Charlton, and Fotheringham used a kernel-density approach to measure W

\(w_{ij} = exp(- \frac{1}{2}(d_{ij}/b)^2)\)

where b is the bandwidth parameter to be estimated. under this weighting scheme, the weight for points that exactly co-occur will be 1 and for those that are the distance \(d_{ij}\) apart, the weight will decrease as a Gaussian curve as \(d_{ij}\) increases.

GWR Kernels

There are several ways to do this, but most programs use a cross-validation method to choose the optimal kernel bandwidth. The cross validation method takes the form: \(CV = \Sigma_i \left[y_i - \hat y_{\neq i} (\beta) \right ]^2\)

Where \(\hat y_{\neq i} (\beta)\) is the fitted value of \(y_i\) with the observations for point i omitted from the calibration process. This in effect minimizes the sum of squared errors at all locations i, and arrives at an optimal bandwidth.

When the model is fit, we can also calculate local z/t tests to test whether the effect of a specifix X variable is significant at a particular location. Likewise, we can calculate a local \(R^2\) measure at each location.

Leung et al 2000 proposed a series of F tests for model improvement over OLS, and they also derive an F test for variation in the beta coefficients (so called F3 test.

GWR Modeling

The advantage of GWR is that we can visualize the difference in the covariate effects as a continuous surface. There are a few steps to fitting the GWR model. First we need to select a bandwidth for the weighting algorithm

gwr.b.f<-gwr.sel(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), spdat)

## Bandwidth: 837305.2 CV score: 0.0006802827 
## Bandwidth: 1353436 CV score: 0.0006843992 
## Bandwidth: 518319 CV score: 0.000671332 
## Bandwidth: 321174.7 CV score: 0.0006543891 
## Bandwidth: 199332.9 CV score: 0.0006472186 
## Bandwidth: 124030.4 CV score: 0.0006733644 
## Bandwidth: 245872.3 CV score: 0.0006478826 
## Bandwidth: 210553.7 CV score: 0.0006470031 
## Bandwidth: 215074.7 CV score: 0.0006469975 
## Bandwidth: 213356.6 CV score: 0.0006469947 
## Bandwidth: 213425.2 CV score: 0.0006469947 
## Bandwidth: 213408.1 CV score: 0.0006469947 
## Bandwidth: 213407.9 CV score: 0.0006469947 
## Bandwidth: 213407.9 CV score: 0.0006469947 
## Bandwidth: 213407.9 CV score: 0.0006469947 
## Bandwidth: 213388.3 CV score: 0.0006469947 
## Bandwidth: 213400.4 CV score: 0.0006469947 
## Bandwidth: 213405 CV score: 0.0006469947 
## Bandwidth: 213406.8 CV score: 0.0006469947 
## Bandwidth: 213407.4 CV score: 0.0006469947 
## Bandwidth: 213407.7 CV score: 0.0006469947 
## Bandwidth: 213407.8 CV score: 0.0006469947 
## Bandwidth: 213407.8 CV score: 0.0006469947 
## Bandwidth: 213407.9 CV score: 0.0006469947 
## Bandwidth: 213407.9 CV score: 0.0006469947 
## Bandwidth: 213407.9 CV score: 0.0006469947

gwr.b.f

## [1] 213407.9

#this is the distance (in meters, because our data are projected in a system measured in meters), which the weighting fuction will search, and include all observations within this radius.

gwr.f<-gwr(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), spdat, bandwidth=gwr.b.f, se.fit=T, hatmatrix=T)
gwr.f

## Call:
## gwr(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     bandwidth = gwr.b.f, hatmatrix = T, se.fit = T)
## Kernel function: gwr.Gauss 
## Fixed bandwidth: 213407.9 
## Summary of GWR coefficient estimates at data points:
##                         Min.    1st Qu.     Median    3rd Qu.       Max.
## X.Intercept.       9.360e-03  1.027e-02  1.045e-02  1.080e-02  1.125e-02
## scale.ppersonspo. -2.531e-04  4.700e-04  6.296e-04  7.949e-04  9.485e-04
## scale.p65plus.     1.410e-03  2.060e-03  2.136e-03  2.220e-03  2.367e-03
## scale.pblack_1.   -1.134e-04  4.689e-05  1.157e-04  2.391e-04  2.934e-03
## scale.phisp.      -2.182e-03 -1.048e-03 -9.715e-04 -7.932e-04  3.420e-04
## I.RUCC....7.TRUE  -1.096e-03 -3.638e-04 -2.153e-04  5.367e-05  4.777e-04
##                    Global
## X.Intercept.       0.0105
## scale.ppersonspo.  0.0006
## scale.p65plus.     0.0021
## scale.pblack_1.    0.0001
## scale.phisp.      -0.0009
## I.RUCC....7.TRUE  -0.0002
## Number of data points: 501 
## Effective number of parameters (residual: 2traceS - traceS'S): 46.7347 
## Effective degrees of freedom (residual: 2traceS - traceS'S): 454.2653 
## Sigma (residual: 2traceS - traceS'S): 0.001059183 
## Effective number of parameters (model: traceS): 33.82929 
## Effective degrees of freedom (model: traceS): 467.1707 
## Sigma (model: traceS): 0.001044451 
## Sigma (ML): 0.001008572 
## AICc (GWR p. 61, eq 2.33; p. 96, eq. 4.21): -5416.218 
## AIC (GWR p. 96, eq. 4.22): -5457.412 
## Residual sum of squares: 0.0005096261 
## Quasi-global R2: 0.8524083

BFC02.gwr.test(gwr.f)

## 
##  Brunsdon, Fotheringham & Charlton (2002, pp. 91-2) ANOVA
## 
## data:  gwr.f
## F = 1.3055, df1 = 495.00, df2 = 454.27, p-value = 0.001934
## alternative hypothesis: greater
## sample estimates:
## SS OLS residuals SS GWR residuals 
##     0.0006653254     0.0005096261

BFC99.gwr.test(gwr.f)

## 
##  Brunsdon, Fotheringham & Charlton (1999) ANOVA
## 
## data:  gwr.f
## F = 3.4071, df1 = 256.14, df2 = 470.49, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## SS GWR improvement   SS GWR residuals 
##       0.0001556992       0.0005096261

LMZ.F1GWR.test(gwr.f)

## 
##  Leung et al. (2000) F(1) test
## 
## data:  gwr.f
## F = 0.83467, df1 = 470.49, df2 = 495.00, p-value = 0.02385
## alternative hypothesis: less
## sample estimates:
## SS OLS residuals SS GWR residuals 
##     0.0006653254     0.0005096261

LMZ.F2GWR.test(gwr.f)

## 
##  Leung et al. (2000) F(2) test
## 
## data:  gwr.f
## F = 2.8438, df1 = 66.187, df2 = 495.000, p-value = 5.23e-11
## alternative hypothesis: greater
## sample estimates:
##   SS OLS residuals SS GWR improvement 
##       0.0006653254       0.0001556992

LMZ.F3GWR.test(gwr.f)

## 
## Leung et al. (2000) F(3) test
## 
##                   F statistic Numerator d.f. Denominator d.f.     Pr(>)
## (Intercept)            2.9132        54.3122           470.49 5.168e-10
## scale(ppersonspo)      3.7056       143.4838           470.49 < 2.2e-16
## scale(p65plus)         2.1047       106.2542           470.49 5.926e-08
## scale(pblack_1)        2.8647        42.1765           470.49 3.026e-08
## scale(phisp)           2.4810        46.2870           470.49 9.023e-07
## I(RUCC >= 7)TRUE       2.3018       205.7501           470.49 8.846e-14
##                      
## (Intercept)       ***
## scale(ppersonspo) ***
## scale(p65plus)    ***
## scale(pblack_1)   ***
## scale(phisp)      ***
## I(RUCC >= 7)TRUE  ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

gwr.dat<-gwr.f$SDF
cols<-brewer.pal(n=6, name="RdBu")
names(gwr.dat)

##  [1] "sum.w"                    "X.Intercept."            
##  [3] "scale.ppersonspo."        "scale.p65plus."          
##  [5] "scale.pblack_1."          "scale.phisp."            
##  [7] "I.RUCC....7.TRUE"         "X.Intercept._se"         
##  [9] "scale.ppersonspo._se"     "scale.p65plus._se"       
## [11] "scale.pblack_1._se"       "scale.phisp._se"         
## [13] "I.RUCC....7.TRUE_se"      "gwr.e"                   
## [15] "pred"                     "pred.se"                 
## [17] "localR2"                  "X.Intercept._se_EDF"     
## [19] "scale.ppersonspo._se_EDF" "scale.p65plus._se_EDF"   
## [21] "scale.pblack_1._se_EDF"   "scale.phisp._se_EDF"     
## [23] "I.RUCC....7.TRUE_se_EDF"  "pred.se_EDF"

#pdf("gwrfixed.pdf")
spplot(gwr.dat, "scale.ppersonspo.", at=quantile(gwr.f$SDF$scale.ppersonspo.), col.regions=cols, main="% Poverty effect")

spplot(gwr.dat, "scale.pblack_1.", at=quantile(gwr.f$SDF$scale.pblack_1.), col.regions=cols, main="% Black effect")

spplot(gwr.dat, "scale.phisp.", at=quantile(gwr.f$SDF$scale.phisp.), col.regions=cols, main="% Hispanic effect")

#another "lighter" plot method
#brks<-c(-3.038e-04,  3.441e-04,  5.882e-04,  7.912e-04 , 1.355e-03); cols<-brewer.pal(n=5,"Reds")
#plot(gwr.a$SDF, col=cols[findInterval(gwr.a$SDF$scale.ppersonspo.,brks, all.inside=TRUE)])

#Local R^2
brks<-seq(0, 1, length.out = 9); cols=brewer.pal(n=9, "Reds")
spplot(gwr.f$SDF, "localR2", at=brks, col.regions=cols, main="local R^2")

#plots of the correlations of the coefficients
pairs(gwr.f$SDF[,3:7])

cor(as(gwr.f$SDF[,3:7], "data.frame"))

##                   scale.ppersonspo. scale.p65plus. scale.pblack_1.
## scale.ppersonspo.       1.000000000   -0.001474733      -0.3823172
## scale.p65plus.         -0.001474733    1.000000000      -0.6196133
## scale.pblack_1.        -0.382317219   -0.619613301       1.0000000
## scale.phisp.           -0.650938360   -0.193689549       0.6597657
## I.RUCC....7.TRUE       -0.286463792    0.134305334      -0.2545913
##                   scale.phisp. I.RUCC....7.TRUE
## scale.ppersonspo.   -0.6509384       -0.2864638
## scale.p65plus.      -0.1936895        0.1343053
## scale.pblack_1.      0.6597657       -0.2545913
## scale.phisp.         1.0000000       -0.2432299
## I.RUCC....7.TRUE    -0.2432299        1.0000000

This plot shows a problem in GWR, that the coefficients can often be highly correlated See Wheeler and Tiefelsdorf (2005), one of the major critiques of GWR. In general, the critiques of GWR have pushed most users to use it for “exploratory” purposes only. So, one way we may explore our model is to examine where the regression effects are “significant” versus not. This can be done by plotting the local t-statistics, or z-statistics if you prefer.

cols<-brewer.pal(8,"RdBu")
dfree<-gwr.f$results$edf

names(gwr.dat)

##  [1] "sum.w"                    "X.Intercept."            
##  [3] "scale.ppersonspo."        "scale.p65plus."          
##  [5] "scale.pblack_1."          "scale.phisp."            
##  [7] "I.RUCC....7.TRUE"         "X.Intercept._se"         
##  [9] "scale.ppersonspo._se"     "scale.p65plus._se"       
## [11] "scale.pblack_1._se"       "scale.phisp._se"         
## [13] "I.RUCC....7.TRUE_se"      "gwr.e"                   
## [15] "pred"                     "pred.se"                 
## [17] "localR2"                  "X.Intercept._se_EDF"     
## [19] "scale.ppersonspo._se_EDF" "scale.p65plus._se_EDF"   
## [21] "scale.pblack_1._se_EDF"   "scale.phisp._se_EDF"     
## [23] "I.RUCC....7.TRUE_se_EDF"  "pred.se_EDF"

gwr.dat$pov.t<-gwr.dat$scale.ppersonspo./gwr.dat$scale.ppersonspo._se
gwr.dat$pov.t.p<-2*pt(-abs(gwr.dat$pov.t)  , dfree)
tbrks<- c(min(gwr.dat$pov.t),qt(c( .975, .9, .1, .025),df=dfree, lower.tail = F),max(gwr.dat$pov.t))
spplot(gwr.dat, "pov.t", col.regions=cols, at=tbrks , main="t-stat %Poverty")

gwr.dat$black.t<-gwr.dat$scale.pblack_1./gwr.dat$scale.pblack_1._se
gwr.dat$black.t.p<-2*pt(-abs(gwr.dat$black.t)  , dfree)
tbrks<- c(min(gwr.dat$black.t),qt(c( .975, .9, .1, .025),df=dfree, lower.tail = F),max(gwr.dat$black.t))
spplot(gwr.dat, "black.t", col.regions=cols,at=tbrks , main="t-stat %Black")

gwr.dat$hisp.t<-gwr.dat$scale.phisp./gwr.dat$scale.phisp._se
gwr.dat$hisp.t.p<-2*pt(-abs(gwr.dat$hisp.t)  , dfree)
tbrks<- c(min(gwr.dat$hisp.t),qt(c( .975, .9, .1, .025),df=dfree, lower.tail = F),max(gwr.dat$hisp.t))
spplot(gwr.dat, "hisp.t", col.regions=cols, at=tbrks, main="t-stat %Hispanic")

So, those are the results from the fixed bandwith analysis. But there are obviously places in our data where counties are more densely occurring, so an adaptive kernel approach is also warranted. If all our data were equally spaced, then this may not be necessary.

Now we try everthing with the adaptive bandwidth:

gwr.b.a<-gwr.sel(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), spdat, adapt=T)

## Adaptive q: 0.381966 CV score: 0.0006708917 
## Adaptive q: 0.618034 CV score: 0.0006759116 
## Adaptive q: 0.236068 CV score: 0.0006662507 
## Adaptive q: 0.145898 CV score: 0.0006610212 
## Adaptive q: 0.09016994 CV score: 0.0006542459 
## Adaptive q: 0.05572809 CV score: 0.0006442366 
## Adaptive q: 0.03444185 CV score: 0.0006379112 
## Adaptive q: 0.02128624 CV score: 0.0006543154 
## Adaptive q: 0.04257247 CV score: 0.000642091 
## Adaptive q: 0.02941686 CV score: 0.0006457471 
## Adaptive q: 0.03754747 CV score: 0.0006396459 
## Adaptive q: 0.03252248 CV score: 0.0006416178 
## Adaptive q: 0.03543098 CV score: 0.0006381852 
## Adaptive q: 0.03389516 CV score: 0.0006379173 
## Adaptive q: 0.03419852 CV score: 0.0006378657 
## Adaptive q: 0.03415783 CV score: 0.0006378591 
## Adaptive q: 0.0340575 CV score: 0.0006378438 
## Adaptive q: 0.03399549 CV score: 0.0006378352 
## Adaptive q: 0.0339548 CV score: 0.00063783 
## Adaptive q: 0.0339548 CV score: 0.00063783

gwr.b.a

## [1] 0.0339548

This is the proportion of all cases which the weighting fuction will search, and include this fraction of observations in a model for each county. In this case it will include 17.0113545 counties in the model for a particular county.

gwr.a<-gwr(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), spdat, adapt=gwr.b.a, se.fit=T, hatmatrix=T )
gwr.a

## Call:
## gwr(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     adapt = gwr.b.a, hatmatrix = T, se.fit = T)
## Kernel function: gwr.Gauss 
## Adaptive quantile: 0.0339548 (about 17 of 501 data points)
## Summary of GWR coefficient estimates at data points:
##                         Min.    1st Qu.     Median    3rd Qu.       Max.
## X.Intercept.       7.577e-03  9.934e-03  1.049e-02  1.089e-02  1.191e-02
## scale.ppersonspo. -3.038e-04  3.441e-04  5.882e-04  7.912e-04  1.355e-03
## scale.p65plus.     1.403e-03  1.998e-03  2.190e-03  2.333e-03  2.790e-03
## scale.pblack_1.   -3.075e-04  6.612e-05  2.370e-04  4.524e-04  2.271e-03
## scale.phisp.      -4.785e-03 -1.187e-03 -9.467e-04 -5.321e-04  7.882e-04
## I.RUCC....7.TRUE  -1.135e-03 -4.824e-04 -2.081e-04  1.074e-04  5.971e-04
##                    Global
## X.Intercept.       0.0105
## scale.ppersonspo.  0.0006
## scale.p65plus.     0.0021
## scale.pblack_1.    0.0001
## scale.phisp.      -0.0009
## I.RUCC....7.TRUE  -0.0002
## Number of data points: 501 
## Effective number of parameters (residual: 2traceS - traceS'S): 101.2227 
## Effective degrees of freedom (residual: 2traceS - traceS'S): 399.7773 
## Sigma (residual: 2traceS - traceS'S): 0.00104379 
## Effective number of parameters (model: traceS): 72.72109 
## Effective degrees of freedom (model: traceS): 428.2789 
## Sigma (model: traceS): 0.001008461 
## Sigma (ML): 0.0009324024 
## AICc (GWR p. 61, eq 2.33; p. 96, eq. 4.21): -5396.638 
## AIC (GWR p. 96, eq. 4.22): -5497.204 
## Residual sum of squares: 0.0004355565 
## Quasi-global R2: 0.8738594

BFC02.gwr.test(gwr.a)

## 
##  Brunsdon, Fotheringham & Charlton (2002, pp. 91-2) ANOVA
## 
## data:  gwr.a
## F = 1.5275, df1 = 495.00, df2 = 399.78, p-value = 5.41e-06
## alternative hypothesis: greater
## sample estimates:
## SS OLS residuals SS GWR residuals 
##     0.0006653254     0.0004355565

BFC99.gwr.test(gwr.a)

## 
##  Brunsdon, Fotheringham & Charlton (1999) ANOVA
## 
## data:  gwr.a
## F = 2.2147, df1 = 394.70, df2 = 437.65, p-value = 4.244e-16
## alternative hypothesis: greater
## sample estimates:
## SS GWR improvement   SS GWR residuals 
##       0.0002297688       0.0004355565

LMZ.F1GWR.test(gwr.a)

## 
##  Leung et al. (2000) F(1) test
## 
## data:  gwr.a
## F = 0.81058, df1 = 437.65, df2 = 495.00, p-value = 0.01213
## alternative hypothesis: less
## sample estimates:
## SS OLS residuals SS GWR residuals 
##     0.0006653254     0.0004355565

LMZ.F2GWR.test(gwr.a)

## 
##  Leung et al. (2000) F(2) test
## 
## data:  gwr.a
## F = 1.7952, df1 = 149.55, df2 = 495.00, p-value = 1.501e-06
## alternative hypothesis: greater
## sample estimates:
##   SS OLS residuals SS GWR improvement 
##       0.0006653254       0.0002297688

LMZ.F3GWR.test(gwr.a)

## 
## Leung et al. (2000) F(3) test
## 
##                   F statistic Numerator d.f. Denominator d.f.     Pr(>)
## (Intercept)            1.4040        16.7158           437.65  0.131199
## scale(ppersonspo)      1.9580       141.0856           437.65 1.094e-07
## scale(p65plus)         1.7019       124.0449           437.65 4.939e-05
## scale(pblack_1)        1.5254        60.3445           437.65  0.009694
## scale(phisp)           1.0542        18.9957           437.65  0.396847
## I(RUCC >= 7)TRUE       1.4327       177.0345           437.65  0.001656
##                      
## (Intercept)          
## scale(ppersonspo) ***
## scale(p65plus)    ***
## scale(pblack_1)   ** 
## scale(phisp)         
## I(RUCC >= 7)TRUE  ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

gwr.data<-gwr.a$SDF
spplot(gwr.data, "scale.ppersonspo.", at=quantile(gwr.a$SDF$scale.ppersonspo.), col.regions=cols, main="% Poverty effect")

spplot(gwr.data, "scale.pblack_1.", at=quantile(gwr.a$SDF$scale.pblack_1.), col.regions=cols, main="% Black effect")

spplot(gwr.data, "scale.phisp.", at=quantile(gwr.a$SDF$scale.phisp.), col.regions=cols, main="% Hispanic effect")

cols<-brewer.pal(8,"RdBu")
dfree<-gwr.a$results$edf

names(gwr.data)

##  [1] "sum.w"                    "X.Intercept."            
##  [3] "scale.ppersonspo."        "scale.p65plus."          
##  [5] "scale.pblack_1."          "scale.phisp."            
##  [7] "I.RUCC....7.TRUE"         "X.Intercept._se"         
##  [9] "scale.ppersonspo._se"     "scale.p65plus._se"       
## [11] "scale.pblack_1._se"       "scale.phisp._se"         
## [13] "I.RUCC....7.TRUE_se"      "gwr.e"                   
## [15] "pred"                     "pred.se"                 
## [17] "localR2"                  "X.Intercept._se_EDF"     
## [19] "scale.ppersonspo._se_EDF" "scale.p65plus._se_EDF"   
## [21] "scale.pblack_1._se_EDF"   "scale.phisp._se_EDF"     
## [23] "I.RUCC....7.TRUE_se_EDF"  "pred.se_EDF"

gwr.data$pov.t<-gwr.data$scale.ppersonspo./gwr.data$scale.ppersonspo._se
gwr.data$pov.t.p<-2*pt(-abs(gwr.data$pov.t)  , dfree)
tbrks<- c(min(gwr.data$pov.t),qt(c( .975, .9, .1, .025),df=dfree, lower.tail = F),max(gwr.data$pov.t))
spplot(gwr.data, "pov.t", col.regions=cols, at=tbrks , main="t-stat %Poverty")

gwr.data$black.t<-gwr.data$scale.pblack_1./gwr.data$scale.pblack_1._se
gwr.data$black.t.p<-2*pt(-abs(gwr.data$black.t)  , dfree)
tbrks<- c(min(gwr.data$black.t),qt(c( .975, .9, .1, .025),df=dfree, lower.tail = F),max(gwr.data$black.t))
spplot(gwr.data, "black.t", col.regions=cols,at=tbrks , main="t-stat %Black")

gwr.data$hisp.t<-gwr.data$scale.phisp./gwr.data$scale.phisp._se
gwr.data$hisp.t.p<-2*pt(-abs(gwr.data$hisp.t)  , dfree)
tbrks<- c(min(gwr.data$hisp.t),qt(c( .975, .9, .1, .025),df=dfree, lower.tail = F),max(gwr.data$hisp.t))
spplot(gwr.data, "hisp.t", col.regions=cols, at=tbrks, main="t-stat %Hispanic")

#Local R^2
brks<-seq(0, 1, length.out = 9); cols=brewer.pal(n=9, "Reds")
spplot(gwr.a$SDF, "localR2", at=brks, col.regions=cols, main="local R^2")

Alternative Construction of spatial regimes - Clustering on GWR coefficients

The first step in the construction of spatial regimes is to do a hierarchical clustering method. We use a distance matrix between all observations for this In this case, we’re using the GWR coefficients as our data.

dat.dist<-dist(gwr.data@data[, 3:7])

#The hclust() function does basic hierarchical clustering
#according to several different methods, we'll use Ward's method
clust.dat<-hclust(dat.dist, method="ward.D")

#And we'll plot the dendrogram, or tree plot
plot(clust.dat)

#I only want a few groups, so I'll cut the tree so I get 5 clusters
gwr.data$clus<-cutree(clust.dat, k=4)


#i'll use table() to get the frequencies of each cluster
table(gwr.data$clus)

## 
##   1   2   3   4 
##  85 269  90  57

#to get it to plot right, we have to convert the cluster number
#to a factor variable
gwr.data$b.cf<-as.factor(gwr.data$clus)
spplot(gwr.data,"b.cf", col.regions=brewer.pal(4, "Accent"), main="GWR Spatial Regimes")

#We can get the means of each coefficient for each of the regime areas
b.means<-aggregate(gwr.data@data[, 3:5], by=list(gwr.data$clus), mean)
b.means

##   Group.1 scale.ppersonspo. scale.p65plus. scale.pblack_1.
## 1       1      0.0005680811    0.002101334    0.0002283936
## 2       2      0.0006852921    0.002200900    0.0001550988
## 3       3      0.0002219154    0.002300691    0.0004603948
## 4       4      0.0005660539    0.001853176    0.0013099481

#The main idea behind examining spatial regimes is that we are looking for areas where the model may be working differently, we've done this, now we will
#fit separate ols models to each regime
spdat$cluster<-gwr.data$clus
lm1<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7),spdat, subset=spdat$cluster==1)
lm2<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7),spdat, subset=spdat$cluster==2)
lm3<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7),spdat, subset=spdat$cluster==3)
lm4<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7),spdat, subset=spdat$cluster==4)

#examine each of the spatial regimes
summary(lm1)

## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = spdat$cluster == 1)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -1.878e-03 -6.063e-04 -4.957e-05  3.881e-04  2.643e-03 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        8.251e-03  9.556e-04   8.634 5.04e-13 ***
## scale(ppersonspo)  5.763e-04  1.537e-04   3.749 0.000336 ***
## scale(p65plus)     2.069e-03  1.489e-04  13.896  < 2e-16 ***
## scale(pblack_1)    1.421e-04  1.188e-04   1.196 0.235173    
## scale(phisp)      -3.630e-03  1.288e-03  -2.819 0.006084 ** 
## I(RUCC >= 7)TRUE   7.228e-05  2.263e-04   0.319 0.750227    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0008858 on 79 degrees of freedom
## Multiple R-squared:  0.8255, Adjusted R-squared:  0.8145 
## F-statistic: 74.75 on 5 and 79 DF,  p-value: < 2.2e-16

summary(lm2)

## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = spdat$cluster == 2)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0043029 -0.0005601  0.0000598  0.0006509  0.0029322 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.041e-02  1.040e-04 100.131  < 2e-16 ***
## scale(ppersonspo)  7.790e-04  1.087e-04   7.166 7.79e-12 ***
## scale(p65plus)     2.075e-03  8.251e-05  25.144  < 2e-16 ***
## scale(pblack_1)   -1.279e-04  1.052e-04  -1.216    0.225    
## scale(phisp)      -1.287e-03  1.277e-04 -10.082  < 2e-16 ***
## I(RUCC >= 7)TRUE  -2.334e-04  1.620e-04  -1.441    0.151    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.001093 on 263 degrees of freedom
## Multiple R-squared:  0.8347, Adjusted R-squared:  0.8315 
## F-statistic: 265.5 on 5 and 263 DF,  p-value: < 2.2e-16

summary(lm3)

## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = spdat$cluster == 3)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0041216 -0.0005345  0.0000171  0.0006476  0.0022513 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.025e-02  1.868e-04  54.847   <2e-16 ***
## scale(ppersonspo)  2.915e-06  1.564e-04   0.019    0.985    
## scale(p65plus)     2.285e-03  1.562e-04  14.629   <2e-16 ***
## scale(pblack_1)    2.999e-04  2.097e-04   1.430    0.156    
## scale(phisp)      -2.728e-04  1.970e-04  -1.385    0.170    
## I(RUCC >= 7)TRUE  -3.323e-05  2.976e-04  -0.112    0.911    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.001087 on 84 degrees of freedom
## Multiple R-squared:  0.7977, Adjusted R-squared:  0.7857 
## F-statistic: 66.24 on 5 and 84 DF,  p-value: < 2.2e-16

summary(lm4)

## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = spdat$cluster == 4)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0051635 -0.0008085 -0.0000394  0.0007192  0.0035645 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        0.0108274  0.0010132  10.687 1.28e-14 ***
## scale(ppersonspo)  0.0002726  0.0002269   1.201    0.235    
## scale(p65plus)     0.0017292  0.0002076   8.330 4.45e-11 ***
## scale(pblack_1)    0.0014645  0.0014865   0.985    0.329    
## scale(phisp)      -0.0004185  0.0002522  -1.659    0.103    
## I(RUCC >= 7)TRUE  -0.0003595  0.0004359  -0.825    0.413    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.001537 on 51 degrees of freedom
## Multiple R-squared:  0.6137, Adjusted R-squared:  0.5758 
## F-statistic:  16.2 on 5 and 51 DF,  p-value: 1.519e-09

Or, we could cluster on the original attributes in the data to find similar places based on their demographic variables:

dem.dist<-dist(spdat@data[, c("ppersonspo", "p65plus", "pblack_1", "phisp", "RUCC")])

#The hclust() function does basic hierarchical clustering
#according to several different methods, we'll use Ward's method
clust.dem<-hclust(dem.dist, method="ward.D")

#And we'll plot the dendrogram, or tree plot
plot(clust.dem)

#I only want a few groups, so I'll cut the tree so I get 4 clusters
spdat$clus<-cutree(clust.dem, k=4)


#i'll use table() to get the frequencies of each cluster
table(spdat$clus)

## 
##   1   2   3   4 
## 116 113  97 175

#to get it to plot right, we have to convert the cluster number
#to a factor variable
spdat$d.cf<-as.factor(spdat$clus)
spplot(spdat,"d.cf", col.regions=brewer.pal(4, "Accent"), main="Demographic Spatial Regimes")

#We can get the means of each coefficient for each of the regime areas
b.means<-aggregate(spdat@data[, c("ppersonspo", "p65plus", "pblack_1", "phisp", "RUCC")], by=list(spdat$d.cf), mean)
b.means

##   Group.1 ppersonspo   p65plus   pblack_1     phisp     RUCC
## 1       1  0.2035794 0.1623899 0.10176707 0.2068053 7.000000
## 2       2  0.1469202 0.1106256 0.13197354 0.1586634 1.902655
## 3       3  0.1785309 0.1793988 0.06183557 0.1690796 8.577320
## 4       4  0.1896993 0.1480544 0.10835914 0.1856226 5.685714

#fit separate ols models to each regime

lm1<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7),spdat, subset=spdat$d.cf==1)
lm2<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7),spdat, subset=spdat$d.cf==2)
lm3<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7),spdat, subset=spdat$d.cf==3)
lm4<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7),spdat, subset=spdat$d.cf==4)

#examine each of the spatial regimes
summary(lm1)

## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = spdat$d.cf == 1)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0038075 -0.0005097  0.0000522  0.0005946  0.0030558 
## 
## Coefficients: (1 not defined because of singularities)
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.060e-02  1.159e-04  91.449  < 2e-16 ***
## scale(ppersonspo)  6.775e-04  1.349e-04   5.022 1.97e-06 ***
## scale(p65plus)     1.771e-03  1.254e-04  14.128  < 2e-16 ***
## scale(pblack_1)    8.215e-05  1.357e-04   0.605    0.546    
## scale(phisp)      -1.040e-03  1.373e-04  -7.577 1.14e-11 ***
## I(RUCC >= 7)TRUE          NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.001051 on 111 degrees of freedom
## Multiple R-squared:  0.8114, Adjusted R-squared:  0.8046 
## F-statistic: 119.4 on 4 and 111 DF,  p-value: < 2.2e-16

summary(lm2)

## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = spdat$d.cf == 2)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -2.947e-03 -3.949e-04  4.587e-05  4.450e-04  1.939e-03 
## 
## Coefficients: (1 not defined because of singularities)
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        0.0104293  0.0001434  72.752  < 2e-16 ***
## scale(ppersonspo)  0.0004382  0.0001168   3.752 0.000284 ***
## scale(p65plus)     0.0023816  0.0001358  17.537  < 2e-16 ***
## scale(pblack_1)    0.0001386  0.0001076   1.288 0.200504    
## scale(phisp)      -0.0007527  0.0001222  -6.161 1.27e-08 ***
## I(RUCC >= 7)TRUE          NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0007792 on 108 degrees of freedom
## Multiple R-squared:  0.8427, Adjusted R-squared:  0.8368 
## F-statistic: 144.6 on 4 and 108 DF,  p-value: < 2.2e-16

summary(lm3)

## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = spdat$d.cf == 3)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0056808 -0.0009007  0.0001499  0.0009163  0.0033947 
## 
## Coefficients: (1 not defined because of singularities)
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        0.0101926  0.0002077  49.078  < 2e-16 ***
## scale(ppersonspo)  0.0004024  0.0002402   1.675   0.0973 .  
## scale(p65plus)     0.0019056  0.0001730  11.012  < 2e-16 ***
## scale(pblack_1)    0.0002083  0.0002587   0.805   0.4228    
## scale(phisp)      -0.0010208  0.0002335  -4.371 3.24e-05 ***
## I(RUCC >= 7)TRUE          NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.00163 on 92 degrees of freedom
## Multiple R-squared:  0.6535, Adjusted R-squared:  0.6384 
## F-statistic: 43.38 on 4 and 92 DF,  p-value: < 2.2e-16

summary(lm4)

## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = spdat$d.cf == 4)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0040428 -0.0004612 -0.0000613  0.0006240  0.0023597 
## 
## Coefficients: (1 not defined because of singularities)
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.066e-02  7.596e-05 140.344  < 2e-16 ***
## scale(ppersonspo)  2.544e-04  1.106e-04   2.301   0.0226 *  
## scale(p65plus)     2.290e-03  1.081e-04  21.190  < 2e-16 ***
## scale(pblack_1)    6.792e-05  1.041e-04   0.652   0.5150    
## scale(phisp)      -6.464e-04  1.032e-04  -6.261 3.02e-09 ***
## I(RUCC >= 7)TRUE          NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.000991 on 170 degrees of freedom
## Multiple R-squared:  0.7996, Adjusted R-squared:  0.7949 
## F-statistic: 169.6 on 4 and 170 DF,  p-value: < 2.2e-16

Spatial Regimes and Geographically Weighted Regression in R

Corey S. Sparks, Ph.D.

October 26, 2015

Geographically Weighted Regression

Approaches to non-stationarity

GWR Model

GWR Modeling

Alternative Construction of spatial regimes - Clustering on GWR coefficients