Chris Brunsdon
UQ July 2018
You will find Chapters 5 and Chapter 8 in An Introduction to R for Spatial Analysis and Mapping by Chris Brunsdon and Lex Comber useful.
georgia
datasetGISTools
and the GWmodel
packages wOnce we are sure all the packages are installed, you need to load them into the current session:
The data set contains a number of variables for the counties in Georgia from the 1990 census including the percentage of the population in each County that
PctRural
)PctBach
)PctEld
)PctFB
)PctPov
)PctBlack
)and the median income of the county (MedInc
) (in 1000s of dollars)
MedInc | PctRural | PctBach | PctEld | PctFB | PctPov | PctBlack |
---|---|---|---|---|---|---|
32.152 | 75.6 | 8.2 | 11.43 | 0.64 | 19.9 | 20.76 |
27.657 | 100.0 | 6.4 | 11.77 | 1.58 | 26.0 | 26.86 |
29.342 | 61.7 | 6.6 | 11.11 | 0.27 | 24.1 | 15.42 |
29.610 | 100.0 | 9.4 | 13.17 | 0.11 | 24.8 | 51.67 |
36.414 | 42.7 | 13.3 | 8.64 | 1.43 | 17.5 | 42.39 |
41.783 | 100.0 | 6.4 | 11.37 | 0.34 | 15.1 | 3.49 |
Visually, it seems that there may be some colinearity between PctPov
,PctBlack
and PctEld
.
We are interested in predicting MedInc
in Georgia.
Assume associated with
PctRurual
)PctBach
)PctEld
)PctFB
)PctPov
)PctBlack
).The equation for a regression is:
yi=β0+∑k=1mβkxik+ϵi for i∈1⋯n
Medinc
herepctpov
etc.for example
m <- lm(MedInc~PctRural+PctBach+PctEld+PctFB+PctPov+PctBlack, data = df)
## ## Call: ## lm(formula = MedInc ~ PctRural + PctBach + PctEld + PctFB + PctPov + ## PctBlack, data = df) ## ## Residuals: ## Min 1Q Median 3Q Max ## -12.4203 -2.9897 -0.6163 2.2095 25.8201 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 52.59895 3.10893 16.919 < 2e-16 *** ## PctRural 0.07377 0.02043 3.611 0.000414 *** ## PctBach 0.69726 0.11221 6.214 4.73e-09 *** ## PctEld -0.78862 0.17979 -4.386 2.14e-05 *** ## PctFB -1.29030 0.47388 -2.723 0.007229 ** ## PctPov -0.95400 0.10459 -9.121 4.19e-16 *** ## PctBlack 0.03313 0.03717 0.891 0.374140 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 5.155 on 152 degrees of freedom ## Multiple R-squared: 0.7685, Adjusted R-squared: 0.7593 ## F-statistic: 84.09 on 6 and 152 DF, p-value: < 2.2e-16
Name | Residual |
---|---|
Charlton | -2.248484 |
Chattahoochee | -2.205354 |
Clarke | -2.918361 |
Fayette | 2.984996 |
Forsyth | 5.620108 |
Seminole | 2.954353 |
Wilkinson | -2.307853 |