UQ July 2018
You will find Chapters 5 and Chapter 8 in An Introduction to R for Spatial Analysis and Mapping by Chris Brunsdon and Lex Comber useful.
georgia datasetGISTools and the GWmodel packages wOnce we are sure all the packages are installed, you need to load them into the current session:
The data set contains a number of variables for the counties in Georgia from the 1990 census including the percentage of the population in each County that
PctRural)PctBach)PctEld)PctFB)PctPov)PctBlack)and the median income of the county (MedInc) (in 1000s of dollars)
| MedInc | PctRural | PctBach | PctEld | PctFB | PctPov | PctBlack |
|---|---|---|---|---|---|---|
| 32.152 | 75.6 | 8.2 | 11.43 | 0.64 | 19.9 | 20.76 |
| 27.657 | 100.0 | 6.4 | 11.77 | 1.58 | 26.0 | 26.86 |
| 29.342 | 61.7 | 6.6 | 11.11 | 0.27 | 24.1 | 15.42 |
| 29.610 | 100.0 | 9.4 | 13.17 | 0.11 | 24.8 | 51.67 |
| 36.414 | 42.7 | 13.3 | 8.64 | 1.43 | 17.5 | 42.39 |
| 41.783 | 100.0 | 6.4 | 11.37 | 0.34 | 15.1 | 3.49 |
Visually, it seems that there may be some colinearity between PctPov,PctBlack and PctEld.
We are interested in predicting MedInc in Georgia.
Assume associated with
PctRurual)PctBach)PctEld)PctFB)PctPov)PctBlack).The equation for a regression is:
\(y_i = \beta_{0} + \sum_{m}^{k=1}\beta_{k}x_{ik} + \epsilon_i \textsf{ for } i \in 1 \cdots n\)
Medinc herepctpov etc.for example
m <- lm(MedInc~PctRural+PctBach+PctEld+PctFB+PctPov+PctBlack,
data = df)
## ## Call: ## lm(formula = MedInc ~ PctRural + PctBach + PctEld + PctFB + PctPov + ## PctBlack, data = df) ## ## Residuals: ## Min 1Q Median 3Q Max ## -12.4203 -2.9897 -0.6163 2.2095 25.8201 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 52.59895 3.10893 16.919 < 2e-16 *** ## PctRural 0.07377 0.02043 3.611 0.000414 *** ## PctBach 0.69726 0.11221 6.214 4.73e-09 *** ## PctEld -0.78862 0.17979 -4.386 2.14e-05 *** ## PctFB -1.29030 0.47388 -2.723 0.007229 ** ## PctPov -0.95400 0.10459 -9.121 4.19e-16 *** ## PctBlack 0.03313 0.03717 0.891 0.374140 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 5.155 on 152 degrees of freedom ## Multiple R-squared: 0.7685, Adjusted R-squared: 0.7593 ## F-statistic: 84.09 on 6 and 152 DF, p-value: < 2.2e-16
| Name | Residual |
|---|---|
| Charlton | -2.248484 |
| Chattahoochee | -2.205354 |
| Clarke | -2.918361 |
| Fayette | 2.984996 |
| Forsyth | 5.620108 |
| Seminole | 2.954353 |
| Wilkinson | -2.307853 |