May 2018
Session 1 Presentation, worksheet and data at
Lex Comber email: a.comber@leeds.ac.uk
Paul Harris (‘Harry’) email: paul.harris@rothamsted.ac.uk
There will be 5 sessions
There will be breaks
Aim to finish at 17:00
We have set aside time in the first session (Introduction) to make sure everything is set up
Technical aspects
consolescript (like a text file)scriptTechnical aspects: RStudio start
Technical aspects: Open a script
Technical aspects: RStudio Components
Conceptual aspects
# in your scriptConceptual aspects
In each session
You will find Chapters 5 and Chapter 8 in An Introduction to R for Spatial Analysis and Mapping by Chris Brunsdon and Lex Comber useful.
.csv format).shp)The data set contains a number of variables for a study area from Rothamsted Research where Harry works.
S)Ca)Fe)P)SOM)Easting and Northing)| ID | Easting | Northing | S | Ca | Fe | P | SOM | pH | Slope | Aspect |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 265632 | 99300 | 698.36 | 3163.39 | 28688.60 | 1026.80 | 12.90 | 5.60 | 1.32 | 257.46 |
| 2 | 265625 | 99275 | 585.10 | 2527.01 | 25116.20 | 823.39 | 10.60 | 5.41 | 0.57 | 269.99 |
| 3 | 265650 | 99275 | 595.65 | 2634.03 | 28939.38 | 937.63 | 11.26 | 5.53 | 2.25 | 307.24 |
| 4 | 265600 | 99250 | 576.12 | 2625.56 | 28467.91 | 853.77 | 10.24 | 5.47 | 2.83 | 315.00 |
| 5 | 265625 | 99250 | 576.18 | 2446.51 | 28502.34 | 883.98 | 10.37 | 5.42 | 0.37 | 258.71 |
| 6 | 265650 | 99250 | 576.61 | 2440.93 | 29359.05 | 914.27 | 10.60 | 5.49 | 2.53 | 298.74 |
We are interested in predicting S .
Assume associated with
S)Ca)Fe)P)SOM)The equation for a regression is:
\(y_i = \beta_{0} + \sum_{m}^{k=1}\beta_{k}x_{ik} + \epsilon_i \textsf{ for } i \in 1 \cdots n\)
S hereSOM etc.for example
m <- lm(S~Ca+Fe+P+SOM+pH+Slope+Aspect, data = df)
## ## Call: ## lm(formula = S ~ Ca + Fe + P + SOM + pH + Slope + Aspect, data = df) ## ## Residuals: ## Min 1Q Median 3Q Max ## -333.55 -33.16 -0.40 31.99 285.15 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 7.466e+02 6.843e+01 10.910 < 2e-16 *** ## Ca 6.213e-02 4.790e-03 12.972 < 2e-16 *** ## Fe -8.236e-03 3.765e-04 -21.874 < 2e-16 *** ## P 1.152e-01 1.186e-02 9.710 < 2e-16 *** ## SOM 1.952e+01 1.021e+00 19.118 < 2e-16 *** ## pH -7.060e+01 1.335e+01 -5.289 1.49e-07 *** ## Slope 9.125e+00 8.159e-01 11.184 < 2e-16 *** ## Aspect 1.157e-01 1.481e-02 7.814 1.33e-14 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 51.45 on 1062 degrees of freedom ## Multiple R-squared: 0.8357, Adjusted R-squared: 0.8346 ## F-statistic: 771.5 on 7 and 1062 DF, p-value: < 2.2e-16
## tmap mode set to interactive viewing
everything is related to everything else, but near things are more related than distant things. (Tobler, 1970)
“the full range of conditions anywhere on the Earth’s surface could in principle be found packed within any small area. There would be no regions of approximately homogeneous conditions to be described by giving attributes to area objects. Topographic surfaces would vary chaotically, with slopes that were everywhere infinite, and the contours of such surfaces would be infinitely dense and contorted. Spatial analysis, and indeed life itself, would be impossible.” (de Smith et al 2007, p44)
Coefficients change
| Intercept | Ca | Fe | P | SOM | pH | Slope | Aspect | |
|---|---|---|---|---|---|---|---|---|
| Min. | -233.472 | -0.026 | -0.015 | -0.093 | -5.664 | -309.436 | -3.774 | -0.104 |
| 1st Qu. | 338.519 | 0.040 | -0.006 | 0.039 | 10.136 | -102.276 | -0.542 | -0.001 |
| Median | 692.092 | 0.060 | -0.004 | 0.097 | 18.789 | -68.268 | 1.033 | 0.043 |
| Mean | 659.049 | 0.063 | -0.005 | 0.103 | 18.120 | -67.445 | 1.456 | 0.086 |
| 3rd Qu. | 896.233 | 0.082 | -0.003 | 0.168 | 26.647 | -21.152 | 3.388 | 0.107 |
| Max. | 1942.674 | 0.176 | 0.002 | 0.284 | 44.712 | 161.084 | 8.880 | 0.737 |
| Global | 746.569 | 0.062 | -0.008 | 0.115 | 19.516 | -70.596 | 9.125 | 0.116 |
Coefficients change over the map:
Some coefficients flip…