The compressed file oregon.zip contains data on average annual
temperatures for Oregon from a set of climate stations in the shapefile
oregontann.shp in a variable called ‘tann’, and station
elevation in a variable called elevation. A second file
orgrid.shp contains a set of gridded elevations for the state at 10
minute resolution. Code is given below to read in these data and convert
to sf and stars objects for geostatistical
analysis. Using the gstat library, carry out the following
analyses:
Read in the files and produce a sample variogram for average
annual temperatures in Oregon. (Use the variogram()
function)
# Oregon boundaries
orotl <- st_read("../datafiles/oregon/oregon/orotl.shp", quiet = TRUE)
st_crs(orotl) <- 4326
# Oregon temperatures
ortann <- st_read("../datafiles/oregon/oregon/oregontann.shp", quiet = TRUE)
st_crs(ortann) <- 4326
# Oregon DEM file
orgrid <- st_read("../datafiles/oregon/oregon/orgrid.shp", quiet = TRUE)
st_crs(orgrid) <- 4326
orgrid.dem <- st_rasterize(orgrid, dx = 0.1667, dy = 0.1667)
# plot
plot(orgrid.dem, reset = FALSE)
plot(ortann["tann"],
add = TRUE,
pch = 16,
cex = 1.5)
ORtemp_var <- variogram(tann ~ 1, ortann)
plot(ORtemp_var, plot.numbers = TRUE)
It looks like the variogram flattens out around 200.
ORtemp_var2 <- variogram(tann ~ 1, ortann,
cutoff = 175,
width = 15)
plot(ORtemp_var2, plot.numbers = TRUE)
Create a variogram model for this data using the
vgm() function. You will need to choose an appropriate
model and initial parameters for the nugget, sill and range. Report the
values and model you have used
The nugget looks pretty small (close to zero), the sill is around 4, and the range is around 200.
ORTemp_Nugget <- 0.1
ORTemp_Range <- 175
ORTemp_Sill <- 3.7
ORTemp_vgm1 <- vgm(psill = ORTemp_Sill,
model = "Cir",
range = ORTemp_Range,
nugget = ORTemp_Nugget)
plot(ORtemp_var, ORTemp_vgm1, main = "Oregon Temperatures Variogram")
After adjusting the nugget, sill, and range values until I was happy with the fit, I ended with a nugget value of 0.1, sill of 3.7, and range of 175.
Use the fit.variogram() function to fit this model
to the sample variogram from step a. Produce a plot showing the final
variogram
ORTemp_vgm2 <- fit.variogram(ORtemp_var, ORTemp_vgm1)
plot(ORtemp_var, ORTemp_vgm2, main = "Oregon Temperatures Variogram")
Fitted values are:
ORTemp_vgm2
## model psill range
## 1 Nug 0.2076095 0.0000
## 2 Cir 3.7165710 208.1351
I was pretty close for the nugget and sill, but underestimated the range.
Now use this model to interpolate the annual temperatures using
the grid from the DEM, using the krige() function. Produce
a map showing the predicted value on the grid and the prediction
error.
ORTemp_krige <- krige(tann ~ 1,
locations = ortann,
newdata = orgrid.dem,
model = ORTemp_vgm2,
nmax = 40)
## [using ordinary kriging]
my.pal <- brewer.pal(9, "Blues")
plot(ORTemp_krige, col = my.pal,
main = "Interpolated precipitation values (OK)")
Predicted error
plot(ORTemp_krige["var1.var"],
main = "Interpolated precipitation prediction error (OK)")
Use the krige.cv() function with 5-fold
cross-validation to report the root mean squared error and R\(^2\)
ORTemp_Kcv <- krige.cv(tann ~ 1,
locations = ortann,
model = ORTemp_vgm2,
nmax = 40,
nfold = 5)
## RMSEP
rmsep <- sqrt(mean(ORTemp_Kcv$residual^2))
rmsep
## [1] 1.283871
##R2P
R2P <- cor(ORTemp_Kcv$observed, ORTemp_Kcv$var1.pred)^2
R2P
## [1] 0.529979
The average error for making a prediction (RMSEP) is 1.284. The amount of variance in the test data set predicted by the model (R\(^2_P\)) is 0.53. This value seems lower than what would be desired.
Looking at the residuals:
sp::bubble(as_Spatial(ORTemp_Kcv)[,"residual"], pch = 16)
The residual looks like it is higher in the high-desert portions of the state that are near mountains (such as the southern area near Klamath Falls, and the northeast corner near the Wallowa Mountains). These patterns suggest that there are other underlying factors that are missing from the dataset.
plot(ORTemp_Kcv$var1.pred, ORTemp_Kcv$residual,
xlab = 'PPT Predicted Values',
ylab = 'PPT Residuals')
abline(h = 0, lty = 2)
Plotting the residuals vs. the predicted values, there doesn’t appear to be any trend. This means that the predicted model is fairly unbiased.
fin