1 Spatial Interpolation using Inverse Distance Weighing (IDW) in R

library(data.table)
library(raster)
## 载入需要的程辑包:sp
library(leaflet)

1.1 简介

This markdown file is adapted from Applied Spatial Analysis in Public Health online course available from https://hughst.github.io/. We would like to thank the contributors of the course who have provided the datasets and course in open domain.

Spatial interpolation is done to estimate spatial variation in risk of continous spatial variables when the data has been collected from sample point locations. For example, temperature is a continous spatial variable, however, data from meterological stations is available as sample point locations.

One of the common methods employed for spatial interpolation is based on Inverse Distance Weighing (IDW). The present document focuses on its application using R.

1.1.1 加载数据

dat <- fread('F:/learn/R/统计方法R实现/IDW插值/mal_data_eth_2009_no_dups.csv',header=T)

第二个数据集(ethopia_adm) 来自GADM的研究区地理信息数据

ethopia_adm <- raster::getData("GADM", country="ETH", level=1)
## Warning in raster::getData("GADM", country = "ETH", level = 1): getData will be removed in a future version of raster
## . Please use the geodata package instead

1.1.2 查看数据

names(dat)
##  [1] "country"      "country_id"   "continent_id" "site_id"      "site_name"   
##  [6] "latitude"     "longitude"    "rural_urban"  "year_start"   "lower_age"   
## [11] "upper_age"    "examined"     "pf_pos"       "pf_pr"        "method"      
## [16] "title1"       "citation1"
options(tibble.width = Inf)
head(dat)
##     country country_id continent_id site_id          site_name latitude
## 1: Ethiopia        ETH       Africa    6694        Dole School   5.9014
## 2: Ethiopia        ETH       Africa    8017     Gongoma School   6.3175
## 3: Ethiopia        ETH       Africa   12873      Buriya School   7.5674
## 4: Ethiopia        ETH       Africa    6533       Arero School   4.7192
## 5: Ethiopia        ETH       Africa    4150     Gandile School   4.8930
## 6: Ethiopia        ETH       Africa    1369 Melka Amana School   6.2461
##    longitude rural_urban year_start lower_age upper_age examined pf_pos
## 1:   38.9412     UNKNOWN       2009         4        15      220      0
## 2:   39.8362     UNKNOWN       2009         4        15      216      0
## 3:   40.7521     UNKNOWN       2009         4        15      127      0
## 4:   38.7650     UNKNOWN       2009         4        15       56      0
## 5:   37.3632     UNKNOWN       2009         4        15      219      0
## 6:   39.7891     UNKNOWN       2009         4        15      215      1
##          pf_pr     method
## 1: 0.000000000 Microscopy
## 2: 0.000000000 Microscopy
## 3: 0.000000000 Microscopy
## 4: 0.000000000 Microscopy
## 5: 0.000000000 Microscopy
## 6: 0.004651163 Microscopy
##                                                                                                                                 title1
## 1: School-based surveys of malaria in Oromia Regional State, Ethiopia: a rapid survey method for malaria in low transmission settings.
## 2: School-based surveys of malaria in Oromia Regional State, Ethiopia: a rapid survey method for malaria in low transmission settings.
## 3: School-based surveys of malaria in Oromia Regional State, Ethiopia: a rapid survey method for malaria in low transmission settings.
## 4: School-based surveys of malaria in Oromia Regional State, Ethiopia: a rapid survey method for malaria in low transmission settings.
## 5: School-based surveys of malaria in Oromia Regional State, Ethiopia: a rapid survey method for malaria in low transmission settings.
## 6: School-based surveys of malaria in Oromia Regional State, Ethiopia: a rapid survey method for malaria in low transmission settings.
##                                                                                                                                                                                                                                                                                                citation1
## 1: Ashton, RA, Kefyalew, T, Tesfaye, G, Pullan, RL, Yadeta, D, Reithinger, R, Kolaczinski, JH and Brooker, S (2011).  <b>School-based surveys of malaria in Oromia Regional State, Ethiopia: a rapid survey method for malaria in low transmission settings.</b> <i>Malaria Journal</i>, <b>10</b>(1):25
## 2: Ashton, RA, Kefyalew, T, Tesfaye, G, Pullan, RL, Yadeta, D, Reithinger, R, Kolaczinski, JH and Brooker, S (2011).  <b>School-based surveys of malaria in Oromia Regional State, Ethiopia: a rapid survey method for malaria in low transmission settings.</b> <i>Malaria Journal</i>, <b>10</b>(1):25
## 3: Ashton, RA, Kefyalew, T, Tesfaye, G, Pullan, RL, Yadeta, D, Reithinger, R, Kolaczinski, JH and Brooker, S (2011).  <b>School-based surveys of malaria in Oromia Regional State, Ethiopia: a rapid survey method for malaria in low transmission settings.</b> <i>Malaria Journal</i>, <b>10</b>(1):25
## 4: Ashton, RA, Kefyalew, T, Tesfaye, G, Pullan, RL, Yadeta, D, Reithinger, R, Kolaczinski, JH and Brooker, S (2011).  <b>School-based surveys of malaria in Oromia Regional State, Ethiopia: a rapid survey method for malaria in low transmission settings.</b> <i>Malaria Journal</i>, <b>10</b>(1):25
## 5: Ashton, RA, Kefyalew, T, Tesfaye, G, Pullan, RL, Yadeta, D, Reithinger, R, Kolaczinski, JH and Brooker, S (2011).  <b>School-based surveys of malaria in Oromia Regional State, Ethiopia: a rapid survey method for malaria in low transmission settings.</b> <i>Malaria Journal</i>, <b>10</b>(1):25
## 6: Ashton, RA, Kefyalew, T, Tesfaye, G, Pullan, RL, Yadeta, D, Reithinger, R, Kolaczinski, JH and Brooker, S (2011).  <b>School-based surveys of malaria in Oromia Regional State, Ethiopia: a rapid survey method for malaria in low transmission settings.</b> <i>Malaria Journal</i>, <b>10</b>(1):25

1.2 R中的IDW插值

1.2.1 创建插值范围

在进行IDW插值前,制定插值范围是第一步,然后课程单元所使用的现有数据来自 Ethopia 的 Oromia,因此创建了相同的范围

suppressMessages(library(spatstat))
oromia <- ethopia_adm[ethopia_adm$NAME_1=="Oromia",]
obs_window <- owin(oromia@bbox[1,], oromia@bbox[2,])

1.2.2 数据框数据转点形式对象

下一步包括在指定的 owin 中创建点模式对象(ppp)。标记(恶性疟率)作为每个点的属性值。

ppp_malaria<-ppp(dat$longitude,dat$latitude,marks=dat$pf_pr,window=obs_window)

1.2.3 创建IDW插值对象

1.2.3.0.1 IDW 对象是使用 R 中 spatstat 库中的 IDW 函数创建的。我们将使用0.05的幂值和“ at”参数像素作为说明。
idw_malaria <- idw(ppp_malaria, power=0.05, at="pixels")

1.2.4 可视化插值结果

plot(idw_malaria,
     col=heat.colors(20), 
     main="Interpolated spatial variation in risk of malaria based on IDW method \n (Power = 0.05)") 

1.2.5 基于点的空间风险插值

为了演示在 idw 函数中使用“ at”参数时可视化的变化,我们将复制上面的代码,只在“ at”参数中进行变化。

idw_points <- idw(ppp_malaria, power = 0.05, at = "points")
plot(idw_points, col=heat.colors(64))

1.2.6 均方根误差计算

均方差是衡量插值方法准确性的指标。

Metrics::mse(ppp_malaria$marks, idw_points)
## [1] 0.0001646708

1.3 用交叉验证法确定IDW插值最优参数

1.3.1 理解最优权重的需求

par(mfrow=c(2,2))
plot(idw(ppp_malaria, power=0.001, at="pixels"),col=heat.colors(20), main="power = 0.001") 
plot(idw(ppp_malaria, power=0.01, at="pixels"),col=heat.colors(20), main="power = 0.01")
plot(idw(ppp_malaria, power=0.1, at="pixels"),col=heat.colors(20), main="power = 0.1")
plot(idw(ppp_malaria, power=10, at="pixels"),col=heat.colors(20), main="power = 10") 

1.3.2 采用交叉验证获取最小误差对应的权重参数

powers <- seq(0.001, 10, 0.01)
mse_result <- NULL
for(power in powers){
  CV_idw <- idw(ppp_malaria, power=power, at="points")
  mse_result <- c(mse_result,
                  Metrics::mse(ppp_malaria$marks,CV_idw))
}
optimal_power <- powers[which.min(mse_result)]
optimal_power
## [1] 1.411
plot(powers, mse_result)

1.4 可视化插值结果

#转为raster
idw_raster <- raster(idw(ppp_malaria, power=optimal_power, at="pixels"))
## 定义坐标
crs_zb <- crs(ethopia_adm)
proj4string(idw_raster) <- CRS(crs_zb@projargs)
# 定义色带
colPal <- colorNumeric(wesanderson::wes_palette("Zissou1")[1:5], idw_raster[], n = 5)
# 画图并在线显示
leaflet() %>% addTiles() %>% addRasterImage(idw_raster, col = colPal, opacity=0.7) %>%
  addLegend(pal = colPal, values = idw_raster[])