The packages we need:
> library(sp) # coordinates, loaded anyway by raster
> library(raster) # crs, extract
The 2009 land cover data, which is a raster file with 300 m resolution:
> gcvn <- globcoverVN::getgcvn()
> legend <- globcoverVN::show_legend(gcvn)
11 : Post-flooding or irrigated croplands (or aquatic)
14 : Rainfed croplands
20 : Mosaic cropland (50-70%) / vegetation (grassland/shrubland/forest) (20-50%)
30 : Mosaic vegetation (grassland/shrubland/forest) (50-70%) / cropland (20-50%)
40 : Closed to open (>15%) broadleaved evergreen or semi-deciduous forest (>5m)
50 : Closed (>40%) broadleaved deciduous forest (>5m)
60 : Open (15-40%) broadleaved deciduous forest/woodland (>5m)
70 : Closed (>40%) needleleaved evergreen forest (>5m)
100 : Closed to open (>15%) mixed broadleaved and needleleaved forest (>5m)
110 : Mosaic forest or shrubland (50-70%) / grassland (20-50%)
120 : Mosaic grassland (50-70%) / forest or shrubland (20-50%)
130 : Closed to open (>15%) (broadleaved or needleleaved, evergreen or deciduous) shrubland (<5m)
140 : Closed to open (>15%) herbaceous vegetation (grassland, savannas or lichens/mosses)
150 : Sparse (<15%) vegetation
160 : Closed to open (>15%) broadleaved forest regularly flooded (semi-permanently or temporarily) - Fresh or brackish water
170 : Closed (>40%) broadleaved forest or shrubland permanently flooded - Saline or brackish water
190 : Artificial surfaces and associated areas (Urban areas >50%)
200 : Bare areas
210 : Water bodies
220 : Permanent snow and ice
> legend <- with(legend, setNames(landtype, code))
> legend <- c(legend, "NA" = NA)
The GPS coordinates:
> gps <- readxl::read_excel("GPS coordinates.xlsx")
> coordinates(gps) <- ~ Longitude + Latitude
> crs(gps) <- crs(gcvn)
Extracting (takes a loooooooong time (a few days)):
> buffer10m <- extract(gcvn, gps, buffer = 10)
> buffer100m <- extract(gcvn, gps, buffer = 100)
10-m and 100-m buffers just give same thing:
> identical(buffer10m, buffer100m)
[1] TRUE
Furthermore, each coordinate has only one value of land cover:
> unique(sapply(buffer10m, length))
[1] 1
which makes our life easier! So, we can easily merge it back with the data frame of the coordinates:
> land_cover <- as.character(unlist(buffer10m))
> land_cover[is.na(land_cover)] <- "NA"
> land_cover <- data.frame(coordinates(gps), land_cover = legend[land_cover])
The number of GPS coordinates:
> nrow(land_cover)
[1] 730320
The number of missing values:
> sum(is.na(land_cover$land_cover))
[1] 73
> 100 * sum(is.na(land_cover$land_cover)) / nrow(land_cover)
[1] 0.009995618
Calculating proportions:
> land_cover2 <- na.exclude(land_cover)
> 100 * sort(table(land_cover$land_cover) / nrow(land_cover), TRUE)
Rainfed croplands
5.927224e+01
Mosaic vegetation (grassland/shrubland/forest) (50-70%) / cropland (20-50%)
3.617469e+01
Mosaic cropland (50-70%) / vegetation (grassland/shrubland/forest) (20-50%)
3.832567e+00
Water bodies
2.327747e-01
Post-flooding or irrigated croplands (or aquatic)
1.204951e-01
Closed to open (>15%) (broadleaved or needleleaved, evergreen or deciduous) shrubland (<5m)
1.066656e-01
Closed to open (>15%) broadleaved evergreen or semi-deciduous forest (>5m)
9.776536e-02
Closed to open (>15%) mixed broadleaved and needleleaved forest (>5m)
7.407712e-02
Closed to open (>15%) herbaceous vegetation (grassland, savannas or lichens/mosses)
4.463797e-02
Artificial surfaces and associated areas (Urban areas >50%)
1.916968e-02
Closed (>40%) needleleaved evergreen forest (>5m)
8.626356e-03
Open (15-40%) broadleaved deciduous forest/woodland (>5m)
6.161683e-03
Bare areas
1.369263e-04
A pie chart:
> pie(sort(table(land_cover$land_cover) / nrow(land_cover)))
So, basically, we have: