There are couple of options–I found the following combination the most straightforward. It downloads a SpatialPolygonsDataFrame for Nepal, admin level 3 (districts). You can also download other levels (development region, anchals, VDCs,etc) depending on the granularity of the dataset your’re trying to explore.
library(raster)
library(rgeos)
library(maptools)
library(ggplot2)
library(ggalt)
library(ggthemes)
library(viridis)
# Get SpatialPolygonsDataFrame and plot
npl <- getData(country="NP",level=3)
plot(npl)
I downloaded a sample dataset for districts from https://github.com/opennepal/odp-economy The file labor_productivity.csv contains GDP, Labor Productivity, Economically Active Population for each district. I will make plots for GDP–rest is left as an exercise for the user.
ldata <- read.csv("../labor_productivity.csv")
#head(ldata)
# create table with district, GDP (Value Added) Rs. In Million.
gdpDist <- subset(ldata,Sub.Group == "GDP (Value Added) Rs. In Million")
#names(gdpDist)
# Only keep District name, Value.
dropCols <- c("Zone","Geographical.Region","Sub.Group","Development.Region")
gdpDist <- gdpDist[ , !(names(gdpDist) %in% dropCols)]
# Change column name "Value" to "GDP"
colnames(gdpDist) <- c("District", "GDP")
# Clean data ready to merge to SpatialPolygonsDataFrame
head(gdpDist)
## District GDP
## 2 Achham 5513.8
## 5 Arghakhanchi 7198.2
## 8 Baglung 9397.1
## 11 Baitadi 5821.3
## 14 Bajhang 3816.6
## 17 Bajura 2851.5
Before we merge dataset map objects, we want to make sure that the names for districts are same.
distData <- gdpDist$District # name of dist. from labor_data file
distNPL <- npl@data$NAME_3 # name of dist. from npl
# Find the difference: dist. that is in distData but not in distNPL
distNPL[!(distNPL %in% distData)]
## [1] "Kavrepalanchok" "Darchula"
# We see that "Kavrepalanchok" and "Darchula" are named differently. Let's fix this. Firstly, start by converting gdpDist$District to characters
gdpDist$District <- as.character(gdpDist$District)
# Change
gdpDist$District[gdpDist$District=="Kavre"] <- "Kavrepalanchok"
gdpDist$District[gdpDist$District=="Darchaula"] <- "Darchula"
# Merge
npl <- merge(npl,gdpDist,by.x="NAME_3", by.y="District")
summary(npl$GDP)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 845 5927 9476 16650 21490 196700
# Now we are ready to make plots
require(RColorBrewer)
cols <- brewer.pal(n = 4, name = "Greys")
lcols <- cut(npl$GDP,
breaks = quantile(npl$GDP),
labels = cols)
plot(npl, col = as.character(lcols),
main="GDP Quantiles")
From the map, it is pretty clear which districts are on the 1st, 2nd, 3rd, or 4th GDP quantile. Kathmandu valley, obviously, is the one with the maximum GDP. Most of the districts from Terai, especially in central development region, are of darker shade–better than the average.
Terhathum, Okhaldhunga, and Rasuwa are the only districts east of Pokhara (Kaski district) that fall on the lowest quantile. Most of the districts in north-west region, Myagdi, Manang, and Rolpa also have GDP below 25%.
Make a similar map for labor productivity.
## District lProductivity
## 3 Achham 48177.7
## 6 Arghakhanchi 70768.0
## 9 Baglung 84903.3
## 12 Baitadi 54979.9
## 15 Bajhang 43624.1
## 18 Bajura 48084.7
cols <- brewer.pal(n = 4, name = "Greys")
lcols <- cut(npl$lProductivity,
breaks = quantile(npl$lProductivity),
labels = cols)
plot(npl, col = as.character(lcols),
main="Labor Productivity Quantiles")
Interesting to see that Manang, Myagdi, Rasuwa and other districts from 1st quantiles having high labor productivity. Say what?
Making quantiles lumps a group of districts into one of four categories. Instead, we can also make choropleth maps. In the following example, I have scaled down GDP by 10.
# https://stat.ethz.ch/pipermail/r-sig-geo/2016-January/023938.html
np_map <- fortify(npl, region="NAME_3")
choro_dat <- data.frame(region=npl@data$NAME_3,
value=npl@data$GDP/10,
stringsAsFactors=FALSE)
gg <- ggplot()
gg <- gg + geom_map(data=np_map, map=np_map,
aes(x=long, y=lat, map_id=id),
color="#b2b2b200", fill="#ffffff00", size=0.25)
gg <- gg + geom_map(data=choro_dat, map=np_map,
aes(fill=value, map_id=region),
color="#b2b2b2", size=0.25)
gg <- gg + scale_fill_viridis(name="GDP",
alpha=0.75)
gg <- gg + theme_map()
gg <- gg + theme(legend.position="bottom")
gg