Vietnam forest cover 2021

1. Objective

This work is to produce a map illustrating the forest cover rate of all provinces in Vietnam in 2021.

2. Clear R environment and load some pakages

First, we install some libraries and packages that are necessary for the process.

# Clear R environment 

rm(list = ls())

# Load some R packages

library("ggplot2")

library("dplyr")

library("stringi")

library("rgdal")

library(maptools)

install.packages("gpclib", repos = "http://cran.us.r-project.org")

gpclibPermit() # The result must be TRUE
[1] TRUE

3. Import and handle forest cover data

The data source origins from The Statistical Yearbook of Vietnam 2021 published by The General Statistics Office (GSO). Since the data is not available on the internet, you can download it at here. Take note of the path of your Excel file because it differs from my Excel path. The key point in data handling is to create a “matching” with the geospatial data. In this case, the matching is the new variable id_latin_new_1.

# Load forest data from Excel sheet

readxl::read_excel("D:/R/R_2023/gso/gso_2021_forest.xlsx", sheet = 1) -> forest_data

# Change the names of two columns

names(forest_data) <- c("province", "forest_cover")

# Create a new column of province names in Latin

forest_data %>% 
  slice(1:63) %>%      # Chose row 1 to 63
  mutate(id_latin = stri_trans_general(province, "Latin-ASCII")) -> forest_data

# Adjust the names of some provinces 

forest_data %>% 
  mutate(id_latin_new_1 = case_when(id_latin == "TP. Ho Chi Minh" ~ "Ho Chi Minh",
                                  id_latin == "Thua Thien - Hue" ~ "Thua Thien Hue",
                                  id_latin == "Bac Can" ~ "Bac Kan",
                                  TRUE ~ id_latin)) -> forest_data

4. Get and handle geospatial data

Now we get the geospatial data for provincial level of Vietnam. The available data is formatted in Spatial Polygons Data, famously known as GADM data. The levels from 0 to 3 are national, provincial, district, and commune levels, respectively. However, the server now is not stable, so you can download the geospatial data for provincial level at file name: gadm36_VNM_2_sp.rds, 1.8 Mb.

You can try do download the data directly from the server by the following function: geoData_vn_province <- raster::getData(“GADM”, country = “Vietnam”, level = 1)

# Get the provincial geospatial data

geoData_vn_province <- readRDS("D:/R/R_2023/gadm36_VNM/gadm36_VNM_2_sp.rds")

Because the initial data is Spatial Polygons Data, we must convert it into a data frame by the function fortify(). And then, we use the function unique() to filter the names of all provinces. We create a new variable id_latin_new_2 as the “matching”. A very basic map of Vietnam can be drawn by ggplot2.

# Convert geospatial data to data frame

vn_province_df <- geoData_vn_province %>% fortify(region = "NAME_1")

# Filter the provinces' names in the data frame

vn_provinces <- vn_province_df$id %>% unique()

# Create a new variable of province name in Latin

vn_province_df %>% 
  mutate(id_latin_new_2 = stri_trans_general(id, "Latin-ASCII")) -> vn_province_df

# Extract 63 provinces from geospatial data

vn_province_df$id_latin_new_2 %>% unique() -> province_vn

# Draw the first map of Vietnam at province level

vn_province_df %>% 
  ggplot(aes(x = long, y = lat, group = group, fill = id)) + 
  geom_polygon(show.legend = FALSE) +
  geom_path()

5. Check and join the two data sets

We create two sub-data sets of the province names, i.e. a.set and b.set and check whether or not they are different. Next we join the two data sets by using the function full_join() and through the matching factors id_latin_new_1 and id_latin_new_2. A basic map with forest cover density is plotted by using ggplot2.

# Check the two data sets

vn_province_df$id_latin_new_2 %>% unique() -> a.set

forest_data$id_latin_new_1 %>% unique() -> b.set

base::setdiff(a.set, b.set)     # Set a and Set b have no difference
character(0)
# Join 2 data sets

full_join(forest_data, vn_province_df, by = c("id_latin_new_1" = "id_latin_new_2")) -> df_for_mapping

# Draw the second map

df_for_mapping %>% 
  ggplot(aes(x = long, y = lat, group = group, fill = forest_cover)) + 
  geom_polygon()

6. Refine the map

We choose a specific font and a set of colors to apply to the final map. It is noted that GADM data does not provide geospatial data of The Vietnamese Paracel and Spratly Islands, so the note must be included.

# Select font

library(showtext) # For using a new font

my_font <- "Roboto Condensed"

font_add_google(name = my_font, family = my_font)

showtext_auto()

# Draw the  final map

my_colors <- c('#f7fcf5','#c7e9c0','#74c476','#238b45','#00441b') # Choose the colors

df_for_mapping %>% 
  ggplot(aes(x = long, y = lat, group = group, fill = forest_cover)) + 
  geom_polygon() + 
  geom_path(color = "grey20", size = 0.001) + 
  coord_map(projection = "albers", lat0 = 30, lat1 = 40) + 
  labs(title = "Vietnam forest cover 2021",
       subtitle = "Note: The Paracel and Spratly Islands of Vietnam are not shown on this map",
       caption = "Data source: GSO, Statistical Yearbook of Vietnam 2021") + 
  theme(axis.line = element_blank(),
        axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        axis.ticks = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        panel.grid = element_blank(),
        plot.background = element_rect(fill = "white", color = NA),
        panel.background = element_rect(fill = "white", color = NA),
        legend.background = element_rect(fill = "white", color = NA),
        panel.border = element_blank()) +
  theme(plot.title = element_text(family = my_font, color = "grey10", size = 23)) + 
  theme(plot.subtitle = element_text(family = my_font, size = 10, color = "gray30", face = "italic")) + 
  theme(plot.caption = element_text(family = my_font, size = 10, colour = "grey40", face = "italic", hjust = 0)) + 
  theme(legend.text = element_text(family = my_font, color = "grey20", size = 10)) + 
  theme(legend.title = element_text(family = my_font, color = "grey20", size = 10)) + 
  theme(plot.margin = unit(c(0.5, 0.3, 0.3, 0.3), "cm")) +
  theme(legend.position = c(1, 0.65)) +
  scale_fill_gradientn(colors = my_colors, name = "% cover") 

7. Save the map

To capture the map, click the button Export and save it as your expected format. The other way is to use the following codes and find it’s location.

# Save the final map

ggsave("my_map.png", dpi = 150)       # We can change the format and dpi

# Find the location of the final map

getwd()
[1] "D:/R/R_2023/gso"