Extra Credit - R rmd files

Mapping with R

A cool thing I learned this summer was how to use R to do GIS analysis. This is a bit of code from a class assignment. We’ll look at data from countries around the world. The data is found in countries.shp, which is a global shapefile with various socio-economic indicators for different countries such as:

Median GDP: (gdp_md_est),
Population: (pop_est),
Income group: (income_grp)

library(sf)
library(ggplot2)
library(RColorBrewer)
library(viridis)
library(ggsn)
library(lubridate)
library(dplyr)

countries <- st_read("countries/countries/countries.shp")

## Reading layer `countries' from data source 
##   `C:\Users\sherl\Documents\Geog 6680 - R\Mod12\countries\countries\countries.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 177 features and 64 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -180 ymin: -90 xmax: 180 ymax: 83.64513
## CRS:           NA

countries = countries[ , c("name", "gdp_md_est", "pop_est", "income_grp")]

ggplot() + geom_sf(data=countries, aes(fill = gdp_md_est)) + 
  scale_fill_viridis(option = "viridis", name = "Estimated Median GDP") + theme_bw() + 
  ggtitle("Estimated Median GDP", "by Country") +
  north(countries, location = "topright", symbol = 2, scale = 0.1)

It is very difficult to see the differences between the GDPs for the countries because there is such a large range between the poorest and richest countries. Taking the log() of the GDP data will make for a better scale to visualize the differences in the GDPs. Before taking log(), we need to be sure that the GDP data is greater than zero (gdp_me_est > 0). We’ll first look to see if the country dataframe has any info that is 0 or negative.

# make columns for the log of the population and gdp variables to better display differences between the countries

# first check for values <= 0
countries[countries$gdp_md_est < 1, ]

## Simple feature collection with 1 feature and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -17.06342 ymin: 20.99975 xmax: -8.665124 ymax: 27.65643
## CRS:           NA
##          name gdp_md_est pop_est    income_grp                       geometry
## 138 W. Sahara        -99     -99 5. Low income MULTIPOLYGON (((-8.794884 2...

countries[countries$pop_est < 1, ]

## Simple feature collection with 1 feature and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -17.06342 ymin: 20.99975 xmax: -8.665124 ymax: 27.65643
## CRS:           NA
##          name gdp_md_est pop_est    income_grp                       geometry
## 138 W. Sahara        -99     -99 5. Low income MULTIPOLYGON (((-8.794884 2...

Looks like W. Sahara is going to cause an issue. I will remove W. Sahara from the dataset, then take the log() of the population and GDP data for the remaining countries.

# remove W. Sahara
countries2 <- countries[countries$pop_est > 0, ]

# take logs of population and gdp columns
countries2$log_gdp_md_est = log(countries2$gdp_md_est)
countries2$log_pop_est = log(countries2$pop_est)

# create GDP and population maps
ggplot() + geom_sf(data=countries2, aes(fill = log_gdp_md_est)) + 
  scale_fill_viridis(option = "viridis", name = "log(Estimated Median GDP)") + theme_bw() + 
  ggtitle("World Estimated Median GDP", "by Country") +
  north(countries2, location = "topright", symbol = 2, scale = 0.1)

This map looks much better. It is easier to see the spread of GDP beyond 1st world countries. I can create a similar map using population data.

# create map for population
ggplot() + geom_sf(data=countries2, aes(fill = log_pop_est)) + 
  scale_fill_viridis(option = "inferno", name = "log(Estimated Population)") + theme_bw() + 
  ggtitle("Estimated Population", "by Country") +
  north(countries2, location = "topright", symbol = 2, scale = 0.1)

The scales for these maps look much better. It is easier to see the differences between the countries for the two data sets.

I will also create a map for displaying the countries by their income groups.

ggplot() + geom_sf(data=countries, aes(fill = income_grp)) + 
  scale_fill_brewer(palette = "Dark2", name = "Income Group") + theme_bw() + 
  north(countries, location = "topright", symbol = 2, scale = 0.1)

Extra Credit - R rmd files

Sherlyn Hilton

2023-09-24

Mapping with R