Introduction

In this tutorial, I will explain Cartograms in R.

About Cartogram

Cartograms are thematic maps that alter the shapes of geographic regions to reflect data variables, diverging from traditional maps where region size correlates with geographical area. This reshaping enables a visual portrayal of data trends not easily discernible on standard maps.

Why Cartogram

Cartograms offer several advantages over traditional thematic maps:

  1. Enhanced Visualization: Cartograms provide a visually appealing way to represent complex data, making it easier to identify spatial patterns and relationships.They make for a more engaging and memorable, enhancing the impact.

  2. Effective Communication: By distorting geographic shapes based on data variables, cartograms can effectively communicate key insights to a wide audiences.

  3. Highlighting Disparities: Cartograms can highlight regional disparities or concentrations of data, helping to draw attention to areas of interest.

  4. Comparative Analysis: Cartograms enable easy comparison between regions by resizing them according to the same variable, allowing for clearer insights into relative values.

Cartograms are valuable tools to analyze and communicate spatial data in a compelling and accessible manner.

Types

There are several types of cartograms that you can create using various packages:

  • Dorling Cartograms: Represent geographic regions as circles or bubbles, resizing them based on a specific variable.

  • Contiguous Area Cartograms: Maintain the topological relationships between neighboring regions while resizing them according to a variable.

  • Non-Contiguous Area Cartograms: Do not preserve topological relationships, allowing for more flexibility in resizing regions based on data variables.

Required Packages

Before we begin, make sure you have the following packages installed. I made a custom function “install_if_missing” to ensure to only install the package if not already installed. For this tutorial we need the ‘dplyr’, ‘tmap’, ‘sf’ and cartogram packages.

  • dplyr: data manipulation package for streamlined data wrangling tasks in R.

  • tmap: mapping package designed for creating thematic maps and spatial visualizations in R.

  • cartogram: package in R provides functions for creating cartograms, a type of thematic map that distorts geographic regions’ shapes to represent data variables.

  • sf : package in R provides tools for working with spatial data enabling efficient manipulation, analysis, and visualization of geographic data.

Load Libraries

Once the required packages are install we proceed to load them into our document.

library(dplyr)
library(tmap)
library(cartogram)
library(sf)

Get Spacial Data

To explain Cartograms properly we must use dataset to properly analyze and visualize our code.I would use the tmap built in dataset for this tutorial.

data(World)

I use the World dataset which comes built in with the tmap package.

southAmerica_data <- World[World$continent == "South America", ]

I would norrow out search wiith considering only the South America continent here. We filter the continent here with South America and retrive the South America dataset.

Data

Lets take closer look at the Africa dataset.

head(southAmerica_data)
## Simple feature collection with 6 features and 15 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -80.96777 ymin: -55.61183 xmax: -34.72999 ymax: 12.4373
## Geodetic CRS:  WGS 84
##    iso_a3      name sovereignt     continent           area   pop_est
## 5     ARG Argentina  Argentina South America 2736690 [km^2]  40913584
## 22    BOL   Bolivia    Bolivia South America 1083300 [km^2]   9775246
## 23    BRA    Brazil     Brazil South America 8358140 [km^2] 198739269
## 30    CHL     Chile      Chile South America  743532 [km^2]  16601707
## 36    COL  Colombia   Colombia South America 1109500 [km^2]  45644023
## 47    ECU   Ecuador    Ecuador South America  248360 [km^2]  14573101
##    pop_est_dens                  economy             income_grp gdp_cap_est
## 5     14.950025  5. Emerging region: G20 3. Upper middle income   14027.126
## 22     9.023582  5. Emerging region: G20 4. Lower middle income    4426.487
## 23    23.777930 3. Emerging region: BRIC 3. Upper middle income   10028.214
## 30    22.328167  5. Emerging region: G20 3. Upper middle income   14727.401
## 36    41.139273     6. Developing region 3. Upper middle income    8662.690
## 47    58.677327     6. Developing region 3. Upper middle income    7390.328
##    life_exp well_being footprint inequality      HPI
## 5    75.927        6.5      3.14  0.1642383 35.19024
## 22   67.450        6.0      2.96  0.3498051 23.32149
## 23   73.907        6.9      3.11  0.2163215 34.34498
## 30   81.050        6.6      4.36  0.1430682 31.66552
## 36   73.673        6.4      1.87  0.2350440 40.69501
## 47   75.449        6.0      2.17  0.2188788 37.04272
##                          geometry
## 5  MULTIPOLYGON (((-65.5 -55.2...
## 22 MULTIPOLYGON (((-62.84647 -...
## 23 MULTIPOLYGON (((-57.62513 -...
## 30 MULTIPOLYGON (((-68.63401 -...
## 36 MULTIPOLYGON (((-75.37322 -...
## 47 MULTIPOLYGON (((-80.30256 -...

The dataset contains information about South american countries, including their names, population estimates (pop_est), and geometries.

southAmerica_data <- st_transform(southAmerica_data, 3395)

Transforming the data into usable format.The st_transform() function in R, from the sf package, is used to transform spatial objects, such as spatial polygons or points, from one coordinate reference system (CRS) to another.

In the given code, southAmerica_data is being transformed from its current CRS to EPSG code 3395.

Plotting Data

Here we plot the data to better understand. In this assignment we will be using population estimate for our code hence we are plotting its distribution across different south america countries. The population varied a lot hence i used the square root function to better visualize the distribution.

par( mfrow=c(1,2) )
counts <- southAmerica_data$pop_est
country_label <- southAmerica_data$name
barplot(counts, names = country_label, main = "Population Distribution in SA", col = "skyblue",las = 2)
counts <- counts^(1/2)
barplot(counts, names = country_label, main = "Normalized Population Distribution in SA", 
        ylab = "Normalized Population", col = "skyblue",las = 2)

Plotting a Map with tmap

Using the tmap package we would first look at the data using map

tm_shape(southAmerica_data) +
  tm_borders() +
  tm_layout(main.title = "Map of South America", 
    main.title.position = "center")

We can see the visual representation of South America from the data.

Creating Cartogram

Now, let’s create a Cartogram by adjusting the size of geographic regions based on a variable of interest.

Add Variable

For this example, I am using the variable pop_est (population estimate) to create the Cartogram. I will use directly population estimate as the size of the regions.

southAmerica_data <- southAmerica_data %>%
 mutate(size = pop_est )

Continous Area Cartogram

Continuous Area Cartograms maintain the original shape of regions but adjust their sizes according to the variable being mapped.

southAmerica_cont <- cartogram_cont(southAmerica_data, "pop_est", itermax = 5)
tm_shape(southAmerica_cont) + tm_polygons("pop_est", style = "cat",palette = "RdYlBu",title ="Population") +
  tm_layout(frame = FALSE,legend.position = c("right", "bottom"),main.title = "Continous Area Cartogram", main.title.size = 0.8,
    main.title.position = "center")

Non-contiguous Area Cartogram

Discontinuous or Disjoint Cartograms, are a type of thematic map that distorts the geographic shapes of regions based on a specific variable while allowing for discontinuities in the map. Unlike Continuous Area Cartograms, where the shapes of regions are preserved, Non-Continuous Area Cartograms can have gaps or spaces between adjacent regions.

southAmerica_ncont <- cartogram_ncont(southAmerica_data, "pop_est")
tm_shape(southAmerica_data) + tm_borders() +
  tm_shape(southAmerica_ncont) + tm_polygons("pop_est", style = "cat",palette = "RdYlBu",title ="Population") +
  tm_layout(frame = FALSE, legend.position = c("right", "bottom"),main.title = "Non-Continous Area Cartogram", main.title.size = 0.8,
    main.title.position = "center")

The Dorling Cartogram adjusts the size of the regions based on population estimates, providing a clearer representation of population distribution.

Dorling Cartogram

A Dorling Cartogram, also known as a Dorling Area Cartogram, is a type of thematic map where geographic regions are represented by circles or bubbles. These circles are sized proportionally based on a specific variable

southAmerica_dorling <- cartogram_dorling(southAmerica_data, "pop_est")
tm_shape(southAmerica_data) + tm_borders() +
  tm_shape(southAmerica_dorling) + tm_polygons("pop_est", style = "cat",palette = "RdYlBu",title ="Population") +
  tm_layout(frame = FALSE, legend.position = c("right", "bottom"),,main.title = "Dorling Cartogram", main.title.size = 0.8, main.title.position = "center")

Conclusion

Here I have created Cartograms in R using spatial data and the tmap package. we can now see how Cartograms are useful for visualizing geographic data.