In statistics and data science, communication is an important element. Computer analyzes and conclusions have almost no value if they cannot be displayed and presented in a suitable format for the context. These conclusions do not matter if they do not reach the recipient in an easily understandable and clear way. The visualization is key!
The visualization opportunities with R-studio almost never ends. You can create all different charts to visualize your data. When observing world data it can be good to use a map because this gives an overview over how it looks in different countries. When example visualizing gender distribution or how the population looks in the world, this is a good way. This can be designed in several different ways, here I will try to show how to use the r package: rworldmap
Let’s go!
The package rworldmap has mainly three core functions which I will try to describe in this vignette. joinCountryData2Map() - This function joins data to the map to enable you to plot what you want. mapCountryData()- plots a map of country data mapGriddedData()- plots a map of gridded data
First off all you need to load the package: (If you don’t have it installed, do this before: install.packages(“rworldmap”))
## Loading required package: sp
## ### Welcome to rworldmap ###
## For a short introduction type : vignette('rworldmap')
We do this in 3 steps
First read the data you want to plot later, here you can either use read.csv(filename.csv) or read.txt(filename.txt), type ?read.table
I will use the World health organisation (WHO) thats availible in tidyr/tidyverse r package dataset - population I will clean this up so it only show the latest measured population, year 2013. I also load the packages needed for the dataset + the cleaning
library(janitor)
##
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
##
## chisq.test, fisher.test
library(tidyr)
library(tidyverse)
## ── Attaching packages ──────────────────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.0 ✓ dplyr 0.8.5
## ✓ tibble 3.0.0 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.5.0
## ✓ purrr 0.3.3
## ── Conflicts ─────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
eval = FALSE
populationData <- population %>%
filter(year==2013)
Note: I rename it to populationData and fyi rworldmap also have a example dataset that can be used, but in this example I wantet to see how it worked to match with an other dataset, but if you want to try the dataset example is called countryExData
Now we have the data, step two here we come!
Now we will use rworldmap’s function joinCountryData2Map to join the data we read in under step 1. nameJoinColumn- name of column containing your country identifiers joinCode- type of code used, in this case its country names, “NAME”, but other codes can occur e.g for ISO 3 letter code identifies by “IOS3” or a numreric code identifies by “UN” note: when using name as identifier more mismatches may apperear because of different spellings and names. For my used dataset populationData I got a few mismatches…
joinData <- joinCountryData2Map( populationData,
joinCode = "NAME",
nameJoinColumn = "country")
## 208 codes from your data successfully matched countries in the map
## 9 codes from your data failed to match with a country code in the map
## 35 codes from the map weren't represented in your data
So, finally, we’ll plot! We plot the joinData that we created in step 2. Then I want to see the population so I put nameColunTopPlot just taht. By using the do.call allows addMapLegend to receive the output from theMap so that they display the same data. This way also gives you more ways to modify the appearance of the legend.
theMap <- mapCountryData( joinData, nameColumnToPlot="population", addLegend=FALSE )
do.call( addMapLegend, c(theMap, legendWidth=1, legendMar = 2))
It is a lot of red on this map, but now you can modify it as you want. One parameter that can be used is catMethod, which arrange the span of the legend. If we play with it we see different types of result. For example, investigate which countries have the least and the most population.
theMap_smallPop <- mapCountryData( joinData, nameColumnToPlot="population", catMethod = c(1344,10000000), addLegend=FALSE )
theMap_ownNr <- mapCountryData( joinData, nameColumnToPlot="population", catMethod = c(150000000, 2000000000), addLegend=FALSE )
An other good thing to know how to do is to zoom in, to do that you do like this!
mapRegion = "Europe"
theMap_europe <- mapCountryData( joinData, nameColumnToPlot="population", mapRegion = "Europe", addLegend=FALSE )
For this we use the example dataset. Here you get a other view that shows more how the population is distributed.
data(gridExData)
mapGriddedData(gridExData)
I certainly have!