I have tried to explore simple world maps from the package ‘rworldmap’ authored by Andy South. In this post, I’ll just reproduce stuff from the vignette. Still a newcomer to R, so the basic purpose of this html document is to just try out the package with the same codes.
At the moment I do not have any application in my mind, so just idle exploration. Maybe later on, I can combine this with some more data processing or make more creative plots. This should serve as a basic model.
First, we load the libraries required for this:
library(rworldmap)
## Warning: package 'rworldmap' was built under R version 3.1.3
## Loading required package: sp
## Warning: package 'sp' was built under R version 3.1.3
## ### Welcome to rworldmap ###
## For a short introduction type : vignette('rworldmap')
library(stringr)
## Warning: package 'stringr' was built under R version 3.1.3
Now load the CountryData dataset that comes with this library.
data("countryExData")
A brief summary of the data is as follows:
class(countryExData)
## [1] "data.frame"
summary(countryExData)
## ISO3V10 Country EPI_regions
## Length:149 Length:149 Length:149
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
##
##
##
##
## GEO_subregion Population2005 GDP_capita.MRYA landlock
## Length:149 Min. : 269.7 Min. : 629.8 Min. :0.0000
## Class :character 1st Qu.: 4382.9 1st Qu.: 2012.9 1st Qu.:0.0000
## Mode :character Median : 10161.0 Median : 5529.4 Median :0.0000
## Mean : 42887.7 Mean : 9976.6 Mean :0.2349
## 3rd Qu.: 29481.8 3rd Qu.:13886.7 3rd Qu.:0.0000
## Max. :1315844.0 Max. :59852.5 Max. :1.0000
## NA's :1
## landarea density EPI ENVHEALTH
## Min. : 1993 Min. : 0.00 Min. :39.10 Min. : 6.00
## 1st Qu.: 65830 1st Qu.: 2.20 1st Qu.:64.00 1st Qu.:58.40
## Median : 237056 Median : 8.70 Median :74.10 Median :84.50
## Mean : 847879 Mean :19.56 Mean :71.87 Mean :74.57
## 3rd Qu.: 669707 3rd Qu.:26.50 3rd Qu.:81.80 3rd Qu.:95.50
## Max. :16679993 Max. :95.30 Max. :95.50 Max. :99.40
##
## ECOSYSTEM ENVHEALTH.1 AIR_E WATER_E
## Min. :37.10 Min. : 6.00 Min. : 44.00 Min. : 0.00
## 1st Qu.:62.80 1st Qu.:58.40 1st Qu.: 87.80 1st Qu.:57.50
## Median :71.30 Median :84.50 Median : 96.10 Median :67.60
## Mean :69.18 Mean :74.57 Mean : 90.72 Mean :66.58
## 3rd Qu.:75.80 3rd Qu.:95.50 3rd Qu.: 98.80 3rd Qu.:79.20
## Max. :92.80 Max. :99.40 Max. :100.00 Max. :99.00
##
## BIODIVERSITY PRODUCTIVE_NATURAL_RESOURCES CLIMATE
## Min. : 0.20 Min. :44.40 Min. :16.10
## 1st Qu.: 23.90 1st Qu.:75.20 1st Qu.:61.90
## Median : 46.50 Median :82.30 Median :72.90
## Mean : 46.35 Mean :79.81 Mean :71.46
## 3rd Qu.: 66.70 3rd Qu.:86.60 3rd Qu.:81.80
## Max. :100.00 Max. :99.00 Max. :99.80
##
## DALY_SC WATER_H AIR_H AIR_E.1
## Min. : 0.00 Min. : 0.00 Min. :16.00 Min. : 44.00
## 1st Qu.:69.40 1st Qu.: 43.80 1st Qu.:52.40 1st Qu.: 87.80
## Median :92.80 Median : 75.70 Median :72.50 Median : 96.10
## Mean :79.86 Mean : 67.96 Mean :70.58 Mean : 90.72
## 3rd Qu.:99.10 3rd Qu.: 96.50 3rd Qu.:92.80 3rd Qu.: 98.80
## Max. :99.80 Max. :100.00 Max. :97.90 Max. :100.00
##
## WATER_E.1 BIODIVERSITY.1 FOREST FISH
## Min. : 0.00 Min. : 0.20 Min. : 0.00 Min. : 0.00
## 1st Qu.:57.50 1st Qu.: 23.90 1st Qu.: 83.08 1st Qu.:53.70
## Median :67.60 Median : 46.50 Median :100.00 Median :77.20
## Mean :66.58 Mean : 46.35 Mean : 88.24 Mean :70.29
## 3rd Qu.:79.20 3rd Qu.: 66.70 3rd Qu.:100.00 3rd Qu.:86.30
## Max. :99.00 Max. :100.00 Max. :100.00 Max. :99.50
## NA's :1 NA's :36
## AGRICULTURE CLIMATE.1 ACSAT_pt WATSUP_pt
## Min. :46.50 Min. :16.10 Min. : 0.00 Min. : 0.00
## 1st Qu.:71.40 1st Qu.:61.90 1st Qu.: 34.50 1st Qu.: 54.20
## Median :77.95 Median :72.90 Median : 73.10 Median : 83.00
## Mean :77.72 Mean :71.46 Mean : 64.76 Mean : 71.15
## 3rd Qu.:82.78 3rd Qu.:81.80 3rd Qu.: 97.70 3rd Qu.: 98.30
## Max. :99.10 Max. :99.80 Max. :100.00 Max. :100.00
## NA's :9
## DALY_pt INDOOR_pt PM10_pt OZONE_H_pt
## Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.:69.40 1st Qu.: 20.00 1st Qu.: 59.10 1st Qu.: 98.30
## Median :92.80 Median : 68.40 Median : 83.80 Median : 99.80
## Mean :79.86 Mean : 57.29 Mean : 73.25 Mean : 91.77
## 3rd Qu.:99.10 3rd Qu.: 94.70 3rd Qu.: 95.90 3rd Qu.:100.00
## Max. :99.80 Max. :100.00 Max. :100.00 Max. :100.00
##
## SO2_pt OZONE_E_pt WATQI_pt WATSTR_pt
## Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 0.0
## 1st Qu.:91.50 1st Qu.: 95.00 1st Qu.:29.40 1st Qu.: 76.5
## Median :96.70 Median : 99.90 Median :49.60 Median : 93.8
## Mean :91.18 Mean : 90.26 Mean :48.86 Mean : 84.3
## 3rd Qu.:99.10 3rd Qu.:100.00 3rd Qu.:69.20 3rd Qu.:100.0
## Max. :99.90 Max. :100.00 Max. :99.00 Max. :100.0
##
## WATQI_GEMS.station.data FORGRO_pt CRI_pt
## Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.:44.90 1st Qu.: 83.08 1st Qu.: 30.40
## Median :65.50 Median :100.00 Median : 72.70
## Mean :61.29 Mean : 88.24 Mean : 63.15
## 3rd Qu.:78.60 3rd Qu.:100.00 3rd Qu.: 97.60
## Max. :99.00 Max. :100.00 Max. :100.00
## NA's :62 NA's :1
## EFFCON_pt AZE_pt MPAEEZ_pt EEZTD_pt
## Min. : 0.00 Min. : 0.00 Min. : 0.0 Min. : 0.00
## 1st Qu.: 11.90 1st Qu.: 19.00 1st Qu.: 1.0 1st Qu.:39.00
## Median : 42.10 Median : 45.70 Median : 9.0 Median :72.30
## Mean : 45.77 Mean : 45.74 Mean : 33.8 Mean :59.87
## 3rd Qu.: 78.60 3rd Qu.: 69.40 3rd Qu.:100.0 3rd Qu.:83.30
## Max. :100.00 Max. :100.00 Max. :100.0 Max. :99.10
## NA's :84 NA's :36
## MTI_pt IRRSTR_pt AGINT_pt AGSUB_pt
## Min. : 0.00 Min. : 0.00 Min. : 0.0 Min. : 0.00
## 1st Qu.: 77.40 1st Qu.: 83.72 1st Qu.: 69.2 1st Qu.: 61.40
## Median : 98.50 Median : 99.90 Median : 90.2 Median :100.00
## Mean : 84.33 Mean : 88.56 Mean : 79.2 Mean : 79.13
## 3rd Qu.:100.00 3rd Qu.:100.00 3rd Qu.: 99.6 3rd Qu.:100.00
## Max. :100.00 Max. :100.00 Max. :100.0 Max. :100.00
## NA's :50 NA's :7 NA's :4
## BURNED_pt PEST_pt GHGCAP_pt CO2IND_pt
## Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 78.70 1st Qu.: 13.60 1st Qu.: 80.80 1st Qu.: 71.70
## Median : 93.00 Median : 77.30 Median : 89.40 Median : 85.00
## Mean : 82.85 Mean : 59.95 Mean : 83.33 Mean : 78.79
## 3rd Qu.: 98.20 3rd Qu.: 95.50 3rd Qu.: 96.50 3rd Qu.: 96.50
## Max. :100.00 Max. :100.00 Max. :100.00 Max. :100.00
## NA's :6
## CO2KWH_pt ACSAT WATSUP DALY
## Min. : 0.00 Min. : 9.00 Min. : 22.00 Min. : 0.1
## 1st Qu.: 30.50 1st Qu.: 44.00 1st Qu.: 73.00 1st Qu.: 0.5
## Median : 50.50 Median : 77.00 Median : 90.00 Median : 4.0
## Mean : 52.25 Mean : 69.76 Mean : 82.87 Mean : 11.8
## 3rd Qu.: 70.40 3rd Qu.: 98.00 3rd Qu.: 99.00 3rd Qu.: 17.0
## Max. :100.00 Max. :100.00 Max. :100.00 Max. :109.0
##
## INDOOR PM10 OZONE_H SO2
## Min. : 0.00 Min. : 6.40 Min. : 0.0 Min. : 0.000
## 1st Qu.: 5.00 1st Qu.: 24.90 1st Qu.: 0.0 1st Qu.: 0.400
## Median :30.00 Median : 39.30 Median : 4.0 Median : 1.400
## Mean :40.56 Mean : 51.45 Mean : 208.7 Mean : 3.769
## 3rd Qu.:76.00 3rd Qu.: 68.60 3rd Qu.: 31.7 3rd Qu.: 3.600
## Max. :95.00 Max. :181.50 Max. :4948.8 Max. :48.300
##
## OZONE_E WATQI WATQI_GEMS.station.data.1
## Min. :0.000e+00 Min. :34.0 Min. :34.00
## 1st Qu.:0.000e+00 1st Qu.:57.5 1st Qu.:66.85
## Median :5.916e+05 Median :69.7 Median :79.30
## Mean :7.035e+07 Mean :69.2 Mean :76.64
## 3rd Qu.:2.059e+07 3rd Qu.:81.5 3rd Qu.:87.15
## Max. :2.661e+09 Max. :99.4 Max. :99.40
## NA's :62
## WATSTR FORGRO CRI EFFCON
## Min. : 0.00 Min. :0.600 Min. :0.0000 Min. : 0.000
## 1st Qu.: 0.00 1st Qu.:1.000 1st Qu.:0.2000 1st Qu.: 1.200
## Median : 5.60 Median :1.000 Median :0.4000 Median : 4.200
## Mean :14.24 Mean :1.015 Mean :0.3181 Mean : 4.577
## 3rd Qu.:21.30 3rd Qu.:1.100 3rd Qu.:0.5000 3rd Qu.: 7.900
## Max. :90.60 Max. :2.500 Max. :0.5000 Max. :10.000
## NA's :1
## AZE MPAEEZ EEZTD MTI
## Min. : 0.00 Min. : 0.00 Min. :0.0094 Min. :-0.02370
## 1st Qu.: 19.00 1st Qu.: 0.10 1st Qu.:0.1674 1st Qu.:-0.00440
## Median : 45.70 Median : 0.90 Median :0.2769 Median :-0.00030
## Mean : 45.74 Mean : 3.38 Mean :0.4013 Mean :-0.00061
## 3rd Qu.: 69.40 3rd Qu.:10.00 3rd Qu.:0.6099 3rd Qu.: 0.00250
## Max. :100.00 Max. :10.00 Max. :1.0000 Max. : 0.02530
## NA's :84 NA's :36 NA's :50
## IRRSTR AGINT AGSUB BURNED
## Min. : 0.000 Min. : 0.00 Min. : 0.00 Min. : 0.000
## 1st Qu.: 0.000 1st Qu.: 0.30 1st Qu.: 0.00 1st Qu.: 0.250
## Median : 0.100 Median : 6.20 Median : 0.00 Median : 0.900
## Mean : 9.897 Mean :13.41 Mean :10.34 Mean : 2.403
## 3rd Qu.:13.800 3rd Qu.:19.50 3rd Qu.:18.00 3rd Qu.: 2.900
## Max. :98.300 Max. :80.90 Max. :69.00 Max. :21.400
## NA's :7 NA's :4 NA's :6
## PEST GHGCAP CO2IND CO2KWH
## Min. : 0.00 Min. : 1.30 Min. : 0.000 Min. : 0.0
## 1st Qu.: 3.00 1st Qu.: 4.00 1st Qu.: 1.100 1st Qu.: 275.0
## Median :17.00 Median : 7.70 Median : 1.900 Median : 459.0
## Mean :13.19 Mean :10.84 Mean : 2.381 Mean : 452.9
## 3rd Qu.:21.00 3rd Qu.:12.20 3rd Qu.: 2.800 3rd Qu.: 644.8
## Max. :22.00 Max. :54.10 Max. :14.500 Max. :1848.0
##
Let’s say we want to plot the per capita GDP data of all the countries on the world map. This dataframe has a good tidy data set for all the countries. Let’s see which column we need.
nam <- names(countryExData)
indx <- str_detect(nam, "GDP_capita.MRYA")
which(indx == TRUE)
## [1] 6
Let’s also just display all the columns of this dataset as follows:
print(nam)
## [1] "ISO3V10" "Country"
## [3] "EPI_regions" "GEO_subregion"
## [5] "Population2005" "GDP_capita.MRYA"
## [7] "landlock" "landarea"
## [9] "density" "EPI"
## [11] "ENVHEALTH" "ECOSYSTEM"
## [13] "ENVHEALTH.1" "AIR_E"
## [15] "WATER_E" "BIODIVERSITY"
## [17] "PRODUCTIVE_NATURAL_RESOURCES" "CLIMATE"
## [19] "DALY_SC" "WATER_H"
## [21] "AIR_H" "AIR_E.1"
## [23] "WATER_E.1" "BIODIVERSITY.1"
## [25] "FOREST" "FISH"
## [27] "AGRICULTURE" "CLIMATE.1"
## [29] "ACSAT_pt" "WATSUP_pt"
## [31] "DALY_pt" "INDOOR_pt"
## [33] "PM10_pt" "OZONE_H_pt"
## [35] "SO2_pt" "OZONE_E_pt"
## [37] "WATQI_pt" "WATSTR_pt"
## [39] "WATQI_GEMS.station.data" "FORGRO_pt"
## [41] "CRI_pt" "EFFCON_pt"
## [43] "AZE_pt" "MPAEEZ_pt"
## [45] "EEZTD_pt" "MTI_pt"
## [47] "IRRSTR_pt" "AGINT_pt"
## [49] "AGSUB_pt" "BURNED_pt"
## [51] "PEST_pt" "GHGCAP_pt"
## [53] "CO2IND_pt" "CO2KWH_pt"
## [55] "ACSAT" "WATSUP"
## [57] "DALY" "INDOOR"
## [59] "PM10" "OZONE_H"
## [61] "SO2" "OZONE_E"
## [63] "WATQI" "WATQI_GEMS.station.data.1"
## [65] "WATSTR" "FORGRO"
## [67] "CRI" "EFFCON"
## [69] "AZE" "MPAEEZ"
## [71] "EEZTD" "MTI"
## [73] "IRRSTR" "AGINT"
## [75] "AGSUB" "BURNED"
## [77] "PEST" "GHGCAP"
## [79] "CO2IND" "CO2KWH"
So, we see the ‘ISO3V10’, which is the 3-digit country code that is crucial for this package to function smoothly. In case you are uning your own dataset, make sure to have this column present in the data. It can make working with this package very easy.
Then the ‘Country’ displays the complete country name.
Thereafter you have the “EPI_region” and the subregion indicating which continent the country belongs to.
In column 6, you have the example plot that we want to make.
That was a brief summary of the data set.
Now that we have read the data and summarized it, the next step is to convert the data set into a special class called ‘SpatialPolygonsDataFrame’. Basically we need to do this because the plotting function makes use of an object from this class to plot the country map. Here’s the code for it:
sPDF <- joinCountryData2Map( countryExData, joinCode = "ISO3", nameJoinColumn = "ISO3V10")
## 149 codes from your data successfully matched countries in the map
## 0 codes from your data failed to match with a country code in the map
## 95 codes from the map weren't represented in your data
Although the parameters are intuitive to understand, a few words on them follow:
Here, joinCode is an important parameter. We have seen that we use the 3-digit country code. Hence, ISO3 is our current format. If your dataset has a 2-digit country code, you can use “ISO2”. One other available code is the “FIPS” in case you are using a dataset directly available from the US Government. Here is a wiki link of what the Federal Information Processing Standards are. “http://en.wikipedia.org/wiki/Federal_Information_Processing_Standards”. One can use the “NAME” option if the country is specified by name. However, different countries have different ways of meaning the same thing. For instance, a Dutch data could have all the country names in Dutch. For instance, ‘The Netherlands’ is ‘de Nederland’ in Dutch. This could cause inconsistency and I recommend that you always have a 3-digit country code in your data set.
The next parameter of importance is the nameJoinColumn. It’s fairly easy to understand. The column from the dataset that you would like to use to specify the country code is entered.
Alright, so now we’re ready to plot the dataset. The code for this is as follows:
par(mai=c(0,0,0.2,0),xaxs="i",yaxs="i")
mapCountryData( sPDF, nameColumnToPlot="GDP_capita.MRYA",
mapTitle = "Per Capita GDP Plot by Country" )
So, there you go. The basic plot was easy! In the next post, we will explore how to go about tweaking and experimenting with more detailed or customized graphs.
Cheers,
Arry87