From data from gapminder https://www.gapminder.org/data/ I downloaded a csv file named “CO2 emissions (tonnes per person)”

1. Reading the data and Getting The Data Prepared

The csv file has a data of countries in the world and how many tonnes per person of CO2 was emitted into the air. The records are from 1960 and 2014. There are lots of missing data in old years.

setwd("C:/workspace")
co2=read.csv("co2emission.csv",header=T,stringsAsFactors = FALSE)
head(co2)
##   癤풠ountry.Name Country.Code                         Indicator.Name
## 1           Aruba          ABW CO2 emissions (metric tons per capita)
## 2     Afghanistan          AFG CO2 emissions (metric tons per capita)
## 3          Angola          AGO CO2 emissions (metric tons per capita)
## 4         Albania          ALB CO2 emissions (metric tons per capita)
## 5         Andorra          AND CO2 emissions (metric tons per capita)
## 6      Arab World          ARB CO2 emissions (metric tons per capita)
##       X1960      X1961      X1962      X1963      X1964     X1965
## 1        NA         NA         NA         NA         NA        NA
## 2 0.0460599 0.05360430 0.07376479 0.07423269 0.08629245 0.1014674
## 3 0.0974716 0.07903808 0.20128908 0.19253474 0.20100336 0.1915284
## 4 1.2581949 1.37418605 1.43995596 1.18168114 1.11174196 1.1660990
## 5        NA         NA         NA         NA         NA        NA
## 6 0.6436890 0.68515088 0.76085451 0.87494119 0.99909766 1.1657054
##       X1966     X1967     X1968      X1969     X1970     X1971     X1972
## 1        NA        NA        NA         NA        NA        NA        NA
## 2 0.1076370 0.1237343 0.1154977 0.08682346 0.1502906 0.1660420 0.1307638
## 3 0.2464128 0.1549116 0.2563160 0.41955056 0.5286980 0.4923022 0.6352147
## 4 1.3330555 1.3637463 1.5195513 1.55896757 1.7532399 1.9894979 2.5159144
## 5        NA        NA        NA         NA        NA        NA        NA
## 6 1.2726507 1.3314044 1.5449425 1.78991339 1.8012455 1.9934862 2.1098716
##       X1973     X1974     X1975     X1976     X1977     X1978     X1979
## 1        NA        NA        NA        NA        NA        NA        NA
## 2 0.1362798 0.1556494 0.1689286 0.1547872 0.1829636 0.1631596 0.1683767
## 3 0.6706243 0.6520234 0.5746931 0.4158503 0.4347550 0.6461792 0.6369442
## 4 2.3038974 1.8490067 1.9106336 2.0135846 2.2758764 2.5306250 2.8982085
## 5        NA        NA        NA        NA        NA        NA        NA
## 6 2.3967997 2.2734407 2.1842635 2.5687100 2.6316284 2.7432413 2.8426551
##       X1980     X1981     X1982     X1983     X1984     X1985     X1986
## 1        NA        NA        NA        NA        NA        NA 2.8683194
## 2 0.1328586 0.1519729 0.1648039 0.2036356 0.2349877 0.2978277 0.2708911
## 3 0.5987173 0.5712019 0.4852515 0.5150715 0.4873957 0.4431214 0.4267687
## 4 1.9350583 2.6930239 2.6248568 2.6832399 2.6942914 2.6580154 2.6653562
## 5        NA        NA        NA        NA        NA        NA        NA
## 6 3.0692088 2.9070579 2.7011166 2.7933569 2.9563182 3.0355580 3.2555366
##       X1987      X1988      X1989      X1990      X1991       X1992
## 1 7.2351980 10.0261792 10.6347326 26.3745032 26.0461298 21.44255880
## 2 0.2716117  0.2484726  0.2356946  0.2134498  0.1876727  0.09966647
## 3 0.5184278  0.4455573  0.4235243  0.4202843  0.4054501  0.40067865
## 4 2.4140608  2.3315985  2.7832431  1.6781067  1.3122126  0.77472491
## 5        NA         NA         NA  7.4673357  7.1824566  6.91205339
## 6 3.1688219  3.2644890  3.2261271  2.9890081  3.2072246  3.38524700
##         X1993       X1994       X1995      X1996       X1997       X1998
## 1 22.00078616 21.03624511 20.77193616 20.3183534 20.42681771 20.58766915
## 2  0.08915404  0.08003917  0.07269862  0.0660447  0.05964838  0.05520717
## 3  0.43088926  0.28109258  0.76917343  0.7123063  0.48920938  0.47137391
## 4  0.72379029  0.60020371  0.65453713  0.6366253  0.49036506  0.56027144
## 5  6.73605485  6.49420042  6.66205168  7.0650715  7.23971272  7.66078389
## 6  3.63837855  3.64485889  3.39819977  3.3047937  3.12484849  3.32954828
##        X1999       X2000       X2001       X2002      X2003       X2004
## 1 20.3115668 26.19487524 25.93402441 25.67116178 26.4204521 26.51729342
## 2  0.0423326  0.03850634  0.03900233  0.04871555  0.0518296  0.03937783
## 3  0.5740836  0.58035266  0.57304749  0.72076885  0.4979751  0.99616548
## 4  0.9601644  0.97817468  1.05330418  1.22954071  1.4126972  1.37621273
## 5  7.9754544  8.01928429  7.78695000  7.59061514  7.3157607  7.35862494
## 6  3.3095534  3.68444127  3.59030296  3.58803558  3.7798890  4.05146517
##         X2005       X2006       X2007      X2008      X2009     X2010
## 1 27.20070778 26.94826047 27.89557400 26.2308466 25.9158329 24.670529
## 2  0.05294821  0.06372847  0.08541751  0.1541014  0.2417227  0.293837
## 3  0.97974003  1.09888390  1.19784398  1.1815268  1.2324945  1.243406
## 4  1.41249821  1.30257637  1.32233486  1.4843111  1.4956002  1.578574
## 5  7.29987194  6.74621872  6.51946591  6.4278866  6.1216523  6.122595
## 6  4.16848626  4.26823987  4.10022627  4.3904014  4.5421515  4.615758
##        X2011      X2012     X2013    X2014
## 1 24.5058352 13.1555417 8.3512943 8.408363
## 2  0.4120169  0.3503706 0.3156018 0.299445
## 3  1.2527893  1.3308430 1.2546172 1.291328
## 4  1.8037147  1.6929083 1.7492111 1.978763
## 5  5.8671299  5.9165969 5.9007526 5.832170
## 6  4.5377552  4.8136307 4.6504742 4.860234
names(co2) #incoding error
##  [1] "癤풠ountry.Name" "Country.Code"    "Indicator.Name" 
##  [4] "X1960"           "X1961"           "X1962"          
##  [7] "X1963"           "X1964"           "X1965"          
## [10] "X1966"           "X1967"           "X1968"          
## [13] "X1969"           "X1970"           "X1971"          
## [16] "X1972"           "X1973"           "X1974"          
## [19] "X1975"           "X1976"           "X1977"          
## [22] "X1978"           "X1979"           "X1980"          
## [25] "X1981"           "X1982"           "X1983"          
## [28] "X1984"           "X1985"           "X1986"          
## [31] "X1987"           "X1988"           "X1989"          
## [34] "X1990"           "X1991"           "X1992"          
## [37] "X1993"           "X1994"           "X1995"          
## [40] "X1996"           "X1997"           "X1998"          
## [43] "X1999"           "X2000"           "X2001"          
## [46] "X2002"           "X2003"           "X2004"          
## [49] "X2005"           "X2006"           "X2007"          
## [52] "X2008"           "X2009"           "X2010"          
## [55] "X2011"           "X2012"           "X2013"          
## [58] "X2014"
names(co2)[1]="country name"
which(colnames(co2)=="X2014") 
## [1] 58
names(co2)[which(colnames(co2)=="X2014")]="yr2014" #change a variable name

2. Load the packages and download a world map

library(maps)  
library(ggplot2)
library(RColorBrewer)
map.world=map_data("world")  #world map. not a google map.

What we should be aware of is that map.world is a data frame of ‘region’(=country name), ‘long’(=longitude), ‘lat’(=latitude), ‘group’, ‘order’, ‘subgroup’, and ‘country code’. It has such a lot of data points(over 80000) of (lat, long) that will be connected as a world map later.

3. Merge the map data and the co2 data frame

In order to represent features on a world map, we have to merge the co2 data frame and the map.world data frame. I got a great help from a youtube video from mitcourseware https://www.youtube.com/watch?v=2rnsbodsJVc

map.world=merge(map_data("world"),co2,by.x="region",by.y="country name")
ggplot(map.world,aes(x=long,y=lat,group=group))+geom_polygon(fill="white",color="black")

2 strange Things here. First, by the function ‘merge’, coordinates are ordered differently, leading to a mess. Look at United States. We should set the right order.

4. Order the data frame

map.world=map.world[order(map.world$group,map.world$order),]
ggplot(map.world,aes(x=long,y=lat,group=group))+geom_polygon(aes(fill=yr2014),color="azure")   #fill the color by co2 emission /person in 2014

Then the problem is that there are quite missing data. 1) Some are because there’s missing data in the data frame. Cannot do anything about that. Cannot replace those as 0 or the mean value. 2) Some are because the names of the countries in the co2 and the map.world is different. Like in the excel file, USA is written “United States”. There are whole lot of examples of these. Important thing here, keeping standard!!!!

5. Modifying country names

co2$`country name`[co2$`country name`=="United States"]="USA"
co2$`country name`[co2$`country name`=="United Kingdom"]="UK"
co2$`country name`[co2$`country name`=="Korea, Rep."]="South Korea"
co2$`country name`[co2$`country name`=="Korea, Dem. People’s Rep.
"]="North Korea"
co2$`country name`[co2$`country name`=="Russian Federation"]="Russia"
co2$`country name`[co2$`country name`=="Qatar"]="Qatar"

6. Make a plot again

map.world=merge(map_data("world"),co2,by.x="region",by.y="country name")
ggplot(map.world,aes(x=long,y=lat,group=group))+geom_polygon(aes(fill=yr2014),color="azure")

map.world=map.world[order(map.world$group,map.world$order),]
ggplot(map.world,aes(x=long,y=lat,group=group))+geom_polygon(aes(fill=yr2014),color="azure")+ scale_fill_gradientn(colours = brewer.pal(8, "RdYlBu")[4:1])

Without the gradientn, it would have been a series of blue color which is a little bit hard to decipher. I changed the colour into a series of red and yellow.

7. Results

Let’s look at the map and order the data

head(co2[order(co2$yr2014,decreasing=T),][1],n=15) #top 15 countries
##                  country name
## 199                     Qatar
## 50                    Curacao
## 241       Trinidad and Tobago
## 126                    Kuwait
## 21                    Bahrain
## 7        United Arab Emirates
## 30          Brunei Darussalam
## 204              Saudi Arabia
## 224 Sint Maarten (Dutch part)
## 143                Luxembourg
## 250                       USA
## 169             North America
## 171             New Caledonia
## 83                  Gibraltar
## 181                      Oman

Biggest Co2 emitter per capita start from Qatar, Curacao, Trinidad and Tobago, Kuwait, Bahrain, UAE, Brunei, Saudi Arabia, Sint Maarten, Luxemburg, US, New Caledonia, Gibraitar, Oman, Australia, Canada… Many countries from the Middle East take the high seed while there’s no China and India in this list. It can be a surprise because those two countries are known for emitting tremendous amount of CO2. They do emit a lot totally but not that much per person.