Load Packages

library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.0.2
## -- Attaching packages ------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.0     v purrr   0.3.4
## v tibble  3.0.1     v dplyr   0.8.5
## v tidyr   1.1.2     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.5.0
## Warning: package 'tidyr' was built under R version 4.0.2
## Warning: package 'forcats' was built under R version 4.0.2
## -- Conflicts ---------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(ggthemes)
## Warning: package 'ggthemes' was built under R version 4.0.3
library(plotly)
## Warning: package 'plotly' was built under R version 4.0.2
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
library(ggplot2)
library(dplyr)

Set Working Directory

setwd("/Users/Joeyc/Documents/School/Fall 2020/Data 110/Project Two")

Import Data

global_emissions = read_csv("co2_global_emissions.csv")
## Parsed with column specification:
## cols(
##   .default = col_double(),
##   `Country Name` = col_character(),
##   `Country Code` = col_character(),
##   `Indicator Name` = col_character(),
##   `Indicator Code` = col_character(),
##   `2015` = col_logical(),
##   `2016` = col_logical(),
##   `2017` = col_logical(),
##   `2018` = col_logical()
## )
## See spec(...) for full column specifications.

Taking a look at the Data

str(global_emissions)
## tibble [264 x 63] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ Country Name  : chr [1:264] "Aruba" "Afghanistan" "Angola" "Albania" ...
##  $ Country Code  : chr [1:264] "ABW" "AFG" "AGO" "ALB" ...
##  $ Indicator Name: chr [1:264] "CO2 emissions (metric tons per capita)" "CO2 emissions (metric tons per capita)" "CO2 emissions (metric tons per capita)" "CO2 emissions (metric tons per capita)" ...
##  $ Indicator Code: chr [1:264] "EN.ATM.CO2E.PC" "EN.ATM.CO2E.PC" "EN.ATM.CO2E.PC" "EN.ATM.CO2E.PC" ...
##  $ 1960          : num [1:264] NA 0.0461 0.0975 1.2582 NA ...
##  $ 1961          : num [1:264] NA 0.0536 0.079 1.3742 NA ...
##  $ 1962          : num [1:264] NA 0.0738 0.2013 1.44 NA ...
##  $ 1963          : num [1:264] NA 0.0742 0.1925 1.1817 NA ...
##  $ 1964          : num [1:264] NA 0.0863 0.201 1.1117 NA ...
##  $ 1965          : num [1:264] NA 0.101 0.192 1.166 NA ...
##  $ 1966          : num [1:264] NA 0.108 0.246 1.333 NA ...
##  $ 1967          : num [1:264] NA 0.124 0.155 1.364 NA ...
##  $ 1968          : num [1:264] NA 0.115 0.256 1.52 NA ...
##  $ 1969          : num [1:264] NA 0.0868 0.4196 1.559 NA ...
##  $ 1970          : num [1:264] NA 0.15 0.529 1.753 NA ...
##  $ 1971          : num [1:264] NA 0.166 0.492 1.989 NA ...
##  $ 1972          : num [1:264] NA 0.131 0.635 2.516 NA ...
##  $ 1973          : num [1:264] NA 0.136 0.671 2.304 NA ...
##  $ 1974          : num [1:264] NA 0.156 0.652 1.849 NA ...
##  $ 1975          : num [1:264] NA 0.169 0.575 1.911 NA ...
##  $ 1976          : num [1:264] NA 0.155 0.416 2.014 NA ...
##  $ 1977          : num [1:264] NA 0.183 0.435 2.276 NA ...
##  $ 1978          : num [1:264] NA 0.163 0.646 2.531 NA ...
##  $ 1979          : num [1:264] NA 0.168 0.637 2.898 NA ...
##  $ 1980          : num [1:264] NA 0.133 0.599 1.935 NA ...
##  $ 1981          : num [1:264] NA 0.152 0.571 2.693 NA ...
##  $ 1982          : num [1:264] NA 0.165 0.485 2.625 NA ...
##  $ 1983          : num [1:264] NA 0.204 0.515 2.683 NA ...
##  $ 1984          : num [1:264] NA 0.235 0.487 2.694 NA ...
##  $ 1985          : num [1:264] NA 0.298 0.443 2.658 NA ...
##  $ 1986          : num [1:264] 2.868 0.271 0.427 2.665 NA ...
##  $ 1987          : num [1:264] 7.235 0.272 0.518 2.414 NA ...
##  $ 1988          : num [1:264] 10.026 0.248 0.446 2.332 NA ...
##  $ 1989          : num [1:264] 10.635 0.236 0.424 2.783 NA ...
##  $ 1990          : num [1:264] 26.375 0.213 0.42 1.678 7.467 ...
##  $ 1991          : num [1:264] 26.046 0.188 0.405 1.312 7.182 ...
##  $ 1992          : num [1:264] 21.4426 0.0997 0.4007 0.7747 6.9121 ...
##  $ 1993          : num [1:264] 22.0008 0.0892 0.4309 0.7238 6.7361 ...
##  $ 1994          : num [1:264] 21.036 0.08 0.281 0.6 6.494 ...
##  $ 1995          : num [1:264] 20.7719 0.0727 0.7692 0.6545 6.6621 ...
##  $ 1996          : num [1:264] 20.318 0.066 0.712 0.637 7.065 ...
##  $ 1997          : num [1:264] 20.4268 0.0596 0.4892 0.4904 7.2397 ...
##  $ 1998          : num [1:264] 20.5877 0.0552 0.4714 0.5603 7.6608 ...
##  $ 1999          : num [1:264] 20.3116 0.0423 0.5741 0.9602 7.9755 ...
##  $ 2000          : num [1:264] 26.1949 0.0385 0.5804 0.9782 8.0193 ...
##  $ 2001          : num [1:264] 25.934 0.039 0.573 1.053 7.787 ...
##  $ 2002          : num [1:264] 25.6712 0.0487 0.7208 1.2295 7.5906 ...
##  $ 2003          : num [1:264] 26.4205 0.0518 0.498 1.4127 7.3158 ...
##  $ 2004          : num [1:264] 26.5173 0.0394 0.9962 1.3762 7.3586 ...
##  $ 2005          : num [1:264] 27.2007 0.0529 0.9797 1.4125 7.2999 ...
##  $ 2006          : num [1:264] 26.9483 0.0637 1.0989 1.3026 6.7462 ...
##  $ 2007          : num [1:264] 27.8956 0.0854 1.1978 1.3223 6.5195 ...
##  $ 2008          : num [1:264] 26.231 0.154 1.182 1.484 6.428 ...
##  $ 2009          : num [1:264] 25.916 0.242 1.232 1.496 6.122 ...
##  $ 2010          : num [1:264] 24.671 0.294 1.243 1.579 6.123 ...
##  $ 2011          : num [1:264] 24.506 0.412 1.253 1.804 5.867 ...
##  $ 2012          : num [1:264] 13.16 0.35 1.33 1.69 5.92 ...
##  $ 2013          : num [1:264] 8.351 0.316 1.255 1.749 5.901 ...
##  $ 2014          : num [1:264] 8.408 0.299 1.291 1.979 5.832 ...
##  $ 2015          : logi [1:264] NA NA NA NA NA NA ...
##  $ 2016          : logi [1:264] NA NA NA NA NA NA ...
##  $ 2017          : logi [1:264] NA NA NA NA NA NA ...
##  $ 2018          : logi [1:264] NA NA NA NA NA NA ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   `Country Name` = col_character(),
##   ..   `Country Code` = col_character(),
##   ..   `Indicator Name` = col_character(),
##   ..   `Indicator Code` = col_character(),
##   ..   `1960` = col_double(),
##   ..   `1961` = col_double(),
##   ..   `1962` = col_double(),
##   ..   `1963` = col_double(),
##   ..   `1964` = col_double(),
##   ..   `1965` = col_double(),
##   ..   `1966` = col_double(),
##   ..   `1967` = col_double(),
##   ..   `1968` = col_double(),
##   ..   `1969` = col_double(),
##   ..   `1970` = col_double(),
##   ..   `1971` = col_double(),
##   ..   `1972` = col_double(),
##   ..   `1973` = col_double(),
##   ..   `1974` = col_double(),
##   ..   `1975` = col_double(),
##   ..   `1976` = col_double(),
##   ..   `1977` = col_double(),
##   ..   `1978` = col_double(),
##   ..   `1979` = col_double(),
##   ..   `1980` = col_double(),
##   ..   `1981` = col_double(),
##   ..   `1982` = col_double(),
##   ..   `1983` = col_double(),
##   ..   `1984` = col_double(),
##   ..   `1985` = col_double(),
##   ..   `1986` = col_double(),
##   ..   `1987` = col_double(),
##   ..   `1988` = col_double(),
##   ..   `1989` = col_double(),
##   ..   `1990` = col_double(),
##   ..   `1991` = col_double(),
##   ..   `1992` = col_double(),
##   ..   `1993` = col_double(),
##   ..   `1994` = col_double(),
##   ..   `1995` = col_double(),
##   ..   `1996` = col_double(),
##   ..   `1997` = col_double(),
##   ..   `1998` = col_double(),
##   ..   `1999` = col_double(),
##   ..   `2000` = col_double(),
##   ..   `2001` = col_double(),
##   ..   `2002` = col_double(),
##   ..   `2003` = col_double(),
##   ..   `2004` = col_double(),
##   ..   `2005` = col_double(),
##   ..   `2006` = col_double(),
##   ..   `2007` = col_double(),
##   ..   `2008` = col_double(),
##   ..   `2009` = col_double(),
##   ..   `2010` = col_double(),
##   ..   `2011` = col_double(),
##   ..   `2012` = col_double(),
##   ..   `2013` = col_double(),
##   ..   `2014` = col_double(),
##   ..   `2015` = col_logical(),
##   ..   `2016` = col_logical(),
##   ..   `2017` = col_logical(),
##   ..   `2018` = col_logical()
##   .. )
summary(global_emissions)
##  Country Name       Country Code       Indicator Name     Indicator Code    
##  Length:264         Length:264         Length:264         Length:264        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##       1960               1961               1962               1963         
##  Min.   : 0.00802   Min.   : 0.00789   Min.   : 0.00848   Min.   : 0.00938  
##  1st Qu.: 0.18213   1st Qu.: 0.18025   1st Qu.: 0.20020   1st Qu.: 0.19774  
##  Median : 0.62003   Median : 0.64923   Median : 0.65233   Median : 0.64797  
##  Mean   : 2.04418   Mean   : 2.15748   Mean   : 2.24880   Mean   : 2.76342  
##  3rd Qu.: 1.70291   3rd Qu.: 1.74622   3rd Qu.: 1.94327   3rd Qu.: 1.72018  
##  Max.   :36.68518   Max.   :36.58378   Max.   :42.24200   Max.   :99.46300  
##  NA's   :72         NA's   :71         NA's   :69         NA's   :68        
##       1964              1965               1966               1967        
##  Min.   : 0.0116   Min.   : 0.01191   Min.   : 0.01326   Min.   : 0.0118  
##  1st Qu.: 0.2176   1st Qu.: 0.23441   1st Qu.: 0.24922   1st Qu.: 0.2518  
##  Median : 0.7661   Median : 0.69654   Median : 0.74914   Median : 0.8029  
##  Mean   : 2.9127   Mean   : 3.03167   Mean   : 3.04470   Mean   : 3.1112  
##  3rd Qu.: 2.0244   3rd Qu.: 2.19005   3rd Qu.: 2.45582   3rd Qu.: 2.9155  
##  Max.   :92.8595   Max.   :85.45859   Max.   :78.62712   Max.   :77.5086  
##  NA's   :61        NA's   :61         NA's   :61         NA's   :61       
##       1968              1969                1970               1971        
##  Min.   :-0.0201   Min.   :  0.01612   Min.   : 0.01229   Min.   : 0.0119  
##  1st Qu.: 0.2700   1st Qu.:  0.32092   1st Qu.: 0.34980   1st Qu.: 0.3405  
##  Median : 0.9867   Median :  1.05863   Median : 1.00036   Median : 1.0981  
##  Mean   : 3.3093   Mean   :  3.91912   Mean   : 4.19749   Mean   : 4.4219  
##  3rd Qu.: 3.2576   3rd Qu.:  3.59743   3rd Qu.: 4.01242   3rd Qu.: 4.5024  
##  Max.   :75.9753   Max.   :100.69767   Max.   :69.11160   Max.   :76.6415  
##  NA's   :61        NA's   :61          NA's   :59         NA's   :58       
##       1972               1973               1974               1975         
##  Min.   : 0.01153   Min.   : 0.01117   Min.   : 0.00974   Min.   : 0.00975  
##  1st Qu.: 0.35436   1st Qu.: 0.36669   1st Qu.: 0.37185   1st Qu.: 0.38111  
##  Median : 1.11133   Median : 1.13637   Median : 1.23269   Median : 1.28549  
##  Mean   : 4.48812   Mean   : 4.80584   Mean   : 4.49946   Mean   : 4.36611  
##  3rd Qu.: 4.51785   3rd Qu.: 5.11695   3rd Qu.: 4.64417   3rd Qu.: 4.85223  
##  Max.   :82.61945   Max.   :87.65265   Max.   :68.23258   Max.   :66.64312  
##  NA's   :56         NA's   :56         NA's   :56         NA's   :56        
##       1976               1977               1978               1979         
##  Min.   : 0.00991   Min.   : 0.01019   Min.   : 0.00738   Min.   : 0.00433  
##  1st Qu.: 0.36318   1st Qu.: 0.38780   1st Qu.: 0.40225   1st Qu.: 0.43831  
##  Median : 1.36287   Median : 1.43705   Median : 1.51862   Median : 1.57835  
##  Mean   : 4.35662   Mean   : 4.48666   Mean   : 4.51104   Mean   : 4.56304  
##  3rd Qu.: 5.17443   3rd Qu.: 5.28561   3rd Qu.: 5.74670   3rd Qu.: 5.49269  
##  Max.   :61.29021   Max.   :54.40915   Max.   :54.82565   Max.   :69.94185  
##  NA's   :56         NA's   :56         NA's   :56         NA's   :56        
##       1980               1981               1982               1983         
##  Min.   : 0.03563   Min.   : 0.02982   Min.   : 0.02843   Min.   : 0.03099  
##  1st Qu.: 0.44931   1st Qu.: 0.46582   1st Qu.: 0.45183   1st Qu.: 0.45037  
##  Median : 1.52564   Median : 1.52441   Median : 1.47916   Median : 1.36581  
##  Mean   : 4.46439   Mean   : 3.99356   Mean   : 3.87247   Mean   : 3.72682  
##  3rd Qu.: 5.49203   3rd Qu.: 5.30724   3rd Qu.: 5.37670   3rd Qu.: 5.40872  
##  Max.   :58.53435   Max.   :51.82543   Max.   :44.53605   Max.   :36.41181  
##  NA's   :56         NA's   :56         NA's   :56         NA's   :56        
##       1984               1985               1986               1987         
##  Min.   : 0.04113   Min.   : 0.03529   Min.   : 0.03567   Min.   : 0.03662  
##  1st Qu.: 0.47515   1st Qu.: 0.46528   1st Qu.: 0.44069   1st Qu.: 0.48470  
##  Median : 1.44877   Median : 1.54159   Median : 1.55041   Median : 1.63990  
##  Mean   : 3.82439   Mean   : 3.91770   Mean   : 3.90545   Mean   : 3.94261  
##  3rd Qu.: 5.25908   3rd Qu.: 5.56298   3rd Qu.: 4.97794   3rd Qu.: 5.38631  
##  Max.   :36.11639   Max.   :35.89097   Max.   :33.41411   Max.   :30.55837  
##  NA's   :56         NA's   :56         NA's   :55         NA's   :55        
##       1988               1989              1990               1991         
##  Min.   : 0.01182   Min.   : 0.0178   Min.   : 0.02401   Min.   : 0.01073  
##  1st Qu.: 0.50639   1st Qu.: 0.4992   1st Qu.: 0.46026   1st Qu.: 0.45024  
##  Median : 1.75620   Median : 1.6438   Median : 1.67303   Median : 1.86103  
##  Mean   : 4.07731   Mean   : 4.2133   Mean   : 4.08245   Mean   : 4.12135  
##  3rd Qu.: 5.79625   3rd Qu.: 5.8379   3rd Qu.: 5.91487   3rd Qu.: 5.98891  
##  Max.   :29.21023   Max.   :31.0288   Max.   :27.95925   Max.   :36.31713  
##  NA's   :55         NA's   :55        NA's   :49         NA's   :47        
##       1992               1993               1994               1995         
##  Min.   : 0.01328   Min.   : 0.01398   Min.   : 0.01516   Min.   : 0.01571  
##  1st Qu.: 0.57081   1st Qu.: 0.52776   1st Qu.: 0.57165   1st Qu.: 0.58101  
##  Median : 2.27881   Median : 2.23531   Median : 2.19315   Median : 2.32266  
##  Mean   : 4.47999   Mean   : 4.50271   Mean   : 4.42463   Mean   : 4.47459  
##  3rd Qu.: 6.47191   3rd Qu.: 6.65546   3rd Qu.: 6.44669   3rd Qu.: 6.47766  
##  Max.   :54.08917   Max.   :61.25241   Max.   :59.60109   Max.   :61.91238  
##  NA's   :23         NA's   :23         NA's   :22         NA's   :21        
##       1996               1997               1998               1999         
##  Min.   : 0.01722   Min.   : 0.01909   Min.   : 0.01938   Min.   : 0.02006  
##  1st Qu.: 0.61819   1st Qu.: 0.68144   1st Qu.: 0.70258   1st Qu.: 0.74136  
##  Median : 2.39780   Median : 2.27434   Median : 2.25260   Median : 2.25969  
##  Mean   : 4.49417   Mean   : 4.49199   Mean   : 4.48218   Mean   : 4.44955  
##  3rd Qu.: 6.75816   3rd Qu.: 6.57679   3rd Qu.: 6.55386   3rd Qu.: 6.69472  
##  Max.   :61.83934   Max.   :70.13564   Max.   :58.86600   Max.   :55.15501  
##  NA's   :21         NA's   :20         NA's   :19         NA's   :19        
##       2000               2001               2002               2003         
##  Min.   : 0.01729   Min.   : 0.01728   Min.   : 0.01862   Min.   : 0.01919  
##  1st Qu.: 0.74018   1st Qu.: 0.76470   1st Qu.: 0.75710   1st Qu.: 0.80170  
##  Median : 2.33916   Median : 2.43634   Median : 2.50363   Median : 2.62574  
##  Mean   : 4.57853   Mean   : 4.63067   Mean   : 4.59742   Mean   : 4.72905  
##  3rd Qu.: 6.60642   3rd Qu.: 6.91775   3rd Qu.: 6.94779   3rd Qu.: 7.24342  
##  Max.   :58.63936   Max.   :67.10602   Max.   :63.35447   Max.   :60.29957  
##  NA's   :19         NA's   :19         NA's   :18         NA's   :18        
##       2004               2005               2006               2007         
##  Min.   : 0.02261   Min.   : 0.02075   Min.   : 0.02437   Min.   : 0.02356  
##  1st Qu.: 0.83543   1st Qu.: 0.85543   1st Qu.: 0.79841   1st Qu.: 0.89101  
##  Median : 2.65745   Median : 2.76730   Median : 2.91508   Median : 2.88561  
##  Mean   : 4.77632   Mean   : 4.82026   Mean   : 4.89865   Mean   : 4.92978  
##  3rd Qu.: 7.13041   3rd Qu.: 7.03361   3rd Qu.: 7.03593   3rd Qu.: 6.92544  
##  Max.   :56.59083   Max.   :58.91873   Max.   :62.82354   Max.   :53.19099  
##  NA's   :18         NA's   :17         NA's   :16         NA's   :15        
##       2008               2009               2010               2011         
##  Min.   : 0.02322   Min.   : 0.02246   Min.   : 0.02426   Min.   : 0.02676  
##  1st Qu.: 0.81606   1st Qu.: 0.82785   1st Qu.: 0.82081   1st Qu.: 0.83982  
##  Median : 3.02796   Median : 2.95356   Median : 2.93322   Median : 2.92997  
##  Mean   : 4.93589   Mean   : 4.72189   Mean   : 4.84474   Mean   : 4.80634  
##  3rd Qu.: 7.01056   3rd Qu.: 6.31907   3rd Qu.: 6.64135   3rd Qu.: 6.71538  
##  Max.   :46.67214   Max.   :43.51448   Max.   :40.74202   Max.   :41.20565  
##  NA's   :15         NA's   :15         NA's   :15         NA's   :15        
##       2012              2013               2014            2015        
##  Min.   : 0.0303   Min.   : 0.03018   Min.   : 0.04449   Mode:logical  
##  1st Qu.: 0.8280   1st Qu.: 0.84854   1st Qu.: 0.88172   NA's:264      
##  Median : 3.0259   Median : 3.00557   Median : 3.15330                 
##  Mean   : 4.9488   Mean   : 4.86222   Mean   : 4.87468                 
##  3rd Qu.: 6.6646   3rd Qu.: 6.71259   3rd Qu.: 6.36518                 
##  Max.   :44.6179   Max.   :37.78009   Max.   :45.42324                 
##  NA's   :13        NA's   :13         NA's   :14                       
##    2016           2017           2018        
##  Mode:logical   Mode:logical   Mode:logical  
##  NA's:264       NA's:264       NA's:264      
##                                              
##                                              
##                                              
##                                              
## 

Check for N/A values

any(is.na(global_emissions))
## [1] TRUE
sum(is.na(global_emissions))
## [1] 3327
colSums(is.na(global_emissions))
##   Country Name   Country Code Indicator Name Indicator Code           1960 
##              0              0              0              0             72 
##           1961           1962           1963           1964           1965 
##             71             69             68             61             61 
##           1966           1967           1968           1969           1970 
##             61             61             61             61             59 
##           1971           1972           1973           1974           1975 
##             58             56             56             56             56 
##           1976           1977           1978           1979           1980 
##             56             56             56             56             56 
##           1981           1982           1983           1984           1985 
##             56             56             56             56             56 
##           1986           1987           1988           1989           1990 
##             55             55             55             55             49 
##           1991           1992           1993           1994           1995 
##             47             23             23             22             21 
##           1996           1997           1998           1999           2000 
##             21             20             19             19             19 
##           2001           2002           2003           2004           2005 
##             19             18             18             18             17 
##           2006           2007           2008           2009           2010 
##             16             15             15             15             15 
##           2011           2012           2013           2014           2015 
##             15             13             13             14            264 
##           2016           2017           2018 
##            264            264            264

There appears to be 3327 N/A values in the data set. This is something we should remove. 2015-2019 were all NAs so let’s get rid of those rows first.

global_emissions = global_emissions[-63]
global_emissions = global_emissions[-62]
global_emissions = global_emissions[-61]
global_emissions = global_emissions[-60]
sum(is.na(global_emissions))
## [1] 2271

We have removed over 1000 NA’s just by omitting those rows, now let’s get rid of the rest of the NAs.

global_emissions_cleaned = global_emissions[complete.cases(global_emissions),]

Let’s check if all the NAs are finally removed from the data frame.

sum(is.na(global_emissions_cleaned))
## [1] 0

As we can see there are no more NA values in the dataframe global_emissions_cleaned. We have removed the rows (countries) with complete data. We have stripped 74 countries from the data frame and are now left with only 190 countries with complete sets of data ranging from 1960 to 2014. We now have a functional dataset ready for our analysis.

summary(global_emissions_cleaned)
##  Country Name       Country Code       Indicator Name     Indicator Code    
##  Length:190         Length:190         Length:190         Length:190        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##       1960               1961               1962               1963         
##  Min.   : 0.00802   Min.   : 0.00789   Min.   : 0.00848   Min.   : 0.00938  
##  1st Qu.: 0.19121   1st Qu.: 0.18839   1st Qu.: 0.19966   1st Qu.: 0.20191  
##  Median : 0.62003   Median : 0.65550   Median : 0.65940   Median : 0.71137  
##  Mean   : 1.91288   Mean   : 2.01624   Mean   : 2.07878   Mean   : 2.60692  
##  3rd Qu.: 1.69937   3rd Qu.: 1.72743   3rd Qu.: 1.94701   3rd Qu.: 1.73727  
##  Max.   :36.68518   Max.   :36.58378   Max.   :36.01263   Max.   :99.46300  
##       1964              1965               1966               1967        
##  Min.   : 0.0116   Min.   : 0.01475   Min.   : 0.01326   Min.   : 0.0118  
##  1st Qu.: 0.2294   1st Qu.: 0.24059   1st Qu.: 0.25321   1st Qu.: 0.2599  
##  Median : 0.7750   Median : 0.76892   Median : 0.79359   Median : 0.9485  
##  Mean   : 2.7458   Mean   : 2.79025   Mean   : 2.84061   Mean   : 3.0224  
##  3rd Qu.: 2.0684   3rd Qu.: 2.28586   3rd Qu.: 2.52057   3rd Qu.: 3.4252  
##  Max.   :92.8595   Max.   :85.45859   Max.   :78.62712   Max.   :77.5086  
##       1968              1969                1970               1971         
##  Min.   :-0.0201   Min.   :  0.01612   Min.   : 0.01563   Min.   : 0.01612  
##  1st Qu.: 0.2830   1st Qu.:  0.32918   1st Qu.: 0.35192   1st Qu.: 0.35099  
##  Median : 1.0052   Median :  1.11013   Median : 1.04235   Median : 1.26390  
##  Mean   : 3.2727   Mean   :  3.89333   Mean   : 4.25291   Mean   : 4.48162  
##  3rd Qu.: 3.5948   3rd Qu.:  3.78305   3rd Qu.: 4.24164   3rd Qu.: 4.69816  
##  Max.   :75.9753   Max.   :100.69767   Max.   :69.11160   Max.   :76.64148  
##       1972               1973               1974               1975         
##  Min.   : 0.01607   Min.   : 0.01698   Min.   : 0.00974   Min.   : 0.00975  
##  1st Qu.: 0.36552   1st Qu.: 0.38258   1st Qu.: 0.38303   1st Qu.: 0.40617  
##  Median : 1.26578   Median : 1.32940   Median : 1.38408   Median : 1.44610  
##  Mean   : 4.61896   Mean   : 4.98891   Mean   : 4.67840   Mean   : 4.52177  
##  3rd Qu.: 4.99509   3rd Qu.: 5.65342   3rd Qu.: 5.52382   3rd Qu.: 5.44857  
##  Max.   :82.61945   Max.   :87.65265   Max.   :68.23258   Max.   :66.64312  
##       1976               1977               1978               1979         
##  Min.   : 0.00991   Min.   : 0.01019   Min.   : 0.00738   Min.   : 0.00433  
##  1st Qu.: 0.36818   1st Qu.: 0.39951   1st Qu.: 0.43229   1st Qu.: 0.50093  
##  Median : 1.46951   Median : 1.48670   Median : 1.60117   Median : 1.65205  
##  Mean   : 4.49844   Mean   : 4.64651   Mean   : 4.66828   Mean   : 4.74105  
##  3rd Qu.: 5.52068   3rd Qu.: 5.46930   3rd Qu.: 6.13218   3rd Qu.: 5.54783  
##  Max.   :61.29021   Max.   :54.40915   Max.   :54.82565   Max.   :69.94185  
##       1980               1981               1982               1983         
##  Min.   : 0.03642   Min.   : 0.02982   Min.   : 0.02843   Min.   : 0.03099  
##  1st Qu.: 0.45634   1st Qu.: 0.48200   1st Qu.: 0.48002   1st Qu.: 0.50731  
##  Median : 1.70411   Median : 1.67712   Median : 1.61874   Median : 1.63017  
##  Mean   : 4.61483   Mean   : 4.09898   Mean   : 3.99521   Mean   : 3.83024  
##  3rd Qu.: 5.87205   3rd Qu.: 5.56605   3rd Qu.: 5.70157   3rd Qu.: 5.61205  
##  Max.   :58.53435   Max.   :51.82543   Max.   :44.53605   Max.   :36.41181  
##       1984               1985               1986               1987         
##  Min.   : 0.04113   Min.   : 0.03529   Min.   : 0.03567   Min.   : 0.03662  
##  1st Qu.: 0.50734   1st Qu.: 0.50843   1st Qu.: 0.49236   1st Qu.: 0.53190  
##  Median : 1.68523   Median : 1.69800   Median : 1.59963   Median : 1.69854  
##  Mean   : 3.92240   Mean   : 4.01723   Mean   : 3.96980   Mean   : 4.00564  
##  3rd Qu.: 5.64799   3rd Qu.: 5.85255   3rd Qu.: 5.88411   3rd Qu.: 5.61996  
##  Max.   :36.11639   Max.   :35.89097   Max.   :33.41411   Max.   :30.55837  
##       1988               1989              1990               1991         
##  Min.   : 0.01182   Min.   : 0.0178   Min.   : 0.02401   Min.   : 0.01073  
##  1st Qu.: 0.54621   1st Qu.: 0.5358   1st Qu.: 0.48778   1st Qu.: 0.46844  
##  Median : 1.82210   Median : 1.8976   Median : 1.78332   Median : 1.87488  
##  Mean   : 4.13764   Mean   : 4.2808   Mean   : 4.09635   Mean   : 4.17868  
##  3rd Qu.: 6.05304   3rd Qu.: 6.1267   3rd Qu.: 6.09113   3rd Qu.: 6.02683  
##  Max.   :29.21023   Max.   :31.0288   Max.   :27.95924   Max.   :36.31713  
##       1992               1993               1994               1995         
##  Min.   : 0.01328   Min.   : 0.01398   Min.   : 0.01516   Min.   : 0.01571  
##  1st Qu.: 0.49835   1st Qu.: 0.49171   1st Qu.: 0.49326   1st Qu.: 0.50195  
##  Median : 1.98303   Median : 2.10304   Median : 2.13418   Median : 2.16486  
##  Mean   : 4.24612   Mean   : 4.38165   Mean   : 4.38431   Mean   : 4.32098  
##  3rd Qu.: 5.95205   3rd Qu.: 5.96758   3rd Qu.: 5.96360   3rd Qu.: 5.97206  
##  Max.   :54.08917   Max.   :61.25241   Max.   :59.60109   Max.   :61.91238  
##       1996               1997               1998               1999         
##  Min.   : 0.01722   Min.   : 0.01909   Min.   : 0.01938   Min.   : 0.02006  
##  1st Qu.: 0.54114   1st Qu.: 0.50804   1st Qu.: 0.48885   1st Qu.: 0.58075  
##  Median : 2.20562   Median : 2.21016   Median : 2.20055   Median : 2.21292  
##  Mean   : 4.36977   Mean   : 4.39154   Mean   : 4.40004   Mean   : 4.37685  
##  3rd Qu.: 5.94769   3rd Qu.: 5.88703   3rd Qu.: 5.96513   3rd Qu.: 6.24506  
##  Max.   :61.83934   Max.   :70.13564   Max.   :58.86600   Max.   :55.15501  
##       2000               2001               2002               2003         
##  Min.   : 0.01729   Min.   : 0.01728   Min.   : 0.01862   Min.   : 0.01919  
##  1st Qu.: 0.63326   1st Qu.: 0.68919   1st Qu.: 0.72259   1st Qu.: 0.73686  
##  Median : 2.29620   Median : 2.33576   Median : 2.38997   Median : 2.49101  
##  Mean   : 4.50924   Mean   : 4.55150   Mean   : 4.52810   Mean   : 4.64728  
##  3rd Qu.: 6.03727   3rd Qu.: 6.12880   3rd Qu.: 6.42819   3rd Qu.: 6.46812  
##  Max.   :58.63936   Max.   :67.10602   Max.   :63.35447   Max.   :60.29957  
##       2004               2005               2006               2007         
##  Min.   : 0.02261   Min.   : 0.02739   Min.   : 0.02847   Min.   : 0.03007  
##  1st Qu.: 0.78618   1st Qu.: 0.81378   1st Qu.: 0.77798   1st Qu.: 0.81469  
##  Median : 2.55389   Median : 2.69362   Median : 2.77711   Median : 2.76387  
##  Mean   : 4.68189   Mean   : 4.71795   Mean   : 4.77699   Mean   : 4.81334  
##  3rd Qu.: 6.34857   3rd Qu.: 6.39352   3rd Qu.: 6.41385   3rd Qu.: 6.66655  
##  Max.   :56.59083   Max.   :58.91873   Max.   :62.82354   Max.   :53.19099  
##       2008               2009               2010               2011         
##  Min.   : 0.03079   Min.   : 0.02803   Min.   : 0.03131   Min.   : 0.03738  
##  1st Qu.: 0.79320   1st Qu.: 0.80025   1st Qu.: 0.80317   1st Qu.: 0.81467  
##  Median : 2.78908   Median : 2.77911   Median : 2.88506   Median : 2.81892  
##  Mean   : 4.80978   Mean   : 4.65839   Mean   : 4.76052   Mean   : 4.70427  
##  3rd Qu.: 6.66625   3rd Qu.: 6.27109   3rd Qu.: 6.42877   3rd Qu.: 6.58947  
##  Max.   :46.67214   Max.   :43.51448   Max.   :40.74202   Max.   :41.20565  
##       2012               2013               2014         
##  Min.   : 0.03482   Min.   : 0.04635   Min.   : 0.04505  
##  1st Qu.: 0.78524   1st Qu.: 0.82275   1st Qu.: 0.84045  
##  Median : 2.82562   Median : 2.88631   Median : 2.77468  
##  Mean   : 4.68552   Mean   : 4.65162   Mean   : 4.65965  
##  3rd Qu.: 6.51219   3rd Qu.: 6.34278   3rd Qu.: 6.26523  
##  Max.   :44.61793   Max.   :37.78009   Max.   :45.42324

The range of emissions data is way too large between the min and the max, even the 3rd quartile and the max. I’m sure most of you guys know there is a couple "emissions giants that really take up a majority of the emissions throughout the world. Let’s take a look at those countries only and see which countries are the top to blame for the emissions.

Using dplyr to filter to show only countries that are currently (2014) in the top of emissions to see how they have progressed over the last 54 years. We will create a new dataset called Leaders_2014, to show the leaders in emissions of the countries in 2014.

Leaders_2014 = filter(global_emissions_cleaned, global_emissions_cleaned$`2014` > 10)

Now that we have reduced the amount of countries to only the top global emissions leaders in 2014. The way the CSV file was organized had the years as the headers and I was unable to plot how I wanted to, and I was able to group the years properly in dplyr. I had to use the write.csv() create a column header “YEAR”.

Emissions_Leaders_2014 = read_csv("Emissons_Leaders_2014.csv")
## Parsed with column specification:
## cols(
##   `Country Name` = col_character(),
##   `Country Code` = col_character(),
##   Year = col_double(),
##   Emissions = col_double()
## )

With this new CSV file uploaded I can know plot the points extremely easy.

Emissions_Leaders_2014 %>% 
  ggplot() +
  geom_smooth(aes(x = Year, y=Emissions)) +
  xlab("Year" ) +
  ylab("Emissions Per Capita in Metric Tons") +
  ggtitle("Emissions Per Capita of the Top 16 Offenders") +
  theme_calc()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

As we can see the Emissions Per Capita of the top 16 Countries in 2014 have overall held steady since 1970.

Emissions_Leaders_2014 %>%
  ggplot(aes(x=Year, y=Emissions))+
           geom_point(aes(x=Year, y=Emissions))+
  facet_wrap(~Emissions_Leaders_2014$`Country Name`) +
  xlab("Year") +
  ylab("Emissions Per Capita in Metric Tons") +
  ggtitle("Emissions Per Capita of the Top 2014 offenders") +
  theme_economist()

It is tough to tell here the journey of the top 16 countries increased or decreased through the 54 years. Since Qatar, UAE, and Brunei all represent outliers, making it difficult to read. I used the scales = free command to show all the facet wraps with different scales.

p = Emissions_Leaders_2014 %>%
  ggplot(aes(x=Year, y=Emissions))+
           geom_point(aes(x=Year, y=Emissions))+
  facet_wrap(~Emissions_Leaders_2014$`Country Code`, scales = "free") +
  xlab("Year") +
  ylab("Emissions Per Capita in Metric Tons") +
  ggtitle("Emissions Per Capita of the Top 2014 offenders") + 
  theme_economist()
ggplotly(p)

A) The topic of the data, any variables included, what kind of variables they are, where the data came from and how you cleaned it up (be detailed and specific, using proper terminology where appropriate). Be sure to explain why you chose this topic and dataset – what meaning does it have for you?

The topic of the data I chose for this project was CO2 Emissions per capita of countries between the years of 1960 and 2014. The variables included in this dataset were; the countries name, countries name code, the CO2 emissions per capita in metric tons for each country, and the year of the emissions. The data originated from the world bank. I cleaned up the data in multiple ways as I explained above. I started my clean up process of the data by removing all NA values. I achieved this by first checking to see if there were any NAs by using the any(is.na()) function and then the colSums(is.na()). I then I saw that 2015-2018 were included but no values were included so I eliminated those columns. Finally I used the complete.cases() command to remove the countries that had NA values. These countries did not have a complete row of numbers, meaning they did not record emissions in a certain year. I chose this dataset because I am interested in the global climate change crisis that has been accelerating over the years. In High School I was part of the global ecology program at Poolesville and then I went on to study Bio at UMBC.

B) Incorporate background research about this topic. This background information will include information you find in an article, website, or book. Please source this background information within the essay or if you have multiple sources, include a bibliography. I am not particular about the format of this bibliography. If you need help finding articles, I am happy to help you and/or show you how to search the MC Library Database.

CO2 emissions have been steadily increasing year to year. We are getting to the point where it doesn’t matter if we slow down CO2 emissions into our atmosphere, the damage has been done and CO2 will continue to hover in our atmosphere even if we aren’t producing at the same rate as before. Luckily, due to the pandemic CO2 emissions have dropped around the world but we are still in a crisis. Too much CO2 in the atmosphere cause trapping of heat in the atmosphere leading to a steady increase in Earth’s temperature. This will cause a plethora of issues for our planet, including rising sea levels and disruptions to delicate ecosystems. Source (https://www.nationalgeographic.com/science/2020/05/plunge-in-carbon-emissions-lockdowns-will-not-slow-climate-change/#close) and background information that I have learned throughout my school.

C) What the visualization represents, any interesting patterns or surprises that arise within the visualization, and anything that could have been shown that you could not get to work or that you wished you could have included.

One of the interesting patterns I have seen is the small island populations causes some of the higher Emissions Per Capita, compared to countries I would have assumed to be towards the top of the emissions scale. As surprised to see countries like China and India which we would assume to be near the top of the list to be way lower on the list. I guess that has to do with the size of population. Even though they have high emissions they have enough people in their country to be low on the list. Countries that are small and urban like Qatar and Luxemburg rank high on the list. Urban areas and densely populated areas seem to be what top the list.