Load Packages
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.0.2
## -- Attaching packages ------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.0 v purrr 0.3.4
## v tibble 3.0.1 v dplyr 0.8.5
## v tidyr 1.1.2 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
## Warning: package 'tidyr' was built under R version 4.0.2
## Warning: package 'forcats' was built under R version 4.0.2
## -- Conflicts ---------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(ggthemes)
## Warning: package 'ggthemes' was built under R version 4.0.3
library(plotly)
## Warning: package 'plotly' was built under R version 4.0.2
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
library(ggplot2)
library(dplyr)
Set Working Directory
setwd("/Users/Joeyc/Documents/School/Fall 2020/Data 110/Project Two")
Import Data
global_emissions = read_csv("co2_global_emissions.csv")
## Parsed with column specification:
## cols(
## .default = col_double(),
## `Country Name` = col_character(),
## `Country Code` = col_character(),
## `Indicator Name` = col_character(),
## `Indicator Code` = col_character(),
## `2015` = col_logical(),
## `2016` = col_logical(),
## `2017` = col_logical(),
## `2018` = col_logical()
## )
## See spec(...) for full column specifications.
Taking a look at the Data
str(global_emissions)
## tibble [264 x 63] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ Country Name : chr [1:264] "Aruba" "Afghanistan" "Angola" "Albania" ...
## $ Country Code : chr [1:264] "ABW" "AFG" "AGO" "ALB" ...
## $ Indicator Name: chr [1:264] "CO2 emissions (metric tons per capita)" "CO2 emissions (metric tons per capita)" "CO2 emissions (metric tons per capita)" "CO2 emissions (metric tons per capita)" ...
## $ Indicator Code: chr [1:264] "EN.ATM.CO2E.PC" "EN.ATM.CO2E.PC" "EN.ATM.CO2E.PC" "EN.ATM.CO2E.PC" ...
## $ 1960 : num [1:264] NA 0.0461 0.0975 1.2582 NA ...
## $ 1961 : num [1:264] NA 0.0536 0.079 1.3742 NA ...
## $ 1962 : num [1:264] NA 0.0738 0.2013 1.44 NA ...
## $ 1963 : num [1:264] NA 0.0742 0.1925 1.1817 NA ...
## $ 1964 : num [1:264] NA 0.0863 0.201 1.1117 NA ...
## $ 1965 : num [1:264] NA 0.101 0.192 1.166 NA ...
## $ 1966 : num [1:264] NA 0.108 0.246 1.333 NA ...
## $ 1967 : num [1:264] NA 0.124 0.155 1.364 NA ...
## $ 1968 : num [1:264] NA 0.115 0.256 1.52 NA ...
## $ 1969 : num [1:264] NA 0.0868 0.4196 1.559 NA ...
## $ 1970 : num [1:264] NA 0.15 0.529 1.753 NA ...
## $ 1971 : num [1:264] NA 0.166 0.492 1.989 NA ...
## $ 1972 : num [1:264] NA 0.131 0.635 2.516 NA ...
## $ 1973 : num [1:264] NA 0.136 0.671 2.304 NA ...
## $ 1974 : num [1:264] NA 0.156 0.652 1.849 NA ...
## $ 1975 : num [1:264] NA 0.169 0.575 1.911 NA ...
## $ 1976 : num [1:264] NA 0.155 0.416 2.014 NA ...
## $ 1977 : num [1:264] NA 0.183 0.435 2.276 NA ...
## $ 1978 : num [1:264] NA 0.163 0.646 2.531 NA ...
## $ 1979 : num [1:264] NA 0.168 0.637 2.898 NA ...
## $ 1980 : num [1:264] NA 0.133 0.599 1.935 NA ...
## $ 1981 : num [1:264] NA 0.152 0.571 2.693 NA ...
## $ 1982 : num [1:264] NA 0.165 0.485 2.625 NA ...
## $ 1983 : num [1:264] NA 0.204 0.515 2.683 NA ...
## $ 1984 : num [1:264] NA 0.235 0.487 2.694 NA ...
## $ 1985 : num [1:264] NA 0.298 0.443 2.658 NA ...
## $ 1986 : num [1:264] 2.868 0.271 0.427 2.665 NA ...
## $ 1987 : num [1:264] 7.235 0.272 0.518 2.414 NA ...
## $ 1988 : num [1:264] 10.026 0.248 0.446 2.332 NA ...
## $ 1989 : num [1:264] 10.635 0.236 0.424 2.783 NA ...
## $ 1990 : num [1:264] 26.375 0.213 0.42 1.678 7.467 ...
## $ 1991 : num [1:264] 26.046 0.188 0.405 1.312 7.182 ...
## $ 1992 : num [1:264] 21.4426 0.0997 0.4007 0.7747 6.9121 ...
## $ 1993 : num [1:264] 22.0008 0.0892 0.4309 0.7238 6.7361 ...
## $ 1994 : num [1:264] 21.036 0.08 0.281 0.6 6.494 ...
## $ 1995 : num [1:264] 20.7719 0.0727 0.7692 0.6545 6.6621 ...
## $ 1996 : num [1:264] 20.318 0.066 0.712 0.637 7.065 ...
## $ 1997 : num [1:264] 20.4268 0.0596 0.4892 0.4904 7.2397 ...
## $ 1998 : num [1:264] 20.5877 0.0552 0.4714 0.5603 7.6608 ...
## $ 1999 : num [1:264] 20.3116 0.0423 0.5741 0.9602 7.9755 ...
## $ 2000 : num [1:264] 26.1949 0.0385 0.5804 0.9782 8.0193 ...
## $ 2001 : num [1:264] 25.934 0.039 0.573 1.053 7.787 ...
## $ 2002 : num [1:264] 25.6712 0.0487 0.7208 1.2295 7.5906 ...
## $ 2003 : num [1:264] 26.4205 0.0518 0.498 1.4127 7.3158 ...
## $ 2004 : num [1:264] 26.5173 0.0394 0.9962 1.3762 7.3586 ...
## $ 2005 : num [1:264] 27.2007 0.0529 0.9797 1.4125 7.2999 ...
## $ 2006 : num [1:264] 26.9483 0.0637 1.0989 1.3026 6.7462 ...
## $ 2007 : num [1:264] 27.8956 0.0854 1.1978 1.3223 6.5195 ...
## $ 2008 : num [1:264] 26.231 0.154 1.182 1.484 6.428 ...
## $ 2009 : num [1:264] 25.916 0.242 1.232 1.496 6.122 ...
## $ 2010 : num [1:264] 24.671 0.294 1.243 1.579 6.123 ...
## $ 2011 : num [1:264] 24.506 0.412 1.253 1.804 5.867 ...
## $ 2012 : num [1:264] 13.16 0.35 1.33 1.69 5.92 ...
## $ 2013 : num [1:264] 8.351 0.316 1.255 1.749 5.901 ...
## $ 2014 : num [1:264] 8.408 0.299 1.291 1.979 5.832 ...
## $ 2015 : logi [1:264] NA NA NA NA NA NA ...
## $ 2016 : logi [1:264] NA NA NA NA NA NA ...
## $ 2017 : logi [1:264] NA NA NA NA NA NA ...
## $ 2018 : logi [1:264] NA NA NA NA NA NA ...
## - attr(*, "spec")=
## .. cols(
## .. `Country Name` = col_character(),
## .. `Country Code` = col_character(),
## .. `Indicator Name` = col_character(),
## .. `Indicator Code` = col_character(),
## .. `1960` = col_double(),
## .. `1961` = col_double(),
## .. `1962` = col_double(),
## .. `1963` = col_double(),
## .. `1964` = col_double(),
## .. `1965` = col_double(),
## .. `1966` = col_double(),
## .. `1967` = col_double(),
## .. `1968` = col_double(),
## .. `1969` = col_double(),
## .. `1970` = col_double(),
## .. `1971` = col_double(),
## .. `1972` = col_double(),
## .. `1973` = col_double(),
## .. `1974` = col_double(),
## .. `1975` = col_double(),
## .. `1976` = col_double(),
## .. `1977` = col_double(),
## .. `1978` = col_double(),
## .. `1979` = col_double(),
## .. `1980` = col_double(),
## .. `1981` = col_double(),
## .. `1982` = col_double(),
## .. `1983` = col_double(),
## .. `1984` = col_double(),
## .. `1985` = col_double(),
## .. `1986` = col_double(),
## .. `1987` = col_double(),
## .. `1988` = col_double(),
## .. `1989` = col_double(),
## .. `1990` = col_double(),
## .. `1991` = col_double(),
## .. `1992` = col_double(),
## .. `1993` = col_double(),
## .. `1994` = col_double(),
## .. `1995` = col_double(),
## .. `1996` = col_double(),
## .. `1997` = col_double(),
## .. `1998` = col_double(),
## .. `1999` = col_double(),
## .. `2000` = col_double(),
## .. `2001` = col_double(),
## .. `2002` = col_double(),
## .. `2003` = col_double(),
## .. `2004` = col_double(),
## .. `2005` = col_double(),
## .. `2006` = col_double(),
## .. `2007` = col_double(),
## .. `2008` = col_double(),
## .. `2009` = col_double(),
## .. `2010` = col_double(),
## .. `2011` = col_double(),
## .. `2012` = col_double(),
## .. `2013` = col_double(),
## .. `2014` = col_double(),
## .. `2015` = col_logical(),
## .. `2016` = col_logical(),
## .. `2017` = col_logical(),
## .. `2018` = col_logical()
## .. )
summary(global_emissions)
## Country Name Country Code Indicator Name Indicator Code
## Length:264 Length:264 Length:264 Length:264
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## 1960 1961 1962 1963
## Min. : 0.00802 Min. : 0.00789 Min. : 0.00848 Min. : 0.00938
## 1st Qu.: 0.18213 1st Qu.: 0.18025 1st Qu.: 0.20020 1st Qu.: 0.19774
## Median : 0.62003 Median : 0.64923 Median : 0.65233 Median : 0.64797
## Mean : 2.04418 Mean : 2.15748 Mean : 2.24880 Mean : 2.76342
## 3rd Qu.: 1.70291 3rd Qu.: 1.74622 3rd Qu.: 1.94327 3rd Qu.: 1.72018
## Max. :36.68518 Max. :36.58378 Max. :42.24200 Max. :99.46300
## NA's :72 NA's :71 NA's :69 NA's :68
## 1964 1965 1966 1967
## Min. : 0.0116 Min. : 0.01191 Min. : 0.01326 Min. : 0.0118
## 1st Qu.: 0.2176 1st Qu.: 0.23441 1st Qu.: 0.24922 1st Qu.: 0.2518
## Median : 0.7661 Median : 0.69654 Median : 0.74914 Median : 0.8029
## Mean : 2.9127 Mean : 3.03167 Mean : 3.04470 Mean : 3.1112
## 3rd Qu.: 2.0244 3rd Qu.: 2.19005 3rd Qu.: 2.45582 3rd Qu.: 2.9155
## Max. :92.8595 Max. :85.45859 Max. :78.62712 Max. :77.5086
## NA's :61 NA's :61 NA's :61 NA's :61
## 1968 1969 1970 1971
## Min. :-0.0201 Min. : 0.01612 Min. : 0.01229 Min. : 0.0119
## 1st Qu.: 0.2700 1st Qu.: 0.32092 1st Qu.: 0.34980 1st Qu.: 0.3405
## Median : 0.9867 Median : 1.05863 Median : 1.00036 Median : 1.0981
## Mean : 3.3093 Mean : 3.91912 Mean : 4.19749 Mean : 4.4219
## 3rd Qu.: 3.2576 3rd Qu.: 3.59743 3rd Qu.: 4.01242 3rd Qu.: 4.5024
## Max. :75.9753 Max. :100.69767 Max. :69.11160 Max. :76.6415
## NA's :61 NA's :61 NA's :59 NA's :58
## 1972 1973 1974 1975
## Min. : 0.01153 Min. : 0.01117 Min. : 0.00974 Min. : 0.00975
## 1st Qu.: 0.35436 1st Qu.: 0.36669 1st Qu.: 0.37185 1st Qu.: 0.38111
## Median : 1.11133 Median : 1.13637 Median : 1.23269 Median : 1.28549
## Mean : 4.48812 Mean : 4.80584 Mean : 4.49946 Mean : 4.36611
## 3rd Qu.: 4.51785 3rd Qu.: 5.11695 3rd Qu.: 4.64417 3rd Qu.: 4.85223
## Max. :82.61945 Max. :87.65265 Max. :68.23258 Max. :66.64312
## NA's :56 NA's :56 NA's :56 NA's :56
## 1976 1977 1978 1979
## Min. : 0.00991 Min. : 0.01019 Min. : 0.00738 Min. : 0.00433
## 1st Qu.: 0.36318 1st Qu.: 0.38780 1st Qu.: 0.40225 1st Qu.: 0.43831
## Median : 1.36287 Median : 1.43705 Median : 1.51862 Median : 1.57835
## Mean : 4.35662 Mean : 4.48666 Mean : 4.51104 Mean : 4.56304
## 3rd Qu.: 5.17443 3rd Qu.: 5.28561 3rd Qu.: 5.74670 3rd Qu.: 5.49269
## Max. :61.29021 Max. :54.40915 Max. :54.82565 Max. :69.94185
## NA's :56 NA's :56 NA's :56 NA's :56
## 1980 1981 1982 1983
## Min. : 0.03563 Min. : 0.02982 Min. : 0.02843 Min. : 0.03099
## 1st Qu.: 0.44931 1st Qu.: 0.46582 1st Qu.: 0.45183 1st Qu.: 0.45037
## Median : 1.52564 Median : 1.52441 Median : 1.47916 Median : 1.36581
## Mean : 4.46439 Mean : 3.99356 Mean : 3.87247 Mean : 3.72682
## 3rd Qu.: 5.49203 3rd Qu.: 5.30724 3rd Qu.: 5.37670 3rd Qu.: 5.40872
## Max. :58.53435 Max. :51.82543 Max. :44.53605 Max. :36.41181
## NA's :56 NA's :56 NA's :56 NA's :56
## 1984 1985 1986 1987
## Min. : 0.04113 Min. : 0.03529 Min. : 0.03567 Min. : 0.03662
## 1st Qu.: 0.47515 1st Qu.: 0.46528 1st Qu.: 0.44069 1st Qu.: 0.48470
## Median : 1.44877 Median : 1.54159 Median : 1.55041 Median : 1.63990
## Mean : 3.82439 Mean : 3.91770 Mean : 3.90545 Mean : 3.94261
## 3rd Qu.: 5.25908 3rd Qu.: 5.56298 3rd Qu.: 4.97794 3rd Qu.: 5.38631
## Max. :36.11639 Max. :35.89097 Max. :33.41411 Max. :30.55837
## NA's :56 NA's :56 NA's :55 NA's :55
## 1988 1989 1990 1991
## Min. : 0.01182 Min. : 0.0178 Min. : 0.02401 Min. : 0.01073
## 1st Qu.: 0.50639 1st Qu.: 0.4992 1st Qu.: 0.46026 1st Qu.: 0.45024
## Median : 1.75620 Median : 1.6438 Median : 1.67303 Median : 1.86103
## Mean : 4.07731 Mean : 4.2133 Mean : 4.08245 Mean : 4.12135
## 3rd Qu.: 5.79625 3rd Qu.: 5.8379 3rd Qu.: 5.91487 3rd Qu.: 5.98891
## Max. :29.21023 Max. :31.0288 Max. :27.95925 Max. :36.31713
## NA's :55 NA's :55 NA's :49 NA's :47
## 1992 1993 1994 1995
## Min. : 0.01328 Min. : 0.01398 Min. : 0.01516 Min. : 0.01571
## 1st Qu.: 0.57081 1st Qu.: 0.52776 1st Qu.: 0.57165 1st Qu.: 0.58101
## Median : 2.27881 Median : 2.23531 Median : 2.19315 Median : 2.32266
## Mean : 4.47999 Mean : 4.50271 Mean : 4.42463 Mean : 4.47459
## 3rd Qu.: 6.47191 3rd Qu.: 6.65546 3rd Qu.: 6.44669 3rd Qu.: 6.47766
## Max. :54.08917 Max. :61.25241 Max. :59.60109 Max. :61.91238
## NA's :23 NA's :23 NA's :22 NA's :21
## 1996 1997 1998 1999
## Min. : 0.01722 Min. : 0.01909 Min. : 0.01938 Min. : 0.02006
## 1st Qu.: 0.61819 1st Qu.: 0.68144 1st Qu.: 0.70258 1st Qu.: 0.74136
## Median : 2.39780 Median : 2.27434 Median : 2.25260 Median : 2.25969
## Mean : 4.49417 Mean : 4.49199 Mean : 4.48218 Mean : 4.44955
## 3rd Qu.: 6.75816 3rd Qu.: 6.57679 3rd Qu.: 6.55386 3rd Qu.: 6.69472
## Max. :61.83934 Max. :70.13564 Max. :58.86600 Max. :55.15501
## NA's :21 NA's :20 NA's :19 NA's :19
## 2000 2001 2002 2003
## Min. : 0.01729 Min. : 0.01728 Min. : 0.01862 Min. : 0.01919
## 1st Qu.: 0.74018 1st Qu.: 0.76470 1st Qu.: 0.75710 1st Qu.: 0.80170
## Median : 2.33916 Median : 2.43634 Median : 2.50363 Median : 2.62574
## Mean : 4.57853 Mean : 4.63067 Mean : 4.59742 Mean : 4.72905
## 3rd Qu.: 6.60642 3rd Qu.: 6.91775 3rd Qu.: 6.94779 3rd Qu.: 7.24342
## Max. :58.63936 Max. :67.10602 Max. :63.35447 Max. :60.29957
## NA's :19 NA's :19 NA's :18 NA's :18
## 2004 2005 2006 2007
## Min. : 0.02261 Min. : 0.02075 Min. : 0.02437 Min. : 0.02356
## 1st Qu.: 0.83543 1st Qu.: 0.85543 1st Qu.: 0.79841 1st Qu.: 0.89101
## Median : 2.65745 Median : 2.76730 Median : 2.91508 Median : 2.88561
## Mean : 4.77632 Mean : 4.82026 Mean : 4.89865 Mean : 4.92978
## 3rd Qu.: 7.13041 3rd Qu.: 7.03361 3rd Qu.: 7.03593 3rd Qu.: 6.92544
## Max. :56.59083 Max. :58.91873 Max. :62.82354 Max. :53.19099
## NA's :18 NA's :17 NA's :16 NA's :15
## 2008 2009 2010 2011
## Min. : 0.02322 Min. : 0.02246 Min. : 0.02426 Min. : 0.02676
## 1st Qu.: 0.81606 1st Qu.: 0.82785 1st Qu.: 0.82081 1st Qu.: 0.83982
## Median : 3.02796 Median : 2.95356 Median : 2.93322 Median : 2.92997
## Mean : 4.93589 Mean : 4.72189 Mean : 4.84474 Mean : 4.80634
## 3rd Qu.: 7.01056 3rd Qu.: 6.31907 3rd Qu.: 6.64135 3rd Qu.: 6.71538
## Max. :46.67214 Max. :43.51448 Max. :40.74202 Max. :41.20565
## NA's :15 NA's :15 NA's :15 NA's :15
## 2012 2013 2014 2015
## Min. : 0.0303 Min. : 0.03018 Min. : 0.04449 Mode:logical
## 1st Qu.: 0.8280 1st Qu.: 0.84854 1st Qu.: 0.88172 NA's:264
## Median : 3.0259 Median : 3.00557 Median : 3.15330
## Mean : 4.9488 Mean : 4.86222 Mean : 4.87468
## 3rd Qu.: 6.6646 3rd Qu.: 6.71259 3rd Qu.: 6.36518
## Max. :44.6179 Max. :37.78009 Max. :45.42324
## NA's :13 NA's :13 NA's :14
## 2016 2017 2018
## Mode:logical Mode:logical Mode:logical
## NA's:264 NA's:264 NA's:264
##
##
##
##
##
Check for N/A values
any(is.na(global_emissions))
## [1] TRUE
sum(is.na(global_emissions))
## [1] 3327
colSums(is.na(global_emissions))
## Country Name Country Code Indicator Name Indicator Code 1960
## 0 0 0 0 72
## 1961 1962 1963 1964 1965
## 71 69 68 61 61
## 1966 1967 1968 1969 1970
## 61 61 61 61 59
## 1971 1972 1973 1974 1975
## 58 56 56 56 56
## 1976 1977 1978 1979 1980
## 56 56 56 56 56
## 1981 1982 1983 1984 1985
## 56 56 56 56 56
## 1986 1987 1988 1989 1990
## 55 55 55 55 49
## 1991 1992 1993 1994 1995
## 47 23 23 22 21
## 1996 1997 1998 1999 2000
## 21 20 19 19 19
## 2001 2002 2003 2004 2005
## 19 18 18 18 17
## 2006 2007 2008 2009 2010
## 16 15 15 15 15
## 2011 2012 2013 2014 2015
## 15 13 13 14 264
## 2016 2017 2018
## 264 264 264
There appears to be 3327 N/A values in the data set. This is something we should remove. 2015-2019 were all NAs so let’s get rid of those rows first.
global_emissions = global_emissions[-63]
global_emissions = global_emissions[-62]
global_emissions = global_emissions[-61]
global_emissions = global_emissions[-60]
sum(is.na(global_emissions))
## [1] 2271
We have removed over 1000 NA’s just by omitting those rows, now let’s get rid of the rest of the NAs.
global_emissions_cleaned = global_emissions[complete.cases(global_emissions),]
Let’s check if all the NAs are finally removed from the data frame.
sum(is.na(global_emissions_cleaned))
## [1] 0
As we can see there are no more NA values in the dataframe global_emissions_cleaned. We have removed the rows (countries) with complete data. We have stripped 74 countries from the data frame and are now left with only 190 countries with complete sets of data ranging from 1960 to 2014. We now have a functional dataset ready for our analysis.
summary(global_emissions_cleaned)
## Country Name Country Code Indicator Name Indicator Code
## Length:190 Length:190 Length:190 Length:190
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## 1960 1961 1962 1963
## Min. : 0.00802 Min. : 0.00789 Min. : 0.00848 Min. : 0.00938
## 1st Qu.: 0.19121 1st Qu.: 0.18839 1st Qu.: 0.19966 1st Qu.: 0.20191
## Median : 0.62003 Median : 0.65550 Median : 0.65940 Median : 0.71137
## Mean : 1.91288 Mean : 2.01624 Mean : 2.07878 Mean : 2.60692
## 3rd Qu.: 1.69937 3rd Qu.: 1.72743 3rd Qu.: 1.94701 3rd Qu.: 1.73727
## Max. :36.68518 Max. :36.58378 Max. :36.01263 Max. :99.46300
## 1964 1965 1966 1967
## Min. : 0.0116 Min. : 0.01475 Min. : 0.01326 Min. : 0.0118
## 1st Qu.: 0.2294 1st Qu.: 0.24059 1st Qu.: 0.25321 1st Qu.: 0.2599
## Median : 0.7750 Median : 0.76892 Median : 0.79359 Median : 0.9485
## Mean : 2.7458 Mean : 2.79025 Mean : 2.84061 Mean : 3.0224
## 3rd Qu.: 2.0684 3rd Qu.: 2.28586 3rd Qu.: 2.52057 3rd Qu.: 3.4252
## Max. :92.8595 Max. :85.45859 Max. :78.62712 Max. :77.5086
## 1968 1969 1970 1971
## Min. :-0.0201 Min. : 0.01612 Min. : 0.01563 Min. : 0.01612
## 1st Qu.: 0.2830 1st Qu.: 0.32918 1st Qu.: 0.35192 1st Qu.: 0.35099
## Median : 1.0052 Median : 1.11013 Median : 1.04235 Median : 1.26390
## Mean : 3.2727 Mean : 3.89333 Mean : 4.25291 Mean : 4.48162
## 3rd Qu.: 3.5948 3rd Qu.: 3.78305 3rd Qu.: 4.24164 3rd Qu.: 4.69816
## Max. :75.9753 Max. :100.69767 Max. :69.11160 Max. :76.64148
## 1972 1973 1974 1975
## Min. : 0.01607 Min. : 0.01698 Min. : 0.00974 Min. : 0.00975
## 1st Qu.: 0.36552 1st Qu.: 0.38258 1st Qu.: 0.38303 1st Qu.: 0.40617
## Median : 1.26578 Median : 1.32940 Median : 1.38408 Median : 1.44610
## Mean : 4.61896 Mean : 4.98891 Mean : 4.67840 Mean : 4.52177
## 3rd Qu.: 4.99509 3rd Qu.: 5.65342 3rd Qu.: 5.52382 3rd Qu.: 5.44857
## Max. :82.61945 Max. :87.65265 Max. :68.23258 Max. :66.64312
## 1976 1977 1978 1979
## Min. : 0.00991 Min. : 0.01019 Min. : 0.00738 Min. : 0.00433
## 1st Qu.: 0.36818 1st Qu.: 0.39951 1st Qu.: 0.43229 1st Qu.: 0.50093
## Median : 1.46951 Median : 1.48670 Median : 1.60117 Median : 1.65205
## Mean : 4.49844 Mean : 4.64651 Mean : 4.66828 Mean : 4.74105
## 3rd Qu.: 5.52068 3rd Qu.: 5.46930 3rd Qu.: 6.13218 3rd Qu.: 5.54783
## Max. :61.29021 Max. :54.40915 Max. :54.82565 Max. :69.94185
## 1980 1981 1982 1983
## Min. : 0.03642 Min. : 0.02982 Min. : 0.02843 Min. : 0.03099
## 1st Qu.: 0.45634 1st Qu.: 0.48200 1st Qu.: 0.48002 1st Qu.: 0.50731
## Median : 1.70411 Median : 1.67712 Median : 1.61874 Median : 1.63017
## Mean : 4.61483 Mean : 4.09898 Mean : 3.99521 Mean : 3.83024
## 3rd Qu.: 5.87205 3rd Qu.: 5.56605 3rd Qu.: 5.70157 3rd Qu.: 5.61205
## Max. :58.53435 Max. :51.82543 Max. :44.53605 Max. :36.41181
## 1984 1985 1986 1987
## Min. : 0.04113 Min. : 0.03529 Min. : 0.03567 Min. : 0.03662
## 1st Qu.: 0.50734 1st Qu.: 0.50843 1st Qu.: 0.49236 1st Qu.: 0.53190
## Median : 1.68523 Median : 1.69800 Median : 1.59963 Median : 1.69854
## Mean : 3.92240 Mean : 4.01723 Mean : 3.96980 Mean : 4.00564
## 3rd Qu.: 5.64799 3rd Qu.: 5.85255 3rd Qu.: 5.88411 3rd Qu.: 5.61996
## Max. :36.11639 Max. :35.89097 Max. :33.41411 Max. :30.55837
## 1988 1989 1990 1991
## Min. : 0.01182 Min. : 0.0178 Min. : 0.02401 Min. : 0.01073
## 1st Qu.: 0.54621 1st Qu.: 0.5358 1st Qu.: 0.48778 1st Qu.: 0.46844
## Median : 1.82210 Median : 1.8976 Median : 1.78332 Median : 1.87488
## Mean : 4.13764 Mean : 4.2808 Mean : 4.09635 Mean : 4.17868
## 3rd Qu.: 6.05304 3rd Qu.: 6.1267 3rd Qu.: 6.09113 3rd Qu.: 6.02683
## Max. :29.21023 Max. :31.0288 Max. :27.95924 Max. :36.31713
## 1992 1993 1994 1995
## Min. : 0.01328 Min. : 0.01398 Min. : 0.01516 Min. : 0.01571
## 1st Qu.: 0.49835 1st Qu.: 0.49171 1st Qu.: 0.49326 1st Qu.: 0.50195
## Median : 1.98303 Median : 2.10304 Median : 2.13418 Median : 2.16486
## Mean : 4.24612 Mean : 4.38165 Mean : 4.38431 Mean : 4.32098
## 3rd Qu.: 5.95205 3rd Qu.: 5.96758 3rd Qu.: 5.96360 3rd Qu.: 5.97206
## Max. :54.08917 Max. :61.25241 Max. :59.60109 Max. :61.91238
## 1996 1997 1998 1999
## Min. : 0.01722 Min. : 0.01909 Min. : 0.01938 Min. : 0.02006
## 1st Qu.: 0.54114 1st Qu.: 0.50804 1st Qu.: 0.48885 1st Qu.: 0.58075
## Median : 2.20562 Median : 2.21016 Median : 2.20055 Median : 2.21292
## Mean : 4.36977 Mean : 4.39154 Mean : 4.40004 Mean : 4.37685
## 3rd Qu.: 5.94769 3rd Qu.: 5.88703 3rd Qu.: 5.96513 3rd Qu.: 6.24506
## Max. :61.83934 Max. :70.13564 Max. :58.86600 Max. :55.15501
## 2000 2001 2002 2003
## Min. : 0.01729 Min. : 0.01728 Min. : 0.01862 Min. : 0.01919
## 1st Qu.: 0.63326 1st Qu.: 0.68919 1st Qu.: 0.72259 1st Qu.: 0.73686
## Median : 2.29620 Median : 2.33576 Median : 2.38997 Median : 2.49101
## Mean : 4.50924 Mean : 4.55150 Mean : 4.52810 Mean : 4.64728
## 3rd Qu.: 6.03727 3rd Qu.: 6.12880 3rd Qu.: 6.42819 3rd Qu.: 6.46812
## Max. :58.63936 Max. :67.10602 Max. :63.35447 Max. :60.29957
## 2004 2005 2006 2007
## Min. : 0.02261 Min. : 0.02739 Min. : 0.02847 Min. : 0.03007
## 1st Qu.: 0.78618 1st Qu.: 0.81378 1st Qu.: 0.77798 1st Qu.: 0.81469
## Median : 2.55389 Median : 2.69362 Median : 2.77711 Median : 2.76387
## Mean : 4.68189 Mean : 4.71795 Mean : 4.77699 Mean : 4.81334
## 3rd Qu.: 6.34857 3rd Qu.: 6.39352 3rd Qu.: 6.41385 3rd Qu.: 6.66655
## Max. :56.59083 Max. :58.91873 Max. :62.82354 Max. :53.19099
## 2008 2009 2010 2011
## Min. : 0.03079 Min. : 0.02803 Min. : 0.03131 Min. : 0.03738
## 1st Qu.: 0.79320 1st Qu.: 0.80025 1st Qu.: 0.80317 1st Qu.: 0.81467
## Median : 2.78908 Median : 2.77911 Median : 2.88506 Median : 2.81892
## Mean : 4.80978 Mean : 4.65839 Mean : 4.76052 Mean : 4.70427
## 3rd Qu.: 6.66625 3rd Qu.: 6.27109 3rd Qu.: 6.42877 3rd Qu.: 6.58947
## Max. :46.67214 Max. :43.51448 Max. :40.74202 Max. :41.20565
## 2012 2013 2014
## Min. : 0.03482 Min. : 0.04635 Min. : 0.04505
## 1st Qu.: 0.78524 1st Qu.: 0.82275 1st Qu.: 0.84045
## Median : 2.82562 Median : 2.88631 Median : 2.77468
## Mean : 4.68552 Mean : 4.65162 Mean : 4.65965
## 3rd Qu.: 6.51219 3rd Qu.: 6.34278 3rd Qu.: 6.26523
## Max. :44.61793 Max. :37.78009 Max. :45.42324
The range of emissions data is way too large between the min and the max, even the 3rd quartile and the max. I’m sure most of you guys know there is a couple "emissions giants that really take up a majority of the emissions throughout the world. Let’s take a look at those countries only and see which countries are the top to blame for the emissions.
Using dplyr to filter to show only countries that are currently (2014) in the top of emissions to see how they have progressed over the last 54 years. We will create a new dataset called Leaders_2014, to show the leaders in emissions of the countries in 2014.
Leaders_2014 = filter(global_emissions_cleaned, global_emissions_cleaned$`2014` > 10)
Now that we have reduced the amount of countries to only the top global emissions leaders in 2014. The way the CSV file was organized had the years as the headers and I was unable to plot how I wanted to, and I was able to group the years properly in dplyr. I had to use the write.csv() create a column header “YEAR”.
Emissions_Leaders_2014 = read_csv("Emissons_Leaders_2014.csv")
## Parsed with column specification:
## cols(
## `Country Name` = col_character(),
## `Country Code` = col_character(),
## Year = col_double(),
## Emissions = col_double()
## )
With this new CSV file uploaded I can know plot the points extremely easy.
Emissions_Leaders_2014 %>%
ggplot() +
geom_smooth(aes(x = Year, y=Emissions)) +
xlab("Year" ) +
ylab("Emissions Per Capita in Metric Tons") +
ggtitle("Emissions Per Capita of the Top 16 Offenders") +
theme_calc()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

As we can see the Emissions Per Capita of the top 16 Countries in 2014 have overall held steady since 1970.
Emissions_Leaders_2014 %>%
ggplot(aes(x=Year, y=Emissions))+
geom_point(aes(x=Year, y=Emissions))+
facet_wrap(~Emissions_Leaders_2014$`Country Name`) +
xlab("Year") +
ylab("Emissions Per Capita in Metric Tons") +
ggtitle("Emissions Per Capita of the Top 2014 offenders") +
theme_economist()

It is tough to tell here the journey of the top 16 countries increased or decreased through the 54 years. Since Qatar, UAE, and Brunei all represent outliers, making it difficult to read. I used the scales = free command to show all the facet wraps with different scales.
p = Emissions_Leaders_2014 %>%
ggplot(aes(x=Year, y=Emissions))+
geom_point(aes(x=Year, y=Emissions))+
facet_wrap(~Emissions_Leaders_2014$`Country Code`, scales = "free") +
xlab("Year") +
ylab("Emissions Per Capita in Metric Tons") +
ggtitle("Emissions Per Capita of the Top 2014 offenders") +
theme_economist()
ggplotly(p)
A) The topic of the data, any variables included, what kind of variables they are, where the data came from and how you cleaned it up (be detailed and specific, using proper terminology where appropriate). Be sure to explain why you chose this topic and dataset – what meaning does it have for you?
The topic of the data I chose for this project was CO2 Emissions per capita of countries between the years of 1960 and 2014. The variables included in this dataset were; the countries name, countries name code, the CO2 emissions per capita in metric tons for each country, and the year of the emissions. The data originated from the world bank. I cleaned up the data in multiple ways as I explained above. I started my clean up process of the data by removing all NA values. I achieved this by first checking to see if there were any NAs by using the any(is.na()) function and then the colSums(is.na()). I then I saw that 2015-2018 were included but no values were included so I eliminated those columns. Finally I used the complete.cases() command to remove the countries that had NA values. These countries did not have a complete row of numbers, meaning they did not record emissions in a certain year. I chose this dataset because I am interested in the global climate change crisis that has been accelerating over the years. In High School I was part of the global ecology program at Poolesville and then I went on to study Bio at UMBC.
B) Incorporate background research about this topic. This background information will include information you find in an article, website, or book. Please source this background information within the essay or if you have multiple sources, include a bibliography. I am not particular about the format of this bibliography. If you need help finding articles, I am happy to help you and/or show you how to search the MC Library Database.
C) What the visualization represents, any interesting patterns or surprises that arise within the visualization, and anything that could have been shown that you could not get to work or that you wished you could have included.
One of the interesting patterns I have seen is the small island populations causes some of the higher Emissions Per Capita, compared to countries I would have assumed to be towards the top of the emissions scale. As surprised to see countries like China and India which we would assume to be near the top of the list to be way lower on the list. I guess that has to do with the size of population. Even though they have high emissions they have enough people in their country to be low on the list. Countries that are small and urban like Qatar and Luxemburg rank high on the list. Urban areas and densely populated areas seem to be what top the list.