This project aims to explore patterns linking temperature change and gas emissions.

Setting the working directory and loading libraries:

setwd("C:/Users/adilk/OneDrive/Desktop/Uni Work/USL")
getwd()
## [1] "C:/Users/adilk/OneDrive/Desktop/Uni Work/USL"
library("sf")
library("maps")
library("dplyr")
library("ggplot2")
library("ggrepel")
library("countrycode")
library("corrplot")

Load the temperature change dataset and replace NA values with mean, as well as make sure the data is numeric as needed

dataset <- read.csv("climate_data.csv")

climate_data <- dataset %>%
  mutate(across(everything(), ~ ifelse(is.na(.), mean(., na.rm = TRUE), .)))
head(climate_data)
##   ObjectId                      Country ISO2   ISO3
## 1        1 Afghanistan, Islamic Rep. of   AF    AFG
## 2        2                       Africa      AFRTMP
## 3        3                      Albania   AL    ALB
## 4        4                      Algeria   DZ    DZA
## 5        5               American Samoa   AS    ASM
## 6        6                     Americas      AMETMP
##                                                                                          Indicator
## 1 Temperature change with respect to a baseline climatology, corresponding to the period 1951-1980
## 2 Temperature change with respect to a baseline climatology, corresponding to the period 1951-1980
## 3 Temperature change with respect to a baseline climatology, corresponding to the period 1951-1980
## 4 Temperature change with respect to a baseline climatology, corresponding to the period 1951-1980
## 5 Temperature change with respect to a baseline climatology, corresponding to the period 1951-1980
## 6 Temperature change with respect to a baseline climatology, corresponding to the period 1951-1980
##             Unit
## 1 Degree Celsius
## 2 Degree Celsius
## 3 Degree Celsius
## 4 Degree Celsius
## 5 Degree Celsius
## 6 Degree Celsius
##                                                                                                                                                                                                                                                Source
## 1 Food and Agriculture Organization of the United Nations (FAO). 2022. FAOSTAT Climate Change, Climate Indicators, Temperature change. License: CC BY-NC-SA 3.0 IGO. Extracted from: https://www.fao.org/faostat/en/#data/ET. Accessed on 2023-03-28.
## 2 Food and Agriculture Organization of the United Nations (FAO). 2022. FAOSTAT Climate Change, Climate Indicators, Temperature change. License: CC BY-NC-SA 3.0 IGO. Extracted from: https://www.fao.org/faostat/en/#data/ET. Accessed on 2023-03-28.
## 3 Food and Agriculture Organization of the United Nations (FAO). 2022. FAOSTAT Climate Change, Climate Indicators, Temperature change. License: CC BY-NC-SA 3.0 IGO. Extracted from: https://www.fao.org/faostat/en/#data/ET. Accessed on 2023-03-28.
## 4 Food and Agriculture Organization of the United Nations (FAO). 2022. FAOSTAT Climate Change, Climate Indicators, Temperature change. License: CC BY-NC-SA 3.0 IGO. Extracted from: https://www.fao.org/faostat/en/#data/ET. Accessed on 2023-03-28.
## 5 Food and Agriculture Organization of the United Nations (FAO). 2022. FAOSTAT Climate Change, Climate Indicators, Temperature change. License: CC BY-NC-SA 3.0 IGO. Extracted from: https://www.fao.org/faostat/en/#data/ET. Accessed on 2023-03-28.
## 6 Food and Agriculture Organization of the United Nations (FAO). 2022. FAOSTAT Climate Change, Climate Indicators, Temperature change. License: CC BY-NC-SA 3.0 IGO. Extracted from: https://www.fao.org/faostat/en/#data/ET. Accessed on 2023-03-28.
##   CTS.Code                   CTS.Name
## 1     ECCS Surface Temperature Change
## 2     ECCS Surface Temperature Change
## 3     ECCS Surface Temperature Change
## 4     ECCS Surface Temperature Change
## 5     ECCS Surface Temperature Change
## 6     ECCS Surface Temperature Change
##                                                            CTS.Full.Descriptor
## 1 Environment, Climate Change, Climate and Weather, Surface Temperature Change
## 2 Environment, Climate Change, Climate and Weather, Surface Temperature Change
## 3 Environment, Climate Change, Climate and Weather, Surface Temperature Change
## 4 Environment, Climate Change, Climate and Weather, Surface Temperature Change
## 5 Environment, Climate Change, Climate and Weather, Surface Temperature Change
## 6 Environment, Climate Change, Climate and Weather, Surface Temperature Change
##    X1961  X1962 X1963  X1964  X1965  X1966  X1967  X1968  X1969  X1970  X1971
## 1 -0.126 -0.173 0.844 -0.751 -0.220  0.239 -0.348 -0.398 -0.513  0.843  0.642
## 2 -0.017 -0.036 0.063 -0.156 -0.201  0.138 -0.215 -0.221  0.355  0.220 -0.198
## 3  0.635  0.342 0.086 -0.169 -0.390  0.553 -0.082  0.062 -0.036 -0.137 -0.205
## 4  0.155  0.120 0.050  0.254 -0.111  0.405 -0.024 -0.065  0.247  0.080 -0.412
## 5  0.121  0.000 0.211 -0.089 -0.595  0.129 -0.387 -0.185  0.145 -0.034 -0.420
## 6  0.053 -0.084 0.264 -0.244 -0.300 -0.104 -0.113 -0.003  0.098  0.156 -0.239
##    X1972  X1973  X1974  X1975  X1976 X1977  X1978  X1979  X1980  X1981  X1982
## 1 -1.095  0.264 -0.466 -0.442 -0.291 0.544  0.120  0.380  0.655  0.558 -0.286
## 2  0.001  0.377 -0.232 -0.231 -0.252 0.240  0.090  0.445  0.376  0.109  0.222
## 3 -0.081 -0.294 -0.128 -0.203 -0.677 0.530 -0.825  0.191 -0.422 -0.346  0.207
## 4 -0.360 -0.023 -0.506 -0.569 -0.799 0.550  0.060  0.706  0.266  0.249  0.421
## 5 -0.056  0.330 -0.300 -0.110 -0.169 0.157  0.093  0.342  0.344  0.165  0.288
## 6 -0.454  0.164 -0.283 -0.123 -0.136 0.505 -0.072 -0.072  0.506  0.697 -0.093
##    X1983  X1984  X1985  X1986  X1987 X1988  X1989 X1990  X1991  X1992  X1993
## 1  0.240  0.252  0.397 -0.015  0.493 1.035  0.015 0.925 -0.051 -0.212  0.254
## 2  0.418  0.348  0.307  0.295  0.685 0.618  0.000 0.568  0.375  0.092  0.373
## 3 -0.088 -0.238 -0.072  0.600 -0.078 0.397 -0.041 0.809 -0.280  0.106  0.113
## 4  0.617  0.054  0.552  0.347  1.050 1.402  0.453 1.332  0.102 -0.232  0.645
## 5  0.309  0.257  0.235  0.375  0.333 0.510  0.167 0.520  0.677  0.335 -0.092
## 6  0.451  0.072 -0.009  0.287  0.836 0.560  0.174 0.367  0.419  0.252  0.268
##   X1994  X1995  X1996     X1997     X1998 X1999 X2000 X2001 X2002 X2003 X2004
## 1 0.546  0.457 -0.093 0.4600000 0.6640000 1.271 1.064 1.377 1.457 0.710 1.482
## 2 0.393  0.549  0.515 0.5810000 0.9490000 0.622 0.458 0.660 0.886 1.002 0.814
## 3 1.360 -0.137  0.006 0.1110000 0.8210000 0.724 1.084 1.551 0.519 1.026 0.472
## 4 0.830  0.687  0.923 1.1130000 1.1560000 1.502 0.862 1.895 1.291 1.619 1.016
## 5 0.141  0.732  0.787 0.5419583 0.9748904 0.240 0.604 0.882 1.130 0.712 0.244
## 6 0.503  0.711  0.186 0.5670000 1.4170000 0.797 0.725 0.851 0.777 0.981 0.679
##   X2005 X2006 X2007 X2008     X2009 X2010 X2011 X2012 X2013 X2014 X2015 X2016
## 1 0.513 1.838 0.794 0.808 0.9480000 1.664 1.455 0.271 1.345 0.521 1.204 1.612
## 2 1.081 0.925 0.793 0.724 0.9600000 1.501 0.913 0.760 1.021 1.013 1.190 1.392
## 3 0.217 0.390 1.389 1.043 0.9770000 1.261 1.125 1.546 1.394 1.285 1.667 1.558
## 4 1.289 1.422 1.249 1.205 0.9580000 2.275 1.403 1.166 1.221 1.713 1.153 1.787
## 5 0.878 0.480 0.928 0.498 0.9177623 1.131 0.681 0.720 1.012 0.840 0.679 1.209
## 6 1.039 1.223 0.923 0.571 0.6810000 1.313 0.865 1.348 0.846 0.724 1.352 1.766
##   X2017 X2018 X2019 X2020 X2021 X2022 X2023
## 1 1.642 1.624 0.991 0.587 1.475 2.154 1.956
## 2 1.180 1.178 1.297 1.200 1.396 0.996 1.485
## 3 1.196 2.103 1.751 1.567 1.589 1.585 2.122
## 4 1.533 1.231 1.107 1.905 2.360 1.776 2.274
## 5 1.105 0.859 1.209 1.101 0.938 0.925 0.951
## 6 1.353 0.949 1.229 1.303 1.558 1.075 1.772
summary(climate_data)
##     ObjectId        Country              ISO2               ISO3          
##  Min.   :  1.00   Length:236         Length:236         Length:236        
##  1st Qu.: 59.75   Class :character   Class :character   Class :character  
##  Median :118.50   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :118.73                                                           
##  3rd Qu.:177.25                                                           
##  Max.   :241.00                                                           
##   Indicator             Unit              Source            CTS.Code        
##  Length:236         Length:236         Length:236         Length:236        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##    CTS.Name         CTS.Full.Descriptor     X1961              X1962         
##  Length:236         Length:236          Min.   :-0.69700   Min.   :-1.36400  
##  Class :character   Class :character    1st Qu.:-0.07725   1st Qu.:-0.17150  
##  Mode  :character   Mode  :character    Median : 0.13950   Median :-0.03500  
##                                         Mean   : 0.16655   Mean   :-0.02863  
##                                         3rd Qu.: 0.28100   3rd Qu.: 0.05850  
##                                         Max.   : 1.89400   Max.   : 1.03300  
##      X1963              X1964              X1965             X1966        
##  Min.   :-2.24400   Min.   :-0.87600   Min.   :-1.0510   Min.   :-2.3570  
##  1st Qu.:-0.16725   1st Qu.:-0.23150   1st Qu.:-0.3645   1st Qu.:-0.0065  
##  Median :-0.01623   Median :-0.09661   Median :-0.2457   Median : 0.1073  
##  Mean   :-0.01623   Mean   :-0.09661   Mean   :-0.2457   Mean   : 0.1073  
##  3rd Qu.: 0.17875   3rd Qu.: 0.05700   3rd Qu.:-0.1268   3rd Qu.: 0.2442  
##  Max.   : 1.18700   Max.   : 1.08300   Max.   : 0.8690   Max.   : 1.4240  
##      X1967             X1968              X1969              X1970         
##  Min.   :-1.0230   Min.   :-3.25100   Min.   :-1.74000   Min.   :-1.28100  
##  1st Qu.:-0.2440   1st Qu.:-0.30750   1st Qu.:-0.00025   1st Qu.:-0.00875  
##  Median :-0.1110   Median :-0.21314   Median : 0.13660   Median : 0.10057  
##  Mean   :-0.1077   Mean   :-0.21314   Mean   : 0.13660   Mean   : 0.10015  
##  3rd Qu.:-0.0270   3rd Qu.:-0.07275   3rd Qu.: 0.30900   3rd Qu.: 0.23600  
##  Max.   : 1.1430   Max.   : 0.47900   Max.   : 0.93500   Max.   : 1.00000  
##      X1971             X1972              X1973              X1974        
##  Min.   :-1.3090   Min.   :-1.79200   Min.   :-0.89200   Min.   :-0.9280  
##  1st Qu.:-0.2692   1st Qu.:-0.16425   1st Qu.: 0.09675   1st Qu.:-0.3025  
##  Median :-0.1958   Median :-0.06672   Median : 0.23750   Median :-0.1548  
##  Mean   :-0.1958   Mean   :-0.06672   Mean   : 0.23750   Mean   :-0.1548  
##  3rd Qu.:-0.0905   3rd Qu.: 0.09000   3rd Qu.: 0.41600   3rd Qu.:-0.0710  
##  Max.   : 0.6650   Max.   : 2.38600   Max.   : 1.14700   Max.   : 1.5960  
##      X1975              X1976             X1977             X1978         
##  Min.   :-1.09200   Min.   :-1.1540   Min.   :-0.7030   Min.   :-0.85500  
##  1st Qu.:-0.24025   1st Qu.:-0.3757   1st Qu.: 0.0855   1st Qu.:-0.01000  
##  Median :-0.04500   Median :-0.2422   Median : 0.1810   Median : 0.07109  
##  Mean   :-0.01387   Mean   :-0.2422   Mean   : 0.1810   Mean   : 0.07109  
##  3rd Qu.: 0.06450   3rd Qu.:-0.1133   3rd Qu.: 0.2933   3rd Qu.: 0.20825  
##  Max.   : 1.89300   Max.   : 0.9240   Max.   : 1.0830   Max.   : 0.93200  
##      X1979             X1980             X1981              X1982         
##  Min.   :-1.3220   Min.   :-0.7590   Min.   :-1.06600   Min.   :-0.68500  
##  1st Qu.: 0.1520   1st Qu.: 0.1373   1st Qu.: 0.07175   1st Qu.: 0.01525  
##  Median : 0.2366   Median : 0.2519   Median : 0.18124   Median : 0.16815  
##  Mean   : 0.2366   Mean   : 0.2519   Mean   : 0.18124   Mean   : 0.16815  
##  3rd Qu.: 0.3970   3rd Qu.: 0.4522   3rd Qu.: 0.32125   3rd Qu.: 0.37075  
##  Max.   : 1.2740   Max.   : 1.0620   Max.   : 1.56300   Max.   : 1.18200  
##      X1983             X1984             X1985              X1986         
##  Min.   :-2.4210   Min.   :-1.9230   Min.   :-1.21000   Min.   :-0.76600  
##  1st Qu.: 0.1817   1st Qu.:-0.1168   1st Qu.:-0.01600   1st Qu.: 0.02575  
##  Median : 0.3335   Median : 0.0803   Median : 0.07129   Median : 0.14353  
##  Mean   : 0.3335   Mean   : 0.0803   Mean   : 0.07129   Mean   : 0.14353  
##  3rd Qu.: 0.6550   3rd Qu.: 0.2482   3rd Qu.: 0.25900   3rd Qu.: 0.30000  
##  Max.   : 1.4460   Max.   : 1.8960   Max.   : 1.78900   Max.   : 0.90700  
##      X1987             X1988             X1989             X1990        
##  Min.   :-1.6490   Min.   :-2.1080   Min.   :-1.6760   Min.   :-0.7180  
##  1st Qu.: 0.2355   1st Qu.: 0.3307   1st Qu.: 0.0135   1st Qu.: 0.2820  
##  Median : 0.3927   Median : 0.4760   Median : 0.2180   Median : 0.5465  
##  Mean   : 0.3914   Mean   : 0.4760   Mean   : 0.2474   Mean   : 0.5500  
##  3rd Qu.: 0.6873   3rd Qu.: 0.6310   3rd Qu.: 0.3257   3rd Qu.: 0.6743  
##  Max.   : 1.4290   Max.   : 1.4020   Max.   : 2.1840   Max.   : 1.8390  
##      X1991             X1992              X1993             X1994        
##  Min.   :-0.6440   Min.   :-1.35600   Min.   :-1.5210   Min.   :-0.5750  
##  1st Qu.: 0.2248   1st Qu.: 0.00875   1st Qu.: 0.0185   1st Qu.: 0.2855  
##  Median : 0.3757   Median : 0.23500   Median : 0.2260   Median : 0.5200  
##  Mean   : 0.3757   Mean   : 0.22454   Mean   : 0.2024   Mean   : 0.5922  
##  3rd Qu.: 0.5380   3rd Qu.: 0.51200   3rd Qu.: 0.4868   3rd Qu.: 0.7778  
##  Max.   : 1.3160   Max.   : 1.60200   Max.   : 1.1140   Max.   : 1.9670  
##      X1995             X1996             X1997             X1998        
##  Min.   :-0.4310   Min.   :-0.7930   Min.   :-0.5180   Min.   :-0.5010  
##  1st Qu.: 0.3685   1st Qu.: 0.0525   1st Qu.: 0.3000   1st Qu.: 0.7883  
##  Median : 0.6224   Median : 0.2915   Median : 0.5420   Median : 0.9990  
##  Mean   : 0.6224   Mean   : 0.2915   Mean   : 0.5420   Mean   : 0.9749  
##  3rd Qu.: 0.7925   3rd Qu.: 0.5158   3rd Qu.: 0.7732   3rd Qu.: 1.1965  
##  Max.   : 2.1010   Max.   : 1.7100   Max.   : 1.9660   Max.   : 2.4840  
##      X1999             X2000             X2001             X2002        
##  Min.   :-0.5510   Min.   :-0.7930   Min.   :-0.1510   Min.   :-0.0730  
##  1st Qu.: 0.4105   1st Qu.: 0.3550   1st Qu.: 0.5192   1st Qu.: 0.6913  
##  Median : 0.7015   Median : 0.5780   Median : 0.7915   Median : 0.8790  
##  Mean   : 0.7413   Mean   : 0.6749   Mean   : 0.8503   Mean   : 0.9295  
##  3rd Qu.: 1.0020   3rd Qu.: 0.9882   3rd Qu.: 1.2277   3rd Qu.: 1.1327  
##  Max.   : 2.0670   Max.   : 2.0460   Max.   : 2.0170   Max.   : 2.2550  
##      X2003             X2004             X2005             X2006        
##  Min.   :-0.4550   Min.   :-0.5160   Min.   :-0.2870   Min.   :-0.1660  
##  1st Qu.: 0.6442   1st Qu.: 0.5730   1st Qu.: 0.6647   1st Qu.: 0.6310  
##  Median : 0.8440   Median : 0.7794   Median : 0.8793   Median : 0.8545  
##  Mean   : 0.8413   Mean   : 0.7794   Mean   : 0.8793   Mean   : 0.8903  
##  3rd Qu.: 1.0277   3rd Qu.: 0.9567   3rd Qu.: 1.1000   3rd Qu.: 1.1253  
##  Max.   : 2.3630   Max.   : 2.1610   Max.   : 2.4460   Max.   : 3.8860  
##      X2007             X2008             X2009             X2010        
##  Min.   :-0.4570   Min.   :-0.0330   Min.   :-0.1950   Min.   :-0.3420  
##  1st Qu.: 0.6915   1st Qu.: 0.4510   1st Qu.: 0.6847   1st Qu.: 0.7845  
##  Median : 0.9030   Median : 0.7475   Median : 0.9178   Median : 1.1178  
##  Mean   : 1.0257   Mean   : 0.8090   Mean   : 0.9178   Mean   : 1.1196  
##  3rd Qu.: 1.1458   3rd Qu.: 1.0430   3rd Qu.: 1.1798   3rd Qu.: 1.3300  
##  Max.   : 3.0500   Max.   : 2.5870   Max.   : 1.9060   Max.   : 3.2310  
##      X2011             X2012             X2013            X2014       
##  Min.   :-0.3560   Min.   :-0.2730   Min.   :0.1730   Min.   :0.1300  
##  1st Qu.: 0.5797   1st Qu.: 0.5890   1st Qu.:0.7225   1st Qu.:0.7097  
##  Median : 0.8035   Median : 0.8615   Median :0.9420   Median :1.0060  
##  Mean   : 0.8273   Mean   : 0.9087   Mean   :0.9420   Mean   :1.1141  
##  3rd Qu.: 1.0580   3rd Qu.: 1.1578   3rd Qu.:1.1590   3rd Qu.:1.2810  
##  Max.   : 2.1050   Max.   : 3.8250   Max.   :2.2710   Max.   :3.3820  
##      X2015             X2016           X2017           X2018       
##  Min.   :-0.1320   Min.   :0.234   Min.   :0.243   Min.   :0.3020  
##  1st Qu.: 0.9945   1st Qu.:1.155   1st Qu.:1.023   1st Qu.:0.8575  
##  Median : 1.2613   Median :1.440   Median :1.277   Median :1.1690  
##  Mean   : 1.2613   Mean   :1.440   Mean   :1.281   Mean   :1.3047  
##  3rd Qu.: 1.5063   3rd Qu.:1.692   3rd Qu.:1.501   3rd Qu.:1.6442  
##  Max.   : 3.3130   Max.   :5.327   Max.   :3.335   Max.   :4.0640  
##      X2019           X2020           X2021             X2022        
##  Min.   :0.121   Min.   :0.059   Min.   :-0.5690   Min.   :-0.6980  
##  1st Qu.:1.133   1st Qu.:1.157   1st Qu.: 0.9465   1st Qu.: 0.8598  
##  Median :1.401   Median :1.476   Median : 1.3098   Median : 1.3010  
##  Mean   :1.429   Mean   :1.523   Mean   : 1.3098   Mean   : 1.3490  
##  3rd Qu.:1.658   3rd Qu.:1.797   3rd Qu.: 1.6035   3rd Qu.: 1.8030  
##  Max.   :2.871   Max.   :3.669   Max.   : 3.0120   Max.   : 3.5120  
##      X2023      
##  Min.   :0.215  
##  1st Qu.:1.198  
##  Median :1.647  
##  Mean   :1.647  
##  3rd Qu.:2.066  
##  Max.   :3.578

Only the numeric columns are needed, and they need to be scaled for PCA.

scaled <- scale(climate_data[, -c(1:10)])
scaled <- as.data.frame(scaled)

Plot the data correlation plot

data_corr <- cor(scaled, method = "pearson")
corrplot(data_corr, order = "alphabet", tl.cex = 0.6)

Strong correlation past the year 1999 can be clearly seen. Let’s perform MDS

data_dist <- dist(t(scaled))
years = colnames(scaled) 

years_numeric <- as.numeric(substr(years, 2, 5)) 

mds <- cmdscale(data_dist, k = 2)

colors <- ifelse(years_numeric > 1999, "red", "blue") 

plot(mds, type = 'n')

text(mds, labels = years, cex=0.6, adj=0.5, col=colors)

MDS also shows that years past 1999 are clustered relatively tightly together.

Let’s perform PCA:

pca_result <- prcomp(scaled, scale. = FALSE)

summary(pca_result)
## Importance of components:
##                           PC1    PC2    PC3    PC4     PC5     PC6    PC7
## Standard deviation     4.0030 2.9361 2.5436 2.1030 1.61980 1.56622 1.4527
## Proportion of Variance 0.2544 0.1368 0.1027 0.0702 0.04165 0.03894 0.0335
## Cumulative Proportion  0.2544 0.3912 0.4939 0.5641 0.60573 0.64466 0.6782
##                            PC8     PC9    PC10    PC11    PC12    PC13    PC14
## Standard deviation     1.29055 1.24609 1.16145 1.12926 1.09298 1.00824 0.96600
## Proportion of Variance 0.02644 0.02465 0.02141 0.02024 0.01896 0.01614 0.01481
## Cumulative Proportion  0.70460 0.72924 0.75065 0.77090 0.78986 0.80599 0.82081
##                           PC15    PC16    PC17    PC18    PC19    PC20    PC21
## Standard deviation     0.91572 0.89490 0.86529 0.83392 0.82058 0.71412 0.69462
## Proportion of Variance 0.01331 0.01271 0.01188 0.01104 0.01069 0.00809 0.00766
## Cumulative Proportion  0.83412 0.84683 0.85871 0.86975 0.88044 0.88853 0.89619
##                           PC22    PC23    PC24    PC25    PC26    PC27    PC28
## Standard deviation     0.67690 0.66746 0.64193 0.61953 0.58257 0.56321 0.54381
## Proportion of Variance 0.00727 0.00707 0.00654 0.00609 0.00539 0.00504 0.00469
## Cumulative Proportion  0.90346 0.91054 0.91708 0.92317 0.92856 0.93359 0.93829
##                           PC29    PC30    PC31    PC32    PC33    PC34    PC35
## Standard deviation     0.51528 0.50544 0.48645 0.47296 0.47121 0.44437 0.42447
## Proportion of Variance 0.00421 0.00406 0.00376 0.00355 0.00352 0.00313 0.00286
## Cumulative Proportion  0.94250 0.94656 0.95031 0.95386 0.95739 0.96052 0.96338
##                           PC36    PC37    PC38    PC39    PC40    PC41    PC42
## Standard deviation     0.41881 0.39643 0.39063 0.38511 0.37094 0.35810 0.34525
## Proportion of Variance 0.00278 0.00249 0.00242 0.00235 0.00218 0.00204 0.00189
## Cumulative Proportion  0.96616 0.96866 0.97108 0.97344 0.97562 0.97766 0.97955
##                           PC43   PC44    PC45   PC46    PC47    PC48    PC49
## Standard deviation     0.32539 0.3176 0.31167 0.3073 0.30176 0.28222 0.27077
## Proportion of Variance 0.00168 0.0016 0.00154 0.0015 0.00145 0.00126 0.00116
## Cumulative Proportion  0.98123 0.9828 0.98437 0.9859 0.98732 0.98858 0.98974
##                           PC50    PC51    PC52    PC53    PC54    PC55    PC56
## Standard deviation     0.26537 0.25778 0.24867 0.24794 0.24151 0.22611 0.21434
## Proportion of Variance 0.00112 0.00105 0.00098 0.00098 0.00093 0.00081 0.00073
## Cumulative Proportion  0.99086 0.99192 0.99290 0.99387 0.99480 0.99561 0.99634
##                           PC57    PC58    PC59   PC60   PC61    PC62    PC63
## Standard deviation     0.20212 0.19784 0.18754 0.1777 0.1775 0.16909 0.15409
## Proportion of Variance 0.00065 0.00062 0.00056 0.0005 0.0005 0.00045 0.00038
## Cumulative Proportion  0.99699 0.99761 0.99817 0.9987 0.9992 0.99962 1.00000

As we can see, 63 variables can be condenced into 22 components and keep 90% of variance. Therefore, we use 22 components for further analysis.

pca_scores <- as.data.frame(pca_result$x)


pca_subset <- pca_scores[, 1:22]

Now using the extracted components we perform K-means clustering to cluster countries into 3

set.seed(123)
kmeans_result <- kmeans(pca_subset, centers = 3)


clustered_data <- cbind(climate_data, cluster = kmeans_result$cluster)

Visualisation of clusters.

ggplot(pca_subset, aes(x = PC1, y = PC2, color = factor(kmeans_result$cluster))) +
  geom_point(size = 3) +
  labs(title = "Clusters of Observations in PCA Space", color = "Cluster") +
  theme_minimal()

Now we’d like to plot the clustered data with some labels to have a better idea of what the clusters represent:

countries <- climate_data[, 2]

clustered_data <- cbind(Country = countries, 
                        KMeans_Cluster = kmeans_result$cluster)

clustered_data <- as.data.frame(clustered_data)

ggplot(data = cbind(pca_subset, Country = countries, Cluster = factor(kmeans_result$cluster)), 
       aes(x = PC1, y = PC2, color = Cluster, label = Country)) +
  geom_point(size = 3) +  # Plot the points
  geom_text_repel(size = 3, max.overlaps = 10) +
  labs(title = "Clusters of Observations with Country Names in PCA Space",
       x = "Principal Component 1",
       y = "Principal Component 2",
       color = "Cluster") +
  theme_minimal()

Now let’s colour the world map based on the clustering results. First we need to handle the name mismatch between the dataset’s and map’s country names

world_map <- map_data("world")

world_map <- as.data.frame(world_map)

world_map$region <- as.character(world_map$region)

dataset_countries <- unique(clustered_data$Country)
map_countries <- unique(world_map$region)

mismatched <- setdiff(dataset_countries, map_countries)
print(mismatched)
##  [1] "Afghanistan, Islamic Rep. of"        "Africa"                             
##  [3] "Americas"                            "Andorra, Principality of"           
##  [5] "Antigua and Barbuda"                 "Armenia, Rep. of"                   
##  [7] "Aruba, Kingdom of the Netherlands"   "Asia"                               
##  [9] "Azerbaijan, Rep. of"                 "Bahamas, The"                       
## [11] "Bahrain, Kingdom of"                 "Belarus, Rep. of"                   
## [13] "British Virgin Islands"              "Brunei Darussalam"                  
## [15] "Cabo Verde"                          "Central African Rep."               
## [17] "China, P.R.: Hong Kong"              "China, P.R.: Macao"                 
## [19] "China, P.R.: Mainland"               "Comoros, Union of the"              
## [21] "Congo, Dem. Rep. of the"             "Congo, Rep. of"                     
## [23] "Côte d'Ivoire"                       "Croatia, Rep. of"                   
## [25] "Czech Rep."                          "Dominican Rep."                     
## [27] "Egypt, Arab Rep. of"                 "Equatorial Guinea, Rep. of"         
## [29] "Eritrea, The State of"               "Estonia, Rep. of"                   
## [31] "Eswatini, Kingdom of"                "Ethiopia, The Federal Dem. Rep. of" 
## [33] "Europe"                              "Falkland Islands (Malvinas)"        
## [35] "Fiji, Rep. of"                       "Gambia, The"                        
## [37] "Gibraltar"                           "Guiana, French"                     
## [39] "Holy See"                            "Iran, Islamic Rep. of"              
## [41] "Kazakhstan, Rep. of"                 "Korea, Dem. People's Rep. of"       
## [43] "Korea, Rep. of"                      "Kyrgyz Rep."                        
## [45] "Lao People's Dem. Rep."              "Lesotho, Kingdom of"                
## [47] "Madagascar, Rep. of"                 "Marshall Islands, Rep. of the"      
## [49] "Mauritania, Islamic Rep. of"         "Micronesia, Federated States of"    
## [51] "Moldova, Rep. of"                    "Mozambique, Rep. of"                
## [53] "Nauru, Rep. of"                      "Netherlands, The"                   
## [55] "North Macedonia, Republic of "       "Oceania"                            
## [57] "Palau, Rep. of"                      "Poland, Rep. of"                    
## [59] "Réunion"                             "Russian Federation"                 
## [61] "San Marino, Rep. of"                 "São Tomé and Príncipe, Dem. Rep. of"
## [63] "Serbia, Rep. of"                     "Slovak Rep."                        
## [65] "Slovenia, Rep. of"                   "South Sudan, Rep. of"               
## [67] "St. Kitts and Nevis"                 "St. Lucia"                          
## [69] "St. Vincent and the Grenadines"      "Svalbard Jan Mayen Islands"         
## [71] "Syrian Arab Rep."                    "Taiwan Province of China"           
## [73] "Tajikistan, Rep. of"                 "Tanzania, United Rep. of"           
## [75] "Timor-Leste, Dem. Rep. of"           "Tokelau"                            
## [77] "Trinidad and Tobago"                 "Türkiye, Rep. of"                   
## [79] "Tuvalu"                              "United Kingdom"                     
## [81] "United States"                       "United States Virgin Islands"       
## [83] "Uzbekistan, Rep. of"                 "Venezuela, Rep. Bolivariana de"     
## [85] "Wallis and Futuna Islands"           "West Bank and Gaza"                 
## [87] "World"                               "Yemen, Rep. of"
clustered_data$Country <- countrycode(
  clustered_data$Country, 
  origin = "country.name", 
  destination = "country.name"
)

clustered_data$Country <- ifelse(
  clustered_data$Country == "Côte d'Ivoire", 
  "Ivory Coast", 
  clustered_data$Country
)

clustered_data <- clustered_data %>%
  filter(!Country %in% c("Africa", "Asia", "World", "Antarctica", "Americas"))

colnames(clustered_data) <- c("Country", "KMeans_Cluster")

clustered_data$Country <- as.character(clustered_data$Country)

clustered_data$Country <- ifelse(clustered_data$Country == "United States", "USA", clustered_data$Country)

Now that the mismatch is handled, we can plot the world map with clustering.

map_data_with_clusters <- left_join(world_map, clustered_data, by = c("region" = "Country"))

# Plot the map
ggplot(map_data_with_clusters, aes(x = long, y = lat, group = group, fill = factor(KMeans_Cluster))) +
  geom_polygon(color = "black") +
  scale_fill_manual(values = c("green", "blue", "red")) +
  labs(title = "Map Colored by KMeans Clusters", fill = "Cluster") +
  theme_minimal() +
  coord_fixed(ratio = 1.1)

Now we need to perform the same steps, except for the Gas emissions dataset

perform PCA on the gas emissions data

## Importance of components:
##                           PC1     PC2     PC3     PC4     PC5     PC6     PC7
## Standard deviation     5.4645 1.42641 0.25046 0.15161 0.08615 0.07190 0.05440
## Proportion of Variance 0.9331 0.06358 0.00196 0.00072 0.00023 0.00016 0.00009
## Cumulative Proportion  0.9331 0.99672 0.99868 0.99939 0.99963 0.99979 0.99988
##                            PC8     PC9    PC10    PC11    PC12    PC13    PC14
## Standard deviation     0.03792 0.02699 0.02088 0.01791 0.01403 0.01206 0.01043
## Proportion of Variance 0.00004 0.00002 0.00001 0.00001 0.00001 0.00000 0.00000
## Cumulative Proportion  0.99993 0.99995 0.99996 0.99997 0.99998 0.99998 0.99999
##                           PC15     PC16     PC17     PC18   PC19     PC20
## Standard deviation     0.01001 0.007621 0.006874 0.006775 0.0059 0.005428
## Proportion of Variance 0.00000 0.000000 0.000000 0.000000 0.0000 0.000000
## Cumulative Proportion  0.99999 0.999990 0.999990 0.999990 1.0000 1.000000
##                            PC21     PC22     PC23     PC24     PC25     PC26
## Standard deviation     0.005289 0.004593 0.004286 0.004059 0.003563 0.003351
## Proportion of Variance 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
## Cumulative Proportion  1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
##                            PC27     PC28     PC29     PC30     PC31     PC32
## Standard deviation     0.003022 0.002799 0.002521 0.002247 0.001679 0.001589
## Proportion of Variance 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
## Cumulative Proportion  1.000000 1.000000 1.000000 1.000000 1.000000 1.000000

We use only 2 components as they capture 99% of variance.

Clusters with labels

## Warning: ggrepel: 185 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

Country name mismatch handling and map plotting

##  [1] "United States"                    "European Union (27)"             
##  [3] "Türkiye"                          "United Kingdom"                  
##  [5] "Côte d'Ivoire"                    "Trinidad and Tobago"             
##  [7] "Macedonia"                        "Eswatini"                        
##  [9] "Antigua and Barbuda"              "Saint Kitts and Nevis"           
## [11] "Saint Vincent and the Grenadines" "Tuvalu"
## Warning: Some values were not matched unambiguously: European Union (27), Micronesia

We can see that there are 2 extreme outliers: China and USA. The rest of the countries are fairly similar, but countries like Russia, India or Brazil have a little more emissions than others.

However, the patterns seen on the temperature change map do not exactly correlate to the patterns seen on the gas emissions map.

While more precise analysis is needed, this visual representation might provide insight in how gas emissions affect the temperature change for each country.