Overview

I am taking the .csv file of the Annual Surface Tempurature Change from climatedata.imf.org. My goal is to tidy the dataset for analysis. I want to see which regions are most affected by climate change.

Lets read the .csv file into R from my github repository.

climate <- read.csv("https://raw.githubusercontent.com/evelynbartley/Data-607/main/Indicator_3_1_Climate_Indicators_Annual_Mean_Global_Surface_Temperature_577579683071085080.csv")
tibble(climate)
## # A tibble: 225 × 72
##    ObjectId Country         ISO2  ISO3  Indicator Unit  Source CTS.Code CTS.Name
##       <int> <chr>           <chr> <chr> <chr>     <chr> <chr>  <chr>    <chr>   
##  1        1 Afghanistan, I… AF    AFG   Temperat… Degr… Food … ECCS     Surface…
##  2        2 Albania         AL    ALB   Temperat… Degr… Food … ECCS     Surface…
##  3        3 Algeria         DZ    DZA   Temperat… Degr… Food … ECCS     Surface…
##  4        4 American Samoa  AS    ASM   Temperat… Degr… Food … ECCS     Surface…
##  5        5 Andorra, Princ… AD    AND   Temperat… Degr… Food … ECCS     Surface…
##  6        6 Angola          AO    AGO   Temperat… Degr… Food … ECCS     Surface…
##  7        7 Anguilla        AI    AIA   Temperat… Degr… Food … ECCS     Surface…
##  8        8 Antigua and Ba… AG    ATG   Temperat… Degr… Food … ECCS     Surface…
##  9        9 Argentina       AR    ARG   Temperat… Degr… Food … ECCS     Surface…
## 10       10 Armenia, Rep. … AM    ARM   Temperat… Degr… Food … ECCS     Surface…
## # ℹ 215 more rows
## # ℹ 63 more variables: CTS.Full.Descriptor <chr>, X1961 <dbl>, X1962 <dbl>,
## #   X1963 <dbl>, X1964 <dbl>, X1965 <dbl>, X1966 <dbl>, X1967 <dbl>,
## #   X1968 <dbl>, X1969 <dbl>, X1970 <dbl>, X1971 <dbl>, X1972 <dbl>,
## #   X1973 <dbl>, X1974 <dbl>, X1975 <dbl>, X1976 <dbl>, X1977 <dbl>,
## #   X1978 <dbl>, X1979 <dbl>, X1980 <dbl>, X1981 <dbl>, X1982 <dbl>,
## #   X1983 <dbl>, X1984 <dbl>, X1985 <dbl>, X1986 <dbl>, X1987 <dbl>, …

Let’s clean up the dataset to include the variables we need for analysis. I want to use the ISO3 code for each country instead of the Country’s name for tidyness.

climate1 <- climate |> 
  select(Country = ISO3, X1961:X2000)
tibble(climate1)
## # A tibble: 225 × 41
##    Country  X1961  X1962  X1963  X1964  X1965  X1966  X1967  X1968  X1969  X1970
##    <chr>    <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
##  1 AFG     -0.113 -0.164  0.847 -0.764 -0.244  0.226 -0.371 -0.423 -0.539  0.813
##  2 ALB      0.627  0.326  0.075 -0.166 -0.388  0.559 -0.074  0.081 -0.013 -0.106
##  3 DZA      0.164  0.114  0.077  0.25  -0.1    0.433 -0.026 -0.067  0.291  0.116
##  4 ASM      0.079 -0.042  0.169 -0.14  -0.562  0.181 -0.368 -0.187  0.132 -0.047
##  5 AND      0.736  0.112 -0.752  0.308 -0.49   0.415  0.637  0.018 -0.137  0.121
##  6 AGO      0.041 -0.152 -0.19  -0.229 -0.196  0.175 -0.081 -0.193  0.188  0.248
##  7 AIA      0.086 -0.024  0.234  0.189 -0.365 -0.001 -0.257 -0.2    0.317  0.082
##  8 ATG      0.09   0.031  0.288  0.214 -0.385  0.097 -0.192 -0.225  0.271  0.109
##  9 ARG      0.122 -0.046  0.162 -0.343  0.09  -0.163  0      0.472  0.292  0.438
## 10 ARM     NA     NA     NA     NA     NA     NA     NA     NA     NA     NA    
## # ℹ 215 more rows
## # ℹ 30 more variables: X1971 <dbl>, X1972 <dbl>, X1973 <dbl>, X1974 <dbl>,
## #   X1975 <dbl>, X1976 <dbl>, X1977 <dbl>, X1978 <dbl>, X1979 <dbl>,
## #   X1980 <dbl>, X1981 <dbl>, X1982 <dbl>, X1983 <dbl>, X1984 <dbl>,
## #   X1985 <dbl>, X1986 <dbl>, X1987 <dbl>, X1988 <dbl>, X1989 <dbl>,
## #   X1990 <dbl>, X1991 <dbl>, X1992 <dbl>, X1993 <dbl>, X1994 <dbl>,
## #   X1995 <dbl>, X1996 <dbl>, X1997 <dbl>, X1998 <dbl>, X1999 <dbl>, …

Instead of having columns for every year, I want to have one column for year, and one column for the surface temperature change in degrees Celcius.

climate2 <- climate1 %>%
  pivot_longer(
    cols = starts_with("X"),
    names_to = "Year",
    values_to = "TempChange"
  )
tibble(climate2)
## # A tibble: 9,000 × 3
##    Country Year  TempChange
##    <chr>   <chr>      <dbl>
##  1 AFG     X1961     -0.113
##  2 AFG     X1962     -0.164
##  3 AFG     X1963      0.847
##  4 AFG     X1964     -0.764
##  5 AFG     X1965     -0.244
##  6 AFG     X1966      0.226
##  7 AFG     X1967     -0.371
##  8 AFG     X1968     -0.423
##  9 AFG     X1969     -0.539
## 10 AFG     X1970      0.813
## # ℹ 8,990 more rows

To create one value that we can reference for the change in surface temperature from 1961 to 2000, I want to calculate the average change in surface temperature for each country.

climate3 <- climate2 %>%
  group_by(Country) %>%
  summarise(avg = mean(TempChange, na.rm = TRUE))
head(climate3)
## # A tibble: 6 × 2
##   Country    avg
##   <chr>    <dbl>
## 1 ABW     0.147 
## 2 AFG     0.139 
## 3 AGO     0.212 
## 4 AIA     0.189 
## 5 ALB     0.0844
## 6 AND     0.380

Lets see the distribution of average change in temperature.

avgofavgs <- mean(climate3$avg, na.rm = TRUE)
ggplot(climate3, aes(x = avg)) + geom_histogram() + geom_vline(aes(xintercept = avgofavgs), color = "tomato", linewidth = 1)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 4 rows containing non-finite values (`stat_bin()`).

Our distribution is looking pretty normal! There does seem to be two outliers.

I want to see which country had the highest average change in temperature and which had the lowest.

climate3[which.min(climate3$avg), ]
## # A tibble: 1 × 2
##   Country    avg
##   <chr>    <dbl>
## 1 GRL     -0.156
climate3[which.max(climate3$avg), ]
## # A tibble: 1 × 2
##   Country   avg
##   <chr>   <dbl>
## 1 LUX      1.65

Analysis and Conclusion

The country with the lowest average change in temperature was Greenland at -0.156 degrees Celcius and the country with the highest average change in temperature was Luxembourg at 1.651 degrees Celcius. Using the small arrows when viewing climate3, we can arrange the averages in descending order. From this, I can see that Luxenbourg, Belgium, Estonia, Latvia, and Slovenia had the five highest average changes in temperatures. This provides evidence that Europe is being effected by climate change the most.