Loading Global Inflation Data

Below I import the csv file containing MTA ridership data from Github, where I saved the file.

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
fileURL <- 'https://raw.githubusercontent.com/stoybis/DATA607Repo/main/global_inflation_data.csv'

inflationData <-read.csv(url(fileURL))

head(inflationData)
##          country_name                                  indicator_name X1980
## 1         Afghanistan Annual average inflation (consumer prices) rate  13.4
## 2             Albania Annual average inflation (consumer prices) rate    NA
## 3             Algeria Annual average inflation (consumer prices) rate   9.7
## 4             Andorra Annual average inflation (consumer prices) rate    NA
## 5              Angola Annual average inflation (consumer prices) rate  46.7
## 6 Antigua and Barbuda Annual average inflation (consumer prices) rate  19.0
##   X1981 X1982 X1983 X1984 X1985 X1986 X1987 X1988 X1989 X1990 X1991  X1992
## 1  22.2  18.2  15.9  20.4   8.7  -2.1  18.4  27.5  71.5  47.4  43.8  58.19
## 2    NA    NA    NA    NA    NA    NA    NA    NA    NA  -0.2  35.7 226.00
## 3  14.6   6.6   7.8   6.3  10.4  14.0   5.9   5.9   9.2   9.3  25.9  31.70
## 4    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA     NA
## 5   1.4   1.8   1.8   1.8   1.8   1.8   1.8   1.8   1.8   1.8  85.3 299.10
## 6  11.5   4.2   2.3   3.8   1.0   0.5   3.6   6.8   4.4   6.6   4.5   3.00
##     X1993  X1994  X1995   X1996  X1997  X1998  X1999 X2000 X2001  X2002 X2003
## 1   33.99  20.01   14.0   14.01  14.01  14.01  14.01   0.0 -43.4  51.93 35.66
## 2   85.00  22.60    7.8   12.70  33.20  20.60   0.40   0.0   3.1   5.20  2.40
## 3   20.50  29.00   29.8   18.70   5.70   5.00   2.60   0.3   4.2   1.40  4.30
## 4      NA     NA     NA      NA     NA     NA     NA    NA    NA   3.10  3.10
## 5 1379.50 949.80 2672.2 4146.00 221.50 107.40 248.20 325.0 152.6 108.90 98.20
## 6    3.10   6.50    2.7    3.00   0.40   3.30   1.10  -0.2   1.9   2.40  2.00
##   X2004 X2005 X2006 X2007 X2008 X2009 X2010 X2011 X2012 X2013 X2014 X2015 X2016
## 1 16.36 10.57  6.78  8.68 26.42 -6.81  2.18  11.8  6.44  7.39  4.67 -0.66  4.38
## 2  2.90  2.40  2.40  3.00  3.30  2.20  3.60   3.4  2.00  1.90  1.60  1.90  1.30
## 3  4.00  1.40  2.30  3.70  4.90  5.70  3.90   4.5  8.90  3.30  2.90  4.80  6.40
## 4  2.90  3.50  3.70  2.70  4.30 -1.20  1.70   2.6  1.50  0.50 -0.10 -1.10 -0.40
## 5 43.50 23.00 13.30 12.20 12.50 13.70 14.50  13.5 10.30  8.80  7.30  9.20 30.70
## 6  2.00  2.10  1.80  1.40  5.30 -0.60  3.40   3.5  3.40  1.10  1.10  1.00 -0.50
##   X2017 X2018 X2019 X2020 X2021 X2022 X2023 X2024
## 1  4.98  0.63   2.3  5.44  5.06 13.71   9.1    NA
## 2  2.00  2.00   1.4  1.60  2.00  6.70   4.8   4.0
## 3  5.60  4.30   2.0  2.40  7.20  9.30   9.0   6.8
## 4  2.60  1.00   0.5  0.10  1.70  6.20   5.2   3.5
## 5 29.80 19.60  17.1 22.30 25.80 21.40  13.1  22.3
## 6  2.40  1.20   1.4  1.10  1.60  7.50   5.0   2.9

Tidying the data

The data is not in a tidy format because there are multiple observations in each row. Each column is a different observation, for example we have the observation of average annual inflation data in 1980, 1981, 1982, etc for each country. These are all new observations and should have their own rows.

First, I remove the X from the column names.

colnames(inflationData)[3:ncol(inflationData)] <- str_replace(colnames(inflationData)[3:ncol(inflationData)],"X","")

head(inflationData)
##          country_name                                  indicator_name 1980 1981
## 1         Afghanistan Annual average inflation (consumer prices) rate 13.4 22.2
## 2             Albania Annual average inflation (consumer prices) rate   NA   NA
## 3             Algeria Annual average inflation (consumer prices) rate  9.7 14.6
## 4             Andorra Annual average inflation (consumer prices) rate   NA   NA
## 5              Angola Annual average inflation (consumer prices) rate 46.7  1.4
## 6 Antigua and Barbuda Annual average inflation (consumer prices) rate 19.0 11.5
##   1982 1983 1984 1985 1986 1987 1988 1989 1990 1991   1992    1993   1994
## 1 18.2 15.9 20.4  8.7 -2.1 18.4 27.5 71.5 47.4 43.8  58.19   33.99  20.01
## 2   NA   NA   NA   NA   NA   NA   NA   NA -0.2 35.7 226.00   85.00  22.60
## 3  6.6  7.8  6.3 10.4 14.0  5.9  5.9  9.2  9.3 25.9  31.70   20.50  29.00
## 4   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA     NA      NA     NA
## 5  1.8  1.8  1.8  1.8  1.8  1.8  1.8  1.8  1.8 85.3 299.10 1379.50 949.80
## 6  4.2  2.3  3.8  1.0  0.5  3.6  6.8  4.4  6.6  4.5   3.00    3.10   6.50
##     1995    1996   1997   1998   1999  2000  2001   2002  2003  2004  2005
## 1   14.0   14.01  14.01  14.01  14.01   0.0 -43.4  51.93 35.66 16.36 10.57
## 2    7.8   12.70  33.20  20.60   0.40   0.0   3.1   5.20  2.40  2.90  2.40
## 3   29.8   18.70   5.70   5.00   2.60   0.3   4.2   1.40  4.30  4.00  1.40
## 4     NA      NA     NA     NA     NA    NA    NA   3.10  3.10  2.90  3.50
## 5 2672.2 4146.00 221.50 107.40 248.20 325.0 152.6 108.90 98.20 43.50 23.00
## 6    2.7    3.00   0.40   3.30   1.10  -0.2   1.9   2.40  2.00  2.00  2.10
##    2006  2007  2008  2009  2010 2011  2012 2013  2014  2015  2016  2017  2018
## 1  6.78  8.68 26.42 -6.81  2.18 11.8  6.44 7.39  4.67 -0.66  4.38  4.98  0.63
## 2  2.40  3.00  3.30  2.20  3.60  3.4  2.00 1.90  1.60  1.90  1.30  2.00  2.00
## 3  2.30  3.70  4.90  5.70  3.90  4.5  8.90 3.30  2.90  4.80  6.40  5.60  4.30
## 4  3.70  2.70  4.30 -1.20  1.70  2.6  1.50 0.50 -0.10 -1.10 -0.40  2.60  1.00
## 5 13.30 12.20 12.50 13.70 14.50 13.5 10.30 8.80  7.30  9.20 30.70 29.80 19.60
## 6  1.80  1.40  5.30 -0.60  3.40  3.5  3.40 1.10  1.10  1.00 -0.50  2.40  1.20
##   2019  2020  2021  2022 2023 2024
## 1  2.3  5.44  5.06 13.71  9.1   NA
## 2  1.4  1.60  2.00  6.70  4.8  4.0
## 3  2.0  2.40  7.20  9.30  9.0  6.8
## 4  0.5  0.10  1.70  6.20  5.2  3.5
## 5 17.1 22.30 25.80 21.40 13.1 22.3
## 6  1.4  1.10  1.60  7.50  5.0  2.9

Then I pivot the data frame to a longer format so that each year is its own observation in a separate row. I also convert the years and values to type numeric and the country names to factors.

inflationDataTidy <- pivot_longer(inflationData,
                                  cols = !c('country_name','indicator_name'), names_to = 'year', values_to = 'value')

inflationDataTidy$country_name <- as.factor(inflationDataTidy$country_name)
inflationDataTidy$year <- as.numeric(inflationDataTidy$year)
inflationDataTidy$value <- as.numeric(inflationDataTidy$value)

head(inflationDataTidy, n = 10)
## # A tibble: 10 × 4
##    country_name indicator_name                                   year value
##    <fct>        <chr>                                           <dbl> <dbl>
##  1 Afghanistan  Annual average inflation (consumer prices) rate  1980  13.4
##  2 Afghanistan  Annual average inflation (consumer prices) rate  1981  22.2
##  3 Afghanistan  Annual average inflation (consumer prices) rate  1982  18.2
##  4 Afghanistan  Annual average inflation (consumer prices) rate  1983  15.9
##  5 Afghanistan  Annual average inflation (consumer prices) rate  1984  20.4
##  6 Afghanistan  Annual average inflation (consumer prices) rate  1985   8.7
##  7 Afghanistan  Annual average inflation (consumer prices) rate  1986  -2.1
##  8 Afghanistan  Annual average inflation (consumer prices) rate  1987  18.4
##  9 Afghanistan  Annual average inflation (consumer prices) rate  1988  27.5
## 10 Afghanistan  Annual average inflation (consumer prices) rate  1989  71.5

The data frame is now tidy - each new observation is its own row.

Analysis

One of the questions is to compare average annual inflation data for countries in similar regions. I am curious to see how the inflation data compares for the US, Canada, and Mexico, the three largest countries in North America.

Below I filter the tidy data frame for these countries.

UsCanMex <- inflationDataTidy |> filter(country_name == 'Canada'|country_name=='Mexico'| country_name == 'United States')

head(UsCanMex, n = 10)
## # A tibble: 10 × 4
##    country_name indicator_name                                   year value
##    <fct>        <chr>                                           <dbl> <dbl>
##  1 Canada       Annual average inflation (consumer prices) rate  1980  10.2
##  2 Canada       Annual average inflation (consumer prices) rate  1981  12.5
##  3 Canada       Annual average inflation (consumer prices) rate  1982  10.8
##  4 Canada       Annual average inflation (consumer prices) rate  1983   5.8
##  5 Canada       Annual average inflation (consumer prices) rate  1984   4.3
##  6 Canada       Annual average inflation (consumer prices) rate  1985   4  
##  7 Canada       Annual average inflation (consumer prices) rate  1986   4.2
##  8 Canada       Annual average inflation (consumer prices) rate  1987   4.4
##  9 Canada       Annual average inflation (consumer prices) rate  1988   4  
## 10 Canada       Annual average inflation (consumer prices) rate  1989   5

Below I graph the average annual inflation rates over time.

ggplot(UsCanMex, aes(x = year, y = value, color = country_name)) +
  geom_line() + ggtitle('Average annual inflation rate over time')

While all three countries had higher inflation in the early 1980s than the 1990s to 2000s, Mexico was meaningfully higher than the US and Canada. This may have to do with Mexico’s economy being in the developing stage whereas the US and Canada are developed economies.

Analysis 2

Below, I conduct the above analysis for Spain, France, Germany, Italy, and Portugal, five of the larger countries in Western Europe.

filterList <- c('Spain', 'France', 'Germany', 'Italy', 'Portugal')

westernEurope <- inflationDataTidy |> filter(country_name %in% filterList)

head(westernEurope, n = 10)
## # A tibble: 10 × 4
##    country_name indicator_name                                   year value
##    <fct>        <chr>                                           <dbl> <dbl>
##  1 France       Annual average inflation (consumer prices) rate  1980  13.1
##  2 France       Annual average inflation (consumer prices) rate  1981  13.3
##  3 France       Annual average inflation (consumer prices) rate  1982  12  
##  4 France       Annual average inflation (consumer prices) rate  1983   9.5
##  5 France       Annual average inflation (consumer prices) rate  1984   7.7
##  6 France       Annual average inflation (consumer prices) rate  1985   5.8
##  7 France       Annual average inflation (consumer prices) rate  1986   2.5
##  8 France       Annual average inflation (consumer prices) rate  1987   3.3
##  9 France       Annual average inflation (consumer prices) rate  1988   2.7
## 10 France       Annual average inflation (consumer prices) rate  1989   6.6

Below I graph the average annual inflation rates over time

ggplot(westernEurope, aes(x = year, y = value, color = country_name)) +
  geom_line() + ggtitle('Average annual inflation rate over time')

Similar to the North America graph, these countries saw high inflation in the 1980s. However, post 1980, the average annual inflation rates for these countries is higher than the US and Canada even though all of these countries would be considered developed economies.