This is an exercise to analyse the performance of the Phillips Curve. The task is to download data from the IMF World Economic Outlook Database. You can find that here:
World Economic Outlook Database
You must download the data and save it was a csv file. This is a difficult file to manage because it is actually a tab-separated-variable file and because the data is presented in rows rather than the usual columns. Managing and sorting out data is a key part of this exercise. It is not unusual to find that this takes more time than the analysis.
If you have set up a Project and have the data file in the same directory or folder as your R code, you will not need ../../ before the name in your data file. You may not need fileEncoding = 'UTF-16LE' or blank.lines.skip = TRUE. Try it with and without. If in doubt, please post a question to the discussion board.
da <- read.delim('../../Data/WEO_Data.csv', fileEncoding = 'UTF-16LE',
blank.lines.skip = TRUE)
str(da)
## 'data.frame': 4 obs. of 49 variables:
## $ Country : Factor w/ 3 levels "","International Monetary Fund, World Economic Outlook Database, April 2023",..: 3 3 1 2
## $ Subject.Descriptor : Factor w/ 3 levels "","Inflation, average consumer prices",..: 2 3 1 1
## $ Units : Factor w/ 3 levels "","Percent change",..: 2 3 1 1
## $ Scale : logi NA NA NA NA
## $ Country.Series.specific.Notes: Factor w/ 3 levels "","See notes for: Inflation, average consumer prices (Index).",..: 2 3 1 1
## $ X1980 : num 13.5 7.17 NA NA
## $ X1981 : num 10.38 7.62 NA NA
## $ X1982 : num 6.16 9.71 NA NA
## $ X1983 : num 3.16 9.6 NA NA
## $ X1984 : num 4.37 7.51 NA NA
## $ X1985 : num 3.53 7.19 NA NA
## $ X1986 : num 1.94 7 NA NA
## $ X1987 : num 3.58 6.17 NA NA
## $ X1988 : num 4.1 5.49 NA NA
## $ X1989 : num 4.79 5.26 NA NA
## $ X1990 : num 5.42 5.62 NA NA
## $ X1991 : num 4.22 6.85 NA NA
## $ X1992 : num 3.04 7.49 NA NA
## $ X1993 : num 2.97 6.91 NA NA
## $ X1994 : num 2.6 6.1 NA NA
## $ X1995 : num 2.81 5.59 NA NA
## $ X1996 : num 2.94 5.41 NA NA
## $ X1997 : num 2.34 4.94 NA NA
## $ X1998 : num 1.55 4.5 NA NA
## $ X1999 : num 2.19 4.22 NA NA
## $ X2000 : num 3.37 3.97 NA NA
## $ X2001 : num 2.82 4.74 NA NA
## $ X2002 : num 1.6 5.78 NA NA
## $ X2003 : num 2.3 5.99 NA NA
## $ X2004 : num 2.67 5.54 NA NA
## $ X2005 : num 3.37 5.08 NA NA
## $ X2006 : num 3.22 4.61 NA NA
## $ X2007 : num 2.87 4.62 NA NA
## $ X2008 : num 3.81 5.8 NA NA
## $ X2009 : num -0.32 9.28 NA NA
## $ X2010 : num 1.64 9.61 NA NA
## $ X2011 : num 3.14 8.93 NA NA
## $ X2012 : num 2.07 8.07 NA NA
## $ X2013 : num 1.47 7.36 NA NA
## $ X2014 : num 1.61 6.16 NA NA
## $ X2015 : num 0.121 5.275 NA NA
## $ X2016 : num 1.27 4.88 NA NA
## $ X2017 : num 2.13 4.36 NA NA
## $ X2018 : num 2.44 3.89 NA NA
## $ X2019 : num 1.81 3.68 NA NA
## $ X2020 : num 1.25 8.09 NA NA
## $ X2021 : num 4.68 5.37 NA NA
## $ X2022 : num 7.99 3.64 NA NA
## $ Estimates.Start.After : int 2022 2022 NA NA
head(da)
## Country
## 1 United States
## 2 United States
## 3
## 4 International Monetary Fund, World Economic Outlook Database, April 2023
## Subject.Descriptor Units Scale
## 1 Inflation, average consumer prices Percent change NA
## 2 Unemployment rate Percent of total labor force NA
## 3 NA
## 4 NA
## Country.Series.specific.Notes
## 1 See notes for: Inflation, average consumer prices (Index).
## 2 Source: National Statistics Office Latest actual data: 2022 Employment type: National definition Primary domestic currency: US dollar Data last updated: 03/2023
## 3
## 4
## X1980 X1981 X1982 X1983 X1984 X1985 X1986 X1987 X1988 X1989 X1990 X1991
## 1 13.502 10.378 6.158 3.16 4.368 3.528 1.944 3.578 4.100 4.791 5.419 4.216
## 2 7.175 7.617 9.708 9.60 7.508 7.192 7.000 6.175 5.492 5.258 5.617 6.850
## 3 NA NA NA NA NA NA NA NA NA NA NA NA
## 4 NA NA NA NA NA NA NA NA NA NA NA NA
## X1992 X1993 X1994 X1995 X1996 X1997 X1998 X1999 X2000 X2001 X2002 X2003 X2004
## 1 3.042 2.970 2.596 2.805 2.937 2.338 1.547 2.193 3.367 2.817 1.596 2.298 2.668
## 2 7.492 6.908 6.100 5.592 5.408 4.942 4.500 4.217 3.967 4.742 5.783 5.992 5.542
## 3 NA NA NA NA NA NA NA NA NA NA NA NA NA
## 4 NA NA NA NA NA NA NA NA NA NA NA NA NA
## X2005 X2006 X2007 X2008 X2009 X2010 X2011 X2012 X2013 X2014 X2015 X2016
## 1 3.366 3.222 2.871 3.815 -0.320 1.637 3.140 2.073 1.466 1.615 0.121 1.267
## 2 5.083 4.608 4.617 5.800 9.283 9.608 8.933 8.075 7.358 6.158 5.275 4.875
## 3 NA NA NA NA NA NA NA NA NA NA NA NA
## 4 NA NA NA NA NA NA NA NA NA NA NA NA
## X2017 X2018 X2019 X2020 X2021 X2022 Estimates.Start.After
## 1 2.131 2.439 1.813 1.251 4.683 7.986 2022
## 2 4.358 3.892 3.683 8.092 5.367 3.642 2022
## 3 NA NA NA NA NA NA NA
## 4 NA NA NA NA NA NA NA
THIS IS A MESS
We create a dataframe from this mess and check the results. Make sure that you understand all the steps.
head(t(da[1:2,7:length(da)-1]))
## 1 2
## X1980 13.502 7.175
## X1981 10.378 7.617
## X1982 6.158 9.708
## X1983 3.160 9.600
## X1984 4.368 7.508
## X1985 3.528 7.192
tail(t(da[1:2,7:length(da)-1]))
## 1 2
## X2017 2.131 4.358
## X2018 2.439 3.892
## X2019 1.813 3.683
## X2020 1.251 8.092
## X2021 4.683 5.367
## X2022 7.986 3.642
str(t(da[1:2,7:length(da)-1]))
## num [1:43, 1:2] 13.5 10.38 6.16 3.16 4.37 ...
## - attr(*, "dimnames")=List of 2
## ..$ : chr [1:43] "X1980" "X1981" "X1982" "X1983" ...
## ..$ : chr [1:2] "1" "2"
Unemployment = t(unname(da[2, c(7:length(da)-1)]))
head(Unemployment)
## 2
## [1,] 7.175
## [2,] 7.617
## [3,] 9.708
## [4,] 9.600
## [5,] 7.508
## [6,] 7.192
Inflation = t(unname(da[1, c(7:length(da)-1)]))
dat <- data.frame("Unemployment" = Unemployment,
"Inflation" = Inflation, row.names = 1980:2022)
colnames(dat) <- c("Unemployment", "Inflation")
head(dat)
## Unemployment Inflation
## 1980 7.175 13.502
## 1981 7.617 10.378
## 1982 9.708 6.158
## 1983 9.600 3.160
## 1984 7.508 4.368
## 1985 7.192 3.528
tail(dat)
## Unemployment Inflation
## 2017 4.358 2.131
## 2018 3.892 2.439
## 2019 3.683 1.813
## 2020 8.092 1.251
## 2021 5.367 4.683
## 2022 3.642 7.986
str(dat)
## 'data.frame': 43 obs. of 2 variables:
## $ Unemployment: num 7.17 7.62 9.71 9.6 7.51 ...
## $ Inflation : num 13.5 10.38 6.16 3.16 4.37 ...
Plot the time series
plot(1980:2022, dat$Unemployment, type = 'l', ylim = c(0, 15),
main = "US Unemployment and inflation", xlab = "Year",
ylab = "Percentage")
lines(1980:2022, dat$Inflation, col = 'red', lty = 2)
legend('topright', inset = 0.08, legend = c("Unemployment",
"Inflation"), col = c('black', 'red'), lty = c(1, 2))
There seems to be a mirror, as we would expect for the Phillips curve. However, that is less apparent when we use a scattergram.
plot(dat$Unemployment, dat$Inflation, main = "US unemployment and
inflation", xlab = "Unemployment", ylab = "Inflation")
The lm() is the function for a linear regression. We can create a regression object with the regression results.
eq1 <- lm(dat$Inflation ~ dat$Unemployment)
summary(eq1)
##
## Call:
## lm(formula = dat$Inflation ~ dat$Unemployment)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.9356 -1.3533 -0.4428 0.3901 10.0842
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.74476 1.45162 1.891 0.0657 .
## dat$Unemployment 0.09381 0.22728 0.413 0.6819
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.489 on 41 degrees of freedom
## Multiple R-squared: 0.004138, Adjusted R-squared: -0.02015
## F-statistic: 0.1704 on 1 and 41 DF, p-value: 0.6819
You can see the results using the summary() function. You can see from this that the R-squared is very low and the estimate of the slope is not statistically significant. However, if you look at the original time series chart you can see that the starting and ending points are the end of the high inflation 1970s and the beginning of the recent inflation phase. We could look at the results when these periods are removed.
dat2 <- dat[9:length(dat$Unemployment) -3, ]
head(dat2)
## Unemployment Inflation
## 1985 7.192 3.528
## 1986 7.000 1.944
## 1987 6.175 3.578
## 1988 5.492 4.100
## 1989 5.258 4.791
## 1990 5.617 5.419
tail(dat2)
## Unemployment Inflation
## 2014 6.158 1.615
## 2015 5.275 0.121
## 2016 4.875 1.267
## 2017 4.358 2.131
## 2018 3.892 2.439
## 2019 3.683 1.813
Now another regression
eq2 <- lm(dat2$Inflation ~ dat2$Unemployment)
summary(eq2)
##
## Call:
## lm(formula = dat2$Inflation ~ dat2$Unemployment)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.57084 -0.65736 0.01069 0.64290 2.77132
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.3730 0.8133 4.147 0.000221 ***
## dat2$Unemployment -0.1291 0.1337 -0.965 0.341326
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.177 on 33 degrees of freedom
## Multiple R-squared: 0.02747, Adjusted R-squared: -0.001999
## F-statistic: 0.9322 on 1 and 33 DF, p-value: 0.3413
It is possible to extract elements of the regression results. To see the options use the names() function.
names(eq1)
## [1] "coefficients" "residuals" "effects" "rank"
## [5] "fitted.values" "assign" "qr" "df.residual"
## [9] "xlevels" "call" "terms" "model"
You can see that one of these is residuals. We can plot the residuals to see if they are random as should be the case if nothing important is missing from the model.
plot(eq2$residuals)
What is notable here is the downward trend to the residuals. This can be estimated using a time trend (like we did with the GDP data in excel).
eq3 <- lm(eq2$residuals ~ I(1:35))
summary(eq3)
##
## Call:
## lm(formula = eq2$residuals ~ I(1:35))
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.01051 -0.41931 -0.03432 0.45418 1.94196
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.24403 0.32197 3.864 0.000494 ***
## I(1:35) -0.06911 0.01560 -4.430 9.77e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9321 on 33 degrees of freedom
## Multiple R-squared: 0.373, Adjusted R-squared: 0.354
## F-statistic: 19.63 on 1 and 33 DF, p-value: 9.771e-05
We can add this to the residual plot.
plot(1985:2019, eq2$residuals, main = "Residuals and trend",
xlab = "Year", ylab = "Residuals")
lines(1985:2019, eq3$fitted.values, col = 'red')
legend('topright', inset = 0.08, legend =
"Downward trend on inflation", lty = 1, col = 'red')
There is something additional putting downward pressure on inflation (other than unemployment). We might add other variables to the model to try to account for these. For example, some suggest that the balance between labour and capital has been tipped more decisively on the side of capital and this makes is much harder for workers to take advantage of tight labour markets. This might be measured by the level of union membership or the level of industry concentration.