This project aims to analyze the relationship between interest rates,
inflation and unemploymet in US from 1970-2021.
Loading
the Data data source: https://www.kaggle.com/datasets/prasertk/inflation-interest-and-unemployment-rate
The dateset contains many counties/regions, we will filter out the data
related to US. Load packages
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readr)
library(dplyr)
library(ggplot2)
Import data
inflation_interest_unemployment <- read_csv("Desktop/mydata/inflation interest unemployment.csv")
## Rows: 13832 Columns: 13
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): country, iso3c, iso2c, adminregion, incomeLevel
## dbl (8): year, Inflation, consumer prices (annual %), Inflation, GDP deflato...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
View data
head(inflation_interest_unemployment)
## # A tibble: 6 × 13
## country year Inflation, consumer prices (annual …¹ Inflation, GDP defla…²
## <chr> <dbl> <dbl> <dbl>
## 1 Afghanistan 1970 NA NA
## 2 Afghanistan 1971 NA NA
## 3 Afghanistan 1972 NA NA
## 4 Afghanistan 1973 NA NA
## 5 Afghanistan 1974 NA NA
## 6 Afghanistan 1975 NA NA
## # ℹ abbreviated names: ¹​`Inflation, consumer prices (annual %)`,
## # ²​`Inflation, GDP deflator (annual %)`
## # ℹ 9 more variables: `Real interest rate (%)` <dbl>,
## # `Deposit interest rate (%)` <dbl>, `Lending interest rate (%)` <dbl>,
## # `Unemployment, total (% of total labor force) (national estimate)` <dbl>,
## # `Unemployment, total (% of total labor force) (modeled ILO estimate)` <dbl>,
## # iso3c <chr>, iso2c <chr>, adminregion <chr>, incomeLevel <chr>
Filter US data
inflation_interest_unemployment_us <- inflation_interest_unemployment[inflation_interest_unemployment$country=="United States",]
print(paste0("The US dataset contains ", prettyNum(nrow(inflation_interest_unemployment_us), big.mark = ","), " rows and ", prettyNum(ncol(inflation_interest_unemployment_us), big.mark = ","), " columns."))
## [1] "The US dataset contains 52 rows and 13 columns."
linear regression Test linear regression between inflation rate, interest rate and unemployment rate, consider unemployment rate as dependent variale, the other two as independent variable.
y <- inflation_interest_unemployment_us$`Unemployment, total (% of total labor force) (national estimate)`
x1 <-inflation_interest_unemployment_us$`Inflation, GDP deflator (annual %)`
x2 <- inflation_interest_unemployment_us$`Real interest rate (%)`
model <- lm(y ~ x1 + x2, data = inflation_interest_unemployment_us)
summary(model)
##
## Call:
## lm(formula = y ~ x1 + x2, data = inflation_interest_unemployment_us)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.3919 -1.1562 -0.3175 1.0649 3.5789
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.00651 0.54544 11.012 9.82e-15 ***
## x1 0.11997 0.09409 1.275 0.208
## x2 -0.04620 0.09377 -0.493 0.625
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.579 on 48 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.03875, Adjusted R-squared: -0.001302
## F-statistic: 0.9675 on 2 and 48 DF, p-value: 0.3873
Scatterplot for y vs x1 and y vs x2
ggplot(inflation_interest_unemployment_us, aes(x = x1, y = y)) +
geom_point(color = "blue") +
labs(title = "Scatterplot of y vs x1", x = "x1", y = "y")
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
ggplot(inflation_interest_unemployment_us, aes(x = x2, y = y)) +
geom_point(color = "green") +
labs(title = "Scatterplot of y vs x2", x = "x2", y = "y")
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
Conclusion The regression model does not
perform well. Neither x1 nor x2 are statistically significant, the
R-squared value is low, and the F-statistic shows that the model as a
whole is not significant. This suggests that the variables included in
this model are not good predictors of the dependent variable, or that
the relationship between the variables might not be linear.