Introduction

This project aims to analyze the relationship between interest rates, inflation and unemploymet in US from 1970-2021.

Loading the Data data source: https://www.kaggle.com/datasets/prasertk/inflation-interest-and-unemployment-rate The dateset contains many counties/regions, we will filter out the data related to US. Load packages

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readr)
library(dplyr)
library(ggplot2)

Import data

inflation_interest_unemployment <- read_csv("Desktop/mydata/inflation interest unemployment.csv")
## Rows: 13832 Columns: 13
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): country, iso3c, iso2c, adminregion, incomeLevel
## dbl (8): year, Inflation, consumer prices (annual %), Inflation, GDP deflato...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

View data

head(inflation_interest_unemployment)
## # A tibble: 6 × 13
##   country      year Inflation, consumer prices (annual …¹ Inflation, GDP defla…²
##   <chr>       <dbl>                                 <dbl>                  <dbl>
## 1 Afghanistan  1970                                    NA                     NA
## 2 Afghanistan  1971                                    NA                     NA
## 3 Afghanistan  1972                                    NA                     NA
## 4 Afghanistan  1973                                    NA                     NA
## 5 Afghanistan  1974                                    NA                     NA
## 6 Afghanistan  1975                                    NA                     NA
## # ℹ abbreviated names: ¹​`Inflation, consumer prices (annual %)`,
## #   ²​`Inflation, GDP deflator (annual %)`
## # ℹ 9 more variables: `Real interest rate (%)` <dbl>,
## #   `Deposit interest rate (%)` <dbl>, `Lending interest rate (%)` <dbl>,
## #   `Unemployment, total (% of total labor force) (national estimate)` <dbl>,
## #   `Unemployment, total (% of total labor force) (modeled ILO estimate)` <dbl>,
## #   iso3c <chr>, iso2c <chr>, adminregion <chr>, incomeLevel <chr>

Filter US data

inflation_interest_unemployment_us <- inflation_interest_unemployment[inflation_interest_unemployment$country=="United States",]
print(paste0("The US dataset contains ", prettyNum(nrow(inflation_interest_unemployment_us), big.mark = ","), " rows and ", prettyNum(ncol(inflation_interest_unemployment_us), big.mark = ","), " columns."))
## [1] "The US dataset contains 52 rows and 13 columns."

linear regression Test linear regression between inflation rate, interest rate and unemployment rate, consider unemployment rate as dependent variale, the other two as independent variable.

y <- inflation_interest_unemployment_us$`Unemployment, total (% of total labor force) (national estimate)`
x1 <-inflation_interest_unemployment_us$`Inflation, GDP deflator (annual %)`
x2 <- inflation_interest_unemployment_us$`Real interest rate (%)`
model <- lm(y ~ x1 + x2, data = inflation_interest_unemployment_us)
summary(model)
## 
## Call:
## lm(formula = y ~ x1 + x2, data = inflation_interest_unemployment_us)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.3919 -1.1562 -0.3175  1.0649  3.5789 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  6.00651    0.54544  11.012 9.82e-15 ***
## x1           0.11997    0.09409   1.275    0.208    
## x2          -0.04620    0.09377  -0.493    0.625    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.579 on 48 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.03875,    Adjusted R-squared:  -0.001302 
## F-statistic: 0.9675 on 2 and 48 DF,  p-value: 0.3873

Scatterplot for y vs x1 and y vs x2

ggplot(inflation_interest_unemployment_us, aes(x = x1, y = y)) +
  geom_point(color = "blue") +
  labs(title = "Scatterplot of y vs x1", x = "x1", y = "y")
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).

ggplot(inflation_interest_unemployment_us, aes(x = x2, y = y)) +
  geom_point(color = "green") +
  labs(title = "Scatterplot of y vs x2", x = "x2", y = "y")
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).



Conclusion The regression model does not perform well. Neither x1 nor x2 are statistically significant, the R-squared value is low, and the F-statistic shows that the model as a whole is not significant. This suggests that the variables included in this model are not good predictors of the dependent variable, or that the relationship between the variables might not be linear.