MEGHA PATIL

BUSINESS ANALYTICS

UNIVERSITY OF NEW HAVEN

POMPEA SCHOOL OF BUSINESS

For the Diffusion Index Analysis in this report I have picked up the 3 economic variables from FRED i.e Housing Starts , Industrial Production & Unemployment Rate and then created a Diffusion Index between these 3 with a built-in smoother.Later have compared the Diffusion Index with the Chicago Fed National Activity Index.

library(dplyr)
## Warning: package 'dplyr' was built under R version 4.3.3
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.3.3
library(lubridate)
## Warning: package 'lubridate' was built under R version 4.3.3
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union
library(readr)
## Warning: package 'readr' was built under R version 4.3.3

Loading Data Sets

houst <- read.csv("C:/Users/Lenovo/Downloads/HOUST.csv")
indpro <- read.csv("C:/Users/Lenovo/Downloads/INDPRO.csv")
unrate <- read.csv("C:/Users/Lenovo/Downloads/UNRATE.csv")
cfnaidiff <- read.csv("C:/Users/Lenovo/Downloads/CFNAIDIFF.csv")

Converting Data Columns to Data Type

houst$DATE <- as.Date(houst$DATE)
indpro$DATE <- as.Date(indpro$DATE)
unrate$DATE <- as.Date(unrate$DATE)
cfnaidiff$DATE <- as.Date(cfnaidiff$DATE)

Filtering of data from 2017-2024

houst <- houst %>% filter(DATE >= as.Date("2017-01-01") & DATE <= as.Date("2024-12-31"))
indpro <- indpro %>% filter(DATE >= as.Date("2017-01-01") & DATE <= as.Date("2024-12-31"))
unrate <- unrate %>% filter(DATE >= as.Date("2017-01-01") & DATE <= as.Date("2024-12-31"))
cfnaidiff <- cfnaidiff %>% filter(DATE >= as.Date("2017-01-01") & DATE <= as.Date("2024-12-31"))

Calculating montly changes for Diffusion Index

houst_diff <- houst %>% mutate(Change = HOUST - lag(HOUST))
indpro_diff <- indpro %>% mutate(Change = INDPRO - lag(INDPRO))
unrate_diff <- unrate %>% mutate(Change = UNRATE - lag(UNRATE))

Calculating Diffusion Index

diffusion_index <- data.frame(
  DATE = houst$DATE,
  Diffusion = rowMeans(cbind(
    houst_diff$Change > 0,
    indpro_diff$Change > 0,
    unrate_diff$Change < 0
  ), na.rm = TRUE) - 0.5
)

Plotting of Diffusion Index

ggplot(diffusion_index, aes(x = DATE, y = Diffusion)) +
  geom_line(color = "blue") +
  geom_smooth(method = "loess", se = FALSE) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  labs(title = "Diffusion Index (2017-2024)",
       y = "Diffusion Index", x = "Year") +
  theme_classic()
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 1 row containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).

Calculating the correlation coefficient

correlation <- cor(diffusion_index_filtered$Diffusion, cfnaidiff_filtered$CFNAIDIFF, use = "complete.obs")
print(paste("Correlation Coefficient:", round(correlation, 3)))
## [1] "Correlation Coefficient: 0.373"

A Correlation coefficent of 0.37 indicates a weak to moderate positive linear relationship between the variables.

As the Diffusion Index increases , the CFNAIDIFF also tends to increase , but not very perfectly.

ggplot() +
  geom_line(data = diffusion_index_filtered, aes(x = DATE, y = Diffusion, color = "Diffusion Index")) +
  geom_line(data = cfnaidiff_filtered, aes(x = DATE, y = CFNAIDIFF, color = "CFNAIDIFF")) +
  labs(title = "Comparison of Diffusion Index and CFNAIDIFF",
       x = "Year", y = "Index Value", color = "Series") +
  geom_hline(yintercept = 0, linetype = "dashed") +
  theme_classic()
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).

From the above graph it indicates that we have got 2 lines in which the blue line indicates Diffusion Index and red line indicates CFNAIDIFF Index .The blue line measure how many parts of the economy are imporving. it moves up & down a lot , meaning that at times , many areas of the economy are doing well.The red line measures the overall strength of the economy.

The graph shows that both the Diffusion Index and CFNAIDIFF are close to zero for the current year. This means the economy is neither growing strongly nor shrinking a lot it’s kind of stable or neutral right now. Things don’t look too bad, but there’s no strong improvement either.