Introduction

Gaining a comprehensive understanding of the U.S. economy requires examining both broad national trends and specific localized factors. This study constructs a diffusion index by integrating data on employment, industrial production, and consumer sentiment. The newly developed index is then analyzed alongside the Chicago Fed National Activity Diffusion Index (CFNAI-DIFF) to uncover trends in economic growth and decline over time.

Data Selection and Diffusion Index Development.

The diffusion index serves as an indicator of economic expansion or contraction, derived from three key variables: employment, industrial production, and housing starts.

# Suppress warnings and startup messages from loaded packages
suppressWarnings({
  suppressPackageStartupMessages({
    library(dplyr)
    library(quantmod)
    library(tsbox)
    library(ggplot2)
    library(zoo)
  })
})

# Adjust options for better numerical output
options(digits = 3, scipen = 99999)
graphics.off()
# Retrieve economic data for analysis
symbols <- c("PAYEMS", "INDPRO", "HOUST") # Define symbols
getSymbols(Symbols = symbols,
           src = "FRED", auto.assign = TRUE, 
           from = "2010-01-01", to = Sys.Date(), env = globalenv())
## [1] "PAYEMS" "INDPRO" "HOUST"

Analysis of the Custom Diffusion Index vs. CFNAIDIFF

-  The correlation between the custom Diffusion Index and the Chicago Fed National Activity Diffusion Index (CFNAIDIFF) provides insights into how closely the two indices align.

-  A high positive correlation (approaching 1) indicates that the custom index effectively mirrors the economic patterns captured by CFNAIDIFF.

-  For instance, a correlation value of 0.8 or higher would demonstrate that the custom index serves as a reliable representation of CFNAIDIFF, reflecting comparable economic trends during the analyzed timeframe.
# Data preprocessing
employment_data <- PAYEMS
industrial_data <- INDPRO
housing_data <- HOUST

# Subset data to the specified date range and convert to time series
employment_subset <- employment_data["2010-01-31/2024-09-01"] |> ts_ts()
industrial_subset <- industrial_data["2010-01-31/2024-09-01"] |> ts_ts()
housing_subset <- housing_data["2010-01-31/2024-09-01"] |> ts_ts()

# Combine the subsets into a single data frame
combined_data <- cbind.data.frame(employment_subset, industrial_subset, housing_subset)

# Compute first differences and clean data
mydf <- combined_data %>%
  mutate(
    emp_diff = tsibble::difference(employment_subset, differences = 1),
    ind_diff = tsibble::difference(industrial_subset, differences = 1),
    house_diff = tsibble::difference(housing_subset, differences = 1)
  ) %>%
  dplyr::select(emp_diff, ind_diff, house_diff) %>%
  na.omit()
## Registered S3 method overwritten by 'tsibble':
##   method               from 
##   as_tibble.grouped_df dplyr

Observations from the First Plot:

# Constructing the Diffusion Index
mydf_sign <- apply(mydf, 2, sign)  # Assign sign to each value in the matrix
positive_counts <- apply(mydf_sign, 1, function(row) sum(row > 0))
negative_counts <- apply(mydf_sign, 1, function(row) sum(row < 0))
total_counts <- positive_counts + negative_counts
diffusion_index <- (positive_counts / total_counts - negative_counts / total_counts) * 100
smoothed_index <- rollmean(diffusion_index, k = 7, align = "right", na.pad = TRUE)

# Create a data frame with the diffusion index and moving average
dates <- seq.Date(from = as.Date("2010-05-01"), length.out = length(diffusion_index), by = "month")
diffusion_data <- cbind.data.frame(dates, diffusion_index, smoothed_index)

# Plot the Diffusion Index
ggplot(diffusion_data, aes(x = dates, y = diffusion_index)) +
  # Main line representing the index
  geom_line(color = "blue", size = 0.8) +
  # Smoothed trend overlay with transparency
  geom_smooth(color = "red", fill = "skyblue", alpha = 0.3, size = 1.2) +
  # Annotate key points with text labels
  geom_text(data = diffusion_data[c(1, nrow(diffusion_data)), ],
            aes(label = paste0(round(diffusion_index, 2), "%")), 
            vjust = -1, size = 3.5, fontface = "bold", color = "darkred") +
  # Add a horizontal reference line at y = 0
  geom_hline(yintercept = 0, linetype = "dashed", color = "black", size = 0.8) +
  # Highlight a significant event with a vertical line and annotation
  geom_vline(xintercept = as.Date("2020-03-01"), linetype = "dotted", color = "darkgreen", size = 1) +
  annotate("text", x = as.Date("2020-03-01"), y = -90, 
           label = "COVID-19", color = "darkgreen", size = 4, hjust = 0, angle = 90) +
  # Add title, subtitle, and axis labels
  labs(
    title = "Economic Diffusion Index Trends in the U.S.",
    subtitle = "Exploring economic performance over time",
    x = "Date",
    y = "Diffusion Index (%)"
  ) +
  # Customize y-axis limits for better view
  scale_y_continuous(limits = c(-120, 120)) +
  # Apply minimal styling and tweak appearance
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(size = 20, face = "bold", hjust = 0.5, color = "navy"),
    plot.subtitle = element_text(size = 14, face = "italic", hjust = 0.5, color = "darkblue"),
    axis.title.x = element_text(size = 14, face = "bold"),
    axis.title.y = element_text(size = 14, face = "bold"),
    axis.text = element_text(size = 12),
    panel.grid.major = element_line(color = "gray85", size = 0.5),
    panel.grid.minor = element_blank(),
    legend.position = "none",
    plot.background = element_rect(fill = "white", color = "white")
  )
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: The `size` argument of `element_line()` is deprecated as of ggplot2 3.4.0.
## ℹ Please use the `linewidth` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

### Long-Term Trends in the Diffusion Index

# Load the CFNAIDIFF data
getSymbols("CFNAIDIFF", src = "FRED", return.class = "xts", from = "2010-01-01")
## [1] "CFNAIDIFF"
# Subset and align CFNAIDIFF data with the custom Diffusion Index
cfnaidiff_subset <- CFNAIDIFF["2010-05-01/2024-09-01"] |> ts_ts()
max_length <- min(length(smoothed_index), length(cfnaidiff_subset))
smoothed_index <- smoothed_index[1:max_length]
cfnaidiff_subset <- cfnaidiff_subset[1:max_length]
aligned_dates <- dates[1:max_length]

# Create a data frame for comparison
comparison_data <- cbind.data.frame(
  Date = aligned_dates, 
  Diffusion_Index = smoothed_index, 
  CFNAIDIFF = cfnaidiff_subset
)

# Calculate the correlation between the two indices
correlation_value <- cor(
  comparison_data$Diffusion_Index, 
  comparison_data$CFNAIDIFF, 
  use = "complete.obs"
)

# Generate an interactive comparison plot using ggplot and plotly
plotly::ggplotly(
  ggplot(data = comparison_data) +
    # Line for the custom Diffusion Index
    geom_line(aes(x = Date, y = Diffusion_Index, color = "Diffusion Index"), size = 1.2) +
    # Line for CFNAIDIFF with scaling for comparison
    geom_line(aes(x = Date, y = CFNAIDIFF * 100, color = "CFNAIDIFF"), size = 1.2, linetype = "dashed") +
    # Define custom colors for the lines
    scale_color_manual(
      values = c("Diffusion Index" = "blue", "CFNAIDIFF" = "darkred")
    ) +
    # Add plot labels and title
    labs(
      title = "Comparison of Custom Diffusion Index and CFNAIDIFF",
      x = "Date",
      y = "Index Value (%)",
      color = "Indices"
    ) +
    # Apply minimal theme styling
    theme_minimal(base_size = 10) +
    theme(
      plot.title = element_text(size = 14, face = "bold", hjust = 0.5),
      legend.position = "bottom"
    )
)

Key Economic Observations

Recent Economic Trends (Post-September 2024)

Conclusion

The Diffusion Index captures meaningful economic variations over time, with recent decreases reflecting challenges in employment, industrial output, and consumer sentiment. Meanwhile, CFNAIDIFF demonstrates more stability at the national level, showing less volatility compared to the custom index. The weak correlation of 0.157 highlights their differing focuses: the Diffusion Index emphasizes localized or sector-specific changes, whereas CFNAIDIFF reflects broader national trends. This contrast suggests that while the Diffusion Index may signal regional or industry-specific vulnerabilities, CFNAIDIFF portrays a steady national economy with no widespread contraction.