Untitled

Including Plots

Package Setup and Options

suppressWarnings({
  suppressPackageStartupMessages({
    library(tidyverse)
    library(quantmod)
    library(tsbox)
    library(zoo)
  })
})

options(digits = 3, scipen = 99999)
graphics.off()

What it does:

Loads required libraries: tidyverse for data manipulation and visualization, quantmod for downloading economic data, tsbox for time-series manipulation, and zoo for rolling averages.

Suppresses warnings and messages to declutter the output.

Sets options:

digits = 3: Restricts numerical output to 3 decimal places.

scipen = 99999: Prevents scientific notation in numerical output.

Closes any open graphics devices (graphics.off()).

Load Economic Variables

getSymbols(c("PAYEMS", "INDPRO", "HOUST"),
           src = "FRED", return.class = 'xts', 
           from = "2010-01-01", to = Sys.Date())

## [1] "PAYEMS" "INDPRO" "HOUST"

What it does: Downloads three economic variables from the FRED database:

PAYEMS: Nonfarm Payroll Employment (employment data).

INDPRO: Industrial Production Index.

HOUST: Housing Starts.

Data is stored as xts objects for easy time-series manipulation.

Data Preprocessing

employment_ss <- PAYEMS["2010-01-31/2024-09-01"] |> ts_ts()
industrial_ss <- INDPRO["2010-01-31/2024-09-01"] |> ts_ts()
housing_ss <- HOUST["2010-01-31/2024-09-01"] |> ts_ts()

mydata <- cbind.data.frame(employment_ss, industrial_ss, housing_ss)

What it does:

Subsets the data for the range 2010-01-31 to 2024-09-01.

Converts each variable (employment, industrial_production, and housing) to a ts object using ts_ts() for compatibility.

Combines the three variables into a single data frame (mydata) for processing.

Calculate First Differences

mydf <- mydata %>%
  mutate(
    emp_diff = tsibble::difference(employment_ss, differences = 1),
    ind_diff = tsibble::difference(industrial_ss, differences = 1),
    house_diff = tsibble::difference(housing_ss, differences = 1)
  ) %>%
  dplyr::select(emp_diff, ind_diff, house_diff) %>%
  na.omit()

## Registered S3 method overwritten by 'tsibble':
##   method               from 
##   as_tibble.grouped_df dplyr

What it does:

Uses the difference() function from the tsibble package to calculate first differences for each variable:

emp_diff: Employment differences.

ind_diff: Industrial production differences.

house_diff: Housing starts differences.

Retains only the calculated differences and removes rows with NA values using na.omit().

Construct Diffusion Index

mydf_mat <- apply(mydf, 2, sign)
pos <- apply(mydf_mat, 1, function(row) sum(row > 0))
neg <- apply(mydf_mat, 1, function(row) sum(row < 0))
tot <- pos + neg
index <- (pos / tot - neg / tot) * 100
ma_index <- rollmean(index, 7, align = "right", na.pad = TRUE)

`What it does:

Converts mydf to a matrix of signs (-1, 0, 1) using apply() and sign(): 1: Positive differences. -1: Negative differences. 0: No change.

Create Date Sequence and Final Data Frame

Date <- seq.Date(from = as.Date("2010-05-01"), length.out = length(index), by = "month")
diffusion_df <- cbind.data.frame(Date, index, ma_index)

What it does:

Creates a sequence of dates (Date) from 2010-05-01 with a monthly frequency.
Combines Date, raw index, and smoothed ma_index into a data frame (diffusion_df)


## Visualize the Diffusion Index


```r
ggplot(diffusion_df, aes(x = Date, y = index)) +
  geom_line(color = "navyblue", size = 0.8) +
  geom_smooth(color = "red", fill = "lightblue", alpha = 0.3, size = 1.2) +
  geom_text(data = diffusion_df[c(1, nrow(diffusion_df)), ],
            aes(label = paste0(round(index, 2), "%")), 
            vjust = -1, size = 3.5, fontface = "bold", color = "darkred") +
  geom_hline(yintercept = 0, linetype = "dashed", color = "black", size = 0.8) +
  geom_vline(xintercept = as.Date("2020-03-01"), linetype = "dotted", color = "darkgreen", size = 1) +
  annotate("text", x = as.Date("2020-03-01"), y = -90, 
           label = "COVID-19", color = "darkgreen", size = 4, hjust = 0, angle = 90) +
  labs(
    title = "U.S. Economic Diffusion Index Over Time",
    subtitle = "Analyzing economic trends",
    x = "Year",
    y = "Diffusion Index (%)"
  ) +
  scale_y_continuous(limits = c(-120, 120)) +
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(size = 20, face = "bold", hjust = 0.5, color = "darkblue"),
    plot.subtitle = element_text(size = 14, face = "italic", hjust = 0.5, color = "orange"),
    axis.title.x = element_text(size = 14, face = "bold", color = "yellow"),
    axis.title.y = element_text(size = 14, face = "bold", color = "yellow"),
    axis.text = element_text(size = 12, color = "yellow"),
    panel.grid.major = element_line(color = "gray85", size = 0.5),
    panel.grid.minor = element_blank(),
    legend.position = "none",
    plot.background = element_rect(fill = "white", color = "white")
  )

## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

## Warning: The `size` argument of `element_line()` is deprecated as of ggplot2 3.4.0.
## ℹ Please use the `linewidth` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

What it does:

geom_line: Plots the diffusion index (index) as a navy-blue line.

geom_smooth: Adds a smoothed trend line (red) with a light blue shaded area.

geom_text: Labels the start and end points of the index with percentages.

geom_hline: Adds a horizontal dashed line at y = 0.

geom_vline: Adds a vertical dotted line at the COVID-19 event (2020-03-01).

annotate: Annotates the COVID-19 event on the plot.

labs: Adds a title, subtitle, and axis labels.

scale_y_continuous: Adjusts the y-axis limits to [-120, 120].

theme_minimal: Applies a minimal theme and customizes fonts, colors, and spacing.

Subsetting and Preprocessing the Data

Load CFNAIDIFF

getSymbols("CFNAIDIFF", src = "FRED", return.class = 'xts', from = "2010-01-01")

## [1] "CFNAIDIFF"

cfnaidiff_ss <- CFNAIDIFF["2010-05-01/2024-09-01"] |> ts_ts()
min_length <- min(length(ma_index), length(cfnaidiff_ss))
ma_index <- ma_index[1:min_length]
cfnaidiff_ss <- cfnaidiff_ss[1:min_length]
Date <- Date[1:min_length]

What it does:

Extracts data for the specified date range (2010-05-01 to 2024-09-01) and converts it into a ts object using ts_ts() for easier time-series manipulation.

Prepare the Comparison Data Frame

compare_df <- cbind.data.frame(Date, diffusion_index = ma_index, CFNAIDIFF = cfnaidiff_ss)

```What it does:

Combines the aligned Date, ma_index, and cfnaidiff_ss into a single data frame called compare_df. This structure is suitable for further analysis and plotting.

Calculate Correlation

correlation <- cor(compare_df$diffusion_index, compare_df$CFNAIDIFF, use = "complete.obs")

What it does:

Computes the Pearson correlation coefficient between the diffusion_index and CFNAIDIFF columns of compare_df.

The use = “complete.obs” argument ensures that missing values (NA) are ignored in the calculation.

Create an Interactive Plot

plotly::ggplotly(
  ggplot() +
    geom_line(data = compare_df, aes(x = Date, y = diffusion_index, color = "Diffusion Index"), size = 1.2) +
    geom_line(data = compare_df, aes(x = Date, y = CFNAIDIFF * 100, color = "CFNAIDIFF"), size = 1.2, linetype = "dashed") +
    scale_color_manual(
      values = c("Diffusion Index" = "blue", "CFNAIDIFF" = "green")
    ) +
    labs(
      title = "Comparison of Diffusion Index and CFNAIDIFF",
      x = "Year",
      y = "Index Value",
      color = "Legend"
    ) +
    theme_minimal(base_size = 8) +
    theme(
      plot.title = element_text(size = 10, face = "bold", hjust = 0.5),
      legend.position = "bottom"
    )
)






## R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.

When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:


```r
summary(cars)

##      speed           dist    
##  Min.   : 4.0   Min.   :  2  
##  1st Qu.:12.0   1st Qu.: 26  
##  Median :15.0   Median : 36  
##  Mean   :15.4   Mean   : 43  
##  3rd Qu.:19.0   3rd Qu.: 56  
##  Max.   :25.0   Max.   :120

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

Untitled

2024-12-11

Including Plots

Package Setup and Options

Load Economic Variables

Data Preprocessing

Calculate First Differences

Construct Diffusion Index

Create Date Sequence and Final Data Frame

Subsetting and Preprocessing the Data

Load CFNAIDIFF

Prepare the Comparison Data Frame

Calculate Correlation

Create an Interactive Plot

Including Plots