1.0 Introduction

This project explores the calculation of a Leading Economic Index (LEI). The LEI is a composite index that is designed to forecast future economic activity. It is composed of several individual economic indicators that historically have shown consistent patterns of leading the overall economy. The LEI is used by economists, policymakers, and business leaders to predict economic trends and potential turning points in economic activity. We will be loosely following the Conference Board’s methodology in order to calculate our index.

Sections 2.0 - 4.0 will be where data collection and transformations will take place. Then in sections 5.0 and 6.0 we will analyze the Index and lastly section 7.0 is a collection of references used in this project.

Conference Board Leading Index Components (FRED ID)

BCI-01 Average weekly hours, manufacturing (AWHMAN)
BCI-05 Average weekly initial claims for unemployment insurance (ICSA)
BCI-08 Manufacturers’ new orders, consumer goods and materials (ACOGNO)
BCI-130 ISM new order index (NOCDFSA066MSFRBPHI)

This is also a proxy because it only encompases the philly fed district. This information is hard to find so well use it for now

BCI-33 Manufacturers’ new orders, non-defense capital goods excl. aircraft (NEWORDER)
BCI-29 Building permits, new private housing units (PERMIT)
BCI-19 Stock prices, 500 common stocks (M1346BUSM156NNBR)
BCI-107 Leading Credit Index (M2REAL)

1.) 2-years Swap Spread (real time), 2.) LIBOR 3 month less 3 month Treasury-Bill yield spread (real time), 3.) Debit balances at margin account at broker dealer (monthly), 4.) AAII Investors Sentiment Bullish (%) less Bearish (%) (weekly), 5.) Senior Loan Officers C&I loan survey – Bank tightening Credit to Large and Medium Firms (quarterly), 6.) Security Repurchases (quarterly) from the Total Finance-Liabilities section of Federal Reserve’s flow of fund report

Substituting this one with M2 because we don’t have access to this index and M2 can capture some of the information in this index. Will have to come back to improve this.

BCI-129 Interest rate spread, 10-year Treasury bonds less federal funds (T10YFF)
BCI-125 Avg. consumer expectations for business and economic conditions (UMCSENT)
Industrial Production: Total Index (INDPRO)

INDPRO is used as a proxy for the coincident index. It is one of the componets of the coincident index and has movements that are close to that of the coincident index. I have seen this done in several academic papers.

2.0 Prepare Workspace.

# clean workspace
rm(list = ls())

# Set the seed for reproducibility
set.seed(123)

# Prepare needed libraries
packages <- c("ggplot2",     # Visualizations 
              "dplyr",       # Data manipulation
              "tidyr",       # Data formatting 
              "stargazer",   # LaTex tables 
              "knitr",       # markdown output formatting
              "readxl",      # read in excel files
              "tsibble",     # tidy timeseries extension
              "zoo",         # time-indexing and analysis
              "hpfilter",    # Hodrick-Prescott Filter
              "forecast",    # forecasting techniques 
              "lubridate",   # Date handling
              "tseries",     # Tim Serise Analysis
              "fabletools",  # tools for fable framework
              "fable",       # Forecasting for Tidy
              "fpp3",
              "BCDating",    # Business Cycle Dating tools
              "fredr",      # R bindings for Fred API
              "purrr",        # functional programming tools
              "BCDating",  # Business cycle dating tools
              "lintr"
              )

# install and upload packages 
  for (i in 1:length(packages)) {
    if (!packages[i] %in% rownames(installed.packages())) {
      install.packages(packages[i]
                       , repos = "http://cran.rstudio.com/"
                       , dependencies = TRUE
                       )
    }
    library(packages[i], character.only = TRUE)
  }

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

## 
## Please cite as:

##  Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.

##  R package version 5.2.3. https://CRAN.R-project.org/package=stargazer

## Registered S3 method overwritten by 'tsibble':
##   method               from 
##   as_tibble.grouped_df dplyr

## 
## Attaching package: 'tsibble'

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, union

## 
## Attaching package: 'zoo'

## The following object is masked from 'package:tsibble':
## 
##     index

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

## 
## Attaching package: 'lubridate'

## The following object is masked from 'package:tsibble':
## 
##     interval

## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union

## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.0 ──

## ✔ tibble      3.2.1     ✔ feasts      0.3.2
## ✔ tsibbledata 0.4.1

## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date()     masks base::date()
## ✖ dplyr::filter()       masks stats::filter()
## ✖ zoo::index()          masks tsibble::index()
## ✖ tsibble::intersect()  masks base::intersect()
## ✖ lubridate::interval() masks tsibble::interval()
## ✖ dplyr::lag()          masks stats::lag()
## ✖ tsibble::setdiff()    masks base::setdiff()
## ✖ tsibble::union()      masks base::union()

2.1 Retrieve Data from FRED API.

The data that we will use in the project can be found on the FRED API. We will use the fredr package to access the API quickly. This comes with the added benefit to be able to keep the index up to date.

We will then use the fetch_datafunction in a loop to grab all the data we want and store the data frames within a list so they are easy to keep track of.

# function that collects our data 
# coerce to data frame
fetch_data <- function(x, frequency = "m") {
  as.data.frame(fredr(
    series_id = x,
    frequency = frequency
  ))
}

# list of series IDs 
cmpts <- c(
  "AWHMAN",
  "ICSA",
  "ACOGNO",
  "NOCDFSA066MSFRBPHI",
  "NEWORDER",
  "PERMIT",
  "M2REAL",
  "T10YFF",
  "UMCSENT",
  "INDPRO"
)
# place to store data frames 
data_list <- list()

# loop through list to collect data and store in list
for(i in 1:length(cmpts)){
  data_list[[cmpts[i]]] <- fetch_data(cmpts[i], "m")
}

2.2 Process data

Now that we have our data we will need to do some processing so that we can create a data frame that is in the correct format. First we need to rename the value column from each API request to its actual name. Then we will un-nest our list of data frames and put each column representing a component and each row representing a month. finally we will be done once we create a tsibble out of our merged data frame.

# change name of column `value` in each data frame in prep
for(i in 1:length(cmpts)) {
  # rename individual df columns
 names(data_list[[i]])[names(data_list[[i]]) == "value"] <- cmpts[i]
 # use list slicing to drop un-needed columns 
 data_list[[i]] <- data_list[[i]][, c("date", cmpts[i])]
}

# Function to merge data frames by 'date'
merge_dfs_by_date <- function(dfs) {
  reduce(dfs, full_join, by = "date")
}

# Merging all data frames by 'date'
merged_df <- merge_dfs_by_date(data_list)

# tibble will be easiest to work with
df <-  merged_df |>
  # format date
  mutate(date = as.yearmon(date)) |>
  # coerce to tibble
  as_tibble(index = date)

2.3 Collect and Format NBER Turning Points

The last step in preparing our data is to collect and then reformat the NBER business cycle dates so that we can plot them. This will allow us to visually inspect our leading index and put it context of the past business cycles.

We will start by using the fetch_data function that was defined earlier. The FRED API keeps the recession dates in binary format 1 = Rrecession 0 = NON-Recession. We must change this format for easy plotting with ggplot to accomplish this we must create a peak and trough columns that contain the dates.

# retrieve NBER Recession Dates 
RECD <- fetch_data("USRECD")

# drop columns that are not needed
RECD <-  RECD |>
  select(-realtime_start,-realtime_end) 

# show data frame structure
str(RECD)

## 'data.frame':    2037 obs. of  3 variables:
##  $ date     : Date, format: "1854-12-01" "1855-01-01" ...
##  $ series_id: chr  "USRECD" "USRECD" "USRECD" "USRECD" ...
##  $ value    : num  1 0 0 0 0 0 0 0 0 0 ...

Create leading and lagging values and then check those values for changes from 0 to 1 and 1 to 0 to identity the peaks and troughs of the business cycle.

# find the peaks and trough in recession dates 
statechanges <- RECD |>
  mutate(prev_value = lag(value),
         lead_value = lead(value)) |>
  filter((value == 1 & prev_value == 0) | (value == 1 & lead_value == 0)) |>
  mutate(indctr = ifelse(lead_value == 1, "peak", "trough") ) |>
  select(date,indctr)

We should now have the peaks and toughs of the business cycle. This data needs to processed slightly so that it is in the proper format for plotting.

# reformat data
turning_points <-  statechanges[2:69,] |>
  pivot_wider(names_from = indctr,
              values_from = date)


# Unnest the list-cols from previous line
turning_points <- unnest(turning_points, cols = c(trough, peak))

# Ensure peak and trough columns in turning_points are in Date format
turning_points$peak <- as.yearmon(as.Date(turning_points$peak, format = "%Y-%m-%d"))
turning_points$trough <- as.yearmon(as.Date(turning_points$trough, format = "%Y-%m-%d"))

2.4 GDP Data

# get quartly gdp data
rgdp_q <- fetch_data("GDPC1", frequency = "q")

# format the data
rgdp_q <- rgdp_q |>
  rename(GDPC1 = value) |>
  mutate(date = yearquarter(as.Date(date, format = "%Y-%m-%d"))) |>
  select(date, GDPC1)

3.0 Plot Components

In this section we will create some plots of all of our componets, this is done so that we can verfy that our data came in correctly and to get some insight on each one. We must create some functions that use the ggplot library. The functions will then be applied and we will store the results in a list in order to keep them organized.

3.1 Plotting Functions

# ggplot main plotting function
cmpt_plot <- function(Data, Cmpt, color = "blue"){
  
  # Filter the data to remove rows where the Cmpt column is NA
  filtered_data <- Data[!is.na(Data[[Cmpt]]), ]
  
  
  # build plot
  ggplot(filtered_data, aes(x = date, y = .data[[Cmpt]])) +
  geom_line(color = color) +
  theme_linedraw()
}

# function to display recession bars 
recession_bars <- function(start,stop){
  geom_rect(data=turning_points[start:stop,], 
            aes(xmin=peak,
                xmax=trough,
                ymin=-Inf,
                ymax=+Inf),
            fill='grey1',
            alpha=0.2,
            inherit.aes = FALSE )
}

# apply plotting function to list  
# lapply is probably faster than a loop
plots_list <- lapply(colnames(df), cmpt_plot, Data = df)

3.2 Raw Componet Plots

3.21 AWHMAN

plots_list[[2]] + 
  labs(title = "Average Weekly Hours of Production and Nonsupervisory Employees, Manufacturing")+
  recession_bars(22,44)

3.22ICSA

plots_list[[3]] + 
  labs(title = "Average weekly initial claims for unemployment insurance") + 
  recession_bars(27,44)

3.23 ACOGNO

plots_list[[4]]  + 
  labs(title = "Manufacturers' new orders, consumer goods and materials") + 
  recession_bars(31,44)

3.24 NOCDF

plots_list[[5]]  + 
  labs(title = "ISM new order index") +
  recession_bars(27,44)

3.25 NEWORDER

plots_list[[6]]  + 
  labs(title = "Manufacturers' new orders, non-defense capital goods excl. aircraft") +
  recession_bars(31,44)

3.26 PERMIT

plots_list[[7]]  + 
  labs(title = "Housing Permits")  + recession_bars(26,44)

3.27 M2REAL

plots_list[[8]]  + 
  labs(title = "M2") + recession_bars(26,44)

3.28 T10YFF

plots_list[[9]]  + 
  labs(title = "Interest rate spread, 10-year Treasury bonds less federal funds") + recession_bars(26,44)

3.29 UMCSENT

plots_list[[10]]  + 
  labs(title = "Avg. consumer expectations for business and economic conditions") + recession_bars(24,44)

3.3 INDPRO

plots_list[[11]]  + 
  labs(title = "Industrial Production: Total Index ") +
  recession_bars(17,44)

# build plot 
gdp_plot <- cmpt_plot(rgdp_q,"GDPC1")

# add recession bars
gdp_plot +
labs(title = "Real GDP") +
  recession_bars(22,44)

## Warning: Removed 10 rows containing missing values or values outside the scale range
## (`geom_rect()`).

4.0 Calculate Leading Index

To calculate our leading index we will use the conference boards composite index calculation methodology. This method uses 6 steps and can boils all of the information in each series into one index that will serve as an early warning sign of the business cycle.

4.1 Step (1)

Month-to-month changes are computed for each component using the symmetric percent change formula for all components that are not already in that form.

We will also start by dropping our missing data, each component has a different length so this will standardize them so that we don’t have to do imputation.

# find complete cases 
 df$ind <-complete.cases(df)

# drop all but the complete cases 
df <- df |>
  filter(ind == TRUE) |>
  select(-ind)


# we end up with data from 1992- 2024
# slightly verbose way of doing this but it works well

# Function to calculate symmetric percent change for vectors
symmetric_percent_change <- function(X_t) {
  return(200 * ((X_t - lag(X_t)) / (X_t + lag(X_t))))
}

# Columns to use
columns_to_change <- c("AWHMAN", "ICSA", "ACOGNO","NEWORDER", "PERMIT","M2REAL", "UMCSENT", "INDPRO")

# Create a new tibble to store results
pchange_results <- df |>
  select(date, all_of(columns_to_change))

# calculate symmetric percent change and add new columns.
pchange_results <- pchange_results |>
  mutate(across(all_of(columns_to_change), 
                .fns = ~ symmetric_percent_change(.),
                .names = "symmetric_{.col}"))

# Convert to data frame
pchange_results <- as.data.frame(pchange_results)

# Print the results
stargazer(pchange_results, type = "text")

## 
## ==================================================================
## Statistic           N     Mean      St. Dev.      Min       Max   
## ------------------------------------------------------------------
## AWHMAN             389   41.202       0.608     38.500    42.300  
## ICSA               389 370,741.400 289,092.500  197,750  4,663,250
## ACOGNO             389 162,949.800 45,642.080   86,445    264,445 
## NEWORDER           389 59,147.920   9,272.555   33,857    73,992  
## PERMIT             389  1,372.758    403.027      513      2,263  
## M2REAL             389  4,128.279   1,579.775  2,305.100 7,656.700
## UMCSENT            389   85.845      13.409     50.000    112.000 
## INDPRO             389   91.754      11.330     61.919    104.104 
## symmetric_AWHMAN   388    0.001       0.630     -6.775     3.951  
## symmetric_ICSA     388   -0.347      11.321     -72.838   167.161 
## symmetric_ACOGNO   388    0.274       2.331     -19.073   17.352  
## symmetric_NEWORDER 388    0.200       3.036     -11.390    9.090  
## symmetric_PERMIT   388    0.062       5.014     -24.655   17.011  
## symmetric_M2REAL   388    0.260       0.673     -1.429     6.991  
## symmetric_UMCSENT  388   -0.002       5.186     -21.504   15.132  
## symmetric_INDPRO   388    0.133       1.070     -14.181    6.375  
## ------------------------------------------------------------------

4.2 Step (2)

The monthly contributions of the components are adjusted to equalize the volatility of each component.

# add columns that don't need symmetric %
# also invert jobless cliams 
pchange_results <- pchange_results |> 
  mutate(T10YFF = df$T10YFF,
         NOCDF = df$NOCDFSA066MSFRBPHI,
         INDPRO = df$INDPRO,
         INDPRO_GR = symmetric_INDPRO)

calculate_adjusted <- function(data, columns) {
  adjusted_columns <- c()
  data <- stats::na.omit(data)
  for (col in columns) {
    # calculate standard deviation
    v_x <- sd(data[[col]])
    # Check if the standard deviation is zero or NA 
    if (is.na(v_x) || v_x == 0) {
      stop(paste("Standard deviation for column", col, "is NA or zero, can't invert"))
    }
    # Invert Standard deviation
    w_x <- 1/v_x
    # Append to adjusted columns
    adjusted_columns <- c(adjusted_columns, w_x)
  }
  
  # sum inverted standard deviations for all columns
  k <- sum(adjusted_columns)
  
  # restate sum between 1,0 for all columns
  adjusted_columns <- adjusted_columns / k
  
  # Adjust contributions for all columns
  for (i in 1:length(columns)) {
    col <- columns[i]
    rx <- adjusted_columns[i]
    data[[paste0("adj_", col)]] <- rx * data[[col]]
  }
  
  return(data)
}

# Columns for which you want to calculate adjusted contributions
columns_to_adj <- c("symmetric_AWHMAN",                     
                        "symmetric_ICSA",                    
                        "symmetric_ACOGNO",                     
                        "symmetric_NEWORDER",                   
                        "symmetric_PERMIT",                    
                        "symmetric_M2REAL",                      
                        "symmetric_UMCSENT",
                        "T10YFF",
                        "NOCDF")


# Call the function to calculate adjusted contributions for all columns
pchange_results <- calculate_adjusted(pchange_results, columns_to_adj)


# Print the results
stargazer(pchange_results, type = "text", title = "Leading index")

## 
## Leading index
## ======================================================================
## Statistic               N     Mean      St. Dev.      Min       Max   
## ----------------------------------------------------------------------
## AWHMAN                 388   41.204       0.608     38.500    42.300  
## ICSA                   388 370,557.200 289,442.900  197,750  4,663,250
## ACOGNO                 388 163,147.000 45,534.820   90,386    264,445 
## NEWORDER               388 59,213.110   9,194.857   35,258    73,992  
## PERMIT                 388  1,373.343    403.383      513      2,263  
## M2REAL                 388  4,132.597   1,579.515  2,305.100 7,656.700
## UMCSENT                388   85.889      13.398     50.000    112.000 
## INDPRO                 388   91.831      11.242     62.440    104.104 
## symmetric_AWHMAN       388    0.001       0.630     -6.775     3.951  
## symmetric_ICSA         388   -0.347      11.321     -72.838   167.161 
## symmetric_ACOGNO       388    0.274       2.331     -19.073   17.352  
## symmetric_NEWORDER     388    0.200       3.036     -11.390    9.090  
## symmetric_PERMIT       388    0.062       5.014     -24.655   17.011  
## symmetric_M2REAL       388    0.260       0.673     -1.429     6.991  
## symmetric_UMCSENT      388   -0.002       5.186     -21.504   15.132  
## symmetric_INDPRO       388    0.133       1.070     -14.181    6.375  
## T10YFF                 388    1.433       1.311     -1.470     3.800  
## NOCDF                  388    8.684      14.920     -69.300   46.700  
## INDPRO_GR              388    0.133       1.070     -14.181    6.375  
## adj_symmetric_AWHMAN   388   0.0002       0.195     -2.091     1.219  
## adj_symmetric_ICSA     388   -0.006       0.195     -1.252     2.872  
## adj_symmetric_ACOGNO   388    0.023       0.195     -1.592     1.448  
## adj_symmetric_NEWORDER 388    0.013       0.195     -0.730     0.582  
## adj_symmetric_PERMIT   388    0.002       0.195     -0.957     0.660  
## adj_symmetric_M2REAL   388    0.075       0.195     -0.413     2.020  
## adj_symmetric_UMCSENT  388   -0.0001      0.195     -0.807     0.568  
## adj_T10YFF             388    0.213       0.195     -0.218     0.564  
## adj_NOCDF              388    0.113       0.195     -0.903     0.609  
## ----------------------------------------------------------------------

4.3 Step (3)

Add the adjusted monthly contribution across the components for each month to obtain the growth rate of the index.

# Get the columns with adjusted contributions
adj_columns <- grep("^adj_", names(pchange_results), value = TRUE)


# Calculate the monthly growth rate for each month
pchange_results$new_monthly_growth_rate <- rowSums(pchange_results[, adj_columns])
# Print the results
stargazer(pchange_results, type = "text")

## 
## =======================================================================
## Statistic                N     Mean      St. Dev.      Min       Max   
## -----------------------------------------------------------------------
## AWHMAN                  388   41.204       0.608     38.500    42.300  
## ICSA                    388 370,557.200 289,442.900  197,750  4,663,250
## ACOGNO                  388 163,147.000 45,534.820   90,386    264,445 
## NEWORDER                388 59,213.110   9,194.857   35,258    73,992  
## PERMIT                  388  1,373.343    403.383      513      2,263  
## M2REAL                  388  4,132.597   1,579.515  2,305.100 7,656.700
## UMCSENT                 388   85.889      13.398     50.000    112.000 
## INDPRO                  388   91.831      11.242     62.440    104.104 
## symmetric_AWHMAN        388    0.001       0.630     -6.775     3.951  
## symmetric_ICSA          388   -0.347      11.321     -72.838   167.161 
## symmetric_ACOGNO        388    0.274       2.331     -19.073   17.352  
## symmetric_NEWORDER      388    0.200       3.036     -11.390    9.090  
## symmetric_PERMIT        388    0.062       5.014     -24.655   17.011  
## symmetric_M2REAL        388    0.260       0.673     -1.429     6.991  
## symmetric_UMCSENT       388   -0.002       5.186     -21.504   15.132  
## symmetric_INDPRO        388    0.133       1.070     -14.181    6.375  
## T10YFF                  388    1.433       1.311     -1.470     3.800  
## NOCDF                   388    8.684      14.920     -69.300   46.700  
## INDPRO_GR               388    0.133       1.070     -14.181    6.375  
## adj_symmetric_AWHMAN    388   0.0002       0.195     -2.091     1.219  
## adj_symmetric_ICSA      388   -0.006       0.195     -1.252     2.872  
## adj_symmetric_ACOGNO    388    0.023       0.195     -1.592     1.448  
## adj_symmetric_NEWORDER  388    0.013       0.195     -0.730     0.582  
## adj_symmetric_PERMIT    388    0.002       0.195     -0.957     0.660  
## adj_symmetric_M2REAL    388    0.075       0.195     -0.413     2.020  
## adj_symmetric_UMCSENT   388   -0.0001      0.195     -0.807     0.568  
## adj_T10YFF              388    0.213       0.195     -0.218     0.564  
## adj_NOCDF               388    0.113       0.195     -0.903     0.609  
## new_monthly_growth_rate 388    0.433       0.605     -3.702     2.647  
## -----------------------------------------------------------------------

4.4 Step (4)

The sum of the adjusted contributions, i.e., growth rates, of the composite indexes are adjusted to equate their trends to that of a coincident index. This is accomplished by adding an adjustment factor, a, to the growth rates of the index each month (it`= it+ a). For example, the trend adjustment factor for the leading index is computed by subtracting its average monthly growth rate (sum over [t] it/ T where T is the number of observations in the sample) from the average monthly growth rate of the coincident index.

# Step 4: Adjust growth rates to match INDPRO   

# Using INDPRO to proxy for coincident index 

adjust_trend <- function(growth_rate, coincident_index_growth_rate) {
  a <- mean(coincident_index_growth_rate) - mean(growth_rate)  # Compute adjustment factor
  return(growth_rate + a)  # Adjusted growth rates
}

pchange_results$New_adj_GROWTHRATE  <- adjust_trend(pchange_results$new_monthly_growth_rate,
                                              pchange_results$INDPRO_GR)

4.5 Step (5)

The level of the index is computed using the symmetric percent change formula. The index is calculated recursively starting from an initial value of 100 for the first month of the sample period (i.e. January 1959)

# Initialize the leading index levels
leading_index <- numeric(nrow(pchange_results))
leading_index[1] <- 100

# Compute the leading index levels recursively
for (i in 2:nrow(pchange_results)) {
  leading_index[i] <- leading_index[i - 1] * (200 + pchange_results$New_adj_GROWTHRATE[i]) / (200 - pchange_results$New_adj_GROWTHRATE[i])
}

# Add the leading index levels to the dataframe
pchange_results$leading_index <- leading_index

# Print the results
stargazer(pchange_results, type = "text")

## 
## =======================================================================
## Statistic                N     Mean      St. Dev.      Min       Max   
## -----------------------------------------------------------------------
## AWHMAN                  388   41.204       0.608     38.500    42.300  
## ICSA                    388 370,557.200 289,442.900  197,750  4,663,250
## ACOGNO                  388 163,147.000 45,534.820   90,386    264,445 
## NEWORDER                388 59,213.110   9,194.857   35,258    73,992  
## PERMIT                  388  1,373.343    403.383      513      2,263  
## M2REAL                  388  4,132.597   1,579.515  2,305.100 7,656.700
## UMCSENT                 388   85.889      13.398     50.000    112.000 
## INDPRO                  388   91.831      11.242     62.440    104.104 
## symmetric_AWHMAN        388    0.001       0.630     -6.775     3.951  
## symmetric_ICSA          388   -0.347      11.321     -72.838   167.161 
## symmetric_ACOGNO        388    0.274       2.331     -19.073   17.352  
## symmetric_NEWORDER      388    0.200       3.036     -11.390    9.090  
## symmetric_PERMIT        388    0.062       5.014     -24.655   17.011  
## symmetric_M2REAL        388    0.260       0.673     -1.429     6.991  
## symmetric_UMCSENT       388   -0.002       5.186     -21.504   15.132  
## symmetric_INDPRO        388    0.133       1.070     -14.181    6.375  
## T10YFF                  388    1.433       1.311     -1.470     3.800  
## NOCDF                   388    8.684      14.920     -69.300   46.700  
## INDPRO_GR               388    0.133       1.070     -14.181    6.375  
## adj_symmetric_AWHMAN    388   0.0002       0.195     -2.091     1.219  
## adj_symmetric_ICSA      388   -0.006       0.195     -1.252     2.872  
## adj_symmetric_ACOGNO    388    0.023       0.195     -1.592     1.448  
## adj_symmetric_NEWORDER  388    0.013       0.195     -0.730     0.582  
## adj_symmetric_PERMIT    388    0.002       0.195     -0.957     0.660  
## adj_symmetric_M2REAL    388    0.075       0.195     -0.413     2.020  
## adj_symmetric_UMCSENT   388   -0.0001      0.195     -0.807     0.568  
## adj_T10YFF              388    0.213       0.195     -0.218     0.564  
## adj_NOCDF               388    0.113       0.195     -0.903     0.609  
## new_monthly_growth_rate 388    0.433       0.605     -3.702     2.647  
## New_adj_GROWTHRATE      388    0.133       0.605     -4.003     2.346  
## leading_index           388   142.715     25.014     100.000   195.462 
## -----------------------------------------------------------------------

4.6 Step (6)

The index is re-based to average 100 in the base year.

raw_index <- pchange_results |>
  select(date,leading_index)

# create a new column for the leading index
df <- df |>
  left_join(raw_index, by = "date")

# Convert the date strings to yearmon objects
base_year <- as.yearmon(c("2016-01-01", "2016-02-01", "2016-03-01",
                          "2016-04-01", "2016-05-01", "2016-06-01",
                          "2016-07-01","2016-08-01", "2016-09-01",
                          "2016-10-01","2016-11-01", "2016-12-01"))

# Filter the data for the base year (2016)
base_year_data <- df |>
  filter(date %in% base_year)

# Calculate the average for the base year (January 2016)
base_year_average <- mean(base_year_data$leading_index, na.rm = TRUE)

# Rebase the leading index
df <- df |>
  mutate(rebased_leading_index = leading_index * 100 / base_year_average)

# Rebase the leading index
df <- df |>
  rename(LEI = rebased_leading_index)

head(df)

## # A tibble: 6 × 13
##   date     AWHMAN   ICSA ACOGNO NOCDFSA066MSFRBPHI NEWORDER PERMIT M2REAL T10YFF
##   <yearmo>  <dbl>  <dbl>  <dbl>              <dbl>    <dbl>  <dbl>  <dbl>  <dbl>
## 1 Feb 1992   40.7 442200  86445               13      33857   1146  2453.   3.28
## 2 Mar 1992   40.7 429500  90386                9.8    35258   1082  2447.   3.53
## 3 Apr 1992   40.8 418250  90535               20.7    35992   1054  2439.   3.64
## 4 May 1992   40.9 417400  92615               12      36811   1056  2433.   3.55
## 5 Jun 1992   40.8 418750  93632               15.2    36666   1057  2422.   3.46
## 6 Jul 1992   40.8 442750  93275               19.8    36629   1089  2416.   3.5 
## # ℹ 4 more variables: UMCSENT <dbl>, INDPRO <dbl>, leading_index <dbl>,
## #   LEI <dbl>

# drop Na from first row, this corrosponds to feb 1992
# this is done because of use of symmetric % change cant be computed on the first value 
df <- stats::na.omit(as.data.frame(df))
stargazer(df, type = "text")

## 
## ==================================================================
## Statistic           N     Mean      St. Dev.      Min       Max   
## ------------------------------------------------------------------
## AWHMAN             388   41.204       0.608     38.500    42.300  
## ICSA               388 370,557.200 289,442.900  197,750  4,663,250
## ACOGNO             388 163,147.000 45,534.820   90,386    264,445 
## NOCDFSA066MSFRBPHI 388    8.684      14.920     -69.300   46.700  
## NEWORDER           388 59,213.110   9,194.857   35,258    73,992  
## PERMIT             388  1,373.343    403.383      513      2,263  
## M2REAL             388  4,132.597   1,579.515  2,305.100 7,656.700
## T10YFF             388    1.433       1.311     -1.470     3.800  
## UMCSENT            388   85.889      13.398     50.000    112.000 
## INDPRO             388   91.831      11.242     62.440    104.104 
## leading_index      388   142.715     25.014     100.000   195.462 
## LEI                388   87.253      15.293     61.138    119.501 
## ------------------------------------------------------------------

4.7 Plot LEI

cmpt_plot(df,"LEI", color = "blue") + 
  recession_bars(31,34) + 
  labs(title = "Leading Index")

Unfortunately we only have enough data to construct this index for 3 recessions but we will have to make due. Upon visual inspection the leading index dose the best around the 2008 recession, showing a decline well before the recession and its trough is slightly before the NBER trough date. As expected the leading index dose not do a great job of forecasting the pandemic recession. This is due to this recession being brought on by exogenous factors in the form of a virus. The time proceeding 2001 the recession is just okay. It seems like the index begins to take a dip in late 2000 but the lead time seems much shorter than in 2008. This could be due to some of the conditions surrounding the dot-com bubble that burst in late 2000.

# use kable to display recent LEI values
kable(tail(df)) |>
  kableExtra::kable_classic(html_font = "Cambria", full_width = TRUE)

	date	AWHMAN	ICSA	ACOGNO	NOCDFSA066MSFRBPHI	NEWORDER	PERMIT	M2REAL	T10YFF	UMCSENT	INDPRO	leading_index	LEI
384	Jan 2024	40.2	209500	243853	-17.9	73527	1508	6700.7	-1.27	79.0	101.4830	168.8196	103.2126
385	Feb 2024	40.5	209250	249125	-5.2	73929	1563	6668.9	-1.12	76.9	102.7267	168.4936	103.0132
386	Mar 2024	40.7	213600	252395	5.4	73774	1485	6673.2	-1.12	79.4	102.5001	168.2013	102.8346
387	Apr 2024	40.6	210250	254629	12.2	73940	1440	6660.6	-0.79	77.2	102.4955	167.2732	102.2672
388	May 2024	40.8	223000	251570	-7.9	73284	1399	6689.1	-0.85	69.1	103.2734	165.8758	101.4128
389	Jun 2024	40.8	236800	250118	-2.2	73673	1454	6716.2	-1.02	68.2	103.5494	165.5864	101.2359

Important note This data tends to lag behind by about 2 months because some of the serise take longer than others to be puplished by the fed.

5.0 Trend Estimation

The next step is to use the one sided hodrick-presscott filter to estimate the trend of our index. This will give us some important information that is left over in the cycle component AKA the deviations from trend. The HP filter is a robust estimation technique that has been shown to be the best estimation technique outside of the PAT or phase average trend. We don’t have a ton of observations so PAT is not an option in this scenario.

# create data frame for trend calculation
y <- as.data.frame(df$LEI)

# create trend 
LEI_hptrend <- hp1(y, lambda = 144000)

# Calculate cycles/ devation from trend
LEI_hpcycle = y - LEI_hptrend

# Adding a new column using mutate
df <- df |>
  mutate(LEI_hpcycle = LEI_hpcycle[,1],
         LEI_hptrend = LEI_hptrend[,1])

5.1 Plot with HP Trend Line.

# Define colors and labels for the legend
colors <- c( "deepskyblue")
labels <- c("hptrend")


# Build plot
LEI_hptrend_ggplot = cmpt_plot(df, "LEI" ) +
  # Line for coincident_index with legend
  geom_line(aes(y = LEI_hptrend , color = "hptrend"),
            alpha = 0.7,
            show.legend = TRUE) +
  # Label graph
  labs(x = "Date (monthly)",
       y = "Index 2016 = 100", 
       title = " Leading index and HP trend") + 
  
  # Assign colors and labels
  scale_color_manual(values = colors, labels = labels)

# recession bars
LEI_hptrend_ggplot = LEI_hptrend_ggplot + recession_bars(31,34)

print(LEI_hptrend_ggplot)

5.3 Plot Devations from Trend

# Build plot
LEI_divhptrend_ggplot = cmpt_plot(df, "LEI_hpcycle", color = "darkturquoise" ) +
  # Label graph
  labs(x = "Date (monthly)",
       y = "Index 2016 = 100", 
       title = " Leading index devations from HP trend") 
  
# recession bars
LEI_divhptrend_ggplot = LEI_divhptrend_ggplot + recession_bars(31,34)

#display plot
print(LEI_divhptrend_ggplot)

5.4 YoY Precent Change

# cacluate year over year growth rate
df <- df|> 
  mutate(YoY = (LEI - lag(LEI,12)) / lag(LEI,12))

# plot yoy
cmpt_plot(df,"YoY", color = "darkslategray4") + 
  recession_bars(31,34)+
  labs(title= "Leading Index Year over Year Growth Rate")

To get an alternative look at the cyclical portion of the leading index we can use the YoY growth rate. This should basically de-trends the data in a way that is easier to interpret. This plot differs slightly from the one above because it has more sections in the postive values but tells pretty much the same story.

The reason we look at the de-trended values becomes clear when looking at the plots. For example in the YoY plot we can see the leading index growth rate goes negative a few months before each recession. For the more normal recessions we see the growth rate go negative several months beforehand, with (2010 previous 19 months negative) and (2001 previous 7 months negative) beforehand. The leading index would have definitely done a great job of forecasting the 2008 recession but could have missed the 2001 recession, 7 months negative is not a strong signal specifically because there are other periods with negative values that do not result in a recession. The recession triggered by the pandemic is an interesting case that is somewhat of an outlier. This is because the event that caused it exogenous to the economy and our index, we have not included a measure of global pandemic risk. None the less its an interesting event to study more in the future and maybe some information can be obtained.

6.0 Evaluation.

In this section we will evaluate the Leading index and see how well it aligns with the established recession dates and then we will use the Granger Causality test to see if the leading index causes the INDPRO series that we are using as a proxy for the current state of the economy.

6.2.1 Clean Up data

# make data quaterly 
df_q <- df |>
  select(date, LEI, INDPRO) |>
  mutate(quarter = as.yearqtr(date)) |>
  group_by(quarter) |>
  summarise(LEI = mean(LEI),
            INDPRO = mean(INDPRO))

# create tsibble and format data for eval 
df_tsibble <-  df_q |>
  mutate(date = yearquarter(quarter)) |>
  select(-quarter) |>
  as_tsibble(index = date) 

# join our LEI with real GDP on quartly level
ind_eva <- left_join(df_tsibble, rgdp_q, by = "date")

6.1 turing point Analysis

Use the BCDating package and specifically the Harding-Pagan method for detecting turning points. The first thing we will do here is take a look at the LEI index with the official NBER recession dates. This will allow us to compare with the turning points detected in our index.

# plot LEI again for comparison
cmpt_plot(df,"LEI", color = "blue") + 
  recession_bars(31,34) + 
  labs(title = "Leading Index")

# turning points 
turning_points[31:34,]

## # A tibble: 4 × 2
##   peak      trough   
##   <yearmon> <yearmon>
## 1 Aug 1990  Mar 1991 
## 2 Apr 2001  Nov 2001 
## 3 Jan 2008  Jun 2009 
## 4 Mar 2020  Apr 2020

# Create time series object because this works with S4 not S3
newind <- ts(ind_eva$LEI,
             start = 1993, 
             end = 2023,
             frequency = 4)

# fit model 
turns <- BBQ(newind, mincycle = 5, minphase = 2, name = "" )


# Plot turns
plot(turns, main = "Leading Index turing points")

# display turning point 
print(turns)

##    Peaks Troughs Duration
## 1 1996Q1  1997Q1        4
## 2 1999Q1  1999Q4        3
## 3 2001Q2  2002Q4        6
## 4 2007Q1  2010Q1       12
## 5 2019Q4  2021Q2        6

6.2 Granger causality test

To preform the Granger causality test we need to make sure that our data is stationary. This will be done through examining the results from the Autocorrelation function and the KPSS unit root test. Then we will difference the data as needed.

6.3 Test stationarity

We will inspect the data further so that we can gather more infomrmation needed to model. This is an exploratory step thats needed to used the granger test.

6.3.1 LEI test and diffrencing

# ues autcorrlation function & plot
ind_eva |>
  ACF(LEI) |>
  autoplot()+
  labs(title = "ACF of LEI")

# use partial auto-correlation function & plot
ind_eva |>
  PACF(LEI) |>
  autoplot()+
  labs(title = "PACF of LEI ")

# preform kpss test
ind_eva |>
  features(LEI, unitroot_kpss)

## # A tibble: 1 × 2
##   kpss_stat kpss_pvalue
##       <dbl>       <dbl>
## 1      2.46        0.01

# use autcorrlation function on diffrenced data & plot 
ind_eva |>
  ACF(difference(LEI)) |>
  autoplot()+
  labs(title = "ACF of LEI")

# use partial autocorrelation function on diffrenced data & plot
ind_eva |>
  PACF(difference(LEI)) |>
  autoplot()+
  labs(title = "PACF of LEI ")

ind_eva |>
  features(difference(LEI), unitroot_kpss)

## # A tibble: 1 × 2
##   kpss_stat kpss_pvalue
##       <dbl>       <dbl>
## 1     0.127         0.1

# preform kpss test 
ind_eva  <- ind_eva  |>
  mutate(LEI_diff = difference(LEI))

# plot first diffrenced LEI 
cmpt_plot(ind_eva, "LEI_diff") +
  recession_bars(31,34) + 
  labs(title = "First diffrenced LEI ")

6.3.2 Real GDP test and diffrencing

# use autocorrelation function and plot 
ind_eva |>
  ACF(GDPC1) |>
  autoplot()+
  labs(title = "ACF of Real GDP")

# use partical autocorrelation function and plot
ind_eva |>
  PACF(GDPC1) |>
  autoplot()+
  labs(title = "PACF of Real GDP")

# preform kpss test
ind_eva |>
  features(GDPC1, unitroot_kpss)

## # A tibble: 1 × 2
##   kpss_stat kpss_pvalue
##       <dbl>       <dbl>
## 1      2.64        0.01

# use autocorrelation function on diffenced data & plot
ind_eva |>
  ACF(difference(GDPC1)) |>
  autoplot()+
  labs(title = "ACF of Real GDP")

# use partical autocorrelation function on diffrenced data & plot
ind_eva |>
  PACF(difference(GDPC1)) |>
  autoplot()+
  labs(title = "PACF of Real GDP")

# kpss test on diffrenced data
ind_eva |>
  features(difference(GDPC1), unitroot_kpss)

## # A tibble: 1 × 2
##   kpss_stat kpss_pvalue
##       <dbl>       <dbl>
## 1    0.0784         0.1

ind_eva  <- ind_eva  |>
  mutate(GDPC1_diff = difference(GDPC1))

# Plot diffrened GDP
cmpt_plot(ind_eva, "GDPC1_diff") +
  recession_bars(31,34) + 
  labs(title = "First diffrenced Real GDP")

6.4 Preform Granger Causality test

# create time series objects for model cause its incompatible with S3
LEI_ts <- ts(ind_eva $LEI_diff,
             start = 1993, 
             end = 2023,
             frequency = 12)

GDPC1_ts <- ts(ind_eva $GDPC1,
             start = 1993, 
             end = 2023,
             frequency = 12)

# fit test 
cause_test <- lmtest::grangertest(LEI_ts, GDPC1_ts, order=3, test="F")

print(cause_test)

## Granger causality test
## 
## Model 1: GDPC1_ts ~ Lags(GDPC1_ts, 1:3) + Lags(LEI_ts, 1:3)
## Model 2: GDPC1_ts ~ Lags(GDPC1_ts, 1:3)
##   Res.Df Df      F   Pr(>F)   
## 1    342                      
## 2    345 -3 4.1956 0.006192 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The Granger causality test shows that the LEI causes GDP. This finding indicates that we can use the lEI to understand where the economy is going to be in the business cycle. An important thing to remember is that we are using INDPRO to proxy for the state of the economy and this is just an approximation. With that being said this Leading Index of Economic Indicators dose a reasonably good job at forecasting movements in the economy