Executive Summary
- Package Installation and Loading
- Data Acquisition and Preparation
Initial Market Overview: Mortgage Rates vs. Home Price Index
Home Price Behavior Around Historical Mortgage Rate Peaks
Home Price Behavior Around Historical Mortgage Rate Lows
The Impact of Interest Rates on Home Buying: A Detailed Financial Analysis
Comparative Analysis of Peak vs. Low Mortgage Rate Scenarios
Key Takeaways for Home Buying Decisions

Executive Summary

This report provides a comprehensive analysis of the historical relationship between U.S. mortgage rates and national home prices. By examining how home values have behaved around periods of both peak and historically low interest rates, I aim to offer data-driven insights for prospective home buyers and real estate investors.

Beyond simple trend observation, I delve into a financial model that quantifies the time to profitability and potential returns for properties purchased during these distinct market cycles. Understanding these dynamics is crucial for making informed decisions in a volatile housing market.

Package Installation and Loading

This section ensures that all necessary R packages for running this report are installed and loaded. If you are running this report for the first time, these packages will be installed automatically.

required_packages <- c("data.table", "lubridate", "ggplot2", "patchwork", "cowplot")

# Check if packages are installed, install if not, then load them
for (pkg in required_packages) {
  if (!require(pkg, character.only = TRUE)) {
    install.packages(pkg, dependencies = TRUE)
    library(pkg, character.only = TRUE)
  }
}

Data Acquisition and Preparation

To conduct this analysis, I leverage two authoritative data sets:

Freddie Mac Primary Mortgage Market Survey (PMMS): Providing historical 30-year fixed mortgage rates. (Source)
Federal Housing Finance Agency (FHFA) House Price Index (HPI): Offering national average home price indices. (Source)

The following steps outline the process of loading, cleaning, and merging these data sets into a cohesive structure suitable for our analysis.

Loading Raw Data

I begin by directly importing the raw data files from their respective online repositories using data.table’s efficient fread function.

# Define the URL for the Freddie Mac mortgage rates CSV
mortgage_rates_url <- "https://www.freddiemac.com/pmms/docs/PMMS_history.csv"

# Load the mortgage rates data directly from the URL
mortgageRates <- fread(mortgage_rates_url, header = TRUE, fill = TRUE)

# Display the first few rows to confirm successful loading
cat("### Raw Freddie Mac Mortgage Rates Data Preview:\n")

## ### Raw Freddie Mac Mortgage Rates Data Preview:

print(head(mortgageRates))

##         date pmms30 pmms30p pmms15 pmms15p pmms51 pmms51p pmms51m pmms51spread
##       <char>  <num>   <num>  <num>   <num>  <num>   <num>   <num>        <num>
## 1:  4/2/1971   7.33      NA     NA      NA     NA      NA      NA           NA
## 2:  4/9/1971   7.31      NA     NA      NA     NA      NA      NA           NA
## 3: 4/16/1971   7.31      NA     NA      NA     NA      NA      NA           NA
## 4: 4/23/1971   7.31      NA     NA      NA     NA      NA      NA           NA
## 5: 4/30/1971   7.29      NA     NA      NA     NA      NA      NA           NA
## 6:  5/7/1971   7.38      NA     NA      NA     NA      NA      NA           NA

# Define the URL for the FHFA House Price Index CSV
home_values_url <- "https://www.fhfa.gov/hpi/download/monthly/hpi_master.csv"

# Load the FHFA Home Price Index data
homeValues <- fread(file = home_values_url)

# Display the first few rows to confirm successful loading
cat("\n### Raw FHFA Home Price Index Data Preview:\n")

## 
## ### Raw FHFA Home Price Index Data Preview:

print(head(homeValues))

##       hpi_type    hpi_flavor frequency                  level
##         <char>        <char>    <char>                 <char>
## 1: traditional purchase-only   monthly USA or Census Division
## 2: traditional purchase-only   monthly USA or Census Division
## 3: traditional purchase-only   monthly USA or Census Division
## 4: traditional purchase-only   monthly USA or Census Division
## 5: traditional purchase-only   monthly USA or Census Division
## 6: traditional purchase-only   monthly USA or Census Division
##                     place_name place_id    yr period index_nsa index_sa
##                         <char>   <char> <int>  <int>     <num>    <num>
## 1: East North Central Division   DV_ENC  1991      1    100.00   100.00
## 2: East North Central Division   DV_ENC  1991      2    100.87   100.87
## 3: East North Central Division   DV_ENC  1991      3    101.32   100.91
## 4: East North Central Division   DV_ENC  1991      4    101.72   100.97
## 5: East North Central Division   DV_ENC  1991      5    102.32   101.31
## 6: East North Central Division   DV_ENC  1991      6    102.80   101.46

Data Cleaning and Date Conversion

Both data sets contain date information in different formats. For accurate merging and time-series analysis, we need to convert these date columns into a standardized format (IDate) and aggregate the mortgage rate data to a monthly frequency to match the HPI data.

# --- Process Home Values Data (FHFA HPI) ---

# The FHFA HPI data uses 'yr' and 'period' (month number) columns.
# We combine them to create a standard 'YYYY-MM-DD' date string (using '01' for the day) and then convert it to an IDate object using lubridate's ymd() function.

homeValues[, date := ymd(paste(yr, period, "01", sep = "-"))]

cat("### Home Price Index Data with Cleaned Dates:\n")

## ### Home Price Index Data with Cleaned Dates:

print(head(homeValues))

##       hpi_type    hpi_flavor frequency                  level
##         <char>        <char>    <char>                 <char>
## 1: traditional purchase-only   monthly USA or Census Division
## 2: traditional purchase-only   monthly USA or Census Division
## 3: traditional purchase-only   monthly USA or Census Division
## 4: traditional purchase-only   monthly USA or Census Division
## 5: traditional purchase-only   monthly USA or Census Division
## 6: traditional purchase-only   monthly USA or Census Division
##                     place_name place_id    yr period index_nsa index_sa
##                         <char>   <char> <int>  <int>     <num>    <num>
## 1: East North Central Division   DV_ENC  1991      1    100.00   100.00
## 2: East North Central Division   DV_ENC  1991      2    100.87   100.87
## 3: East North Central Division   DV_ENC  1991      3    101.32   100.91
## 4: East North Central Division   DV_ENC  1991      4    101.72   100.97
## 5: East North Central Division   DV_ENC  1991      5    102.32   101.31
## 6: East North Central Division   DV_ENC  1991      6    102.80   101.46
##          date
##        <Date>
## 1: 1991-01-01
## 2: 1991-02-01
## 3: 1991-03-01
## 4: 1991-04-01
## 5: 1991-05-01
## 6: 1991-06-01

cat("\nStructure of 'date' column in Home Price Index data:\n")

## 
## Structure of 'date' column in Home Price Index data:

print(str(homeValues$date))

##  Date[1:130687], format: "1991-01-01" "1991-02-01" "1991-03-01" "1991-04-01" "1991-05-01" ...
## NULL

# --- Process Mortgage Rates Data (Freddie Mac) ---

# The mortgageRates data uses a 'date' column in 'Month/Day/Year' format.
# We convert this directly to an IDate.

mortgageRates[, date := mdy(date)]

# Next, we aggregate weekly mortgage rates to a monthly average.
# This ensures consistency with the monthly FHFA HPI data.

monthly_avg_mortgage_rates <- mortgageRates[, .(
  avg_mortgage_rate = mean(`pmms30`, na.rm = TRUE)
), by = .(yr = year(date), period = month(date))]

# Create a matching 'date' column (first day of the month) for the monthly averages
monthly_avg_mortgage_rates[, date := ymd(paste(yr, period, "01", sep = "-"))]

# Select only the relevant columns for merging
monthly_avg_mortgage_rates <- monthly_avg_mortgage_rates[, .(date, avg_mortgage_rate)]

cat("\n### Monthly Average Mortgage Rates Data with Cleaned Dates:\n")

## 
## ### Monthly Average Mortgage Rates Data with Cleaned Dates:

print(head(monthly_avg_mortgage_rates))

##          date avg_mortgage_rate
##        <Date>             <num>
## 1: 1971-04-01            7.3100
## 2: 1971-05-01            7.4250
## 3: 1971-06-01            7.5300
## 4: 1971-07-01            7.6040
## 5: 1971-08-01            7.6975
## 6: 1971-09-01            7.6875

cat("\nStructure of 'date' column in Monthly Average Mortgage Rates data:\n")

## 
## Structure of 'date' column in Monthly Average Mortgage Rates data:

print(str(monthly_avg_mortgage_rates$date))

##  Date[1:652], format: "1971-04-01" "1971-05-01" "1971-06-01" "1971-07-01" "1971-08-01" ...
## NULL

Merging Datasets: Creating the Combined Analytical Base

To enable direct comparison and analysis, I merge the processed national home price index data with the monthly average mortgage rates.

I perform an inner join on the date column, meaning only months where data exists in both data sets will be included in the final merged_data. This ensures that the analysis is based on synchronously available historical information.

# Step 1: Aggregate homeValues to get a single national average index_nsa per date
# The FHFA HPI dataset contains regional data; we average it to get a national view.

national_avg_home_values <- homeValues[, .(
  national_avg_index_nsa = mean(index_nsa, na.rm = TRUE)
), by = .(date)]

# Step 2: Merge the national_avg_home_values with monthly_avg_mortgage_rates
# Using `all = FALSE` for an inner join ensures we only analyze periods
# where both HPI and mortgage rate data are available.

merged_data <- merge(national_avg_home_values, monthly_avg_mortgage_rates,
                     by = "date", all = FALSE)

Initial Market Overview: Mortgage Rates vs. Home Price Index

Before diving into specific scenarios, let’s visualize the overarching trends of both 30-year fixed mortgage rates and the National Home Price Index over the entire data set period. This foundational plot helps us understand the long-term relationship and major market shifts.

# Calculate a scaling factor for the secondary axis
# This helps align the two lines visually for comparison of trends.
# We scale the mortgage rate to fit the home price index range, ensuring both trends are clearly visible.

max_index <- max(merged_data$national_avg_index_nsa, na.rm = TRUE)
max_rate <- max(merged_data$avg_mortgage_rate, na.rm = TRUE)
scaling_factor <- max_index / max_rate

# Create the ggplot
ggplot(merged_data, aes(x = date)) +
  # Line for National Average Home Price Index
  geom_line(aes(y = national_avg_index_nsa, color = "Home Price Index"), size = 1.2) +
  # Line for Average Mortgage Rate (scaled for visual comparison)
  geom_line(aes(y = avg_mortgage_rate * scaling_factor, color = "Mortgage Rate"), size = 1.2) +
  # Set colors for the lines
  scale_color_manual(name = "Data Series",
                     values = c("Home Price Index" = "#0072B2", "Mortgage Rate" = "#D55E00")) + # Blue and Orange
  # Set titles and labels
  labs(
    title = "Mortgage Rates vs. FHFA National Home Price Index (1971-Present)",
    subtitle = "Illustrating Long-Term Trends in U.S. Housing Costs and Values",
    x = "Date",
    y = "National Home Price Index (NSA)",
    caption = paste0("Data Sources: Freddie Mac (Mortgage Rates), FHFA (Home Price Index)\n",
                     "Mortgage Rate scaled by a factor of ", round(scaling_factor, 2), " for visual comparison.")
  ) +
  # Customize Y-axis: primary for Home Price Index, secondary for Mortgage Rate
  scale_y_continuous(
    name = "National Home Price Index (NSA)",
    sec.axis = sec_axis(~ . / scaling_factor, name = "Average 30-Yr Fixed Mortgage Rate (%)")
  ) +
  # Customize X-axis for better date formatting
  scale_x_date(date_breaks = "5 years", date_labels = "%Y") +
  # Apply a clean theme and further customize aesthetics
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5, face = "bold", size = 18),
    plot.subtitle = element_text(hjust = 0.5, size = 12, color = "gray30"),
    axis.title = element_text(size = 13, face = "bold"),
    axis.text = element_text(size = 10),
    legend.position = "bottom", # Position legend at the bottom
    legend.title = element_text(face = "bold"),
    legend.text = element_text(size = 11),
    panel.grid.minor = element_blank(), # Remove minor grid lines
    panel.grid.major.x = element_line(linetype = "dotted", color = "gray80"), # Dotted vertical grid lines
    panel.grid.major.y = element_line(linetype = "dotted", color = "gray80") # Dotted horizontal grid lines
  )

Home Price Behavior Around Historical Mortgage Rate Peaks

This section visualizes how the Home Price Index (HPI) has historically behaved when mortgage rates reached their highest points. Each panel focuses on a specific peak, normalizing the HPI to 0% at the month of that peak, allowing for a clear comparison of post-peak home value trends.

# 1. Define specific, historically significant mortgage rate peak dates
peak_dates_info <- list(
  list(date = as.IDate("1981-10-01"), label = "Early 1980s Peak (~18%)"),
  list(date = as.IDate("1988-10-01"), label = "Late 1980s Peak (~11%)"),
  list(date = as.IDate("1994-12-01"), label = "Mid 1990s Peak (~9%)"),
  list(date = as.IDate("2000-05-01"), label = "Early 2000s Peak (~8.5%)"),
  list(date = as.IDate("2006-07-01"), label = "Mid 2000s Peak (~6.5%)"),
  list(date = as.IDate("2023-10-01"), label = "Recent Peak (~7.5%)")
)

# Initialize an empty list to store data frames for each cycle
cycle_data_list <- list()

# Define the window size around each peak
years_before_peak <- 4
years_after_peak <- 6

# Get overall min/max dates as standard Date objects for comparison
min_date_overall_base <- as.Date(min(merged_data$date, na.rm = TRUE))
max_date_overall_base <- as.Date(max(merged_data$date, na.rm = TRUE))

# Loop through each peak date to extract and process data for plotting
for (i in seq_along(peak_dates_info)) {
  peak_info <- peak_dates_info[[i]]
  peak_date_obj_idate <- peak_info$date
  cycle_label <- peak_info$label

  # Convert peak_date_obj to base Date for lubridate arithmetic
  peak_date_obj_base <- as.Date(peak_date_obj_idate)

  # Calculate window start and end using base Date and lubridate::years(),
  # then clamp to overall data boundaries, and convert back to IDate.
  window_start <- as.IDate(max(peak_date_obj_base - years(years_before_peak), min_date_overall_base))
  window_end <- as.IDate(min(peak_date_obj_base + years(years_after_peak), max_date_overall_base))

  # Filter merged_data for the current window
  current_cycle_data <- merged_data[date >= window_start & date <= window_end]

  # Ensure there's enough data in the window for a meaningful plot (e.g., at least 6 months)
  if (nrow(current_cycle_data) < 6) {
    message(paste("Skipping cycle '", cycle_label, "' due to insufficient data in window."))
    next
  }

  # --- IMPORTANT: Calculate HPI Percentage Change from Peak ---
  # Find the actual peak rate date and its corresponding HPI value within the *current window*
  # This makes the HPI normalization precise for each panel.
  peak_date_in_panel_row <- current_cycle_data[which.max(avg_mortgage_rate)]
  peak_date_in_panel <- peak_date_in_panel_row$date
  hpi_at_peak_date_for_panel <- peak_date_in_panel_row$national_avg_index_nsa

  # Ensure hpi_at_peak_date_for_panel is not NA or zero to avoid division issues
  if (is.na(hpi_at_peak_date_for_panel) || hpi_at_peak_date_for_panel == 0) {
      message(paste("Skipping cycle '", cycle_label, "' due to missing or zero HPI at peak rate date."))
      next
  }

  current_cycle_data[, hpi_pct_change_from_peak := ((national_avg_index_nsa / hpi_at_peak_date_for_panel) - 1) * 100]

  # Add peak date specific info to the data.table for annotation later
  current_cycle_data[, panel_peak_date := peak_date_in_panel]
  current_cycle_data[, panel_peak_rate_value := peak_date_in_panel_row$avg_mortgage_rate]


  current_cycle_data[, cycle_label := factor(cycle_label, levels = sapply(peak_dates_info, `[[`, "label"))]

  cycle_data_list[[i]] <- current_cycle_data
}

# Combine all cycle data into a single data.table
all_cycle_data <- rbindlist(cycle_data_list, fill = TRUE)

# --- GLOBAL SCALING FACTOR FOR MORTGAGE RATES ---
# This ensures consistency of the secondary axis across all panels.
# We map the range of mortgage rates to the observed range of HPI percentage change.
min_hpi_pct_change_global <- min(all_cycle_data$hpi_pct_change_from_peak, na.rm = TRUE)
max_hpi_pct_change_global <- max(all_cycle_data$hpi_pct_change_from_peak, na.rm = TRUE)
range_hpi_pct_global <- max_hpi_pct_change_global - min_hpi_pct_change_global

min_rate_global <- min(merged_data$avg_mortgage_rate, na.rm = TRUE)
max_rate_global <- max(merged_data$avg_mortgage_rate, na.rm = TRUE)
range_rate_global <- max_rate_global - min_rate_global

# Calculate a conversion factor to map rate range onto HPI percentage range
rate_conversion_factor <- range_hpi_pct_global / range_rate_global

# Calculate an offset to align the minimum of the scaled rate with the minimum of HPI percentage
rate_offset <- min_hpi_pct_change_global - (min_rate_global * rate_conversion_factor)


# 2. Create the list of individual ggplot panels
plot_list <- list()

# Get the levels of cycle_label to ensure plots are generated and ordered correctly
cycle_labels_ordered <- levels(all_cycle_data$cycle_label)

for (label in cycle_labels_ordered) {
  # Filter data for the current panel
  panel_data <- all_cycle_data[cycle_label == label]

  if (nrow(panel_data) == 0) {
    next
  }

  # These values are now stored in panel_data from the loop above
  peak_date_in_panel <- panel_data$panel_peak_date[1] # Take first value, as it's constant for the panel
  peak_rate_value_in_panel <- panel_data$panel_peak_rate_value[1]


  p <- ggplot(panel_data, aes(x = date)) +
    # Line for Home Price Index (% Change from Peak Rate Date)
    geom_line(aes(y = hpi_pct_change_from_peak, color = "Home Price Index (% Change)"), size = 1) +
    # Line for Average Mortgage Rate (scaled for visual comparison)
    geom_line(aes(y = avg_mortgage_rate * rate_conversion_factor + rate_offset, color = "Mortgage Rate"), size = 1) +
    # Mark the peak rate date
    geom_vline(xintercept = as.numeric(peak_date_in_panel), linetype = "solid", color = "darkgreen", size = 0.8) +
    # Annotate the peak rate date and value
    geom_text(aes(x = peak_date_in_panel,
                  y = min(panel_data$hpi_pct_change_from_peak, na.rm = TRUE), # Position at bottom of HPI % range
                  label = paste0("Peak Rate:\n", round(peak_rate_value_in_panel, 2), "%\n(", format(peak_date_in_panel, "%b %Y"), ")")),
              color = "darkgreen", hjust = -0.1, vjust = 1.2, size = 3, angle = 90) +
    # Set colors for the lines
    scale_color_manual(name = "Data Series",
                       values = c("Home Price Index (% Change)" = "#0072B2", "Mortgage Rate" = "#D55E00")) +
    # Set titles and labels
    labs(
      title = label, # Use the predefined label as the panel title
      x = NULL, # Remove x-axis label for individual plots for cleaner layout
      y = "Home Price Index (% Change)"
    ) +
    # Customize Y-axis: primary for HPI % Change, secondary for Mortgage Rate
    scale_y_continuous(
      name = "Home Price Index (% Change)",
      sec.axis = sec_axis(~ (. - rate_offset) / rate_conversion_factor, name = "Average 30-Yr Fixed Mortgage Rate")
    ) +
    # Ensure x-axis shows appropriate breaks for the wider window
    scale_x_date(limits = c(min(panel_data$date), max(panel_data$date)),
                 date_breaks = "2 years", date_labels = "%Y") +
    theme_minimal() +
    theme(
      plot.title = element_text(hjust = 0.5, face = "bold", size = 12),
      axis.title.x = element_blank(),
      axis.title.y = element_text(size = 10),
      axis.text = element_text(size = 8),
      legend.position = "none", # Hide individual legends
      panel.grid.minor = element_blank(),
      panel.grid.major.x = element_line(linetype = "dotted", color = "gray80"),
      panel.grid.major.y = element_line(linetype = "dotted", color = "gray80"),
      plot.margin = unit(c(0.2, 0.5, 0.2, 0.2), "cm") # Adjust plot margins for better spacing
    )

  plot_list[[label]] <- p
}

# Order the plot_list by the original sequence of peak_dates_info
ordered_plot_list <- plot_list[sapply(peak_dates_info, `[[`, "label")]

# 3. Create a common legend for the entire arrangement
dummy_plot <- ggplot(all_cycle_data[1,], aes(x = date)) + # Use all_cycle_data for dummy plot for correct legend names
  geom_line(aes(y = hpi_pct_change_from_peak, color = "Home Price Index (% Change)")) +
  geom_line(aes(y = avg_mortgage_rate * rate_conversion_factor + rate_offset, color = "Mortgage Rate")) +
  scale_color_manual(name = "Data Series",
                     values = c("Home Price Index (% Change)" = "#0072B2", "Mortgage Rate" = "#D55E00")) +
  theme(legend.position = "bottom")

common_legend <- get_legend(dummy_plot)

# 4. Combine all plots using patchwork
final_plot_arrangement <- wrap_plots(ordered_plot_list, ncol = 2)

final_plot <- final_plot_arrangement +
  plot_annotation(
    title = "Home Price Index (% Change) vs. Mortgage Rates Around Historical Peaks",
    subtitle = paste0("Home Price Index normalized to 0% at the month of the Mortgage Rate Peak.\n",
                     "Each panel shows data up to approximately ", years_before_peak, " years before to ", years_after_peak, " years after the indicated peak rate."),
    caption = paste0("Data Sources: Freddie Mac (Mortgage Rates), FHFA (Home Price Index)\n",
                     "Green vertical line indicates the month of the highest mortgage rate in each panel's window."),
    theme = theme(plot.title = element_text(hjust = 0.5, face = "bold", size = 20),
                  plot.subtitle = element_text(hjust = 0.5, size = 14, color = "gray30"),
                  plot.caption = element_text(size = 10, color = "gray50", hjust = 0))
  ) +
  common_legend

# Print the final combined plot
print(final_plot)

Home Price Behavior Around Historical Mortgage Rate Lows

This section provides a detailed look at how the Home Price Index behaved around specific historical lows in mortgage rates. Similar to the peak analysis, the Home Price Index will be normalized to 0% at the month of the mortgage rate low for each panel, highlighting the subsequent home value trends.

# 1. Define specific, historically significant mortgage rate lowest dates
# Now including 5 periods
low_dates_info <- list(
  list(date = as.IDate("1971-04-01"), label = "Early 1970s Low (~7.3%)"),
  list(date = as.IDate("1993-11-01"), label = "Early 1990s Low (~6.7%)"),
  list(date = as.IDate("2003-06-01"), label = "Early 2000s Low (~5.2%)"), # NEW LOW PERIOD ADDED
  list(date = as.IDate("2012-11-01"), label = "Post-2008 Low (~3.3%)"),
  list(date = as.IDate("2020-12-01"), label = "COVID-Era Low (~2.7%)")
)

# Initialize an empty list to store data frames for each cycle
cycle_data_list_low <- list()

# Define the window size around each low
years_before_low <- 4
years_after_low <- 6

# Get overall min/max dates as standard Date objects for comparison
min_date_overall <- min(merged_data$date, na.rm = TRUE)
max_date_overall <- max(merged_data$date, na.rm = TRUE)


# Loop through each low date to extract and process data for plotting
for (i in seq_along(low_dates_info)) {
  low_info <- low_dates_info[[i]]
  low_date_obj_idate <- low_info$date # This is the target low date

  # Find the actual date in merged_data closest to the target low date
  # FIX: Convert difference to numeric before abs()
  closest_date_idx <- which.min(abs(as.numeric(merged_data$date - low_date_obj_idate)))
  actual_low_date_in_data <- merged_data$date[closest_date_idx]

  cycle_label <- low_info$label

  # Calculate window start and end using the actual data point's date
  window_start <- as.IDate(max(actual_low_date_in_data - years(years_before_low), min_date_overall))
  window_end <- as.IDate(min(actual_low_date_in_data + years(years_after_low), max_date_overall))

  # Filter merged_data for the current window
  current_cycle_data <- merged_data[date >= window_start & date <= window_end]

  # Ensure there's enough data in the window for a meaningful plot (e.g., at least 6 months)
  if (nrow(current_cycle_data) < 6) {
    message(paste("Skipping cycle '", cycle_label, "' due to insufficient data in window or before/after dates."))
    next
  }

  # --- IMPORTANT: Calculate HPI Percentage Change from the *actual low rate date in this panel* ---
  # Find the actual low rate date and its corresponding HPI value within the *current window*
  # This makes the HPI normalization precise for each panel.
  # Use which.min to find the row with the minimum mortgage rate within the current panel data
  low_date_in_panel_row <- current_cycle_data[which.min(avg_mortgage_rate)]
  low_date_in_panel <- low_date_in_panel_row$date
  hpi_at_low_date_for_panel <- low_date_in_panel_row$national_avg_index_nsa

  # Ensure hpi_at_low_date_for_panel is not NA or zero to avoid division issues
  if (is.na(hpi_at_low_date_for_panel) || hpi_at_low_date_for_panel == 0) {
      message(paste("Skipping cycle '", cycle_label, "' due to missing or zero HPI at low rate date in panel."))
      next
  }

  current_cycle_data[, hpi_pct_change_from_low := ((national_avg_index_nsa / hpi_at_low_date_for_panel) - 1) * 100]

  # Add low date specific info to the data.table for annotation later
  current_cycle_data[, panel_low_date := low_date_in_panel]
  current_cycle_data[, panel_low_rate_value := low_date_in_panel_row$avg_mortgage_rate]


  current_cycle_data[, cycle_label := factor(cycle_label, levels = sapply(low_dates_info, `[[`, "label"))]

  cycle_data_list_low[[i]] <- current_cycle_data
}

# Combine all cycle data into a single data.table
# Only try to rbindlist if the list is not empty
if (length(cycle_data_list_low) > 0) {
  all_cycle_data_low <- rbindlist(cycle_data_list_low, fill = TRUE)
} else {
  message("No data available for low rate cycles after filtering. Cannot generate plot.")
  # Create an empty data.table if no data to avoid errors downstream
  all_cycle_data_low <- data.table(date = as.IDate(character()),
                                   national_avg_index_nsa = numeric(),
                                   avg_mortgage_rate = numeric(),
                                   hpi_pct_change_from_low = numeric(),
                                   panel_low_date = as.IDate(character()),
                                   panel_low_rate_value = numeric(),
                                   cycle_label = factor())
}


# --- GLOBAL SCALING FACTOR FOR MORTGAGE RATES ---
# This ensures consistency of the secondary axis across all panels.
# Only calculate if all_cycle_data_low has rows, otherwise use defaults
if (nrow(all_cycle_data_low) > 0) {
  min_hpi_pct_change_global_low <- min(all_cycle_data_low$hpi_pct_change_from_low, na.rm = TRUE)
  max_hpi_pct_change_global_low <- max(all_cycle_data_low$hpi_pct_change_from_low, na.rm = TRUE)
  range_hpi_pct_global_low <- max_hpi_pct_change_global_low - min_hpi_pct_change_global_low

  min_rate_global <- min(merged_data$avg_mortgage_rate, na.rm = TRUE)
  max_rate_global <- max(merged_data$avg_mortgage_rate, na.rm = TRUE)
  range_rate_global <- max_rate_global - min_rate_global

  # Avoid division by zero if range_rate_global is 0 (unlikely with real data)
  if (range_rate_global == 0) {
    rate_conversion_factor_low <- 1
    rate_offset_low <- 0
  } else {
    rate_conversion_factor_low <- range_hpi_pct_global_low / range_rate_global
    rate_offset_low <- min_hpi_pct_change_global_low - (min_rate_global * rate_conversion_factor_low)
  }
} else {
  # Set default values if no data to avoid errors
  rate_conversion_factor_low <- 1
  rate_offset_low <- 0
  min_hpi_pct_change_global_low <- -10 # Arbitrary defaults for empty plot
  max_hpi_pct_change_global_low <- 50
}


# 2. Create the list of individual ggplot panels
plot_list_low <- list()

# Get the levels of cycle_label to ensure plots are generated and ordered correctly
cycle_labels_ordered_low <- levels(all_cycle_data_low$cycle_label)

# Only proceed if there are actual labels to plot
if (length(cycle_labels_ordered_low) > 0) {
  for (label in cycle_labels_ordered_low) {
    # Filter data for the current panel
    panel_data <- all_cycle_data_low[cycle_label == label]

    if (nrow(panel_data) == 0) {
      next # Skip if no data for this panel
    }

    low_date_in_panel <- panel_data$panel_low_date[1] # Take first value, as it's constant for the panel
    low_rate_value_in_panel <- panel_data$panel_low_rate_value[1]

    p <- ggplot(panel_data, aes(x = date)) +
      # Line for Home Price Index (% Change from Low Rate Date)
      geom_line(aes(y = hpi_pct_change_from_low, color = "Home Price Index (% Change)"), size = 1) +
      # Line for Average Mortgage Rate (scaled for visual comparison)
      geom_line(aes(y = avg_mortgage_rate * rate_conversion_factor_low + rate_offset_low, color = "Mortgage Rate"), size = 1) +
      # Mark the low rate date
      geom_vline(xintercept = as.numeric(low_date_in_panel), linetype = "solid", color = "darkgreen", size = 0.8) +
      # Annotate the low rate date and value
      geom_text(aes(x = low_date_in_panel,
                    y = min(panel_data$hpi_pct_change_from_low, na.rm = TRUE), # Position at bottom of HPI % range
                    label = paste0("Low Rate:\n", round(low_rate_value_in_panel, 2), "%\n(", format(low_date_in_panel, "%b %Y"), ")")),
                color = "darkgreen", hjust = -0.1, vjust = 1.2, size = 3, angle = 90) +
      # Set colors for the lines
      scale_color_manual(name = "Data Series",
                         values = c("Home Price Index (% Change)" = "#0072B2", "Mortgage Rate" = "#D55E00")) +
      # Set titles and labels
      labs(
        title = label, # Use the predefined label as the panel title
        x = NULL, # Remove x-axis label for individual plots for cleaner layout
        y = "Home Price Index (% Change from Low Rate Date)"
      ) +
      # Customize Y-axis: primary for HPI % Change, secondary for Mortgage Rate
      scale_y_continuous(
        name = "Home Price Index (% Change from Low Rate Date)",
        sec.axis = sec_axis(~ (. - rate_offset_low) / rate_conversion_factor_low, name = "Average 30-Yr Fixed Mortgage Rate (%)"),
        # Adjust limits to ensure labels don't get cut off, especially for lower values
        limits = c(min_hpi_pct_change_global_low - abs(min_hpi_pct_change_global_low) * 0.1, # Extend lower limit
                  max_hpi_pct_change_global_low + abs(max_hpi_pct_change_global_low) * 0.1) # Extend upper limit
      ) +
      # Ensure x-axis shows appropriate breaks for the wider window
      scale_x_date(limits = c(min(panel_data$date), max(panel_data$date)),
                   date_breaks = "2 years", date_labels = "%Y") +
      theme_minimal() +
      theme(
        plot.title = element_text(hjust = 0.5, face = "bold", size = 12),
        axis.title.x = element_blank(),
        axis.title.y = element_text(size = 10),
        axis.text = element_text(size = 8),
        legend.position = "none", # Hide individual legends
        panel.grid.minor = element_blank(),
        panel.grid.major.x = element_line(linetype = "dotted", color = "gray80"),
        panel.grid.major.y = element_line(linetype = "dotted", color = "gray80"),
        plot.margin = unit(c(0.2, 0.5, 0.2, 0.2), "cm") # Adjust plot margins for better spacing
      )

    plot_list_low[[label]] <- p
  }
} else {
  message("No plots generated for low rate cycles.")
}

# Order the plot_list by the original sequence of low_dates_info
ordered_plot_list_low <- plot_list_low[sapply(low_dates_info, `[[`, "label")]

# 3. Create a common legend for the entire arrangement
# Use all_cycle_data_low for dummy plot for correct legend names, ensure it's not empty
if (nrow(all_cycle_data_low) > 0) {
  dummy_plot_low <- ggplot(all_cycle_data_low[1,], aes(x = date)) +
    geom_line(aes(y = hpi_pct_change_from_low, color = "Home Price Index (% Change)")) +
    geom_line(aes(y = avg_mortgage_rate * rate_conversion_factor_low + rate_offset_low, color = "Mortgage Rate")) +
    scale_color_manual(name = "Data Series",
                       values = c("Home Price Index (% Change)" = "#0072B2", "Mortgage Rate" = "#D55E00")) +
    theme(legend.position = "bottom")

  common_legend_low <- get_legend(dummy_plot_low)
} else {
  # Create a dummy legend if no data was available
  common_legend_low <- cowplot::get_plot_component(
    ggplot() + geom_line(aes(x=1, y=1, color="Home Price Index (% Change)")) +
               geom_line(aes(x=1, y=1, color="Mortgage Rate")) +
               scale_color_manual(name = "Data Series",
                                  values = c("Home Price Index (% Change)" = "#0072B2", "Mortgage Rate" = "#D55E00")) +
               theme(legend.position = "bottom"),
    "guide-box"
  )
}


# 4. Combine all plots using patchwork
# Only wrap plots if ordered_plot_list_low is not empty
if (length(ordered_plot_list_low) > 0) {
  final_plot_arrangement_low <- wrap_plots(ordered_plot_list_low, ncol = 2)

  final_plot_low <- final_plot_arrangement_low +
    plot_annotation(
      title = "Home Price Index (% Change) vs. Mortgage Rates Around Historical Lows",
      subtitle = paste0("Home Price Index normalized to 0% at the month of the Mortgage Rate Low.\n",
                       "Each panel shows data approximately ", years_before_low, " years before to ", years_after_low, " years after the indicated low rate."),
      caption = paste0("Data Sources: Freddie Mac (Mortgage Rates), FHFA (Home Price Index)\n",
                       "Green vertical line indicates the month of the lowest mortgage rate in each panel's window."),
      theme = theme(plot.title = element_text(hjust = 0.5, face = "bold", size = 20),
                    plot.subtitle = element_text(hjust = 0.5, size = 14, color = "gray30"),
                    plot.caption = element_text(size = 10, color = "gray50", hjust = 0))
    ) +
    common_legend_low

  # Print the final combined plot
  print(final_plot_low)
} else {
  message("No combined plot generated for low rate cycles due to lack of individual plots.")
}

The Impact of Interest Rates on Home Buying: A Detailed Financial Analysis

Understanding the interplay between mortgage rates and home prices is crucial for any potential buyer. While our previous plots illustrate the historical behavior of the Home Price Index (HPI) around both peak and low interest rate periods, they don’t directly quantify the financial implications of buying at those times. This section will introduce a model to calculate the time to reach profit and the actual profit after specific holding periods (3, 5, and 10 years) for homes purchased during these historical scenarios.

Modeling Assumptions for Profitability

To assess profitability, we need a consistent set of assumptions for purchasing and selling a home. For this analysis, we’ll use the following:

Initial Home Value: $400,000 (a base value for consistent comparison across different time periods)
Buying Costs: 3% of the initial home value (e.g., closing costs, appraisal fees)
Selling Costs: 6% of the eventual selling price (e.g., real estate commissions, closing costs)

Profit is defined as the point where the home’s value (minus selling costs) exceeds the initial total investment (purchase price plus buying costs).

R Code for Financial Scenarios Calculations

First, let’s set up the financial assumptions and a helper function to calculate time to profit and profit at specific timeframes.

# Define financial assumptions for the model
initial_home_value_model <- 400000
buying_cost_percentage_model <- 0.03
selling_cost_percentage_model <- 0.06

# Function to calculate time to profit and profit at given durations
calculate_home_profit_metrics <- function(start_date, purchase_hpi_index, merged_data_df,
                                          initial_home_value, buying_costs_pct, selling_costs_pct) {

  total_initial_investment <- initial_home_value * (1 + buying_costs_pct)

  # Target selling value to break even (i.e., reach profit)
  target_selling_value_for_profit <- total_initial_investment / (1 - selling_costs_pct)

  # Filter data from the purchase date onwards
  # Ensure future_hpi_data starts exactly at start_date, and ordered by date
  future_hpi_data <- merged_data_df[date >= start_date][order(date)]

  if (nrow(future_hpi_data) == 0) {
    return(list(
      time_to_profit_months = NA,
      profit_3_years = NA,
      profit_5_years = NA,
      profit_10_years = NA
    ))
  }

  # --- Calculate Time to Profit ---
  time_to_profit_months <- NA
  # Loop through rows, `j` will be 1-indexed for the rows in future_hpi_data
  for (j in 1:nrow(future_hpi_data)) {
    current_date <- future_hpi_data$date[j]
    current_hpi_index <- future_hpi_data$national_avg_index_nsa[j]

    current_home_value <- initial_home_value * (current_hpi_index / purchase_hpi_index)

    if (current_home_value * (1 - selling_costs_pct) >= total_initial_investment) {
      # FIX: Use j-1 directly for months elapsed from start month (month 0)
      time_to_profit_months <- j - 1
      break # Found the first month where profit is reached, exit loop
    }
  }

  # --- Calculate Profit at Specific Durations ---
  calculate_profit_at_duration <- function(duration_years) {
    target_date <- start_date + years(duration_years)
    # Find the closest data point within merged_data
    # FIX: Convert difference to numeric before abs()
    closest_date_idx <- which.min(abs(as.numeric(merged_data_df$date - target_date)))

    # Ensure the closest date found is not before the purchase date for meaningful calculation
    # And ensure it's within the overall data range
    if (length(closest_date_idx) == 0 || merged_data_df$date[closest_date_idx] < start_date) {
      return(NA) # No valid data point for this duration
    }

    hpi_at_target_date <- merged_data_df$national_avg_index_nsa[closest_date_idx]

    # If the HPI at target date is NA, return NA for profit
    if (is.na(hpi_at_target_date)) {
        return(NA)
    }

    current_home_value <- initial_home_value * (hpi_at_target_date / purchase_hpi_index)
    net_profit <- (current_home_value * (1 - selling_costs_pct)) - total_initial_investment
    return(net_profit)
  }

  profit_3_years <- calculate_profit_at_duration(3)
  profit_5_years <- calculate_profit_at_duration(5)
  profit_10_years <- calculate_profit_at_duration(10)

  return(list(
    time_to_profit_months = time_to_profit_months,
    profit_3_years = profit_3_years,
    profit_5_years = profit_5_years,
    profit_10_years = profit_10_years
  ))
}

Scenario 1: Buying at Peak Interest Rates

Let’s evaluate the financial outcomes if you purchased a home during the historical periods when mortgage rates were at their highest. This analysis quantifies how long it took to break even and the profit generated after 3, 5, and 10 years.

# The 'peak_dates_info' list is defined in a previous chunk.

peak_profit_scenarios <- data.table(
  Scenario = character(),
  Buy_Date = as.IDate(character()),
  Mortgage_Rate_at_Buy = numeric(),
  Time_to_Profit_Months = numeric(), # Keep as numeric
  Profit_3_Years = numeric(),   # Keep as numeric
  Profit_5_Years = numeric(),   # Keep as numeric
  Profit_10_Years = numeric()  # Keep as numeric
)

for (p_info in peak_dates_info) {
  buy_date_target <- p_info$date

  # Find the actual date in merged_data closest to the target buy date
  closest_date_idx <- which.min(abs(as.numeric(merged_data$date - buy_date_target)))
  actual_buy_date_in_data <- merged_data$date[closest_date_idx]

  buy_data_row <- merged_data[date == actual_buy_date_in_data]

  if (nrow(buy_data_row) > 0) {
    purchase_hpi_index <- buy_data_row$national_avg_index_nsa[1]
    mortgage_rate_at_buy <- buy_data_row$avg_mortgage_rate[1]

    metrics <- calculate_home_profit_metrics(
      start_date = actual_buy_date_in_data,
      purchase_hpi_index = purchase_hpi_index,
      merged_data_df = merged_data,
      initial_home_value = initial_home_value_model,
      buying_costs_pct = buying_cost_percentage_model,
      selling_costs_pct = selling_cost_percentage_model
    )

    peak_profit_scenarios <- rbindlist(list(peak_profit_scenarios, data.table(
      Scenario = paste0("Buy at Peak Rate (", p_info$label, ")"),
      Buy_Date = actual_buy_date_in_data,
      Mortgage_Rate_at_Buy = mortgage_rate_at_buy,
      Time_to_Profit_Months = metrics$time_to_profit_months,
      Profit_3_Years = metrics$profit_3_years,
      Profit_5_Years = metrics$profit_5_years,
      Profit_10_Years = metrics$profit_10_years
    )))
  } else {
    message(paste("Warning: Buy date", format(actual_buy_date_in_data, "%Y-%m-%d"), "not found in merged data. Skipping."))
  }
}

# Convert months to years for readability
peak_profit_scenarios[, Time_to_Profit_Years := round(Time_to_Profit_Months / 12, 1)]

# Create a display table for kable with formatted strings
display_peak_profit_scenarios <- copy(peak_profit_scenarios)
display_peak_profit_scenarios[, `:=`(
  Profit_3_Years = paste0("$", formatC(Profit_3_Years, format = "f", digits = 0, big.mark = ",")),
  Profit_5_Years = paste0("$", formatC(Profit_5_Years, format = "f", digits = 0, big.mark = ",")),
  Profit_10_Years = paste0("$", formatC(Profit_10_Years, format = "f", digits = 0, big.mark = ","))
)]
# Handle NA for Time_to_Profit_Months and Time_to_Profit_Years
display_peak_profit_scenarios[is.na(Time_to_Profit_Months), Time_to_Profit_Months := "Not Reached"]
display_peak_profit_scenarios[is.na(Time_to_Profit_Years), Time_to_Profit_Years := "Not Reached"]

Scenario 2: Buying at Lowest Interest Rates

Now, let’s contrast those findings by looking at the financial outcomes when purchasing a home during periods of historically low mortgage rates. This section will highlight the potential benefits of timing your purchase with favorable borrowing costs.

# The 'low_dates_info' list is defined in a previous chunk.

low_profit_scenarios <- data.table(
  Scenario = character(),
  Buy_Date = as.IDate(character()),
  Mortgage_Rate_at_Buy = numeric(),
  Time_to_Profit_Months = numeric(), # Keep as numeric
  Profit_3_Years = numeric(),   # Keep as numeric
  Profit_5_Years = numeric(),   # Keep as numeric
  Profit_10_Years = numeric()  # Keep as numeric
)

for (p_info in low_dates_info) {
  buy_date_target <- p_info$date

  # Find the actual date in merged_data closest to the target buy date
  # This part relies on `merged_data` containing the dates exactly as expected.
  closest_date_idx <- which.min(abs(as.numeric(merged_data$date - buy_date_target)))
  actual_buy_date_in_data <- merged_data$date[closest_date_idx]

  buy_data_row <- merged_data[date == actual_buy_date_in_data]

  if (nrow(buy_data_row) > 0) {
    purchase_hpi_index <- buy_data_row$national_avg_index_nsa[1]
    mortgage_rate_at_buy <- buy_data_row$avg_mortgage_rate[1]

    metrics <- calculate_home_profit_metrics(
      start_date = actual_buy_date_in_data,
      purchase_hpi_index = purchase_hpi_index,
      merged_data_df = merged_data,
      initial_home_value = initial_home_value_model,
      buying_costs_pct = buying_cost_percentage_model,
      selling_costs_pct = selling_cost_percentage_model
    )

    low_profit_scenarios <- rbindlist(list(low_profit_scenarios, data.table(
      Scenario = paste0("Buy at Low Rate (", p_info$label, ")"),
      Buy_Date = actual_buy_date_in_data,
      Mortgage_Rate_at_Buy = mortgage_rate_at_buy,
      Time_to_Profit_Months = metrics$time_to_profit_months,
      Profit_3_Years = metrics$profit_3_years,
      Profit_5_Years = metrics$profit_5_years,
      Profit_10_Years = metrics$profit_10_years
    )))
  } else {
    message(paste("Warning: Buy date", format(actual_buy_date_in_data, "%Y-%m-%d"), "not found in merged data. Skipping."))
  }
}

# Convert months to years for readability
low_profit_scenarios[, Time_to_Profit_Years := round(Time_to_Profit_Months / 12, 1)]

# Create a display table for kable with formatted strings
display_low_profit_scenarios <- copy(low_profit_scenarios)
display_low_profit_scenarios[, `:=`(
  Profit_3_Years = paste0("$", formatC(Profit_3_Years, format = "f", digits = 0, big.mark = ",")),
  Profit_5_Years = paste0("$", formatC(Profit_5_Years, format = "f", digits = 0, big.mark = ",")),
  Profit_10_Years = paste0("$", formatC(Profit_10_Years, format = "f", digits = 0, big.mark = ","))
)]
# Handle NA for Time_to_Profit_Months and Time_to_Profit_Years by converting to "Not Reached" string
display_low_profit_scenarios[is.na(Time_to_Profit_Months), Time_to_Profit_Months := "Not Reached"]
display_low_profit_scenarios[is.na(Time_to_Profit_Years), Time_to_Profit_Years := "Not Reached"]

cat("### Financial Outcomes from Buying at Historical Mortgage Rate Peaks\n\n")

Financial Outcomes from Buying at Historical Mortgage Rate Peaks

print(knitr::kable(display_peak_profit_scenarios,
                   caption = "Comparison of Financial Outcomes When Purchasing at Historical Mortgage Rate Peaks",
                   col.names = c("Scenario", "Purchase Date", "Mortgage Rate (%)", "Time to Profit (Months)", "Profit at 3 Yrs", "Profit at 5 Yrs", "Profit at 10 Yrs", "Time to Profit (Years)"),
                   format = "html") # Ensure HTML format for clean rendering
)

Comparison of Financial Outcomes When Purchasing at Historical Mortgage Rate Peaks
Scenario	Purchase Date	Mortgage Rate (%)	Time to Profit (Months)	Profit at 3 Yrs	Profit at 5 Yrs	Profit at 10 Yrs	Time to Profit (Years)
Buy at Peak Rate (Early 1980s Peak (~18%))	1982-01-01	17.4850	13	$-660	$35,371	$112,648	1.1
Buy at Peak Rate (Late 1980s Peak (~11%))	1989-01-01	10.7300	22	$-1,810	$22,052	$107,249	1.8
Buy at Peak Rate (Mid 1990s Peak (~9%))	1994-12-01	9.1980	28	$435	$51,414	$265,136	2.3
Buy at Peak Rate (Early 2000s Peak (~8.5%))	2000-05-01	8.5150	17	$49,650	$147,063	$102,812	1.4
Buy at Peak Rate (Mid 2000s Peak (~6.5%))	2006-07-01	6.7625	131	$-85,059	$-105,284	$-21,449	10.9
Buy at Peak Rate (Recent Peak (~7.5%))	2023-10-01	7.6200	NA	$-9,693	$-9,693	$-9,693	NA

cat("### Financial Outcomes from Buying at Historical Mortgage Rate Lows\n\n")

Financial Outcomes from Buying at Historical Mortgage Rate Lows

print(knitr::kable(display_low_profit_scenarios,
                   caption = "Comparison of Financial Outcomes When Purchasing at Historical Mortgage Rate Lows",
                   col.names = c("Scenario", "Purchase Date", "Mortgage Rate (%)", "Time to Profit (Months)", "Profit at 3 Yrs", "Profit at 5 Yrs", "Profit at 10 Yrs", "Time to Profit (Years)"),
                   format = "html") # Ensure HTML format for clean rendering
)

Comparison of Financial Outcomes When Purchasing at Historical Mortgage Rate Lows
Scenario	Purchase Date	Mortgage Rate (%)	Time to Profit (Months)	Profit at 3 Yrs	Profit at 5 Yrs	Profit at 10 Yrs	Time to Profit (Years)
Buy at Low Rate (Early 1970s Low (~7.3%))	1975-01-01	9.432	20	$-67,769	$2,289	$79,170	1.7
Buy at Low Rate (Early 1990s Low (~6.7%))	1993-11-01	7.155	38	$-923	$37,549	$218,237	3.2
Buy at Low Rate (Early 2000s Low (~5.2%))	2003-06-01	5.230	10	$73,878	$39,224	$23,244	0.8
Buy at Low Rate (Post-2008 Low (~3.3%))	2012-11-01	3.352	18	$27,073	$82,446	$361,839	1.5
Buy at Low Rate (COVID-Era Low (~2.7%))	2020-12-01	2.684	5	$88,560	$127,199	$127,199	0.4

# Assuming `peak_profit_scenarios` and `low_profit_scenarios` are data.tables
# and their profit columns (`Profit_3_Years`, `Profit_5_Years`, `Profit_10_Years`)
# and time to profit columns (`Time_to_Profit_Months`, `Time_to_Profit_Years`) are numeric.
# These data tables should be the *original numeric* ones, not the `display_*` versions.

# Combine peak and low rate scenarios into a single data.table
all_scenarios <- rbindlist(list(
  # Add a 'Rate_Type' column to differentiate scenarios
  peak_profit_scenarios[, Rate_Type := "Peak Rate"],
  low_profit_scenarios[, Rate_Type := "Low Rate"]
), fill = TRUE)

# Create a more descriptive scenario label for plotting, including the actual mortgage rate
all_scenarios[, Scenario_Label := paste0(gsub("Buy at ", "", Scenario), " (", round(Mortgage_Rate_at_Buy, 1), "%)")]

# --- Plot 1: Profitability Comparison (Bar Chart) ---

# Melt the profit columns into a long format for easier ggplot plotting
profit_long <- melt(all_scenarios,
                    id.vars = c("Scenario", "Scenario_Label", "Buy_Date", "Mortgage_Rate_at_Buy", "Rate_Type"),
                    measure.vars = c("Profit_3_Years", "Profit_5_Years", "Profit_10_Years"),
                    variable.name = "Duration",
                    value.name = "Profit")

# Clean up Duration names for legend
profit_long[, Duration := fcase(
  Duration == "Profit_3_Years", "3 Years",
  Duration == "Profit_5_Years", "5 Years",
  Duration == "Profit_10_Years", "10 Years"
)]
# Ensure Duration is a factor with a specific order for consistent legend
profit_long[, Duration := factor(Duration, levels = c("3 Years", "5 Years", "10 Years"))]

# Order scenarios for plotting: first by Rate_Type, then by Purchase Date within each type
# This creates a consistent order across both plots.
scenario_order_levels <- all_scenarios[order(Rate_Type, Buy_Date)]$Scenario_Label
profit_long[, Scenario_Label := factor(Scenario_Label, levels = scenario_order_levels)]


p_profit_combined <- ggplot(profit_long, aes(x = Scenario_Label, y = Profit, fill = Duration)) +
  geom_bar(stat = "identity", position = "dodge") +
  # Format Y-axis as currency
  scale_y_continuous(labels = scales::dollar,
                     # Add a slight expansion to the y-axis to ensure annotations fit
                     expand = expansion(mult = c(0.1, 0.1))) +
  labs(
    title = "Profitability Comparison",
    subtitle = "Net Profit (in USD) after 3, 5, and 10 Years",
    x = NULL, # Remove X-axis label, as scenario labels are descriptive
    y = "Net Profit"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 9), # Rotate X-axis labels for readability
    legend.position = "bottom",
    plot.title = element_text(hjust = 0.5, face = "bold", size = 14),
    plot.subtitle = element_text(hjust = 0.5, size = 10, color = "gray30")
  ) +
  scale_fill_brewer(palette = "Paired") + # Use a color palette for durations
  # Add a vertical line to visually separate Peak Rate from Low Rate scenarios
  geom_vline(xintercept = length(peak_profit_scenarios$Scenario) + 0.5, linetype = "dashed", color = "grey50", size = 0.8) +
  # Add labels for "Peak Rate Scenarios" and "Low Rate Scenarios"
  annotate("text",
           x = length(peak_profit_scenarios$Scenario) / 2 + 0.5, # Midpoint of peak scenarios
           y = max(profit_long$Profit, na.rm = TRUE, -min(profit_long$Profit, na.rm = TRUE))*0.9, # Position above bars
           label = "Peak Rate Scenarios", size = 4, fontface = "bold", color = "#E69F00") + # Color matching peak rate
  annotate("text",
           x = length(peak_profit_scenarios$Scenario) + 0.5 + length(low_profit_scenarios$Scenario) / 2, # Midpoint of low scenarios
           y = max(profit_long$Profit, na.rm = TRUE, -min(profit_long$Profit, na.rm = TRUE))*0.9, # Position above bars
           label = "Low Rate Scenarios", size = 4, fontface = "bold", color = "#56B4E9") # Color matching low rate

# --- Plot 2: Time to Profit Comparison (Bar Chart) ---

# Filter out scenarios where 'Time_to_Profit_Months' is NA (meaning "Not Reached") for cleaner bars
time_to_profit_plot_data <- all_scenarios[!is.na(Time_to_Profit_Months)]
# Ensure the same scenario order as the profit plot
time_to_profit_plot_data[, Scenario_Label := factor(Scenario_Label, levels = scenario_order_levels)]


p_time_to_profit_combined <- ggplot(time_to_profit_plot_data, aes(x = Scenario_Label, y = Time_to_Profit_Months, fill = Rate_Type)) +
  geom_bar(stat = "identity", position = "dodge") +
  scale_y_continuous(name = "Time to Profit (Months)",
                     breaks = scales::pretty_breaks(n = 10)) + # More breaks for months
  labs(
    title = "Time to Profit Comparison",
    subtitle = "Number of Months to Reach Net Profit",
    x = NULL, # Remove X-axis label
    y = "Time to Profit (Months)"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 9),
    legend.position = "bottom",
    plot.title = element_text(hjust = 0.5, face = "bold", size = 14),
    plot.subtitle = element_text(hjust = 0.5, size = 10, color = "gray30")
  ) +
  scale_fill_manual(values = c("Peak Rate" = "#E69F00", "Low Rate" = "#56B4E9")) + # Custom colors for rate types
  # Add a vertical line to visually separate Peak Rate from Low Rate scenarios
  geom_vline(xintercept = length(peak_profit_scenarios$Scenario) + 0.5, linetype = "dashed", color = "grey50", size = 0.8)


# --- Combine Plots using patchwork ---
# The '& theme(...)' applies themes to all sub-plots within the combined layout.
final_comparison_plot <- (p_profit_combined / p_time_to_profit_combined) +
  plot_annotation(
    title = "Comparative Financial Outcomes of Home Purchases at Different Mortgage Rate Cycles",
    caption = "Data based on $400,000 initial home value, 3% buying costs, 6% selling costs.\n'Not Reached' scenarios are excluded from 'Time to Profit' plot."
  ) & theme(plot.title = element_text(hjust = 0.5, face = "bold", size = 16),
            plot.caption = element_text(size = 9, color = "gray50", hjust = 0))

# Print the final combined plot
print(final_comparison_plot)

The table above demonstrates that buying at historical low interest rates generally leads to a much faster time to profitability and often significantly higher profits over 3, 5, and 10-year periods. This is largely due to the lower cost of borrowing combined with the subsequent appreciation of home values that often follows periods of low rates.

Here’s the explanation and R code to analyze the average years to profit and comparative percentage profits for peak and low mortgage rate scenarios, excluding the 2008 peak outlier.

Comparative Analysis of Peak vs. Low Mortgage Rate Scenarios

To better understand the long-term financial implications, we compare the average time to reach profitability and the average percentage profit after specific durations for both peak and low mortgage rate purchasing scenarios. The “Post-2008 Bubble Peak” scenario has been excluded from this analysis due to its unique and exceptionally challenging market conditions, which act as a significant outlier.

The percentage profit is calculated based on the initial total investment, which includes the home purchase price and estimated buying costs. For an initial home value of $400,000 with 3% buying costs, the total initial investment is $412,000.

# Ensure 'all_scenarios' data.table is available from previous chunks,
# and its profit and time columns are numeric.
# If 'all_scenarios' was not yet created as numeric, ensure it is.
if (!exists("all_scenarios")) {
  # This block should ideally not be needed if previous chunks are run in order.
  # Re-create all_scenarios if necessary for independent execution.
  # Assumes peak_profit_scenarios and low_profit_scenarios are already defined and numeric.
  all_scenarios <- rbindlist(list(
    peak_profit_scenarios[, Rate_Type := "Peak Rate"],
    low_profit_scenarios[, Rate_Type := "Low Rate"]
  ), fill = TRUE)

  # Ensure numeric types for calculations
  numeric_cols <- c("Mortgage_Rate_at_Buy", "Time_to_Profit_Months", "Profit_3_Years", "Profit_5_Years", "Profit_10_Years", "Time_to_Profit_Years")
  for (col in numeric_cols) {
    all_scenarios[[col]] <- as.numeric(all_scenarios[[col]])
  }
}

# Define the outlier scenario name
outlier_scenario_name <- "Buy at Peak Rate (Post-2008 Bubble Peak (~5.1%))"

# Filter out the outlier scenario from the combined dataset
filtered_scenarios <- all_scenarios[Scenario != outlier_scenario_name]

# --- 1. Calculate Average Years to Profit ---
avg_years_to_profit <- filtered_scenarios[, .(
  Avg_Time_to_Profit_Years = mean(Time_to_Profit_Years, na.rm = TRUE)
), by = Rate_Type]

# --- 2. Calculate Average Percentage Profit ---

# Total initial investment (e.g., $400,000 home + 3% buying costs = $412,000)
total_initial_investment_calc <- initial_home_value_model * (1 + buying_cost_percentage_model)

# Calculate percentage profit for each scenario
filtered_scenarios[, `:=`(
  Pct_Profit_3_Years = (Profit_3_Years / total_initial_investment_calc) * 100,
  Pct_Profit_5_Years = (Profit_5_Years / total_initial_investment_calc) * 100,
  Pct_Profit_10_Years = (Profit_10_Years / total_initial_investment_calc) * 100
)]

# Calculate average percentage profit for each rate type
avg_pct_profit <- filtered_scenarios[, .(
  Avg_Pct_Profit_3_Years = mean(Pct_Profit_3_Years, na.rm = TRUE),
  Avg_Pct_Profit_5_Years = mean(Pct_Profit_5_Years, na.rm = TRUE), # CORRECTED TYPO HERE
  Avg_Pct_Profit_10_Years = mean(Pct_Profit_10_Years, na.rm = TRUE)
), by = Rate_Type]

# --- Display Results ---
cat("#### Average Time to Profit\n\n")

Average Time to Profit

cat(paste0("On average, after removing the 2008 outlier:\n"))

On average, after removing the 2008 outlier:

for (i in 1:nrow(avg_years_to_profit)) {
  cat(paste0("- **", avg_years_to_profit$Rate_Type[i], "** scenarios reach profitability in approximately ",
             round(avg_years_to_profit$Avg_Time_to_Profit_Years[i], 1), " years.\n"))
}

Peak Rate scenarios reach profitability in approximately 3.5 years.
Low Rate scenarios reach profitability in approximately 1.5 years.

cat("\n")

cat("#### Average Percentage Profit\n\n")

Average Percentage Profit

cat(paste0("Comparing the average percentage profit relative to the initial investment ($",
           formatC(total_initial_investment_calc, format = "f", digits = 0, big.mark = ","), "):\n"))

Comparing the average percentage profit relative to the initial investment ($412,000):

for (duration in c("3_Years", "5_Years", "10_Years")) {
  col_name <- paste0("Avg_Pct_Profit_", duration)
  peak_val <- avg_pct_profit[Rate_Type == "Peak Rate", get(col_name)]
  low_val <- avg_pct_profit[Rate_Type == "Low Rate", get(col_name)]

  cat(paste0("- After **", gsub("_", " ", duration), "**: "))
  if (peak_val > low_val) {
    cat(paste0("The **Peak Rate** scenarios show a higher average profit of **", round(peak_val, 2), "%** compared to Low Rate scenarios (", round(low_val, 2), "%).\n"))
  } else if (low_val > peak_val) {
    cat(paste0("The **Low Rate** scenarios show a higher average profit of **", round(low_val, 2), "%** compared to Peak Rate scenarios (", round(peak_val, 2), "%).\n"))
  } else {
    cat(paste0("Both scenarios show approximately equal average profit of **", round(peak_val, 2), "%**.\n"))
  }
}

After 3 Years: The Low Rate scenarios show a higher average profit of 5.86% compared to Peak Rate scenarios (-1.91%).
After 5 Years: The Low Rate scenarios show a higher average profit of 14.01% compared to Peak Rate scenarios (5.7%).
After 10 Years: The Low Rate scenarios show a higher average profit of 39.31% compared to Peak Rate scenarios (22.52%).

cat("\n")

# Display the calculated average percentage profits in a table for clarity
cat("#### Detailed Average Percentage Profits by Rate Type\n\n")

Detailed Average Percentage Profits by Rate Type

display_avg_pct_profit <- copy(avg_pct_profit)
display_avg_pct_profit[, `:=`(
  Avg_Pct_Profit_3_Years = paste0(round(Avg_Pct_Profit_3_Years, 2), "%"),
  Avg_Pct_Profit_5_Years = paste0(round(Avg_Pct_Profit_5_Years, 2), "%"),
  Avg_Pct_Profit_10_Years = paste0(round(Avg_Pct_Profit_10_Years, 2), "%")
)]

print(knitr::kable(display_avg_pct_profit,
                   caption = "Average Percentage Profit by Scenario Type (Excluding 2008 Peak)",
                   col.names = c("Scenario Type", "Avg % Profit (3 Yrs)", "Avg % Profit (5 Yrs)", "Avg % Profit (10 Yrs)"),
                   format = "html"))

Average Percentage Profit by Scenario Type (Excluding 2008 Peak)
Scenario Type	Avg % Profit (3 Yrs)	Avg % Profit (5 Yrs)	Avg % Profit (10 Yrs)
Peak Rate	-1.91%	5.7%	22.52%
Low Rate	5.86%	14.01%	39.31%

Key Takeaways for Home Buying Decisions

Based on these analyses, here are some crucial considerations to inform your home buying decisions:

Mortgage Rate Environment Matters: While interest rates are not the sole factor, historical data clearly shows that the prevailing mortgage rate environment at the time of purchase has a substantial impact on both the time it takes to recoup your initial investment and the overall profitability of your home.
Time Horizon is Critical: The longer you plan to hold a property, the more likely you are to see positive returns, even if you buy at less-than-ideal interest rates. Housing markets tend to appreciate over longer periods, smoothing out short-term fluctuations.
Don’t Overlook Local Market Dynamics: This analysis uses a national Home Price Index. Local market conditions (supply and demand, job growth, etc.) can vary significantly and will ultimately determine your specific home’s appreciation. Always research local trends in addition to national data.
Costs Add Up: Buying and selling costs are significant. They directly impact your break-even point and total profit. Understanding these costs upfront is vital.
Consider Opportunity Cost: While this analysis focuses on home equity, remember that your down payment and accumulated equity could otherwise be invested. A deeper dive might compare housing returns against other investment opportunities over the same timeframes.
Affordability vs. Profitability: Low rates make homes more affordable (lower monthly payments for the same loan amount), but high rates don’t necessarily preclude profitability if home values appreciate significantly. Conversely, low rates don’t guarantee profit if home values stagnate or fall. It’s the combination of rate and HPI movement that dictates the outcome.

In conclusion, while timing the market perfectly is challenging, understanding these historical patterns can provide valuable context. Buying when interest rates are low has historically presented a stronger financial advantage in terms of faster profitability and higher returns. However, robust home price appreciation has also offset high interest rates in some historical periods over the long run. Always consider your personal financial situation, time horizon, and local market conditions when making such a significant investment.

Historical Analysis of U.S. Housing Market Dynamics: Navigating Mortgage Rates and Home Price Trends

An In-Depth Look at Peak vs. Low Rate Buying Scenarios and Their Financial Outcomes

Derek Q. Jackson

July 09, 2025