library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.2.3
## Warning: package 'lubridate' was built under R version 4.2.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.0 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.1 ✔ tibble 3.1.8
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.0
## ✔ purrr 1.0.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the ]8;;http://conflicted.r-lib.org/conflicted package]8;; to force all conflicts to become errors
library(lubridate)
library(zoo) # for rolling averages
## Warning: package 'zoo' was built under R version 4.2.3
##
## Attaching package: 'zoo'
##
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
In this project, I explore how changes in U.S. mortgage rates are related to the growth rate of national housing prices. The main question is whether increases in mortgage rates are associated with slower housing price growth, and whether those effects show up immediately or with a delay over time.
The analysis uses mortgage rate data from Freddie Mac’s Primary Mortgage Market Survey and national housing price data from the FHFA House Price Index (HPI). The steps include importing and cleaning the raw data, transforming variables into growth rates, creating visualizations, computing correlations and lagged relationships, estimating regression models, and checking time-series properties using autocorrelation functions.
mort_file <- "data/mortgage_rates.csv"
hpi_file <- "data/house_price_index.csv"
mort_raw <- read_csv(mort_file)
hpi_raw <- read_csv(hpi_file)
glimpse(mort_raw)
## Rows: 2,854
## Columns: 2
## $ observation_date <date> 1971-04-02, 1971-04-09, 1971-04-16, 1971-04-23, 1971…
## $ MORTGAGE30US <dbl> 7.33, 7.31, 7.31, 7.31, 7.29, 7.38, 7.42, 7.44, 7.46,…
head(mort_raw)
glimpse(hpi_raw)
## Rows: 203
## Columns: 2
## $ observation_date <date> 1975-01-01, 1975-04-01, 1975-07-01, 1975-10-01, 1976…
## $ USSTHPI <dbl> 59.99, 60.92, 61.38, 62.24, 62.89, 65.54, 66.58, 67.2…
head(hpi_raw)
At this stage, I imported the two main datasets that I will use
throughout the analysis: the Freddie Mac mortgage rate series and the
FHFA House Price Index (HPI). The glimpse() and
head() functions allowed me to quickly check the structure
of each dataset, verify column names, inspect how dates are formatted,
and confirm the number of observations.
Seeing this information up front is important because it helps identify any issues such as missing values, unusual variable types, or inconsistent date formats. It also confirms that the datasets cover a long enough time period to run meaningful time-series analysis later in the project.
mort <- mort_raw %>%
rename(
DATE = 1,
MORTGAGE_RATE = 2
) %>%
mutate(DATE = as.Date(DATE))
hpi <- hpi_raw %>%
rename(
DATE = 1,
HPI = 2
) %>%
mutate(DATE = as.Date(DATE))
df <- inner_join(mort, hpi, by = "DATE") %>%
arrange(DATE)
glimpse(df)
## Rows: 21
## Columns: 3
## $ DATE <date> 1976-10-01, 1977-04-01, 1977-07-01, 1982-10-01, 1983-04…
## $ MORTGAGE_RATE <dbl> 8.90, 8.70, 8.95, 15.13, 12.82, 13.08, 10.05, 10.39, 6.8…
## $ HPI <dbl> 67.27, 72.66, 74.38, 111.54, 115.79, 116.81, 152.41, 154…
head(df)
tail(df)
summary(df)
## DATE MORTGAGE_RATE HPI
## Min. :1976-10-01 Min. : 2.880 Min. : 67.27
## 1st Qu.:1983-07-01 1st Qu.: 4.940 1st Qu.:116.81
## Median :1994-07-01 Median : 7.700 Median :183.02
## Mean :1998-01-08 Mean : 7.545 Mean :243.89
## 3rd Qu.:2010-04-01 3rd Qu.: 8.950 3rd Qu.:325.93
## Max. :2021-07-01 Max. :15.130 Max. :537.97
In this section, I cleaned and standardized both datasets so they could be merged into a single time series. The mortgage rate data originally came with weekly or monthly observations, while the HPI data is quarterly. After renaming columns and converting dates to a proper date format, I merged the datasets using an inner join on the DATE column.
This merging step is essential because it aligns the two series on
common timestamps so that changes in mortgage rates can be compared
directly to changes in housing prices. The summary(),
head(), and tail() functions helped confirm
that the merged dataset is consistent, sorted correctly, and ready for
further analysis.
df_ts <- df %>%
arrange(DATE) %>%
mutate(
mort_change = MORTGAGE_RATE - lag(MORTGAGE_RATE),
mort_pct_change = (MORTGAGE_RATE / lag(MORTGAGE_RATE)) - 1,
hpi_growth = (HPI / lag(HPI)) - 1,
hpi_log_growth = log(HPI) - log(lag(HPI))
)
glimpse(df_ts)
## Rows: 21
## Columns: 7
## $ DATE <date> 1976-10-01, 1977-04-01, 1977-07-01, 1982-10-01, 1983-…
## $ MORTGAGE_RATE <dbl> 8.90, 8.70, 8.95, 15.13, 12.82, 13.08, 10.05, 10.39, 6…
## $ HPI <dbl> 67.27, 72.66, 74.38, 111.54, 115.79, 116.81, 152.41, 1…
## $ mort_change <dbl> NA, -0.20, 0.25, 6.18, -2.31, 0.26, -3.03, 0.34, -3.50…
## $ mort_pct_change <dbl> NA, -0.02247191, 0.02873563, 0.69050279, -0.15267680, …
## $ hpi_growth <dbl> NA, 0.080124870, 0.023671897, 0.499596666, 0.038102923…
## $ hpi_log_growth <dbl> NA, 0.077076655, 0.023396062, 0.405196182, 0.037394935…
summary(df_ts)
## DATE MORTGAGE_RATE HPI mort_change
## Min. :1976-10-01 Min. : 2.880 Min. : 67.27 Min. :-3.500
## 1st Qu.:1983-07-01 1st Qu.: 4.940 1st Qu.:116.81 1st Qu.:-1.045
## Median :1994-07-01 Median : 7.700 Median :183.02 Median :-0.200
## Mean :1998-01-08 Mean : 7.545 Mean :243.89 Mean :-0.296
## 3rd Qu.:2010-04-01 3rd Qu.: 8.950 3rd Qu.:325.93 3rd Qu.: 0.310
## Max. :2021-07-01 Max. :15.130 Max. :537.97 Max. : 6.180
## NA's :1
## mort_pct_change hpi_growth hpi_log_growth
## Min. :-0.33686 Min. :-0.02090 Min. :-0.02112
## 1st Qu.:-0.17067 1st Qu.: 0.01173 1st Qu.: 0.01166
## Median :-0.04268 Median : 0.04689 Median : 0.04579
## Mean :-0.03209 Mean : 0.11779 Mean : 0.10395
## 3rd Qu.: 0.04185 3rd Qu.: 0.18505 3rd Qu.: 0.16952
## Max. : 0.69050 Max. : 0.49960 Max. : 0.40520
## NA's :1 NA's :1 NA's :1
head(df_ts)
Mortgage rates and housing prices tend to trend over time, which can make it difficult to analyze their short-run relationships in raw levels. To address this, I transformed the series into changes and growth rates.
mort_change represents the period-to-period change in
mortgage rates.mort_pct_change is the percent change in mortgage
rates.hpi_growth measures the quarter-over-quarter percentage
growth in home prices.hpi_log_growth is the log-difference version of HPI
growth.These transformations help stabilize the data and make it more suitable for correlation, autocorrelation, and regression analysis. They also reflect how economists typically work with financial and macroeconomic time series—focusing on returns or changes rather than levels.
ggplot(df_ts, aes(x = DATE, y = MORTGAGE_RATE)) +
geom_line() +
labs(
title = "30-Year Fixed Mortgage Rate Over Time",
x = "Date",
y = "Mortgage Rate (%)"
)
The mortgage rate plot shows the long-term evolution of U.S. 30-year mortgage rates. Rates were extremely high in the early 1980s, gradually declined over the following decades, and then reached historically low levels after 2010. The sharp increase around 2021–2022 reflects tighter monetary policy and rising inflation.
Understanding this long-run pattern is important because it provides context for how interest rate environments may affect housing affordability and price growth.
ggplot(df_ts, aes(x = DATE, y = HPI)) +
geom_line() +
labs(
title = "FHFA U.S. National House Price Index Over Time",
x = "Date",
y = "House Price Index"
)
The HPI level plot shows a steady upward trend in national home prices over time, with notable slowdowns around the late 2000s housing crash. The overall trend confirms that housing prices tend to rise in the long run, although the pace of growth can vary.
This trend also highlights why raw HPI levels are not suitable for correlation or regression analysis—they contain strong long-term trends that can distort statistical relationships. This is why the analysis focuses on HPI growth instead of levels.
ggplot(df_ts, aes(x = DATE, y = hpi_growth)) +
geom_line() +
labs(
title = "Housing Price Growth Rate Over Time",
x = "Date",
y = "HPI Growth (Percent Change)"
)
## Warning: Removed 1 row containing missing values (`geom_line()`).
The HPI growth plot shows that housing price growth fluctuates across time with periods of strong appreciation and periods of slower or negative growth. Unlike the level plot, the growth series does not show a persistent upward trend. This makes it more suitable for statistical modeling because growth rates behave more like a stationary series.
Identifying these fluctuations helps illustrate how sensitive housing prices are to broader economic conditions.
df_corr <- df_ts %>%
filter(!is.na(mort_change), !is.na(hpi_growth))
cor_same <- cor(df_corr$mort_change, df_corr$hpi_growth, use = "complete.obs")
cor_same
## [1] 0.148769
df_ts <- df_ts %>%
mutate(
mort_change_lag1 = lag(mort_change, 1),
mort_change_lag2 = lag(mort_change, 2),
mort_change_lag3 = lag(mort_change, 3)
)
df_lag <- df_ts %>%
filter(!is.na(mort_change_lag1), !is.na(hpi_growth))
cor_lag1 <- cor(df_lag$mort_change_lag1, df_lag$hpi_growth, use = "complete.obs")
cor_lag2 <- cor(df_lag$mort_change_lag2, df_lag$hpi_growth, use = "complete.obs")
cor_lag3 <- cor(df_lag$mort_change_lag3, df_lag$hpi_growth, use = "complete.obs")
cor_lag1
## [1] 0.09608185
cor_lag2
## [1] -0.07964326
cor_lag3
## [1] 0.4359975
To understand whether mortgage rate changes affect housing price growth, I first calculated the simple correlations using same-period values and then explored lagged relationships. The same-period correlation is very small, which suggests that mortgage rate movements do not immediately translate into changes in home price growth.
The lag correlations show slightly different values, with the largest correlation appearing at a lag of three periods. This indicates that the housing market may respond to mortgage rate changes with a delay, which makes sense because home buying and selling are slow-moving processes. While none of the correlations are very large, the lag pattern provides useful insight into the potential timing of mortgage rate effects.
ggplot(df_corr, aes(x = mort_change, y = hpi_growth)) +
geom_point(alpha = 0.5) +
labs(
title = "Mortgage Rate Changes vs Housing Price Growth",
x = "Change in Mortgage Rate (Level)",
y = "Housing Price Growth (Percent Change)"
)
The scatterplot of mortgage rate changes versus housing price growth shows a dispersed cloud of points with no strong linear trend. This visual pattern reinforces the earlier correlation results: mortgage rate changes by themselves do not have a clear or consistent immediate impact on quarterly housing price growth.
Even though the points appear scattered, this is still meaningful because it shows that simple same-period relationships do not explain housing price movements well, and it sets up the need to investigate lagged effects and regression models.
df_reg_same <- df_corr
model1 <- lm(hpi_growth ~ mort_change, data = df_reg_same)
df_reg_lag1 <- df_ts %>%
filter(!is.na(hpi_growth), !is.na(mort_change_lag1))
model2 <- lm(hpi_growth ~ mort_change_lag1, data = df_reg_lag1)
df_reg_lag12 <- df_ts %>%
filter(!is.na(hpi_growth), !is.na(mort_change_lag1), !is.na(mort_change_lag2))
model3 <- lm(hpi_growth ~ mort_change_lag1 + mort_change_lag2, data = df_reg_lag12)
summary(model1)
##
## Call:
## lm(formula = hpi_growth ~ mort_change, data = df_reg_same)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.14346 -0.10800 -0.06040 0.09513 0.31093
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.12103 0.03350 3.613 0.00199 **
## mort_change 0.01095 0.01715 0.638 0.53133
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1481 on 18 degrees of freedom
## Multiple R-squared: 0.02213, Adjusted R-squared: -0.03219
## F-statistic: 0.4074 on 1 and 18 DF, p-value: 0.5313
summary(model2)
##
## Call:
## lm(formula = hpi_growth ~ mort_change_lag1, data = df_reg_lag1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.13383 -0.09680 -0.06840 0.07584 0.37594
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.121894 0.035526 3.431 0.00319 **
## mort_change_lag1 0.007057 0.017730 0.398 0.69559
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1531 on 17 degrees of freedom
## Multiple R-squared: 0.009232, Adjusted R-squared: -0.04905
## F-statistic: 0.1584 on 1 and 17 DF, p-value: 0.6956
summary(model3)
##
## Call:
## lm(formula = hpi_growth ~ mort_change_lag1 + mort_change_lag2,
## data = df_reg_lag12)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.14237 -0.10262 -0.07676 0.08847 0.37168
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.125611 0.039201 3.204 0.00591 **
## mort_change_lag1 0.006010 0.019493 0.308 0.76208
## mort_change_lag2 -0.004005 0.019541 -0.205 0.84035
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1607 on 15 degrees of freedom
## Multiple R-squared: 0.0126, Adjusted R-squared: -0.1191
## F-statistic: 0.09571 on 2 and 15 DF, p-value: 0.9093
The regression models provide a more formal way of testing whether mortgage rate changes help explain housing price growth. In Model 1 (same-period effect), the coefficient on mortgage rate change is small and not statistically significant, which matches the correlation results and the scatterplot.
Model 2 introduces a one-period lag, and Model 3 includes both one- and two-period lags. The coefficients in these models remain small and statistically insignificant, and the R-squared values stay very low across all specifications. This suggests that mortgage rate changes, even when lagged, do not explain much of the quarter-to-quarter variation in national housing price growth on their own.
While the results show weak relationships, the modeling process itself is important because it demonstrates how to test timing effects and how to interpret regression output in a financial context.
acf(na.omit(df_ts$hpi_growth), main = "ACF of HPI Growth")
acf(na.omit(df_ts$mort_change), main = "ACF of Mortgage Rate Change")
The ACF plot for HPI growth shows very weak autocorrelation after the first lag, with most values falling inside the 95% confidence bounds. This indicates that the growth series is approximately stationary, meaning its statistical properties do not change dramatically over time. Stationarity is important because many time-series techniques assume it.
For mortgage rate changes, the ACF plot shows essentially no autocorrelation beyond lag 0. This means rate changes behave like independent shocks from period to period, which is typical for differenced interest rate series.
Together, these ACF results confirm that working with growth rates and changes was the correct approach and that the series are suitable for correlation and regression analysis.
df_pre2000 <- df_ts %>% filter(DATE < as.Date("2000-01-01"))
df_post2000 <- df_ts %>% filter(DATE >= as.Date("2000-01-01"))
nrow(df_pre2000); nrow(df_post2000)
## [1] 12
## [1] 9
cor_pre <- cor(df_pre2000$mort_change, df_pre2000$hpi_growth, use = "complete.obs")
cor_post <- cor(df_post2000$mort_change, df_post2000$hpi_growth, use = "complete.obs")
cor_pre
## [1] 0.3448831
cor_post
## [1] -0.7054775
The split-sample analysis compares the relationship between mortgage rate changes and housing price growth before and after the year 2000. The correlations remain small in both subsamples, which indicates that the basic relationship does not change much across different housing market environments.
This result adds robustness to the earlier findings: regardless of the time period, mortgage rate changes by themselves do not strongly predict quarterly movements in national home price growth.
df_reg_lag3 <- df_ts %>%
filter(!is.na(hpi_growth), !is.na(mort_change_lag3))
model4 <- lm(hpi_growth ~ mort_change_lag3, data = df_reg_lag3)
summary(model4)
##
## Call:
## lm(formula = hpi_growth ~ mort_change_lag3, data = df_reg_lag3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.11817 -0.07676 -0.03655 0.03928 0.22297
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.11053 0.02812 3.931 0.00133 **
## mort_change_lag3 0.02507 0.01336 1.876 0.08020 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1148 on 15 degrees of freedom
## Multiple R-squared: 0.1901, Adjusted R-squared: 0.1361
## F-statistic: 3.521 on 1 and 15 DF, p-value: 0.0802
model_summary <- tibble(
model = c("Model 1: same period",
"Model 2: lag 1",
"Model 3: lag 1 + lag 2",
"Model 4: lag 3"),
r_squared = c(summary(model1)$r.squared,
summary(model2)$r.squared,
summary(model3)$r.squared,
summary(model4)$r.squared),
adj_r_sq = c(summary(model1)$adj.r.squared,
summary(model2)$adj.r.squared,
summary(model3)$adj.r.squared,
summary(model4)$adj.r.squared)
)
model_summary
Adding the three-quarter lag to the regression (Model 4) was motivated by the fact that lag 3 had the highest simple correlation with housing price growth. However, even this model produces a small coefficient and a low R-squared value.
The model comparison table summarizes the fit across all four regression specifications. The consistently low R-squared values confirm that mortgage rate changes alone are not a strong driver of short-term housing price growth. This does not mean mortgage rates are irrelevant; rather, it suggests that other economic factors play a more significant role in explaining housing price dynamics at the national level.
par(mfrow = c(1, 2))
acf(df_ts$HPI, main = "ACF of HPI Level")
acf(na.omit(df_ts$hpi_growth), main = "ACF of HPI Growth")
par(mfrow = c(1, 1))
The ACF of the raw HPI level shows very strong autocorrelation across many lags, which is typical for trending economic time series. This confirms that the raw level series is non-stationary and not appropriate for simple regression or correlation analysis.
By contrast, the ACF of HPI growth shows much weaker autocorrelation and behaves more like a stationary series. This comparison visually demonstrates why economists work with returns or growth rates instead of raw price levels: it avoids spurious correlations and makes statistical modeling more reliable.
df_smooth <- df_ts %>%
arrange(DATE) %>%
mutate(
mort_ma4 = rollmean(MORTGAGE_RATE, k = 4, fill = NA, align = "right"),
hpi_growth_ma4 = rollmean(hpi_growth, k = 4, fill = NA, align = "right")
)
ggplot(df_smooth, aes(x = DATE)) +
geom_line(aes(y = MORTGAGE_RATE), alpha = 0.4) +
geom_line(aes(y = mort_ma4)) +
labs(
title = "Mortgage Rate with 4-Quarter Rolling Average",
x = "Date",
y = "Mortgage Rate (%)"
)
## Warning: Removed 3 rows containing missing values (`geom_line()`).
ggplot(df_smooth, aes(x = DATE)) +
geom_line(aes(y = hpi_growth), alpha = 0.4) +
geom_line(aes(y = hpi_growth_ma4)) +
labs(
title = "HPI Growth with 4-Quarter Rolling Average",
x = "Date",
y = "HPI Growth (Percent Change)"
)
## Warning: Removed 1 row containing missing values (`geom_line()`).
## Warning: Removed 4 rows containing missing values (`geom_line()`).
The rolling-average plots smooth the short-run volatility in mortgage rates and housing price growth, making underlying trends easier to see. The smoothed mortgage rate series highlights the long downward trend in interest rates. The smoothed HPI growth series shows multi-year periods of stronger or weaker price appreciation.
These plots provide additional context for understanding how the broader economic environment evolves over time and how housing price dynamics shift across different interest rate regimes.
Overall, the analysis finds that changes in mortgage rates, by themselves, have only a weak short-run relationship with national housing price growth. Simple correlations, scatterplots, and multiple regression models all point to the same conclusion: quarter- to-quarter movements in mortgage rates explain very little of the variation in HPI growth, even when lagged effects are considered.
The time-series diagnostics and ACF plots suggest that working with growth rates and changes is appropriate and that the transformed series behave reasonably well as stationary variables. The subsample analysis shows that this weak relationship is similar before and after 2000, despite major changes in housing finance and monetary policy.
These findings are consistent with the idea that housing prices are influenced by a wide range of factors beyond just mortgage rates, including income growth, credit availability, local supply constraints, and broader macroeconomic conditions. Mortgage rates are clearly important for affordability, but their direct effect on national housing price growth appears to be limited in this dataset.