Project Objective

The primary goal of this project is to identify and analyze the underlying trends and patterns in mortgage interest rates over an extended period (2000= 2023). Utilizing historical data, this project aims to develop a predictive model that can predict future interest rates.

Data Dictionary

The dataset consists of three primary variables, each representing different aspects of mortgage interest rates over time:

  1. Year:
    • Type: Date
    • Description: This column represents the year for which the mortgage interest rate data is recorded.
  2. Period:
    • Type: Integer
    • Description: This column serves as a sequential identifier for the observations.
  3. Interest_Rate:
    • Type: Numeric
    • Description: This column contains the mortgage interest rates recorded for each corresponding year. The values are expressed as percentages and represent the annual interest rate at the time.

Step 1: Install and load required packages

library(dplyr)
library(readxl)
library(zoo)
library(ggplot2)

Step 2: Download and examine data set

library(readxl)
df <- read_excel("C:/Users/belen/OneDrive/Homework/Stats/Mortgage.xlsx")
head(df)
## # A tibble: 6 × 3
##   Year                Period Interest_Rate
##   <dttm>               <dbl>         <dbl>
## 1 2000-01-01 00:00:00      1          8.05
## 2 2001-01-01 00:00:00      2          6.97
## 3 2002-01-01 00:00:00      3          6.54
## 4 2003-01-01 00:00:00      4          5.83
## 5 2004-01-01 00:00:00      5          5.84
## 6 2005-01-01 00:00:00      6          5.87

Step 3: Data Exploration

A) Construct a time series plot and determine the type of pattern that exists in the data

plot(df$Year, df$Interest_Rate, type= "o", col= "red", xlab= "Year", ylab= "Interest Rate", main="Mortgage Interest Rate Over Time")

Interpretation: The graph above suggests a downward trend from 2000 to 2015, when mortgage interest rates stabilized, to finally decline leading to 2020. From 2020 onwards, interest rates appeared to have skyrocketed and keep trending upwards in 2023. 

B) Develop the linear trend equation for the time series

model<- lm(Interest_Rate ~ Period, data = df)
summary(model)
## 
## Call:
## lm(formula = Interest_Rate ~ Period, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.3622 -0.7212 -0.2823  0.5015  3.1847 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  6.69541    0.43776  15.295 3.32e-13 ***
## Period      -0.12890    0.03064  -4.207 0.000364 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.039 on 22 degrees of freedom
## Multiple R-squared:  0.4459, Adjusted R-squared:  0.4207 
## F-statistic:  17.7 on 1 and 22 DF,  p-value: 0.0003637
Result - linear trend equation: Interest Rate= 6.70 - 0.13 * Period

C) Forecast the average interest rate for period 25 (i.e., 2024).

forecast_period_25<- predict(model, newdata=data.frame(Period= 25))
forecast_period_25
##        1 
## 3.472942
Interpretation: Based on this model, the 2024 Mortgage Interest Rate is predicted to be 3.47.