Multiple Linear Regression using Stock Price Index

Collect the data

## [1] "C:/MSDS_Course/Spring_2022/DATA_605/Week_12"
##   Year Month Interest_Rate Unemployment_Rate Stock_Price_Index
## 1 2017    12          2.75               5.3              1464
## 2 2017    11          2.50               5.3              1394
## 3 2017    10          2.50               5.3              1357
## 4 2017     9          2.50               5.3              1293
## 5 2017     8          2.50               5.4              1256
## 6 2017     7          2.50               5.6              1254
## [1] "Year"              "Month"             "Interest_Rate"    
## [4] "Unemployment_Rate" "Stock_Price_Index"

Check linearity of the Stock Price Index with Interest Rate

ggplot(data = stock_data, aes(x = Interest_Rate, y = Stock_Price_Index)) +
 geom_point()

Check linearity of the Stock Price Index with Unemployment Rate

ggplot(data = stock_data, aes(x = Unemployment_Rate, y = Stock_Price_Index)) +
 geom_point()

Apply multiple linear regression

model <- lm( data = stock_data, Stock_Price_Index ~ Interest_Rate + Unemployment_Rate)
summary(model)
## 
## Call:
## lm(formula = Stock_Price_Index ~ Interest_Rate + Unemployment_Rate, 
##     data = stock_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -175.959  -38.459    7.664   51.635  111.670 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)   
## (Intercept)         1461.9      942.6   1.551  0.13584   
## Interest_Rate        386.5      115.3   3.351  0.00303 **
## Unemployment_Rate   -206.3      123.9  -1.664  0.11088   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 73.08 on 21 degrees of freedom
## Multiple R-squared:  0.8902, Adjusted R-squared:  0.8797 
## F-statistic: 85.12 on 2 and 21 DF,  p-value: 8.444e-11

The p-values of Interest rate is less than 0.05, which is statistically significant in the multiple linear regression model.

Using the coefficients in the summary to build the equation:

Stock_Price_Index = (Intercept) + (Interest_Rate_coeff) * \(X{_1}\) (Unemployment_Rate) * \(X{_2}\)

Stock_Price_Index = 1461.9 + 386.5* \(X{_1}\) + (-206.3)* \(X{_2}\)

Make Prediction

Assume interest rate is 2.1 and Unmeployment rate is 5.9 Then the Stock_Price_Index would be …

x1 <- 2.1
x2 <- 5.9
(Stock_Price_Index <-  1461.9 + (386.5* x1)  + ((-206.3)* x2))
## [1] 1056.38

Residual Analysis

Fitted Value vs Residuals

plot(model$fitted.values, model$residuals, xlab='Fitted Values', ylab='Residuals')
abline(0,0)

When we look at the residuals vs X(fitted values) we can see sine-like curve pattern which indicates that the dataset is not linear.

It is possible to say that the outlier values do not show the same variance of the residuals; however, it is not very clear. I think it is reasonable to continue with the analysis and assume similar variance of residuals.

qqnorm(model$residuals)
qqline(model$residuals)

The normal Q-Q plot of the residuals appears to follow the theoretical line. Residuals are reasonably normally distributed.

Adjusted coefficient of determination of a multiple linear regression model is the coefficient of determination.

summary(model)$adj.r.squared 
## [1] 0.8797363