R Markdown

##Loading the dataset
library(readxl)
price_data <- read_excel("C:/Users/Baha/Downloads/price_data.xlsx")
head(price_data)
## # A tibble: 6 Ă— 5
##   Date                Interest   cpi Inventory `Average Price`
##   <dttm>                 <dbl> <dbl>     <dbl>           <dbl>
## 1 2018-03-01 00:00:00     4.44  250.     19080         134781.
## 2 2018-04-01 00:00:00     4.47  251.     19775         140569.
## 3 2018-05-01 00:00:00     4.59  252.     20616         146346.
## 4 2018-06-01 00:00:00     4.57  252.     21354         150539.
## 5 2018-07-01 00:00:00     4.53  252.     22339         151227.
## 6 2018-08-01 00:00:00     4.55  252.     23305         150136.

##Problem one

Construct a scatter plot of your dependent variable and an independent variable of interest, and use it to discuss the appropriate specification of your model (linear, log-linear, log-log, quadratic to capture non linearity, etc.). Use sale price and inventory for the two variables here.

##Problem two

Then Estimate a bi-variate regression with your dependent variable as a function of a constant term and the independent variable of interest, the same two variables from the previous one.

##Problem three

Estimate a multivariate regression with your dependent variable as a function of a constant and your independent variables, as before.

##Problem four

Use the models to predict the future price (constant) based on the cpi, inventory and mortgage rate.

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(dplyr)
##Plotting the scatter plot
ggplot(price_data, aes(x = Inventory, y = `Average Price`)) + geom_point()

##Plotting the scatter plot
ggplot(price_data, aes(x = cpi, y = `Average Price`)) +
  geom_point() 

##Plotting the scatter plot
ggplot(price_data, aes(x = Interest, y = `Average Price`)) +
  geom_point()

##Then Estimate a bi-variate regression with your dependent variable as a function of a constant term and the independent variable of interest
attach(price_data)
bi_model <-lm(`Average Price`~cpi, data = price_data)
summary(bi_model)
## 
## Call:
## lm(formula = `Average Price` ~ cpi, data = price_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -12605.5  -3914.3   -411.7   3201.1  12765.7 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -137497.59    9995.42  -13.76   <2e-16 ***
## cpi            1132.43      36.73   30.83   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5759 on 65 degrees of freedom
## Multiple R-squared:  0.936,  Adjusted R-squared:  0.935 
## F-statistic: 950.7 on 1 and 65 DF,  p-value: < 2.2e-16

##Conclusion

A negative intercept indicates that the line of best fit crosses the y-axis below zero. In other words, the predicted value of the dependent variable is negative when the independent variable, cpi, is zero. The estimate of variable cpi is positive, indicating that it influences the response variable positively. One unit increase in cpi results into a 1132.43 increase in average price. Both the dependent variable and the independent variable are statistically significant, this is shown by the p-value which is less than 5% level of significance. The model had an R squared of 93.6% and an adjusted R squared of 93.5%, this indicated that the explained a proportion equal to 93.6% of data and thus a better fit to data.

multi_model <-lm(`Average Price`~Interest+Inventory+cpi, data = price_data)
summary(multi_model)
## 
## Call:
## lm(formula = `Average Price` ~ Interest + Inventory + cpi, data = price_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9561.0 -3821.5  -845.5  2803.4 11802.1 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.920e+05  2.481e+04  -7.737 1.03e-10 ***
## Interest    -6.288e+02  9.440e+02  -0.666    0.508    
## Inventory    9.265e-01  3.100e-01   2.988    0.004 ** 
## cpi          1.288e+03  8.777e+01  14.675  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5341 on 63 degrees of freedom
## Multiple R-squared:  0.9467, Adjusted R-squared:  0.9441 
## F-statistic: 372.7 on 3 and 63 DF,  p-value: < 2.2e-16

##Conclusion

The intercept is negative, it shows that the response variable, average price, will be negative when all the predictor variables are zero. The variable Interest is negatively correlated to average price, it has a negative estimate. This is explained as, when interest rate increases, average price decreases and vise versa, the variable is also not significant in the explanation of average price, this is indicated by the p-value which is greater than 5% level of significance. Variables Inventory and cpi are all significant in the explanation of average price, their p-values are less than 5% level of significance, they both influence average price positively since their estimates are positive. The model had an R squared of 94.67% and an adjusted R squared of 94.41%, this indicated that the model explained a 94.67% proportion of data and thus a good fit to the data.

##Use the models to predict the future price (constant) based on the cpi, inventory and mortgage rate.

##Predictions of the future price using the fitted models

##The bi variate model

The prediction equation of a bi-variate model is given by:

Y = -137497.59 + 1132.43cpi, this indicates that estimated beta_0 is -137497.59 and estimated beta_1 is 1132.43. To predict future price, the model will be utilized. For instance, if this month’s consumer price index is 250.00, then the future price will be given by

Y_predicted = -137497.59 + 1132.43*250.00

Y_predicted = 145,609.91

Therefore the future price will be 145,609.91

##Multi_variate model

The prediction equation of the multi-variate model is given by:

Y_predicted = -192000 + 0.9265inventory - 628.8interest + 1288cpi To get the future price at say an interest rate of 4.47%, inventory of 26354 and at a consumer price index of 246.78.

Y_predicted = -192000 + 0.926526354 - 628.80.0447 + 1288*246.78

Y_predicted = 150,241.51

And thus the future price will be 150,241.51