Abstract
This is a solution for Workshop 6. Not all the workshop will be displayed;only the sections were students needed to work on an exercise or respond questions.Assume that this industry is an oligopoly where there are few main players in the market that account for most of the market share. In an oligopoly, it is very common that price elasticity of demand is high and sensitive. In other words, when a player decreases the price of a product, it is very likely that its demand will significantly increase. For example, for a price elasticity of -2, if a company decides to reduce the price by 5% in one month, it is expected that on average the demand will increase by about 10% on average.
It is also important to examine not only how the demand of one product changes with respect to its own price, but also how its demand changes with respect to the changes in prices of other products. The concept of cross-elasticity refers to the change of demand of one product with respect to the change in price of another product. If the cross-elasticity coefficient between the demand of product 1 and price of product 2 is about 1.5, this is a sign of competition between both products: when the price of product 2 increases in about 1%, then the demand of product 1 increases in about 1.5%. This price sensitivity might be due to competition between products. If you get a negative cross elasticity coefficient, then this might be a sign of complement between the products.
The concept of price elasticity is simple, but in reality it is not easy to get a good estimate since there are other factors that can influence the demand of a product. For example, it is common that the sales of products in units are seasonal, so the increase in demand might be due to seasonaly more than price elasticity. Another common driver of consumer demand is the organic trend due to either distribution or consumer acceptance. So, we need an econometric model that can separate these drivers and be able to measure price elasticities using a robust method.
Let’s see how we can estimate price elasticities, and how these elasticities along with other main factors that affect demand can help us to do a good demand forecast for products.
For this exercise we will use a sample dataset of main consumer products from main competitors of a category in a country.This dataset has historical sales of the main 7 products of a category, which account for 70% of the share in the category. This dataset was created using real data but changing the name of products. For each product you have sales in value ($) and sales in volume (kgs).
Download the dataset from: http://www.apradie.com/ec2004/salesfabs.xlsx
#I set the working directory:
library(readxl)
library(dplyr)
library(xts)
library(zoo)
library(tseries)
library(forecast)
library(lmtest)
library(astsa)
# Download the excel file from a web site:
<- download.file("http://www.apradie.com/ec2004/salesfabs.xlsx",
dataset "dataw6.xlsx", mode="wb")
# The first parameter is the link and the second is a name for the
# local file
# Use the function read_excel()
<- read_excel("dataw6.xlsx") dataset
The columns that start with qfab are related to sales in volume, and vfab refer to sales in value ($). This dataset has historical monthly sales data for 7 consumer products (qfab1, qfab2, etc).
We can create 2 datasets, one for sales in volume and another for sales in value. This will help to easily create another dataset with prices.
<- dataset %>% select(contains("qfab"))
salesvol <- dataset %>% select(contains("vfab")) salesval
Now we can create another dataset for prices. We just divide sales in value by sales in volume to get average historical monthly prices:
# I construct a date variable as a sequence of dates using the month column:
<- seq(as.Date(dataset$month[1]),
date by="month",
length.out = nrow(dataset) )
# I delete the first column since it has the month, and I will construct
# an xts dataset with the index equal to the month
# I create the xts object:
<- xts(x=dataset[,-1], frequency = "monthly", order.by=date)
dataset.xts
<- salesval / salesvol
prices names(prices)<-c("pfab1","pfab2","pfab3","pfab4","pfab5","pfab6","pfab7")
<- ts(coredata(salesvol),start=c(2014,12),frequency=12)
qfabs <- ts(coredata(prices),start=c(2014,12),frequency=12) pfabs
You have to calibrate an ARIMA model for product 4 following the method we learned here. You have to show and explain your steps, and also you have to INTERPRET your final model, and do a prediction for the next 15 months.
We will model the annual % change in demand of product 4 (month by month). We will assume that this product might be competing with another product or might be a complement of other consumer products in the dataset. Then, for this ARIMAX model we have to include the annual % price change of ALL products as independent/explanatory variables.
But before setting up this ARIMAX model, we need to calibrate an ARIMA/SARIMA model for each price of the products in the dataset. In this case, I will provide you with the calibration of the ARIMA/SARIMA models for all prices of the products. I calibrated these models according to what we learned in previous workshops.
We will use the Arima function with a specific calibration for each price:
# Product price 1:
<- Arima(pfabs[,"pfab1"], order = c(1,0,0),
mprice1 seasonal = list(order=c(0,1,0),period=12),
include.constant = TRUE,
lambda = 0)
# I check the pvalues of the coefficients:
coeftest(mprice1)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.59807762 0.10544181 5.6721 1.41e-08 ***
## drift 0.00508341 0.00063234 8.0391 9.05e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# I create and plot the price forecast
<- forecast(mprice1, h=15)
forecastp1 autoplot(forecastp1)
# Product price 2:
<- Arima(pfabs[,"pfab2"], order = c(1,0,0),
mprice2 seasonal = list(order=c(0,1,0),period=12),
include.constant = TRUE,
lambda = 0)
# I check the pvalues of the coefficients:
coeftest(mprice2)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.6480305 0.0979253 6.6176 3.651e-11 ***
## drift 0.0053835 0.0010509 5.1228 3.010e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# I create and plot the price forecast
<- forecast(mprice2, h=15)
forecastp2 autoplot(forecastp2)
# Product price 3:
<- Arima(pfabs[,"pfab3"], order = c(1,0,0),
mprice3 seasonal = list(order=c(0,1,1),period=12),
include.constant = TRUE,
lambda = 0)
# I check the pvalues of the coefficients:
coeftest(mprice3)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.63551916 0.10995083 5.7800 7.469e-09 ***
## sma1 -0.22374081 0.19500882 -1.1473 0.251242
## drift 0.00217850 0.00072782 2.9932 0.002761 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# I create and plot the price forecast
<- forecast(mprice3, h=15)
forecastp3 autoplot(forecastp3)
# Product price 4:
<- Arima(pfabs[,"pfab4"], order = c(1,0,0),
mprice4 seasonal = list(order=c(0,1,1),period=12),
include.constant = TRUE,
lambda = 0)
# I check the pvalues of the coefficients:
coeftest(mprice4)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.66101380 0.09695434 6.8178 9.246e-12 ***
## sma1 -0.56239455 0.16024533 -3.5096 0.0004488 ***
## drift 0.00421852 0.00042551 9.9141 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# I create and plot the price forecast
<- forecast(mprice4, h=15)
forecastp4 autoplot(forecastp4)
# Product price 5:
<- Arima(pfabs[,"pfab5"], order = c(1,0,0),
mprice5 seasonal = list(order=c(0,1,0),period=12),
include.constant = TRUE,
lambda = 0)
# I check the pvalues of the coefficients:
coeftest(mprice5)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.73931832 0.09285911 7.9617 1.697e-15 ***
## drift 0.00556178 0.00095356 5.8326 5.456e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# I create and plot the price forecast
<- forecast(mprice5, h=15)
forecastp5 autoplot(forecastp5)
# Product price 6:
<- Arima(pfabs[,"pfab6"], order = c(1,0,0),
mprice6 seasonal = list(order=c(1,1,0),period=12),
include.constant = TRUE,
lambda = 0)
# I check the pvalues of the coefficients:
coeftest(mprice6)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.8984396 0.0640989 14.0165 < 2.2e-16 ***
## sar1 -0.3927082 0.1315205 -2.9859 0.002827 **
## drift 0.0044928 0.0017715 2.5361 0.011208 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# I create and plot the price forecast
<- forecast(mprice6, h=15)
forecastp6 autoplot(forecastp6)
# Product price 7:
<- Arima(pfabs[,"pfab7"], order = c(1,0,0),
mprice7 seasonal = list(order=c(0,1,0),period=12),
include.constant = TRUE,
lambda = 0)
# I check the pvalues of the coefficients:
coeftest(mprice7)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.7346684 0.0865368 8.4897 <2e-16 ***
## drift 0.0010131 0.0021806 0.4646 0.6422
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# I create and plot the price forecast
<- forecast(mprice7, h=15)
forecastp7 autoplot(forecastp7)
Now that we have a forecast for prices for each product, we can design an ARIMAX model for the demand of product 4 (qfab4):
The historical data of qfab4 is:
"qfab4"] qfabs[,
## Jan Feb Mar Apr May Jun Jul
## 2014
## 2015 602.1692 616.7962 576.9154 552.2004 583.3495 530.5445 544.4641
## 2016 682.1081 707.9934 655.1503 625.7916 617.9704 583.9963 631.9970
## 2017 812.1743 848.8255 817.2468 745.7488 734.3957 654.5958 684.2820
## 2018 801.1178 898.1721 779.3245 759.8208 787.5706 776.4568 773.0577
## 2019 938.2758 1114.3176 883.4799 815.0494 867.1072 849.1447 765.6616
## 2020 986.4335 1071.5877 884.9235 876.3864 872.0506 843.9563 841.7164
## Aug Sep Oct Nov Dec
## 2014 604.8942
## 2015 555.6245 532.7308 562.9418 611.9219 735.2260
## 2016 627.0950 656.2282 654.0440 673.1533 759.4932
## 2017 659.1869 701.4818 710.1428 769.3757 917.9404
## 2018 759.5814 766.0799 767.7176 892.9676 1009.6267
## 2019 782.4975 778.8295 810.1746 920.9684 1073.6561
## 2020 826.6552 807.3565
I start calibrating an ARIMA/SARIMA model for the demand of product 4 (qfab4). This this is the demand of a consumer product, I start checking whether the annual % growth is stationary:
<- diff(log(qfabs[,"qfab4"]),lag=12)
s12.lnfab4 plot(s12.lnfab4)
Now I run the Dicky-Fuller test to check whether the seasonal difference of the log of product 4 demand is stationary:
adf.test(s12.lnfab4,k=0)
## Warning in adf.test(s12.lnfab4, k = 0): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: s12.lnfab4
## Dickey-Fuller = -4.926, Lag order = 0, p-value = 0.01
## alternative hypothesis: stationary
Since p-value<0.05, then we can treat the series as stationary, so I set the D=1. In other words, I will continue with the seasonal difference of the log of the series (that is the annual cc % change of demand 4) to do the ARIMA-SARIMA model.
Next, I need to do the ACF and the PACF plots to examine the autocorrelations between the annual % change of product 4 demand and its own lags:
acf2(s12.lnfab4,max.lag=12)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
## ACF 0.53 0.35 0.19 0.13 0.06 0.29 0.34 0.19 0.09 -0.14 -0.19 -0.33
## PACF 0.53 0.10 -0.03 0.02 -0.02 0.35 0.13 -0.20 -0.07 -0.29 0.03 -0.29
I see a classical AR signature with p=1 since the ACF shows a slow decay in the autocorrelations, while the PACF shows a fast decay after autocorrelation with lag 2.
Then I start including an AR(1) term to the model:
<- Arima(qfabs[,"qfab4"], order = c(1,0,0),
m1qfab4 seasonal = list(order=c(0,1,0),period=12),
include.constant = TRUE,
lambda = 0)
coeftest(m1qfab4)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.5471474 0.1116303 4.9014 9.514e-07 ***
## drift 0.0078296 0.0013094 5.9797 2.236e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
I check whether the model residuals is a white noise:
acf2(m1qfab4$residuals, max.lag=24)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## ACF -0.06 0.09 -0.02 0.05 -0.18 0.21 0.26 0.04 0.13 -0.17 0.03 -0.36 0.01
## PACF -0.06 0.09 -0.01 0.04 -0.17 0.20 0.33 0.04 0.10 -0.25 0.05 -0.34 -0.20
## [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24]
## ACF 0.13 0.14 -0.06 0.15 -0.09 -0.15 0.04 -0.01 0.15 0.10 0.08
## PACF 0.19 0.07 0.05 0.15 0.04 0.18 0.01 -0.11 0.04 -0.04 -0.11
The autocorrelation with lag 12 is negative and significant in both plots, so I add a seasonal MA term:
<- Arima(qfabs[,"qfab4"], order = c(1,0,0),
m2qfab4 seasonal = list(order=c(0,1,1),period=12),
include.constant = TRUE,
lambda = 0)
coeftest(m2qfab4)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.57750589 0.11162316 5.1737 2.295e-07 ***
## sma1 -0.48653698 0.15099391 -3.2222 0.001272 **
## drift 0.00783918 0.00080503 9.7377 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
I check whether the model 2 residuals is a stationary series:
acf2(m2qfab4$residuals,max.lag = 12)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
## ACF -0.11 0.12 0.05 0.05 -0.09 0.28 0.17 0.05 0.10 -0.12 0.03 -0.03
## PACF -0.11 0.11 0.07 0.05 -0.10 0.25 0.26 0.06 0.05 -0.19 0.00 -0.05
Now the lag 6 autocorrelation is positive and significant in both plots. We can think in including either an AR(6) or an MA(6). In these cases, we can look at the ACF plot. For an AR(6) process with positive autocorrelation, we would see that the lag 6, lag 12, lag 18, lag 24 autocorrelations decay slowly. We can expand the ACF plot to 24 lags:
acf2(m2qfab4$residuals,max.lag = 24)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## ACF -0.11 0.12 0.05 0.05 -0.09 0.28 0.17 0.05 0.10 -0.12 0.03 -0.03 -0.07
## PACF -0.11 0.11 0.07 0.05 -0.10 0.25 0.26 0.06 0.05 -0.19 0.00 -0.05 -0.22
## [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24]
## ACF 0.19 0.05 -0.08 0.13 -0.08 -0.11 0.05 0.00 0.03 0.08 0.03
## PACF 0.13 0.04 -0.04 0.19 -0.07 -0.02 -0.01 -0.13 0.11 -0.03 0.05
Looking at the ACF plot, lag 6, lag 12, lag 18 autocorrelations does not show a slow decay of autocorrelations, so this might be a sign of an MA(6) process only the lag 6 autocorrelation is positive and significant, but there is a no slow decay in the following lag 12, lag 18 and lag 24 autocorrelations.
We could also run both models, with an AR(6) and an MA(6) and keep the model with the more negative Akaike Information Criteria.
I add an MA(6) term. For now, I need to add all MA terms from 1 to 6:
<- Arima(qfabs[,"qfab4"], order = c(1,0,6),
m3qfab4 seasonal = list(order=c(0,1,1),period=12),
include.constant = TRUE,
lambda = 0)
coeftest(m3qfab4)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.7726986 0.1334643 5.7896 7.057e-09 ***
## ma1 -0.3939526 0.1651325 -2.3857 0.0170478 *
## ma2 0.0580795 0.1304586 0.4452 0.6561787
## ma3 -0.1134056 0.1219631 -0.9298 0.3524563
## ma4 0.1960571 0.1360630 1.4409 0.1496050
## ma5 -0.2320074 0.1477422 -1.5704 0.1163331
## ma6 0.5463719 0.1521144 3.5918 0.0003283 ***
## sma1 -0.4183691 0.1735053 -2.4113 0.0158968 *
## drift 0.0078121 0.0013491 5.7905 7.016e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Unfortunately the terms for MA 1, 2, 3, 4 and 5 were also included. Curiously, the MA 1 term was significant, so I can try to keep it in the model. I can indicate to the Arima model that we DO NOT want to include the MA terms that are not significant by using the option fixed. The coefficients of the model that are show above are in the following order:
ar1, ma1, ma2, ma3, ma4, ma5, ma6, sma1, drift
In the fixed option we set a vector with zeros to the terms that we DO NOT want to include:
<- Arima(qfabs[,"qfab4"], order = c(1,0,6),
m3qfab4 seasonal = list(order=c(0,1,1),period=12),
fixed=c(NA,NA,0,0,0,0,NA,NA,NA),
include.constant = TRUE,
lambda = 0)
#In the fixed option I set a vector with the following values:
#fixed =c(NA,NA,0,0,0,0,NA,NA,NA)
#Since I do not want to include the terms ma2, ma3, ma4 and ma5, then I set
#to zero the values 3, 4, 5, and 6, and NA to the rest.
coeftest(m3qfab4)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.7976025 0.1173969 6.7941 1.090e-11 ***
## ma1 -0.4317473 0.1517493 -2.8451 0.004439 **
## ma6 0.4233557 0.1549114 2.7329 0.006278 **
## sma1 -0.4284858 0.1654629 -2.5896 0.009608 **
## drift 0.0078171 0.0014167 5.5178 3.433e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
I see that the mr6 term is positive and significant as expected. Also, I see that the sum of the AR and MA terms is less than 1 (0.797 + -0.431 + 0.423), that is an important condition for the calibration of an ARIMA model.
I check whether the model residuals is a stationary series:
acf2(m3qfab4$residuals,max.lag = 12)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
## ACF 0.02 0.03 -0.02 0 -0.09 -0.02 0.12 -0.02 0.07 -0.15 -0.04 -0.06
## PACF 0.02 0.03 -0.02 0 -0.09 -0.01 0.13 -0.03 0.07 -0.16 -0.04 -0.03
Now I see that the residuals look like a stationary series, so I stop here and use this model to start calibrating the ARIMAX model by including the price variables:
# I generate the log of the product prices since the lamda =0 (log
# transformation) does not apply to the independent variables
<- log(pfabs)
lnpfabs
<- Arima(qfabs[,"qfab4"], order = c(1,0,6),
m1qfab4x seasonal = list(order=c(0,1,1),period=12),
fixed=c(NA,NA,0,0,0,0,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA),
include.constant = TRUE,
xreg = lnpfabs,
lambda = 0)
coeftest(m1qfab4x)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.70581474 0.15501660 4.5532 5.285e-06 ***
## ma1 -0.42741487 0.15671674 -2.7273 0.006385 **
## ma6 0.39044162 0.14697393 2.6565 0.007895 **
## sma1 -0.73833927 0.27309231 -2.7036 0.006859 **
## drift 0.01274847 0.00061499 20.7295 < 2.2e-16 ***
## pfab1 -0.19275941 0.15904008 -1.2120 0.225506
## pfab2 -0.14462470 0.14818485 -0.9760 0.329077
## pfab3 0.03983060 0.12635832 0.3152 0.752595
## pfab4 -1.14086373 0.21974575 -5.1917 2.083e-07 ***
## pfab5 -0.00981323 0.19476756 -0.0504 0.959816
## pfab6 0.42601386 0.16178352 2.6332 0.008458 **
## pfab7 -0.20312498 0.09615632 -2.1124 0.034648 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Now I start dropping independent variables one by one, starting with the variable with the highest p-value (the less significant).
I dropped pfab5. I can drop it using the fixed option and setting to zero the cell that is in the position of pfab5, or just deleting pfab5 from the independent variables. I do the first option:
# I will write the fixed vector and the corresponding variable in the model:
#c(NA, 0 , 0, 0, 0, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)
# ar1,ar2,ar3,ar4,ar5,ar6,sma1,drift,pfab1,pfab2,pfab3,pfab4,pfab5,pfab6,pfab7
<- Arima(qfabs[,"qfab4"], order = c(1,0,6),
mqfab4x seasonal = list(order=c(0,1,1),period=12),
fixed=c(NA,NA,0,0,0,0,NA,NA,NA,NA,NA,NA,NA,0,NA,NA),
include.constant = TRUE,
xreg = lnpfabs,
lambda = 0)
coeftest(mqfab4x)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.7063496 0.1553820 4.5459 5.470e-06 ***
## ma1 -0.4289063 0.1535095 -2.7940 0.005206 **
## ma6 0.3918765 0.1500680 2.6113 0.009019 **
## sma1 -0.7375099 0.2735282 -2.6963 0.007012 **
## drift 0.0127051 0.0020026 6.3441 2.237e-10 ***
## pfab1 -0.1940457 0.2028249 -0.9567 0.338711
## pfab2 -0.1452253 0.1594926 -0.9105 0.362535
## pfab3 0.0411864 0.1288014 0.3198 0.749145
## pfab4 -1.1434493 0.2375941 -4.8126 1.490e-06 ***
## pfab6 0.4272293 0.1787154 2.3906 0.016823 *
## pfab7 -0.2041541 0.0959618 -2.1275 0.033383 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Now I drop pfab3:
<- Arima(qfabs[,"qfab4"], order = c(1,0,6),
mqfab4x seasonal = list(order=c(0,1,1),period=12),
fixed=c(NA,NA,0,0,0,0,NA,NA,NA,NA,NA,0,NA,0,NA,NA),
include.constant = TRUE,
xreg = lnpfabs,
lambda = 0)
coeftest(mqfab4x)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.7112106 0.1532052 4.6422 3.447e-06 ***
## ma1 -0.4311787 0.1533635 -2.8115 0.004931 **
## ma6 0.3885091 0.1510796 2.5716 0.010124 *
## sma1 -0.7398107 0.2764273 -2.6763 0.007443 **
## drift 0.0127165 0.0020064 6.3381 2.326e-10 ***
## pfab1 -0.1863923 0.2027396 -0.9194 0.357903
## pfab2 -0.1386816 0.1585221 -0.8748 0.381661
## pfab4 -1.1475783 0.2386049 -4.8095 1.513e-06 ***
## pfab6 0.4349775 0.1779510 2.4444 0.014511 *
## pfab7 -0.2069631 0.0958833 -2.1585 0.030890 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
I drop pfab2:
<- Arima(qfabs[,"qfab4"], order = c(1,0,6),
mqfab4x seasonal = list(order=c(0,1,1),period=12),
fixed=c(NA,NA,0,0,0,0,NA,NA,NA,NA,0,0,NA,0,NA,NA),
include.constant = TRUE,
xreg = lnpfabs,
lambda = 0)
coeftest(mqfab4x)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.7260142 0.1638397 4.4312 9.369e-06 ***
## ma1 -0.4800588 0.1674882 -2.8662 0.004154 **
## ma6 0.3814251 0.1439032 2.6506 0.008036 **
## sma1 -0.7436167 0.2774953 -2.6797 0.007368 **
## drift 0.0120168 0.0018676 6.4343 1.241e-10 ***
## pfab1 -0.1897519 0.2014832 -0.9418 0.346308
## pfab4 -1.1793780 0.2436953 -4.8396 1.301e-06 ***
## pfab6 0.4525333 0.1788069 2.5308 0.011379 *
## pfab7 -0.2100093 0.0960790 -2.1858 0.028830 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
I drop pfab1:
<- Arima(qfabs[,"qfab4"], order = c(1,0,6),
mqfab4x seasonal = list(order=c(0,1,1),period=12),
fixed=c(NA,NA,0,0,0,0,NA,NA,NA,0,0,0,NA,0,NA,NA),
include.constant = TRUE,
xreg = lnpfabs,
lambda = 0)
coeftest(mqfab4x)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.7220916 0.1574207 4.5870 4.496e-06 ***
## ma1 -0.4678493 0.1519663 -3.0786 0.002079 **
## ma6 0.3941961 0.1498912 2.6299 0.008541 **
## sma1 -0.7721748 0.3001826 -2.5724 0.010101 *
## drift 0.0107543 0.0012532 8.5818 < 2.2e-16 ***
## pfab4 -1.1299331 0.2358930 -4.7900 1.668e-06 ***
## pfab6 0.4761994 0.1762298 2.7022 0.006889 **
## pfab7 -0.2309810 0.0954122 -2.4209 0.015483 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
All independent variables and the AR and the seasonal MA terms are significant (pvalues<0.05). I stop here the ARIMAX calibration process.
I can re-run the same model including only the variables I need instead of fixing to 0 the value for the independent variables. I will get the same result:
<- Arima(qfabs[,"qfab4"], order = c(1,0,6),
mqfab4x seasonal = list(order=c(0,1,1),period=12),
fixed=c(NA,NA,0,0,0,0,NA,NA,NA,NA,NA,NA),
include.constant = TRUE,
xreg = lnpfabs[,c("pfab4","pfab6","pfab7")],
lambda = 0)
# I assign a variable to the coefficient table:
=coeftest(mqfab4x)
coefs coefs
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.72209346 0.15566436 4.6388 3.505e-06 ***
## ma1 -0.46786882 0.15084700 -3.1016 0.0019247 **
## ma6 0.39419576 0.13903703 2.8352 0.0045799 **
## sma1 -0.77218837 0.29019826 -2.6609 0.0077932 **
## drift 0.01075430 0.00059463 18.0858 < 2.2e-16 ***
## pfab4 -1.12992075 0.03904113 -28.9418 < 2.2e-16 ***
## pfab6 0.47619743 0.13403670 3.5527 0.0003812 ***
## pfab7 -0.23098294 0.09532131 -2.4232 0.0153843 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Since all independent variables are significant, and also the coefficients of the ARIMA/SARIMA model, then I stop here the ARIMAX calibration process.
INTERPRETATION of the model:
CONSIDERING THE EFFECT OF PRICE CHANGES AND PREVIOUS SHOCKS, THE ANNUAL % DEMAND GROWTH OF PRODUCT 4 IS POSITIVE AND SIGNIFICANTLY RELATED WITH ITS OWN ANNUAL % GROWTH OF THE LAST MONTH. FOR EACH +1% ANNUAL GROWTH OF LAST MONTH, IT IS EXPECTED THAT THE CURRENT ANNUAL % GROWTH WILL MOVE IN ABOUT +0.7220935%. IN OTHER WORDS, 72.2093464% OF THE LAST-MONTH ANNUAL % GROWTH WILL IMPACT THE CURRENT ANNUAL % DEMAND GROWTH OF PRODUCT 4.
THE SHOCKS OF PREVIOUS MONTH AND 12 MONTHS AGO SIGNIFICANTLY AND NEGATIVELY IMPACT THE ANNUAL % DEMAND GROWTH OF PRODUCT 4 AFTER CONSIDERING THE EFFECT OF PRICE CHANGES AND LAST MONTH ANNUAL GROWTH.
THE SHOCK OF 6 MONTHS AGO SIGNIFICANTLY AND POSITIVELY IMPACT THE ANNUAL % DEMAND GROWTH OF PRODUCT 4 AFTER CONSIDERING THE EFFECT OF PRICE CHANGES AND LAST MONTH ANNUAL GROWTH.
NOW, RELATED TO THE PRICE CHANGE RELATIONSHIPS, HERE ARE THE INTERPRETATIONS:
1. THE AVERAGE DIRECT ELASTICITY OF PRODUCT 4 IS -1.1299207, AND THIS ELASTICITY IS SIGNIFICANT. IN OTHER WORDS, FOR EACH +1% CHANGE IN PRODUCT 4 PRICE, IT IS EXPECTED THAT ITS OWN DEMAND WILL MOVE IN ABOUT -1.1299207%, AFTER CONSIDERING THE EFFECT OF PRICE CHANGES OF PRODUCTS 6 AND 7, AND THE AUTOCORRELATIONS OF THE ARIMAX MODEL.
2. THE AVERAGE CROSS-ELASTICITY OF PRODCUT 6 IS 0.4761974. THIS MEANS THAT FOR EACH +1% CHANGE IN PRODUCT 6 PRICE, IT IS EXPECTED THAT THE PRODUCT 4 DEMAND WILL MOVE IN ABOUT 0.4761974%. SINCE THIS COEFFICIENT IS POSSITIVE AND SIGNIFICANT, THEN THIS IS AN EVIDENCE OF POSSIBLE COMPETITION BETWEEN PRODUCT 6 AND PRODUCT 4 SINCE THE PRODUCT 4 DEMAND DECREASES WHEN PRODUCT 6 PRICE DECREASES.
3. THE AVERAGE CROSS-ELASTICITY OF PRODUCT 7 IS -0.2309829. THIS MEANS THAT FOR EACH +1% CHANGE IN PRODUCT 7 PRICE, IT IS EXPECTED THAT THE PRODUCT 4 DEMAND WILL MOVE IN ABOUT 0.4761974%. SINCE THIS COEFFICIENT IS NEGATIVE AND SIGNIFICANT, THEN THIS IS AN EVIDENCE OF COMPLEMENTARITY BETWEEN PRODUCT 7 AND PRODUCT 4 SINCE THE PRODUCT 4 DEMAND INCREASES WHEN PRODUCT 7 PRICE DECREASES.
Finally, I do the forecast of product 4 demand with this ARIMAX model:
# Create a dataset for the forecast log prices of the explanatory variables:
<- cbind(forecastp4$mean,forecastp6$mean,forecastp7$mean)
pricesff <- log(pricesff)
lnpricesff colnames(lnpricesff) <- c("pfab4","pfab6","pfab7")
<- forecast(mqfab4x,xreg=lnpricesff, h=15)
qfab4forecast autoplot(qfab4forecast)
The forecast for product 4 demand is the following:
$mean qfab4forecast
## Jan Feb Mar Apr May Jun Jul
## 2020
## 2021 1046.9095 1164.6108 987.5000 954.5529 984.1217 928.3148 928.1803
## Aug Sep Oct Nov Dec
## 2020 864.4570 963.8279 1110.1721
## 2021 925.1270 927.7429 966.7506 1083.5664 1238.3819