R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

1 Check your working directory 2 Set your working directory to “ANLY 580/RScript”. 3 Download “Inflation.xls” data file and set the “observation_date” variable to the date format and the “CPI” variable to the numeric format. The “CPI” variable represents the consumer price index, which indicates the relative prices of a consumer basket.

setwd("C:/Users/aziz/Documents/Harrisburg/CS Master/ANLY 565/RScript")
getwd()
## [1] "C:/Users/aziz/Documents/Harrisburg/CS Master/ANLY 565/RScript"
library(readxl)
Inf_data <- read_xls("Inflation.xls", col_types = c ("date","numeric"))
str(Inf_data)
## tibble [1,273 x 2] (S3: tbl_df/tbl/data.frame)
##  $ observation_date: POSIXct[1:1273], format: "1913-01-01" "1913-02-01" ...
##  $ CPI             : num [1:1273] 9.8 9.8 9.8 9.8 9.7 9.8 9.9 9.9 10 10 ...

4 Create two stand alone variables: “date” and “cpi”. “date” variable should represent the values of the “observation_date” variable from the “Inflation” data set, while, “cpi” variable should represent values of the “cpi” variable from the “Inflation” data set.

date <- Inf_data$observation_date
cpi <- Inf_data$CPI

5 Transform “cpi” variable from numeric format to the time series format by using ts() function. Label the new variable as “cpits”.

range(date)
## [1] "1913-01-01 UTC" "2019-01-01 UTC"
cpits <- ts(cpi, start = c(1913,1), end = c(2019,1), frequency = 12)
head(cpits,5)
## [1] 9.8 9.8 9.8 9.8 9.7

6 Please construct the following three graphs: 1)time series plot, 2) autocorrelation and 3) partial autocorrelation functions for the “cpits” variable. Based on the signature of these graphs, does the variable appear stationary? Explain

plot(cpits, main = "CPI 1913-2019", xlab = "Months", ylab = "CPI")

acf(cpits)

pacf(cpits)

the data appears to have a strong trend moving upward throughout the years therefore not stationnary

7 Use “cpits” variable and window() function to create 2 new variables called “cpi.pre”, “cpi.post”. The “cpi.pre” should include all observations for the period starting from January of 1990 and up until October 2018. The “cpi.post” should include all observations starting from November 2018. and up until the last month in the dataset.

cpi.pre <- window(cpits, start = c(1990,1), end = c(2018,1), frequency = 12)
cpi.post <- window(cpits, start = c(2018,11), end = c(2019,1), frequency = 12)

8 Use time() function and “cpi.pre” variable to create a variable called “Time”. Moreover, use “cpi.pre” variable and cycle() function to create a factor variable titled “Seas”.

Time <- time(cpi.pre)
Seas <- factor(cycle(cpi.pre))

9 Use lm() function to estimate parameter values of a linear regression model by regressing “Time”, and “Seas” on “cpi.pre”. Save these estimates as “cpi.lm”. Set the value of the intercept to 0, in order to interpret the coefficients of the seasonal dummy variables as seasonal intercepts. (Setting intercept to 0 ensures that for each season there is a unique intercept) Save these estimates as cpi.lm

cpi.lm <- lm(cpi.pre ~ 0 + Time +  Seas)
coef(cpi.lm)
##         Time        Seas1        Seas2        Seas3        Seas4        Seas5 
##     4.367881 -8564.638995 -8564.210600 -8563.627590 -8563.365188 -8563.269535 
##        Seas6        Seas7        Seas8        Seas9       Seas10       Seas11 
## -8563.198632 -8563.413265 -8563.444362 -8563.334674 -8563.648842 -8564.287118 
##       Seas12 
## -8564.963858

10 Create the following new items: “new.Time”- sequence of 12 values starting from 2018.75+1/12 and each number going up by 1/12 “new.Seas”- a vector with the following values c(11,12,1,2,3,4,5,6,7,8,9,10) “new.data”- a data frame that combines the “new.Time” and “new.Seas” variables.

new.Time <- seq(2018.75+1/12, length =12, by=1/12)
new.Time
##  [1] 2018.833 2018.917 2019.000 2019.083 2019.167 2019.250 2019.333 2019.417
##  [9] 2019.500 2019.583 2019.667 2019.750
new.Seas <- factor(c(11,12,1,2,3,4,5,6,7,8,9,10))
new.data<- data.frame(Time = new.Time, Seas = new.Seas)
new.data
##        Time Seas
## 1  2018.833   11
## 2  2018.917   12
## 3  2019.000    1
## 4  2019.083    2
## 5  2019.167    3
## 6  2019.250    4
## 7  2019.333    5
## 8  2019.417    6
## 9  2019.500    7
## 10 2019.583    8
## 11 2019.667    9
## 12 2019.750   10

11 Use predict() function and cpi.lm model to create a 12 month ahead forecast of the consumer price index. Save this forecast as “predict.lm”

predict.lm <- predict(cpi.lm, new.data)
predict.lm
##        1        2        3        4        5        6        7        8 
## 253.7366 253.4238 254.1127 254.9050 255.8520 256.4784 256.9381 257.3730 
##        9       10       11       12 
## 257.5223 257.8552 258.3289 258.3787

12 Collect residuals from the “cpi.lm” model and save them as “cpi.lm.resid”. Moreover, construct acf and pacf for the “cpi.lm.resid” series. Is the series stationary? Is there autocorrelation in the residual series?

cpi.lm.resid <- cpi.lm$residuals
acf(cpi.lm.resid)

pacf(cpi.lm.resid)

it shows that there is a strong correlation

13 Based on the AIC, identify the best order of ARMA model (without the seasonal component) for the cpi.lm.resid time series and estimate the value of the parameter coefficients. Please, consider any ARMA model with up to 3 AR and/or MA terms. Save these estimates as resid.best.arma. What is the order of resid.best.arma?

best.order <- c(0,0,0)
best.aic <- Inf
for (i in 0:3) for (j in 0:3) {
  fit.aic <- AIC(arima(cpi.lm.resid, order = c(i, 0,
                                               j)))
  if (fit.aic < best.aic) {
    best.order <- c(i, 0, j)
    resid.best.arma <- arima(cpi.lm.resid, order = best.order)
    best.aic <- fit.aic
  }}
resid.best.arma
## 
## Call:
## arima(x = cpi.lm.resid, order = best.order)
## 
## Coefficients:
##          ar1     ma1     ma2  intercept
##       0.9441  0.5793  0.1627    -0.0728
## s.e.  0.0182  0.0578  0.0549     0.7743
## 
## sigma^2 estimated as 0.2292:  log likelihood = -231.73,  aic = 473.46
best.order
## [1] 1 0 2

looks like best order is 1,0,2

14 Use predict() function and resid.best.arma to create a 12 period ahead forecast of cpi.lm.resid series. Save the forecasted values as resid.best.arma.pred

resid.best.arma.pred <- predict(resid.best.arma, n.ahead =  12)
resid.best.arma.pred$pred
## Time Series:
## Start = 338 
## End = 349 
## Frequency = 1 
##  [1] -1.4978454 -1.3342014 -1.2637141 -1.1971656 -1.1343358 -1.0750168
##  [7] -1.0190125 -0.9661377 -0.9162175 -0.8690867 -0.8245896 -0.7825789

15 Use ts() function to combine the cpi values forecaseted by cpi.lm model and the residual values forecasted by resid.best.arma. Lable this time series as cpi.pred

cpi.pred <- ts((predict.lm + resid.best.arma.pred$pred), start = c(2018,11), frequency =  12)
cpi.pred
##           Jan      Feb      Mar      Apr      May      Jun      Jul      Aug
## 2018                                                                        
## 2019 252.8489 253.7079 254.7177 255.4034 255.9191 256.4068 256.6061 256.9861
##           Sep      Oct      Nov      Dec
## 2018                   252.2387 252.0896
## 2019 257.5043 257.5961

16 Use ts.plot() function to plot cpi.pre and cpi.pred together on one graph. What do you expect will happen to the CPI during the next 12 month?

ts.plot(cbind(cpi.pre, cpi.pred), lty = 1:2)

the cpi will keep going up in the next 12 months

17 Please calculate mean absolute percentage error for the cpi.pred forecast for the first three month (November 2018, December 2018, January 2019) How accurate is the model?

mean(abs((cpi.post[1:3] - cpi.pred[1:3])/ cpi.post[1:3]))*100
## [1] 0.2907615

model is accurate with mape of 0.29%

18 What is the forecasted rate of inflation between December 2018 and January 2019? Hint: Inflation = % change in CPI

diff(log(cpi.pred[1:3]))*100
##           2           3 
## -0.05913051  0.30076564
diff(log(cpi.post[1:3]))*100
## [1] -0.3199074  0.1904781

forecasted inflation is -.06 vs -0.32

19 Policy makers often care more about inflation rather than cpi. Create a new stand alone variable that would represent the first log difference of the the cpits variable. Label this variable “pi”, which represents monthly inflation rate in the US. If percentage change is positive there is inflation (prices go up), and if the percentage change is negative there is deflation (prices fall). What was the lowest monthly rate of inflation(deflation) recorded in US during the time sample? What about was the highest?

pi <- diff(log(cpits))*100
summary(pi)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -3.2088  0.0000  0.2397  0.2552  0.5451  5.7158

20 Please construct the time series plot, the autocorrelation and partial autocorrelation functions for the “pi” variable. Based on the signature of these graphs, does the variable appear stationary? Explain

plot(pi, main="Inflation 1913-2019", xlab="Months", ylab="Inflation %")

acf(pi)

pacf(pi)

there are period in which there is strong inflation then it stabilizes

21 Use “pi” variable and window() function to create 2 new variables called “pi.pre”, “pi.post”. The “pi.pre” should include all observations for the period starting from January of 1990 and up until October 2018. The “pi.post” should include all observations starting from November 2018. and up until the last month in the dataset.

pi.pre <- window(pi, start = c(1990,1), end=c(2018,10), frequency=12)
pi.post <- window(pi, start = c(2018,11), end=c(2019,1), frequency=12)

22 Please create a function that takes a time series as input, and then uses AIC to identify the best SARIMA model. The function should return the following: - the order of the best SARIMA, - its AIC - and the estimates of its coefficient values Lable this formula get.best.sarima

get.best.sarima <- function(x.ts, maxord = c(2,2,2,2,2,2))
{
  best.aic <- Inf
  n <- length(x.ts)
  for (p in 0:maxord[2]) for(d in 0:maxord[2]) for(q in 0:maxord[2])
    for (P in 0:maxord[2]) for(D in 0:maxord[2]) for(Q in 0:maxord[2])
    {
      fit <- arima(x.ts, order = c(p,d,q),
                   seas = list(order = c(P,D,Q),
                               frequency(x.ts)), method = "CSS")
      fit.aic <- -2 * fit$loglik + (log(n) + 1) * length(fit$coef)
      if (fit.aic < best.aic)
      {
        best.aic <- fit.aic
        best.fit <- fit
        best.order <- c(p,d,q,P,D,Q)
      }
    }
  list(best.order,best.aic, best.fit)
}

23 By using get.best.sarima() function please identify the best SARIMA model for pi.pre time series. Please cosider SARIMA(2,2,2,2,2,2) as the maximum order of the model. Save the results of the get.best.sarima() function as “pi.best.sarima” What is the order of the best SARIMA model?

AIC(arima((pi.pre), order=c(0,0,1), seas=list(order=c(2,1,1))))
## [1] 47.65343
AIC(arima((pi.pre), order=c(2,0,0), seas=list(order=c(2,0,1))))
## [1] 38.91891
pi.best.sarima <- get.best.sarima((pi.pre), maxord = c(2,2,2,2,2,2))
## Warning in arima(x.ts, order = c(p, d, q), seas = list(order = c(P, D, Q), :
## possible convergence problem: optim gave code = 1

## Warning in arima(x.ts, order = c(p, d, q), seas = list(order = c(P, D, Q), :
## possible convergence problem: optim gave code = 1

## Warning in arima(x.ts, order = c(p, d, q), seas = list(order = c(P, D, Q), :
## possible convergence problem: optim gave code = 1

## Warning in arima(x.ts, order = c(p, d, q), seas = list(order = c(P, D, Q), :
## possible convergence problem: optim gave code = 1

## Warning in arima(x.ts, order = c(p, d, q), seas = list(order = c(P, D, Q), :
## possible convergence problem: optim gave code = 1

## Warning in arima(x.ts, order = c(p, d, q), seas = list(order = c(P, D, Q), :
## possible convergence problem: optim gave code = 1

## Warning in arima(x.ts, order = c(p, d, q), seas = list(order = c(P, D, Q), :
## possible convergence problem: optim gave code = 1
pi.best.sarima
## [[1]]
## [1] 0 0 1 2 1 1
## 
## [[2]]
## [1] 52.45196
## 
## [[3]]
## 
## Call:
## arima(x = x.ts, order = c(p, d, q), seasonal = list(order = c(P, D, Q), frequency(x.ts)), 
##     method = "CSS")
## 
## Coefficients:
##          ma1     sar1     sar2     sma1
##       0.4615  -0.1775  -0.0910  -0.8422
## s.e.  0.0444   0.0600   0.0577   0.0379
## 
## sigma^2 estimated as 0.06311:  part log likelihood = -12.53

best order model is 0,0,1,2,1,1

24 Please use predict() function and the best.sarima.pi model to forecast monthly rate of inflation in the US during November 2018, December 2018 and January 2019. Save these predictions as pi.sarima.pred

pi.sarima.pred <- predict(pi.best.sarima[[3]], n.ahead=3)$pred
pi.sarima.pred
##             Jan Feb Mar Apr May Jun Jul Aug Sep Oct        Nov        Dec
## 2018                                                -0.1475132 -0.2388366
## 2019  0.2646618

25 Please calculate mean absolute percentage error of the best.sarima model. How accurate is the model?

mean(abs((pi.post[1:3] - pi.sarima.pred)/(pi.post[1:3])))*100
## [1] 40.10648
pi.post
##             Jan Feb Mar Apr May Jun Jul Aug Sep Oct        Nov        Dec
## 2018                                                -0.3354970 -0.3199074
## 2019  0.1904781
pi.sarima.pred
##             Jan Feb Mar Apr May Jun Jul Aug Sep Oct        Nov        Dec
## 2018                                                -0.1475132 -0.2388366
## 2019  0.2646618

looks like model can do better in accuracy since its mape is 40%

26 Extract the residual series from the pi.best.sarima model, and save them as sarima.resid.

sarima.resid <- pi.best.sarima[[3]]$residuals

27 Plot the acf of the sarima.resid series and acf of the sarima.resid^2 series What can you conclude based on these graphs?

plot(sarima.resid)

acf(sarima.resid)

pacf(sarima.resid)

trend graph shows there is no trend, no autocorrelation also

#28 Download fGarch package and upload it to the library

library(fGarch)
## NOTE: Packages 'fBasics', 'timeDate', and 'timeSeries' are no longer
## attached to the search() path when 'fGarch' is attached.
## 
## If needed attach them yourself in your R script by e.g.,
##         require("timeSeries")

29 Use garchFit() function from the fGarch package to estimate garch(1,1) model of the sarima.resid time series. By doing so you will be able to analyze the volatility of the inflation, or, in other words, how stable it is. Save the estimated coefficients as resid.garch

resid.garch <- garchFit(~garch(1,1), data=sarima.resid, include.mean =F, trace = F)
summary(resid.garch)
## 
## Title:
##  GARCH Modelling 
## 
## Call:
##  garchFit(formula = ~garch(1, 1), data = sarima.resid, include.mean = F, 
##     trace = F) 
## 
## Mean and Variance Equation:
##  data ~ garch(1, 1)
## <environment: 0x000000fc2475d8c0>
##  [data = sarima.resid]
## 
## Conditional Distribution:
##  norm 
## 
## Coefficient(s):
##      omega      alpha1       beta1  
## 0.00065719  0.33466123  0.72819655  
## 
## Std. Errors:
##  based on Hessian 
## 
## Error Analysis:
##         Estimate  Std. Error  t value Pr(>|t|)    
## omega  0.0006572   0.0002065    3.183  0.00146 ** 
## alpha1 0.3346612   0.0737839    4.536 5.74e-06 ***
## beta1  0.7281965   0.0433202   16.810  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log Likelihood:
##  67.64266    normalized:  0.195499 
## 
## Description:
##  Sat Dec  3 14:11:24 2022 by user: aziz 
## 
## 
## Standardised Residuals Tests:
##                                 Statistic p-Value    
##  Jarque-Bera Test   R    Chi^2  104.6063  0          
##  Shapiro-Wilk Test  R    W      0.9648452 2.11466e-07
##  Ljung-Box Test     R    Q(10)  3.589069  0.9639872  
##  Ljung-Box Test     R    Q(15)  17.51113  0.2892379  
##  Ljung-Box Test     R    Q(20)  18.74426  0.5385011  
##  Ljung-Box Test     R^2  Q(10)  5.76984   0.8342153  
##  Ljung-Box Test     R^2  Q(15)  6.80949   0.9627221  
##  Ljung-Box Test     R^2  Q(20)  10.02731  0.967674   
##  LM Arch Test       R    TR^2   6.544643  0.8861823  
## 
## Information Criterion Statistics:
##        AIC        BIC        SIC       HQIC 
## -0.3736570 -0.3403064 -0.3738056 -0.3603767
coef(resid.garch)
##        omega       alpha1        beta1 
## 0.0006571871 0.3346612263 0.7281965470

30 The main priority of the monetary authority (Federal Reserve) in the United States is to ensure stable value of currency. Simply put, Fed wants to keep inflation stable (no volatility). To maintain stability the Fed depends on a number of tools, and its effectiveness is judged based on the forecasting model of volatility. Please use resid.garch variable and predict function to forecast two period ahead inflation volatility, which is measured by a square of the forecasted standard deviation. How stable will be the currency in February and March of 2019?

resid.garch.pi<- garchFit(~garch(1,1), data = pi, include.mean = F, trace = F)
frcst<- predict(resid.garch.pi, n.ahead = 2)
frcst$standardDeviation^2
## [1] 0.0964593 0.1022910

looks like the inflation will keep increasing in Feb & Mar

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.