Introduction

American Eagle (AEO), is an American clothing and accessories retailer, headquartered in the Southside Works Neighborhood of Pittsburgh, Pennsylvania. AEO is the parent company of Aerie - intimate apparel sub-brand. The brand target market is male and female university students but the traction for the older age segments has been increasing sluggishly over the years. Their offering ranges from high-quality on-trend clothing to accessories and personal care products.

Most of the brick and mortar stores have seen difficult times due to shift of consumer preference to online shopping and ecommerce. During the troubling period for retail space in North America, apparel retailers have shown strong signs of rebound thanks to an improving consumer spending environment over the last 2 years. Shares of AEO roughly doubled over the past 12 months but there has been a trend downwards since their disappointing 3rd quarter release.

The goal of this research paper is to investigate on how various factors have affected the sales revenue of retail sales at AEO Inc, who are currently one of the leading brands for the millenials. These factors include acceptance of internet commerce, increasing comfort with technology(through technologies and ordering items through Amazon Alexa), launch of American Eagle mobile application. Furthermore, AEO focus on innovation has lead the company perfoming better than it’s competitors by digitized distribution network. It will go about trying to answer a couple of important questions including the management performance and the performance of the CEO who has been power since Jan 2014 - Jay L. Schottenstein sold estimated $2.4 million worth of shares over the last few months.

There are 5 parts of the report(namely Introduction, Data description and analysis, Data analysis, Forecasting and finally reaching conclusion) - the focus will be on how the sales numbers before innovation will affect forecasting and the other part will be to observe if the innovation lead by AEO within the company have actually bought financial gains or simply increased cost with little to show in return.

Import Data

# Load libraries
library(fpp2)
library(knitr)
library(vars)

# Defining variables
dataset = read.csv("C:/ECON 4210/final-dataset.csv")
df = ts(dataset, start= c(2006, 3), frequency=4)
aeo_sales = df[, "AEO"]  

Data Description & Collection

As mentioned above, the data used for this paper is the quarterly sales revenue of AEO. In order to make take the analysis to the next level relative to the previous term paper - it is important to take into account other datapoints influencing American Eagle Revenue. We will be looking 2 important datasets(all figures in millions of dollars); a major player in the apparel market - GAP Incorporated and US Real GDP of one the biggest economies in the world - USA.

Data Analysis

1: General Properties

# Description
aeo_sales %>%
  stl(t.window=13, s.window="periodic", robust=TRUE) %>%
  autoplot()

Stl provides a quick way to understand general properties of the data by decomposing the timeseries and giving a hollistic picture by showing the pattern in data, seasonality, anomilies if any and trend.

The dataset shows non-stationary & increasing variance (graph 1), a general increasing trend, that has been increasing at a high rate since 2014(graph 2), very strong seasonality which is expected due to the nature of business(graph 3). The last graph - remainder (graph 4) shows swinging remainders(between negative and positive) throughout the range. Specifically, unusual observations can be seen between 2009 and 2011 after which the anomilies start to die down. This is linked with the sluggish growth trend in the first couple of years (graph 2).

2: Serial Correlation Analysis

tsdisplay(aeo_sales, main="Quarterly AEO Sales Data")

The first plot on the top shows the level of sales over the period measured in millions of dollars. It shows an fluctuating trend with a gradual increase in sales. The 2nd plot shows ACF being affected at every 4th lag, that means future values of the series are strongly correlated i.e. heavily affected by past values. The troughs tend to be four quarters apart. It unusual when the 14th lag touched the siginificane line on the negative portion of ACF. The 3rd plot (Partial auto correlation) shows a case of anomiliy which needs further investigation.

3: Seasonal Analysis

yr = 100*diff(log(aeo_sales))
Acf(yr, main = "Magna q o q sales", col="purple", lwd=3)

r4 is higher than for the other lags. This is due to the seasonal pattern in the data: the peaks tend to be four quarters apart and the troughs tend to be four quarters apart. It is clear that there are big jump in sales in the first quarter of each year. This might be due to general retail discount season(near holidays) and purchases are high.

It is interesting to note that the sales drop steeply after the 1st quarter which might be due to general retail discount season(near holidays) and purchases are high. The sales slugglishly increase over the other quarters.

4: Histogram & Normal Distribution

h <- hist(aeo_sales,col = "lightgray", xlab = "Accuracy", main = "AEO Sales") 
xfit <- seq(min(aeo_sales), max(aeo_sales), length = 40) 
yfit <- dnorm(xfit, mean = mean(aeo_sales), sd = sd(aeo_sales)) 
yfit <- yfit * diff(h$mids[1:2]) * length(aeo_sales) 

lines(xfit, yfit, col = "black", lwd = 2)

The graph shows that normal distribution does not exist. The histogram shows that most of the quaterly sales are below the billion dollar mark i.e. lower percentiles. There is a decreasing tail on the right end of the dataset. The graph as a whole is in line with analysts and consensus of the sluggish growth in the apparel market where fashion industry and continued growth of sophisticated e-commerce platforms has resulted in changes in consumer expectations for purchases. It may be pivotal for AEO reconsider their business and strategies in order to increase adaptability.

5: Summary Statistics

summary(aeo_sales)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   602.3   696.3   761.8   818.7   940.6  1229.0

The quarterly sales for AEO have been growing for the past few quarters due to change in consumer spending environment and strong adaptability to technological changes in the market but the sales are seasonality type The maximum sales made by AEO in the past 12 years is $1229 million , and minimum is $602.9 million. The mean is around $818.7 milion.


Forecasting

We will first delve into basic to mid tier models followed up by accuracy and then finally getting into more technical models.

Setup

We will be using 80% of the dataset to train the models. There are currently 50 datapoints in the file therefore 80% of the dataset is 50*80% which is equivalent to 40 datapoints.

train <- window(aeo_sales,start=c(2006, 3),end=c(2016, 1))
test <- window(aeo_sales, start=c(2016,2))
both <- window(aeo_sales,start=c(2006, 3))
h=length(test)

1: Benchmarks

It is important to establist benchmarks (i.e. mean, seasonal naïve, naïve and drift models) before creating our models

Training

Since the data strong seasonality; we can expect a good fit for the model using the Seasonal naïve technique.

# 4 basic models
fit1 <- meanf(train, h=h)
fit2 <- naive(train, h=h)
fit3 <- snaive(train, h=h)
fit4 <- rwf(train, h=h, drift=TRUE)

# Saving plots to variables
p1 = autoplot(fit1)
p2 = autoplot(fit2)
p3 = autoplot(fit3)
p4 = autoplot(fit4)

Graph Plots

gridExtra::grid.arrange(p1, p2, p3, p4, nrow=2)

The graphs are as expected; forecasts from mean model(graph 1),naïve model(graph 2), random walk model (graph 4) fails to capture the seasonality in the data. We can see that as expected seasonal naïve did a much better job than other models and therefore will be the most important benchmark for reviewing future models.

2: Time Series Regression

Training

# Multiple linear model
tps <- tslm(train ~ trend + season)
fit5 = forecast(tps,h=h)

# Linear model 
lnm = tslm(train ~ trend, data=train)
fit6 = forecast(lnm,h=h)

# Saving plots to variables
p5 = autoplot(fit5, main = 'Forecasting using Multiple Linear Model')
p6 = autoplot(fit6, main = 'Forecasting using Linear Model')

Graph Plots

gridExtra::grid.arrange(p5, p6, nrow=2)

Multiple Linear model did a great job capturing both trend and seasonalty in the data which we can see from the graph above and make reasonable predictions into the future. Linear model on the other hand performed poorly which is acceptable since it dosen’t take a hollistic picture. An interesting question is if linear test model performed better than other benchmarks?

3: Time Series Decomposition

Training

# stl method
y.stl <- stl(train, t.window=15, s.window="periodic", robust=TRUE)
fit7 <- forecast(y.stl, method="rwdrift",h=h)

# saving plots to variables
p7 = autoplot(fit7, main = "Forecasting using STL Model")

Graph Plots

p7

4: Exponential Smoothing

Training

# simple exponential moving averages
fit8 <- ses(train, h = h)
p8 = autoplot(fit8)

# holt's linear trend method
fit9 <- holt(train,  h=h) 
p9 = autoplot(fit9)

# holt's damped trend method
fit10 <- holt(train, damped=TRUE, h=h) 
p10 = autoplot(fit10)

# holt winter's  method
fit11 <- hw(train, seasonal="multiplicative", h=h) 
p11 = autoplot(fit11)

# ETS  method
y.ets <- ets(train, model="ZZZ") 
fit12 <- forecast(y.ets, h=h)
p12 = autoplot(fit12)

Graph Plots

gridExtra::grid.arrange(p8, p9, p10, p11, p12, nrow=3)

The first graph is the worst from the model plotted since it has smoothed out the real datapoints completely.

We can observe and make some insights that Simple exponential smoothing,Holt’s method,Damped Holt’s method are unlike to be good in forecasting the real sales data. The other 2 models are capturing different trends and attaching different values so we can’t really compare them. We will look at the accuracy measures to make a better decision.

Forecasts from Holt−Winters’ multiplicative method and ETS gives a wide confidence interval and thus like the real data values are likely to be accurately predicted by the model but the value of the model can be limited sometime depending on different cases.

If we had to decide using from just looking at the graphs, a model using the graph only, Holt Winter might be more appriate since it gives narrow confidence interval relative to other models.

Accuracy Measures (Basic Models)

#first run
a1 = accuracy(fit1, test)
a2 = accuracy(fit2, test)
a3 = accuracy(fit3, test)
a4 = accuracy(fit4, test)
a5 = accuracy(fit5, test)
a6 = accuracy(fit6, test)

#2nd run
a7 = accuracy(fit7, test)
a8 = accuracy(fit8, test)
a9 = accuracy(fit9, test)
a10 = accuracy(fit10, test)

#3rd run
a11 = accuracy(fit11, test)
a12 = accuracy(fit12, test)

a.table<-rbind(a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12)
row.names(a.table)<-c('Mean training','Mean test',  
                        'Naive training', 'Naive test', 
                        'S. Naive training', 'S. Naive test',
                        'Random walk with drift training','Random walk with drift test',
                        'Multiple Linear trend training','Multiple Linear trend test',
                        'Linear trend training','Linear trend test',
                        'STL Training', 'STL Test',
                        'SES training','SES test', 
                        'Holt linear training', 'Holt linear test',
                        'Holt dampled training','Holt damped test', 
                        'Holt Winters training', 'Holt Winters test', 
                        'ETS training', 'ETS test')
                    
a.table<-as.data.frame(a.table)
a.table<-a.table[order(a.table$MASE),]
kable(a.table)
ME RMSE MAE MPE MAPE MASE ACF1 Theil’s U
Holt Winters training -4.118300 34.76725 28.15606 -0.6832755 3.559746 0.5902136 -0.0077864 NA
ETS training 4.147257 36.28635 29.54061 0.3916160 3.745230 0.6192369 -0.0066783 NA
STL Training 0.000000 39.39929 31.63552 0.0286537 4.140146 0.6631510 -0.2617571 NA
Multiple Linear trend training 0.000000 39.82614 31.87458 -0.2184208 4.074160 0.6681622 0.5305045 NA
Holt Winters test 21.585923 46.13377 33.67314 2.4527925 3.554635 0.7058639 0.2077890 0.2673476
STL Test 30.129913 51.98004 38.52377 3.0746137 3.945975 0.8075440 0.1704033 0.3014381
S. Naive training 18.218571 57.53526 47.70486 2.0879073 5.866728 1.0000000 0.6394510 NA
Multiple Linear trend test 50.074533 66.28132 50.93137 5.2592512 5.337359 1.0676349 0.2089291 0.3809775
S. Naive test 65.227300 83.59460 67.02730 7.1613546 7.325438 1.4050414 0.5209891 0.4457402
ETS test 68.448439 87.30851 72.42406 7.4381400 7.800549 1.5181696 0.3632369 0.4988174
Holt damped test 3.397205 137.79840 111.65039 -1.7712256 11.937342 2.3404406 -0.1802179 0.6818558
Linear trend test 8.282571 140.51898 114.71423 -1.2909644 12.219701 2.4046656 -0.1539917 0.6951404
Holt linear test 21.749927 142.52450 115.34752 0.1947408 12.104238 2.4179408 -0.1479176 0.7146042
Linear trend training 0.000000 141.61060 117.97126 -2.9768973 14.746082 2.4729402 -0.1282333 NA
Holt dampled training 20.990557 145.30881 118.21844 -0.3640173 14.385819 2.4781217 -0.1265835 NA
Holt linear training -3.942584 141.91507 119.58268 -3.5591935 15.038567 2.5067192 -0.1304997 NA
SES test 83.825980 167.28818 120.86424 7.0268794 11.876535 2.5335834 -0.1079347 0.8860945
SES training 26.246628 152.46069 124.65234 0.0503824 15.142447 2.6129906 -0.1033246 NA
Mean training 0.000000 151.64574 131.26776 -3.4371502 16.506539 2.7516644 -0.0281470 NA
Mean test 126.374505 192.16934 141.30322 11.7616103 13.740111 2.9620301 -0.1079347 1.0424207
Random walk with drift training 0.000000 211.70969 171.37078 -3.6922669 22.641266 3.5923129 -0.3230413 NA
Naive training 13.254474 212.12419 177.80921 -1.9756869 23.307658 3.7272769 -0.3230413 NA
Naive test -186.673700 236.23211 211.27370 -23.0738862 25.075513 4.4287671 -0.1079347 1.1509595
Random walk with drift test -259.573305 292.80548 262.96615 -30.9426632 31.218729 5.5123558 -0.2031001 1.4817486

Benchmark Accuracy

Seasonal naïve performed better than the other 3 benchmarks. The MAPE which shows the general accuracy of forecasts is the lowest; we will therefore regard this as the most important benchmark. We will breakdown and go into detail of the performance of the accuracy measures relative to the 3 benchmarks.

Detailed Breakdown

Exponential smoothing techniques

We will now delve into STL decomposition technique and make comparision with benchmarks. STL decomposition performed better than the mean test and naïve test in all categories except ME and MPE when comparing the model with Seasonal naïve .

Time Series Regression

Multiple linear trend test which captured both trend and seasonality performed better at ME, RMSE(except S.naive and Naive test), MAE(except S.naive and Naive test ),MPE, MAPE (except S.naive and Naive test ), MASE(except S.naive and Naive test). Linear trend test did a good job but wasn’t as good as the S.naive.

Time Series Decomposition

ETS test performed better at ME, RMSE (except S.naive), MAE(except S.naive),MPE, MAPE (except S.naive), MASE(except S.naive by a really tiny amount)

Exponential Smoothing

Exponential smoothing techniques attach greater weightage in recent values compared to older values and thus considered to provide better forecasting results.

The Holt’s linear method has similar results as Linear Trend Test. Specifically for S.naive The forecasting is only better than the benchmark on 2 categories - ME(21 vs 65) and MPE(0.19 vs 7). The method captured the trend but failed to capture the seasonality and performced poorly when compared with S.naive.

Holt Winters multiplicative method has similar results as STL. Specifically for S.naive The forecasting is only better than the benchmark on 2 categories - ME and performed worse in all other categories.

Holt dampled method has similar results as STL. It is interesting to note however that it is only better than mean test in ME category and Naive test in MAPE and MASE category by a negligible number. Specifically for S. naive, forecasting is only better on 2 categories - ME(3.9 vs 65) and MPE(-1.7 vs 7). This again, is in line with our earlier hypothesis.

ES test performed better in all other categories except ME when compared with mean test while it perfomed worse in every category when compared with naive test except MAPE which was by a negligible amount.Comparing ES test with S.naive, it performed worse in every category except MPE which was by a negligible amount. This is in our line with earlier hypothesis.

Takeaway

Among the basic models, the results show us that the Holt Winters test method has not only outperformed the benchmarks and all other forecasting models in nearly every category except S.naive in a few categories. The method is therefore the best one for this dataset and should be used to make forecasting decisions.,

The STL(time series decomposition) is the second best.

5: Advanced Forecasting Models

Methodology

Since these are advanced techniques, I will not be including the previous models into this analysis.

1: ARIMA

General
ndiffs(aeo_sales)
## [1] 1

This shows that only one difference is required thus I am setting the difference to 1.

Training
# arima
fit13 = forecast(auto.arima(train, d=1), h=h)
p13 = autoplot(fit13)
Graph Plots
autoplot(aeo_sales, series = "Actual") + 
  ggtitle("AEO Quarterly Sales (millions $)") +
  autolayer(fit13, PI = FALSE, series = "ARIMA") +
  ylab("Revenue ($ Millions)")+
  theme(plot.title = element_text(size=15, face = "bold",hjust = 0.5) ,
        axis.text.x = element_text(size=10),
        axis.text.y = element_text(size=10))

The ARIMA model seemed to capture the first half of the testing pretty well but underestimated the sales in the second half. Overall the model seemed to be mapping over the actual plot which shows a good performance.

Vector Auto Regression (VAR)

Corellation
cor(df[,2:4])
##               AEO       GAP   USA.GDP
## AEO     1.0000000 0.8577768 0.4667728
## GAP     0.8577768 1.0000000 0.2758253
## USA.GDP 0.4667728 0.2758253 1.0000000

The table show the correlation among the varaibles; US Real GDP, GAP Sales and AEO Sales. The table shows that there is a strong positive correlation between GAP and AEO (around 86%). It is to be noted however that correlation does not mean causation. The GAP Sales and AEO Sales are both positively autocorrelated with Real GDP while the stronger correlation being with AEO Sales. We will keep this in mind while making the VAR model.

vardata = log (df[,c(4,3,2)])
plot(vardata, main = "VAR Data", xlab = "$ millions")

It is necessary to difference it. In order to understand the general pattern of the variables before we create equations, it might be usefult to look at the data. The graph is in line with what we expected looking at the correlation between AEO sales and GAP sales i.e. the general pattern looks similar. The US GDP show and increasing trend at a high rate starting from 2008.

Looking at the graph plot for stationary LOW sales, I believe adding time trend aswell as intercept in each equation of the model will be helpful.

Model Selection

At least 2 full years’ worth of data is required for the the dataset on hand which is why I put the lag.max = 9. The R output shows the lag length selected by each of the information criteria available in the vars package.

The null hypothesis is no autocorrelation is rejected since the p-value is lower than the significance level of 0.05. Since autocorrelation is an undesirable feature of the model, we will look for model that has p-value of greater than 0.05.

VAR select
VARselect(vardata, lag.max = 9, type = "both", season=4)
## $selection
## AIC(n)  HQ(n)  SC(n) FPE(n) 
##      9      9      9      9 
## 
## $criteria
##                    1             2             3             4
## AIC(n) -2.377535e+01 -2.361862e+01 -2.342835e+01 -2.403766e+01
## HQ(n)  -2.340896e+01 -2.311483e+01 -2.278717e+01 -2.325909e+01
## SC(n)  -2.276202e+01 -2.222529e+01 -2.165502e+01 -2.188434e+01
## FPE(n)  4.804167e-11  5.774249e-11  7.335693e-11  4.320814e-11
##                    5             6             7             8
## AIC(n) -2.447483e+01 -2.468719e+01 -2.491922e+01 -2.598952e+01
## HQ(n)  -2.355886e+01 -2.363383e+01 -2.372847e+01 -2.466136e+01
## SC(n)  -2.194151e+01 -2.177387e+01 -2.162591e+01 -2.231620e+01
## FPE(n)  3.156448e-11  3.067954e-11  3.193137e-11  1.644692e-11
##                    9
## AIC(n) -2.743855e+01
## HQ(n)  -2.597300e+01
## SC(n)  -2.338524e+01
## FPE(n)  7.272830e-12

The suggested model is with 9 lags.

var1 <- VAR(vardata, p=9, type="both", season=4)
serial.test(var1, lags.pt=16, type="PT.asymptotic")
## 
##  Portmanteau Test (asymptotic)
## 
## data:  Residuals of VAR object var1
## Chi-squared = 263.69, df = 63, p-value < 2.2e-16

The model is rejected because autocorrelation is undesirable(and the model is significant) and now we will look at lower lags to find a reasonable model. VAR(9) and all models till VAR (6) have some residual serial correlation, therefore we fit VAR(5).

var1 <- VAR(vardata, p=5, type="both", season=4)
serial.test(var1, lags.pt=16, type="PT.asymptotic")
## 
##  Portmanteau Test (asymptotic)
## 
## data:  Residuals of VAR object var1
## Chi-squared = 103.12, df = 99, p-value = 0.3683

VAR(5) model, tests for autocorrelation, and finds that the null of no autocorrelation cannot be rejected because the p-value of 0.3683 is greater than the significance level of 0.05. Since there is not enough evidence of presence of autocorrelation, we will go forward with this model.

Summarising result
summary(var1)
## 
## VAR Estimation Results:
## ========================= 
## Endogenous variables: USA.GDP, GAP, AEO 
## Deterministic variables: both 
## Sample size: 44 
## Log Likelihood: 381.312 
## Roots of the characteristic polynomial:
## 0.8889 0.8889 0.838 0.838 0.8285 0.8285 0.8234 0.8234 0.8169 0.8169 0.7919 0.7919 0.6794 0.6794 0.05244
## Call:
## VAR(y = vardata, p = 5, type = "both", season = 4L)
## 
## 
## Estimation results for equation USA.GDP: 
## ======================================== 
## USA.GDP = USA.GDP.l1 + GAP.l1 + AEO.l1 + USA.GDP.l2 + GAP.l2 + AEO.l2 + USA.GDP.l3 + GAP.l3 + AEO.l3 + USA.GDP.l4 + GAP.l4 + AEO.l4 + USA.GDP.l5 + GAP.l5 + AEO.l5 + const + trend + sd1 + sd2 + sd3 
## 
##              Estimate Std. Error t value Pr(>|t|)   
## USA.GDP.l1  0.9170142  0.2688484   3.411   0.0023 **
## GAP.l1      0.0595632  0.0417195   1.428   0.1663   
## AEO.l1     -0.0111618  0.0281665  -0.396   0.6954   
## USA.GDP.l2 -0.0212623  0.3522493  -0.060   0.9524   
## GAP.l2     -0.0203847  0.0378716  -0.538   0.5954   
## AEO.l2      0.0301949  0.0303618   0.995   0.3299   
## USA.GDP.l3  0.0201431  0.3138269   0.064   0.9494   
## GAP.l3      0.0443649  0.0351573   1.262   0.2191   
## AEO.l3     -0.0159078  0.0294713  -0.540   0.5943   
## USA.GDP.l4 -0.2504267  0.3269035  -0.766   0.4511   
## GAP.l4     -0.0075502  0.0392962  -0.192   0.8493   
## AEO.l4     -0.0497927  0.0312633  -1.593   0.1243   
## USA.GDP.l5  0.2721984  0.2040182   1.334   0.1947   
## GAP.l5     -0.0334080  0.0363057  -0.920   0.3666   
## AEO.l5      0.0028176  0.0263846   0.107   0.9158   
## const       0.9656267  1.3267631   0.728   0.4738   
## trend       0.0006297  0.0003209   1.962   0.0615 . 
## sd1         0.0075324  0.0202273   0.372   0.7129   
## sd2         0.0145453  0.0209581   0.694   0.4943   
## sd3         0.0302307  0.0167162   1.808   0.0831 . 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 0.005643 on 24 degrees of freedom
## Multiple R-Squared: 0.9955,  Adjusted R-squared: 0.9919 
## F-statistic: 277.6 on 19 and 24 DF,  p-value: < 2.2e-16 
## 
## 
## Estimation results for equation GAP: 
## ==================================== 
## GAP = USA.GDP.l1 + GAP.l1 + AEO.l1 + USA.GDP.l2 + GAP.l2 + AEO.l2 + USA.GDP.l3 + GAP.l3 + AEO.l3 + USA.GDP.l4 + GAP.l4 + AEO.l4 + USA.GDP.l5 + GAP.l5 + AEO.l5 + const + trend + sd1 + sd2 + sd3 
## 
##             Estimate Std. Error t value Pr(>|t|)   
## USA.GDP.l1 -0.439263   1.821410  -0.241  0.81147   
## GAP.l1      0.674769   0.282644   2.387  0.02520 * 
## AEO.l1      0.049729   0.190824   0.261  0.79662   
## USA.GDP.l2 -1.593698   2.386439  -0.668  0.51062   
## GAP.l2      0.111330   0.256574   0.434  0.66823   
## AEO.l2      0.087280   0.205697   0.424  0.67512   
## USA.GDP.l3  2.832051   2.126133   1.332  0.19537   
## GAP.l3     -0.179678   0.238186  -0.754  0.45797   
## AEO.l3      0.269469   0.199664   1.350  0.18973   
## USA.GDP.l4 -3.837447   2.214725  -1.733  0.09598 . 
## GAP.l4      0.575961   0.266226   2.163  0.04069 * 
## AEO.l4     -0.258149   0.211805  -1.219  0.23476   
## USA.GDP.l5  2.220235   1.382194   1.606  0.12128   
## GAP.l5     -0.253428   0.245966  -1.030  0.31312   
## AEO.l5     -0.019086   0.178752  -0.107  0.91586   
## const      13.217798   8.988631   1.471  0.15442   
## trend       0.003821   0.002174   1.757  0.09164 . 
## sd1         0.098445   0.137037   0.718  0.47946   
## sd2         0.152769   0.141988   1.076  0.29266   
## sd3         0.320670   0.113250   2.832  0.00923 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 0.03823 on 24 degrees of freedom
## Multiple R-Squared: 0.9328,  Adjusted R-squared: 0.8796 
## F-statistic: 17.53 on 19 and 24 DF,  p-value: 9.567e-10 
## 
## 
## Estimation results for equation AEO: 
## ==================================== 
## AEO = USA.GDP.l1 + GAP.l1 + AEO.l1 + USA.GDP.l2 + GAP.l2 + AEO.l2 + USA.GDP.l3 + GAP.l3 + AEO.l3 + USA.GDP.l4 + GAP.l4 + AEO.l4 + USA.GDP.l5 + GAP.l5 + AEO.l5 + const + trend + sd1 + sd2 + sd3 
## 
##              Estimate Std. Error t value Pr(>|t|)  
## USA.GDP.l1  1.239e-01  1.900e+00   0.065   0.9486  
## GAP.l1      8.499e-02  2.948e-01   0.288   0.7756  
## AEO.l1      5.256e-01  1.991e-01   2.640   0.0143 *
## USA.GDP.l2 -4.083e-01  2.489e+00  -0.164   0.8711  
## GAP.l2      1.852e-01  2.676e-01   0.692   0.4957  
## AEO.l2     -8.251e-02  2.146e-01  -0.385   0.7040  
## USA.GDP.l3  1.746e+00  2.218e+00   0.787   0.4387  
## GAP.l3     -2.466e-01  2.485e-01  -0.992   0.3309  
## AEO.l3      3.613e-02  2.083e-01   0.173   0.8637  
## USA.GDP.l4 -1.871e+00  2.310e+00  -0.810   0.4259  
## GAP.l4      8.245e-02  2.777e-01   0.297   0.7691  
## AEO.l4      1.508e-01  2.209e-01   0.682   0.5015  
## USA.GDP.l5  1.600e+00  1.442e+00   1.110   0.2782  
## GAP.l5     -2.288e-01  2.566e-01  -0.892   0.3813  
## AEO.l5     -4.156e-01  1.865e-01  -2.229   0.0354 *
## const      -1.351e+01  9.376e+00  -1.441   0.1625  
## trend       6.917e-04  2.268e-03   0.305   0.7630  
## sd1         6.657e-02  1.429e-01   0.466   0.6456  
## sd2         2.143e-01  1.481e-01   1.447   0.1609  
## sd3         3.221e-01  1.181e-01   2.726   0.0118 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 0.03988 on 24 degrees of freedom
## Multiple R-Squared: 0.9745,  Adjusted R-squared: 0.9543 
## F-statistic: 48.21 on 19 and 24 DF,  p-value: 1.216e-14 
## 
## 
## 
## Covariance matrix of residuals:
##           USA.GDP       GAP       AEO
## USA.GDP 3.184e-05 0.0001395 2.636e-05
## GAP     1.395e-04 0.0014615 5.407e-04
## AEO     2.636e-05 0.0005407 1.590e-03
## 
## Correlation matrix of residuals:
##         USA.GDP    GAP    AEO
## USA.GDP  1.0000 0.6465 0.1171
## GAP      0.6465 1.0000 0.3547
## AEO      0.1171 0.3547 1.0000

We are given multiple equations and looking at the adjust R-square will help us decide how equations performed. Looking at 1st equation(Estimation results for equation USA.GDP), we can see a superior result (around 99%) therefore it does a great job as a model. Estimation results for equation GAP equation shows a pretty good result aswell i.e. around 88%. Looking at the final equation,(Estimation results for equation AEO); we can see a really good result. It is effectively capturing 95% of the variability.

Roots
roots(var1)
##  [1] 0.88888137 0.88888137 0.83802539 0.83802539 0.82851051 0.82851051
##  [7] 0.82335969 0.82335969 0.81691561 0.81691561 0.79189048 0.79189048
## [13] 0.67937285 0.67937285 0.05244259

All roots are less than 1 with most of them being at the average of 0.75 which indicates a good model.

Granger Causality
causality(var1, cause= c("AEO","USA.GDP"))
## $Granger
## 
##  Granger causality H0: USA.GDP AEO do not Granger-cause GAP
## 
## data:  VAR object var1
## F-Test = 0.89344, df1 = 10, df2 = 72, p-value = 0.5436
## 
## 
## $Instant
## 
##  H0: No instantaneous causality between: USA.GDP AEO and GAP
## 
## data:  VAR object var1
## Chi-squared = 14.605, df = 2, p-value = 0.0006737

There is no granger causality between real GDP and AEO sales because the p value (0.746) is greater than significance level(0.05) and thus insignificant. This implies that US Real GDP and GAP sales data are not affecting sales performance of AEO. On the other hand, for instantaneous causality, the p value of 0.0004406 is significant. Therefore, we can reject null hypothesis and accept the hpythoses that there is actually a instantaneous causality relationship between the variables.

IRF

One of the more useful techniques in this corner: impulse-response (IR) simulations by way of vector autoregression (VAR) modeling. It is a powerful tool for developing perspective on a recurring question What could happen to y if x changes by z percent?

AEO Sales Response
par(mfrow=c(2,2))
var1a.irf <- irf(var1,  n.ahead = 16, boot = TRUE,  runs=500, seed=99, cumulative=FALSE)
plot(irf(var1, response = "AEO", n.ahead = 24, boot = TRUE,  runs=500) , plot.type = "single")

The first graph show the link between US Real GDP and AEO Sales Response - the red dotted lines signify the confidence intervals are split with huge confidence band between the grey line(signifying 0) and don’t cross over the full range which implies that there is limited if any impact on AEO sales.

The second graph shows a competitor(GAP) where red confidence lines follow a similar pattern as graph1 but has a much lower confidence interval. Since no redline crossed over in the opposite side compared to where it started - it signifies that there is limited impact on sales through sales of GAP.

The third graph has the smallest confidence interval - there is cross over of a redline in around the 3rd lag where it continues to stay. Thus it can be concluded that AEO Revenue performance has a impact on the sales of it’s comapny which makes sense.

GAP Sales Response
par(mfrow=c(2,2))
plot(irf(var1, response = "GAP", n.ahead = 24, boot = TRUE,  runs=500) , plot.type = "single")

The first graph show the link between US Real GDP and GAP Sales Response - the red dotted lines signify the confidence intervals are split with huge confidence band between the grey line(signifying 0) and there is a cross during the start of the range which implies that US GDP affects GAP perfromance.

The second graph shows a the company’s where there is a cross over of the bottom red line from the positive to the negative quadrant. This is indicative of the impact that GAP sales have an impact on their sales.

The third graph is interesting due to intersection of the 2 confince interval in the grey line(represented by 0). Due to this reason, I cannot state with surety if the AEO affects GAP performance or not.

US Real GDP Sales Response
par(mfrow=c(2,2))
plot(irf(var1, response = "USA.GDP", n.ahead = 24, boot = TRUE,  runs=500) , plot.type = "single")

It is interesting to note the similarities in trends in graph 2 and 3.

The first graph show the link between US Real GDP with itself - the red dotted lines signify the confidence intervals are split with huge confidence band between the grey line(signifying 0) and there is a cross during the start of the range which implies that US GDP affects GAP perfromance is prominent.

The second graph is interesting due to intersection of the 2 confince interval in the grey line(represented by 0). Due to this reason, I cannot state with surety if the AEO affects GAP performance affects GDP or not but logically OBVIOUSLY it does.

The third graph is interesting due to intersection of the 2 confince interval in the grey line(represented by 0). Due to this reason, I cannot state with surety if the AEO affects GAP performance or not.

Forecast Error Variance Decomposition (FEVD)

forecast error variance decompositions allow us to analyze the dynamic interaction between all variables thus providing us with further insights about the VAR model.

# fevd
fevd(var1, n.ahead = 16)
## $USA.GDP
##         USA.GDP        GAP         AEO
##  [1,] 1.0000000 0.00000000 0.000000000
##  [2,] 0.9660453 0.03178840 0.002166327
##  [3,] 0.9296849 0.06579448 0.004520590
##  [4,] 0.8612095 0.13235768 0.006432847
##  [5,] 0.8135766 0.18074049 0.005682867
##  [6,] 0.7814703 0.21008722 0.008442492
##  [7,] 0.7669363 0.22454765 0.008516001
##  [8,] 0.7547666 0.23538418 0.009849236
##  [9,] 0.7475382 0.23999437 0.012467437
## [10,] 0.7415410 0.24466835 0.013790651
## [11,] 0.7367741 0.24947487 0.013751020
## [12,] 0.7302502 0.25558068 0.014169159
## [13,] 0.7225544 0.26213377 0.015311863
## [14,] 0.7132021 0.26981290 0.016985001
## [15,] 0.7042694 0.27561676 0.020113795
## [16,] 0.6982612 0.27962707 0.022111697
## 
## $GAP
##         USA.GDP       GAP         AEO
##  [1,] 0.4179943 0.5820057 0.000000000
##  [2,] 0.3920440 0.6063512 0.001604801
##  [3,] 0.3350109 0.6508547 0.014134399
##  [4,] 0.2909773 0.6031990 0.105823722
##  [5,] 0.2481266 0.6531921 0.098681212
##  [6,] 0.2448555 0.6541993 0.100945168
##  [7,] 0.2493805 0.6505902 0.100029235
##  [8,] 0.2585149 0.6246150 0.116870160
##  [9,] 0.2659878 0.6184604 0.115551856
## [10,] 0.2677276 0.6163358 0.115936524
## [11,] 0.2691411 0.6150689 0.115790062
## [12,] 0.2691302 0.6122704 0.118599402
## [13,] 0.2694810 0.6118791 0.118639955
## [14,] 0.2694361 0.6118578 0.118706131
## [15,] 0.2694523 0.6115527 0.118994932
## [16,] 0.2695043 0.6112229 0.119272779
## 
## $AEO
##          USA.GDP       GAP       AEO
##  [1,] 0.01372207 0.1336677 0.8526102
##  [2,] 0.02359332 0.1505383 0.8258684
##  [3,] 0.04054611 0.1922248 0.7672291
##  [4,] 0.06466397 0.1896667 0.7456694
##  [5,] 0.06367484 0.2002201 0.7361051
##  [6,] 0.06288989 0.2099569 0.7271532
##  [7,] 0.06345254 0.2000716 0.7364759
##  [8,] 0.06630935 0.2141317 0.7195589
##  [9,] 0.07008734 0.2110428 0.7188698
## [10,] 0.07797568 0.2173999 0.7046244
## [11,] 0.09174995 0.2096903 0.6985597
## [12,] 0.10516975 0.2066826 0.6881477
## [13,] 0.11127484 0.2148797 0.6738454
## [14,] 0.11472191 0.2167408 0.6685373
## [15,] 0.11481095 0.2278659 0.6573231
## [16,] 0.11367474 0.2311268 0.6551985

The table shows 3 category - USA GDP, GAP, and AEO.

In the first category, we see results in line with our expectations from looking at IRF; GDP strongly influences itself after the 1st quarter, influence of GDP on GAP and AEO sales increased sligtly every quarter affecting GAP sales by a larger propotion.

In the second category, we see results in line with our expectations from looking at IRF; GAP strongly influences itself with an average of around 60% over the 16 quarters with maximum being at the 5th,6th and 7th quarter when the influence was 65%. Sales at GAP obviously affects GDP but it is interesting to note that the portion contributed fell by a huge amount - which might be explained by the uptick in GDP growth in the preceeding quarters we saw at the start of the discussion.Influence of AEO sales is minimal with the average being around 11% over the quarters.

Lastly, in the 3rd category, we obviously see standard pattern of infuencing itself as we saw in other variables but interestingly for AEO, the affect on itself fluctuated frequently with smaller variablily than the other 2 variables. The influence on USA GDP and GAP consistently increased every quarter.

Training

Since we are using different variables to create this model, it is necessary to set the training and test set again.

# set training data, test data, out of sample
train_var <- window(vardata,start=c(2006, 3),end=c(2016, 1))

test_var <- window(vardata, start=c(2016,2))
both_var <- window(vardata, start=c(2006, 3))
h_var=dim(test_var)[1]

var2 <- VAR(train_var, p=5, type="both", season=4)

# forecast
var.fc2 = forecast(var2, h=h_var)
Forecasted Values
kable(var.fc2)
Time Series Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
2016 Q2 USA.GDP 16.662334 16.655496 16.669172 16.651876 16.672792
2016 Q3 USA.GDP 16.651810 16.641595 16.662026 16.636187 16.667434
2016 Q4 USA.GDP 16.645713 16.630997 16.660430 16.623206 16.668220
2017 Q1 USA.GDP 16.643868 16.625982 16.661755 16.616513 16.671223
2017 Q2 USA.GDP 16.647120 16.627063 16.667177 16.616446 16.677795
2017 Q3 USA.GDP 16.653557 16.632208 16.674906 16.620907 16.686207
2017 Q4 USA.GDP 16.658362 16.636258 16.680466 16.624557 16.692167
2018 Q1 USA.GDP 16.671011 16.647968 16.694053 16.635770 16.706252
2018 Q2 USA.GDP 16.679102 16.655464 16.702741 16.642950 16.715254
2018 Q3 USA.GDP 16.691408 16.666997 16.715818 16.654076 16.728740
2016 Q2 GAP 8.105707 8.056258 8.155157 8.030081 8.181333
2016 Q3 GAP 8.127868 8.069077 8.186658 8.037956 8.217780
2016 Q4 GAP 8.201626 8.126851 8.276401 8.087267 8.315985
2017 Q1 GAP 8.363829 8.284057 8.443601 8.241829 8.485830
2017 Q2 GAP 8.162236 8.079122 8.245350 8.035124 8.289348
2017 Q3 GAP 8.210515 8.126313 8.294718 8.081739 8.339292
2017 Q4 GAP 8.270767 8.184684 8.356849 8.139114 8.402419
2018 Q1 GAP 8.470727 8.381532 8.559923 8.334314 8.607140
2018 Q2 GAP 8.260716 8.169835 8.351597 8.121726 8.399707
2018 Q3 GAP 8.305823 8.213811 8.397835 8.165103 8.446543
2016 Q2 AEO 6.545272 6.489435 6.601109 6.459877 6.630667
2016 Q3 AEO 6.645555 6.578168 6.712941 6.542496 6.748613
2016 Q4 AEO 6.818021 6.744010 6.892032 6.704831 6.931211
2017 Q1 AEO 7.002300 6.927715 7.076885 6.888231 7.116368
2017 Q2 AEO 6.595414 6.518200 6.672629 6.477325 6.713504
2017 Q3 AEO 6.691449 6.614030 6.768867 6.573047 6.809850
2017 Q4 AEO 6.849178 6.769350 6.929006 6.727092 6.971264
2018 Q1 AEO 7.045922 6.963487 7.128356 6.919849 7.171995
2018 Q2 AEO 6.619381 6.536860 6.701901 6.493176 6.745585
2018 Q3 AEO 6.697958 6.615324 6.780592 6.571581 6.824335
Graph Plots
autoplot(var.fc2) + xlab("Year")

We see that AEO has the narrowest confidence interval compared to the other variables and generally every variable is expected to follow an increasing trend with AEO having the smallest bump.

ANN (NNETS)

I am setting up the training and test again to make sure that the set we are using is in line with our expectations.

Setup
# set training data, test data, out of sample
train <- window(aeo_sales,start=c(2006, 3),end=c(2016, 1))

test <- window(aeo_sales, start=c(2016,2))
both <- window(aeo_sales, start=c(2006, 3))
h=length(test)
Training
set.seed(7)
fit.nnet <- forecast(nnetar(train), h=h)
Forecasted Values
kable(fit.nnet)
Qtr1 Qtr2 Qtr3 Qtr4
2016 680.8760 828.8692 972.9326
2017 1104.9821 673.5990 851.5492 1004.0249
2018 1101.6303 671.4776 865.8583
Graph Plots
autoplot(fit.nnet) +
  autolayer(aeo_sales, series="Original Data") +
  ylab("AEO Sales ($ millions)")

The neural network did a good job capturing the pattern but it seems that it underestimates the performance of the AEO revenue - AEO beats expectations in that quarter so it makes sense because a lot of analysts were also predicting a more “maintainable” earnings.

BATS (TBATS)

Training
fit.tbats <- forecast(tbats(train), h=h)
Forecasted Values
kable(fit.tbats)
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
2016 Q2 726.2208 675.6822 780.5394 650.3684 810.9198
2016 Q3 751.8004 697.5345 810.2881 670.4118 843.0697
2016 Q4 888.7325 808.9944 976.3301 769.7217 1026.1442
2017 Q1 1083.9757 986.4584 1191.1332 938.4389 1252.0829
2017 Q2 727.4626 654.0224 809.1493 618.1961 856.0419
2017 Q3 745.0261 669.6982 828.8268 632.9559 876.9392
2017 Q4 890.3015 792.4038 1000.2941 745.0158 1063.9195
2018 Q1 1074.7938 956.4756 1207.7483 899.2092 1284.6641
2018 Q2 729.7604 644.0443 826.8845 602.8230 883.4272
2018 Q3 740.2000 653.1678 838.8289 611.3180 896.2536
Graph Plots
autoplot(fit.tbats) +
  autolayer(aeo_sales, series="Original Data") +
  ylab("AEO Sales ($ millions)")

The model did quite well and captured the pattern properly but it seems that there is big confidence interval which was able to capture the uptick in the last few months with 95% confidence interval level.

Which model is the best?

There was issues with finding accuracy of the training model of VAR therefore I will show it in a different table and then go forward to work on my analysis.

a13 = accuracy(fit13, test)
a15 = accuracy(fit.nnet, test)
a16 = accuracy(fit.tbats, test)


a.table<-rbind(a13, a15, a16)

row.names(a.table)<-c('ARIMA Training','ARIMA Test',
                        'ANN Training', 'ANN Test',
                        'BATS Training', 'BATS Test')

a.table<-as.data.frame(a.table)
a.table<-a.table[order(a.table$MASE),]
kable(a.table, caption="Accuracy Measure Table")
Accuracy Measure Table
ME RMSE MAE MPE MAPE MASE ACF1 Theil’s U
ARIMA Training -2.5594435 39.53353 28.84797 -0.3081758 3.711761 0.6047176 -0.1483885 NA
ANN Training -0.0002033 35.33944 30.62870 -0.2482478 3.911478 0.6420458 0.5493786 NA
ARIMA Test 22.9614989 43.81186 31.08687 2.4788227 3.232087 0.6516500 0.2980984 0.2434355
BATS Training 8.5421618 45.59331 35.13623 0.8405497 4.371480 0.7365337 -0.0386941 NA
ANN Test 43.7463729 80.34422 63.18000 4.9465689 7.005367 1.3243933 0.3089266 0.3818720
BATS Test 83.4989169 103.61070 83.49892 8.9159247 8.915925 1.7503232 0.1851220 0.6010873

Test accuracy of VAR model

a14 = accuracy(exp (var.fc2$forecast$AEO$mean), exp (test_var[,3]))
kable(a14)
ME RMSE MAE MPE MAPE ACF1 Theil’s U
Test set 52.5907 66.94133 53.02231 5.80804 5.847385 0.299744 0.3734272

ARIMA has by far been the best model so far with MASE of around 0.5 indicating that the predicted sales almost mapped on to the actual values. Even the ARIMA Training performed well amongst the other models. The VAR model for some reason did not provide a MASE nevertheless, looking at other metrics like MAPE and RMSE, it’s highly unlikely that the VAR model would have a lower MASE than ARIMA.

Conclusion & Deeper Insights

Amazon - a company that has not only diversified into completely different industry but has been successul in changing longstanding status quo. The company has rolled out private label clothing brands, such as Lark & Ro, which resembles fast-fashion looks from the likes of Zara. Amazon is already a major player in clothing and has position itself for exponential growth by leveraging economies of scale and one of the biggest e-commerce platform in the world - amazon.com. Fashion’s biggest brands are struggling to work out how to deal with Amazon’s quiet transformation into an apparel powerhouse. Is it friend or is it foe?

Aerie - a subsidary of American Eagle has led the charge in the body positivity movement by refusing to airbrush models in its Aerie Real campaign leading to double-digit sales growth. This shows that the manaagemnt is pretty active and have done a good job in one of the subsidary but it still stands out that the CEO is offload stock - we will be able to know better when the new quarter report comes out.

American Eagle like many other retail companies is investing heavily on mobile apps because it has been proven to have the highest engagement with younge generation(AE target segment). The company came out with the mobile application in 2015 and the trend of the data has improved with a steeper gradient. The model shows that the sales growth in will be limited - a change in digital strategy might be required. The innovation has not affected forecasting too much because the affect is not significant enough.

Some of the most important insights relative to Term Paper 1 that I had was that revenue didn’t jump as much as expected when the economy started to recover - something that was statistically answered by IRF in VAR modelling section. This is an indication that either the industry has shrunk(which hasn’t) or American eagle has become more of a commodity with limited ability to capture the uptick the economic business boom. This means that the management should focus on branding in order to differentiate itself.

Final Words

AEO sales have been sluggish, for the last couple of years, the forecasted values does not really paint a great picture. A complete mayover working with cutting edge tools like Machine Learning and Artifical Intelligence might help in this regard. Google for example has been working with major retailers in Canada and United States to prepare for the future changes and help increase efficiency.

AEO may lose any prominence, unless it learns how to provide similar, or even better products, than companies GAP, Amazon etc.