1 Analyse Apple Aktie inkl. Prognose bis 2020

1.1 Problem & Lösung

1.1.1 Hintergrund

Analyse Apple als wichtige Aktienposition an Fonds UBS Vitainvest 75 World. Der Anteil beträgt rund 1.68%.

1.1.2 Frage

In welchem Bereich entwickelt sich die Apple Aktie bis Ende 2020?

Nullhypothese: Die Apple Aktie folgt mindestens dem durchschnittlichen Wachstum der letzten Jahre.
Alternativhypothese: Die Apple Aktie erfüllt die Erwartungen der Anleger nicht (H0 wird widerlegt).

1.1.3 Ergebnis

Der geschätzte Kurswert bis 2020 beträgt rund 243 (+ ~22% zu Februar 2018). Pro Jahr kann also ein Wertzuwachs von rund 11% erwartet werden. Zu Beginn ist die Streuung klein. Diese wächst mit zunehmender Zeit und findet ihren Höhepunkt beim letzten Schätzpunkt (Dez 2020).

Der wachstum über die letzten 10 Jahre betrug 84% bzw. 8.4% jährlich. Der geschätzte Kursverlauf liegt daher über den Durchschnittswerten der letzten zehn Jahre und somit bestätigt sich die Nullhypothese: Die Rendite-Erwartungen der Anleger werden erfüllt.

1.2 Methodik

1.3 Vorbereitung

1.3.1 Pakete laden

knitr::opts_chunk$set(fig.width=9, fig.height=6) 
library(tidyverse)
library(magrittr)
library(readxl)
library(plotly)
library(lubridate)
library(openintro)
library(tseries)
library(lmtest)
library(forecast)
library(xts)
library(vars)
library(dygraphs)
library(summarytools)

1.3.2 Daten laden & visuelle Datenexploration

# Daten lesen
apple <- read_excel(path = "Data/Prices Nestle and Apple.xlsx", sheet = "Apple")

# .csv Versuch hat Wertproblem auch nicht gelöst
# apple <- read.csv("Data/Prices_Apple.csv", sep=";", header=TRUE)

# Spalten umbennen, Durchschnittswerte per 1. Monatstag berechnen &
# nach Datum aufsteigend sortieren
apple %<>%
  rename(datum = Date,
         wert = Kurswert) %>% 
  mutate(monat = month(datum),
         jahr = year(datum)) %>% 
  unite(monat_jahr, monat, jahr) %>%
  group_by(monat_jahr) %>% 
  mutate(wert_monat = mean(wert)) %>% 
  ungroup() %>% 
  arrange(datum) %>% 
  distinct(monat_jahr, .keep_all = T) %>% 
  dplyr::select(datum, wert_monat)

# Cast zu passendem Zeitformat
apple$datum <- ymd(apple$datum)

# Datenstruktur & statistische Kennzahlen anzeigen
glimpse(apple)

## Observations: 433
## Variables: 2
## $ datum      <date> 1982-11-12, 1982-12-01, 1983-01-03, 1983-02-01, 19...
## $ wert_monat <dbl> 1.1843825, 1.1359391, 1.1840833, 1.6130900, 1.60219...

summary(apple)

##      datum              wert_monat      
##  Min.   :1982-11-12   Min.   :  0.6191  
##  1st Qu.:1991-11-01   1st Qu.:  1.6862  
##  Median :2000-11-01   Median :  2.6965  
##  Mean   :2000-10-31   Mean   : 27.1324  
##  3rd Qu.:2009-11-02   3rd Qu.: 29.2801  
##  Max.   :2018-11-01   Max.   :219.5482

(d <- plot_ly(data = apple, x = ~datum, y = ~wert_monat, type = "scatter", mode = "lines") %>% 
  layout(title = "Apple Aktie",
         xaxis = list(title = "Datum"),
         yaxis = list(title = "Kurswert")))

Was ist 2001 passiert? iPod wird vorgestellt. Scheinbar nur ein kurzer Erfolg. Ereignis 2007? iPhone wird eingeführt.

Offene Fragen:

Warum Tief im Juni 2013?
Warum tief im Mai 2016?

1.3.3 Umwandlung in Zeitreihe & weiterführende Datenexploration

# Umwandlung in Zeitreihe
apple_ts <- ts(apple[[2]], start = 1982, end = 2018, frequency = 12)
dygraph(apple_ts, main = "mit Ausreisser")

plot(apple_decomposition <- decompose(apple_ts))

# Ausreisser bereinigte Datenreihe anschauen
apple$clean_wert_monat = tsclean(apple_ts) 
apple_ts_clean <- ts(apple[[3]], start = 1982, end = 2018, frequency = 12)
dygraph(apple_ts_clean, main = "ohne Ausreisser")

plot(apple_decomposition_clean <- decompose(apple_ts_clean))

# Bessere Übersicht: Zeige je eine Saison
par(mfrow=c(2,1))
plot(apple_decomposition$seasonal[1:12], type = "l", main = "Eine Saison (Jahr)",
     xlab = "Monat", ylab = "")
plot(apple_decomposition_clean$seasonal[1:12], type = "l",  
     main = "Eine Saison (Jahr, ohne Ausreisser)", xlab = "Monat", ylab = "")

# XTS Format für spätere Visualisierung
apple_xts <- xts(x = apple$wert_monat,
                 order.by = apple$datum,
                 frequency = 12,
                 tzone = Sys.getenv("Europe/Berlin"))

1.4 Analyse

1.4.1 Stationarität prüfen mittels Dickey-Fuller Test

adf.test(apple_ts)

## Warning in adf.test(apple_ts): p-value greater than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  apple_ts
## Dickey-Fuller = 0.84285, Lag order = 7, p-value = 0.99
## alternative hypothesis: stationary

Nullhypothese “Daten sind nicht stationär” wird angenommen, da p-Wert mit 0.99 > 0.05. Folge: Die Renditen müssen berechnet werden.

# Daten vorbereiten
apple_returns <- ts(diff(log(apple_ts)))[-1]
apple_returns_xts <- diff(log(apple_xts))[-1]

# Visuelle Exploration durch Interaktiver Graph
(d_returns <- dygraph(apple_returns_xts, main = "Apple Renditen") %>% 
    dyRangeSelector() %>% 
    dyOptions(stackedGraph = TRUE, colors = "green"))

# Deskriptive Analyse
(hist_apple_returns <- plot_ly(x = ~as.double(apple_returns), type = "histogram"))

(boxplot_apple_return <- plot_ly(y = ~as.double(apple_returns), type = "box"))

summary(apple_returns)

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -0.99012 -0.04810  0.02251  0.01190  0.08180  0.42704

adf.test(apple_returns)

## Warning in adf.test(apple_returns): p-value smaller than printed p-value

## 
##  Augmented Dickey-Fuller Test
## 
## data:  apple_returns
## Dickey-Fuller = -8.0946, Lag order = 7, p-value = 0.01
## alternative hypothesis: stationary

Nullhypothese “Daten sind nicht stationär” wird verworfen, da p-Wert mit 0.01 < 0.05. Renditen sind somit stationär.

1.4.2 ACF & PACF visualieren, Ergebnisse interpretieren und Modell bestimmen

acf(apple_returns)

pacf(apple_returns)

Ergebnisse mit folgender Entscheidungstabelle interpretieren: AR oder MA: Welches Modell ist ideal?

Fazit: ACF fällt nach Lag q abrupt auf Null & PACF geht stetig gegen Null. Folglich wird zuerst das Modell MA(q) verwendet.

1.5 Prognose

1.5.1 Modelle

# Modell Ma(q) erstellen und Signifikanz prüfen
apple_ar <- Arima(apple_ts, c(0,1,1))
coeftest(apple_ar)

## 
## z test of coefficients:
## 
##     Estimate Std. Error z value  Pr(>|z|)    
## ma1 0.162591   0.047733  3.4063 0.0006585 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Eignung Modellwahl prüfen und optimale p, d & q prüfen
# Dabei sollen die zwei gängsigsten Kriterien der Modelleignung optimiert werden:
# 1. Akaike information criteria (AIC) sowie Baysian information criteria (BIC)
auto.arima(apple_ts, trace = TRUE)

## 
##  Fitting models using approximations to speed things up...
## 
##  ARIMA(2,2,2)(1,0,1)[12]                    : 2275.42
##  ARIMA(0,2,0)                               : 2465.71
##  ARIMA(1,2,0)(1,0,0)[12]                    : 2406.784
##  ARIMA(0,2,1)(0,0,1)[12]                    : 2261.105
##  ARIMA(0,2,1)(1,0,1)[12]                    : Inf
##  ARIMA(0,2,1)                               : 2270.395
##  ARIMA(0,2,1)(0,0,2)[12]                    : 2263.14
##  ARIMA(0,2,1)(1,0,2)[12]                    : Inf
##  ARIMA(1,2,1)(0,0,1)[12]                    : Inf
##  ARIMA(0,2,0)(0,0,1)[12]                    : 2467.567
##  ARIMA(0,2,2)(0,0,1)[12]                    : 2259.564
##  ARIMA(1,2,3)(0,0,1)[12]                    : 2264.431
##  ARIMA(0,2,2)(1,0,1)[12]                    : 2269.078
##  ARIMA(0,2,2)                               : 2266.5
##  ARIMA(0,2,2)(0,0,2)[12]                    : 2261.611
##  ARIMA(0,2,2)(1,0,2)[12]                    : Inf
##  ARIMA(1,2,2)(0,0,1)[12]                    : 2262.58
##  ARIMA(0,2,3)(0,0,1)[12]                    : 2261.526
## 
##  Now re-fitting the best model(s) without approximations...
## 
##  ARIMA(0,2,2)(0,0,1)[12]                    : 2269.17
## 
##  Best model: ARIMA(0,2,2)(0,0,1)[12]

## Series: apple_ts 
## ARIMA(0,2,2)(0,0,1)[12] 
## 
## Coefficients:
##           ma1      ma2     sma1
##       -0.8739  -0.1040  -0.1821
## s.e.   0.0528   0.0531   0.0627
## 
## sigma^2 estimated as 11.1:  log likelihood=-1130.54
## AIC=2269.08   AICc=2269.17   BIC=2285.34

# Alterantivmodell anhand Empfehlung auto.arima() prüfen
# apple_ar_2 <- Arima(apple_ts, order = c(0,2,2), seasonal = c(0,0,1))
# einfachere Notation
apple_ar_2 = auto.arima(apple_ts, seasonal = TRUE)
coeftest(apple_ar_2)

## 
## z test of coefficients:
## 
##       Estimate Std. Error  z value  Pr(>|z|)    
## ma1  -0.873920   0.052786 -16.5558 < 2.2e-16 ***
## ma2  -0.104037   0.053059  -1.9608  0.049907 *  
## sma1 -0.182088   0.062727  -2.9029  0.003698 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Alle geschätzten Parameter des menschlich geschätzten ARIMA-Modells (Modell 1) sowie der Modellempfehlung (Modell 2) aus sind statistisch signifikant.

Prognose Modell 1

###############################################################
# Kurzform für im Anschluss aufgeführte Einzelschritte
apple_prediction <- apple_ar %>% 
  forecast(h = 35)

apple_prediction %>% 
  autoplot() + xlab("Jahr")

# Prognose durchführen
# apple_prediction <- forecast(apple_ar, h=35)
# visuell ansprechender ggplot2 Graph anzeigen
# autoplot(apple_prediction)
###############################################################

# Zeige Prognose-Punktschätzer 
apple_prediction_df <- summary(apple_prediction)

## 
## Forecast method: ARIMA(0,1,1)
## 
## Model Information:
## Series: apple_ts 
## ARIMA(0,1,1) 
## 
## Coefficients:
##          ma1
##       0.1626
## s.e.  0.0477
## 
## sigma^2 estimated as 11.46:  log likelihood=-1139.2
## AIC=2282.4   AICc=2282.42   BIC=2290.53
## 
## Error measures:
##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 0.368738 3.376729 1.488186 0.3881103 8.560265 0.1920035
##                      ACF1
## Training set -0.001594156
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Feb 2018        186.806 182.4685 191.1434 180.1723 193.4396
## Mar 2018        186.806 180.1544 193.4575 176.6333 196.9786
## Apr 2018        186.806 178.4590 195.1529 174.0404 199.5715
## May 2018        186.806 177.0540 196.5579 171.8916 201.7203
## Jun 2018        186.806 175.8273 197.7846 170.0156 203.5963
## Jul 2018        186.806 174.7246 198.8873 168.3291 205.2828
## Aug 2018        186.806 173.7144 199.8975 166.7842 206.8277
## Sep 2018        186.806 172.7768 200.8351 165.3502 208.2617
## Oct 2018        186.806 171.8980 201.7139 164.0063 209.6056
## Nov 2018        186.806 171.0683 202.5436 162.7372 210.8747
## Dec 2018        186.806 170.2801 203.3318 161.5318 212.0801
## Jan 2019        186.806 169.5279 204.0841 160.3814 213.2305
## Feb 2019        186.806 168.8070 204.8049 159.2790 214.3330
## Mar 2019        186.806 168.1140 205.4980 158.2190 215.3929
## Apr 2019        186.806 167.4457 206.1662 157.1970 216.4149
## May 2019        186.806 166.7997 206.8122 156.2091 217.4028
## Jun 2019        186.806 166.1740 207.4379 155.2521 218.3598
## Jul 2019        186.806 165.5667 208.0452 154.3233 219.2886
## Aug 2019        186.806 164.9763 208.6357 153.4203 220.1916
## Sep 2019        186.806 164.4014 209.2105 152.5411 221.0708
## Oct 2019        186.806 163.8409 209.7710 151.6839 221.9280
## Nov 2019        186.806 163.2938 210.3181 150.8472 222.7647
## Dec 2019        186.806 162.7591 210.8528 150.0294 223.5825
## Jan 2020        186.806 162.2360 211.3759 149.2295 224.3824
## Feb 2020        186.806 161.7239 211.8880 148.4463 225.1657
## Mar 2020        186.806 161.2220 212.3899 147.6787 225.9332
## Apr 2020        186.806 160.7298 212.8821 146.9259 226.6860
## May 2020        186.806 160.2466 213.3653 146.1870 227.4249
## Jun 2020        186.806 159.7722 213.8397 145.4613 228.1506
## Jul 2020        186.806 159.3059 214.3060 144.7482 228.8637
## Aug 2020        186.806 158.8473 214.7646 144.0470 229.5650
## Sep 2020        186.806 158.3962 215.2157 143.3570 230.2549
## Oct 2020        186.806 157.9522 215.6598 142.6779 230.9340
## Nov 2020        186.806 157.5148 216.0971 142.0090 231.6029
## Dec 2020        186.806 157.0839 216.5280 141.3500 232.2619

# Residuenanalyse
apple_prediction_residuen <- residuals(apple_prediction)
tsdisplay(apple_prediction_residuen, main='(0, 1, 1) Modell Residuen')

# Normalverteilung erfüllt?
hist(apple_prediction_residuen)

qqnorm(apple_prediction_residuen)
qqline(apple_prediction_residuen)

shapiro.test(apple_prediction_residuen)

## 
##  Shapiro-Wilk normality test
## 
## data:  apple_prediction_residuen
## W = 0.64441, p-value < 2.2e-16

# Erwartungswert von 0 prüfen
apple_prediction_fitted_values <- apple_prediction$fitted
(apple_prediction_2_erwartungswert <- plot_ly(x = ~apple_prediction_fitted_values, y = ~apple_prediction_residuen, type = "scatter") %>% 
    layout(xaxis = list(title = "Prognosewerte"),
           yaxis = list(title = "Residuen")) %>% 
    add_lines(y = ~fitted(lm(apple_prediction_residuen ~ apple_prediction_fitted_values)),
            line = list(color = '#07A4B5'),
            name = "Linear Regression", showlegend = TRUE))

Prüfung Modell 1 auf Gaussches weisses Rauschen: unkorrelierte (1), normalverteile (2) Zufallsvariablen mit Erwartungswert Null (3) und konstanter Varianz (4).
Die Residuen … (1) sind grundsätzlich unkorreliert (gemäss ACF & PACF) (2) sind nicht normalverteilt (3) haben tendenziell einen Erwartungswert gegen Null (4) weisen eine grundsätzlich konstante Varianz auf

Drei viertel der Kriterien sind grundsätzlich erfüllt. Die Verteilung müsste weiter analysiert werden.

Prognose Modell 2

###############################################################
# Kurzform für im Anschluss aufgeführte Einzelschritte
apple_prediction_2 <- apple_ar_2 %>% 
  forecast(h = 35)

apple_prediction_2 %>% 
  autoplot() + xlab("Jahr")

# Prognose durchführen
# apple_prediction_2 <- forecast(apple_ar_2, h=35)
# visuell ansprechender ggplot2 Graph anzeigen
# autoplot(apple_prediction_2)
###############################################################

# Zeige Prognose-Punktschätzer 
apple_prediction_df_2 <- summary(apple_prediction_2)

## 
## Forecast method: ARIMA(0,2,2)(0,0,1)[12]
## 
## Model Information:
## Series: apple_ts 
## ARIMA(0,2,2)(0,0,1)[12] 
## 
## Coefficients:
##           ma1      ma2     sma1
##       -0.8739  -0.1040  -0.1821
## s.e.   0.0528   0.0531   0.0627
## 
## sigma^2 estimated as 11.1:  log likelihood=-1130.54
## AIC=2269.08   AICc=2269.17   BIC=2285.34
## 
## Error measures:
##                     ME     RMSE     MAE         MPE     MAPE     MASE
## Training set 0.2035101 3.311642 1.43388 -0.01295457 8.778225 0.184997
##                      ACF1
## Training set -0.007933902
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Feb 2018       190.5743 186.3056 194.8431 184.0458 197.1028
## Mar 2018       192.9375 186.5087 199.3663 183.1056 202.7695
## Apr 2018       196.3521 188.2682 204.4360 183.9888 208.7154
## May 2018       196.4613 186.9586 205.9640 181.9282 210.9945
## Jun 2018       198.5227 187.7430 209.3023 182.0365 215.0088
## Jul 2018       196.5612 184.6000 208.5223 178.2682 214.8542
## Aug 2018       198.4294 185.3557 211.5030 178.4350 218.4238
## Sep 2018       199.9176 185.7834 214.0517 178.3013 221.5339
## Oct 2018       197.5744 182.4203 212.7285 174.3982 220.7506
## Nov 2018       198.7888 182.6470 214.9306 174.1020 223.4756
## Dec 2018       199.7749 182.6715 216.8782 173.6176 225.9322
## Jan 2019       206.3280 188.2846 224.3714 178.7330 233.9229
## Feb 2019       208.4620 189.7213 227.2028 179.8006 237.1235
## Mar 2019       210.0635 190.6514 229.4755 180.3753 239.7516
## Apr 2019       211.6649 191.5845 231.7454 180.9545 242.3753
## May 2019       213.2664 192.5198 234.0129 181.5373 244.9954
## Jun 2019       214.8678 193.4570 236.2786 182.1229 247.6127
## Jul 2019       216.4692 194.3956 238.5428 182.7106 250.2279
## Aug 2019       218.0707 195.3353 240.8061 183.2999 252.8415
## Sep 2019       219.6721 196.2756 243.0687 183.8902 255.4540
## Oct 2019       221.2736 197.2163 245.3309 184.4811 258.0660
## Nov 2019       222.8750 198.1571 247.5930 185.0722 260.6778
## Dec 2019       224.4764 199.0977 249.8552 185.6630 263.2899
## Jan 2020       226.0779 200.0380 252.1178 186.2533 265.9024
## Feb 2020       227.6793 200.9778 254.3809 186.8428 268.5158
## Mar 2020       229.2808 201.9168 256.6447 187.4312 271.1303
## Apr 2020       230.8822 202.8550 258.9094 188.0182 273.7462
## May 2020       232.4836 203.7921 261.1752 188.6037 276.3636
## Jun 2020       234.0851 204.7281 263.4421 189.1874 278.9828
## Jul 2020       235.6865 205.6627 265.7103 189.7691 281.6040
## Aug 2020       237.2880 206.5960 267.9799 190.3487 284.2272
## Sep 2020       238.8894 207.5278 270.2510 190.9260 286.8528
## Oct 2020       240.4909 208.4581 272.5237 191.5009 289.4808
## Nov 2020       242.0923 209.3866 274.7980 192.0733 292.1113
## Dec 2020       243.6937 210.3134 277.0741 192.6430 294.7445

# Residuenanalyse
apple_prediction_2_residuen <- residuals(apple_prediction_2)
tsdisplay(apple_prediction_2_residuen, main='(0,2,2) Modell Residuen')

# Normalverteilung erfüllt?
hist(apple_prediction_2_residuen)

qqnorm(apple_prediction_2_residuen)
qqline(apple_prediction_2_residuen)

shapiro.test(apple_prediction_2_residuen)

## 
##  Shapiro-Wilk normality test
## 
## data:  apple_prediction_2_residuen
## W = 0.63577, p-value < 2.2e-16

# Erwartungswert von 0 prüfen
apple_prediction_2_fitted_values <- apple_prediction_2$fitted
(apple_prediction_2_erwartungswert <- plot_ly(x = ~apple_prediction_2_fitted_values, y = ~apple_prediction_2_residuen, type = "scatter") %>% 
    layout(xaxis = list(title = "Prognosewerte"),
           yaxis = list(title = "Residuen")) %>% 
    add_lines(y = ~fitted(lm(apple_prediction_2_residuen ~ apple_prediction_2_fitted_values)),
            line = list(color = '#07A4B5'),
            name = "Linear Regression", showlegend = TRUE))

Prüfung Modell 2 auf Gaussches weisses Rauschen: unkorrelierte (1), normalverteile (2) Zufallsvariablen mit Erwartungswert Null (3) und konstanter Varianz (4).
Die Residuen … (1) sind grundsätzlich unkorreliert (gemäss ACF & PACF) (2) sind nicht normalverteilt (3) haben einen Erwartungswert gegen Null (4) weisen eine grundsätzlich konstante Varianz auf

Drei viertel der Kriterien sind grundsätzlich erfüllt. Die Verteilung müsste weiter analysiert werden.

Dieses Modell berücksichtigt die saisonale Komponente und wird daher als finales Modell zur Beantwortung der Fragestellung verwendet.

1.5.2 Ergebnis

Der wachstum über die letzten 10 Jahre betrug 84% bzw. 8.4% jährlich. Der geschätzte Kursverlauf liegt daher über den Durchschnittswerten der letzten zehn Jahre und somit bestätigt sich die Nullhypothese: Die Rendite-Erwartungen der Anleger werden erfüllt.

1.6 Lessons Learned & Ausblick

1.6.1 Lessons Learned

Datenvorbereitung und -verständnis ist die halbe Miete!

1.6.2 Ausblick

Experimentieren mit verschiedenen Ausschnitten der Zeitreihe von heute aus zurück. Auf dessen Basis Ergebnisse evaluieren und potentiell bessere Modell prüfen.

Abschliessend folgt ein Versuch für die Prognose mit dygraph und der Verwendung des Holtwinters Modell.

hw <- HoltWinters(apple_ts)
p <- predict(hw, n.ahead = 35, prediction.interval = TRUE)
all <- cbind(apple_ts, p)

dygraph(all, "Prognose Apple Aktie bis 2020 auf Basis Daten bis Jan 2018") %>%
  dySeries("apple_ts", label = "Actual") %>%
  dySeries(c("p.lwr", "p.fit", "p.upr"), label = "Predicted")

apple_ts_ab_2015 <- tail(apple_ts, 37)
hw <- HoltWinters(apple_ts_ab_2015)
p <- predict(hw, n.ahead = 35, prediction.interval = TRUE)
all <- cbind(apple_ts_ab_2015, p)

dygraph(all, "Prognose Apple Aktie bis 2020 auf Basis Daten 2015 - Jan 2018") %>%
  dySeries("apple_ts_ab_2015", label = "Actual") %>%
  dySeries(c("p.lwr", "p.fit", "p.upr"), label = "Predicted")

Leistungsnachweis Time Series Analysis in Finance | Master IDS HSLU

Ramon Schildknecht

14. Dezember 2018