01.Abstract


The financial markets contain a plethora of statistical patterns. The behavior of those patterns is similar with the behavior of the natural phenomena patterns. That means that both are affected by unknown and unstable variables. Which leads to high unpredictability and volatility. That makes almost impossible to forecast future behavior.

As Burton Malkiel, who argues in his 1973 book, “A Random Walk Down Wall Street,”

Nevertheles, one forecasting methodology is: To use the past performance of markets as a predictor for the future. That can be achieved by observing the changes of small seasonal intervals, when the time series is stationary.

02.Introduction


The purpose of this project is to analyze 3 different algorithms for the financial forecasting of Daimler share. Disclaimer I am working on Advanced Analytics of Daimler AG. This analysis is for educational purposes and not for financial advising.

03.Methodology


We will use Daimler historical share market datasets (from 2010). For forecasting future values. On those nature of forecasting we assume that some patterns of our sets have carriage on future short linear interims. The same approach is applied on the weather forecasting.

We will apply mathematical technical indicators in our datasets on the below domains:
-Support & resistance
-Trend
-Momentum
-Volume
-Volatility

Some of them are the:
-Moving average convergence/divergence
-Relative strength index
-Stochastic oscillator
-Ease of movement
-Larry Williams oscilator. Etc.

The 3 algorithms that will we compare to predict the Daimler financial share behavior are:

-(LASSO) Least Absolute Shrinkage and Selection Operator. This method is based on a linear regression model is proposed as a novel method to predict financial market behavior

-Deep Learning (Long Short Term Memory Neural Network of linear
stack densely connected layers.

-And eXtreme Gradient Boosting

Apreciation to the colleagues from:
Cornell University (arXiv:1512.04916v3) [q-fin.CP] (Ruoxua, 2016)

Cornell University Social and Information Networks (cs.SI); Computational
Finance (q-fin.CP) (Jichang Zhao,2019)

04.Disclaimer


This article is intended for academic and educational purposes and is not an investment recommendation. The information that we provide or should not be a substitute for advice from an investment professional. The models discussed in this paper do not reflect the investment performance. A decision to invest in any product or strategy should not be based on the information or conclusions contained herein. This is neither an offer to sell / buy nor a solicitation for an offer to buy interests in securities.

Load required libraries

Download the Daimler share prices, since 2010

05.Data Observation


We will use as data sets the Daimler AG Symbol (DDAIF)

Display of the 6 first entries of our dataset

              DDAIF.Open               DDAIF.High                DDAIF.Low 
                   FALSE                    FALSE                    FALSE 
             DDAIF.Close             DDAIF.Volume           DDAIF.Adjusted 
                   FALSE                    FALSE                    FALSE 
           Avg_volume_10            Avg_volume_20       Volume_perc_avg_60 
                    TRUE                     TRUE                     TRUE 
                   Range      perc_change_closing         change_from_yest 
                   FALSE                     TRUE                     TRUE 
           moving_avg_10            moving_avg_20            moving_avg_60 
                    TRUE                     TRUE                     TRUE 
      perc_moving_avg_10       perc_moving_avg_20       perc_moving_avg_60 
                    TRUE                     TRUE                     TRUE 
             cash_tradet       avg_cash_trated_10       avg_cash_trated_20 
                   FALSE                     TRUE                     TRUE 
      avg_cash_trated_60 Avg_Dollar_volume_pct_10 Avg_Dollar_volume_pct_20 
                    TRUE                     TRUE                     TRUE 
Avg_Dollar_volume_pct_60                 nightgap           night_gap_perc 
                    TRUE                     TRUE                     TRUE 
     perc_range_previous          perc_range_atpr      perc_range_williams 
                   FALSE                    FALSE                    FALSE 
    one_month_range_perc                    EMA10                    EMA20 
                    TRUE                     TRUE                     TRUE 
                   EMA60                    WMA10                  EVWMA10 
                    TRUE                     TRUE                     TRUE 
                 ZLEMA10                   VWAP10                    HMA10 
                    TRUE                     TRUE                     TRUE 
                  ALMA10 
                    TRUE 
           DDAIF.Open DDAIF.High DDAIF.Low DDAIF.Close DDAIF.Volume
2010-05-03      50.90      51.62     50.68       51.26       753100
           DDAIF.Adjusted Avg_volume_10 Avg_volume_20 Volume_perc_avg_60
2010-05-03       37.62413            NA            NA                 NA
              Range perc_change_closing change_from_yest moving_avg_10
2010-05-03 0.939999                  NA               NA            NA
           moving_avg_20 moving_avg_60 perc_moving_avg_10
2010-05-03            NA            NA                 NA
           perc_moving_avg_20 perc_moving_avg_60 cash_tradet
2010-05-03                 NA                 NA    38603904
           avg_cash_trated_10 avg_cash_trated_20 avg_cash_trated_60
2010-05-03                 NA                 NA                 NA
           Avg_Dollar_volume_pct_10 Avg_Dollar_volume_pct_20
2010-05-03                       NA                       NA
           Avg_Dollar_volume_pct_60  nightgap night_gap_perc
2010-05-03                       NA        NA             NA
           perc_range_previous perc_range_atpr perc_range_williams
2010-05-03           38.297488        1.833787            38.29802
           one_month_range_perc EMA10 EMA20 EMA60 WMA10 EVWMA10 ZLEMA10
2010-05-03                   NA    NA    NA    NA    NA      NA      NA
           VWAP10 HMA10 ALMA10
2010-05-03     NA    NA     NA
 [ reached getOption("max.print") -- omitted 5 rows ]
An 'xts' object on 2010-05-03/2019-04-30 containing:
  Data: num [1:2264, 1:40] 50.9 48.9 47.3 47.1 46.2 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:40] "DDAIF.Open" "DDAIF.High" "DDAIF.Low" "DDAIF.Close" ...
  Indexed by objects of class: [Date] TZ: UTC
  xts Attributes:  
List of 2
 $ src    : chr "yahoo"
 $ updated: POSIXct[1:1], format: "2019-05-15 18:38:05"

Calculate log returns

Chart series technical analysis graph

Bollinger Bands, and Moving Average Convergence Divergence graph

Interactive graph with opening Prices since 2010 pro semester

Interactive graph with the Adjusted Closing Prices

Plot of an An open-high-low-close chart

Add new indicators to our sets to increase prediction. Such us: Weighted Moving Average
Double Exponential Moving Average is a measure of a security’s trending average The EVWMA uses the volume to declare the period of the MA. Zero Lag Exponential Moving Average (ZLEMA) As is the case with the double Volume weighted average price (VWAP) and moving volume weighted average price The Hull Moving Average (HMA), developed by Alan Hull, is an extremely fast The ALMA moving average uses curve of the Normal (Gauss) distribution which

# We create a response variable. To predicting future days price, we apply
# lag function in the price change

# We Calculate the various moving averages (MA) of a series for volume. For
# the past 10, 20 , 60 days
require(TTR)
DDAIF$Avg_volume_10 <- SMA(DDAIF$DDAIF.Volume, n = 10)
DDAIF$Avg_volume_20 <- SMA(DDAIF$DDAIF.Volume, n = 20)
DDAIF$Avg_volume_60 <- SMA(DDAIF$DDAIF.Volume, n = 60)

# We calculate the % of the average volume of the above days
DDAIF$Volume_perc_avg_10 <- (DDAIF$DDAIF.Volume/DDAIF$Avg_vol_10) * 100
DDAIF$Volume_perc_avg_20 <- (DDAIF$DDAIF.Volume/DDAIF$Avg_vol_20) * 100
DDAIF$Volume_perc_avg_60 <- (DDAIF$DDAIF.Volume/DDAIF$Avg_vol_60) * 100

# We calculate the range between high and low
DDAIF$Range <- DDAIF$DDAIF.High - DDAIF$DDAIF.Low

# % change of closing price.
DDAIF$perc_change_closing <- (DDAIF$DDAIF.Close - lag(DDAIF$DDAIF.Close))/lag(DDAIF$DDAIF.Close) * 
    100

# Range between prior days closing price and todays closing price
DDAIF$change_from_yest <- DDAIF$DDAIF.Close - lag(DDAIF$DDAIF.Close)

# We Calculate again the various moving averages (MA) for range now . For
# the past 10, 20 , 60 days
DDAIF$moving_avg_10 <- SMA(DDAIF$Range, n = 10)
DDAIF$moving_avg_20 <- SMA(DDAIF$Range, n = 20)
DDAIF$moving_avg_60 <- SMA(DDAIF$Range, n = 60)

# We calculate the % of the average range of the above days
DDAIF$perc_moving_avg_10 <- (DDAIF$Range/DDAIF$moving_avg_10) * 100
DDAIF$perc_moving_avg_20 <- (DDAIF$Range/DDAIF$moving_avg_20) * 100
DDAIF$perc_moving_avg_60 <- (DDAIF$Range/DDAIF$moving_avg_60) * 100

# The tot amount of money traded multiplied by the volume (in dollars)
DDAIF$cash_tradet <- DDAIF$DDAIF.Close * DDAIF$DDAIF.Volume

# The average volume of cash trated for the same periods as above
DDAIF$avg_cash_trated_10 <- SMA(DDAIF$cash_tradet, n = 10)
DDAIF$avg_cash_trated_20 <- SMA(DDAIF$cash_tradet, n = 20)
DDAIF$avg_cash_trated_60 <- SMA(DDAIF$cash_tradet, n = 60)

# The % of the avgo volume today.
DDAIF$Avg_Dollar_volume_pct_10 <- (DDAIF$cash_tradet/DDAIF$avg_cash_trated_10) * 
    100
DDAIF$Avg_Dollar_volume_pct_20 <- (DDAIF$cash_tradet/DDAIF$avg_cash_trated_20) * 
    100
DDAIF$Avg_Dollar_volume_pct_60 <- (DDAIF$cash_tradet/DDAIF$avg_cash_trated_60) * 
    100

Augmented Dickey-Fuller test AND correlation test

We calculate the correlations of all numerical variables

We display the correlation of our features, in order to choose the ones without

           used  (Mb) gc trigger   (Mb) max used  (Mb)
Ncells 13347310 712.9   24364361 1301.2 16827610 898.7
Vcells 33860826 258.4   58208892  444.1 40300111 307.5

We remove the highly correlated variables to avoid overfitting of models

KERAS DEEP LEARNING : backend TensorFlow We will apply deep learning networks of linear stack densely connected layers If you already have installed Keras and tensorflow then skip the below commands devtools::install_github(“rstudio/keras”) devtools::install_github(“rstudio/tensorflow”) install_tensorflow() require(keras) require(tensorflow)

06.Keras with LSTM


We will apply deep learning networks of linear stack densely connected layers

We train the NN model

Plot of Keras Model History

Interactive plot of Keras Model Predictions vs Actuals

KERAS PRED REAL PRICES
61.47425 66.77
62.02896 66.31
60.32361 65.07
62.72711 64.31
61.83586 65.03
59.63830 65.13
60.34718 65.74

07.Lasso regression model


With caret package we will apply cross validation in order to find the optimal hyperparameters

Display of the Laso Regression vars penalty

The Lasso regression used 11 variables, and did not used 5 variables.
Interactive plot of Keras and Lasso regression:Predicted vs Actual Prices
KERAS PRED LASSO PRED REAL PRICES
2019-04-22 61.47425 63.98916 66.77
2019-04-23 62.02896 55.00517 66.31
2019-04-24 60.32361 63.50995 65.07
2019-04-25 62.72711 56.25471 64.31
2019-04-26 61.83586 63.31208 65.03
2019-04-29 59.63830 56.38149 65.13
2019-04-30 60.34718 68.96010 65.74

09.Results of all models


KERAS PRED LASSO PRED XGB PRED REAL PRICES
2019-04-22 61.47425 63.98916 61.09808 66.77
2019-04-23 62.02896 55.00517 62.61695 66.31
2019-04-24 60.32361 63.50995 59.31833 65.07
2019-04-25 62.72711 56.25471 62.09966 64.31
2019-04-26 61.83586 63.31208 56.91272 65.03
2019-04-29 59.63830 56.38149 61.77945 65.13
2019-04-30 60.34718 68.96010 56.54294 65.74

10.Conclusion


Usually the share price daily fluctuation is between 1 - 2 percent in ordinary time periods. Unfortunately the above models presented high daily fluctuation. Regardless from our application of the mathematical technical indicators. Into our datasets before the training of the models.

Unfortunately currently our models are not adequate to forecast time series of markets successfully.

11.Proposal


On another paper, i have also created some models to analyze the correlation of social media sentiment and Daimler share price. There my models forecasting was significantly more successful. I would propose to create a model that would analyze and combine the results of:

-Social media sentiment
-Economic News
-Market Time Series Analysis

Thank you for reading my analysis
KR
Niko

Contact


(https://www.linkedin.com/in/niko-papacosmas-mba-pmp-mcse-695a2695/)

REFERENCES

[1] Ruoxuan Xiong, Eric P. Nichols, Yuan Shen. Deep Learning Stock Volatility with Google Domestic Trends. Cornell University (arXiv:1512.04916v3) [q-fin.CP] (2016)

[2] Junran Wu, Ke Xu, Jichang Zhao.Online reviews can predict long-term returns of individual stocks . Cornell University (2019)