Getting data for the forecasting

In this section we’re collecting the data for the target variable (the variable we want to forecast) and for potentially useful explanatory variables. Since the goal is to forecast, we will have to have least 250 observations at our disposal. Based on the frequency of observation time horizon will be different.

That means we could use daily, weekly, monthly (we’ll need at least 21 years) or quarterly data (we’ll need at least 63 years).

We settled on using daily data from period of 01-01-2019 to 06-10-2021, with 616 observations in total, nearly two and a half years of trading data.

For the analysis, our target variable is Microsoft (MSFT) stock. Some of the explanatory variables (used for regression modeling and forecasting and for initial data analysis) are:

For data collection process we’re using quantmod library.

Microsoft data

##            MSFT.Open MSFT.High MSFT.Low MSFT.Close MSFT.Volume MSFT.Adjusted
## 2021-06-03    245.22    246.34   243.00     245.71    25307700        245.71
## 2021-06-04    247.76    251.65   247.51     250.79    25281100        250.79
## 2021-06-07    249.98    254.09   249.81     253.81    23079200        253.81
## 2021-06-08    255.16    256.01   252.51     252.57    22455000        252.57
## 2021-06-09    253.81    255.53   253.21     253.59    17937600        253.59
## 2021-06-10    254.29    257.46   253.67     257.24    24563600        257.24

S&P500 data

##            GSPC.Open GSPC.High GSPC.Low GSPC.Close GSPC.Volume GSPC.Adjusted
## 2021-06-03   4191.43   4204.39  4167.93    4192.85  4579450000       4192.85
## 2021-06-04   4206.05   4233.45  4206.05    4229.89  3487070000       4229.89
## 2021-06-07   4229.34   4232.34  4215.66    4226.52  3835570000       4226.52
## 2021-06-08   4233.81   4236.74  4208.41    4227.26  3943870000       4227.26
## 2021-06-09   4232.99   4237.09  4218.74    4219.55  3902870000       4219.55
## 2021-06-10   4228.56   4249.74  4220.34    4239.18  3502480000       4239.18

Nasdaq data

##            IXIC.Open IXIC.High IXIC.Low IXIC.Close IXIC.Volume IXIC.Adjusted
## 2021-06-03  13655.75  13684.13 13548.93   13614.51  5367460000      13614.51
## 2021-06-04  13697.25  13826.82 13692.01   13814.49  4341800000      13814.49
## 2021-06-07  13802.82  13889.11 13784.89   13881.72  4602940000      13881.72
## 2021-06-08  13946.32  13981.72 13831.98   13924.91  5894140000      13924.91
## 2021-06-09  13980.23  14003.50 13906.45   13911.75  5607720000      13911.75
## 2021-06-10  13933.88  14031.19 13904.40   14020.33  4889500000      14020.33

Apple data

##            AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume AAPL.Adjusted
## 2021-06-03    124.68    124.85   123.13     123.54    76229200        123.54
## 2021-06-04    124.07    126.16   123.85     125.89    75169300        125.89
## 2021-06-07    126.17    126.32   124.83     125.90    71057600        125.90
## 2021-06-08    126.60    128.46   126.21     126.74    74403800        126.74
## 2021-06-09    127.21    127.75   126.52     127.13    56877900        127.13
## 2021-06-10    127.02    128.19   125.94     126.11    71186400        126.11

Google data

##            GOOG.Open GOOG.High GOOG.Low GOOG.Close GOOG.Volume GOOG.Adjusted
## 2021-06-03   2395.02  2409.745 2382.830    2404.61      917300       2404.61
## 2021-06-04   2422.52  2453.859 2417.770    2451.76     1297400       2451.76
## 2021-06-07   2451.32  2468.000 2441.073    2466.09     1192500       2466.09
## 2021-06-08   2479.90  2494.495 2468.240    2482.85     1253000       2482.85
## 2021-06-09   2499.50  2505.000 2487.330    2491.40     1006300       2491.40
## 2021-06-10   2494.01  2523.260 2494.000    2521.60     1561700       2521.60

IBM data

##            IBM.Open IBM.High IBM.Low IBM.Close IBM.Volume IBM.Adjusted
## 2021-06-03   144.91   145.88  144.04    145.55    4130600       145.55
## 2021-06-04   146.00   147.55  145.76    147.42    3117900       147.42
## 2021-06-07   147.55   148.74  147.17    148.02    3462700       148.02
## 2021-06-08   148.12   150.20  148.12    149.07    5080100       149.07
## 2021-06-09   149.03   151.07  148.82    150.67    5303300       150.67
## 2021-06-10   151.47   152.84  149.76    150.54    4758500       150.54

3M data

##            MMM.Open MMM.High MMM.Low MMM.Close MMM.Volume MMM.Adjusted
## 2021-06-03   202.50   204.67  201.81    203.67    1902100       203.67
## 2021-06-04   204.11   206.12  203.77    206.05    1868600       206.05
## 2021-06-07   206.35   206.81  203.31    203.73    1531100       203.73
## 2021-06-08   202.00   204.04  201.14    203.59    1701400       203.59
## 2021-06-09   203.56   203.57  201.94    202.74    1707600       202.74
## 2021-06-10   204.15   204.97  202.78    203.13    1953100       203.13

 

Statistics and Financial Data Analysis

A work by: Nikola Krivacevic, Aleksandar Milinkovic and Milos Milunovic

Entire forecasting project on github

(https://github.com/mcf-long-short/statistics-stocks-forecasting)