To start our exploratory data analysis, it is necessary to load all the libraries that will be used in our analysis.
library(xts)
library(dplyr)
library(zoo)
library(tseries)
library(stats)
library(forecast)
library(astsa)
library(corrplot)
library(AER)
library(vars)
library(dynlm)
library(vars)
library(TSstudio)
library(tidyverse)
library(sarima)
library(dygraphs)
library(dplyr)
library(lubridate)
# Subsequently, the database that we will deal with the company is loaded.
<-read.csv( "/Users/gabrielmedina/Downloads/entretainment_stocks.csv")
stocks stocks
## Date Disney_Adj_Close Netflix_Adj_Close Nintendo_Adj_Close
## 1 2007-01-01 29.12 3.26 37.10
## 2 2007-02-01 28.36 3.22 33.10
## 3 2007-03-01 28.51 3.31 36.30
## 4 2007-04-01 28.96 3.17 40.05
## 5 2007-05-01 29.34 3.13 43.65
## 6 2007-06-01 28.65 2.77 45.85
## 7 2007-07-01 27.70 2.46 61.25
## 8 2007-08-01 28.20 2.50 58.15
## 9 2007-09-01 28.86 2.96 64.85
## 10 2007-10-01 29.07 3.78 78.50
## 11 2007-11-01 27.82 3.30 76.10
## 12 2007-12-01 27.09 3.80 74.05
## 13 2008-01-01 25.32 3.59 61.75
## 14 2008-02-01 27.50 4.51 62.40
## 15 2008-03-01 26.62 4.95 64.85
## 16 2008-04-01 27.51 4.57 68.69
## 17 2008-05-01 28.51 4.34 68.95
## 18 2008-06-01 26.47 3.72 69.85
## 19 2008-07-01 25.75 4.41 57.75
## 20 2008-08-01 27.45 4.41 61.10
## 21 2008-09-01 26.04 4.41 53.07
## 22 2008-10-01 21.98 3.54 39.00
## 23 2008-11-01 19.11 3.28 38.84
## 24 2008-12-01 19.25 4.27 47.75
## 25 2009-01-01 17.81 5.16 36.40
## 26 2009-02-01 14.44 5.18 35.25
## 27 2009-03-01 15.64 6.13 36.50
## 28 2009-04-01 18.86 6.47 33.65
## 29 2009-05-01 20.86 5.63 33.68
## 30 2009-06-01 20.09 5.91 34.47
## 31 2009-07-01 21.63 6.28 33.12
## 32 2009-08-01 22.42 6.23 33.55
## 33 2009-09-01 23.65 6.60 31.57
## 34 2009-10-01 23.57 7.64 31.41
## 35 2009-11-01 26.02 8.38 30.78
## 36 2009-12-01 27.77 7.87 29.82
## 37 2010-01-01 25.74 8.89 34.90
## 38 2010-02-01 27.21 9.44 33.87
## 39 2010-03-01 30.41 10.53 41.65
## 40 2010-04-01 32.09 14.13 41.90
## 41 2010-05-01 29.11 15.88 36.45
## 42 2010-06-01 27.44 15.52 37.27
## 43 2010-07-01 29.35 14.65 35.21
## 44 2010-08-01 28.34 17.93 34.75
## 45 2010-09-01 28.83 23.17 31.20
## 46 2010-10-01 31.47 24.80 32.15
## 47 2010-11-01 31.80 29.41 34.10
## 48 2010-12-01 32.67 25.10 36.33
## 49 2011-01-01 34.23 30.58 34.10
## 50 2011-02-01 38.52 29.52 36.65
## 51 2011-03-01 37.94 33.97 33.74
## 52 2011-04-01 37.95 33.24 29.79
## 53 2011-05-01 36.66 38.69 28.90
## 54 2011-06-01 34.38 37.53 23.30
## 55 2011-07-01 34.01 38.00 19.95
## 56 2011-08-01 29.99 33.57 21.98
## 57 2011-09-01 26.56 16.18 18.15
## 58 2011-10-01 30.71 11.73 18.90
## 59 2011-11-01 31.57 9.22 19.05
## 60 2011-12-01 33.02 9.90 16.94
## 61 2012-01-01 34.83 17.17 16.94
## 62 2012-02-01 37.60 15.82 18.51
## 63 2012-03-01 39.20 16.43 18.96
## 64 2012-04-01 38.60 11.45 16.72
## 65 2012-05-01 40.93 9.06 14.32
## 66 2012-06-01 43.42 9.78 14.52
## 67 2012-07-01 44.00 8.12 13.80
## 68 2012-08-01 44.29 8.53 13.93
## 69 2012-09-01 46.81 7.78 15.87
## 70 2012-10-01 43.98 11.32 16.15
## 71 2012-11-01 44.46 11.67 15.10
## 72 2012-12-01 44.58 13.23 13.31
## 73 2013-01-01 48.98 23.61 12.16
## 74 2013-02-01 49.63 26.87 12.06
## 75 2013-03-01 51.64 27.04 13.44
## 76 2013-04-01 57.13 30.87 13.75
## 77 2013-05-01 57.35 32.32 12.47
## 78 2013-06-01 57.41 30.16 14.67
## 79 2013-07-01 58.77 34.93 15.80
## 80 2013-08-01 55.30 40.56 14.09
## 81 2013-09-01 58.63 44.17 14.12
## 82 2013-10-01 62.36 46.07 14.01
## 83 2013-11-01 64.13 52.26 16.08
## 84 2013-12-01 69.46 52.60 16.68
## 85 2014-01-01 66.82 58.48 14.63
## 86 2014-02-01 74.37 63.66 15.40
## 87 2014-03-01 73.69 50.29 14.89
## 88 2014-04-01 73.02 46.01 13.08
## 89 2014-05-01 77.32 59.69 14.56
## 90 2014-06-01 78.91 62.94 14.95
## 91 2014-07-01 79.04 60.39 13.83
## 92 2014-08-01 82.72 68.23 13.89
## 93 2014-09-01 81.94 64.45 13.59
## 94 2014-10-01 84.10 56.11 13.67
## 95 2014-11-01 85.14 49.51 14.50
## 96 2014-12-01 86.69 48.80 12.95
## 97 2015-01-01 84.78 63.11 12.04
## 98 2015-02-01 97.00 67.84 13.34
## 99 2015-03-01 97.76 59.53 18.43
## 100 2015-04-01 101.33 79.50 21.04
## 101 2015-05-01 102.87 89.15 21.16
## 102 2015-06-01 106.38 93.85 20.98
## 103 2015-07-01 111.84 114.31 22.01
## 104 2015-08-01 95.51 115.03 25.62
## 105 2015-09-01 95.81 103.26 20.97
## 106 2015-10-01 106.62 108.38 20.22
## 107 2015-11-01 106.37 123.33 19.13
## 108 2015-12-01 98.51 114.38 17.26
## 109 2016-01-01 90.40 91.84 17.57
## 110 2016-02-01 90.12 93.41 17.39
## 111 2016-03-01 93.69 102.23 17.75
## 112 2016-04-01 97.42 90.03 17.06
## 113 2016-05-01 93.61 102.57 18.34
## 114 2016-06-01 92.29 91.48 17.78
## 115 2016-07-01 90.52 91.25 25.78
## 116 2016-08-01 89.77 97.45 27.28
## 117 2016-09-01 88.25 98.55 32.98
## 118 2016-10-01 88.08 124.87 30.09
## 119 2016-11-01 94.19 117.00 30.83
## 120 2016-12-01 99.04 123.80 25.95
## 121 2017-01-01 105.96 140.71 24.73
## 122 2017-02-01 105.42 142.13 26.11
## 123 2017-03-01 108.59 147.81 29.02
## 124 2017-04-01 110.70 152.20 31.68
## 125 2017-05-01 103.37 163.07 37.83
## 126 2017-06-01 101.75 149.41 41.82
## 127 2017-07-01 105.27 181.66 42.46
## 128 2017-08-01 97.63 174.71 41.63
## 129 2017-09-01 95.10 181.35 45.95
## 130 2017-10-01 94.36 196.43 48.65
## 131 2017-11-01 101.12 187.58 50.98
## 132 2017-12-01 103.72 191.96 45.07
## 133 2018-01-01 105.68 270.30 57.08
## 134 2018-02-01 100.32 291.38 57.05
## 135 2018-03-01 97.68 295.35 55.51
## 136 2018-04-01 97.57 312.46 52.48
## 137 2018-05-01 96.74 351.60 51.04
## 138 2018-06-01 101.93 391.43 40.79
## 139 2018-07-01 110.44 337.45 42.67
## 140 2018-08-01 109.82 367.68 44.95
## 141 2018-09-01 114.64 374.13 45.47
## 142 2018-10-01 112.57 301.78 39.13
## 143 2018-11-01 113.22 286.13 37.87
## 144 2018-12-01 107.49 267.66 33.10
## 145 2019-01-01 110.17 339.50 37.24
## 146 2019-02-01 111.48 358.10 34.25
## 147 2019-03-01 109.69 356.56 35.87
## 148 2019-04-01 135.32 370.54 43.08
## 149 2019-05-01 130.45 343.28 44.14
## 150 2019-06-01 137.95 367.32 45.77
## 151 2019-07-01 141.28 322.99 46.19
## 152 2019-08-01 136.44 293.75 47.25
## 153 2019-09-01 129.54 267.62 46.60
## 154 2019-10-01 129.15 287.41 46.52
## 155 2019-11-01 150.68 314.66 48.39
## 156 2019-12-01 143.77 323.57 49.90
## 157 2020-01-01 138.31 345.09 45.90
## 158 2020-02-01 117.65 369.03 41.98
## 159 2020-03-01 96.60 375.50 48.28
## 160 2020-04-01 108.15 419.85 51.44
## 161 2020-05-01 117.30 419.73 50.84
## 162 2020-06-01 111.51 455.04 55.90
## 163 2020-07-01 116.94 488.88 55.01
## 164 2020-08-01 131.87 529.56 67.37
## 165 2020-09-01 124.08 500.03 70.90
## 166 2020-10-01 121.25 475.74 67.73
## 167 2020-11-01 148.01 490.70 70.95
## 168 2020-12-01 181.18 540.73 80.52
## 169 2021-01-01 168.17 532.39 72.27
## 170 2021-02-01 189.04 538.85 77.12
## 171 2021-03-01 184.52 521.66 70.80
## 172 2021-04-01 186.02 513.47 71.89
## 173 2021-05-01 178.65 502.81 77.32
## 174 2021-06-01 175.77 528.21 72.53
## 175 2021-07-01 176.02 517.57 64.25
## 176 2021-08-01 181.30 569.19 60.08
## 177 2021-09-01 169.17 610.34 59.25
## 178 2021-10-01 169.07 690.31 55.25
## 179 2021-11-01 144.90 641.90 55.08
## 180 2021-12-01 154.89 602.44 58.37
## 181 2022-01-01 142.97 427.14 12.22
## 182 2022-02-01 148.46 394.52 12.71
## 183 2022-03-01 137.16 374.59 12.58
## 184 2022-04-01 111.63 190.36 11.40
## 185 2022-05-01 110.44 197.44 11.12
## 186 2022-06-01 94.40 174.87 10.76
## 187 2022-07-01 106.10 224.90 11.20
## 188 2022-08-01 112.08 223.56 10.22
## 189 2022-09-01 94.33 235.44 10.19
## 190 2022-10-01 106.54 291.88 10.12
## 191 2022-11-01 97.87 305.53 10.73
## 192 2022-12-01 86.88 294.88 10.42
## WBD_Adj_Close EA_Adj_Close Paramount_Adj_Close
## 1 8.47 49.48 22.05
## 2 8.21 49.90 21.48
## 3 9.78 49.84 21.64
## 4 11.11 49.89 22.64
## 5 11.95 48.36 23.70
## 6 11.75 46.83 23.90
## 7 12.12 48.13 22.75
## 8 12.84 52.39 22.60
## 9 14.74 55.41 22.60
## 10 14.57 60.48 20.75
## 11 12.50 55.61 19.84
## 12 12.85 57.80 19.89
## 13 11.87 46.88 18.40
## 14 11.53 46.80 16.66
## 15 10.84 49.40 16.29
## 16 11.83 50.93 17.02
## 17 13.38 49.68 15.92
## 18 11.22 43.97 14.56
## 19 10.16 42.73 12.22
## 20 10.34 48.30 12.08
## 21 7.28 36.61 11.07
## 22 6.97 22.54 7.37
## 23 7.67 18.86 5.06
## 24 7.24 15.87 6.22
## 25 7.41 15.28 4.55
## 26 7.93 16.14 3.40
## 27 8.19 18.00 3.06
## 28 9.70 20.14 5.69
## 29 11.47 22.75 5.96
## 30 11.50 21.49 5.59
## 31 12.52 21.25 6.66
## 32 13.25 18.03 8.41
## 33 14.76 18.85 9.79
## 34 14.05 18.05 9.61
## 35 16.33 16.71 10.46
## 36 15.67 17.57 11.47
## 37 15.16 16.11 10.60
## 38 15.92 16.41 10.64
## 39 17.27 18.47 11.42
## 40 19.79 19.17 13.33
## 41 19.24 16.34 11.97
## 42 18.25 14.25 10.63
## 43 19.73 15.76 12.20
## 44 19.29 15.07 11.41
## 45 22.25 16.28 13.09
## 46 22.83 15.67 14.02
## 47 20.84 14.76 13.95
## 48 21.31 16.21 15.78
## 49 19.93 15.43 16.47
## 50 22.03 18.60 19.81
## 51 20.39 19.33 20.79
## 52 22.62 19.97 20.99
## 53 22.26 24.16 23.26
## 54 20.93 23.35 23.71
## 55 20.34 22.02 22.86
## 56 21.60 22.35 20.93
## 57 19.22 20.24 17.02
## 58 22.21 23.11 21.65
## 59 21.45 22.95 21.85
## 60 20.94 20.39 22.77
## 61 21.91 18.39 23.99
## 62 23.84 16.17 25.18
## 63 25.86 16.32 28.56
## 64 27.81 15.22 28.21
## 65 25.60 13.48 26.97
## 66 27.59 12.22 27.70
## 67 25.87 10.91 28.36
## 68 28.02 13.19 30.80
## 69 30.46 12.56 30.80
## 70 30.16 12.22 27.56
## 71 30.87 14.66 30.60
## 72 32.44 14.37 32.36
## 73 35.45 15.57 35.60
## 74 37.48 17.35 37.03
## 75 40.24 17.52 39.85
## 76 40.28 17.43 39.17
## 77 40.30 22.75 42.36
## 78 39.47 22.75 41.82
## 79 40.74 25.85 45.33
## 80 39.61 26.36 43.84
## 81 43.14 25.28 47.32
## 82 45.41 25.98 50.85
## 83 44.59 21.94 50.35
## 84 46.20 22.70 54.80
## 85 40.77 26.13 50.59
## 86 42.58 28.29 57.79
## 87 42.26 28.71 53.24
## 88 38.78 28.01 49.85
## 89 39.33 34.76 51.45
## 90 37.96 35.50 53.63
## 91 43.54 33.25 49.14
## 92 43.72 37.45 51.27
## 93 37.80 35.24 46.27
## 94 35.35 40.54 47.01
## 95 34.90 43.47 47.58
## 96 34.45 46.53 47.98
## 97 28.99 54.29 47.65
## 98 32.30 56.59 51.38
## 99 30.76 58.21 52.71
## 100 32.36 57.49 54.15
## 101 33.94 62.11 53.79
## 102 33.26 65.81 48.37
## 103 33.02 70.81 46.72
## 104 26.60 65.46 39.53
## 105 26.03 67.05 34.86
## 106 29.44 71.32 40.79
## 107 31.14 67.08 44.26
## 108 26.68 68.01 41.32
## 109 27.59 63.88 41.77
## 110 25.00 63.57 42.55
## 111 28.63 65.42 48.45
## 112 27.31 61.21 49.31
## 113 27.85 75.95 48.69
## 114 25.23 74.97 48.01
## 115 25.09 75.53 46.19
## 116 25.51 80.38 45.13
## 117 26.92 84.51 48.42
## 118 26.11 77.70 50.25
## 119 27.09 78.42 53.89
## 120 27.41 77.94 56.47
## 121 28.35 82.56 57.41
## 122 28.76 85.60 58.68
## 123 29.09 88.59 61.74
## 124 28.78 93.83 59.41
## 125 26.50 112.15 54.54
## 126 25.83 104.62 56.93
## 127 24.60 115.53 58.93
## 128 22.21 120.24 57.35
## 129 21.29 116.83 51.92
## 130 18.88 118.36 50.38
## 131 19.02 105.24 50.33
## 132 22.38 103.97 52.97
## 133 25.07 125.64 51.88
## 134 24.32 122.41 47.71
## 135 21.43 119.98 46.28
## 136 23.65 116.75 44.46
## 137 21.09 129.55 45.52
## 138 27.50 139.55 50.81
## 139 26.58 127.41 47.77
## 140 27.83 112.23 48.09
## 141 32.00 119.24 52.11
## 142 32.39 90.03 52.19
## 143 30.72 83.20 49.30
## 144 24.74 78.09 39.78
## 145 28.38 91.28 45.17
## 146 28.90 94.78 45.85
## 147 27.02 100.57 43.40
## 148 30.90 93.67 46.99
## 149 27.26 92.11 44.25
## 150 30.70 100.21 45.74
## 151 30.31 91.54 47.39
## 152 27.60 92.71 38.69
## 153 26.63 96.80 37.14
## 154 26.96 95.40 33.29
## 155 32.94 99.96 37.30
## 156 32.74 106.39 38.77
## 157 29.26 106.80 31.53
## 158 25.70 100.32 22.73
## 159 19.44 99.13 12.94
## 160 22.42 113.07 16.19
## 161 21.75 121.60 19.45
## 162 21.10 130.68 21.87
## 163 21.10 140.15 24.70
## 164 22.07 138.02 26.38
## 165 21.77 129.05 26.54
## 166 20.24 118.58 27.29
## 167 26.91 126.42 33.70
## 168 30.09 142.11 35.59
## 169 41.42 141.90 46.64
## 170 53.03 132.75 62.02
## 171 43.46 134.14 43.37
## 172 37.66 140.96 39.56
## 173 32.11 141.81 40.91
## 174 30.68 142.70 43.59
## 175 29.01 143.00 39.70
## 176 28.84 144.24 40.20
## 177 25.38 141.47 38.32
## 178 23.44 139.48 35.34
## 179 23.27 123.54 30.20
## 180 23.54 131.17 29.45
## 181 27.91 131.28 31.67
## 182 28.05 128.74 28.98
## 183 24.92 125.20 35.79
## 184 18.15 116.98 27.77
## 185 18.45 137.40 32.74
## 186 13.42 120.55 23.54
## 187 15.00 130.22 22.77
## 188 13.24 125.89 22.52
## 189 11.50 114.99 18.33
## 190 13.00 125.17 17.83
## 191 11.40 129.96 19.54
## 192 9.48 121.60 16.42
The company chosen to do the analysis was Disney, so it is important to create a new database with data on Disney shares.
<- data.frame(
disney_bd Date = stocks$Date,
Disney = stocks$Disney_Adj_Close
)
head(disney_bd)
## Date Disney
## 1 2007-01-01 29.12
## 2 2007-02-01 28.36
## 3 2007-03-01 28.51
## 4 2007-04-01 28.96
## 5 2007-05-01 29.34
## 6 2007-06-01 28.65
Checking the new database and analyzing the type of data available.
summary(disney_bd)
## Date Disney
## Length:192 Min. : 14.44
## Class :character 1st Qu.: 31.55
## Mode :character Median : 85.92
## Mean : 78.71
## 3rd Qu.:108.26
## Max. :189.04
str(disney_bd)
## 'data.frame': 192 obs. of 2 variables:
## $ Date : chr "2007-01-01" "2007-02-01" "2007-03-01" "2007-04-01" ...
## $ Disney: num 29.1 28.4 28.5 29 29.3 ...
Analyzing the data structure, we can see that the date is in an incorrect format, so it is necessary to apply a data transformation to it.
$Date <- as.Date(disney_bd$Date,format = "%Y-%m-%d")
disney_bd$Date <- ymd(disney_bd$Date)
disney_bd#Verify
class(disney_bd$Date)
## [1] "Date"
For better data handling, we convert our dates to the appropriate data type.
summary(disney_bd)
## Date Disney
## Min. :2007-01-01 Min. : 14.44
## 1st Qu.:2010-12-24 1st Qu.: 31.55
## Median :2014-12-16 Median : 85.92
## Mean :2014-12-16 Mean : 78.71
## 3rd Qu.:2018-12-08 3rd Qu.:108.26
## Max. :2022-12-01 Max. :189.04
Descriptive statistics, the most important statistics of the database are analyzed, corroborating the correct format, structure and reading of the variables.
It is essential to analyze the behavior of the variable throughout the time series that is being analyzed.
# Time series plot 1
plot(disney_bd$Date,disney_bd$Disney,type="l",col="blue", lwd=2, xlab ="Date",ylab ="Adjusted Close Price", main = "Disney Stock Price")
In the previous graph we can see very clearly how the variable has
behaved over the years, where we can see that at the beginning of the
time series (2007-2009) the Disney share price showed stable behavior
However, following this behavior, two trend patterns can be seen, where
on the one hand, from 2009 to 2021, a clear positive trend is observed,
which has an inflection point near 2021, where it began to have a
negative trend that remained until the end of our time series. It is
interesting to analyze the context of Disney to determine what caused
that peak in the shares and what caused it to not last for the following
periods.
# Time series plot 2
<-xts(disney_bd$Disney,order.by=disney_bd$Date)
disney_bdxtsplot(disney_bdxts)
In this other type of graph, the rise in the share price in 2021 is better appreciated. Investigating the context of Disney, it was found that it was in the last quarter of 2020 where Disney launched its successful streaming program Disney+ in Mexico and Latin America, one of the strongest markets globally. The platform was greatly received in Mexico and Latin America, increasing Disney’s income and raising shareholders’ expectations about the future of the company and thus the share price. However, some time later, people’s excitement about the streaming service began to decrease, while new streaming services were launched from major competitors in the market such as HBO, Paramount, etc. This caused Disney stock to begin to decline and show a negative trend in the latter part of our time series.
# Positivy Trend
dygraph(disney_bdxts, main = "Disney Stocks") %>%
dyOptions(colors = RColorBrewer::brewer.pal(4, "Dark2")) %>%
dyShading(from = "2009-02-01",
to = "2021-02-01",
color = "#00FF00")
# Negative Trend
dygraph(disney_bdxts, main = "Disney Stocks") %>%
dyOptions(colors = RColorBrewer::brewer.pal(4, "Dark2")) %>%
dyShading(from = "2021-03-01",
to = "2022-12-01",
color = "#FF0000")
<-ts(disney_bd$Disney,frequency=12,start=c(2007,1))
disney_bd_ts<-decompose(disney_bd_ts)
disney_ts_decomposeplot(disney_ts_decompose)
To better understand the time series, a decomposition will be used that
involves dividing the observed time series data into four different
elements: the observed values, the trend component, the seasonality
component and the random or residual component. By decomposing time
series data, we can isolate and analyze these individual aspects,
allowing us to discover trends, periodic patterns, and irregular
fluctuations in the data.
In this decomposition we can notice the clear trends in the time series that had already been identified between the periods 2009-2021, 2021-2023. Likewise, the seasonality component shows the patterns that are repeated throughout the periods (years), where it is seen how in the last parts and the beginning of each year, the shares rise due to the increase in the income of Disney for the holiday seasons and the premieres of its main films, the increase in attendance at its theme parks and the purchase of multiple gifts for the holidays. So the seasonal component makes a lot of sense in the real context.
On the other hand, the noise component shows a stable behavior until 2017, where it began to fluctuate a lot, which is explained by many events that occurred from that year onwards that caused Disney’s shares to fluctuate greatly. For example, in 2017 the purchase of FOX was carried out, in 2019 the successful culmination of the fourth stage of the Marvel cinematic universe took place. Likewise, in 2020 the COVID-19 pandemic occurred at the same time as the new Disney+ streaming service was launched, as previously explained.
# The stationarity of the series is checked with the following ADF test.
adf.test(disney_bd$Disney)
##
## Augmented Dickey-Fuller Test
##
## data: disney_bd$Disney
## Dickey-Fuller = -2.363, Lag order = 5, p-value = 0.4243
## alternative hypothesis: stationary
acf(disney_bd$Disney,main="Significant Autocorrelations")
It can be seen that all periods show serial autocorrelation, since the
linear relationship of the data with its lags is very high, since in all
periods the autocorrelation function exceeds the lines of significance,
and these values are considered significantly different from 0 , there
being a positive correlation.
Se utilizará el método ARMA y ARIMA para hacer un modelo de regresión lineal para la serie de tiempo
To begin, it is important to note that a linear trend cannot be found in the observed values, which makes data analysis difficult. Therefore, a data normalization technique will be used using the logarithm of the variable (log). This will stabilize the variance in the time series and decrease the large variation in the data over multiple time periods.
plot(disney_bd$Date,(disney_bd$Disney), type="l",col="blue", lwd=2, xlab ="Date",ylab ="log(Stock Price)", main = "CSIQ Stock Price")
plot(disney_bd$Date,log(disney_bd$Disney), type="l",col="blue", lwd=2, xlab ="Date",ylab ="log(Stock Price)", main = "Log CSIQ Stock Price")
# As can be seen in the graph, the trend series took on a more linear trend that facilitates its analysis.
plot(diff(log(disney_bd$Disney)),type="l",ylab="first order difference",main = "Diff - CSIQ Stock Price")
However, the series continue to show a trend and to facilitate its
analysis and forecast, it will be converted to stationary by applying
first degree differentiation. Which shows the differences between each
value and the t-1 value.
summary(disney_arma<-arma(diff(log(disney_bd$Disney)),order=c(2,2)))
##
## Call:
## arma(x = diff(log(disney_bd$Disney)), order = c(2, 2))
##
## Model:
## ARMA(2,2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.203216 -0.037176 0.004314 0.049921 0.204822
##
## Coefficient(s):
## Estimate Std. Error t value Pr(>|t|)
## ar1 -1.563337 0.169892 -9.202 < 2e-16 ***
## ar2 -0.795790 0.132178 -6.021 1.74e-09 ***
## ma1 1.635036 0.199105 8.212 2.22e-16 ***
## ma2 0.804004 0.168098 4.783 1.73e-06 ***
## intercept 0.006519 0.020183 0.323 0.747
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Fit:
## sigma^2 estimated as 0.005307, Conditional Sum-of-Squares = 1, AIC = -448.58
plot(disney_arma)
The estimated coefficients indicate that there is a strong presence of autocorrelation in lags 1 and 2, both in the autoregressive component (AR) and in the moving average component (MA). Furthermore, the p values for these coefficients are very low, suggesting that they are statistically significant. The presence of AR and MA terms in the model indicates that the series has both dependence on past values and past errors. The AIC value (-448.58) suggests that this model provides a good fit to the data
summary(disney_arima<-Arima(diff(log(disney_bd$Disney)),order=c(1,1,1)))
## Series: diff(log(disney_bd$Disney))
## ARIMA(1,1,1)
##
## Coefficients:
## ar1 ma1
## 0.0426 -0.9819
## s.e. 0.0774 0.0372
##
## sigma^2 = 0.005536: log likelihood = 223.46
## AIC=-440.92 AICc=-440.79 BIC=-431.18
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -5.716688e-05 0.07381462 0.05543337 89.16338 150.1821 0.7289193
## ACF1
## Training set -0.0009538925
plot(disney_arima)
The estimated coefficients are 0.0426 for the autoregressive (AR) term and -0.9819 for the moving average (MA) term. These coefficients indicate that there is a slight linear dependence in the current value of the series with respect to its previous value and in the current error with respect to the previous error.
The AIC value (-440.92) and the corrected AIC value (AICc) suggest that this model provides a good fit to the data, as they are relatively low. The BIC value is also important for comparing models, and in this case, it is -431.18.
Regarding the error metrics for the training set, it is observed that the mean absolute error (MAE) is approximately 0.0554, which means that, on average, the model deviates by 0.0554 units from the original variable in its predictions. . Furthermore, the ACF1 coefficient is close to zero, suggesting that the residuals do not show significant autocorrelation.
In summary, this ARIMA model appears to provide a good fit to the data and has significant AR and MA coefficients.
<-disney_arma$residuals
disney_ARMA_residualsBox.test(disney_ARMA_residuals,lag=1,type="Ljung-Box")
##
## Box-Ljung test
##
## data: disney_ARMA_residuals
## X-squared = 0.00027505, df = 1, p-value = 0.9868
Since the p-value is very high (0.9868), we do not have enough evidence to reject the null hypothesis. This indicates that there is no significant serial autocorrelation in the residuals of the ARMA model.
#Testing residuals
suppressWarnings({
$residuals <- na.omit(disney_arma$residuals)
disney_armaadf.test(disney_arma$residuals)
})
##
## Augmented Dickey-Fuller Test
##
## data: disney_arma$residuals
## Dickey-Fuller = -5.5235, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
Since the p-value (0.01) is less than a commonly used significance level such as 0.05, we reject the null hypothesis in favor of the alternative hypothesis. This means that the residuals from the ARMA model are stationary and do not show a significant trend in the time series.
suppressWarnings({
$fitted.values <- na.omit(disney_arma$fitted.values)
disney_armaadf.test(disney_arma$fitted.values)
})
##
## Augmented Dickey-Fuller Test
##
## data: disney_arma$fitted.values
## Dickey-Fuller = -5.5222, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
Since the p-value (0.01) is less than a commonly used significance level such as 0.05, we reject the null hypothesis in favor of the alternative hypothesis. This means that the fitted values of the ARMA model are stationary and do not show a significant trend in the time series.
hist(disney_arma$residuals)
#Normality distribution is appreciated
<-disney_arima$residuals
disney_arima_residualsBox.test(disney_arima_residuals,lag=1,type="Ljung-Box")
##
## Box-Ljung test
##
## data: disney_arima_residuals
## X-squared = 0.00017654, df = 1, p-value = 0.9894
Since the p-value (0.9894) is very high and much higher than a commonly used significance level such as 0.05, we have no evidence to reject the null hypothesis. This means that the residuals from the ARIMA model do not exhibit significant serial autocorrelation. This is good because it indicates that the ARIMA model has well captured the autocorrelation structure in the data and has decreased the noise.
#Testing residuals
suppressWarnings({
$residuals <- na.omit(disney_arima$residuals)
disney_arimaadf.test(disney_arima$residuals)
})
##
## Augmented Dickey-Fuller Test
##
## data: disney_arima$residuals
## Dickey-Fuller = -6.099, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
Since the p-value (0.01) is less than a commonly used significance level such as 0.05, we have evidence to reject the null hypothesis. This means that the residuals appear to be stationary and have no significant trend.
adf.test(disney_arima$fitted)
##
## Augmented Dickey-Fuller Test
##
## data: disney_arima$fitted
## Dickey-Fuller = -2.0019, Lag order = 5, p-value = 0.5754
## alternative hypothesis: stationary
Since the p-value (0.5754) is greater than a commonly used significance level such as 0.05, we have no evidence to reject the null hypothesis. This means that the time series appears to be non-stationary and could have a significant trend.
MODEL 1 (ARMA(2,2)) shows significant coefficients for the autoregressive terms (ar1 and ar2) and the moving average terms (ma1 and ma2), suggesting that the model captures the dependence structure in the data. Furthermore, diagnostic tests, such as the Box-Ljung test and the Augmented Dickey-Fuller Test on the residuals, indicate that the residuals are stationary. Likewise, the AIC is -448.58, which is a lower value compared to MODEL 2. Both models eliminated serial autocorrelation. MODEL 2 (ARIMA(1,1,1)) also shows a significant coefficient for the autoregressive term (ar1) and the moving average term (ma1). Although the AIC is higher than MODEL 1 (-440.92), it is still a competitive value. Furthermore, diagnostic tests on the residuals show that they are stationary but this is not the case for the fitted values. For this reason and due to a lower value in the AIC, model 1 (ARMA) is chosen as the model that best fits the data.
suppressWarnings({
<-exp(disney_arma$fitted.values) # The variables are transformed back to the originals (if log is applied, the exponential must be applied (it is the opposite))
disney_estimated_stock_price=c(disney_estimated_stock_price) #reverting "log" operation using "exp"
vector <-c(disney_bd$Disney) #Converting into a vector.
original = vector+original #reverting "diff" operation summing the original values to the differences.
restauracion <- ts(restauracion, start = 1, end = length(restauracion), frequency = 1) #Make time series
ts
})
# Forecast (ARMA)
<-forecast(ts,h=5)
disney_ARMA_forecast disney_ARMA_forecast
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 193 87.91991 79.68404 96.15578 75.32423 100.5156
## 194 87.91991 76.25768 99.58214 70.08406 105.7558
## 195 87.91991 73.61781 102.22200 66.04674 109.7931
## 196 87.91991 71.38333 104.45649 62.62940 113.2104
## 197 87.91991 69.40681 106.43301 59.60657 116.2333
plot(disney_ARMA_forecast)
autoplot(disney_ARMA_forecast)
FORECTAST INTERPRETATION
Period 1: The forecasted value is 87.9, with a 95% condifence level, the forecasted value is between 75.32423 and 100.5156. Period 2: The forecasted value is 88, with a 95% condifence level, the forecasted value is between 70.08406 and 105.7558. Period 3: The forecasted value is 87.9, with a 95% condifence level, the forecasted value is between 66.04674 and 109.7931. Period 4: The forecasted value is 87.9, with a 95% condifence level, the forecasted value is between 62.62940 and 113.2104 Period 5: The forecasted value is 87.9, with a 95% condifence level, the forecasted value is between 59.60657 and 116.2333.
The Walt Disney Company. (2023). Anual Report. Disney. https://thewaltdisneycompany.com/app/uploads/2023/02/2022-Annual-Report.pdf