A00834241 | Regina RodrĂguez Chávez
A00833617 | Yessica Acosta Blancheth
A01275763 | Eli Gabriel Hernández Medina
A00833172 | Genaro RodrĂguez Alcántara
• Briefly describe the selected company / variable (i.e., What is the company’s background? What is the consumer price index?)
Canadian Solar Inc. is a leading global company in the solar energy industry. Founded in 2001 in Canada, it has become a leading player in the design, development and manufacturing of solar modules and renewable energy solutions. The company operates around the world and has a significant presence in the Americas, Europe and Asia.
Canadian Solar specializes in the manufacture and sale of solar panels and other solar related products such as solar photovoltaic systems for residential, commercial and industrial use. In addition, it is also involved in the development and construction of large-scale solar projects, including solar parks and solar power plants.
The company has been noted for its focus on quality and innovation in the solar industry, offering a wide range of products and solutions that address the growing demand for clean and sustainable energy. Over the years, Canadian Solar has worked to improve the efficiency of its solar panels and reduce production costs to make solar energy more competitive compared to other energy sources.
The Consumer Price Index (CPI) is a measure used to assess changes in the average cost of a basket of goods and services that consumers typically purchase. The CPI is used to measure inflation, which is the general and sustained increase in prices in an economy. The cost of the basket of goods and services is compared in different periods of time to determine how prices have changed.
In the fast-paced world of finance, companies are constantly looking for ways to leverage their historical data to gain insight and make informed decisions. In this context, this project focuses on the construction of a time series model to analyze the financial performance of the company “Canadian Solar Inc.”
The main objective of this project is to develop a model to understand and predict the performance of Canadian Solar Inc. shares and other relevant financial variables. To achieve this, time series analysis techniques will be used, which will provide crucial information for decision-making at both a strategic and operational level.
• How has been the stock price / variable’s performance in the last year?
At first glance, we can see that, in the last year, the variable that indicates the price of the company has had a performance that seems constant, because despite constant variations throughout the year, it ended the year with very similar values. at the beginning of the year, considering a lot of variability throughout the year. However, it is important to analyze this data in depth and compare it with previous years to identify the seasonality of the data and to have a more accurate model for this company. For this reason, we will begin with an exploratory analysis of the data that allows us to discover insights about the data and the behavior of the variable over the years.
#To start our exploratory data analysis, it is necessary to load all the libraries that will be used in our analysis.
library(xts)
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
library(dplyr)
##
## ######################### Warning from 'xts' package ##########################
## # #
## # The dplyr lag() function breaks how base R's lag() function is supposed to #
## # work, which breaks lag(my_xts). Calls to lag(my_xts) that you type or #
## # source() into this session won't work correctly. #
## # #
## # Use stats::lag() to make sure you're not using dplyr::lag(), or you can add #
## # conflictRules('dplyr', exclude = 'lag') to your .Rprofile to stop #
## # dplyr from breaking base R's lag() function. #
## # #
## # Code in packages is not affected. It's protected by R's namespace mechanism #
## # Set `options(xts.warn_dplyr_breaks_lag = FALSE)` to suppress this warning. #
## # #
## ###############################################################################
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:xts':
##
## first, last
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(zoo)
library(tseries)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
library(stats)
library(forecast)
library(astsa)
##
## Attaching package: 'astsa'
## The following object is masked from 'package:forecast':
##
## gas
library(corrplot)
## corrplot 0.92 loaded
library(AER)
## Loading required package: car
## Loading required package: carData
##
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
##
## recode
## Loading required package: lmtest
## Loading required package: sandwich
## Loading required package: survival
library(vars)
## Loading required package: MASS
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
## Loading required package: strucchange
## Loading required package: urca
library(dynlm)
library(vars)
library(TSstudio)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## âś” forcats 1.0.0 âś” readr 2.1.4
## âś” ggplot2 3.4.2 âś” stringr 1.5.0
## âś” lubridate 1.9.2 âś” tibble 3.2.1
## âś” purrr 1.0.1 âś” tidyr 1.3.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## âś– stringr::boundary() masks strucchange::boundary()
## âś– dplyr::filter() masks stats::filter()
## âś– dplyr::first() masks xts::first()
## âś– dplyr::lag() masks stats::lag()
## âś– dplyr::last() masks xts::last()
## âś– car::recode() masks dplyr::recode()
## âś– MASS::select() masks dplyr::select()
## âś– purrr::some() masks car::some()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(sarima)
## Loading required package: stats4
##
## Attaching package: 'sarima'
##
## The following object is masked from 'package:astsa':
##
## sarima
##
## The following object is masked from 'package:stats':
##
## spectrum
library(dygraphs)
# Subsequently, the database that we will deal with the company is loaded.
<-read.csv( "/Users/gabrielmedina/Downloads/ts_stock_prices/CSIQ.csv")
CSIQ
# Likewise, some commands are used to understand the database and show an exploratory summary of the type of data we have in our hands.
head(CSIQ)
## Date Open High Low Close Adj.Close Volume
## 1 2015-01-05 24.20 24.51 21.72 22.90 22.90 9388500
## 2 2015-01-12 22.66 23.16 21.01 21.29 21.29 10843500
## 3 2015-01-19 21.06 21.19 18.68 19.30 19.30 15126400
## 4 2015-01-26 19.25 20.73 18.98 20.39 20.39 10836000
## 5 2015-02-02 20.84 27.35 20.63 25.66 25.66 29040500
## 6 2015-02-09 25.62 29.72 25.57 28.84 28.84 16991600
colnames(CSIQ)
## [1] "Date" "Open" "High" "Low" "Close" "Adj.Close"
## [7] "Volume"
summary(CSIQ)
## Date Open High Low
## Length:417 Min. :11.02 Min. :11.70 Min. :10.25
## Class :character 1st Qu.:16.07 1st Qu.:16.80 1st Qu.:15.40
## Mode :character Median :20.46 Median :21.83 Median :19.40
## Mean :24.26 Mean :25.75 Mean :22.83
## 3rd Qu.:32.45 3rd Qu.:34.01 3rd Qu.:30.10
## Max. :63.90 Max. :67.39 Max. :58.34
## Close Adj.Close Volume
## Min. :11.07 Min. :11.07 Min. : 1223100
## 1st Qu.:16.19 1st Qu.:16.19 1st Qu.: 3677000
## Median :20.41 Median :20.41 Median : 5744100
## Mean :24.26 Mean :24.26 Mean : 7034394
## 3rd Qu.:32.36 3rd Qu.:32.36 3rd Qu.: 9306200
## Max. :63.00 Max. :63.00 Max. :31584500
str(CSIQ)
## 'data.frame': 417 obs. of 7 variables:
## $ Date : chr "2015-01-05" "2015-01-12" "2015-01-19" "2015-01-26" ...
## $ Open : num 24.2 22.7 21.1 19.2 20.8 ...
## $ High : num 24.5 23.2 21.2 20.7 27.4 ...
## $ Low : num 21.7 21 18.7 19 20.6 ...
## $ Close : num 22.9 21.3 19.3 20.4 25.7 ...
## $ Adj.Close: num 22.9 21.3 19.3 20.4 25.7 ...
## $ Volume : int 9388500 10843500 15126400 10836000 29040500 16991600 8575800 11091300 24919000 14199000 ...
With this information at first hand, we can determine that we are working with a clean database, to which we will begin to implement data examination techniques to build our time series model.
Setting time series format Here we are checking the date data in our database
$Date CSIQ
## [1] "2015-01-05" "2015-01-12" "2015-01-19" "2015-01-26" "2015-02-02"
## [6] "2015-02-09" "2015-02-16" "2015-02-23" "2015-03-02" "2015-03-09"
## [11] "2015-03-16" "2015-03-23" "2015-03-30" "2015-04-06" "2015-04-13"
## [16] "2015-04-20" "2015-04-27" "2015-05-04" "2015-05-11" "2015-05-18"
## [21] "2015-05-25" "2015-06-01" "2015-06-08" "2015-06-15" "2015-06-22"
## [26] "2015-06-29" "2015-07-06" "2015-07-13" "2015-07-20" "2015-07-27"
## [31] "2015-08-03" "2015-08-10" "2015-08-17" "2015-08-24" "2015-08-31"
## [36] "2015-09-07" "2015-09-14" "2015-09-21" "2015-09-28" "2015-10-05"
## [41] "2015-10-12" "2015-10-19" "2015-10-26" "2015-11-02" "2015-11-09"
## [46] "2015-11-16" "2015-11-23" "2015-11-30" "2015-12-07" "2015-12-14"
## [51] "2015-12-21" "2015-12-28" "2016-01-04" "2016-01-11" "2016-01-18"
## [56] "2016-01-25" "2016-02-01" "2016-02-08" "2016-02-15" "2016-02-22"
## [61] "2016-02-29" "2016-03-07" "2016-03-14" "2016-03-21" "2016-03-28"
## [66] "2016-04-04" "2016-04-11" "2016-04-18" "2016-04-25" "2016-05-02"
## [71] "2016-05-09" "2016-05-16" "2016-05-23" "2016-05-30" "2016-06-06"
## [76] "2016-06-13" "2016-06-20" "2016-06-27" "2016-07-04" "2016-07-11"
## [81] "2016-07-18" "2016-07-25" "2016-08-01" "2016-08-08" "2016-08-15"
## [86] "2016-08-22" "2016-08-29" "2016-09-05" "2016-09-12" "2016-09-19"
## [91] "2016-09-26" "2016-10-03" "2016-10-10" "2016-10-17" "2016-10-24"
## [96] "2016-10-31" "2016-11-07" "2016-11-14" "2016-11-21" "2016-11-28"
## [101] "2016-12-05" "2016-12-12" "2016-12-19" "2016-12-26" "2017-01-02"
## [106] "2017-01-09" "2017-01-16" "2017-01-23" "2017-01-30" "2017-02-06"
## [111] "2017-02-13" "2017-02-20" "2017-02-27" "2017-03-06" "2017-03-13"
## [116] "2017-03-20" "2017-03-27" "2017-04-03" "2017-04-10" "2017-04-17"
## [121] "2017-04-24" "2017-05-01" "2017-05-08" "2017-05-15" "2017-05-22"
## [126] "2017-05-29" "2017-06-05" "2017-06-12" "2017-06-19" "2017-06-26"
## [131] "2017-07-03" "2017-07-10" "2017-07-17" "2017-07-24" "2017-07-31"
## [136] "2017-08-07" "2017-08-14" "2017-08-21" "2017-08-28" "2017-09-04"
## [141] "2017-09-11" "2017-09-18" "2017-09-25" "2017-10-02" "2017-10-09"
## [146] "2017-10-16" "2017-10-23" "2017-10-30" "2017-11-06" "2017-11-13"
## [151] "2017-11-20" "2017-11-27" "2017-12-04" "2017-12-11" "2017-12-18"
## [156] "2017-12-25" "2018-01-01" "2018-01-08" "2018-01-15" "2018-01-22"
## [161] "2018-01-29" "2018-02-05" "2018-02-12" "2018-02-19" "2018-02-26"
## [166] "2018-03-05" "2018-03-12" "2018-03-19" "2018-03-26" "2018-04-02"
## [171] "2018-04-09" "2018-04-16" "2018-04-23" "2018-04-30" "2018-05-07"
## [176] "2018-05-14" "2018-05-21" "2018-05-28" "2018-06-04" "2018-06-11"
## [181] "2018-06-18" "2018-06-25" "2018-07-02" "2018-07-09" "2018-07-16"
## [186] "2018-07-23" "2018-07-30" "2018-08-06" "2018-08-13" "2018-08-20"
## [191] "2018-08-27" "2018-09-03" "2018-09-10" "2018-09-17" "2018-09-24"
## [196] "2018-10-01" "2018-10-08" "2018-10-15" "2018-10-22" "2018-10-29"
## [201] "2018-11-05" "2018-11-12" "2018-11-19" "2018-11-26" "2018-12-03"
## [206] "2018-12-10" "2018-12-17" "2018-12-24" "2018-12-31" "2019-01-07"
## [211] "2019-01-14" "2019-01-21" "2019-01-28" "2019-02-04" "2019-02-11"
## [216] "2019-02-18" "2019-02-25" "2019-03-04" "2019-03-11" "2019-03-18"
## [221] "2019-03-25" "2019-04-01" "2019-04-08" "2019-04-15" "2019-04-22"
## [226] "2019-04-29" "2019-05-06" "2019-05-13" "2019-05-20" "2019-05-27"
## [231] "2019-06-03" "2019-06-10" "2019-06-17" "2019-06-24" "2019-07-01"
## [236] "2019-07-08" "2019-07-15" "2019-07-22" "2019-07-29" "2019-08-05"
## [241] "2019-08-12" "2019-08-19" "2019-08-26" "2019-09-02" "2019-09-09"
## [246] "2019-09-16" "2019-09-23" "2019-09-30" "2019-10-07" "2019-10-14"
## [251] "2019-10-21" "2019-10-28" "2019-11-04" "2019-11-11" "2019-11-18"
## [256] "2019-11-25" "2019-12-02" "2019-12-09" "2019-12-16" "2019-12-23"
## [261] "2019-12-30" "2020-01-06" "2020-01-13" "2020-01-20" "2020-01-27"
## [266] "2020-02-03" "2020-02-10" "2020-02-17" "2020-02-24" "2020-03-02"
## [271] "2020-03-09" "2020-03-16" "2020-03-23" "2020-03-30" "2020-04-06"
## [276] "2020-04-13" "2020-04-20" "2020-04-27" "2020-05-04" "2020-05-11"
## [281] "2020-05-18" "2020-05-25" "2020-06-01" "2020-06-08" "2020-06-15"
## [286] "2020-06-22" "2020-06-29" "2020-07-06" "2020-07-13" "2020-07-20"
## [291] "2020-07-27" "2020-08-03" "2020-08-10" "2020-08-17" "2020-08-24"
## [296] "2020-08-31" "2020-09-07" "2020-09-14" "2020-09-21" "2020-09-28"
## [301] "2020-10-05" "2020-10-12" "2020-10-19" "2020-10-26" "2020-11-02"
## [306] "2020-11-09" "2020-11-16" "2020-11-23" "2020-11-30" "2020-12-07"
## [311] "2020-12-14" "2020-12-21" "2020-12-28" "2021-01-04" "2021-01-11"
## [316] "2021-01-18" "2021-01-25" "2021-02-01" "2021-02-08" "2021-02-15"
## [321] "2021-02-22" "2021-03-01" "2021-03-08" "2021-03-15" "2021-03-22"
## [326] "2021-03-29" "2021-04-05" "2021-04-12" "2021-04-19" "2021-04-26"
## [331] "2021-05-03" "2021-05-10" "2021-05-17" "2021-05-24" "2021-05-31"
## [336] "2021-06-07" "2021-06-14" "2021-06-21" "2021-06-28" "2021-07-05"
## [341] "2021-07-12" "2021-07-19" "2021-07-26" "2021-08-02" "2021-08-09"
## [346] "2021-08-16" "2021-08-23" "2021-08-30" "2021-09-06" "2021-09-13"
## [351] "2021-09-20" "2021-09-27" "2021-10-04" "2021-10-11" "2021-10-18"
## [356] "2021-10-25" "2021-11-01" "2021-11-08" "2021-11-15" "2021-11-22"
## [361] "2021-11-29" "2021-12-06" "2021-12-13" "2021-12-20" "2021-12-27"
## [366] "2022-01-03" "2022-01-10" "2022-01-17" "2022-01-24" "2022-01-31"
## [371] "2022-02-07" "2022-02-14" "2022-02-21" "2022-02-28" "2022-03-07"
## [376] "2022-03-14" "2022-03-21" "2022-03-28" "2022-04-04" "2022-04-11"
## [381] "2022-04-18" "2022-04-25" "2022-05-02" "2022-05-09" "2022-05-16"
## [386] "2022-05-23" "2022-05-30" "2022-06-06" "2022-06-13" "2022-06-20"
## [391] "2022-06-27" "2022-07-04" "2022-07-11" "2022-07-18" "2022-07-25"
## [396] "2022-08-01" "2022-08-08" "2022-08-15" "2022-08-22" "2022-08-29"
## [401] "2022-09-05" "2022-09-12" "2022-09-19" "2022-09-26" "2022-10-03"
## [406] "2022-10-10" "2022-10-17" "2022-10-24" "2022-10-31" "2022-11-07"
## [411] "2022-11-14" "2022-11-21" "2022-11-28" "2022-12-05" "2022-12-12"
## [416] "2022-12-19" "2022-12-26"
For better data handling, we convert our dates to the appropriate data type
$Date <- as.Date(CSIQ$Date)
CSIQ
summary(CSIQ$Adj.Close)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 11.07 16.19 20.41 24.26 32.36 63.00
• Plot the stock price / variable using a time series format.
We will verify the correct type of data
class(CSIQ$Date)
## [1] "Date"
summary(CSIQ)
## Date Open High Low
## Min. :2015-01-05 Min. :11.02 Min. :11.70 Min. :10.25
## 1st Qu.:2017-01-02 1st Qu.:16.07 1st Qu.:16.80 1st Qu.:15.40
## Median :2018-12-31 Median :20.46 Median :21.83 Median :19.40
## Mean :2018-12-31 Mean :24.26 Mean :25.75 Mean :22.83
## 3rd Qu.:2020-12-28 3rd Qu.:32.45 3rd Qu.:34.01 3rd Qu.:30.10
## Max. :2022-12-26 Max. :63.90 Max. :67.39 Max. :58.34
## Close Adj.Close Volume
## Min. :11.07 Min. :11.07 Min. : 1223100
## 1st Qu.:16.19 1st Qu.:16.19 1st Qu.: 3677000
## Median :20.41 Median :20.41 Median : 5744100
## Mean :24.26 Mean :24.26 Mean : 7034394
## 3rd Qu.:32.36 3rd Qu.:32.36 3rd Qu.: 9306200
## Max. :63.00 Max. :63.00 Max. :31584500
# Time series plot 1
plot(CSIQ$Date,CSIQ$Adj.Close,type="l",col="blue", lwd=2, xlab ="Date",ylab ="Adjusted Close Price", main = "Canadian Solar Inc Stock Price")
In this graph we can identify some increasing and decreasing trends
throughout the periods. Where it can be seen that in recent years there
is a clear downward trend.
# Time series plot 2
<-xts(CSIQ$Adj.Close,order.by=CSIQ$Date)
CSIQxtsplot(CSIQxts)
In this graph we can see 2 main trends in the price of Solar Inc shares, the first of them seems to have started in the second half of 2018, where the shares began to greatly increase their value until the first half of 2021. Where from then and as mentioned above, the shares have shown a negative trend until the latest values in our database that represent the second half of 2022.
Investigating the context of the company in those periods and focusing the research on the political and social environment because due to the nature of the company, the actions are greatly influenced by this type of political factors. Therefore, it is interesting to find that in the first period that was explained, the positive trend coincides with a boom period for renewable energies and where many governments, including the United States government, implemented policies to promote renewable energies, this could explain the growth of stocks in the last part of the decade. On the other hand, the drop in stock price in the second half of 2021 could have been due to macroeconomic factors and market fluctuations. For example, global economic events, such as the COVID-19 pandemic, financial crises, or changes in interest rates, may negatively affect stock markets and cause investors to be more cautious.
# In this graph, we mark the parts of the graph with a marked trend
dygraph(CSIQxts, main = "Canadian Solar Inc") %>%
dyOptions(colors = RColorBrewer::brewer.pal(4, "Dark2")) %>%
dyShading(from = "2018-07-02",
to = "2022-07-04",
color = "#FFE6E6")
• Decompose the time series data in observed, trend, seasonality, and random. i. Do the time series data show a trend? ii. Do the time series data show seasonality? How is the change of the seasonal component over time?
<-ts(CSIQ$Adj.Close,frequency=52,start=c(2015,1))
CSIQts<-decompose(CSIQts)
CSIQ_ts_decomposeplot(CSIQ_ts_decompose)
In this graph we can see the time series decomposed into seasonality,
trend and the residual component. We can observe that in the residual or
random component graph there is a constancy that indicates the normality
of the residuals and that the residuals have a constant dispersion in
the most of the observations, with the exception of the last part,
because when analyzing the last part of the time series, we notice a
pattern that resembles noise. This means that the values in this part of
the series appear to fluctuate randomly without following any
discernible trend or pattern. This observation is interesting as it
coincides with general trends that we have observed in the data over
time. Those same trends that were identified coincide with the trends
part of the graph. On the other hand, in the seasonality component, we
appreciate that in each period of the year there is a part where it
increases greatly and subsequently decreases because what we can say
that the actions have a seasonality component in some part of the year,
by investigating a little This seasonality we can find that renewable
energies often tend to have seasonality in their demands, for example,
in many regions, the demand for solar energy can be seasonal. For
example, demand for solar energy systems tends to increase in the spring
and summer due to more hours of sunshine and preparation for the high
temperature and air conditioning season. This could boost Canadian Solar
Inc. stock prices in the months leading up to the peak demand
season.
• Detect if the time series data is stationary.
# It is important to determine the seasonality of the time series, this is possible with the Augmented Dickey-Fuller Test.
adf.test(CSIQ$Adj.Close) ### H0: Non-stationary and HA: Stationary. p-values < 0.05 reject the H0.
##
## Augmented Dickey-Fuller Test
##
## data: CSIQ$Adj.Close
## Dickey-Fuller = -2.5048, Lag order = 7, p-value = 0.3642
## alternative hypothesis: stationary
### P-Value > 0.05. Fails to Reject the H0. Time series data is non-stationary.
The result indicates that with a p-value of 0.3642, there is insufficient evidence to reject the null hypothesis that the Canadian Solar Inc (CSIQ) adjusted closing series is non-stationary.
• Detect if the time series data shows serial autocorrelation.
acf(CSIQ$Adj.Close,main="Significant Autocorrelations")
In the serial autocorrelation analysis, we were able to significantly
observe the presence of serial autocorrelation in the time series. This
fact is evidenced by the ACF (Autocorrelation Function) that
consistently exceeds the thresholds established throughout the graph.
This observation suggests a strong relationship between current data and
past data throughout the series, indicating a constant dependence on the
evolution of values over time.
• Estimate 3 different time series regression models. You might want to consider ARMA (p,q) and / or ARIMA (p,d,q).
# + t-1
summary(MODEL1<-arma(log(CSIQ$Adj.Close),order=c(1,1)))
##
## Call:
## arma(x = log(CSIQ$Adj.Close), order = c(1, 1))
##
## Model:
## ARMA(1,1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.350680 -0.040645 0.002824 0.041949 0.227205
##
## Coefficient(s):
## Estimate Std. Error t value Pr(>|t|)
## ar1 0.982798 0.009389 104.67 <2e-16 ***
## ma1 0.013266 0.049155 0.27 0.7873
## intercept 0.054055 0.029372 1.84 0.0657 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Fit:
## sigma^2 estimated as 0.005732, Conditional Sum-of-Squares = 2.38, AIC = -963.03
plot(MODEL1)
The estimated coefficients of the model indicate that both the
autoregression (AR) terms and the moving average (MA) terms are
significant. The model residuals have a low estimated variance,
suggesting that the model fits the observed data well. The Akaike
Information Criterion (AIC) also supports the choice of model. Taken
together, these results indicate that the ARMA(1,1) model provides an
effective representation of the underlying dynamics in the firm’s price
time series.
# Ajustar un modelo ARIMA (p, d, q)
<- arima(log(CSIQ$Adj.Close), order = c(1, 1, 1))
MODEL2
# Resumen del modelo
summary(MODEL2)
##
## Call:
## arima(x = log(CSIQ$Adj.Close), order = c(1, 1, 1))
##
## Coefficients:
## ar1 ma1
## 0.2496 -0.2489
## s.e. NaN NaN
##
## sigma^2 estimated as 0.005766: log likelihood = 482.12, aic = -958.24
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.0006885942 0.07584362 0.05628651 -0.009334677 1.820745 0.9977407
## ACF1
## Training set 0.003718469
# Gráfico del modelo
plot(MODEL2)
The ARIMA(1, 1, 1) model fitted to the Canadian Solar Inc. time series
data represents an approach that combines an Autoregression (AR)
component of order 1, a Differentiation (I) component of order 1 and a
Moving Average (MA) component of order 1. Although the model shows an
acceptable fit to the training data, with relatively low prediction
errors, it is important to highlight that the standard error (s.e.)
values of the coefficients are NaN , which suggests problems in
parameter estimation.
summary(MODEL3<-arma(diff(log(CSIQ$Adj.Close)),order=c(1,1)))
##
## Call:
## arma(x = diff(log(CSIQ$Adj.Close)), order = c(1, 1))
##
## Model:
## ARMA(1,1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.352932 -0.037966 0.002234 0.041585 0.230148
##
## Coefficient(s):
## Estimate Std. Error t value Pr(>|t|)
## ar1 0.2517553 0.3502978 0.719 0.472
## ma1 -0.2526828 0.3620715 -0.698 0.485
## intercept 0.0007144 0.0027958 0.256 0.798
##
## Fit:
## sigma^2 estimated as 0.005775, Conditional Sum-of-Squares = 2.39, AIC = -957.6
plot(MODEL3)
In terms of model fit, the estimated error variance (sigma^2) is
0.005775 and the Akaike information criterion (AIC) is -957.6. The AIC
value is a little lower, suggesting a relatively better fit compared to
other models, but still, it is important to note the lack of statistical
significance in the AR and MA coefficients.
<-MODEL1$residuals
MODEL1_residualsBox.test(MODEL1_residuals,lag=1,type="Ljung-Box")
##
## Box-Ljung test
##
## data: MODEL1_residuals
## X-squared = 5.3776e-06, df = 1, p-value = 0.9981
The calculated p-value is 0.9981, which means that there is not enough evidence to reject the null hypothesis that there is no autocorrelation in the residuals. There is no significant evidence of autocorrelation in the residuals.
#Testing residuals
suppressWarnings({
$residuals <- na.omit(MODEL1$residuals)
MODEL1adf.test(MODEL1$residuals)
})
##
## Augmented Dickey-Fuller Test
##
## data: MODEL1$residuals
## Dickey-Fuller = -7.777, Lag order = 7, p-value = 0.01
## alternative hypothesis: stationary
The calculated p-value is 0.01, which is lower than the commonly used significance level (0.05). Therefore, there is sufficient evidence to reject the null hypothesis in favor of the alternative hypothesis, which indicates that the residuals are stationary.
suppressWarnings({
$fitted.values <- na.omit(MODEL1$fitted.values)
MODEL1adf.test(MODEL1$fitted.values)
})
##
## Augmented Dickey-Fuller Test
##
## data: MODEL1$fitted.values
## Dickey-Fuller = -2.4311, Lag order = 7, p-value = 0.3953
## alternative hypothesis: stationary
The calculated p-value is 0.3953, which is higher than the commonly used significance level (such as 0.05). This means that there is not enough evidence to reject the null hypothesis. So the time series is not stationary.
hist(MODEL1$residuals)
#Normality distribution is appreciated
Normality is seen in the distribution of the residuals despite a bias to the right of said distribution.
<-MODEL2$residuals
MODEL2_residualsBox.test(MODEL2_residuals,lag=1,type="Ljung-Box")
##
## Box-Ljung test
##
## data: MODEL2_residuals
## X-squared = 0.0058074, df = 1, p-value = 0.9393
The calculated p-value is 0.9393, which means that there is not enough evidence to reject the null hypothesis that there is no autocorrelation in the residuals.
#Testing residuals
suppressWarnings({
$residuals <- na.omit(MODEL2$residuals)
MODEL2adf.test(MODEL2$residuals)
})
##
## Augmented Dickey-Fuller Test
##
## data: MODEL2$residuals
## Dickey-Fuller = -7.6869, Lag order = 7, p-value = 0.01
## alternative hypothesis: stationary
The calculated p-value is 0.01, which is lower than the commonly used significance level (0.05). Therefore, there is sufficient evidence to reject the null hypothesis in favor of the alternative hypothesis, which indicates that the time series is stationary.
hist(MODEL2$residuals)
#Normality distribution is appreciated
Normality is seen in the distribution of the residuals.
<-MODEL3$residuals
MODEL3_residualsBox.test(MODEL3_residuals,lag=1,type="Ljung-Box")
##
## Box-Ljung test
##
## data: MODEL3_residuals
## X-squared = 0.0040866, df = 1, p-value = 0.949
The calculated p-value is 0.949, which means that there is not enough evidence to reject the null hypothesis that there is no autocorrelation in the residuals.
#Testing residuals
suppressWarnings({
$residuals <- na.omit(MODEL3$residuals)
MODEL3adf.test(MODEL3$residuals)
})
##
## Augmented Dickey-Fuller Test
##
## data: MODEL3$residuals
## Dickey-Fuller = -7.7324, Lag order = 7, p-value = 0.01
## alternative hypothesis: stationary
The calculated p-value is 0.01, which is lower than the commonly used significance level (0.05). Therefore, there is sufficient evidence to reject the null hypothesis in favor of the alternative hypothesis, which indicates that the residuals are stationary.
suppressWarnings({
$fitted.values <- na.omit(MODEL3$fitted.values)
MODEL3adf.test(MODEL3$fitted.values)
})
##
## Augmented Dickey-Fuller Test
##
## data: MODEL3$fitted.values
## Dickey-Fuller = -8.5552, Lag order = 7, p-value = 0.01
## alternative hypothesis: stationary
The calculated p-value is 0.01, which is lower than the commonly used significance level (0.05). Therefore, there is sufficient evidence to reject the null hypothesis in favor of the alternative hypothesis, which indicates that the time series is stationary.
hist(MODEL3$residuals)
#Normality distribution is appreciated
Normality is seen in the distribution of the residuals.
Since model 3 manages to convert the series into stationary, it manages to eliminate serial autocorrelation, be a simpler model and have slightly more significant coefficients than model 2. Model 3 is chosen.
MODEL 2
suppressWarnings({
<-exp(MODEL3$fitted.values) # The variables are transformed back to the originals (if log is applied, the exponential must be applied (it is the opposite))
model3_return=c(model3_return) #reverting "log" operation using "exp"
vector <-c(CSIQ$Adj.Close) #Converting into a vector.
original = vector+original #reverting "diff" operation summing the original values to the differences.
restauracion_CSIQ <- ts(restauracion_CSIQ, start = 1, end = length(restauracion_CSIQ), frequency = 1)
ts_csiq })
# Forecast
<- forecast(ts_csiq, h = 5)
CSIQ_ARMA_forecast CSIQ_ARMA_forecast
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 418 31.43699 28.51961 34.35437 26.97524 35.89874
## 419 31.43699 27.34067 35.53331 25.17221 37.70177
## 420 31.43699 26.42795 36.44602 23.77633 39.09765
## 421 31.43699 25.65398 37.22000 22.59264 40.28133
## 422 31.43699 24.96874 37.90524 21.54466 41.32932
plot(CSIQ_ARMA_forecast)
autoplot(CSIQ_ARMA_forecast)
FORECTAST INTERPRETATION
According to our forecast, the following values are expected…
Period 1: The forecasted value is 31.43699, with a 95% condifence level, the forecasted value is between 26.97524 and 35.89874. Period 2: The forecasted value is 31, with a 95% condifence level, the forecasted value is between 25.17221 and 37.70177. Period 3: The forecasted value is 31.43699, with a 95% condifence level, the forecasted value is between 23.77633 and 39.09765. Period 4: The forecasted value is 31.43699, with a 95% condifence level, the forecasted value is between 22.59264 and 40.28133 Period 5: The forecasted value is 31.43699, with a 95% condifence level, the forecasted value is between 21.54466 and 41.32932.
In the work carried out, a time series analysis of the shares of Canadian Solar Inc (CSIQ) was carried out. Different models such as ARMA and ARIMA were explored to model and predict stock price behavior. Diagnostic tests were performed to evaluate the quality of the models, and short-term forecasts were generated.
CSIQ stock price forecasting has potential applications in financial decision making. Investors and companies can use these forecasts to make informed decisions about buying or selling stocks, risk management, and short-term financial planning. Additionally, time series analysis can provide valuable information on historical patterns and trends in the market, which can be useful for formulating trading strategies.
References
Canadian Solar Inq. (2023). Investors. Canadian Solar Inq. https://www.canadiansolar.com/