Patterns in time series data are the backbone of the analysis of it. As with other fields of statistics and, in particular, the field of machine learning, one of the primary goals of time series analysis is to identify patterns in data. Those patterns can then be utilized to provide meaningful insights about both past and future events such as seasonal, outliers, or unique events.
Patterns in time series analysis can be categorized into one of the following:
Structural patterns: These are also known as series components, which represent, as the name implies, the core structure of the series. There are three types of structural patterns— trend, cycle, and seasonal. You can think about those patterns as binary events, which may or may not exist in the data. This helps to classify the series characteristics and identify the best approach to analyze the series.
Non-structural: This is also known as the irregular component, and refers to any other types of patterns in the data that are not related to the structural patterns.
We can use these two groups of patterns (structural and non-structural) to express time series data using the following equation, when the series has an additive structure:
\[ Y_t = T_t + S_t + C_t + I_t \]
And when the series has a multiplicative structure:
\[ Y_t = T_t \times S_t \times C_t \times I_t \]
Where \(Y_t\) represents the series observation at time \(t\) and \(T_t, S_t ,C_t, \text{and}\; I_t\) represent the value of the trend, seasonal, cycle, and irregular components of the series at time \(t\), respectively. We shall define additive and multiplicative models in the later part of this class.
A trend, if it exists in time series data, represents the general direction of the series, either up or down, over time. Furthermore, a trend could have either linear or exponential growth (or close to either one), depending on the series characteristics.
These simple examples represent non-complex time series data with a clear trend component, and it is therefore simple to identify the trend and classify its growth type. Typically, your data could have additional components and patterns, such as seasonality.
The seasonal component (or seasonality) is another common pattern in time series data. If this exists, it represents a repeated variation in the series, which is related to the frequency units of the series (for example, the months of the year for a monthly series). One of the common examples for a series with a strong seasonality pattern is the demand for electricity or natural gas. In those cases, the seasonal pattern is derived from a variety of seasonal events, such as weather patterns, the season of the year, and sunlight hours.
In addition, a series could have more than one seasonal pattern. A classic example of this is the hourly demand for electricity, which could potentially have three different seasonality patterns:
Hourly seasonality, which is derived from parameters such as sunlight hours and temperatures throughout the day
Weekly seasonality, which depends on the day of the week (weekdays versus the weekend)
Monthly seasonality, which is related to the season of the year (high consumption during the winter months versus low consumption during the summer months, assuming that the heating system is powered by electricity)
This is a simplistic example of a series with a seasonal pattern. However, unless you are very lucky, it is most likely that your series will be a combination of multiple patterns, which will create a more complex data structure. A common example of a mixture of patterns is a series with both seasonal and trend patterns.
The definition of the cycle in a time series is derived from the broad definition of a cycle in macroeconomics. A cycle can be described as a sequence of repeatable events over time, where the starting point of a cycle is at a local minimum of the series and the ending point is at the next one, and the ending point of one cycle is the starting point of the following cycle.
Moreover, unlike the seasonal pattern, cycles do not necessarily occur at equally spaced time intervals, and their length could change from cycle to cycle. The US monthly unemployment rate series is an example of a series with a cycle pattern.
Looking at the time series plot, you can easily observe that the series has had three cycles since 1990:
The first cycle occurred between 1990 and 2000, which was close to an 11-year cycle
The second cycle started in 2000 and ended in 2007, which was a 7-year cycle
A third cycle, which began in 2007 and as of May 2019 has not been completed yet, which means that this has continued for more than 12 years.
This component, which is the remainder between the series and structural components, provides an indication of irregular events in the series. This includes non-systematic patterns or events in the data, which cause irregular fluctuation. In addition, the irregular component could provide some indication of the appropriate fit of the other components when using a decomposing method. A high correlation in this component is an indication that some patterns related to one of the other components were leftover due to an inaccurate estimate.
When the irregular component is not correlated with its lags, then we have a white noise. A series is defined as white noise when there is no correlation between the series observations or patterns. In other words, the relationship between different observations is random. In many of the applications of white noise in time series, there are some assumptions made about the distribution of the white noise series. Typically, unless mentioned otherwise, we assume that white noise is an independent and identically distributed random variables (\(i.i.d\)), with a mean of 0 and a variance of \(\sigma^2\).