Since 2014, I have been interested in the combat sport
that is Muay Thai. The intensified version of kickboxing brings a
fascinating mix of strategy and athleticism that cannot be found in most
other sports. Like any fan, I want the sport to grow in participation
and popularity. For that reason, I have downloaded Google-provided data
on the number of monthly searches and attempted to forecast in which
direction the sport will develop in the coming years.
I
believe that with the increase in popularity of MMA and other combat
sports, Muay Thai will grow at an accelerated pace as well. To test my
hypothesis, I utilized an ETS, ARIMA, NNETAR, and an ensemble model,
which performance I tested utilizing an adjusted R2 metric, as well as
the traditional performance measures such as AIC and BIC.
The data I utilized comes from Google Data. Specifically,
it reflects the number of weekly searches of “Muay Thai,” between April
2019 and April 2024, which I later aggregated to reflect the number of
monthly searches rather than weekly. To better test the performance of
my models, I created an 80/20 split of the data, with 20% of the dataset
being left alone to be later utilized for testing while the 80% being
used to train my models.
ETS does exceptionally well with data that possesses heavy
seasonality, as well as a trend. A simple eye test of the data in the
above graph shows that the data does not demonstrate any obvious
seasonality. However, there is a clear upward trend, with possible
cyclical downturns. These downturns can be seen in early 2020 and 2024.
I engage the Multiplicative Holt-Winter’s method for my
forecast. This is before there is no visible dampening in the upward
trend of searches, meaning the Holt-Winter’s damped method might not be
the most efficient. Also, while there is no obvious seasonal aspect, the
multiplicative seasonal aspect of the Multiplicative Holt-Winter’s will
capture anything we cannot immediately see.
ETS <- Train %>%
model("ETS" = ETS(Searches_Sum~ error("M") + trend("A") + season("M")))
ETSforc <- ETS%>%
forecast(h=13)
Train %>%
ggplot(aes(x = yearmonth, y = Searches_Sum)) +
geom_line(color = "black", size = 1) + # Plot actual observations in blue
geom_line(data = Test, aes(x = yearmonth, y = Searches_Sum), color = "green", size = 1) + # Plot real observations from Test dataframe in green
geom_line(data = ETSforc, aes(y = .mean), color = "red", linetype = "dashed") + # Plot forecast in red dashed line
labs(y = "Searches of Muay Thai",
x = "Time",
title = "ETS Forecast of Monthly Searches of Muay Thai",
subtitle = "Source: Google Data") +
theme_minimal()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: The output of `fortify(<fable>)` has changed to better suit usage with the ggdist package.
## If you're using it to extract intervals, consider using `hilo()` to compute intervals, and `unpack_hilo()` to obtain values.
report(ETS)
## Series: Searches_Sum
## Model: ETS(M,A,M)
## Smoothing parameters:
## alpha = 0.1434019
## beta = 0.0001015461
## gamma = 0.001544671
##
## Initial states:
## l[0] b[0] s[0] s[-1] s[-2] s[-3] s[-4] s[-5]
## 208.4861 1.208277 0.8971978 0.9812704 0.984331 0.7604929 1.045519 1.049487
## s[-6] s[-7] s[-8] s[-9] s[-10] s[-11]
## 1.023337 1.159614 1.178053 1.163743 0.956842 0.8001136
##
## sigma^2: 0.0654
##
## AIC AICc BIC
## 565.8328 586.9363 597.2853
accuracy(ETS)
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 ETS Training -2.15 40.5 30.6 -8.47 20.3 0.576 0.584 0.128
#R2
squared_residualsETS <- (Test$Searches_Sum-ETSforc$.mean)^2
SSRETS <- sum(squared_residualsETS,na.rm=TRUE)
# SST
total_sumETS <- (Test$Searches_Sum - mean(Train$Searches_Sum, na.rm=TRUE))
squared_sumETS<- total_sumETS^2
SSTETS <- sum(squared_sumETS,na.rm=TRUE)
#R2
R2ETS <- 1-(SSRETS/SSTETS)
R2ETS
## [1] 0.6672194
While the AIC and BIC values provide us with no
information until we generate the following models, the R2 presents us
with an interesting challenge. When calculating R2, traditionally, one
uses the \(1-SSR/SST\) formula.
However, since we are using the 80/20 data split, we are operating with
two different means. In this case, the mean of the testing set is much
higher than that of the training set, leading to potential issues, such
as a negative R2 value.
To avoid these issues, we calculate
the SSR from the testing set, as that is the only case where that is
possible. However, we calculate SST by taking the real values of the
testing set and subtracting from them the training set’s mean. This,
however, changes the interpretation of the metric. Where in traditional
regression analysis, R2 is a measurement of how much variation in the
dependent variable can be explained by right-hand side variables, this
R2 value shows how many percentage points superior/inferior the forecast
we created is, compared to simply using the mean of the training set. As
I developed this metric myself and have not found any literature on it,
I will continue to refer to it as R2 despite it not resembling R2 in
reality. That said, this is a reasonably good measure of model
performance. According to R2, the model performs at 66.72 percentage
points better than a simple mean forecast, which is a respectable value.
The ARIMA model generated by the automatic ARIMA()
function is an one order moving average model, with one degree of first
differencing. As mentioned, the data does not manifest with heavy levels
of seasonality, so the model using first differencing to make the data
stationary is no surprise.
## Warning: The output of `fortify(<fable>)` has changed to better suit usage with the ggdist package.
## If you're using it to extract intervals, consider using `hilo()` to compute intervals, and `unpack_hilo()` to obtain values.
## Series: Searches_Sum
## Model: ARIMA(0,1,1)
##
## Coefficients:
## ma1
## -0.7338
## s.e. 0.1024
##
## sigma^2 estimated as 2159: log likelihood=-241.74
## AIC=487.47 AICc=487.75 BIC=491.13
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 ARIMA Training 5.99 45.5 33.8 -5.48 21.9 0.637 0.656 0.0668
## [1] 0.5874579
The ARIMA model forecast performs better relative to the
ETS forecast, specifically due to its AIC and BIC values being lower.
However, its RMSE value, as well as its MAE and R2 values, are inferior
to the ETS model. While these values send mixed messages, I lean towards
saying that the ETS model’s performance is superior to the ARIMA model.
A third model I use is a neural network auto-regressive
model. Created by the NNETAR function, the model contains two lagged
values in two neurons in its hidden layer. The hidden layer is
particularly valuable, as it allows the model to adapt to non-linear
data, a large advantage over traditional linear regression models.
## Warning: The output of `fortify(<fable>)` has changed to better suit usage with the ggdist package.
## If you're using it to extract intervals, consider using `hilo()` to compute intervals, and `unpack_hilo()` to obtain values.
## Series: Searches_Sum
## Model: NNAR(2,1,2)[12]
##
## Average of 20 networks, each of which is
## a 3-2-1 network with 11 weights
## options were - linear output units
##
## sigma^2 estimated as 942.4
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 NNETAR Training 0.0110 30.7 23.0 -5.26 14.3 0.434 0.443 -0.137
## [1] 0.4539324
The NNAR model boasts a superior MAE and RMSE than even
the ETS model. Low MAE and RMSE values are a common feature of NNAR
models; however, its R2 is much inferior to both previous models.
Additionally, the visual test of the forecast shows that the model needs
to perform better as well as the previous two.
Finally, I created an Ensemble model, which combines the
three models with hopes of creating a complete, superior combination of
its parts. To ensure as good a performance as possible, I assigned
different weights to each model, specifically, 60% to ETS, 10% to ARIMA,
and 30% to NNAR. With this, I hope to capture the low values for MAE and
RMSE from the NNEA model and the superior R2 from the ETS model.
ensamble <- ((0.3)*NNETARforc$.mean+(0.1)*ARIMAforc$.mean+(0.6)*ETSforc$.mean)
Test$ensamble <- ensamble
Train %>%
ggplot(aes(x = yearmonth, y = Searches_Sum)) +
geom_line(color = "black", size = 1) + # Plot actual observations in blue
geom_line(data = Test, aes(x = yearmonth, y = Searches_Sum), color = "green", size = 1) + # Plot real observations from Test dataframe in green
geom_line(data = Test, aes(y = ensamble), color = "red", linetype = "dashed") + # Plot forecast in red dashed line
labs(y = "Searches of Muay Thai",
x = "Time",
title = "Ensemble Forecast of Monthly Searches of Muay Thai",
subtitle = "Source: Google Data") +
theme_minimal()
#R2
squared_residualsensamble <- (Test$Searches_Sum-Test$ensamble)^2
SSRensamble <- sum(squared_residualsensamble,na.rm=TRUE)
# SST
total_sumensamble <- (Test$Searches_Sum - mean(Train$Searches_Sum, na.rm=TRUE))
squared_sumensamble<- total_sumensamble^2
SSTensamble <- sum(squared_sumensamble,na.rm=TRUE)
#R2
R2ensamble <- 1-(SSRensamble/SSTensamble)
R2ensamble
## [1] 0.6218467
VarianceETSforc <- var(ETSforc$.mean)
VarianceETSforc
## [1] 1186.097
VarianceEnsamble <- var(Test$ensamble)
VarianceEnsamble
## [1] 427.4003
Despite the slightly lower R2 of the ensemble model, the
model seems to be a better tool for forecasting relative to the ETS
model, as the variance of the forecast is much lower. Since both
forecasts underestimate the growth of the actual variable in question,
the lower variance is an asset in forecasting the number of searches
despite the ~5 percentage points lower R2.
By utilizing the three different models, as well as an
ensemble model, we see that the interest in Muay Thai is expected to
grow in the coming years. To improve my model further, I would likely
create an ensemble model of an ETS model and an NNAR model with a longer
time series dataset. This paper could provide valuable information to
some Muay Thai organizations and promotional companies when allocating
assets and calculating PPV purses for its fighters. With some luck and
the fall of interest in boxing, Muay Thai will become one of the world’s
foremost and most followed combat sports.