Introduction


      Since 2014, I have been interested in the combat sport that is Muay Thai. The intensified version of kickboxing brings a fascinating mix of strategy and athleticism that cannot be found in most other sports. Like any fan, I want the sport to grow in participation and popularity. For that reason, I have downloaded Google-provided data on the number of monthly searches and attempted to forecast in which direction the sport will develop in the coming years.
      I believe that with the increase in popularity of MMA and other combat sports, Muay Thai will grow at an accelerated pace as well. To test my hypothesis, I utilized an ETS, ARIMA, NNETAR, and an ensemble model, which performance I tested utilizing an adjusted R2 metric, as well as the traditional performance measures such as AIC and BIC.

Data


      The data I utilized comes from Google Data. Specifically, it reflects the number of weekly searches of “Muay Thai,” between April 2019 and April 2024, which I later aggregated to reflect the number of monthly searches rather than weekly. To better test the performance of my models, I created an 80/20 split of the data, with 20% of the dataset being left alone to be later utilized for testing while the 80% being used to train my models.

Models

ETS


      ETS does exceptionally well with data that possesses heavy seasonality, as well as a trend. A simple eye test of the data in the above graph shows that the data does not demonstrate any obvious seasonality. However, there is a clear upward trend, with possible cyclical downturns. These downturns can be seen in early 2020 and 2024.
      I engage the Multiplicative Holt-Winter’s method for my forecast. This is before there is no visible dampening in the upward trend of searches, meaning the Holt-Winter’s damped method might not be the most efficient. Also, while there is no obvious seasonal aspect, the multiplicative seasonal aspect of the Multiplicative Holt-Winter’s will capture anything we cannot immediately see.

ETS <- Train %>%
  model("ETS" = ETS(Searches_Sum~ error("M") + trend("A") + season("M")))

ETSforc <- ETS%>%
  forecast(h=13)

Train %>%
  ggplot(aes(x = yearmonth, y = Searches_Sum)) +
  geom_line(color = "black", size = 1) + # Plot actual observations in blue
  geom_line(data = Test, aes(x = yearmonth, y = Searches_Sum), color = "green", size = 1) + # Plot real observations from Test dataframe in green
  geom_line(data = ETSforc, aes(y = .mean), color = "red", linetype = "dashed") + # Plot forecast in red dashed line
  labs(y = "Searches of Muay Thai",
       x = "Time",
       title = "ETS Forecast of Monthly Searches of Muay Thai",
       subtitle = "Source: Google Data") +
  theme_minimal()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: The output of `fortify(<fable>)` has changed to better suit usage with the ggdist package.
## If you're using it to extract intervals, consider using `hilo()` to compute intervals, and `unpack_hilo()` to obtain values.

report(ETS)
## Series: Searches_Sum 
## Model: ETS(M,A,M) 
##   Smoothing parameters:
##     alpha = 0.1434019 
##     beta  = 0.0001015461 
##     gamma = 0.001544671 
## 
##   Initial states:
##      l[0]     b[0]      s[0]     s[-1]    s[-2]     s[-3]    s[-4]    s[-5]
##  208.4861 1.208277 0.8971978 0.9812704 0.984331 0.7604929 1.045519 1.049487
##     s[-6]    s[-7]    s[-8]    s[-9]   s[-10]    s[-11]
##  1.023337 1.159614 1.178053 1.163743 0.956842 0.8001136
## 
##   sigma^2:  0.0654
## 
##      AIC     AICc      BIC 
## 565.8328 586.9363 597.2853
accuracy(ETS)
## # A tibble: 1 × 10
##   .model .type       ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <chr>  <chr>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 ETS    Training -2.15  40.5  30.6 -8.47  20.3 0.576 0.584 0.128
#R2

squared_residualsETS <- (Test$Searches_Sum-ETSforc$.mean)^2

SSRETS <- sum(squared_residualsETS,na.rm=TRUE)

# SST

total_sumETS <- (Test$Searches_Sum - mean(Train$Searches_Sum, na.rm=TRUE))

squared_sumETS<- total_sumETS^2

SSTETS <- sum(squared_sumETS,na.rm=TRUE)

#R2

R2ETS <- 1-(SSRETS/SSTETS)

R2ETS
## [1] 0.6672194


      While the AIC and BIC values provide us with no information until we generate the following models, the R2 presents us with an interesting challenge. When calculating R2, traditionally, one uses the \(1-SSR/SST\) formula. However, since we are using the 80/20 data split, we are operating with two different means. In this case, the mean of the testing set is much higher than that of the training set, leading to potential issues, such as a negative R2 value.
      To avoid these issues, we calculate the SSR from the testing set, as that is the only case where that is possible. However, we calculate SST by taking the real values of the testing set and subtracting from them the training set’s mean. This, however, changes the interpretation of the metric. Where in traditional regression analysis, R2 is a measurement of how much variation in the dependent variable can be explained by right-hand side variables, this R2 value shows how many percentage points superior/inferior the forecast we created is, compared to simply using the mean of the training set. As I developed this metric myself and have not found any literature on it, I will continue to refer to it as R2 despite it not resembling R2 in reality. That said, this is a reasonably good measure of model performance. According to R2, the model performs at 66.72 percentage points better than a simple mean forecast, which is a respectable value.

ARIMA


      The ARIMA model generated by the automatic ARIMA() function is an one order moving average model, with one degree of first differencing. As mentioned, the data does not manifest with heavy levels of seasonality, so the model using first differencing to make the data stationary is no surprise.

## Warning: The output of `fortify(<fable>)` has changed to better suit usage with the ggdist package.
## If you're using it to extract intervals, consider using `hilo()` to compute intervals, and `unpack_hilo()` to obtain values.

## Series: Searches_Sum 
## Model: ARIMA(0,1,1) 
## 
## Coefficients:
##           ma1
##       -0.7338
## s.e.   0.1024
## 
## sigma^2 estimated as 2159:  log likelihood=-241.74
## AIC=487.47   AICc=487.75   BIC=491.13
## # A tibble: 1 × 10
##   .model .type       ME  RMSE   MAE   MPE  MAPE  MASE RMSSE   ACF1
##   <chr>  <chr>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 ARIMA  Training  5.99  45.5  33.8 -5.48  21.9 0.637 0.656 0.0668
## [1] 0.5874579


      The ARIMA model forecast performs better relative to the ETS forecast, specifically due to its AIC and BIC values being lower. However, its RMSE value, as well as its MAE and R2 values, are inferior to the ETS model. While these values send mixed messages, I lean towards saying that the ETS model’s performance is superior to the ARIMA model.

NNAR


      A third model I use is a neural network auto-regressive model. Created by the NNETAR function, the model contains two lagged values in two neurons in its hidden layer. The hidden layer is particularly valuable, as it allows the model to adapt to non-linear data, a large advantage over traditional linear regression models.

## Warning: The output of `fortify(<fable>)` has changed to better suit usage with the ggdist package.
## If you're using it to extract intervals, consider using `hilo()` to compute intervals, and `unpack_hilo()` to obtain values.

## Series: Searches_Sum 
## Model: NNAR(2,1,2)[12] 
## 
## Average of 20 networks, each of which is
## a 3-2-1 network with 11 weights
## options were - linear output units 
## 
## sigma^2 estimated as 942.4
## # A tibble: 1 × 10
##   .model .type        ME  RMSE   MAE   MPE  MAPE  MASE RMSSE   ACF1
##   <chr>  <chr>     <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 NNETAR Training 0.0110  30.7  23.0 -5.26  14.3 0.434 0.443 -0.137
## [1] 0.4539324


      The NNAR model boasts a superior MAE and RMSE than even the ETS model. Low MAE and RMSE values are a common feature of NNAR models; however, its R2 is much inferior to both previous models. Additionally, the visual test of the forecast shows that the model needs to perform better as well as the previous two.

Ensemble Model


      Finally, I created an Ensemble model, which combines the three models with hopes of creating a complete, superior combination of its parts. To ensure as good a performance as possible, I assigned different weights to each model, specifically, 60% to ETS, 10% to ARIMA, and 30% to NNAR. With this, I hope to capture the low values for MAE and RMSE from the NNEA model and the superior R2 from the ETS model.

ensamble <- ((0.3)*NNETARforc$.mean+(0.1)*ARIMAforc$.mean+(0.6)*ETSforc$.mean)
Test$ensamble <- ensamble

Train %>%
  ggplot(aes(x = yearmonth, y = Searches_Sum)) +
  geom_line(color = "black", size = 1) + # Plot actual observations in blue
  geom_line(data = Test, aes(x = yearmonth, y = Searches_Sum), color = "green", size = 1) + # Plot real observations from Test dataframe in green
  geom_line(data = Test, aes(y = ensamble), color = "red", linetype = "dashed") + # Plot forecast in red dashed line
  labs(y = "Searches of Muay Thai",
       x = "Time",
       title = "Ensemble Forecast of Monthly Searches of Muay Thai",
       subtitle = "Source: Google Data") +
  theme_minimal()

#R2

squared_residualsensamble <- (Test$Searches_Sum-Test$ensamble)^2

SSRensamble <- sum(squared_residualsensamble,na.rm=TRUE)

# SST

total_sumensamble <- (Test$Searches_Sum - mean(Train$Searches_Sum, na.rm=TRUE))

squared_sumensamble<- total_sumensamble^2

SSTensamble <- sum(squared_sumensamble,na.rm=TRUE)

#R2

R2ensamble <- 1-(SSRensamble/SSTensamble)

R2ensamble
## [1] 0.6218467
VarianceETSforc <- var(ETSforc$.mean)
VarianceETSforc
## [1] 1186.097
VarianceEnsamble <- var(Test$ensamble)
VarianceEnsamble
## [1] 427.4003


      Despite the slightly lower R2 of the ensemble model, the model seems to be a better tool for forecasting relative to the ETS model, as the variance of the forecast is much lower. Since both forecasts underestimate the growth of the actual variable in question, the lower variance is an asset in forecasting the number of searches despite the ~5 percentage points lower R2.

Conclusion


      By utilizing the three different models, as well as an ensemble model, we see that the interest in Muay Thai is expected to grow in the coming years. To improve my model further, I would likely create an ensemble model of an ETS model and an NNAR model with a longer time series dataset. This paper could provide valuable information to some Muay Thai organizations and promotional companies when allocating assets and calculating PPV purses for its fighters. With some luck and the fall of interest in boxing, Muay Thai will become one of the world’s foremost and most followed combat sports.