A dissertation submitted in partial ful llment of the requirements for the degree of M.Sc. Data Science (School of Engineering & Informatics)
Author
Affiliation
Rabin Thapa
University of Wolverhampton
Published
02/28/2025 03:10:00 AM +0545
1 ABSTRACT
The application of deep learning models has been explored by this dissertation, particularly different types of neural networks designed inorder to forecast stock market trends using the historical data. Focusing on internationally relevant six indices, the London Stock Exchange(^FTSE), New York Stock Exchange(^NYA), National Stock Exchange of INDIA(^NSEI), Euronext 100 Index(^N100), Gold Future(^GC=F), and Crude Oil Future(^CL=F), the research integrates historical market data with other important variables related to macroeconomics. After selecting three models from literature review, each model was evaluated based on architecture and training process. The transformer-based model stood best to understand market patterns and predict trends for the period 2024-2030 after comparison of evaluation metrics. During the data processing, normalization and time-series sequence generation is applied to train the dataset for the selected model. After each prediction of indices, root mean square error, mean square error, r squared, model accuracy and directional accuracy was analysed to reliability of the predicted trend in visualization. For the long term stock indices forecast, the findings demonstrate that Transformer models outperform traditional approaches like LSTM, hybrid and Random Forest, achieving significant accuracy. After data analysis of each predicted trend, the dissertation ends with recommendations regarding future work in stock market prediction, like using real-time data, AXI and sentiment analysis to improve prediction accuracy.
2 ACKNOWLEDGEMENT
Throughout my research and studies, any combination of words would be merely enough to thank Mr. Lawrence Krukrubo, for his motivation, continuous support and encouragement. It’s my great pleasure to express my wholehearted thankfulness to him as my supervisor and lecturer whose expertise, advice and insightful input during regular classes and workshops have played a significant role in crafting this piece of art.
All the professors, lecturers and administrative staff shall be always cherished in my soul who have helped in my academic journey and provided the knowledge needed for this research. It has been inspirational to have a sound academic space to grow through their teachings, encouragement and togetherness.
For their unconditional love, support and sacrifices, I would like to remember my parents, Resham Bahadur Thapa and Hela Thapa. The spark of light that I have received from them has always directed me towards knowledge in a journey to being a good human for society as a foundation in my life.
With unconditional love, for my wife, Meera Gopali, I would like to thank her for understanding and steadfast support during this journey. Her patience and encouragement is always a source of constant energy boost. A big thank you to my newborn daughter, Reeva Thapa, whose presence has filled my life with immense joy and purpose.
I also wish to thank all the teachers and my students from my school who are the best source of knowledge and information for this research. Their help has been essential in making this project possible as a process of learning, unlearning and relearning.
Foremost, I would like to express my soulful thankfulness to the lord, who has been the spark of my life. Thank you for the blessing of patience during the scarcity and brightness of wisdom during abundance. Your presence in my life has been full of wonders. Thank you, Lord.
3 INTRODUCTION
3.1 Neural Networks(NN) in Stock Market Prediction
The ability to apply ML techniques to solve complex problems helps us to make sound financial decisions is a rapidly evolving field (Chang and team, 2024). Stock market prediction represents one such challenge, given its dynamic and volatile nature, which is affected by many variables including data in the stock market to macroeconomics. This project aims to develop a neural network-based ML model to forecast the market trend of six different stock indices using datasets from 2008 to 2023 from an authentic source. As advocated by Shastry (2024), this project demonstrates a systematic understanding of an advanced area in Data Science, while critically evaluating methodological processes and their effectiveness by addressing this challenge.
Predicting stock market trends has become an important task, with applications ranging from financial forecasting to managing investment portfolios (Front Matter, 2024). Application of data science coupled with artificial intelligence can predict the stock market movements with acceptable accuracy. However, for major indices like the FTSE, NYA, NSEI, N100, GC=F and CL=F, it is a complex task because of the dynamic nature of the financial market. To overcome this complexity, we need to explore and develop models as per the historic data available from the 2008 financial crisis to 2023. After evaluating different developing models, one module with best performance will be selected for the forecasting. The selected model is expected to predict future trends from 2024 to 2030 as per the trained data for selected index using historical stock data, macroeconomic factors and Neural Networks (Zhan, 2020).
To build a model, this study uses machine learning techniques, particularly neural networks. Several models were tested during the research, including traditional time series models like ARIMA and LSTM, Transformer-based and hybrid models as suggested by Shi (Shi and team, 2024). For the best forecasting, the performance of these models must be compared using the evaluation metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), R-squared (R²), directional accuracy and model accuracy (Ican and Celik, 2017). The best model is selected through this comparison, we can finalize the best model, its architecture and training process. Which can be applied independently to each desired indices.
Following the data ethics and integrity as per GDPR (GDPR, 2018), after accessing the datasets, the data is processed and normalised followed by training, testing, visualization and evaluation. Which shall be further explored for the model’s success and limitation in stock market trends prediction and explore its efficiency to use while making financial decisions.
3.2 Objective
The investigation of historical stock market data from 2008 to 2023, after the 2008 global financial breakdown till now by application of modern data science techniques is a main objective of this project. The project aims to uncover patterns and trends that influence market movements over time by analysing this data which is trained, tested and visualised in suitable the ML model under neural networking systems(Aszemi and Dominic, 2019). Understanding these variable’s patterns inside the dataset can assist us to construct a reliable ML model which can capture trends in the complex stock market’s indices.
Creating and applying a deep learning model, particularly with neural networks to explore trends in the market is the main aim of this project. NN are ideal for this job because they can manage complicated data patterns and adjust to changes. Different models will be developed and analyzed. As suggested by Hubara and Courbariaux(2018), common measures like MAE and RMSE will be used to evaluate their performance. This will help to determine how accurately the model predicts market trends and identify areas where it may need improvement.
The best possible model type, architecture and training process is aimed to be obtained for our data which is later deployed for training, testing, predicting, visualisation and data analytics.
Finally, each model applied for different indices will be evaluated to understand its strength and weakness after visualizing the future trend through evaluation metrics. This includes reflecting on the limitations of the neural network model and the challenges faced during the process, ensuring a well-rounded understanding of its practical application in the space of financial investment.
3.3 Significance of the research
By combining creativity and technical information to design a deep learning system to acknowledge the hidden pattern in the stock market reflects the core principles of the developed module. It integrates knowledge in neural networks, data preprocessing and financial analysis, offering a new approach to solving complex challenges. The project requires a solid understanding of financial data variables, prediction, and addressing issues that are encountered while applying ML models in the financial market. The project involves building a neural network from scratch, testing various methods and refining the model based on the results by conducting independent research.
Data analytics done for the visualisation obtained from the best model is expected to help make financial decisions during or after investment. A critical evaluation of the model’s performance and limitations ensures a thoughtful analysis, its potential and areas for improvement directing toward further knowledge related to deep learning modules and data analytics.
4 Literature Review
Aimed at developing models that can forecast market movements and help during the decision making process before investment. Influenced by factors like past stock prices, global economic conditions and political events, the financial markets are complex, which make prediction difficult (Park and Shin, 2013).
In this section, we will review the most commonly used methods for forecasting finance and examine how deep learning has been used in financial predictions. We will analyse different models and their architecture which are being applied to predict the closing prices of global stock markets and select a few suitable for the final comparison followed by model development and final forecasting.
4.1Traditional Time Series Models
In the past, stock market prediction mostly relied on traditional statistical time series models (Du, Yu. 2024) which use the past data and patterns to predict future values. The Auto Regressive Integrated Moving Average (ARIMA) combines autoregressive (AR), moving average (MA) and differencing techniques to model time series data is one of the most commonly used models as illustrated in Fig 2.1. (Chen, Muhammad and team, 2023).
Fig 2.1. Basic ARIMA architecture for the prediction.
This model works when the data is linear but is unable to understand non-linear patterns and long-term dependencies.Therefore, they have limitations when handling complex and unpredictable financial data (Box & Jenkins, 1976).
4.2Machine Learning Models
For nonlinear patterns, as suggested by Chen and Hao (2017), support vector machines (SVM) and k-nearest neighbours (KNN) including decision trees are also used to predict indices trends (Chen and Hao, 2017). Though, these models showed good results in stock price prediction but often require manual input of relevant features causing an overfit when dealing with more data with big volume.
One successful machine learning method is Random Forest (Breiman, 2001), which is a collective learning approach. The prediction of several decision trees is mixed to polish accuracy and reliability.
Fig 2.2 Basic GBM architecture for the prediction of nonlinear data.
Another well-known machine, Gradient Boosting Machines commonly known as GBM builds decision trees one after another as illustrated in Fig 2.2 enhancing the nonlinearity in predication. In this model, the error output of the previous tree is fixed by the following tree. GBM has proven to be more reliable than simpler methods like decision trees and SVMs (Natekin and Knoll, 2013).
4.3Deep Learning Models
New investigation in Deep Learning has helped in prediction of the stock market trend. ANNs, commonly known as Artificial Neural Networks were initial techniques which were applied to forecast the market (Guresen, Kayakutlu and Daim, 2011). However, traditional ANNs have their inability in acknowledging temporal dependencies, which is a key demerit in forecasting financial trends.
In order to overcome these dependencies, Recurrent Neural Networks(RNN) and Long Short-Term Memory(LSTM) networks are being developed which have the simple yet effective architecture as illustrated in Fig 2.3.
As a type of RNN, LSTM network is introduced by Hochreiter and Schmidhuber (1997) in order to solve long-term dependencies. Recently, LSTM networks have been applied in stock market prediction due to their potential to capture nonlinear patterns over extended periods. Fischer and Krauss (2018) propose that LSTM is one of the best model architectures for a huge volume of financial data variables, including stock prices and stock returns. However, LSTMs are computationally expensive and can suffer from issues like vanishing gradients, especially when dealing with long sequences.
Fig 2.3 Basic architecture of neural network (NN) model.
Hence, Gated Recurrent Units (GRU) with similar but simple architecture were developed to solve the long sequencing and gradients, offering faster training times with comparable performance in time series prediction tasks (Chung, 2014).
4.4Transformer-Based Models
Gillioz and his team(2020) illustrates that transformer models, which is a type of deep learning model, have gained attraction in the field of Time-based prediction. To capture complex dependencies across time steps which is effective to handle long-range dependencies, it uses a self attention method.
Fig 2.4 Basic architecture of Transformer-Based Models.
Lim and Zohren (2021) suggests that the Time series Transformer (TST) has shown better ability as compared to both LSTMs and GRUs in tasks involving prediction related to time series. The model ability to train with historical data with high accuracy is obtained from attention mechanisms. These models are increasingly being applied in financial prediction tasks with large data-sets. This also provides more accurate forecasts than classical ML and DLM.
4.5Hybrid Models
Recently, a Hybrid Model which is a mixture of two or more ML algorithms has also gained attention (Xu and Zhang, 2021). Such as using traditional time series models in conjunction with machine learning algorithms or employing ensemble methods to improve prediction accuracy which combine the strength to each model. As illustrated in Fig 2.5, the architecture where transformer based model is combined with LSTM is expected to give highly reliable output as the prediction.
Fig 2.5 Basic architecture of hybrid model (LMST + Transformer).
This model uses LSTM layers combined with an attention mechanism to prioritise important sets of time-series data. Dropout is used to avoid overfitting, while embedding layers turn categorical data into continuous values. The softmax layer gives probabilities, making this approach suitable for time-series prediction and forecasting.
The authors claim that ensemble learning methods, such as stacking, are effective in combining predictions from various models to improve overall prediction accuracy (Nti and Weyori, 2020).
4.6Evaluation Metrics for Stock Market Prediction
Chicco and his team (2021) recommend MAE, MSE, R², model accuracy and directional accuracy must be analysed to analyse reliability of each model performance after the selection of model type and architecture (Chicco, and this team, 2021). This process is necessary for a predictive model to not only minimize errors but also to ensure generalization of the unseen data (Goldberg and Mouti, 2022). An ability to deal with unseen data with high efficiency must be also a major property of a good predictive model which means choosing the right model, using cross-validation and applying techniques to prevent overfitting.
Furthermore, using different evaluation measures and testing the model on various data sets can give a stronger evaluation which will ensure that the model not only works well with past data but also predicts future market trends accurately.
4.7Challenges and Future Directions
Predictive models are advancing rapidly, moving along with several problems remaining in the field of stock market prediction. Financial data are inherently noisy and volatile which is affected by many other socio economic variables such as politics and market sentiment (Kumar, Rao and Dhochak, 2025). This makes it difficult to develop an efficient NN model with high accuracy . In order to solve this, Taherdoost (2023) suggests backing our model with market sentiment analysis using artificial intelligence as illustrated in Fig 2.7.
Fig 2.7. Architecture of sentiment based Model (Silpa, 2021).
Furthermore, data accessibility to a huge volume of reliable sources has always been a problem. For training and testing data in a good model, it needs a huge volume of authentic data which may not always be available for certain financial products or markets. Therefore, we can expect the integration of additional variables as input to the selected model. Like sentiment analysis from news articles or social media coupled with XAI(explainable AI) to improve the accuracy of predictions (Silpa, 2021).
5 Methodology
The methodology starts with Data Collection and Preprocessing, where relevant data is gathered and cleaned. Then, for identification of pattern, exploratory data analysis (EDA) is adopted to identify patterns and trends (Krebs, Denton and Wark, 2006). In Model Design and Training, a neural network model will be constructed and trained on the data from reliable sources following data ethics. Model Evaluation follows, using performance metrics to assess accuracy. Finally, Deployment and Visualization involve integrating the model into a platform and visualizing the predictions.
5.1 Determining the Model type and architecture
As illustrated in Fig 3.1 a combination of data management, data exploration, literature review, data preprocessing, feature engineering, model type and architecture selection is associated in this research. Application of selected models with its architecture will be executed to predict the FTSE, NYA, NSEI and N100 index by training 4033 available entries of the last 16 years for each index.
Fig 3.1. Methodology to obtain the best Model type and Architecture.
After collection of 16 datasets one for each year, spanning from 2008 to 2023, which includes global stock prices basics variables like closing, volume and adjusted price using publicly accessed dataset (Pavan Narne, 2023). Later macroeconomic factors like GDP growth, inflation rates and interest rates which are publicly available from the Office of National Statistics are also integrated to enhance the prediction . These datasets are freely available to access following the GDPR. These datasets are further explored inorder to understand the variables, its types, number and collinearity between the variables. This will direct our literature review on models which are being used in financial market prediction (Snyder, Hannah. 2019).
As illustrated in Fig 3.1, after shortlisting three best models based on literature review. We will develop and evaluate each model for a sample index: LONDON STOCK EXCHANGE (^FTSE), only for the selection of the model with best architecture. Based on our literature review, we have selected three model types: Transformer-based Model, LMST Model and Hybrid model to be developed and evaluated for the best model architecture.
5.2 Developing the Model
Once the model type is selected with its best architecture as per our cleaned data, data will be trained, tested, predicted, visualized and evaluated as shown in Fig 3.2.
Fig 3.2. Methodology to obtain the best market trend prediction for each index.
For prediction of each index, its identical data will be used to train, test, predict, visualize and evaluate the forecasted market trend from 2024 to 2030 with expectation of high accuracy.
The radar diagram is selected to visualize which will be used to compare the model accuracy of each predicted trend related to their individual index
6 Data Processing
The data preprocessing for this project involves several steps to prepare the historical stock market and macroeconomic data for the selected models. After the literature review, we have selected three models to be developed for the evaluation before the prediction (Kabir, 2025).
After importing the necessary tools as seen in the code below, the data is imported from CSV files and then combined based on the date as shown in Fig 5.0. The variables and data are processed as required by our selected model.
Code
import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport torchimport torch.nn as nnimport torch.optim as optimfrom torch.utils.data import DataLoader, TensorDatasetfrom sklearn.preprocessing import MinMaxScalerfrom sklearn.metrics import mean_absolute_error, mean_squared_error, r2_scorefrom datetime import timedeltafrom IPython.display import displayimport sysimport osdevice = torch.device("cuda"if torch.cuda.is_available() else"cpu")
Fig. 4.0 Codes to import necessary tools for models.
Once the data is combined from all 16 datasets for each index, it is normalized and cleaned as per the model as suggested by Nayak (2021).
6.1 Data Exploration:
The data exploration phase involves examining the key variables as well as some socio-economic variables which are likely to influence the stock market predictions.
The dataset includes several key variables in stock market: Ticker is the unique symbol for a stock; Date is when the data was recorded; Open is the price at the start of the trading day; High and Low are the highest and lowest prices reached during the day; Close is the price at the end of the day; AdjClose is the closing price after adjustments; and Volume is the total number of shares bought and sold. These details help us look at how stock prices change and how much trading happens.
Other stock market indices such as NYSE Composite (NYA), Nifty 50 (NSEI) and Euronext 100 (N100), which will be later used for their individual prediction after the model is selected. There are a total of seven variables and 4037 entries for each index.
⇒ total number of dataset for each index(N) = 4037
The primary target Ticker or Index is the FTSE’s which serves as the key stock market index for the model type selection. We select closing price as the primary selected feature, as we need to train our data to predict future trends in closing price.
Therefore, the macroeconomic factors incorporated in the study include GDP growth, inflation rates and interest rates for the UK from 2008 to 2023 during the process of model architecture selection. These factors are crucial for understanding the broader economic environment with its effect in the market which will be also used to train the model (Hay, Colin. 2009). We are importing all the datasets as observed in Fig. 4.1.
The data tidying process includes several important steps to prepare the datasets for analysis. The date columns in all the relevant data are converted to the correct datetime format to make sure everything aligns properly.
The FTSE 100 index data is filtered, keeping Date as the index. Only, closing prices are kept, removing any missing values.
The macroeconomic variables are also indexed by date. These datasets are then combined using an outer join, making sure all available data is included, even if there are some missing values.
Finally, the FTSE 100 data is merged with the macroeconomic data and any remaining missing values are filled using linear interpolation.
Rarcia(2015) mentions that Rows with null values are removed, leaving a clean and organised data. We can now further transform these data before independently training in each module.
6.3 Data Refining
To prepare clean data for training, the features are first normalised using a MinMaxScaler (Cabello-solorzano2023). This scales all the values to be between 0 and 1, so that no variable is more important just because of its scale.
The fit_transform() method is applied to the combined dataset to carry out this normalisation. Then, a function called create_sequences must be applied for time series data conversion to 30 consecutive days of sequence (with time_steps=30).
This function creates two sets: the feature set X, which contains the input data for each sequence, and the target set y, which contains the stock prices for the next day.
While splitting the data distribution for training, 80% is used for training and 20% for testing as illustrated in Fig 4.3. The dataset is now ready to be used in the model, which will learn the time-based patterns and predict future stock prices.
Fig. 4.3 Codes to apply transform and sequence to the data.
This code in Fig 4.4 converts the training and testing data into PyTorch tensors with a data type of float32.
Input features(X) are paired with by application TensorDataset with the corresponding target values (y). Then, batches of training and testing created by using the DataLoader, with a batch size of 32. The training data is shuffled, while the test data is not, allowing for efficient training and evaluation of the model.
Fig. 4.4 Codes to convert training and testing functions to PyTorch.
7 Model Development:
7.1 Transformer-Based Model:
The StockTransformer class is applied in a Transformer-based model. In the __init__ method, the model is built with a total of 6 layers. An embedding layer (nn.Linear) converts the input data into a form the model can understand. For understanding the pattern over the data trained distribution, a transformer encoder layer is implemented.
Code
class StockTransformer(nn.Module):def__init__(self, input_dim, hidden_dim=128, output_dim=1, num_heads=4, num_layers=2):super(StockTransformer, self).__init__()self.embedding = nn.Linear(input_dim, hidden_dim) encoder_layers = nn.TransformerEncoderLayer(d_model=hidden_dim, nhead=num_heads, dim_feedforward=256, batch_first=True)self.transformer = nn.TransformerEncoder(encoder_layers, num_layers=num_layers)self.fc = nn.Linear(hidden_dim, output_dim)def forward(self, x): x =self.embedding(x) x =self.transformer(x) x = x[:, -1, :]returnself.fc(x)model = StockTransformer(input_dim=X.shape[2]).to(device)criterion = nn.MSELoss()optimizer = optim.Adam(model.parameters(), lr=0.001)# for the loop, training Loopfrom contextlib import redirect_stdoutepochs =50withopen(os.devnull, 'w') as fnull:with redirect_stdout(fnull):for epoch inrange(epochs): model.train() train_loss =0.0for X_batch, y_batch in train_loader: optimizer.zero_grad() y_pred = model(X_batch).squeeze() loss = criterion(y_pred, y_batch) loss.backward() optimizer.step() train_loss += loss.item()# defining the function for the predictionsy_pred = model(X_test_torch).detach().cpu().numpy()
Fig. 5.1(a) Codes for transformer based model development.
As shown in Fig 5.1, the embedding layer follows the initial input layer, later the model transforms the input into a higher-dimensional space. It then passes through two Transformer encoder layers, which use self-attention and a feedforward network to process the data. A fully connected layer is added to make the prediction which provides the final result as an output.
This encoder can adjust the number of attention heads with the num_heads and num_layers settings. Fully connected layer (fc) to produce the predicted stock price. In this forward method, the input data first goes through the embedding layer to create a more detailed representation. Next, it moves through the Transformer encoder, which looks at the data over time.
Only the final output (the last time step) is selected as the prediction, which is common in time series forecasting. For training, the Mean Squared Error (MSE) loss function is used, as this is a regression problem and it helps in minimizing the difference between predicted and actual stock prices. The Adam optimizer is used for parameter updates with a learning rate of 0.001, which helps in optimizing the model during training. The training loop runs for 50 epochs, processing data in batches provided by train_loader.
After each batch, the loss is worked out. When training is finished, predictions are made using the test set, and the model’s performance is checked by comparing the predicted stock prices with the real ones, as shown in the code below in Fig 5.2.
Fig. 5.1(b) Model evaluation for the transformer based model.
7.2 LMST Model:
This model is built to forecast stock prices using an LSTM (Long Short-Term Memory) network.
As seen in Fig 5.2, the model has four main layers. First, the Input Layer takes in the raw data, which includes things like past stock prices or other details over time. Then, the LSTM Layer processes this data, learning patterns and returning a set of hidden states that shows the relation between each data point over time. After that, the Fully Connected (FC) Layer uses these hidden states and turns them into the final output. Finally, the Output Layer gives the predicted value, such as the predicted stock price for that sequence.
Introduced by Chen (2023), for the formation of a hybrid model, LSTM (Long Short-Term Memory) and Transformer are combined during the model development. The models were defined using PyTorch, with both the LSTM and Transformer architectures consisting of key layers to process the input data and output predictions.
This hybrid has two parts: the LSTM and the Transformer as observed in Fig 5.3, each with its own layers. The LSTM model starts from an Input Layer where the initial raw data enters the model. The LSTM Layer then processes the sequence data using three stacked LSTM layers, with a 0.3 dropout rate to reduce overfitting. The output from the LSTM to a single predicted value is mapped by a Fully Connected (FC) Layer, and the Output Layer produces the final predicted data point. The predicted sets of data points can be used to acknowledge the hidden patterns which can be analysed to predict the future possible trends in the market.
In the second connected model, transformer model, the sequence data also enters through the Input Layer. It then passes through an Embedding Layer followed by transforming the data into a higher-dimension. The core of the model consists of Transformer Encoder Layers that use self-attention to capture complex patterns in the data. After two encoder layers, the Fully Connected Layer processes the output and maps it to the prediction, followed by the Output Layer which gives the final predicted value. Although both models have four layers each, the Transformer’s self-attention mechanism and embedding layer allow it to handle long-range dependencies more effectively than the LSTM.
Code
from contextlib import redirect_stdoutdef create_sequences(data, time_steps=30): X, y = [], []for i inrange(len(data) - time_steps): X.append(data[i:i+time_steps]) y.append(data[i+time_steps][0]) return np.array(X), np.array(y)time_steps =30X, y = create_sequences(df_scaled, time_steps)train_size =int(len(X) *0.8)X_train, X_test = X[:train_size], X[train_size:]y_train, y_test = y[:train_size], y[train_size:]X_train_torch = torch.tensor(X_train, dtype=torch.float32).to(device)X_test_torch = torch.tensor(X_test, dtype=torch.float32).to(device)y_train_torch = torch.tensor(y_train, dtype=torch.float32).to(device)y_test_torch = torch.tensor(y_test, dtype=torch.float32).to(device)train_dataset = TensorDataset(X_train_torch, y_train_torch)test_dataset = TensorDataset(X_test_torch, y_test_torch)train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)# LSTM Model definitionclass StockLSTM(nn.Module):def__init__(self, input_dim, hidden_dim=128, output_dim=1, num_layers=3, dropout=0.3):super(StockLSTM, self).__init__()self.lstm = nn.LSTM(input_dim, hidden_dim, num_layers, batch_first=True, dropout=dropout)self.fc = nn.Linear(hidden_dim, output_dim)def forward(self, x): lstm_out, _ =self.lstm(x) out = lstm_out[:, -1, :] # Last time step outputreturnself.fc(out)# Transformer Model definitionclass StockTransformer(nn.Module):def__init__(self, input_dim, hidden_dim=128, output_dim=1, num_heads=4, num_layers=2):super(StockTransformer, self).__init__()self.embedding = nn.Linear(input_dim, hidden_dim) encoder_layers = nn.TransformerEncoderLayer(d_model=hidden_dim, nhead=num_heads, dim_feedforward=256, batch_first=True)self.transformer = nn.TransformerEncoder(encoder_layers, num_layers=num_layers)self.fc = nn.Linear(hidden_dim, output_dim)def forward(self, x): x =self.embedding(x) x =self.transformer(x) x = x[:, -1, :]returnself.fc(x)# Initialize models (ensure they don't print anything)lstm_model = StockLSTM(input_dim=X.shape[2]).to(device)transformer_model = StockTransformer(input_dim=X.shape[2]).to(device)# Loss and optimizercriterion = nn.MSELoss()lstm_optimizer = optim.Adam(lstm_model.parameters(), lr=0.0005)transformer_optimizer = optim.Adam(transformer_model.parameters(), lr=0.001)# Suppressing output of model architecture during training and predictionswithopen(os.devnull, 'w') as fnull:with redirect_stdout(fnull): # Redirecting all print outputs to null to avoid display# Training the models with verbose=False epochs =50for epoch inrange(epochs): lstm_model.train() transformer_model.train() train_loss_lstm, train_loss_transformer =0.0, 0.0for X_batch, y_batch in train_loader: lstm_optimizer.zero_grad() y_pred_lstm = lstm_model(X_batch).squeeze() loss_lstm = criterion(y_pred_lstm, y_batch) loss_lstm.backward() lstm_optimizer.step() train_loss_lstm += loss_lstm.item() transformer_optimizer.zero_grad() y_pred_transformer = transformer_model(X_batch).squeeze() loss_transformer = criterion(y_pred_transformer, y_batch) loss_transformer.backward() transformer_optimizer.step() train_loss_transformer += loss_transformer.item()# Doing the predictions from both models lstm_model.eval() transformer_model.eval()with torch.no_grad(): y_pred_lstm = lstm_model(X_test_torch).detach().cpu().numpy() y_pred_transformer = transformer_model(X_test_torch).detach().cpu().numpy()# Combining predictions y_pred_combined =0.5* y_pred_lstm +0.5* y_pred_transformer
Fig. 5.3. Codes for the Hybrid ( LMST + Transformer )model architecture.
After training, the predictions from both models were combined by averaging their outputs to improve accuracy. The combined predictions are then analyzed on tested data to see its success in forecasting stock market trends.
Now, calculating the evaluation metrics for the above hybrid model:
While plotting the positive evaluation metrics in the Redar Diagram as seen below in Fig 6.1, we obtain the following diagram for each of the tested models which is used for the selection of our model type and its architecture.
Fig 6.1 Radar diagram of the positive metrics for model comparison.
After comparing the metrics of the above of our selected model type and its architecture, the Transformer-based model is the best choice for predicting future trends in the global stock market because it performs better than other models on key measures. With an R-Squared score (R²) of 0.93, it explains 93% of the variation in the data. For our data, it can effectively capture stock market trends. The Transformer model has much lower MAE and RMSE values (109 and 139, respectively). Overall, the Transformer model offers the best balance of accuracy and ability to explain data making it the most reliable model for forecasting stock market trends.
9 Data Analytics and prediction:
For each selected stock index, a transformer based neural network model was designed, trained, tested, predicted and plotted for the following visualization as per our model development methodology.
9.1 London Stock Exchange(^FTSE):
After the application of the selected transformer model and its architecture for the prediction of the London Stock Exchange (^FTSE), we obtain the prediction as in Fig 7.1.
Code
import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport torchimport torch.nn as nnimport torch.optim as optimfrom torch.utils.data import DataLoader, TensorDatasetfrom sklearn.preprocessing import MinMaxScalerfrom sklearn.metrics import mean_absolute_error, mean_squared_error, r2_scorefrom datetime import timedeltaimport osfrom IPython.display import display# Ensure the device is either CPU or CUDA (GPU)device = torch.device("cuda"if torch.cuda.is_available() else"cpu")# Define file paths and load datasetsfile_directory =r"C:\Users\Dell\OneDrive\Desktop\University\Course\9. M.Sc. Project\3. Dataset\archive"file_names = ["2014_Global_Markets_Data.csv", "2015_Global_Markets_Data.csv", "2016_Global_Markets_Data.csv", "2017_Global_Markets_Data.csv", "2018_Global_Markets_Data.csv", "2019_Global_Markets_Data.csv", "2020_Global_Markets_Data.csv", "2021_Global_Markets_Data.csv", "2022_Global_Markets_Data.csv", "2023_Global_Markets_Data.csv"]file_paths = [os.path.join(file_directory, file) forfilein file_names]dfs = [pd.read_csv(file) forfilein file_paths]df_stock = pd.concat(dfs, ignore_index=True)# Macroeconomic data filesfile_directory_macroeconomics =r"C:\Users\Dell\OneDrive\Desktop\University\Course\9. M.Sc. Project\3. Dataset\archive\Macroeconomic_factors"df_gdp = pd.read_csv(os.path.join(file_directory_macroeconomics, "GDP_growth_uk.csv"))df_inflation = pd.read_csv(os.path.join(file_directory_macroeconomics, "Inflation_rate_uk.csv"))df_interest = pd.read_csv(os.path.join(file_directory_macroeconomics, "Interest_rate_uk.csv"))# Convert Date columns to datetimedf_stock['Date'] = pd.to_datetime(df_stock['Date'])df_gdp['Date'] = pd.to_datetime(df_gdp['Date'])df_inflation['Date'] = pd.to_datetime(df_inflation['Date'])df_interest['Date'] = pd.to_datetime(df_interest['Date'])# Filter FTSE 100 (^FTSE) datadf_ftse = df_stock[df_stock['Ticker'] =='^FTSE'].copy()df_ftse.set_index('Date', inplace=True)df_ftse = df_ftse[['Close']].dropna()# Set Date as index for macroeconomic datadf_gdp.set_index('Date', inplace=True)df_inflation.set_index('Date', inplace=True)df_interest.set_index('Date', inplace=True)# Merge macroeconomic indicators into stock datadf_macro = df_gdp.join([df_inflation, df_interest], how='outer')df_combined = df_ftse.join(df_macro, how='outer')df_combined = df_combined.interpolate(method='linear').dropna()# Normalize the combined data for scalingscaler = MinMaxScaler()df_scaled = scaler.fit_transform(df_combined)# Function to create sequences for the modeldef create_sequences(data, time_steps=30): X, y = [], []for i inrange(len(data) - time_steps): X.append(data[i:i+time_steps]) y.append(data[i+time_steps][0])return np.array(X), np.array(y)time_steps =30X, y = create_sequences(df_scaled, time_steps)train_size =int(len(X) *0.8)X_train, X_test = X[:train_size], X[train_size:]y_train, y_test = y[:train_size], y[train_size:]# Convert to PyTorch tensorsX_train_torch = torch.tensor(X_train, dtype=torch.float32).to(device)X_test_torch = torch.tensor(X_test, dtype=torch.float32).to(device)y_train_torch = torch.tensor(y_train, dtype=torch.float32).to(device)y_test_torch = torch.tensor(y_test, dtype=torch.float32).to(device)train_dataset = TensorDataset(X_train_torch, y_train_torch)test_dataset = TensorDataset(X_test_torch, y_test_torch)train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)# Define the Transformer modelclass StockTransformer(nn.Module):def__init__(self, input_dim, hidden_dim=128, output_dim=1, num_heads=4, num_layers=2):super(StockTransformer, self).__init__()self.embedding = nn.Linear(input_dim, hidden_dim) encoder_layers = nn.TransformerEncoderLayer(d_model=hidden_dim, nhead=num_heads, dim_feedforward=256, batch_first=True)self.transformer = nn.TransformerEncoder(encoder_layers, num_layers=num_layers)self.fc = nn.Linear(hidden_dim, output_dim)def forward(self, x): x =self.embedding(x) x =self.transformer(x) x = x[:, -1, :]returnself.fc(x)model = StockTransformer(input_dim=X.shape[2]).to(device)criterion = nn.MSELoss()optimizer = optim.Adam(model.parameters(), lr=0.001)# Training Loopepochs =50for epoch inrange(epochs): model.train() train_loss =0.0for X_batch, y_batch in train_loader: optimizer.zero_grad() y_pred = model(X_batch).squeeze() loss = criterion(y_pred, y_batch) loss.backward() optimizer.step() train_loss += loss.item()# Predictions for the test datay_pred = model(X_test_torch).detach().cpu().numpy()# Predict future values for 2023-2030future_steps =7*12# Predicting monthly data for the next 7 years (2023-2030)predicted_future = []# Start from the last known point and predict future valuesinput_sequence = X_test[-1].reshape(1, time_steps, X_test.shape[2]) # Use the last point of the test datafor _ inrange(future_steps): predicted_value = model(torch.tensor(input_sequence, dtype=torch.float32).to(device)).detach().cpu().numpy() predicted_future.append(predicted_value[0]) input_sequence = np.roll(input_sequence, -1, axis=1) # Shift the sequence for the next prediction input_sequence[0, -1, 0] = predicted_value.item() # Use .item() to extract the scalar valuepredicted_future = np.array(predicted_future)# Reverse the scaling for predictionspredicted_future_rescaled = scaler.inverse_transform(np.hstack((predicted_future, np.zeros((predicted_future.shape[0], df_combined.shape[1] -1)))))predicted_future_rescaled = predicted_future_rescaled[:, 0]# Visualizationplt.figure(figsize=(10, 5))# Plot Historical Data (2014-2023)plt.plot(df_ftse.index, df_ftse['Close'], label="Historical Data", color='blue')# Plot Predicted Data (2025-2030) with a shaded confidence intervalfuture_dates = pd.date_range(df_ftse.index[-1], periods=future_steps, freq='ME') # Use 'ME' instead of 'M'plt.plot(future_dates, predicted_future_rescaled, label="Predicted Data (2025-2030)", linestyle='--', color='darkred')# Shading for the confidence intervalplt.fill_between(future_dates, predicted_future_rescaled -50, predicted_future_rescaled +50, color='red', alpha=0.2)plt.xlabel("Date", fontweight='bold')plt.ylabel("FTSE 100 Closing Price", fontweight='bold')plt.title("London Stock Exchange (^FTSE) Predicted Closing Prices (2025-2030)", fontweight='bold')# Add legendplt.legend()# Grid settings to match the screenshot styleplt.grid(True, linestyle='-', linewidth=0.5)# Show plotplt.tight_layout()plt.show()
Fig 7.1 Prediction for the London Stock Exchange from 2023 to 2030.
As illustrated in Fig 7.1, the future prediction for the London Stock Exchange (FTSE 100) from 2023 to 2030 shows a steady upward trend, as indicated by the predicted data (shown in red). The predicted values suggest growth during 2025 which is followed by market fall with some fluctuations until 2028. A sudden market rise can be expected in the initial months of 2028.
Now printing the performance metrics for this prediction of ^FTSE.
9.2 New York Stock Exchange(^NYA):
Similarly, after the application of the selected transformer model and its architecture for the prediction of the New York Stock Exchange (^NYA), we obtain the prediction as in Fig 7.2.
Code
import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport torchimport torch.nn as nnimport torch.optim as optimfrom torch.utils.data import DataLoader, TensorDatasetfrom sklearn.preprocessing import MinMaxScalerfrom sklearn.metrics import mean_absolute_error, mean_squared_error, r2_scorefrom datetime import timedeltaimport osdevice = torch.device("cuda"if torch.cuda.is_available() else"cpu")# Define file paths and load datasetsfile_directory =r"C:\Users\Dell\OneDrive\Desktop\University\Course\9. M.Sc. Project\3. Dataset\archive"file_names = ["2014_Global_Markets_Data.csv", "2015_Global_Markets_Data.csv", "2016_Global_Markets_Data.csv", "2017_Global_Markets_Data.csv", "2018_Global_Markets_Data.csv", "2019_Global_Markets_Data.csv", "2020_Global_Markets_Data.csv", "2021_Global_Markets_Data.csv", "2022_Global_Markets_Data.csv", "2023_Global_Markets_Data.csv"]file_paths = [os.path.join(file_directory, file) forfilein file_names]dfs = [pd.read_csv(file) forfilein file_paths]df_stock = pd.concat(dfs, ignore_index=True)# Macroeconomic data filesfile_directory_macroeconomics =r"C:\Users\Dell\OneDrive\Desktop\University\Course\9. M.Sc. Project\3. Dataset\archive\Macroeconomic_factors"df_gdp = pd.read_csv(os.path.join(file_directory_macroeconomics, "GDP_growth_uk.csv"))df_inflation = pd.read_csv(os.path.join(file_directory_macroeconomics, "Inflation_rate_uk.csv"))df_interest = pd.read_csv(os.path.join(file_directory_macroeconomics, "Interest_rate_uk.csv"))# Convert Date columns to datetimedf_stock['Date'] = pd.to_datetime(df_stock['Date'])df_gdp['Date'] = pd.to_datetime(df_gdp['Date'])df_inflation['Date'] = pd.to_datetime(df_inflation['Date'])df_interest['Date'] = pd.to_datetime(df_interest['Date'])# Filter FTSE 100 (^FTSE) datadf_ftse = df_stock[df_stock['Ticker'] =='^NYA'].copy()df_ftse.set_index('Date', inplace=True)df_ftse = df_ftse[['Close']].dropna()# Set Date as index for macroeconomic datadf_gdp.set_index('Date', inplace=True)df_inflation.set_index('Date', inplace=True)df_interest.set_index('Date', inplace=True)# Merge macroeconomic indicators into stock datadf_macro = df_gdp.join([df_inflation, df_interest], how='outer')df_combined = df_ftse.join(df_macro, how='outer')df_combined = df_combined.interpolate(method='linear').dropna()# Normalize the combined data for scalingscaler = MinMaxScaler()df_scaled = scaler.fit_transform(df_combined)# Function to create sequences for the modeldef create_sequences(data, time_steps=30): X, y = [], []for i inrange(len(data) - time_steps): X.append(data[i:i+time_steps]) y.append(data[i+time_steps][0])return np.array(X), np.array(y)time_steps =30X, y = create_sequences(df_scaled, time_steps)train_size =int(len(X) *0.8)X_train, X_test = X[:train_size], X[train_size:]y_train, y_test = y[:train_size], y[train_size:]# Convert to PyTorch tensorsX_train_torch = torch.tensor(X_train, dtype=torch.float32).to(device)X_test_torch = torch.tensor(X_test, dtype=torch.float32).to(device)y_train_torch = torch.tensor(y_train, dtype=torch.float32).to(device)y_test_torch = torch.tensor(y_test, dtype=torch.float32).to(device)train_dataset = TensorDataset(X_train_torch, y_train_torch)test_dataset = TensorDataset(X_test_torch, y_test_torch)train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)# Define the Transformer modelclass StockTransformer(nn.Module):def__init__(self, input_dim, hidden_dim=128, output_dim=1, num_heads=4, num_layers=2):super(StockTransformer, self).__init__()self.embedding = nn.Linear(input_dim, hidden_dim) encoder_layers = nn.TransformerEncoderLayer(d_model=hidden_dim, nhead=num_heads, dim_feedforward=256, batch_first=True)self.transformer = nn.TransformerEncoder(encoder_layers, num_layers=num_layers)self.fc = nn.Linear(hidden_dim, output_dim)def forward(self, x): x =self.embedding(x) x =self.transformer(x) x = x[:, -1, :]returnself.fc(x)model = StockTransformer(input_dim=X.shape[2]).to(device)criterion = nn.MSELoss()optimizer = optim.Adam(model.parameters(), lr=0.001)# Training Loopepochs =50for epoch inrange(epochs): model.train() train_loss =0.0for X_batch, y_batch in train_loader: optimizer.zero_grad() y_pred = model(X_batch).squeeze() loss = criterion(y_pred, y_batch) loss.backward() optimizer.step() train_loss += loss.item()# Predictions for the test datay_pred = model(X_test_torch).detach().cpu().numpy()# Predict future values for 2023-2030future_steps =7*12# Predicting monthly data for the next 7 years (2023-2030)predicted_future = []# Start from the last known point and predict future valuesinput_sequence = X_test[-1].reshape(1, time_steps, X_test.shape[2]) # Use the last point of the test datafor _ inrange(future_steps): predicted_value = model(torch.tensor(input_sequence, dtype=torch.float32).to(device)).detach().cpu().numpy() predicted_future.append(predicted_value[0]) input_sequence = np.roll(input_sequence, -1, axis=1) # Shift the sequence for the next prediction input_sequence[0, -1, 0] = predicted_value.item() # Use .item() to extract the scalar valuepredicted_future = np.array(predicted_future)# Reverse the scaling for predictionspredicted_future_rescaled = scaler.inverse_transform(np.hstack((predicted_future, np.zeros((predicted_future.shape[0], df_combined.shape[1] -1)))))predicted_future_rescaled = predicted_future_rescaled[:, 0]# Visualizationplt.figure(figsize=(10, 5))# Plot Historical Data (2014-2023)plt.plot(df_ftse.index, df_ftse['Close'], label="Historical Data", color='blue')# Plot Predicted Data (2025-2030) with a shaded confidence intervalfuture_dates = pd.date_range(df_ftse.index[-1], periods=future_steps, freq='ME') # Use 'ME' instead of 'M'plt.plot(future_dates, predicted_future_rescaled, label="Predicted Data (2025-2030)", linestyle='--', color='darkred')# Shading for the confidence intervalplt.fill_between(future_dates, predicted_future_rescaled -50, predicted_future_rescaled +50, color='red', alpha=0.2)plt.xlabel("Date", fontweight='bold')plt.ylabel("NYA Closing Price", fontweight='bold')plt.title("NEW YORK (^NYA) Predicted Closing Prices(2025-2030)", fontweight='bold')# Add legendplt.legend()# Grid settings to match the screenshot styleplt.grid(True, linestyle='-', linewidth=0.5)# Show plotplt.tight_layout()plt.show()
Fig 7.2 Prediction of the New York Stock Exchange from 2023 to 2030.
As illustrated in Fig 7.2, the future prediction for the New York Stock Exchange (^NYA) from 2023 to 2030 shows a consistent upward trend, as shown by the predicted data (represented in red) with some fluctuations. As per the visualization, ^NYA is expected to rise suddenly at the end of 2025 and in the first quarter of 2028.
The model forecasts a steady increase in market value, with some fluctuations between 2025 and 2030. However, the average increased gradient of the forecasted trend is noticeably low. Overall, the New York Stock Exchange is expected to see growth with a low pace in the coming years.
Now printing the performance metrics for this prediction of ^NYA.
9.3 India- National Stock Exchange (^NESI):
Again, after the application of the selected transformer model and its architecture for the prediction of the India- National Stock Exchange (^NESI), we obtain the prediction as in Fig 7.3.
Code
import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport torchimport torch.nn as nnimport torch.optim as optimfrom torch.utils.data import DataLoader, TensorDatasetfrom sklearn.preprocessing import MinMaxScalerfrom sklearn.metrics import mean_absolute_error, mean_squared_error, r2_scorefrom datetime import timedeltaimport os# Ensure the device is either CPU or CUDA (GPU)device = torch.device("cuda"if torch.cuda.is_available() else"cpu")# Define file paths and load datasetsfile_directory =r"C:\Users\Dell\OneDrive\Desktop\University\Course\9. M.Sc. Project\3. Dataset\archive"file_names = ["2014_Global_Markets_Data.csv", "2015_Global_Markets_Data.csv", "2016_Global_Markets_Data.csv", "2017_Global_Markets_Data.csv", "2018_Global_Markets_Data.csv", "2019_Global_Markets_Data.csv", "2020_Global_Markets_Data.csv", "2021_Global_Markets_Data.csv", "2022_Global_Markets_Data.csv", "2023_Global_Markets_Data.csv"]file_paths = [os.path.join(file_directory, file) forfilein file_names]dfs = [pd.read_csv(file) forfilein file_paths]df_stock = pd.concat(dfs, ignore_index=True)# Macroeconomic data filesfile_directory_macroeconomics =r"C:\Users\Dell\OneDrive\Desktop\University\Course\9. M.Sc. Project\3. Dataset\archive\Macroeconomic_factors"df_gdp = pd.read_csv(os.path.join(file_directory_macroeconomics, "GDP_growth_uk.csv"))df_inflation = pd.read_csv(os.path.join(file_directory_macroeconomics, "Inflation_rate_uk.csv"))df_interest = pd.read_csv(os.path.join(file_directory_macroeconomics, "Interest_rate_uk.csv"))# Convert Date columns to datetimedf_stock['Date'] = pd.to_datetime(df_stock['Date'])df_gdp['Date'] = pd.to_datetime(df_gdp['Date'])df_inflation['Date'] = pd.to_datetime(df_inflation['Date'])df_interest['Date'] = pd.to_datetime(df_interest['Date'])# Filter FTSE 100 (^FTSE) datadf_ftse = df_stock[df_stock['Ticker'] =='^NSEI'].copy()df_ftse.set_index('Date', inplace=True)df_ftse = df_ftse[['Close']].dropna()# Set Date as index for macroeconomic datadf_gdp.set_index('Date', inplace=True)df_inflation.set_index('Date', inplace=True)df_interest.set_index('Date', inplace=True)# Merge macroeconomic indicators into stock datadf_macro = df_gdp.join([df_inflation, df_interest], how='outer')df_combined = df_ftse.join(df_macro, how='outer')df_combined = df_combined.interpolate(method='linear').dropna()# Normalize the combined data for scalingscaler = MinMaxScaler()df_scaled = scaler.fit_transform(df_combined)# Function to create sequences for the modeldef create_sequences(data, time_steps=30): X, y = [], []for i inrange(len(data) - time_steps): X.append(data[i:i+time_steps]) y.append(data[i+time_steps][0])return np.array(X), np.array(y)time_steps =30X, y = create_sequences(df_scaled, time_steps)train_size =int(len(X) *0.8)X_train, X_test = X[:train_size], X[train_size:]y_train, y_test = y[:train_size], y[train_size:]# Convert to PyTorch tensorsX_train_torch = torch.tensor(X_train, dtype=torch.float32).to(device)X_test_torch = torch.tensor(X_test, dtype=torch.float32).to(device)y_train_torch = torch.tensor(y_train, dtype=torch.float32).to(device)y_test_torch = torch.tensor(y_test, dtype=torch.float32).to(device)train_dataset = TensorDataset(X_train_torch, y_train_torch)test_dataset = TensorDataset(X_test_torch, y_test_torch)train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)# Define the Transformer modelclass StockTransformer(nn.Module):def__init__(self, input_dim, hidden_dim=128, output_dim=1, num_heads=4, num_layers=2):super(StockTransformer, self).__init__()self.embedding = nn.Linear(input_dim, hidden_dim) encoder_layers = nn.TransformerEncoderLayer(d_model=hidden_dim, nhead=num_heads, dim_feedforward=256, batch_first=True)self.transformer = nn.TransformerEncoder(encoder_layers, num_layers=num_layers)self.fc = nn.Linear(hidden_dim, output_dim)def forward(self, x): x =self.embedding(x) x =self.transformer(x) x = x[:, -1, :]returnself.fc(x)model = StockTransformer(input_dim=X.shape[2]).to(device)criterion = nn.MSELoss()optimizer = optim.Adam(model.parameters(), lr=0.001)# Training Loopepochs =50for epoch inrange(epochs): model.train() train_loss =0.0for X_batch, y_batch in train_loader: optimizer.zero_grad() y_pred = model(X_batch).squeeze() loss = criterion(y_pred, y_batch) loss.backward() optimizer.step() train_loss += loss.item()# Predictions for the test datay_pred = model(X_test_torch).detach().cpu().numpy()# Predict future values for 2023-2030future_steps =7*12# Predicting monthly data for the next 7 years (2023-2030)predicted_future = []# Start from the last known point and predict future valuesinput_sequence = X_test[-1].reshape(1, time_steps, X_test.shape[2]) # Use the last point of the test datafor _ inrange(future_steps): predicted_value = model(torch.tensor(input_sequence, dtype=torch.float32).to(device)).detach().cpu().numpy() predicted_future.append(predicted_value[0]) input_sequence = np.roll(input_sequence, -1, axis=1) # Shift the sequence for the next prediction input_sequence[0, -1, 0] = predicted_value.item() # Use .item() to extract the scalar valuepredicted_future = np.array(predicted_future)# Reverse the scaling for predictionspredicted_future_rescaled = scaler.inverse_transform(np.hstack((predicted_future, np.zeros((predicted_future.shape[0], df_combined.shape[1] -1)))))predicted_future_rescaled = predicted_future_rescaled[:, 0]# Visualizationplt.figure(figsize=(10, 5))# Plot Historical Data (2014-2023)plt.plot(df_ftse.index, df_ftse['Close'], label="Historical Data", color='blue')# Plot Predicted Data (2025-2030) with a shaded confidence intervalfuture_dates = pd.date_range(df_ftse.index[-1], periods=future_steps, freq='ME') # Use 'ME' instead of 'M'plt.plot(future_dates, predicted_future_rescaled, label="Predicted Data (2025-2030)", linestyle='--', color='darkred')# Shading for the confidence intervalplt.fill_between(future_dates, predicted_future_rescaled -50, predicted_future_rescaled +50, color='red', alpha=0.2)plt.xlabel("Date" , fontweight='bold')plt.ylabel("NSEI Closing Price", fontweight='bold')plt.title(" INDIA (^NSEI) Predicted Closing Prices (2025-2030)", fontweight='bold')# Add legendplt.legend()# Grid settings to match the screenshot styleplt.grid(True, linestyle='-', linewidth=0.5)# Show plotplt.tight_layout()plt.show()
Fig 7.3 Prediction of the India- National Stock Exchange from 2023 to 2030.
As shown in Fig 7.3, the future prediction for the India National Stock Exchange (^NSEI) from 2023 to 2030 indicates a consistent growth with some fluctuation, as represented by the predicted data (in red).
We can also observe ^NSEI to gain its new peak at the beginning of 2026 followed by 2030.
The forecast suggests that the market will continue to rise steadily, with some consistent fluctuations expected, particularly after 2025. Overall, on an average, the India ^NSEI is projected to experience consistency and slight growth in the coming years.
Now printing the performance metrics for this prediction of ^NSEI.
9.4 European 100 Index EU (^N100):
Lastly, after the application of the selected transformer model and its architecture for the prediction of European 100 Index EU (^N100), we obtain the prediction as in Fig 7.4.
Code
import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport torchimport torch.nn as nnimport torch.optim as optimfrom torch.utils.data import DataLoader, TensorDatasetfrom sklearn.preprocessing import MinMaxScalerfrom sklearn.metrics import mean_absolute_error, mean_squared_error, r2_scorefrom datetime import timedeltaimport os# Ensure the device is either CPU or CUDA (GPU)device = torch.device("cuda"if torch.cuda.is_available() else"cpu")# Define file paths and load datasetsfile_directory =r"C:\Users\Dell\OneDrive\Desktop\University\Course\9. M.Sc. Project\3. Dataset\archive"file_names = ["2014_Global_Markets_Data.csv", "2015_Global_Markets_Data.csv", "2016_Global_Markets_Data.csv", "2017_Global_Markets_Data.csv", "2018_Global_Markets_Data.csv", "2019_Global_Markets_Data.csv", "2020_Global_Markets_Data.csv", "2021_Global_Markets_Data.csv", "2022_Global_Markets_Data.csv", "2023_Global_Markets_Data.csv"]file_paths = [os.path.join(file_directory, file) forfilein file_names]dfs = [pd.read_csv(file) forfilein file_paths]df_stock = pd.concat(dfs, ignore_index=True)# Macroeconomic data filesfile_directory_macroeconomics =r"C:\Users\Dell\OneDrive\Desktop\University\Course\9. M.Sc. Project\3. Dataset\archive\Macroeconomic_factors"df_gdp = pd.read_csv(os.path.join(file_directory_macroeconomics, "GDP_growth_uk.csv"))df_inflation = pd.read_csv(os.path.join(file_directory_macroeconomics, "Inflation_rate_uk.csv"))df_interest = pd.read_csv(os.path.join(file_directory_macroeconomics, "Interest_rate_uk.csv"))# Convert Date columns to datetimedf_stock['Date'] = pd.to_datetime(df_stock['Date'])df_gdp['Date'] = pd.to_datetime(df_gdp['Date'])df_inflation['Date'] = pd.to_datetime(df_inflation['Date'])df_interest['Date'] = pd.to_datetime(df_interest['Date'])# Filter FTSE 100 (^FTSE) datadf_ftse = df_stock[df_stock['Ticker'] =='^N100'].copy()df_ftse.set_index('Date', inplace=True)df_ftse = df_ftse[['Close']].dropna()# Set Date as index for macroeconomic datadf_gdp.set_index('Date', inplace=True)df_inflation.set_index('Date', inplace=True)df_interest.set_index('Date', inplace=True)# Merge macroeconomic indicators into stock datadf_macro = df_gdp.join([df_inflation, df_interest], how='outer')df_combined = df_ftse.join(df_macro, how='outer')df_combined = df_combined.interpolate(method='linear').dropna()# Normalize the combined data for scalingscaler = MinMaxScaler()df_scaled = scaler.fit_transform(df_combined)# Function to create sequences for the modeldef create_sequences(data, time_steps=30): X, y = [], []for i inrange(len(data) - time_steps): X.append(data[i:i+time_steps]) y.append(data[i+time_steps][0])return np.array(X), np.array(y)time_steps =30X, y = create_sequences(df_scaled, time_steps)train_size =int(len(X) *0.8)X_train, X_test = X[:train_size], X[train_size:]y_train, y_test = y[:train_size], y[train_size:]# Convert to PyTorch tensorsX_train_torch = torch.tensor(X_train, dtype=torch.float32).to(device)X_test_torch = torch.tensor(X_test, dtype=torch.float32).to(device)y_train_torch = torch.tensor(y_train, dtype=torch.float32).to(device)y_test_torch = torch.tensor(y_test, dtype=torch.float32).to(device)train_dataset = TensorDataset(X_train_torch, y_train_torch)test_dataset = TensorDataset(X_test_torch, y_test_torch)train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)# Define the Transformer modelclass StockTransformer(nn.Module):def__init__(self, input_dim, hidden_dim=128, output_dim=1, num_heads=4, num_layers=2):super(StockTransformer, self).__init__()self.embedding = nn.Linear(input_dim, hidden_dim) encoder_layers = nn.TransformerEncoderLayer(d_model=hidden_dim, nhead=num_heads, dim_feedforward=256, batch_first=True)self.transformer = nn.TransformerEncoder(encoder_layers, num_layers=num_layers)self.fc = nn.Linear(hidden_dim, output_dim)def forward(self, x): x =self.embedding(x) x =self.transformer(x) x = x[:, -1, :]returnself.fc(x)model = StockTransformer(input_dim=X.shape[2]).to(device)criterion = nn.MSELoss()optimizer = optim.Adam(model.parameters(), lr=0.001)# Training Loopepochs =50for epoch inrange(epochs): model.train() train_loss =0.0for X_batch, y_batch in train_loader: optimizer.zero_grad() y_pred = model(X_batch).squeeze() loss = criterion(y_pred, y_batch) loss.backward() optimizer.step() train_loss += loss.item()# Predictions for the test datay_pred = model(X_test_torch).detach().cpu().numpy()# Predict future values for 2023-2030future_steps =7*12# Predicting monthly data for the next 7 years (2023-2030)predicted_future = []# Start from the last known point and predict future valuesinput_sequence = X_test[-1].reshape(1, time_steps, X_test.shape[2]) # Use the last point of the test datafor _ inrange(future_steps): predicted_value = model(torch.tensor(input_sequence, dtype=torch.float32).to(device)).detach().cpu().numpy() predicted_future.append(predicted_value[0]) input_sequence = np.roll(input_sequence, -1, axis=1) # Shift the sequence for the next prediction input_sequence[0, -1, 0] = predicted_value.item() # Use .item() to extract the scalar valuepredicted_future = np.array(predicted_future)# Reverse the scaling for predictionspredicted_future_rescaled = scaler.inverse_transform(np.hstack((predicted_future, np.zeros((predicted_future.shape[0], df_combined.shape[1] -1)))))predicted_future_rescaled = predicted_future_rescaled[:, 0]# Visualizationplt.figure(figsize=(10, 5))# Plot Historical Data (2014-2023)plt.plot(df_ftse.index, df_ftse['Close'], label="Historical Data", color='blue')# Plot Predicted Data (2025-2030) with a shaded confidence intervalfuture_dates = pd.date_range(df_ftse.index[-1], periods=future_steps, freq='ME') # Use 'ME' instead of 'M'plt.plot(future_dates, predicted_future_rescaled, label="Predicted Data (2025-2030)", linestyle='--', color='darkred')# Shading for the confidence intervalplt.fill_between(future_dates, predicted_future_rescaled -50, predicted_future_rescaled +50, color='red', alpha=0.2)plt.xlabel("Date", fontweight='bold')plt.ylabel(" Closing Price", fontweight='bold')plt.title("European 100 Index (^N100) Predicted Closing Prices (2025-2030)", fontweight='bold')# Add legendplt.legend()# Grid settings to match the screenshot styleplt.grid(True, linestyle='-', linewidth=0.5)# Show plotplt.tight_layout()plt.show()
Fig 7.4 Prediction of theEuropean 100 Index EU from 2023 to 2030.
As shown in Fig 7.4, the prediction for the European 100 Index (^N100) from 2023 to 2030 demonstrates consistent fluctuation in the same range. The forecast suggests the possibility of steady movement on an average , with some fluctuations expected, particularly around 2025.
From the predicted visualization, we can anticipate the consistent volatility in the European Stock market which is not likely to encounter any significant rise until 2030.
The predicted data also shows a shaded area representing the confidence interval, suggesting that while growth is anticipated, the market could experience some variation in the coming years.
Now printing the performance metrics for this prediction of ^N100.
10 Result:
This section presents a comparison of model performance across various stock markets, using a transformer-based model trained on historical data from 2008 to 2014 for each index.
10.1Models comparison for each index
Regarding the reliability of the predictions, the overall high Model Accuracy (95% or higher) across all markets, coupled with the R² values (close to or above 0.8), suggests that the model is relatively robust and can explain a significant portion of the variance in stock price movements. However, the relatively low Directional Accuracy (around 48%) indicates that while the model performs well in terms of predicting the magnitude of stock price changes, it faces challenges in predicting the exact direction of price movements, which is a common limitation in financial prediction models. The Model Accuracy and R² suggest the predictions are generally reliable in terms of fitting the actual data, but investors and analysts should be cautious about the directional aspect, as it still leaves room for improvement. Overall, the model’s reliability can be considered strong for trend estimation and performance evaluation but may need further refinement for predicting precise directional shifts.
Fig 8.1 Comparison of the performance metric for each index’s prediction.
The performance of the transformer-based model varies across different stock markets, as it was trained on each market’s own historical data from 2008 to 2014. The European market performed the best in terms of R², Model Accuracy, and Directional Accuracy, likely due to more stable trends during the training period. The London and New York markets performed well but had lower R² and Model Accuracy compared to Europe, likely because of more volatile data. The Indian market also performed slightly worse than Europe but still had similar accuracy. These differences in performance show that the model’s effectiveness can be affected by the quality and consistency of the data used for training each market.
Therefore, we observe that, even if we train the same model architecture with different data with similar patterns over the long duration of time, there is not much significant difference in the model performance and predictability.
8.2. Data analysis of predicted visualization
The analysis of stock market predictions for various global indices and commodities for the period 2025-2030 shows a clear view of expected trends. The models presented predict the future performance of Crude Oil Futures, Gold Futures, European 100 Index, India NSEI, New York Stock Exchange and the London Stock Exchange.
Each of these markets is projected based on historical data and future predictions, with significant trends observed across all selected indices.
The Crude Oil Futures prediction shows a general increase in price, peaking around 2025 before showing some fluctuations through 2030. While growth is expected, it is accompanied by a certain degree of unpredictability in the forecast, particularly after 2025.
Similarly, the Gold Futures market shows a sharp decline after reaching a peak around 2020, followed by a steady drop, with the predicted prices stabilising at lower levels by 2025-2030. This is also directing towards a possible gold market crash in between 2025 to 2030 as per the our selected model.
The European 100 Index is expected to maintain growth until 2025, after which the prices will stabilise with less intense fluctuations through 2030.
The India NSEI, on the other hand, shows a strong upward trend, indicating that the Indian stock market will likely experience significant growth over the next decade. The New York Stock Exchange (NYSE) indicates a similar upward trend until 2025 with a increased volatility in the latter part of the forecast.
The London Stock Exchange (FTSE 100) prediction shows a steady growth trajectory until 2025, with more fluctuation expected afterward. This suggests that while the FTSE will continue to rise, it will experience some unpredictability towards 2030.
The transform based models used for forecasting provide useful insights into these markets, offering a reasonable degree of confidence in predicting future trends. These predictions are likely to be valuable for strategic financial planning and decision-making.
11 Limitation and discussion:
The model provides useful insights into future market trends, but it has some limitations.
It primarily relies on historical data, which assumes past market behaviour will continue in a similar way. However, financial markets can be affected by unpredictable events, such as political instability or global crises. This gap must be filled before obtaining high accuracy forecasting.
Despite these challenges, the model’s results still offer valuable insights. The R-squared values and other metrics are promising. For long-term predictions to improve future forecasting, the model could be enhanced by integrating real-time data, exploring additional algorithms and adapting more quickly to shifts in market conditions.
Market sentiment analysis using deep learning models have been an important solution to improve forecasting accuracy in financial markets.
These models can assess the overall mood or sentiment of market participants, which often influences market movements by analyzing social media, news articles, financial reports and other text sources which will require integration of large database management and large model architecture. Sentiment analysis models, such as Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM) networks and Transformer-based models, are capable of processing large volumes of unstructured text data and extracting meaningful insights (Heydarian, 2024). On top of that, Ali (2023) insists that enhancing model transparency through Explainable AI (XAI) techniques would also help improve trust in these predictions.
These models can identify patterns and trends related to investor sentiment, such as optimism or fear, which can then be used to enhance traditional forecasting models. Incorporating sentiment data allows for a more comprehensive view of market dynamics, as it captures the emotional and psychological factors that often drive market decisions, potentially leading to more accurate and timely predictions.
12 Conclusion and Future works:
This research highlights the potential of neural network models and data analytics, particularly in Transformer-based architectures, in predicting global stock market trends. The Transformer model achieved strong results in terms of accuracy and variance explanation, surpassing other models like LMST by effectively using historical data and key macroeconomic indicators. Its success in forecasting movements in stock indices such as FTSE 100, NYSE and commodities like Gold and Crude Oil demonstrates the huge space of ML in financial prediction.
Even if we use its own individual historic datasets for each index to train under the same model architecture and parameters, there is not much difference in prediction accuracy as per the evaluation metrics. However, there is still space to improvise the model for gaining high directional accuracy.
The analysis of stock market predictions for global indices offers valuable insights for making informed investment decisions. Crude Oil Futures are expected to rise with some fluctuations, while Gold Futures may decline and stabilize at lower levels. The European 100 Index is predicted to grow steadily until 2025, followed by stability, while the India NSEI is expected to see significant growth. The New York Stock Exchange and London Stock Exchange show positive trends, with the latter experiencing increased volatility towards 2030.
Despite these promising results, there are still challenges and limitations. The models heavily rely on historical data, which may not always reflect future market conditions, particularly during unforeseen economic events or geopolitical crises. Additionally, the models cannot fully capture real-time market sentiment or sudden shocks.
Future research could focus on incorporating more dynamic datasets, such as market sentiment from news and social media and real-time economic indicators. Enhancing model transparency through Explainable AI (XAI) techniques would also help improve trust in these predictions. Combining hybrid models, real-time data integration, and sentiment analysis will be vital for further improving prediction accuracy and reliability in future studies.
[8] Chicco, (2021). The Coefficient of Determination R-Squared Is More Informative Than SMAPE, MAE, MAPE, MSE and RMSE in Regression Analysis Evaluation. https://doi.org/10.7717/peerj-cs.623.
[9] Chen and Hao, (2017). A Feature Weighted Support Vector Machine and K-Nearest Neighbor Algorithm for Stock Market Indices Prediction.https://doi.org/10.1016/j.eswa.2017.02.044.
[10] Chang, V., Xu, Q.A., Chidozie, A. & Wang, H.(2024). Predicting Economic Trends and Stock Market Prices with Deep Learning and Advanced Machine Learning Techniques. https://www.mdpi.com/2079-9292/13/17/3396
[11] European Commission. Directorate General for Justice and Consumers, (2018). The GDPR :new opportunities, new obligations : what every business needs to know about the EU’s General Data Protection Regulation. LU: Publications Office. https://doi.org/10.2838/6725.
[12] Fernando, Antonette. 2017. Macroeconomic Impact on Stock Market Returns and Volatility: Evidence from Sri Lanka. https://doi.org/10.2139/ssrn.3238532.
[13] Fischer, Thomas, and Christopher Krauss, (2018). Deep Learning with Long Short-Term Memory Networks for Financial Market Predictions. https://doi.org/10.1016/j.ejor.2017.11.054.
[15] Goldberg, Lisa R., and Saad Mouti, (2022). Sustainable Investing and the Cross-Section of Returns and Maximum Drawdown. https://doi.org/10.1016/j.jfds.2022.11.002.
[16] Guresen, Erkam, Gulgun Kayakutlu, and Tugrul U. Daim, (2011). Using Artificial Neural Network Models in Stock Market Index Prediction. https://doi.org/10.1016/j.eswa.2011.02.068.
[17] Heydarian, Peyman, Albert Bifet, and Shaen Corbet, (2024). Understanding Market Sentiment Analysis: A Survey. https://doi.org/10.1111/joes.12645.
[19] Ican, Özgür, and Taha Bugra Çelik, (2017). Stock Market Prediction Performance of Neural Networks: A Literature Review. https://doi.org/10.5539/ijef.v9n11p100.
[20] Kabir, Md R., Dipayan Bhadra, Moinul Ridoy, and Mariofanna Milanova, (2025). LSTMTransformer-Based Robust Hybrid Deep Learning Model for Financial Time Series Forecasting.
[22] Kumar, Satish, Amar Rao, and Monika Dhochak, (2025). Hybrid ML Models for Volatility Prediction in Financial Risk Management. https://doi.org/10.1016/j.iref.2025.103915.
[26] Nayak, S. C., B. B. Misra, and H. S. Behera, (2012). Evaluation of Normalization Methods on Neuro-Genetic Models for Stock Index Forecasting.https://doi.org/10.1109/wict.2012.6409147.
[27] Taherdoost, Hamed, and Mitra Madanchian, (2023). Artificial Intelligence and Sentiment Analysis: A Review in Competitive Research. https://doi.org/10.3390/computers12020037.
[29] Xu, Dehe, Qi Zhang, Yan Ding, and De Zhang, (2021). Application of a Hybrid ARIMA-LSTM Model Based on the SPEI for Drought Forecasting. https://doi.org/10.1007/s11356-021-15325-z.
[29] Moghaddam, A.H., Hedayati, M., & Esfandyari, M. (2024). Stock market index prediction using artificial neural networks.
[31] Shilpa, (2021). Explainable stock prices prediction from financial news articles using sentiment analysis. Artificial Intelligence: Data Mining and Machine Learning. https://peerj.com/articles/cs-340/
Citation
BibTeX citation:
@online{thapa2025,
author = {Thapa, Rabin},
title = {Integrative {Application} of {Neural} {Networks} for
{Predicting} {Global} {Stock} {Market} {Trends:} {A} {Data}
{Science} {Investigation} {Using} {Historical} {Data}},
date = {2025-02-28},
url = {https://www.researchgate.net/profile/Rabin-Thapa-8},
langid = {en}
}
For attribution, please cite this work as:
Thapa, Rabin. 2025. “Integrative Application of Neural Networks
for Predicting Global Stock Market Trends: A Data Science Investigation
Using Historical Data.” February 28, 2025. https://www.researchgate.net/profile/Rabin-Thapa-8.