## This post aims to combine LSTM, technical indicators and PCA to model, forecast an example of financial time series of SP500 over a medium term period. The model yields a very low MSE. However, here are some other considerations that could also be addressed in using SP500:

#Market Dynamics: The S&P 500 is influenced by various factors, including economic conditions, geopolitical events, and market sentiment. The performance of the proposed model should be considered in the context of the complexity and unpredictability of financial markets.

#Volatility of the S&P 500: The S&P 500 can exhibit periods of both high and low volatility. If a model is performing well during a less volatile period, it's essential to assess how it generalizes to more volatile periods. Financial markets can experience sudden changes, and models that work well in one regime may face challenges in another.

#Benchmarking Against Baseline Models: Compare the performance of  LSTM+PCA+technical indicators model against baseline models or simple forecasting approaches. Understanding how the model performs relative to simpler methods helps gauge its effectiveness.

#Evaluation Period: Given that only medium term  data set, consider the temporal aspects of financial data and the performance of the model  across different and longer time periods.

#Robustness Testing: Consider conducting robustness testing by evaluating the model on different subsets of the data or incorporating rolling validation periods. This can provide insights into the stability of  model's performance.

#Risk Management Considerations: In financial modeling, risk management is crucial. While evaluating predictive accuracy is important, also consider the implications of model errors on decision-making and potential financial risks.

## There's a vast and diverse literature on the theory and application of deep learning algorithms for financial time series.
## Here some ways machine learning algorithms currently applied to model, and forecast financial time series:

## Long Short-Term Memory (LSTM) networks are a type of recurrent neural network that can be used for modelling and forecasting time series forecasting. LSTM networks are capable of learning long-term dependencies and can be used to model complex nonlinear relationships between variables.There are many types of LSTM models that can be used for each specific type of time series forecasting problem.There are many resources available online that provide detailed information on how to use LSTM networks for financial time series forecasting in Python. One such resource is  Machine Learning Mastery.

## Machine Learning Mastery web site provides tutorials on time series prediction with LSTM recurrent neural networks in Python with Keras .  Some tutorials provide standalone examples of each model on each type of time series problem as a template that you can copy and adapt for your specific time series forecasting problem. The tutorial covers the following types of LSTM models and their structure:

## Univariate LSTM Models
## Multivariate LSTM Models
## Multi-Step LSTM Models
## Multivariate Multi-Step LSTM Models

## Here are just a few among other numerous works on deep learning applied to financial time series(and references within). These works cover a broad range of applications to financial data. However, evidence for deep learning models potential improvement over much simpler ones is still ways off.

## 1. S. Hochreiter and J. Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8):1735–1780, 11 1997.
## This paper introduces   a novel, efficient, gradient-based method called long short-term memory (LSTM) in conjunction with an appropriate gradient-based learning algorithm capable of learning so-called 'long term dependencies'.  LSTM can learn to bridge minimal time lags in excess of 1,000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

## 2. C. Olah. Understanding lstm networks, 2015.
## This paper explains the concept behind LTSM networks

## 3. A deep learning framework for financial time series using stacked autoencoders and long-short term memory (LSTM) .
##PLoS One. By Wei Bao 1, Jun Yue 2, Yulei Rao 
## This study presents a novel deep learning framework where wavelet transforms (WT), stacked autoencoders (SAEs) and long-short term memory (LSTM) are combined for stock price forecasting.
  
## 4. Deep Learning Model for Financial Time Series Prediction . 
##IEEE.By Omnia Kelany; Sherin Aly; Mohamed A. Ismail 
## This paper employs Long-Short Term Memory (LSTM) deep learning approach to predict future prices for low, medium, and high risk stocks.
  
## 5. Using LSTM in Stock prediction and Quantitative Trading . 
## By Zhichao Zou; Zihao Qu . Stanford-CS230 
## This study examines the impact of predictive powers in different financial Time Series, and builds three deep learning models as well as one traditional time series model. They are 1) Time Series Model (ARIMA);2) RNN with LSTM Model (LSTM); 3) RNN with Stacked-LSTM (Stacked-LSTM);4) RNN with LSTM + Attention (Attention-LSTM).
  
## 6.Deep learning models for price forecasting of financial time series: A review of recent advancements: 2020–2022 
##Cheng Zhang, Nilam Nur Amir Sjarif, Roslina Ibrahim 
##First published: 28 September 2023 
##This review delves deeply into deep learning-based forecasting models, presenting information on model architectures, practical applications, and their respective advantages and disadvantages. In particular, detailed information is provided on advanced models for price forecasting, such as Transformers, generative adversarial networks (GANs), graph neural networks (GNNs), and deep quantum neural networks (DQNNs).The review also includes potential directions for future research, such as examining the effectiveness of deep learning models with complex structures for price forecasting, extending from point prediction to interval prediction using deep learning models, scrutinizing the reliability and validity of decomposition ensembles, and exploring the influence of data volume on model performance.
  
## 7. Machine Learning Mastery web site provides tutorials on time series prediction with LSTM recurrent neural networks in Python with Keras . 
  
## 8. Deep learning models for price forecasting of financial time series: A review of recent advancements: 2020-2022.
## Cheng Zhang, Nilam Nur Amir Sjarif, Roslina Ibrahim.https://arxiv.org › q-fin 
## This review highlights the advantages of deep learning models over traditional statistical and machine learning models. The review delves deeply into deep learning-based forecasting models, presenting information on model architectures, practical applications, and their respective advantages and disadvantages. In particular, detailed information is provided on advanced models for price forecasting, such as Transformers, generative adversarial networks (GANs), graph neural networks (GNNs), and deep quantum neural networks (DQNNs). The review also includes potential directions for future research, such as examining the effectiveness of deep learning models with complex structures for price forecasting, extending from point prediction to interval prediction using deep learning models, scrutinizing the reliability and validity of decomposition ensembles, and exploring the influence of data volume on model performance.
  
## 9. Financial Time Series Forecasting with the Deep Learning Ensemble Model 
##by Kaijian He ,Qian Yang ,Lei Ji ,Jingcheng Pan , andYingchao Zou 
  
## 10. Long Short-Term Memory Neural Network for Financial Time Series
##by Carmina Fjellström-q-fin-arXiv:2201.08218 
  
## The paper presents a different approach to predicting stock price movements using an ensemble of independent and parallel LSTM neural networks. The results of the study suggest that the LSTM ensemble model combined with a median-based binary classification outperforms a randomly chosen portfolio and a portfolio containing all stocks in Stockholm’s OMX30 other models in terms of average daily returns and cumulative returns over time. The study also shows that the LSTM portfolio exhibits less volatility, leading to higher risk-return ratios.

## 11. Financial time series forecasting model based on ceemdan and lstm.
## J. Cao, Z. Li, and J. Li. Physica A: Statistical Mechanics and its Applications, 519:127–139, 2019
## This paper uses complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and long short-term memory (LSTM) network1. The model is designed to forecast non-stationary and nonlinear financial series, which are difficult to predict using existing models1. The authors use a sliding time window to decompose the original sequence into a cluster of equal length sequences. After CEEMDAN decomposition, wavelet threshold denoising and reconstruction, the denoised signal was obtained. Using the denoised signal instead of the original signal as the input of the LSTM network, the more accurate final prediction result will be obtained1. The proposed CEEMDAN-LSTM model has better prediction effects than the standard LSTM model and the common prediction methods combined with empirical mode decomposition.

## 12.T. Fischer and C. Krauss. Deep learning with long short-term memory networks for financial
##market predictions. European Journal of Operational Research, 270(2):654–669, 2018.
## In this paper the authors deploy LSTM networks for predicting out-of-sample directional movements for the constituent stocks of the S&P 500 from 1992 until 20152. With daily returns of 0.46 percent and a Sharpe Ratio of 5.8 prior to transaction costs, the proposed method has better prediction effects than the standard LSTM model and the common prediction methods combined with empirical mode decomposition1. The paper was published in the European Journal of Operational Research in 2018.

##13. A. Tsantekidis, N. Passalis, A. Tefas, J. Kanniainen, M. Gabbouj, and A. Iosifidis. Using deep
##learning to detect price change indications in financial markets. In 2017 25th European Signal
##Processing Conference (EUSIPCO), pages 2511–2515, 2017.
## The proposed method at predicting future price movement from large-scale frequency time series data on limit order books.

## 14.  Jump detection in financial time series using machine learning algorithms. Soft Computing, 24(3):1789–1801,2020.
## J. F. A. Yeung, Z.-k. Wei, K. Y. Chan, H. Y. Lau, and K.-F. C. Yiu.
##  This paper proposes a new hybrid method based on machine learning algorithms  to detect short-term market instability, which could be jumping up or down, instead of a directional prediction. The model is a combination of a long short-term memory (LSTM) neural network model and a machine learning pattern recognition model. The LSTM model is applied for time series prediction, which predicts the next data point. The historical prediction errors sequence can be used as the information source or input of the jump detection model/module. The machine learning pattern recognition model is applied for jump detection. The proposed hybrid jump detection model is effective to detect jumps in terms of accuracy, comparing to the other classical jump detection methods. 

## 15. F. Chollet et al. Keras. https://keras.io, 2015.

## 16.L. Di Persio and O. Honchar. Artificial neural networks architectures for stock price prediction:Comparisons and applications. 
## International journal of circuits, systems and signal processing,10(2016):403–413, 2016.

## This paper considers the Multi-layer Perceptron (MLP), the Convolutional Neural Networks (CNN), and the Long Short-Term Memory (LSTM) recurrent neural networks techniques. They show that neural networks are able to predict financial time series movements even trained only on plain time series data and propose more ways to improve results.

## 17.J. B. Heaton, N. G. Polson, and J. H. Witte. Deep learning in finance. CoRR, abs/1602.06561,2016.
## This paper explores the use of deep learning hierarchical models for problems in financial prediction and classification. They propose a general framework of optimal investment and a collection of trading ideas, which combine probability and statistical theory with, potentially, machine learning techniques, e.g., machine learning regression, classification and reinforcement learning. They show that applying deep learning methods to financial prediction problems can produce more useful results than standard methods in finance. In particular, deep learning can detect and exploit interactions in the data that are, at least currently, invisible to any existing financial economic theory.

## 18. Learning to Forget: Continual Prediction withLSTM.
## F. A. Gers, J. Schmidhuber, and F. Cummins. Neural Computation, 12(10):2451–2471, 10 2000.
## This paper  identifies a weakness of LSTM networks processing continual input streams without explicitly marked sequence ends. Without resets, the internal state values may grow indefinitely and eventually cause the network to break down. Their remedy is an adaptive “forget gate” that enables an LSTM cell to learn to reset itself at appropriate times, thus releasing internal resources. They review an illustrative benchmark problem on which standard LSTM outperforms other RNN algorithms. All algorithms (including LSTM) fail to solve a continual version of that problem. LSTM with forget gates, however, easily solves it in an elegant way.

##Some web sites:
## 1.machinelearningmastery.com
## 2.arxiv.org
## 3.hindawi.com
# Import necessary packages
import numpy as np
import pandas as pd
import yfinance as yf
import datetime as dt
import pandas_ta as ta
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from keras.models import Sequential
from keras.layers import Dense, LSTM, Dropout
from sklearn.model_selection import train_test_split
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
import warnings

# Suppress the warning
with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    import warnings

# Suppress the specific warning
warnings.filterwarnings("ignore", category=UserWarning, module="pandas")

# Load financial data
ticker = '^GSPC'
start_date = dt.date.today() - dt.timedelta(days=365*3)
end_date = dt.date.today() + dt.timedelta(days=1)
df = yf.download(ticker, start_date, end_date)

[*********************100%%**********************]  1 of 1 completed
# Calculate daily returns
df['Returns'] = df['Adj Close'].pct_change()

# Adding the technical indicators
df['RSI'] = ta.rsi(df['Close'], timeperiod=14)

df['ATR'] = ta.atr(df['High'], df['Low'], df['Close'], timeperiod=14)
# Calculate stochastic oscillator
stoch_results = ta.stoch(df['High'], df['Low'], df['Close'], window=14, smooth_window=3)
df['%K'] = stoch_results['STOCHk_14_3_3']
df['%D'] = stoch_results['STOCHd_14_3_3']

#df['MFI'] = df.ta.mfi(length=14).astype(float)
#df_mfi= df.ta.mfi(length=14)
#df['MFI']=df_mfi
# Create MACD
df_macd=df.ta.macd(fast=12, slow=26, signal=9, append=True)
df_bollinger=df.ta.bbands(length=20, append=True)
# Add the MACD line to df
df['MACD'] = df_macd['MACD_12_26_9']

# Add the MACD signal line to df
df['MACD_signal'] = df_macd['MACDs_12_26_9']

# Add the MACD histogram to df
df['MACD_histogram'] = df_macd['MACDh_12_26_9']

# Add the upper Bollinger Band to df
df['Bollinger_upper'] = df_bollinger['BBU_20_2.0']

# Add the middle Bollinger Band to df
df['Bollinger_middle'] = df_bollinger['BBM_20_2.0']

# Add the lower Bollinger Band to df
df['Bollinger_lower'] = df_bollinger['BBL_20_2.0']
# Calculate On-balance volume
df['OBV']=ta.obv(df['Close'], df['Volume'])
# Calculate Relative Vigor Index
df['RVI']=ta.rvi(df['Close'], length=10)
df['WR']=ta.willr(df['High'], df['Low'], df['Close'], length=14)
# Assuming df contains 'High', 'Low', 'Close', and 'Volume' columns

# Calculate Accumulation Distribution Line (ADL)
df_ad=df.ta.ad(high='High', low='Low', close='Close', volume='Volume', append=False)

# Calculate Chaikin Oscillator
df['chaikin_ema3'] = df['Volume'] * (2 / (1 + 3)) * ((df['Close'] - df['Low']) - (df['High'] - df['Close'])).cumsum()
df['chaikin_ema10'] = df['Volume'] * (2 / (1 + 10)) * ((df['Close'] - df['Low']) - (df['High'] - df['Close'])).cumsum()
df['chaikin'] = ta.ema(df['chaikin_ema3'], length=3) - ta.ema(df['chaikin_ema10'], length=10)

# Drop intermediate columns if needed
df = df.drop(['chaikin_ema3', 'chaikin_ema10'], axis=1)
# Calculate the technical indicators

df_adx=df.ta.adx(length=14, append=False)
df_cci=df.ta.cci(length=20, c=0.015, append=False)
df_psar=df.ta.psar(af=0.02, max_af=0.2, append=False)
df_trix=df.ta.trix(length=14, append=False)
df_vortex=df.ta.vortex(length=14, append=False)
df['CCI']=df_cci
df['VTX']=df_vortex['VTXM_14']
df['ADX']=df_adx['ADX_14']
df['PSAR']=df_psar['PSARr_0.02_0.2']
df['TRIX']=df_trix['TRIX_14_9']

# Define features and signal features
features = ['RSI', 'ATR', '%K', '%D', 'MACD_signal', 'Bollinger_middle', 'OBV', 'RVI', 'WR','TRIX','VTX','CCI','ADX','PSAR']
# Extract features and signal features
X = df[features]
y = df['Returns']

# Calculate the correlation matrix
correlation_matrix = df[features].corr()

# Find highly correlated features

threshold = 0.6  # Adjust the threshold as needed
highly_correlated = (correlation_matrix.abs() > threshold).sum()

# Print the highly correlated features
print("Highly Correlated Features:")
Highly Correlated Features:
#print(highly_correlated)
# Find pairs of highly correlated features

highly_correlated_pairs = (correlation_matrix.abs() > threshold).stack().reset_index()
highly_correlated_pairs = highly_correlated_pairs[highly_correlated_pairs[0] & (highly_correlated_pairs['level_0'] != highly_correlated_pairs['level_1'])]

# Print the highly correlated feature pairs
#print("Highly Correlated Feature Pairs:")
#print(highly_correlated_pairs)

df_combined = pd.concat([X, y], axis=1)
df_combined = pd.concat([X, y], axis=1)
df_combined=df_combined.dropna()
X=df_combined.drop('Returns',axis=1)
y=df_combined['Returns']

# Some of these features seem to be highly correlated. Using Principal Component Analysis (PCA) can be justified to reduce correlation among features,

# Standardize the data using MinMaxScaler

scaler = MinMaxScaler()
X_scaled = scaler.fit_transform(X)

# Apply PCA to reduce correlation among features
pca = PCA(n_components=5)
X_pca = pca.fit_transform(X_scaled)

# Plot the scree plot

plt.plot(range(1, len(pca.explained_variance_ratio_) + 1), pca.explained_variance_ratio_, marker='o')
plt.xlabel('Number of Components')
plt.ylabel('Explained Variance Ratio')
plt.title('Scree Plot')
plt.show()

# Create a DataFrame with the transformed data
X_pca_df = pd.DataFrame(X_pca, columns=['PCA_1', 'PCA_2', 'PCA_3','PCA_4','PCA_5'])

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X_pca_df, y, test_size=0.2, random_state=5)


# Reshape the transformed data into the shape expected by the LSTM model
timesteps = 1
X_train_reshaped = X_train.values.reshape((X_train.shape[0], timesteps, X_train.shape[1]))
X_test_reshaped = X_test.values.reshape((X_test.shape[0], timesteps, X_test.shape[1]))

# Create a new LSTM model
model = Sequential()

# Add an LSTM layer with 64 units
#model.add(LSTM(64, input_shape=(X_train_reshaped.shape[0],timesteps, transformed_df_train.shape[1])))
model.add(LSTM(64, input_shape=(timesteps, X_train.shape[1]),return_sequences=True))


# Add another LSTM layer with 32 units
model.add(LSTM(32, return_sequences=True))

# Add another LSTM layer with 16 units
model.add(LSTM(16,return_sequences=True))
model.add(Dropout(0.2))

# Add a dense layer with 1 unit
model.add(Dense(1))

# Compile the model with Mean Squared Error loss and accuracy metric
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])

# Fit the model to the reshaped training data
model.fit(X_train_reshaped, y_train, epochs=30, batch_size=32,verbose=0)
<keras.src.callbacks.History object at 0x00000083F9CEA8D0>
# Make predictions on the reshaped testing data
y_pred = model.predict(X_test_reshaped)

1/5 [=====>........................] - ETA: 13s
5/5 [==============================] - 3s 5ms/step
# Evaluate the model on the testing data
loss, accuracy = model.evaluate(X_test_reshaped, y_test)

1/5 [=====>........................] - ETA: 13s - loss: 1.1297e-04 - accuracy: 0.0000e+00
5/5 [==============================] - 3s 5ms/step - loss: 9.8667e-05 - accuracy: 0.0000e+00
print(f'Mean Squared Error (MSE) Loss: {loss}')
Mean Squared Error (MSE) Loss: 9.866656182566658e-05
print(f'Accuracy: {accuracy}')
Accuracy: 0.0
# Print the explained variance ratio for each principal component
explained_variance_ratio = pca.explained_variance_ratio_
for i, component in enumerate(['PCA_1', 'PCA_2', 'PCA_3','PCA_4','PCA_5']):
    print(f"{component} Explained Variance Ratio: {explained_variance_ratio[i]}")
PCA_1 Explained Variance Ratio: 0.4942145478311793
PCA_2 Explained Variance Ratio: 0.14338653416386807
PCA_3 Explained Variance Ratio: 0.10928301857669297
PCA_4 Explained Variance Ratio: 0.10426514060962777
PCA_5 Explained Variance Ratio: 0.0471101304823444
    # Print the explained variance ratio for each principal component
explained_variance_ratio = pca.explained_variance_ratio_
cumulative_explained_variance = np.cumsum(explained_variance_ratio)

# Print the cumulative explained variance
print("\nCumulative Explained Variance:")

Cumulative Explained Variance:
for i, component in enumerate(['PCA_1', 'PCA_2', 'PCA_3', 'PCA_4', 'PCA_5']):
    print(f"{component} Cumulative Explained Variance: {cumulative_explained_variance[i]}")
PCA_1 Cumulative Explained Variance: 0.4942145478311793
PCA_2 Cumulative Explained Variance: 0.6376010819950474
PCA_3 Cumulative Explained Variance: 0.7468841005717404
PCA_4 Cumulative Explained Variance: 0.8511492411813681
PCA_5 Cumulative Explained Variance: 0.8982593716637125
# Create a bar plot to visualize the explained variance
plt.bar(range(1, len(explained_variance_ratio) + 1), explained_variance_ratio, alpha=0.7)
plt.xlabel('Principal Components')
plt.ylabel('Explained Variance Ratio')
plt.title('Explained Variance of Principal Components')
plt.show()

# Plot y against y_pred
plt.figure(figsize=(10, 6))
plt.scatter(y_test, y_pred, alpha=0.5)
plt.xlabel('True Values')
plt.ylabel('Predictions')
plt.title('True Values vs. Predictions')
plt.show()

# Assuming df has a time index column (e.g., 'Date')
time_index = df.index

# Get the indices corresponding to the testing set
test_indices = np.where(df.index.isin(y_test.index))[0]

# Plot y_test and y_pred against the time index for the testing set
# Plot the predictions
plt.figure(figsize=(10, 6))
plt.plot(time_index[:len(test_indices)], y_test, label='True Returns', marker='o', linestyle='-', color='blue', alpha=0.7)
plt.plot(time_index[:len(test_indices)], y_pred.squeeze(), label='Predictions', marker='o', linestyle='-', color='orange', alpha=0.7)
plt.xlabel('Time')
plt.ylabel('Returns')
plt.title('LSTM Model Predictions')
plt.legend()
plt.show()

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.