📈 Ensemble Strategy for Backtesting Stock Price

This repository implements an ensemble strategy for backtesting stock price, Combining Bollinger Bands and LSTM (Neural Network) Models.

📚 Table of Contents

🛠️ How it works
📖 User Manual
️ 🏗️ Project Structure
️ 🛠️ Our Approach
🚀 Performance Analysis
🔮 Drawbacks & Future Works
📂 Asset Categories

How it Works

There are two important files in the repository 📁.

The run.ipynb is the main file that the user can run 🏃‍♂️ to backtest the ensemble strategy.
The generate_signals.py file is a module that generates buy/sell/hold signals 👍👎🔁 as 1,-1,0 respectively and returns them to the notebook. The user can customize the input parameters.
The notebook will generate a quantstats full report 📊 at the end to evaluate the performance of the strategy.

📖 User Manual

1. Manual Installation

Follow the steps for a manual installation: (You can run the project only with the 3 steps (steps 1,2,4) )

Clone the Repository: Get a copy of this repository on your local machine with the following command:
git clone https://github.com/YeakubSadlil/Ensemble_backtesting_stock_market.git
Install Dependencies: Make sure you have Python 3.10 installed. Then, install the required dependencies using the following command:
pip install -r requirements.txt
Data Ingestion: Ingest your custom data with Zipline. There is a default data ingestion available on the notebook. Note that the strategy can accept multiple assets.
Run the Notebook: Open the run.ipynb file in the repository and customize your backtesting parameters such as stock symbols, time period, and investment settings like the amount and number of stocks to buy at each buy signal.
Interpret Results: A quantstats report will be generated automatically at the end by gs.plots(results).
Analyze the generated plots and results to assess the strategy's performance on your selected or default assets.

N.B.

If you don't want to customize any input parameters or data ingestion, you can directly run the notebook run.ipynb without any changes.
If you face any issues in step 2 associated with ta-lib, please install it first, doc

2. Installation Using Docker

If you have Docker installed, you can use it to run the project to avoid setting the environment or installing dependencies:

Verify Docker Installation: Make sure Docker is installed and running on your machine.
Clone the Repository: Get a copy of this repository on your local machine with the following command:
git clone https://github.com/YeakubSadlil/Ensemble_backtesting_stock_market.git
Build the Docker Image: Run the following command to build the Docker image:
docker build -t ensemble-backtest-stockprice .
Run the Docker Container: Start a new Docker container with the image using the following command:
docker run -d -p 8888:8888 --name my_backtest_container ensemble-backtest-stockprice
Access Jupyter Notebook: Open your web browser and go to http://localhost:8888. Paste the copied token when prompted. (If no token is required, you can skip step 6).
Get the Jupyter Notebook Token: Run the following commands to get the Jupyter Notebook token:
docker exec -it my_backtest_container /bin/bash
After that, run the following command inside the container that you run:
jupyter server list
It will show you a link with a token. Copy the token only from the link and paste it in the browser jupyter notebook prompt.

From the Jupyter Notebook, run run.ipynb to start the project.

3. Google Colab

If you don't want to install anything on your local machine or you haven't have enough time to set up the environment, you can run the project on Google Colab.
Please go to the Colab Notebook and follow the instructions there. After uploading the necessary files you are ready to go just with a single click 'Run All'.

🏗️ Project Structure

├── 📂 Data                <- Folder for all the data used for model training
│   └── sp50/daily     
│   
├── 📂 ML Models           <- Folder for all the machine learning models used for the project
│       └──LSTM_Stock_Price_Prediction.ipynb
│
├── 📓 run.ipynb           <- Jupyter notebook from which the user can run the backtesting
│
├── 📄 generate_signal.py  <- Module to generate the buy/sell/hold signals
│
├── 📝 requirements.txt    <- List of required python packages
│
├── 🐳 Dockerfile          <- Dockerfile for building the Docker image
│
└── 📄 lstm_12_p50_ckp_13_24_e150.h5   <- LSTM model weights file

🛠️ Our Approach

Ensemble Strategy: We combined Bollinger Bands and LSTM models to predict stock prices and generate signals.
- When the LSTM model predicts that tomorrow's stock price is higher than the current price and the Bollinger's lower band is also higher than the current price, then we generate a buy signal (Long Position)
- When the LSTM model predicts that tomorrow's stock price is lower than the current price and Bollinger's upper band is also lower than the current price, then we generate a sell signal.
- Otherwise, we generate a hold signal.
- We have chosen to go only for long positions as the market is a bull market.
Tuning Models:
- We have tried trend following and mean reversion strategies with different technical indicators like MACD, RSI, Bollinger Bands, etc. and checked their individual performance.
- Then we combined the best performing strategies with LSTM to create an ensemble strategy.
- We have found that the ensemble strategy is performing better than the individual strategies compared to each other and benchmark S&P500.
- We have also tested the In Sample and Out Sample performance and found that the ensemble is performing better. Check all test notebooks in the ML Models folder.
Bollinger Bands: Utilized the Bollinger Bands model to generate buy/sell signals based on the stock's price volatility with a default window of 20 days.
LSTM Model: Developed an LSTM model to predict the stock price of the next day based on the previous 50 days of stock prices.
- Used it as a filter with Bollinger Bands to generate signals. The reason behind that is predicted stock price was higher than the current price during downtrends and lower during uptrends.
- Trained the model on S&P 500 data from 2013 to 2020 and tested it on data from 2023 to 2024. 📅
Asset Categorization: Backtested our strategy on 50 assets from 10 different sectors (2018-2022) to add diversification and evaluate its performance. Check Asset Lists or the Data section to see the list of assets.
Module Development: Developed a module to generate signals (generate_signals.py), which is imported into the run.ipynb. It will return buy/sell/hold signals as 1, -1, 0 respectively.
Backtesting: Utilized the zipline library to backtest our strategy and quantstats to evaluate the performance. 🧪

LSTM Model Architecture:

🚀 Performance Analysis

Our ensemble strategy is pretty close to the Bollinger Bands individual strategy, but it has outperformed the benchmark (S&P 500) in terms of CAGR, Sharpe Ratio, Portfolio Value while bactested with 50 assets from 2018-22.

It couldn't beat the benchmark while backtested with some single assets for out sample data but performed well for the AAPL stock.
Although the performance was better than Benchmark when going long only in a bull condition, the strategy was suffering when there were high drawdowns which indicates that the strategy is not robust enough to handle the market downturns.

Ensemble Notebook: We tuned our ensemble model in Google Colab for faster training. The notebook is available here or check the Test_ensemble_InSample.ipynb in the folder ML Models.

The table below shows the performance comparison on in-sample data

Metric	Benchmark	Bollinger Bands	LSTM + Bollinger Ensemble
Start Period	2018-03-19	2018-03-19	2018-03-19
End Period	2022-12-30	2022-12-30	2022-12-30
Risk-Free Rate	0.0%	0.0%	0.0%
Time in Market	100.0%	100.0%	100.0%
Cumulative Return	39.52%	50.92%	54.21%
CAGR	4.92%	6.12%	6.45%
Sharpe	0.43	0.46	0.49
Max Drawdown	-33.92%	-43.2%	-44.06%
Avg. Drawdown	-2.18%	-2.79%	-2.79%
Volatility (ann.)	22.01%	25.56%	25.21%
Calmar	0.15	0.14	0.15

The plot below shows the performance of the ensemble strategy

🔮 Drawbacks & Future Works

Dataset Choosing: We have trained the LSTM model on S&P 500 data, but a market index can be created with the 50 assets we have used for backtesting.
Order Strategy: As the market is a bull market we went only for long positions but with a proper short-selling strategy more profit can be generated.
Fine-Tuning Models: Continuously refine and optimize the Bollinger Bands window size and LSTM models for better prediction accuracy. The LSTM model was underperforming while predicting based on the past 100 and 150 days.LSTM may suffer from vanishing gradients and can be improved with Attention mechanisms, Stacking more layers or Bidirectional LSTMs etc.🔧
Risk Management: Implement risk management strategies to minimize potential losses such as stop loss and take profit.
Meta-Labeling Strategy: In his book Advances in Financial Machine Learning, Dr.Lopez de Prad describes a Meta-labeling technique that uses an array of new Ensemble learning techniques to enhance machine learning strategies. Hudson & Thames, a financial research group, expanded on these techniques and showed some implementation ideas in a youtube video.

📂 Asset Categories

We have backtested our strategy on 50 assets from 10 different sectors. If you want to test our model based on your custom data please choose tickers from here. The list of assets is as follows:

Industrials	Health Care	Information Technology	Financials	Materials	Consumer Staples	Energy	Communication Services	Utilities	Real Estate
MMM	ABT	ADBE	AFL	FMC	BG	TRGP	DIS	AES	ARE
AOS	BAX	AMD	BAC	IFF	MO	VLO	WBD	LNT	BXP
BA	BDX	AAPL	BRK-B	KLAC	CPB	WMB	GOOGLE	AEP	CPT
AXON	TECH	CDNS	BX	APD	STZ	APA	FOX	AWK	AMT
CAT	ALGN	NVDA	COF	CE	WMT	BKR	EA	CEG	CCI

Summary

                                      Data dances in time's rapid stream  💃🕺🌊⏳
                                      Patterns prediction, a trader's dream 🔮💰💤😴
                                      Bollinger's Bands, our measuring guide  📏📈📊🔍
                                      LSTM whispers where profits reside  🤫💰💵🏠
                                      The ensemble dances with a symphony bright 🌟🎭💃🎶
                                      Forecasting markets, with endless sight  🧐📉📈👀
                                      --------------------> An Anonymous Quant

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📈 Ensemble Strategy for Backtesting Stock Price

📚 Table of Contents

How it Works

📖 User Manual

1. Manual Installation

2. Installation Using Docker

3. Google Colab

🏗️ Project Structure

🛠️ Our Approach

🚀 Performance Analysis

🔮 Drawbacks & Future Works

📂 Asset Categories

Summary

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
Data/sp50/daily		Data/sp50/daily
ML Models		ML Models
Dockerfile		Dockerfile
README.md		README.md
generate_signal.py		generate_signal.py
lstm_12_p50_ckp_13_24_e150.h5		lstm_12_p50_ckp_13_24_e150.h5
requirements.txt		requirements.txt
run.ipynb		run.ipynb

YeakubSadlil/Ensemble_backtesting_stock_market

Folders and files

Latest commit

History

Repository files navigation

📈 Ensemble Strategy for Backtesting Stock Price

📚 Table of Contents

How it Works

📖 User Manual

1. Manual Installation

2. Installation Using Docker

3. Google Colab

🏗️ Project Structure

🛠️ Our Approach

🚀 Performance Analysis

🔮 Drawbacks & Future Works

📂 Asset Categories

Summary

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages