Skip to content

Latest commit

 

History

History
 
 

08_ml4t_workflow

Putting it all together: The ML4T Workflow

This chapter integrates the various building blocks of the machine learning for trading (ML4T) workflow and presents an end-to-end perspective on the process of designing, simulating, and evaluating an ML-driven trading strategy. Most importantly, it demonstrates in more detail how to prepare, design, run and evaluate a backtest using the Python libraries backtrader and Zipline.

The ultimate goal of the ML4T workflow is to gather evidence from historical data that helps decide whether to deploy a candidate strategy in a live market and put financial resources at risk. This process builds on the skills you developed in the previous chapters because it relies on your ability to

  • work with a diverse set of data sources to engineer informative factors,
  • design ML models that generate predictive signals to inform your trading strategy, and
  • optimize the resulting portfolio from a risk-return perspective.

A realistic simulation of your strategy also needs to faithfully represent how security markets operate and how trades are executed. Therefore, the institutional details of exchanges, such as which order types are available and how prices are determined (see Chapter 2, Market and Fundamental Data, also matter when you design a backtest or evaluate whether a backtesting engine includes the requisite features for accurate performance measurements. Finally, there are several methodological aspects that require attention to avoid biased results and false discoveries that will lead to poor investment decisions.

More specifically, after working through this chapter you will be able to:

  • Plan and implement end-to-end strategy backtesting
  • Understand and avoid critical pitfalls in the process
  • Discuss the advantages and disadvantages of vectorized vs event-driven backtesting engines
  • Identify and evaluate the key components of an event-driven backtester
  • Design and execute the ML4T workflow using data sources at minute and daily frequencies, with ML models trained separately or as part of the backtest
  • Know how to use Zipline and backtrader

The data used for some of the backtest simulations are generated by the script data_prep.py in the data directory and are based on the linear regression return predictions in Chapter 7, Linear Models.

How to backtest an ML-driven strategy

In a nutshell, the ML4T workflow, illustrated in Figure 8.1, is about backtesting a trading strategy that leverages machine learning to generate trading signals, select and size positions, or optimize the execution of trades. It involves the following steps, with a specific investment universe and horizon in mind:

  • Source and prepare market, fundamental, and alternative data
  • Engineer predictive alpha factors and features
  • Design, tune, and evaluate ML models to generate trading signals
  • Decide on trades based on these signals, e.g. by applying rules
  • Size individual positions in the portfolio context
  • Simulate the resulting trades triggered using historical market data
  • Evaluate how the resulting positions would have performed

The ML4T Workflow

The pitfalls of backtesting and how to avoid them

Backtesting simulates an algorithmic strategy based on historical data with the goal of producing performance results that generalize to new market conditions. In addition to the generic uncertainty around predictions in the context of ever-changing markets, several implementation aspects can bias the results and increase the risk of mistaking in-sample performance for patterns that will hold out-of-sample.

Getting the data right

Data issues that undermine the validity of a backtest include

  • look-ahead bias,
  • survivorship bias,
  • outlier control, as well as
  • the selection of the sample period.

Getting the simulation right

Practical issues related to the implementation of the historical simulation include:

  • a failure to mark to market to accurately reflect market prices and account for drawdowns;
  • unrealistic assumptions about the availability, cost, or market impact of trades; or
  • incorrect timing of signals and trade execution.

Getting the statistics right: Data-snooping and backtest-overfitting

The most prominent challenge to backtest validity, including to published results, relates to the discovery of spurious patterns due to multiple testing during the strategy-selection process. Selecting a strategy after testing different candidates on the same data will likely bias the choice because a positive outcome is more likely to be due to the stochastic nature of the performance measure itself. In other words, the strategy is overly tailored, or overfit, to the data at hand and produces deceptively positive results.

Marcos Lopez de Prado has published extensively on the risks of backtesting, and how to detect or avoid it. This includes an online simulator of backtest-overfitting.

Code Example: The deflated Sharpe Ratio

De Lopez Prado and David Bailey derived a deflated SR to compute the probability that the SR is statistically significant while controlling for the inflationary effect of multiple testing, non-normal returns, and shorter sample lengths.

The pyton script deflated_sharpe_ratio in the directory multiple_testing contains the Python implementation with references for the derivation of the related formulas.

References

How a backtesting engine works

Put simply, a backtesting engine iterates over historical prices (and other data), passes the current values to your algorithm, receives orders in return, and keeps track of the resulting positions and their value. In practice, there are numerous requirements to create a realistic and robust simulation of the ML4T workflow depicted above. The difference between vectorized and event-driven approaches illustrates how the faithful reproduction of the actual trading environment adds significant complexity.

Vectorized vs event-driven backtesting

A vectorized backtest is the most basic way to evaluate a strategy. It simply multiplies a signal vector that represents the target position size with a vector of returns for the investment horizon to compute the period performance.

Code example

We illustrate the vectorized approach using the daily return predictions that we created using ridge regression in Chapter 7

Key Implementation Aspects

The requirements for a realistic simulation may be met by a single platform that supports all steps of the process in an end-to-end fashion, or by multiple tools that each specialize in different aspects. For instance, you could handle the design and testing of ML models that generate signals using generic ML libraries like scikit-learn or others that we will encounter in this book and feed the model outputs into a separate backtesting engine. Alternatively, you could run the entire ML4T workflow end-to-end on a single platform like Quantopian and QuantConnect.

The following implementation details need to be addressed to put this process in action, and are discussed in more detail in this section of the book:

  • Data ingestion: Format, frequency, and timing
  • Factor engineering: Built-in computations vs third-party libraries
  • ML models, predictions, and signals
  • Trading rules and execution
  • Performance evaluation

backtrader: a flexible tool for local backtests

backtrader is a popular, flexible, and user-friendly Python library for local backtests with great documentation, developed since 2015 by Daniel Rodriguez. In addition to a large and active community of individual traders, there are several banks and trading houses that use backtrader to prototype and test new strategies before porting them to a production-ready platform using, e.g., Java. You can also use backtrader for live trading with several brokers of your choice (see the backtrader documentation and Chapter 23, Next Steps)).

Key concepts of backtrader’s Cerebro architecture

How to use backtrader in practice

Resources

zipline: production-ready backtesting by Quantopian

The open source Zipline library is an event-driven backtesting system maintained and used in production by the crowd-sourced quantitative investment fund Quantopian to facilitate algorithm-development and live-trading. It automates the algorithm's reaction to trade events and provides it with current and historical point-in-time data that avoids look-ahead bias.

In Chapter 4, we introduced zipline to simulate the computation of alpha factors, and in Chapter 5 we added trades to simulate a simple strategy and measure its performance as well as optimize portfolio holdings using different techniques.

The code for this section is in the subdirectory ml4t_workflow_with_zipline:

  • the notebook backtesting_with_zipline demonstrates the use of the Pipeline interface while loading ML predictions from another local (HDF5) data source
  • the notebook ml4t_with_zipline shows how to train an ML model locally as part of a Pipeline using a CustomFactor and various technical indicators as features for daily bundle data.
  • the notebook ml4t_quantopian shows how to train an ML model on the Quantopian platform to utilize the broad range of data sources available there.