You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: 04_alpha_factor_research/README.md
+42-11
Original file line number
Diff line number
Diff line change
@@ -1,25 +1,25 @@
1
-
# Chapter 04: Alpha Factor Research & Evaluation
1
+
# Chapter 04: Financial Feature Engineering: How to research Alpha Factors
2
2
3
3
Alpha factors aim to predict the price movements of assets in the investment universe based on the available market, fundamental, or alternative data. A factor may combine one or several input variables, but assumes a single value for each asset every time the strategy evaluates the factor.
4
4
5
5
Trade decisions typically rely on relative values across assets. Trading strategies are often based on signals emitted by multiple factors, and we will see that machine learning (ML) models are particularly well suited to integrate the various signals efficiently to make more accurate predictions.
6
6
7
7
This chapter provides a framework for understanding how factors work and how to measure their performance, for example using the information coefficient (IC). It demonstrates how to engineer alpha factors from data using Python libraries offline and on the Quantopian platform. It also introduces the `zipline` library to backtest factors and the `alphalens` library to evaluate their predictive power. More specifically, this chapter covers:
8
8
9
-
-How to characterize, justify and measure key types of alpha factors
10
-
- How to create alpha factors using financial feature engineering
11
-
- How to use `zipline` offline to test individual alpha factors
12
-
- How to use `zipline`on Quantopian to combine alpha factors and identify more sophisticated signals
9
+
-Which categories of factors exist, why they work and how to measure them
10
+
- How to create alpha factors using numpy, pandas, and talib
11
+
- How to denoise data using wavelets and the Kalman filter
12
+
- How to use zipline offline and on Quantopian to test individual and multiple alpha factors
13
13
- How the information coefficient (IC) measures an alpha factor's predictive performance
14
-
- How to use `alphalens` to evaluate predictive performance and turnover
14
+
- How to use alphalens to evaluate predictive performance and turnover using, among other metrics, the information coefficient (IC)
15
15
16
-
## Engineering Alpha Factor
16
+
## Alpha Factors in practice: from data to signals
17
17
18
18
Alpha factors are transformations of market, fundamental, and alternative data that contain predictive signals. They are designed to capture risks that drive asset returns. One set of factors describes fundamental, economy-wide variables such as growth, inflation, volatility, productivity, and demographic risk. Another set consists of tradeable investment styles such as the market portfolio, value-growth investing, and momentum investing.
19
19
20
20
There are also factors that explain price movements based on the economics or institutional setting of financial markets, or investor behavior, including known biases of this behavior. The economic theory behind factors can be rational, where the factors have high returns over the long run to compensate for their low returns during bad times, or behavioral, where factor risk premiums result from the possibly biased, or not entirely rational behavior of agents that is not arbitraged away.
21
21
22
-
### Important Factor Categories
22
+
### On the shoulders of giants: meet the factor establishment
23
23
24
24
In an idealized world, categories of risk factors should be independent of each other (orthogonal), yield positive risk premia, and form a complete set that spans all dimensions of risk and explains the systematic risks for assets in a given class. In practice, these requirements will hold only approximately.
25
25
@@ -31,9 +31,36 @@ In an idealized world, categories of risk factors should be independent of each
31
31
-[Anomalies and Market Efficiency](https://www.nber.org/papers/w9277.pdf) by G. William Schwert25 (Ch. 15 in Handbook of the- "Economics of Finance", by Constantinides, Harris, and Stulz, 2003)
32
32
-[Investor Psychology and Asset Pricing](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=265132), by David Hirshleifer (2001)
33
33
34
-
### How to transform Data into Factors
34
+
## Engineering alpha factors that predict returns
35
+
36
+
37
+
38
+
### How to engineer factors using pandas and NumPy
35
39
36
40
- The notebook [feature_engineering.ipynb](00_data/feature_engineering.ipynb) in the [data](00_data) directory illustrates how to engineer basic factors.
41
+
-[Fama French](https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html) Data Library
-[10 minutes to pandas](https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html)
47
+
-[Python Pandas Tutorial: A Complete Introduction for Beginners](https://www.learndatasci.com/tutorials/python-pandas-tutorial-complete-introduction-for-beginners/)
48
+
-[alphatools](https://github.com/marketneutral/alphatools) - Quantitative finance research tools in Python
49
+
-[mlfinlab](https://github.com/hudson-and-thames/mlfinlab) - Package based on the work of Dr Marcos Lopez de Prado regarding his research with respect to Advances in Financial Machine Learning
50
+
51
+
#### How to denoise your Alpha Factors with the Kalman Filter
## From signals to trades: backtesting with`zipline`
48
75
49
76
The open source [zipline](http://www.zipline.io/index.html) library is an event-driven backtesting system maintained and used in production by the crowd-sourced quantitative investment fund [Quantopian](https://www.quantopian.com/) to facilitate algorithm-development and live-trading. It automates the algorithm's reaction to trade events and provides it with current and historical point-in-time data that avoids look-ahead bias.
50
77
51
-
-`zipline` installation: see [docs](http://www.zipline.io/index.html) and the introduction to `zipline` in [Chapter 2](../02_market_and_fundamental_data/02_data_providers/04_zipline) for more detail.
78
+
### Installation
79
+
80
+
- The current release 1.3 has a few shortcomings such as the [dependency on benchmark data from the IEX exchange](https://github.com/quantopian/zipline/issues/2480) and limitations for importing features beyond the basic OHLCV data points.
81
+
- To enable the use of `zipline`, I've provided a [patched version](https://github.com/stefan-jansen/zipline) that works for the purposes of this book.
82
+
- Install by cloning the repo, `cd` into the packages' root folder and, after activating the `ml4t` environment, run `pip install -e`
52
83
53
84
## Separating signal and noise – how to use alphalens
0 commit comments