|Project Name||Stars||Downloads||Repos Using This||Packages Using This||Most Recent Commit||Total Releases||Latest Release||Open Issues||License||Language|
|Openbbterminal||24,896||5||10 hours ago||34||November 30, 2023||272||mit||Python|
|Investment Research for Everyone, Everywhere.|
|Qlib||12,931||1||6 days ago||32||July 18, 2023||181||mit||Python|
|Qlib is an AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas to implementing productions. Qlib supports diverse machine learning modeling paradigms. including supervised learning, market dynamics modeling, and RL.|
|Machine Learning For Trading||9,469||5 months ago||13||Jupyter Notebook|
|Code for Machine Learning for Algorithmic Trading, 2nd edition.|
|Pybroker||1,322||a month ago||7||other||Python|
|Algorithmic Trading in Python with Machine Learning|
|Surpriver||1,275||2 years ago||11||gpl-3.0||Python|
|Find big moving stocks before they move using machine learning and anomaly detection|
|Sgx Full Orderbook Tick Data Trading Strategy||1,151||a year ago||3||Jupyter Notebook|
|Providing the solutions for high-frequency trading (HFT) strategies using data science approaches (Machine Learning) on Full Orderbook Tick Data.|
|My Data Competition Experience||271||3 years ago||2||Python|
|Quantitative Notebooks||233||3 years ago||apache-2.0||Jupyter Notebook|
|Educational notebooks on quantitative finance, algorithmic trading, financial modelling and investment strategy|
|Stockperformanceclassification||113||5 years ago||3||Jupyter Notebook|
|Keras 1D CNN on Azure ML Workbench to classify 4 week stock performance based on text in public earnings statements|
|Awesome Financial Nlp||98||4 years ago|
|Researches for Natural Language Processing for Financial Domain|
Recent released features | Feature | Status | | -- | ------ | | KRNN and Sandwich models | 📈 Released on May 26, 2023 | | Release Qlib v0.9.0 | Released on Dec 9, 2022 | | RL Learning Framework | 🔨 📈 Released on Nov 10, 2022. #1332, #1322, #1316,#1299,#1263, #1244, #1169, #1125, #1076| | HIST and IGMTF models | 📈 Released on Apr 10, 2022 | | Qlib notebook tutorial | Released on Apr 7, 2022 | | Ibovespa index data | 🍚 Released on Apr 6, 2022 | | Point-in-Time database | 🔨 Released on Mar 10, 2022 | | Arctic Provider Backend & Orderbook data example | 🔨 Released on Jan 17, 2022 | | Meta-Learning-based framework & DDG-DA | 📈 🔨 Released on Jan 10, 2022 | | Planning-based portfolio optimization | 🔨 Released on Dec 28, 2021 | | Release Qlib v0.8.0 | Released on Dec 8, 2021 | | ADD model | 📈 Released on Nov 22, 2021 | | ADARNN model | 📈 Released on Nov 14, 2021 | | TCN model | 📈 Released on Nov 4, 2021 | | Nested Decision Framework | 🔨 Released on Oct 1, 2021. Example and Doc | | Temporal Routing Adaptor (TRA) | 📈 Released on July 30, 2021 | | Transformer & Localformer | 📈 Released on July 22, 2021 | | Release Qlib v0.7.0 | Released on July 12, 2021 | | TCTS Model | 📈 Released on July 1, 2021 | | Online serving and automatic model rolling | 🔨 Released on May 17, 2021 | | DoubleEnsemble Model | 📈 Released on Mar 2, 2021 | | High-frequency data processing example | 🔨 Released on Feb 5, 2021 | | High-frequency trading example | 📈 Part of code released on Jan 28, 2021 | | High-frequency data(1min) | 🍚 Released on Jan 27, 2021 | | Tabnet Model | 📈 Released on Jan 22, 2021 |
Features released before 2021 are not listed here.
Qlib is an open-source, AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas to implementing productions. Qlib supports diverse machine learning modeling paradigms, including supervised learning, market dynamics modeling, and reinforcement learning.
An increasing number of SOTA Quant research works/papers in diverse paradigms are being released in Qlib to collaboratively solve key challenges in quantitative investment. For example, 1) using supervised learning to mine the market's complex non-linear patterns from rich and heterogeneous financial data, 2) modeling the dynamic nature of the financial market using adaptive concept drift technology, and 3) using reinforcement learning to model continuous investment decisions and assist investors in optimizing their trading strategies.
It contains the full ML pipeline of data processing, model training, back-testing; and covers the entire chain of quantitative investment: alpha seeking, risk modeling, portfolio optimization, and order execution. For more details, please refer to our paper "Qlib: An AI-oriented Quantitative Investment Platform".
|Frameworks, Tutorial, Data & DevOps||Main Challenges & Solutions in Quant Research|
New features under development(order by estimated release time). Your feedbacks about the features are very important.
The high-level framework of Qlib can be found above(users can find the detailed framework of Qlib's design when getting into nitty gritty). The components are designed as loose-coupled modules, and each component could be used stand-alone.
Qlib provides a strong infrastructure to support Quant research. Data is always an important part. A strong learning framework is designed to support diverse learning paradigms (e.g. reinforcement learning, supervised learning) and patterns at different levels(e.g. market dynamic modeling). By modeling the market, trading strategies will generate trade decisions that will be executed. Multiple trading strategies and executors in different levels or granularities can be nested to be optimized and run together. At last, a comprehensive analysis will be provided and the model can be served online in a low cost.
This quick start guide tries to demonstrate
This table demonstrates the supported Python version of
| | install with pip | install from source | plot |
| ------------- |:---------------------:|:--------------------:|:----:|
| Python 3.7 | ✔️ | ✔️ | ✔️ |
| Python 3.8 | ✔️ | ✔️ | ✔️ |
| Python 3.9 | ❌ | ✔️ | ❌ |
condaenvironment may result in missing header files, causing the installation failure of certain packages.
Qlibfrom source. If users use Python 3.6 on their machines, it is recommended to upgrade Python to version 3.7 or use
conda's Python to install
Qlibsupports running workflows such as training models, doing backtest and plot most of the related figures (those included in notebook). However, plotting for the model performance is not supported for now and we will fix this when the dependent packages are upgraded in the future.
hdf5in tables does not support python3.9.
Users can easily install
Qlib by pip according to the following command.
pip install pyqlib
Note: pip will install the latest stable qlib. However, the main branch of qlib is in active development. If you want to test the latest scripts or functions in the main branch. Please install qlib with the methods below.
Also, users can install the latest dev version
Qlib by the source code according to the following steps:
Qlib from source, users need to install some dependencies:
pip install numpy pip install --upgrade cython
Clone the repository and install
Qlib as follows.
git clone https://github.com/microsoft/qlib.git && cd qlib pip install .
Note: You can install Qlib with
python setup.py install as well. But it is not the recommended approach. It will skip
pip and cause obscure problems. For example, only the command
pip install . can overwrite the stable version installed by
pip install pyqlib, while the command
python setup.py install can't.
Tips: If you fail to install
Qlib or run the examples in your environment, comparing your steps and the CI workflow may help you find the problem.
Load and prepare data by running the following code:
# get 1d data python -m qlib.run.get_data qlib_data --target_dir ~/.qlib/qlib_data/cn_data --region cn # get 1min data python -m qlib.run.get_data qlib_data --target_dir ~/.qlib/qlib_data/cn_data_1min --region cn --interval 1min
# get 1d data python scripts/get_data.py qlib_data --target_dir ~/.qlib/qlib_data/cn_data --region cn # get 1min data python scripts/get_data.py qlib_data --target_dir ~/.qlib/qlib_data/cn_data_1min --region cn --interval 1min
Please pay ATTENTION that the data is collected from Yahoo Finance, and the data might not be perfect. We recommend users to prepare their own data if they have a high-quality dataset. For more information, users can refer to the related document.
This step is Optional if users only want to try their models and strategies on history data.
It is recommended that users update the data manually once (--trading_date 2021-05-25) and then set it to update automatically.
NOTE: Users can't incrementally update data based on the offline data provided by Qlib(some fields are removed to reduce the data size). Users should use yahoo collector to download Yahoo data from scratch and then incrementally update it.
For more information, please refer to: yahoo collector
Automatic update of data to the "qlib" directory each trading day(Linux)
set up timed tasks:
* * * * 1-5 python <script path> update_data_to_bin --qlib_data_1d_dir <user data dir>
Manual update of data
python scripts/data_collector/yahoo/collector.py update_data_to_bin --qlib_data_1d_dir <user data dir> --trading_date <start date> --end_date <end date>
Qlib provides a tool named
qrun to run the whole workflow automatically (including building dataset, training models, backtest and evaluation). You can start an auto quant research workflow and have a graphical reports analysis according to the following steps:
Quant Research Workflow: Run
qrun with lightgbm workflow config (workflow_config_lightgbm_Alpha158.yaml as following.
cd examples # Avoid running program under the directory contains `qlib` qrun benchmarks/LightGBM/workflow_config_lightgbm_Alpha158.yaml
If users want to use
qrun under debug mode, please use the following command:
python -m pdb qlib/workflow/cli.py examples/benchmarks/LightGBM/workflow_config_lightgbm_Alpha158.yaml
The result of
qrun is as follows, please refer to Intraday Trading for more details about the result.
'The following are analysis results of the excess return without cost.' risk mean 0.000708 std 0.005626 annualized_return 0.178316 information_ratio 1.996555 max_drawdown -0.081806 'The following are analysis results of the excess return with cost.' risk mean 0.000512 std 0.005626 annualized_return 0.128982 information_ratio 1.444287 max_drawdown -0.091078
Here are detailed documents for
qrun and workflow.
Graphical Reports Analysis: Run
jupyter notebook to get graphical reports
Forecasting signal (model prediction) analysis
Explanation of above results
The automatic workflow may not suit the research workflow of all Quant researchers. To support a flexible Quant research workflow, Qlib also provides a modularized interface to allow researchers to build their own workflow by code. Here is a demo for customized Quant research workflow by code.
Quant investment is a very unique scenario with lots of key challenges to be solved. Currently, Qlib provides some solutions for several of them.
Accurate forecasting of the stock price trend is a very important part to construct profitable portfolios. However, huge amount of data with various formats in the financial market which make it challenging to build forecasting models.
An increasing number of SOTA Quant research works/papers, which focus on building forecasting models to mine valuable signals/patterns in complex financial data, are released in
Here is a list of models built on
Your PR of new Quant models is highly welcomed.
The performance of each model on the
Alpha360 datasets can be found here.
All the models listed above are runnable with
Qlib. Users can find the config files we provide and some details about the model through the benchmarks folder. More information can be retrieved at the model files listed above.
Qlib provides three different ways to run a single model, users can pick the one that fits their cases best:
Users can use the tool
qrun mentioned above to run a model's workflow based from a config file.
Users can create a
workflow_by_code python script based on the one listed in the
Users can use the script
run_all_model.py listed in the
examples folder to run a model. Here is an example of the specific shell command to be used:
python run_all_model.py run --models=lightgbm, where the
--models arguments can take any number of models listed above(the available models can be found in benchmarks). For more use cases, please refer to the file's docstrings.
Qlib also provides a script
run_all_model.py which can run multiple models for several iterations. (Note: the script only support Linux for now. Other OS will be supported in the future. Besides, it doesn't support parallel running the same model for multiple times as well, and this will be fixed in the future development too.)
The script will create a unique virtual environment for each model, and delete the environments after training. Thus, only experiment results such as
backtest results will be generated and stored.
Here is an example of running all the models for 10 iterations:
python run_all_model.py run 10
It also provides the API to run specific models at once. For more use cases, please refer to the file's docstrings.
Due to the non-stationary nature of the environment of the financial market, the data distribution may change in different periods, which makes the performance of models build on training data decays in the future test data. So adapting the forecasting models/strategies to market dynamics is very important to the model/strategies' performance.
Here is a list of solutions built on
Qlib now supports reinforcement learning, a feature designed to model continuous investment decisions. This functionality assists investors in optimizing their trading strategies by learning from interactions with the environment to maximize some notion of cumulative reward.
Here is a list of solutions built on
Qlib categorized by scenarios.
Dataset plays a very important role in Quant. Here is a list of the datasets built on
|Dataset||US Market||China Market|
Here is a tutorial to build dataset with
Your PR to build new Quant dataset is highly welcomed.
Qlib is high customizable and a lot of its components are learnable.
The learnable components are instances of
Forecast Model and
Trading Agent. They are learned based on the
Learning Framework layer and then applied to multiple scenarios in
The learning framework leverages the
Workflow layer as well(e.g. sharing
Information Extractor, creating environments based on
Based on learning paradigms, they can be categorized into reinforcement learning and supervised learning.
Workflowlayer to create environments. It's worth noting that
NestedExecutoris supported as well. This empowers users to optimize different level of strategies/models/agents together (e.g. optimizing an order execution strategy for a specific portfolio management strategy).
If you want to have a quick glance at the most frequently used components of qlib, you can try notebooks here.
cd docs/ conda install sphinx sphinx_rtd_theme -y # Otherwise, you can install them with pip # pip install sphinx sphinx_rtd_theme make html
You can also view the latest document online directly.
Qlib is in active and continuing development. Our plan is in the roadmap, which is managed as a github project.
The data server of Qlib can either deployed as
Offline mode or
Online mode. The default mode is offline mode.
Offline mode, the data will be deployed locally.
Online mode, the data will be deployed as a shared data service. The data and their cache will be shared by all the clients. The data retrieval performance is expected to be improved due to a higher rate of cache hits. It will consume less disk space, too. The documents of the online mode can be found in Qlib-Server. The online mode can be deployed automatically with Azure CLI based scripts. The source code of online data server can be found in Qlib-Server repository.
The performance of data processing is important to data-driven methods like AI technologies. As an AI-oriented platform, Qlib provides a solution for data storage and data processing. To demonstrate the performance of Qlib data server, we compare it with several other data storage solutions.
We evaluate the performance of several storage solutions by finishing the same task, which creates a dataset (14 features/factors) from the basic OHLCV daily data of a stock market (800 stocks each day from 2007 to 2020). The task involves data queries and processing.
|HDF5||MySQL||MongoDB||InfluxDB||Qlib -E -D||Qlib +E -D||Qlib +E +D|
|Total (1CPU) (seconds)||184.43.7||365.37.5||253.66.7||368.23.6||147.08.8||47.61.0||7.40.3|
|Total (64CPU) (seconds)||8.80.6||4.20.2|
+(-)Eindicates with (out)
+(-)Dindicates with (out)
Most general-purpose databases take too much time to load data. After looking into the underlying implementation, we find that data go through too many layers of interfaces and unnecessary format transformations in general-purpose database solutions. Such overheads greatly slow down the data loading process. Qlib data are stored in a compact format, which is efficient to be combined into arrays for scientific computation.
Qlib, please create pull requests.
Join IM discussion groups: |Gitter| |----| ||
Before we released Qlib as an open-source project on Github in Sep 2020, Qlib is an internal project in our group. Unfortunately, the internal commit history is not kept. A lot of members in our group have also contributed a lot to Qlib, which includes Ruihua Wang, Yinda Zhang, Haisu Yu, Shuyu Wang, Bochen Pang, and Dong Zhou. Especially thanks to Dong Zhou due to his initial version of Qlib.
This project welcomes contributions and suggestions.
Here are some code standards and development guidance for submiting a pull request.
Making contributions is not a hard thing. Solving an issue(maybe just answering a question raised in issues list or gitter), fixing/issuing a bug, improving the documents and even fixing a typo are important contributions to Qlib.
For example, if you want to contribute to Qlib's document/code, you can follow the steps in the figure below.
If you don't know how to start to contribute, you can refer to the following examples. | Type | Examples | | -- | -- | | Solving issues | Answer a question; issuing or fixing a bug | | Docs | Improve docs quality ; Fix a typo | | Feature | Implement a requested feature like this; Refactor interfaces | | Dataset | Add a dataset | | Models | Implement a new model, some instructions to contribute models |
Good first issues are labelled to indicate that they are easy to start your contributions.
You can find some impefect implementation in Qlib by
rg 'TODO|FIXME' qlib
If you would like to become one of Qlib's maintainers to contribute more (e.g. help merge PR, triage issues), please contact us by email([email protected]). We are glad to help to upgrade your permission.
Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the right to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.