Pyraformer: Low-complexity Pyramidal Attention for Long-range Time Series Modeling and Forecasting

This is the Pytorch implementation of Pyraformer (Pyramidal Attention based Transformer) in the ICLR paper: Pyraformer: Low-complexity Pyramidal Attention for Long-range Time Series Modeling and Forecasting.

The network architecture of Pyraformer.

Figure 1. The network architecture of Pyraformer.

Pyramidal Attention

As demonstrated in Figure 2, we leverage a pyramidal graph to describe the temporal dependencies of the observed time series in a multiresolution fashion. We can decompose the pyramidal graph into two parts: the inter-scale and the intra-scale connections. The inter-scale connections form a C-ary tree, in which each parent has C children. For example, if we associate the finest scale of the pyramidal graph with hourly observations of the original time series, the nodes at coarser scales can be regarded as the daily, weekly, and even monthly features of the time series. As a consequence, the pyramidal graph offers a multiresolution representation of the original time series. Furthermore, it is easier to capture long-range dependencies (e.g., monthly dependence) in the coarser scales by simply connecting the neighboring nodes via the intra-scale connections. In other words, the coarser scales are instrumental in describing long-range correlations in a manner that is graphically far more parsimonious than could be solely captured with a single, finest scale model.

The Pyramidal Attention Mechanism.

Figure 2. The Pyramidal Attention Mechanism.


  • Ubuntu OS
  • Python 3.7
  • pytorch 1.8.0
  • CUDA 11.1
  • TVM 0.8.0 (optional)

Dependencies can be installed by:

pip install -r requirements.txt

If you are using CUDA 11.1, you can use the compiled TVM runtime version in the our code to run PAM-TVM. Due to the short history length in the experiments, PAM-TVM does not provide a speed increase. If you want to compile our PAM-TVM kernel yourself, see here to compile TVM 0.8.0 first.

Data preparetion

The four datasets (Electricity, Wind, ETT and App Flow) used in this paper can be downloaded from the following links:

The downloaded datasets can be put in the 'data' directory. For single step forecasting, we preprocess Electricity, Wind and App Flow using scripts, and respectively. You can also download preprocessed data here. and put them in the 'data' directory. The directory structure looks like:

    |-- data
        |-- elect
            |-- test_data_elect.npy
            |-- train_data_elect.npy
        |-- flow
        |-- wind
        |-- ETT
            |-- ETTh1.csv
            |-- ETTh2.csv
            |-- ETTm1.csv
            |-- ETTm2.csv
        |-- LD2011_2014.txt
        |-- synthetic.npy

Where synthetic.npy is generated by running:



To perform long-range forecasting, run:

sh scripts/

To perform single step forecasting, run:

sh scripts/

The meaning of each command line argument is explained in and, respectively.


Evaluation can be done by adding the -eval option to the command line. We provide pretrained models here. The downloaded models should be put in the 'models' directory. The directory structure is as follows:

    |-- models
        |-- LongRange
            |-- elect
                |-- 168
                    |-- best_iter0.pth
                    |-- best_iter1.pth
                    |-- best_iter2.pth
                    |-- best_iter3.pth
                    |-- best_iter4.pth
                |-- 336
                |-- 720
            |-- ETTh1
            |-- ETTm1
        |-- SingleStep
            |-- elect
                |-- best_model.pth
            |-- flow
                |-- best_model.pth
            |-- wind
                |-- best_model.pth

Below are evaluation examples:

python -data ETTh1 -input_size 168 -predict_step 168 -n_head 6 -eval

python -data_path data/elect/ -dataset elect -eval


title={Pyraformer: Low-Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting},
author={Liu, Shizhan and Yu, Hang and Liao, Cong and Li, Jianguo and Lin, Weiyao and Liu, Alex X and Dustdar, Schahram},
booktitle={International Conference on Learning Representations},


For any questions w.r.t. Pyraformer, please submit them to Github Issues.

If you have interests in business cooperation with us or using our time-series forecasting products, please scan the QR code below and join our DingTalk customer group. A Chinese version introdution of Pyraformer and its practical applications can be found at