Awesome Open Source
Awesome Open Source

Pyramidal Convolution

This is the PyTorch implementation of our paper "Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition". (Note that this is the code for image recognition on ImageNet. For semantic image segmentation/parsing refer to this repository:

Pyramidal Convolution: PyConv

The models trained on ImageNet can be found here.

PyConv is able to provide improved recognition capabilities over the baseline (see the paper for details).

The accuracy on ImageNet (using the default training settings):

Network 50-layers 101-layers 152-layers
ResNet 76.12% (model) 78.00% (model) 78.45% (model)
PyConvHGResNet 78.48% (model) 79.22% (model) 79.36% (model)
PyConvResNet 77.88% (model) 79.01% (model) 79.52% (model)

The accuracy on ImageNet can be significantly improved using more complex training settings (for instance, using additional data augmentation (CutMix), increase bach size to 1024, learning rate of 0.4, cosine scheduler over 300 epochs and use mixed precision to speed-up training):

Network test crop: 224×224 test crop: 320×320
PyConvResNet-50 (+augment) 79.44 80.59 (model)
PyConvResNet-101 (+augment) 80.58 81.49 (model)


Install PyTorch and ImageNet dataset following the official PyTorch ImageNet training code.

A fast alternative (without the need to install PyTorch and other deep learning libraries) is to use NVIDIA-Docker, we used this container image.


To train a model (for instance, PyConvResNet with 50 layers) using DataParallel run; you need also to provide result_path (the directory path where to save the results and logs) and the --data (the path to the ImageNet dataset):

mkdir -p ${result_path}
python \
--data /your/path/to/ImageNet/dataset/ \
--result_path ${result_path} \
--arch pyconvresnet \
--model_depth 50

To train using Multi-processing Distributed Data Parallel Training follow the instructions in the official PyTorch ImageNet training code.


If you find our work useful, please consider citing:

  author  = {Ionut Cosmin Duta and Li Liu and Fan Zhu and Ling Shao},
  title   = {Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition},
  journal = {arXiv preprint arXiv:2006.11538},
  year    = {2020},

Alternative Project Comparisons
Related Awesome Lists
Top Programming Languages
Top Projects

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
Python (795,861
Machine Learning (36,524
Deep Learning (35,987
Pytorch (20,745
Artificial Intelligence (18,753
Neural Network (15,414
Convolutional Neural Networks (12,473
Cnn (12,470
Recognition (10,880
Computer Vision (8,646
Deep Neural Networks (3,829
Convolution (2,304
Imagenet (1,843
Image Recognition (1,110
Residual Networks (227
Pattern Recognition (223
Visual Recognition (215
Residual Learning (21