Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Segmentation_models | 4,095 | 12 | 12 | 5 months ago | 8 | January 10, 2020 | 237 | mit | Python | |
Segmentation models with pretrained backbones. Keras and TensorFlow Keras. | ||||||||||
Resnest | 3,070 | 5 | 4 months ago | 897 | July 07, 2022 | 58 | apache-2.0 | Python | ||
ResNeSt: Split-Attention Networks | ||||||||||
Detectron.pytorch | 2,695 | 4 years ago | 121 | mit | Python | |||||
A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available. | ||||||||||
Pvt | 1,289 | 5 months ago | 26 | apache-2.0 | Python | |||||
Fastfcn | 636 | 3 years ago | 1 | other | Python | |||||
FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation. | ||||||||||
Fishnet | 511 | 4 years ago | Python | |||||||
Implementation code of the paper: FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction, NeurIPS 2018 | ||||||||||
Centermask | 449 | 3 years ago | 12 | other | Python | |||||
CenterMask : Real-Time Anchor-Free Instance Segmentation, in CVPR 2020 | ||||||||||
Panoptic Deeplab | 333 | 2 years ago | 7 | apache-2.0 | Python | |||||
This is Pytorch re-implementation of our CVPR 2020 paper "Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation" (https://arxiv.org/abs/1911.10194) | ||||||||||
Iccv2021 Papers With Code Demo | 212 | a year ago | 1 | |||||||
ICCV 2021 paper with code | ||||||||||
Ds Net | 181 | a year ago | 6 | mit | Python | |||||
[CVPR 2021] Rank 1st in the public leaderboard of SemanticKITTI Panoptic Segmentation (2020-11-16) |
The image is from Transformers: Revenge of the Fallen.
This repository contains the official implementation of PVTv1 & PVTv2 in image classification, object detection, and semantic segmentation tasks.
Classification configs & weights see >>>here<<<.
Method | Size | [email protected] | #Params (M) |
---|---|---|---|
PVTv2-B0 | 224 | 70.5 | 3.7 |
PVTv2-B1 | 224 | 78.7 | 14.0 |
PVTv2-B2-Linear | 224 | 82.1 | 22.6 |
PVTv2-B2 | 224 | 82.0 | 25.4 |
PVTv2-B3 | 224 | 83.1 | 45.2 |
PVTv2-B4 | 224 | 83.6 | 62.6 |
PVTv2-B5 | 224 | 83.8 | 82.0 |
Method | Size | [email protected] | #Params (M) |
---|---|---|---|
PVT-Tiny | 224 | 75.1 | 13.2 |
PVT-Small | 224 | 79.8 | 24.5 |
PVT-Medium | 224 | 81.2 | 44.2 |
PVT-Large | 224 | 81.7 | 61.4 |
Detection configs & weights see >>>here<<<.
Method | Backbone | Pretrain | Lr schd | Aug | box AP | mask AP |
---|---|---|---|---|---|---|
RetinaNet | PVTv2-b0 | ImageNet-1K | 1x | No | 37.2 | - |
RetinaNet | PVTv2-b1 | ImageNet-1K | 1x | No | 41.2 | - |
RetinaNet | PVTv2-b2 | ImageNet-1K | 1x | No | 44.6 | - |
RetinaNet | PVTv2-b3 | ImageNet-1K | 1x | No | 45.9 | - |
RetinaNet | PVTv2-b4 | ImageNet-1K | 1x | No | 46.1 | - |
RetinaNet | PVTv2-b5 | ImageNet-1K | 1x | No | 46.2 | - |
Mask R-CNN | PVTv2-b0 | ImageNet-1K | 1x | No | 38.2 | 36.2 |
Mask R-CNN | PVTv2-b1 | ImageNet-1K | 1x | No | 41.8 | 38.8 |
Mask R-CNN | PVTv2-b2 | ImageNet-1K | 1x | No | 45.3 | 41.2 |
Mask R-CNN | PVTv2-b3 | ImageNet-1K | 1x | No | 47.0 | 42.5 |
Mask R-CNN | PVTv2-b4 | ImageNet-1K | 1x | No | 47.5 | 42.7 |
Mask R-CNN | PVTv2-b5 | ImageNet-1K | 1x | No | 47.4 | 42.5 |
Method | Backbone | Pretrain | Lr schd | Aug | box AP | mask AP |
---|---|---|---|---|---|---|
Cascade Mask R-CNN | PVTv2-b2-Linear | ImageNet-1K | 3x | Yes | 50.9 | 44.0 |
Cascade Mask R-CNN | PVTv2-b2 | ImageNet-1K | 3x | Yes | 51.1 | 44.4 |
ATSS | PVTv2-b2-Linear | ImageNet-1K | 3x | Yes | 48.9 | - |
ATSS | PVTv2-b2 | ImageNet-1K | 3x | Yes | 49.9 | - |
GFL | PVTv2-b2-Linear | ImageNet-1K | 3x | Yes | 49.2 | - |
GFL | PVTv2-b2 | ImageNet-1K | 3x | Yes | 50.2 | - |
Sparse R-CNN | PVTv2-b2-Linear | ImageNet-1K | 3x | Yes | 48.9 | - |
Sparse R-CNN | PVTv2-b2 | ImageNet-1K | 3x | Yes | 50.1 | - |
Detector | Backbone | Pretrain | Lr schd | box AP | mask AP |
---|---|---|---|---|---|
RetinaNet | PVT-Tiny | ImageNet-1K | 1x | 36.7 | - |
RetinaNet | PVT-Small | ImageNet-1K | 1x | 40.4 | - |
Mask RCNN | PVT-Tiny | ImageNet-1K | 1x | 36.7 | 35.1 |
Mask RCNN | PVT-Small | ImageNet-1K | 1x | 40.4 | 37.8 |
DETR | PVT-Small | ImageNet-1K | 50ep | 34.7 | - |
Segmentation configs & weights see >>>here<<<.
PVT-v2 + Segmentation see >>>here<<<.
Method | Backbone | Pretrain | Iters | mIoU |
---|---|---|---|---|
Semantic FPN | PVT-Tiny | ImageNet-1K | 40K | 35.7 |
Semantic FPN | PVT-Small | ImageNet-1K | 40K | 39.8 |
Semantic FPN | PVT-Medium | ImageNet-1K | 40K | 41.6 |
Semantic FPN | PVT-Large | ImageNet-1K | 40K | 42.1 |
Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers. pdf | code
Masked Vision-Language Transformer in Fashion. pdf | code
This repository is released under the Apache 2.0 license as found in the LICENSE file.
If you use this code for a paper, please cite:
PVTv1
@inproceedings{wang2021pyramid,
title={Pyramid vision transformer: A versatile backbone for dense prediction without convolutions},
author={Wang, Wenhai and Xie, Enze and Li, Xiang and Fan, Deng-Ping and Song, Kaitao and Liang, Ding and Lu, Tong and Luo, Ping and Shao, Ling},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={568--578},
year={2021}
}
PVTv2
@article{wang2021pvtv2,
title={Pvtv2: Improved baselines with pyramid vision transformer},
author={Wang, Wenhai and Xie, Enze and Li, Xiang and Fan, Deng-Ping and Song, Kaitao and Liang, Ding and Lu, Tong and Luo, Ping and Shao, Ling},
journal={Computational Visual Media},
volume={8},
number={3},
pages={1--10},
year={2022},
publisher={Springer}
}
This repo is currently maintained by Wenhai Wang (@whai362), Enze Xie (@xieenze), and Zhe Chen (@czczup).