Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Deep Learning For Image Processing | 14,599 | 3 months ago | 28 | gpl-3.0 | Python | |||||
deep learning for image processing including classification and object-detection etc. | ||||||||||
Segmentation_models.pytorch | 7,404 | 2 | 34 | 2 days ago | 10 | November 18, 2021 | 28 | mit | Python | |
Segmentation models with pretrained backbones. PyTorch. | ||||||||||
Pytorch Unet | 6,910 | 6 days ago | 56 | gpl-3.0 | Python | |||||
PyTorch implementation of the U-Net for image semantic segmentation with high quality images | ||||||||||
Pytorch Cnn Visualizations | 6,883 | 8 months ago | 3 | mit | Python | |||||
Pytorch implementation of convolutional neural network visualization techniques | ||||||||||
Mmsegmentation | 5,931 | 2 | 15 hours ago | 30 | July 01, 2022 | 433 | apache-2.0 | Jupyter Notebook | ||
OpenMMLab Semantic Segmentation Toolbox and Benchmark. | ||||||||||
Gluon Cv | 5,422 | 15 | 44 | 5 months ago | 1,514 | July 07, 2022 | 61 | apache-2.0 | Python | |
Gluon CV Toolkit | ||||||||||
Yolact | 4,432 | 10 months ago | 366 | mit | Python | |||||
A simple, fully convolutional model for real-time instance segmentation. | ||||||||||
Awesome Domain Adaptation | 4,326 | 2 months ago | 1 | mit | ||||||
A collection of AWESOME things about domian adaptation | ||||||||||
Nnunet | 3,720 | 6 | 8 days ago | 1 | May 28, 2021 | 307 | apache-2.0 | Python | ||
Catalyst | 3,106 | 19 | 10 | 2 months ago | 108 | April 29, 2022 | 5 | apache-2.0 | Python | |
Accelerated deep learning R&D |
Python library with Neural Networks for Image
Segmentation based on PyTorch.
The main features of this library are:
Visit Read The Docs Project Page or read following README to know more about Segmentation Models Pytorch (SMP for short) library
Segmentation model is just a PyTorch nn.Module, which can be created as easy as:
import segmentation_models_pytorch as smp
model = smp.Unet(
encoder_name="resnet34", # choose encoder, e.g. mobilenet_v2 or efficientnet-b7
encoder_weights="imagenet", # use `imagenet` pre-trained weights for encoder initialization
in_channels=1, # model input channels (1 for gray-scale images, 3 for RGB, etc.)
classes=3, # model output channels (number of classes in your dataset)
)
All encoders have pretrained weights. Preparing your data the same way as during weights pre-training may give you better results (higher metric score and faster convergence). It is not necessary in case you train the whole model, not only decoder.
from segmentation_models_pytorch.encoders import get_preprocessing_fn
preprocess_input = get_preprocessing_fn('resnet18', pretrained='imagenet')
Congratulations! You are done! Now you can train your model with your favorite framework!
The following is a list of supported encoders in the SMP. Select the appropriate family of encoders and click to expand the table and select a specific encoder and its pre-trained weights (encoder_name
and encoder_weights
parameters).
Encoder | Weights | Params, M |
---|---|---|
resnet18 | imagenet / ssl / swsl | 11M |
resnet34 | imagenet | 21M |
resnet50 | imagenet / ssl / swsl | 23M |
resnet101 | imagenet | 42M |
resnet152 | imagenet | 58M |
Encoder | Weights | Params, M |
---|---|---|
resnext50_32x4d | imagenet / ssl / swsl | 22M |
resnext101_32x4d | ssl / swsl | 42M |
resnext101_32x8d | imagenet / instagram / ssl / swsl | 86M |
resnext101_32x16d | instagram / ssl / swsl | 191M |
resnext101_32x32d | 466M | |
resnext101_32x48d | 826M |
Encoder | Weights | Params, M |
---|---|---|
timm-resnest14d | imagenet | 8M |
timm-resnest26d | imagenet | 15M |
timm-resnest50d | imagenet | 25M |
timm-resnest101e | imagenet | 46M |
timm-resnest200e | imagenet | 68M |
timm-resnest269e | imagenet | 108M |
timm-resnest50d_4s2x40d | imagenet | 28M |
timm-resnest50d_1s4x24d | imagenet | 23M |
Encoder | Weights | Params, M |
---|---|---|
timm-res2net50_26w_4s | imagenet | 23M |
timm-res2net101_26w_4s | imagenet | 43M |
timm-res2net50_26w_6s | imagenet | 35M |
timm-res2net50_26w_8s | imagenet | 46M |
timm-res2net50_48w_2s | imagenet | 23M |
timm-res2net50_14w_8s | imagenet | 23M |
timm-res2next50 | imagenet | 22M |
Encoder | Weights | Params, M |
---|---|---|
timm-regnetx_002 | imagenet | 2M |
timm-regnetx_004 | imagenet | 4M |
timm-regnetx_006 | imagenet | 5M |
timm-regnetx_008 | imagenet | 6M |
timm-regnetx_016 | imagenet | 8M |
timm-regnetx_032 | imagenet | 14M |
timm-regnetx_040 | imagenet | 20M |
timm-regnetx_064 | imagenet | 24M |
timm-regnetx_080 | imagenet | 37M |
timm-regnetx_120 | imagenet | 43M |
timm-regnetx_160 | imagenet | 52M |
timm-regnetx_320 | imagenet | 105M |
timm-regnety_002 | imagenet | 2M |
timm-regnety_004 | imagenet | 3M |
timm-regnety_006 | imagenet | 5M |
timm-regnety_008 | imagenet | 5M |
timm-regnety_016 | imagenet | 10M |
timm-regnety_032 | imagenet | 17M |
timm-regnety_040 | imagenet | 19M |
timm-regnety_064 | imagenet | 29M |
timm-regnety_080 | imagenet | 37M |
timm-regnety_120 | imagenet | 49M |
timm-regnety_160 | imagenet | 80M |
timm-regnety_320 | imagenet | 141M |
Encoder | Weights | Params, M |
---|---|---|
timm-gernet_s | imagenet | 6M |
timm-gernet_m | imagenet | 18M |
timm-gernet_l | imagenet | 28M |
Encoder | Weights | Params, M |
---|---|---|
senet154 | imagenet | 113M |
se_resnet50 | imagenet | 26M |
se_resnet101 | imagenet | 47M |
se_resnet152 | imagenet | 64M |
se_resnext50_32x4d | imagenet | 25M |
se_resnext101_32x4d | imagenet | 46M |
Encoder | Weights | Params, M |
---|---|---|
timm-skresnet18 | imagenet | 11M |
timm-skresnet34 | imagenet | 21M |
timm-skresnext50_32x4d | imagenet | 25M |
Encoder | Weights | Params, M |
---|---|---|
densenet121 | imagenet | 6M |
densenet169 | imagenet | 12M |
densenet201 | imagenet | 18M |
densenet161 | imagenet | 26M |
Encoder | Weights | Params, M |
---|---|---|
inceptionresnetv2 | imagenet / imagenet+background | 54M |
inceptionv4 | imagenet / imagenet+background | 41M |
xception | imagenet | 22M |
Encoder | Weights | Params, M |
---|---|---|
efficientnet-b0 | imagenet | 4M |
efficientnet-b1 | imagenet | 6M |
efficientnet-b2 | imagenet | 7M |
efficientnet-b3 | imagenet | 10M |
efficientnet-b4 | imagenet | 17M |
efficientnet-b5 | imagenet | 28M |
efficientnet-b6 | imagenet | 40M |
efficientnet-b7 | imagenet | 63M |
timm-efficientnet-b0 | imagenet / advprop / noisy-student | 4M |
timm-efficientnet-b1 | imagenet / advprop / noisy-student | 6M |
timm-efficientnet-b2 | imagenet / advprop / noisy-student | 7M |
timm-efficientnet-b3 | imagenet / advprop / noisy-student | 10M |
timm-efficientnet-b4 | imagenet / advprop / noisy-student | 17M |
timm-efficientnet-b5 | imagenet / advprop / noisy-student | 28M |
timm-efficientnet-b6 | imagenet / advprop / noisy-student | 40M |
timm-efficientnet-b7 | imagenet / advprop / noisy-student | 63M |
timm-efficientnet-b8 | imagenet / advprop | 84M |
timm-efficientnet-l2 | noisy-student | 474M |
timm-efficientnet-lite0 | imagenet | 4M |
timm-efficientnet-lite1 | imagenet | 5M |
timm-efficientnet-lite2 | imagenet | 6M |
timm-efficientnet-lite3 | imagenet | 8M |
timm-efficientnet-lite4 | imagenet | 13M |
Encoder | Weights | Params, M |
---|---|---|
mobilenet_v2 | imagenet | 2M |
timm-mobilenetv3_large_075 | imagenet | 1.78M |
timm-mobilenetv3_large_100 | imagenet | 2.97M |
timm-mobilenetv3_large_minimal_100 | imagenet | 1.41M |
timm-mobilenetv3_small_075 | imagenet | 0.57M |
timm-mobilenetv3_small_100 | imagenet | 0.93M |
timm-mobilenetv3_small_minimal_100 | imagenet | 0.43M |
Encoder | Weights | Params, M |
---|---|---|
dpn68 | imagenet | 11M |
dpn68b | imagenet+5k | 11M |
dpn92 | imagenet+5k | 34M |
dpn98 | imagenet | 58M |
dpn107 | imagenet+5k | 84M |
dpn131 | imagenet | 76M |
Encoder | Weights | Params, M |
---|---|---|
vgg11 | imagenet | 9M |
vgg11_bn | imagenet | 9M |
vgg13 | imagenet | 9M |
vgg13_bn | imagenet | 9M |
vgg16 | imagenet | 14M |
vgg16_bn | imagenet | 14M |
vgg19 | imagenet | 20M |
vgg19_bn | imagenet | 20M |
Backbone from SegFormer pretrained on Imagenet! Can be used with other decoders from package, you can combine Mix Vision Transformer with Unet, FPN and others!
Limitations:
Encoder | Weights | Params, M |
---|---|---|
mit_b0 | imagenet | 3M |
mit_b1 | imagenet | 13M |
mit_b2 | imagenet | 24M |
mit_b3 | imagenet | 44M |
mit_b4 | imagenet | 60M |
mit_b5 | imagenet | 81M |
Apple's "sub-one-ms" Backbone pretrained on Imagenet! Can be used with all decoders.
Note: In the official github repo the s0 variant has additional num_conv_branches, leading to more params than s1.
Encoder | Weights | Params, M |
---|---|---|
mobileone_s0 | imagenet | 4.6M |
mobileone_s1 | imagenet | 4.0M |
mobileone_s2 | imagenet | 6.5M |
mobileone_s3 | imagenet | 8.8M |
mobileone_s4 | imagenet | 13.6M |
* ssl
, swsl
- semi-supervised and weakly-supervised learning on ImageNet (repo).
Pytorch Image Models (a.k.a. timm) has a lot of pretrained models and interface which allows using these models as encoders in smp, however, not all models are supported
features_only
functionality implemented that is required for encoderTotal number of supported encoders: 549
model.encoder
- pretrained backbone to extract features of different spatial resolutionmodel.decoder
- depends on models architecture (Unet
/Linknet
/PSPNet
/FPN
)model.segmentation_head
- last block to produce required number of mask channels (include also optional upsampling and activation)model.classification_head
- optional block which create classification head on top of encodermodel.forward(x)
- sequentially pass x
through model`s encoder, decoder and segmentation head (and classification head if specified)Input channels parameter allows you to create models, which process tensors with arbitrary number of channels.
If you use pretrained weights from imagenet - weights of first convolution will be reused. For
1-channel case it would be a sum of weights of first convolution layer, otherwise channels would be
populated with weights like new_weight[:, i] = pretrained_weight[:, i % 3]
and than scaled with new_weight * 3 / new_in_channels
.
model = smp.FPN('resnet34', in_channels=1)
mask = model(torch.ones([1, 1, 64, 64]))
All models support aux_params
parameters, which is default set to None
.
If aux_params = None
then classification auxiliary output is not created, else
model produce not only mask
, but also label
output with shape NC
.
Classification head consists of GlobalPooling->Dropout(optional)->Linear->Activation(optional) layers, which can be
configured by aux_params
as follows:
aux_params=dict(
pooling='avg', # one of 'avg', 'max'
dropout=0.5, # dropout ratio, default is None
activation='sigmoid', # activation function, default is None
classes=4, # define number of output labels
)
model = smp.Unet('resnet34', classes=4, aux_params=aux_params)
mask, label = model(x)
Depth parameter specify a number of downsampling operations in encoder, so you can make
your model lighter if specify smaller depth
.
model = smp.Unet('resnet34', encoder_depth=4)
PyPI version:
$ pip install segmentation-models-pytorch
Latest version from source:
$ pip install git+https://github.com/qubvel/segmentation_models.pytorch
Segmentation Models
package is widely used in the image segmentation competitions.
Here you can find competitions, names of the winners and links to their solutions.
make install_dev # create .venv, install SMP in dev mode
make all # run flake8, black, tests
make table # generate table with encoders and print to stdout
@misc{Iakubovskii:2019,
Author = {Pavel Iakubovskii},
Title = {Segmentation Models Pytorch},
Year = {2019},
Publisher = {GitHub},
Journal = {GitHub repository},
Howpublished = {\url{https://github.com/qubvel/segmentation_models.pytorch}}
}
Project is distributed under MIT License