Once for All: Train One Network and Specialize it for Efficient Deployment [arXiv] [Slides] [Video]

  title={Once for All: Train One Network and Specialize it for Efficient Deployment},
  author={Han Cai and Chuang Gan and Tianzhe Wang and Zhekai Zhang and Song Han},
  booktitle={International Conference on Learning Representations},

[News] The hands-on tutorial of OFA is released!

[News] OFA is available via pip! Run pip install ofa to install the whole OFA codebase.

[News] Fisrt place in the 4th Low-Power Computer Vision Challenge, both classification and detection track.

[News] First place in the 3rd Low-Power Computer Vision Challenge, DSP track at ICCV’19 using the Once-for-all Network.

Train once, specialize for many deployment scenarios

80% top1 ImageNet accuracy under mobile setting

Consistently outperforms MobileNetV3 on Diverse hardware platforms

How to use / evaluate OFA Specialized Networks


""" OFA Specialized Networks.
Example: net, image_size = ofa_specialized('[email protected][email protected][email protected]', pretrained=True)
from ofa.model_zoo import ofa_specialized
net, image_size = ofa_specialized(net_id, pretrained=True)

If the above scripts failed to download, you download it manually from Google Drive and put them under $HOME/.torch/ofa_specialized/.


python --path 'Your path to imagent' --net [email protected][email protected][email protected]

OFA Specialized Sub-nets Top-1 (%) Top-5 (%) #Params #MACs
[email protected][email protected][email protected] 80.0 94.9 9.1M 595M
[email protected][email protected][email protected] 79.6 94.8 9.1M 482M
[email protected][email protected][email protected] 79.1 94.5 8.4M 389M
[email protected][email protected][email protected] 76.4 93.0 5.8M 230M
[email protected][email protected][email protected] 74.7 92.0 5.8M 151M
[email protected][email protected][email protected] 73.0 91.1 5.0M 103M
[email protected][email protected][email protected] 71.1 89.7 4.1M 74M
Samsung S7 Edge
[email protected][email protected][email protected] 76.3 92.9 6.4M 219M
[email protected][email protected][email protected] 74.7 92.0 4.6M 145M
[email protected][email protected][email protected] 73.1 91.0 4.7M 96M
[email protected][email protected][email protected] 70.5 89.5 3.8M 66M
Samsung Note8
[email protected][email protected][email protected] 76.1 92.7 5.3M 220M
[email protected][email protected][email protected] 74.9 92.1 6.0M 164M
[email protected][email protected][email protected] 72.8 90.8 4.6M 101M
[email protected][email protected][email protected] 70.4 89.3 4.3M 67M
Samsung Note10
[email protected][email protected][email protected] 80.2 95.1 9.1M 743M
[email protected][email protected][email protected] 79.7 94.9 9.1M 554M
[email protected][email protected][email protected] 79.3 94.5 9.0M 457M
[email protected][email protected][email protected] 78.4 94.2 7.5M 339M
[email protected][email protected][email protected] 76.6 93.1 5.9M 237M
[email protected][email protected][email protected] 75.5 92.3 4.9M 163M
[email protected][email protected][email protected] 73.6 91.2 4.3M 110M
[email protected][email protected][email protected] 71.4 89.8 3.8M 79M
Google Pixel1
[email protected][email protected][email protected] 80.1 95.0 9.2M 642M
[email protected][email protected][email protected] 79.8 94.9 9.2M 593M
[email protected][email protected][email protected] 78.7 94.2 8.2M 356M
[email protected][email protected][email protected] 76.9 93.3 5.8M 230M
[email protected][email protected][email protected] 74.9 92.1 6.0M 162M
[email protected][email protected][email protected] 73.3 91.0 5.2M 109M
[email protected][email protected][email protected] 71.4 89.8 4.3M 77M
Google Pixel2
[email protected][email protected][email protected] 75.8 92.7 5.8M 208M
[email protected][email protected][email protected] 74.7 91.9 4.7M 166M
[email protected][email protected][email protected] 73.4 91.1 5.1M 113M
[email protected][email protected][email protected] 71.5 90.1 4.1M 79M
1080ti GPU (Batch Size 64)
[email protected][email protected][email protected] 76.4 93.0 6.5M 397M
[email protected][email protected][email protected] 75.3 92.4 5.2M 313M
[email protected][email protected][email protected] 73.8 91.3 6.0M 226M
[email protected][email protected][email protected] 72.6 90.9 5.9M 165M
V100 GPU (Batch Size 64)
[email protected][email protected][email protected] 76.1 92.7 6.2M 352M
[email protected][email protected][email protected] 75.3 92.4 5.2M 313M
[email protected][email protected][email protected] 73.0 91.1 4.9M 179M
[email protected][email protected][email protected] 71.6 90.3 5.2M 141M
Jetson TX2 GPU (Batch Size 16)
[email protected][email protected][email protected] 75.8 92.7 6.2M 349M
[email protected][email protected][email protected] 75.4 92.4 5.2M 313M
[email protected][email protected][email protected] 72.9 91.1 4.9M 179M
[email protected][email protected][email protected] 70.3 89.4 4.3M 121M
Intel Xeon CPU with MKL-DNN (Batch Size 1)
[email protected][email protected][email protected] 75.7 92.6 4.9M 365M
[email protected][email protected][email protected] 74.6 92.0 4.9M 301M
[email protected][email protected][email protected] 72.0 90.4 4.4M 160M
[email protected][email protected][email protected] 71.1 89.9 4.2M 143M

How to use / evaluate OFA Networks


""" OFA Networks.
    Example: ofa_network = ofa_net('ofa_mbv3_d234_e346_k357_w1.0', pretrained=True)
from ofa.model_zoo import ofa_net
ofa_network = ofa_net(net_id, pretrained=True)
# Randomly sample sub-networks from OFA network
random_subnet = ofa_network.get_active_subnet(preserve_weight=True)
# Manually set the sub-network
ofa_network.set_active_subnet(ks=7, e=6, d=4)
manual_subnet = ofa_network.get_active_subnet(preserve_weight=True)

If the above scripts failed to download, you download it manually from Google Drive and put them under $HOME/.torch/ofa_nets/.


python --path 'Your path to imagenet' --net ofa_mbv3_d234_e346_k357_w1.0

How to train OFA Networks

mpirun -np 32 -H <server1_ip>:8,<server2_ip>:8,<server3_ip>:8,<server4_ip>:8 \
    -bind-to none -map-by slot \


horovodrun -np 32 -H <server1_ip>:8,<server2_ip>:8,<server3_ip>:8,<server4_ip>:8 \

Introduction Video

Watch the video

Hands-on Tutorial Video

Watch the video


  • Python 3.6+
  • Pytorch 1.4.0+
  • ImageNet Dataset
  • Horovod

Related work on automated and efficient deep learning:

ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware (ICLR’19)

AutoML for Architecting Efficient and Specialized Neural Networks (IEEE Micro)

AMC: AutoML for Model Compression and Acceleration on Mobile Devices (ECCV’18)

HAQ: Hardware-Aware Automated Quantization (CVPR’19, oral)

