Awesome Open Source
Awesome Open Source

MMDet to tensorrt

This project aims to convert the mmdetection model to tensorrt model end2end. Focus on object detection for now. Mask support is experiment.


  • fp16
  • int8(experiment)
  • batched input
  • dynamic input shape
  • combination of different modules
  • deepstream support

Any advices, bug reports and stars are welcome.


This project is released under the Apache 2.0 license.



Set the envoirment variable(in ~/.bashrc):

export AMIRSTAN_LIBRARY_PATH=${amirstan_plugin_root}/build/lib



git clone
cd mmdetection-to-tensorrt
python develop


Build docker image(Note that TensorRT7.0 might have memory leak, better to upgrade to 7.1+)

# cuda10.2 tensorrt7.0 pytorch1.6
sudo docker build -t mmdet2trt_docker:v1.0 docker/

Run (will show the help for the CLI entrypoint)

sudo docker run --gpus all -it --rm -v ${your_data_path}:${bind_path} mmdet2trt_docker:v1.0

Or if you want to open a terminal inside de container:

sudo docker run --gpus all -it --rm -v ${your_data_path}:${bind_path} --entrypoint bash mmdet2trt_docker:v1.0

Example conversion:

sudo docker run --gpus all -it --rm -v ${your_data_path}:${bind_path} mmdet2trt_docker:v1.0 ${bind_path}/ ${bind_path}/checkpoint.pth ${bind_path}/output.trt


how to create a tensorrt model from mmdet model (converting might take few minutes)(Might have some warning when converting.) detail can be found in



Run mmdet2trt -h for help on optional arguments.


        [1,3,320,320],      # min shape
        [1,3,800,1344],     # optimize shape
        [1,3,1344,1344],    # max shape
max_workspace_size=1<<30    # some module and tactic need large workspace.
trt_model = mmdet2trt(cfg_path, weight_path, opt_shape_param=opt_shape_param, fp16_mode=True, max_workspace_size=max_workspace_size), save_path)

how to use the converted model

trt_model = init_detector(save_path)
num_detections, trt_bbox, trt_score, trt_cls = inference_detector(trt_model, image_path, cfg_path, "cuda:0")

how to save the tensorrt engine

with open(engine_path, mode='wb') as f:

note that the bbox inference result did not divided by scale factor, divided by you self if needed.

play demo in demo/ for more detail

How does it works?

Most other project use pytorch=>ONNX=>tensorRT route, This repo convert pytorch=>tensorRT directly, avoid unnecessary ONNX IR. read for detail.

Support Model/Module

  • [x] Faster R-CNN
  • [x] Cascade R-CNN
  • [x] Double-Head R-CNN
  • [x] Group Normalization
  • [x] Weight Standardization
  • [x] DCN
  • [x] SSD
  • [x] RetinaNet
  • [x] Libra R-CNN
  • [x] FCOS
  • [x] Fovea
  • [x] CARAFE
  • [x] FreeAnchor
  • [x] RepPoints
  • [x] NAS-FPN
  • [x] ATSS
  • [x] PAFPN
  • [x] FSAF
  • [x] GCNet
  • [x] Guided Anchoring
  • [x] Generalized Attention
  • [x] Dynamic R-CNN
  • [x] Hybrid Task Cascade
  • [x] DetectoRS
  • [x] Side-Aware Boundary Localization
  • [x] YOLOv3
  • [x] PAA
  • [ ] CornerNet(WIP)
  • [x] Generalized Focal Loss
  • [x] Grid RCNN
  • [x] VFNet
  • [x] GROIE
  • [x] Mask R-CNN(experiment)
  • [x] Cascade Mask R-CNN(experiment)
  • [x] Cascade RPN
  • [x] DETR

Tested on:

  • torch=1.6.0
  • tensorrt=
  • mmdetection=2.10.0
  • cuda=10.2
  • cudnn=

If you find any error, please report in the issue.


read this page if you meet any problem.


This repo is maintained by @grimoire

Discuss group: QQ:1107959378

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
python (53,198
object-detection (475
yolov3 (108
inference (97
ssd (62
faster-rcnn (55
tensorrt (53
retinanet (21
mmdetection (16