TensorRT Deployment¶
DeprecationWarning¶
TensorRT support will be deprecated in the future. Welcome to use the unified model deployment toolbox MMDeploy: https://github.com/open-mmlab/mmdeploy
Introduction¶
NVIDIA TensorRT is a software development kit(SDK) for high-performance inference of deep learning models. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. Please check its developer’s website for more information.
To ease the deployment of trained models with custom operators from mmcv.ops
using TensorRT, a series of TensorRT plugins are included in MMCV.
List of TensorRT plugins supported in MMCV¶
ONNX Operator | TensorRT Plugin | MMCV Releases |
---|---|---|
MMCVRoiAlign | MMCVRoiAlign | 1.2.6 |
ScatterND | ScatterND | 1.2.6 |
NonMaxSuppression | NonMaxSuppression | 1.3.0 |
MMCVDeformConv2d | MMCVDeformConv2d | 1.3.0 |
grid_sampler | grid_sampler | 1.3.1 |
cummax | cummax | 1.3.5 |
cummin | cummin | 1.3.5 |
MMCVInstanceNormalization | MMCVInstanceNormalization | 1.3.5 |
MMCVModulatedDeformConv2d | MMCVModulatedDeformConv2d | 1.3.8 |
Notes
All plugins listed above are developed on TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0
How to build TensorRT plugins in MMCV¶
Prerequisite¶
Clone repository
git clone https://github.com/open-mmlab/mmcv.git
Install TensorRT
Download the corresponding TensorRT build from NVIDIA Developer Zone.
For example, for Ubuntu 16.04 on x86-64 with cuda-10.2, the downloaded file is TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz
.
Then, install as below:
cd ~/Downloads
tar -xvzf TensorRT-7.2.1.6.Ubuntu-16.04.x86_64-gnu.cuda-10.2.cudnn8.0.tar.gz
export TENSORRT_DIR=`pwd`/TensorRT-7.2.1.6
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$TENSORRT_DIR/lib
Install python packages: tensorrt, graphsurgeon, onnx-graphsurgeon
pip install $TENSORRT_DIR/python/tensorrt-7.2.1.6-cp37-none-linux_x86_64.whl
pip install $TENSORRT_DIR/onnx_graphsurgeon/onnx_graphsurgeon-0.2.6-py2.py3-none-any.whl
pip install $TENSORRT_DIR/graphsurgeon/graphsurgeon-0.4.5-py2.py3-none-any.whl
For more detailed information of installing TensorRT using tar, please refer to Nvidia’ website.
Install cuDNN
Install cuDNN 8 following Nvidia’ website.
Build on Linux¶
cd mmcv ## to MMCV root directory
MMCV_WITH_OPS=1 MMCV_WITH_TRT=1 pip install -e .
Create TensorRT engine and run inference in python¶
Here is an example.
import torch
import onnx
from mmcv.tensorrt import (TRTWrapper, onnx2trt, save_trt_engine,
is_tensorrt_plugin_loaded)
assert is_tensorrt_plugin_loaded(), 'Requires to complie TensorRT plugins in mmcv'
onnx_file = 'sample.onnx'
trt_file = 'sample.trt'
onnx_model = onnx.load(onnx_file)
## Model input
inputs = torch.rand(1, 3, 224, 224).cuda()
## Model input shape info
opt_shape_dict = {
'input': [list(inputs.shape),
list(inputs.shape),
list(inputs.shape)]
}
## Create TensorRT engine
max_workspace_size = 1 << 30
trt_engine = onnx2trt(
onnx_model,
opt_shape_dict,
max_workspace_size=max_workspace_size)
## Save TensorRT engine
save_trt_engine(trt_engine, trt_file)
## Run inference with TensorRT
trt_model = TRTWrapper(trt_file, ['input'], ['output'])
with torch.no_grad():
trt_outputs = trt_model({'input': inputs})
output = trt_outputs['output']
How to add a TensorRT plugin for custom op in MMCV¶
Main procedures¶
Below are the main steps:
Add c++ header file
Add c++ source file
Add cuda kernel file
Register plugin in
trt_plugin.cpp
Add unit test in
tests/test_ops/test_tensorrt.py
Take RoIAlign plugin roi_align
for example.
Add header
trt_roi_align.hpp
to TensorRT include directorymmcv/ops/csrc/tensorrt/
Add source
trt_roi_align.cpp
to TensorRT source directorymmcv/ops/csrc/tensorrt/plugins/
Add cuda kernel
trt_roi_align_kernel.cu
to TensorRT source directorymmcv/ops/csrc/tensorrt/plugins/
Register
roi_align
plugin in trt_plugin.cpp#include "trt_plugin.hpp" #include "trt_roi_align.hpp" REGISTER_TENSORRT_PLUGIN(RoIAlignPluginDynamicCreator); extern "C" { bool initLibMMCVInferPlugins() { return true; } } // extern "C"
Add unit test into
tests/test_ops/test_tensorrt.py
Check here for examples.
Reminders¶
Please note that this feature is experimental and may change in the future. Strongly suggest users always try with the latest master branch.
Some of the custom ops in
mmcv
have their cuda implementations, which could be referred.
Known Issues¶
None