Skip to content

Commit 88690c7

Browse files
author
张仕洋
committed
更新镜像
1 parent 5fb892d commit 88690c7

File tree

2 files changed

+24
-18
lines changed

2 files changed

+24
-18
lines changed

docs/installation.mdx

Lines changed: 14 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ The basic environment is as follows:
1616
| `cuda` | >= 11.0 | - The CUDA version must be consistent with the version that PyTorch depends on (for the convenience of `torch.utils.cpp_extension` to compile code on the fly). <br />- Compatibility testing with CUDA 10.2(no support for c++17) is no longer performed. |
1717
| `pytorch` | >= 1.10.2(cuda11) | - Compatibility testing is no longer performed for `c++14/cuda-10.2/pytorch==1.10.2`, but you may still be able to run it with simple modifications. |
1818
| `opencv` | 4.x | At least the core, imgproc, imgcodecs, and highgui modules are included. |
19-
| `tensorrt` |>= 7.2<br /><= 9.2| - There is a [memory leak](https://github.com/NVIDIA/TensorRT/issues/351) in TensorRT 7.0 with dynamic inputs.
19+
| `tensorrt` |>= 7.2<br /><= 9.3| - There is a [memory leak](https://github.com/NVIDIA/TensorRT/issues/351) in TensorRT 7.0 with dynamic inputs.
2020

2121

2222
:::note
@@ -26,7 +26,7 @@ All dependencies mentioned above come from a specific default backend. The const
2626
## Using NGC base image {#NGC}
2727
The easiest way is to choose NGC mirror for source code compilation (official mirror may still be able to run low version drivers through Forward Compatibility or Minor Version Compatibility).
2828

29-
- Minimum support nvcr.io/nvidia/pytorch:21.07-py3 (Starting from 0.3.2rc3)
29+
- Minimum support nvcr.io/nvidia/pytorch:21.07-py3
3030
- Maximum support nvcr.io/nvidia/pytorch:23.08-py3
3131
- Latest test version: nvcr.io/nvidia/pytorch:22.12-py3
3232

@@ -54,9 +54,7 @@ If you are using a transformer-like model, it is strongly recommended to use Ten
5454
```
5555
*Release 23.05 is based on CUDA 12.1.1, which requires NVIDIA Driver release 530 or later. However, if you are running on a data center GPU (for example, T4 or any other data center GPU), you can use NVIDIA driver release 450.51 (or later R450), 470.57 (or later R470), 510.47 (or later R510), 515.65 (or later R515), 525.85 (or later R525), or 530.30 (or later R530)*
5656
```
57-
This can be overcome by:
58-
- Using the custom image described in the [next section](#selfdocker)
59-
- For online deployment, only considering support for data center cards such as Tesla T4 for the time being.
57+
6058
:::
6159

6260
Next, you can compile the source code:
@@ -127,13 +125,13 @@ torch.onnx.export(resnet18, data_bchw, model_path,
127125
```python
128126
import torch, torchpipe
129127
model = torchpipe.pipe({'model': model_path,
130-
'backend': "Sequential[cvtColorTensor,TensorrtTensor,SyncTensor]", # Back-end engine, as explained in the "Overview" section.
128+
'backend': "Sequential[CvtColorTensor,TensorrtTensor,SyncTensor]", # Back-end engine, as explained in the "Overview" section.
131129
'instance_num': 2, 'batching_timeout': '5', # Number of instances and timeout duration.
132130
'max': 4, # Maximum value for model optimization scope, can also be '4x3x224x224'.
133131
'mean': '123.675, 116.28, 103.53',#255*"0.485, 0.456, 0.406",
134132
'std': '58.395, 57.120, 57.375', # Merge into TensorRT network.
135133
'color': 'rgb'}) # cvtColorTensor backend parameter: target color space order
136-
data = torch.zeros((1, 3, 224, 224)) # or torch.from_numpy(...)
134+
data = torch.zeros((1, 3, 224, 224)).cuda() # or torch.from_numpy(...)
137135
input = {"data": data, 'color': 'bgr'}
138136
model(input) # Concurrency can be utilized
139137
# Use "result" as the data output identifier, although other key values can be customized as well.
@@ -148,13 +146,18 @@ For more examples, see [Showcase](./showcase/showcase.mdx).
148146

149147
## Customizing Dockerfile {#selfdocker}
150148

151-
Refer to the [example Dockerfile](https://github.com/torchpipe/torchpipe/blob/main/docker/trt9.1.base). After downloading [TensorRT](https://github.com/NVIDIA/TensorRT/tree/release/9.1#optional---if-not-using-tensorrt-container-specify-the-tensorrt-ga-release-build-path) in advance, you can compile the corresponding base image.
149+
Refer to the [example Dockerfile](https://github.com/torchpipe/torchpipe/blob/main/docker/Dockerfile).
150+
152151
```bash
153-
# put TensorRT-9.*.Linux.x86_64-gnu.cuda-11.8.tar.gz into thirdparty/
152+
153+
docker build --network=host -f ./docker/Dockerfile -t torchpipe thirdparty/
154154

155-
# docker build --network=host -f docker/trt9.base -t torchpipe:base_trt-9 .
155+
docker run --rm --network=host --gpus=all --ulimit memlock=-1 --ulimit stack=67108864 --privileged=true -v `pwd`:/workspace -it torchpipe:latest /bin/bash
156+
157+
cd /workspace/ && python setup.py install
158+
159+
cd examples/resnet18 && python resnet18.py
156160

157-
# docker run --rm --network=host --gpus=all --ulimit memlock=-1 --ulimit stack=67108864 --privileged=true -v `pwd`:/workspace -it torchpipe:base_trt-9 /bin/bash
158161

159162
```
160163
Base images compiled in this way have smaller sizes than NGC PyTorch images. Please note that `_GLIBCXX_USE_CXX11_ABI==0`.

i18n/zh/docusaurus-plugin-content-docs/current/installation.mdx

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ type: explainer
1717
| `cuda` | >= 11.0 | - 与PyTorch依赖的cuda版本必须一致(便于`torch.utils.cpp_extension`即时编译代码) <br />- 不再对CUDA 10.2(不支持c++17)的兼容进行测试。 |
1818
| `pytorch` | >= 1.10.2(cuda11) | - 对于`c++14/cuda-10.2/pytorch==1.10.2`不再进行兼容测试,然而通过简单修改,您可能仍可运行。 |
1919
| `opencv` | 3.x, 4.x | 至少包含 core, imgproc,imgcodecs 和 highgui 四个模块 |
20-
| `tensorrt` | >= 7.2(Starting from 0.3.2rc3)<br /><= 9.2 | - tensorrt 7.0 动态输入下存在[内存泄漏](https://github.com/NVIDIA/TensorRT/issues/351) |
20+
| `tensorrt` | >= 7.2<br /><= 9.3 | - tensorrt 7.0 动态输入下存在[内存泄漏](https://github.com/NVIDIA/TensorRT/issues/351) |
2121

2222
:::note
2323
以上依赖项均来源于默认存在的特定计算后端。构筑c++核心不依赖于以上任意一项。
@@ -50,7 +50,6 @@ docker run --rm --gpus=all --ipc=host --network=host -v `pwd`:/workspace --shm
5050
:::note
5151
- 如果您使用的是Transformer-like的模型,强烈推荐使用Tensorrt >= 8.6.1(`nvcr.io/nvidia/pytorch:23.05-py3`)以便支持 opset 17 for LayerNormalization 和 opset 18 GroupNormalization, 以及对此类模型更深的支持。然而,其NGC镜像对显卡驱动版本[有所要求](https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-23-05.html#rel-23-05).
5252
可以使用下一节[自定义镜像](#selfdocker).
53-
- 由于tensorrt的优化问题,部分较大参数量的Transformer-like的模型opset13下速度更快。
5453

5554

5655
:::
@@ -115,7 +114,7 @@ torch.onnx.export(resnet18, data_bchw, model_path,
115114
```python
116115
import torch, torchpipe
117116
model = torchpipe.pipe({'model': model_path,
118-
'backend': "Sequential[cvtColorTensor,TensorrtTensor,SyncTensor]", # 后端引擎, 可见overview章节的解释。
117+
'backend': "Sequential[CvtColorTensor,TensorrtTensor,SyncTensor]", # 后端引擎, 可见overview章节的解释。
119118
'instance_num': 2, 'batching_timeout': '5', # 实例数和超时时间
120119
'max': 4, # 模型优化范围最大值,也可以为 '4x3x224x224'
121120
'mean': '123.675, 116.28, 103.53',#255*"0.485, 0.456, 0.406",
@@ -136,15 +135,19 @@ print(input["result"].shape) # 失败则此键值一定不存在,即使输入
136135

137136
## 自定义dockerfile {#selfdocker}
138137

139-
参考[示例dockerfile](https://github.com/torchpipe/torchpipe/blob/main/docker/trt9.base),预先下载[TensorRT](https://github.com/NVIDIA/TensorRT/tree/release/9.1#optional---if-not-using-tensorrt-container-specify-the-tensorrt-ga-release-build-path)后可编译相关基础环境镜像。
138+
140139
```bash
141-
# put TensorRT-9.*.Linux.x86_64-gnu.cuda-11.8.tar.gz into thirdparty/
140+
141+
docker build --network=host -f ./docker/Dockerfile -t torchpipe thirdparty/
142142

143-
# docker build --network=host -f docker/trt9.base -t torchpipe:base_trt-9 .
143+
docker run --rm --network=host --gpus=all --ulimit memlock=-1 --ulimit stack=67108864 --privileged=true -v `pwd`:/workspace -it torchpipe:latest /bin/bash
144144

145-
# docker run --rm --network=host --gpus=all --ulimit memlock=-1 --ulimit stack=67108864 --privileged=true -v `pwd`:/workspace -it torchpipe:base_trt-9 /bin/bash
145+
cd /workspace/ && python setup.py install
146+
147+
cd examples/resnet18 && python resnet18.py
146148

147149
```
150+
148151
这种方式编译出的基础镜像比NGC pytorch镜像体积更小. 需要注意,其`_GLIBCXX_USE_CXX11_ABI==0`
149152

150153

0 commit comments

Comments
 (0)