更新镜像

张仕洋 · 张仕洋 · commit 88690c7f4297 · 2024-02-27T16:18:14.000+08:00
diff --git a/docs/installation.mdx b/docs/installation.mdx
@@ -16,7 +16,7 @@ The basic environment is as follows:
 | `cuda`             | >= 11.0           | - The CUDA version must be consistent with the version that PyTorch depends on (for the convenience of `torch.utils.cpp_extension` to compile code on the fly). <br />- Compatibility testing with CUDA 10.2(no support for c++17) is no longer performed. |
 | `pytorch`          | >= 1.10.2(cuda11) | - Compatibility testing is no longer performed for `c++14/cuda-10.2/pytorch==1.10.2`, but you may still be able to run it with simple modifications.                                                                                                       |
 | `opencv` | 4.x               | At least the core, imgproc, imgcodecs, and highgui modules are included.                                                                                                                                                                                   |
-|  `tensorrt`	|>= 7.2<br /><= 9.2| - There is a [memory leak](https://github.com/NVIDIA/TensorRT/issues/351) in TensorRT 7.0 with dynamic inputs.
+|  `tensorrt`	|>= 7.2<br /><= 9.3| - There is a [memory leak](https://github.com/NVIDIA/TensorRT/issues/351) in TensorRT 7.0 with dynamic inputs.
 
 
 :::note 
@@ -26,7 +26,7 @@ All dependencies mentioned above come from a specific default backend. The const
 ## Using NGC base image {#NGC}
 The easiest way is to choose NGC mirror for source code compilation (official mirror may still be able to run low version drivers through Forward Compatibility or Minor Version Compatibility). 
 
-- Minimum support nvcr.io/nvidia/pytorch:21.07-py3 (Starting from 0.3.2rc3)
+- Minimum support nvcr.io/nvidia/pytorch:21.07-py3
 - Maximum support nvcr.io/nvidia/pytorch:23.08-py3
 - Latest test version: nvcr.io/nvidia/pytorch:22.12-py3
 
@@ -54,9 +54,7 @@ If you are using a transformer-like model, it is strongly recommended to use Ten
 ```
 *Release 23.05 is based on CUDA 12.1.1, which requires NVIDIA Driver release 530 or later. However, if you are running on a data center GPU (for example, T4 or any other data center GPU), you can use NVIDIA driver release 450.51 (or later R450), 470.57 (or later R470), 510.47 (or later R510), 515.65 (or later R515), 525.85 (or later R525), or 530.30 (or later R530)*
 ```
-This can be overcome by:
-- Using the custom image described in the [next section](#selfdocker)
-- For online deployment, only considering support for data center cards such as Tesla T4 for the time being.
+
 :::
 
 Next, you can compile the source code:
@@ -127,13 +125,13 @@ torch.onnx.export(resnet18, data_bchw, model_path,
 ```python
 import torch, torchpipe
 model = torchpipe.pipe({'model': model_path,
-                        'backend': "Sequential[cvtColorTensor,TensorrtTensor,SyncTensor]", # Back-end engine, as explained in the "Overview" section.
+                        'backend': "Sequential[CvtColorTensor,TensorrtTensor,SyncTensor]", # Back-end engine, as explained in the "Overview" section.
                         'instance_num': 2, 'batching_timeout': '5', # Number of instances and timeout duration.
                         'max': 4, # Maximum value for model optimization scope, can also be '4x3x224x224'.
                         'mean': '123.675, 116.28, 103.53',#255*"0.485, 0.456, 0.406"，
                         'std': '58.395, 57.120, 57.375', # Merge into TensorRT network.
                         'color': 'rgb'}) # cvtColorTensor backend parameter: target color space order
-data = torch.zeros((1, 3, 224, 224)) # or torch.from_numpy(...)
+data = torch.zeros((1, 3, 224, 224)).cuda() # or torch.from_numpy(...)
 input = {"data": data, 'color': 'bgr'}
 model(input)  # Concurrency can be utilized
 # Use "result" as the data output identifier, although other key values can be customized as well.
@@ -148,13 +146,18 @@ For more examples, see [Showcase](./showcase/showcase.mdx).
 
 ## Customizing Dockerfile {#selfdocker}
 
-Refer to the [example Dockerfile](https://github.com/torchpipe/torchpipe/blob/main/docker/trt9.1.base). After downloading [TensorRT](https://github.com/NVIDIA/TensorRT/tree/release/9.1#optional---if-not-using-tensorrt-container-specify-the-tensorrt-ga-release-build-path) in advance, you can compile the corresponding base image.
+Refer to the [example Dockerfile](https://github.com/torchpipe/torchpipe/blob/main/docker/Dockerfile). 
+
 ```bash
-# put TensorRT-9.*.Linux.x86_64-gnu.cuda-11.8.tar.gz into thirdparty/
+ 
+docker build --network=host -f ./docker/Dockerfile -t torchpipe thirdparty/
 
-# docker build --network=host -f docker/trt9.base -t torchpipe:base_trt-9 .
+docker run --rm   --network=host --gpus=all --ulimit memlock=-1 --ulimit stack=67108864 --privileged=true  -v `pwd`:/workspace -it torchpipe:latest  /bin/bash 
+
+cd /workspace/ && python setup.py install
+
+cd examples/resnet18 && python resnet18.py
 
-# docker run --rm   --network=host --gpus=all --ulimit memlock=-1 --ulimit stack=67108864 --privileged=true  -v `pwd`:/workspace -it torchpipe:base_trt-9  /bin/bash 
 
 ```
 Base images compiled in this way have smaller sizes than NGC PyTorch images. Please note that `_GLIBCXX_USE_CXX11_ABI==0`.
diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/installation.mdx b/i18n/zh/docusaurus-plugin-content-docs/current/installation.mdx
@@ -17,7 +17,7 @@ type: explainer
 | `cuda`           | >= 11.0                | - 与PyTorch依赖的cuda版本必须一致（便于`torch.utils.cpp_extension`即时编译代码） <br />- 不再对CUDA 10.2（不支持c++17）的兼容进行测试。            |
 | `pytorch`        | >= 1.10.2（cuda11）      | - 对于`c++14/cuda-10.2/pytorch==1.10.2`不再进行兼容测试，然而通过简单修改，您可能仍可运行。                                                      |
 | `opencv`   | 3.x, 4.x               | 至少包含 core, imgproc，imgcodecs 和 highgui 四个模块                                                                                          |
-| `tensorrt` | >= 7.2(Starting from 0.3.2rc3)<br /><= 9.2 | - tensorrt 7.0 动态输入下存在[内存泄漏](https://github.com/NVIDIA/TensorRT/issues/351) |
+| `tensorrt` | >= 7.2<br /><= 9.3 | - tensorrt 7.0 动态输入下存在[内存泄漏](https://github.com/NVIDIA/TensorRT/issues/351) |
 
 :::note 
 以上依赖项均来源于默认存在的特定计算后端。构筑c++核心不依赖于以上任意一项。
@@ -50,7 +50,6 @@ docker run --rm --gpus=all --ipc=host  --network=host -v `pwd`:/workspace  --shm
 :::note
 - 如果您使用的是Transformer-like的模型，强烈推荐使用Tensorrt >= 8.6.1(`nvcr.io/nvidia/pytorch:23.05-py3`)以便支持 opset 17 for LayerNormalization 和 opset 18 GroupNormalization, 以及对此类模型更深的支持。然而，其NGC镜像对显卡驱动版本[有所要求](https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-23-05.html#rel-23-05). 
 可以使用下一节[自定义镜像](#selfdocker).
-- 由于tensorrt的优化问题，部分较大参数量的Transformer-like的模型opset13下速度更快。
 
 
 :::
@@ -115,7 +114,7 @@ torch.onnx.export(resnet18, data_bchw, model_path,
 ```python
 import torch, torchpipe
 model = torchpipe.pipe({'model': model_path,
-                        'backend': "Sequential[cvtColorTensor,TensorrtTensor,SyncTensor]", # 后端引擎， 可见overview章节的解释。
+                        'backend': "Sequential[CvtColorTensor,TensorrtTensor,SyncTensor]", # 后端引擎， 可见overview章节的解释。
                         'instance_num': 2, 'batching_timeout': '5', # 实例数和超时时间
                         'max': 4, # 模型优化范围最大值，也可以为 '4x3x224x224'
                         'mean': '123.675, 116.28, 103.53',#255*"0.485, 0.456, 0.406"，
@@ -136,15 +135,19 @@ print(input["result"].shape)  # 失败则此键值一定不存在，即使输入
 
 ## 自定义dockerfile {#selfdocker}
 
-参考[示例dockerfile](https://github.com/torchpipe/torchpipe/blob/main/docker/trt9.base)，预先下载[TensorRT](https://github.com/NVIDIA/TensorRT/tree/release/9.1#optional---if-not-using-tensorrt-container-specify-the-tensorrt-ga-release-build-path)后可编译相关基础环境镜像。
+
 ```bash
-# put TensorRT-9.*.Linux.x86_64-gnu.cuda-11.8.tar.gz into thirdparty/
+ 
+docker build --network=host -f ./docker/Dockerfile -t torchpipe thirdparty/
 
-# docker build --network=host -f docker/trt9.base -t torchpipe:base_trt-9 .
+docker run --rm   --network=host --gpus=all --ulimit memlock=-1 --ulimit stack=67108864 --privileged=true  -v `pwd`:/workspace -it torchpipe:latest  /bin/bash 
 
-# docker run --rm   --network=host --gpus=all --ulimit memlock=-1 --ulimit stack=67108864 --privileged=true  -v `pwd`:/workspace -it torchpipe:base_trt-9  /bin/bash 
+cd /workspace/ && python setup.py install
+
+cd examples/resnet18 && python resnet18.py
 
 ```
+
 这种方式编译出的基础镜像比NGC pytorch镜像体积更小. 需要注意，其`_GLIBCXX_USE_CXX11_ABI==0`