This repository was archived by the owner on Feb 3, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 228
This repository was archived by the owner on Feb 3, 2025. It is now read-only.
Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created while converting a saved model to trt engine #336
Copy link
Copy link
Open
Description
I am trying to convert a tensorflow saved_model to tensorrt engine using the below python script.
from tensorflow.python.compiler.tensorrt import trt_convert as trt
# Conversion Parameters
conversion_params = trt.TrtConversionParams(precision_mode=trt.TrtPrecisionMode.FP16)
input_saved_model_dir = "/home/administrator/Documents/penguin_behavior_detection/tf_onnx_trt_stuff/seagate_exported_model/saved_model"
output_saved_model_dir = "/home/administrator/Documents/penguin_behavior_detection/tf_onnx_trt_stuff/"
converter = trt.TrtGraphConverterV2(input_saved_model_dir=input_saved_model_dir, conversion_params=conversion_params)
# Converter method used to partition and optimize TensorRT compatible segments
converter.convert()
converter.summary()
# Save the model to the disk
converter.save(output_saved_model_dir)
This is the structure of seagate_exported_model
directory
.
├── checkpoint
│ ├── checkpoint
│ ├── ckpt-0.data-00000-of-00001
│ └── ckpt-0.index
├── pipeline.config
└── saved_model
├── assets
├── fingerprint.pb
├── saved_model.pb
└── variables
├── variables.data-00000-of-00001
└── variables.index
I get below output on the terminal
2024-04-01 20:36:22.993604: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING:tensorflow:From /home/administrator/Documents/penguin_behavior_detection/tf_onnx_trt_stuff/convert_to_trt.py:23: calling TrtGraphConverterV2.__init__ (from tensorflow.python.compiler.tensorrt.trt_convert) with conversion_params is deprecated and will be removed in a future version.
Instructions for updating:
Use individual converter parameters instead
2024-04-01 20:36:25.663756: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1928] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9528 MB memory: -> device: 0, name: NVIDIA TITAN V, pci bus id: 0000:c1:00.0, compute capability: 7.0
2024-04-01 20:36:25.664265: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1928] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 10431 MB memory: -> device: 1, name: NVIDIA TITAN V, pci bus id: 0000:e1:00.0, compute capability: 7.0
2024-04-01 20:36:42.381116: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 2
2024-04-01 20:36:42.381230: I tensorflow/core/grappler/clusters/single_machine.cc:361] Starting new session
2024-04-01 20:36:42.382733: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1928] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9528 MB memory: -> device: 0, name: NVIDIA TITAN V, pci bus id: 0000:c1:00.0, compute capability: 7.0
2024-04-01 20:36:42.382881: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1928] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 10431 MB memory: -> device: 1, name: NVIDIA TITAN V, pci bus id: 0000:e1:00.0, compute capability: 7.0
2024-04-01 20:36:47.126394: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 2
2024-04-01 20:36:47.126499: I tensorflow/core/grappler/clusters/single_machine.cc:361] Starting new session
2024-04-01 20:36:47.127907: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1928] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9528 MB memory: -> device: 0, name: NVIDIA TITAN V, pci bus id: 0000:c1:00.0, compute capability: 7.0
2024-04-01 20:36:47.128046: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1928] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 10431 MB memory: -> device: 1, name: NVIDIA TITAN V, pci bus id: 0000:e1:00.0, compute capability: 7.0
2024-04-01 20:36:47.655835: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:186] Calibration with FP32 or FP16 is not implemented. Falling back to use_calibration = False.Note that the default value of use_calibration is True.
2024-04-01 20:36:47.730476: W tensorflow/compiler/tf2tensorrt/segment/segment.cc:970]
################################################################################
TensorRT unsupported/non-converted OP Report:
- GatherV2 -> 46x
- StridedSlice -> 35x
- Sub -> 30x
- Shape -> 24x
- Cast -> 22x
- ConcatV2 -> 19x
- Mul -> 19x
- ExpandDims -> 18x
- Pack -> 17x
- Identity -> 17x
- Select -> 16x
- Fill -> 15x
- Reshape -> 15x
- Placeholder -> 14x
- Less -> 10x
- Unpack -> 10x
- Greater -> 9x
- AddV2 -> 8x
- Pad -> 8x
- Switch -> 8x
- NonMaxSuppressionV5 -> 7x
- Minimum -> 7x
- Merge -> 7x
- NextIteration -> 6x
- Enter -> 6x
- Split -> 5x
- Slice -> 5x
- RealDiv -> 4x
- Maximum -> 4x
- Round -> 4x
- Transpose -> 3x
- Range -> 3x
- NoOp -> 3x
- Reciprocal -> 2x
- Squeeze -> 2x
- ResizeBilinear -> 2x
- Exit -> 2x
- Exp -> 2x
- TopKV2 -> 2x
- Tile -> 2x
- TensorListStack -> 2x
- TensorListReserve -> 2x
- TensorListSetItem -> 2x
- Where -> 1x
- TensorListGetItem -> 1x
- TensorListFromTensor -> 1x
- GreaterEqual -> 1x
- Sum -> 1x
- LogicalAnd -> 1x
- LoopCond -> 1x
--------------------------------------------------------------------------------
- Total nonconverted OPs: 451
- Total nonconverted OP Types: 50
For more information see https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#supported-ops.
################################################################################
2024-04-01 20:36:48.146815: W tensorflow/compiler/tf2tensorrt/segment/segment.cc:1298] The environment variable TF_TRT_MAX_ALLOWED_ENGINES=20 has no effect since there are only 10 TRT Engines with at least minimum_segment_size=3 nodes.
2024-04-01 20:36:48.182719: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:799] Number of TensorRT candidate segments: 10
2024-04-01 20:36:48.224087: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 0 consisting of 6 nodes by TRTEngineOp_000_000.
2024-04-01 20:36:48.224163: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 1 consisting of 1993 nodes by TRTEngineOp_000_001.
2024-04-01 20:36:48.227138: W tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:916] TF-TRT Warning: Cannot replace segment 2 consisting of 6 nodes by TRTEngineOp_000_002 reason: Segment has no inputs (possible constfold failure) (keeping original segment).
2024-04-01 20:36:48.227395: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 3 consisting of 5 nodes by TRTEngineOp_000_003.
2024-04-01 20:36:48.227442: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 4 consisting of 4 nodes by TRTEngineOp_000_004.
2024-04-01 20:36:48.227482: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 5 consisting of 4 nodes by TRTEngineOp_000_005.
2024-04-01 20:36:48.227535: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 6 consisting of 25 nodes by TRTEngineOp_000_006.
2024-04-01 20:36:48.227592: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 7 consisting of 4 nodes by TRTEngineOp_000_007.
2024-04-01 20:36:48.227635: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 8 consisting of 4 nodes by TRTEngineOp_000_008.
2024-04-01 20:36:48.227671: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:913] Replaced segment 9 consisting of 3 nodes by TRTEngineOp_000_009.
TRTEngineOP Name Device # Nodes # Inputs # Outputs Input DTypes Output Dtypes Input Shapes Output Shapes
================================================================================================================================================================
----------------------------------------
TRTEngineOp_000_000 device:GPU:0 6 1 1 ['float32'] ['float32'] [[1, -1, -1, 3]] [[1, -1, -1, 3]]
- Const: 3x
- Mul: 2x
- Sub: 1x
----------------------------------------
TRTEngineOp_000_001 device:GPU:0 1931 1 2 ['float32'] ['float32', 'f ... [[1, 640, 640, 3]] [[1, 76725, 4] ...
- AddV2: 15x
- BatchMatMulV2: 32x
- BiasAdd: 122x
- ConcatV2: 2x
- Const: 864x
- Conv2D: 165x
- DepthwiseConv2dNative: 94x
- FusedBatchNormV3: 133x
- MaxPool: 18x
- Mean: 22x
- Mul: 149x
- Pack: 64x
- Reshape: 69x
- Sigmoid: 150x
- Squeeze: 32x
----------------------------------------
TRTEngineOp_000_003 device:GPU:0 7 4 1 ['float32', 'f ... ['float32'] [[57600, 1], [ ... [[57600, 4]]
- ConcatV2: 1x
- Const: 2x
- Mul: 4x
----------------------------------------
TRTEngineOp_000_004 device:GPU:0 4 4 1 ['float32', 'f ... ['float32'] [[-1, 1], [-1, ... [[-1]]
- Mul: 1x
- Squeeze: 1x
- Sub: 2x
----------------------------------------
TRTEngineOp_000_005 device:GPU:0 4 4 1 ['float32', 'f ... ['float32'] [[-1, 1], [-1, ... [[-1]]
- Mul: 1x
- Squeeze: 1x
- Sub: 2x
----------------------------------------
TRTEngineOp_000_006 device:GPU:0 25 1 8 ['float32'] ['float32', 'f ... [[76725, 7]] [[76725, 7], [ ...
- Const: 10x
- Reshape: 8x
- Slice: 7x
----------------------------------------
TRTEngineOp_000_007 device:GPU:0 6 2 1 ['float32', 'f ... ['float32'] [[57600, 2], [ ... [[57600, 4]]
- AddV2: 1x
- ConcatV2: 1x
- Const: 2x
- Mul: 1x
- Sub: 1x
----------------------------------------
TRTEngineOp_000_008 device:GPU:0 3 1 1 ['float32'] ['float32'] [[76725, 1, 4]] [[76725, 4]]
- Const: 1x
- Reshape: 1x
- Unpack: 1x
----------------------------------------
TRTEngineOp_000_009 device:GPU:0 3 1 2 ['float32'] ['float32', 'f ... [[1, 76725, 4]] [[1, 76725, 1, ...
- Const: 1x
- ExpandDims: 1x
- Squeeze: 1x
================================================================================================================================================================
[*] Total number of TensorRT engines: 9
[*] % of OPs Converted: 78.87% [1989/2522]
2024-04-01 20:36:49.217361: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created
2024-04-01 20:36:49.217649: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created
2024-04-01 20:36:49.217857: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created
2024-04-01 20:36:49.218032: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created
2024-04-01 20:36:49.218200: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created
2024-04-01 20:36:49.218390: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created
2024-04-01 20:36:49.218555: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created
2024-04-01 20:36:49.218810: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created
2024-04-01 20:36:49.218981: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: NOT_FOUND: TRTEngineCacheResource not yet created
Below is my environment
Python: 3.10.13
Tensorflow: 2.16.1
OS: Ubuntu 20.04
TensorRT: 8.6.1
Cuda: 12.1
nVidia driver: 530.30.02
Any help is highly appreciated. Thanks in advance!
Metadata
Metadata
Assignees
Labels
No labels