-
Notifications
You must be signed in to change notification settings - Fork 372
Add support for TensorRT-RTX #3753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
No ciflow labels are configured for this repo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lanluo-nvidia just remove the PTQ Calibrator feature from python and C++ and put in deprecation errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some changes that do not conform to Python style guidelines:
--- /home/runner/work/TensorRT/TensorRT/tests/py/ts/integrations/test_trt_intercompatibility.py 2025-08-27 00:39:48.227435+00:00
+++ /home/runner/work/TensorRT/TensorRT/tests/py/ts/integrations/test_trt_intercompatibility.py 2025-08-27 00:40:20.832194+00:00
@@ -34,10 +34,11 @@
trt_engine = torchtrt.ts.convert_method_to_trt_engine(
self.ts_model, "forward", **compile_spec
)
import tensorrt as trt
+
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
with trt.Runtime(TRT_LOGGER) as rt:
engine = rt.deserialize_cuda_engine(trt_engine)
with engine.create_execution_context() as ctx:
out = torch.empty(
@@ -77,6 +78,17 @@ def quantize( | |||
dtype = trt.DataType.FP8 | |||
max_bound = 448 | |||
|
|||
if ( | |||
dtype == trt.DataType.INT8 | |||
and ".input_quantizer" in name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we write a validator for this case? (As an aside have we tried quantization where not all of the model is running in TensorRT?)
args[0], | ||
# currently nonzero is not supported for tensorrt_rtx | ||
# TODO: lan to remove this once rtx team has fixed the bug | ||
if not ENABLED_FEATURES.tensorrt_rtx: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could just stack the decorator here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think
@needs_tensorrt_rtx
@dynamo_tensorrt_converter(...)
def aten...
Should do what we want
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the decorator then you are good to merge
Description
Add initial support for TensorRT-RTX.
The following are the currently identified issues:
RTX team side:
5439176
5400490
5407733
5402295
5459328
5481821
Our side:
PR is in Progress:
fix: atan2 strong type support & bug fix for integer dynamic shape #3751
add strong typing fix #3749
🐛 [Bug] TensorRT-RTX BatchNorm constant fold got nan #3699
🐛 [Bug] TensorRT-RTX Refitter test failed when constant fold is disabled #3752
🐛 [Bug] TensorRT-RTX: Cuda graph test failed #3781
🐛 [Bug] TensorRT - RTX: torch-script test failure #3782
Fixes # (issue)
Type of change
Please delete options that are not relevant and/or add your own.
Checklist: