Torch-TensorRT v2.2.0 #2646
narendasan
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Dynamo Frontend for Torch-TensorRT, PyTorch 2.2, CUDA 12.1, TensorRT 8.6
Torch-TensorRT 2.2.0 targets PyTorch 2.2, CUDA 12.1 (builds for CUDA 11.8 are available via the PyTorch package index - https://download.pytorch.org/whl/cu118) and TensorRT 8.6. This release is the second major release of Torch-TensorRT as the default frontend has changed from TorchScript to Dynamo allowing for users to more easily control and customize the compiler in Python.
The dynamo frontend can support both JIT workflows through
torch.compileand AOT workflows throughtorch.export + torch_tensorrt.compile. It targets the Core ATen Opset (https://pytorch.org/docs/stable/torch.compiler_ir.html#core-aten-ir) and currently has 82% coverage. Just like in Torchscript graphs will be partitioned based on the ability to map operators to TensorRT in addition to any graph surgery done in Dynamo.Output Format
Through the Dynamo frontend, different output formats can be selected for AOT workflows via the
output_formatkwarg. The choices aretorchscriptwhere the resulting compiled module will be traced withtorch.jit.trace, suitable for Pythonless deployments,exported_programa new serializable format for PyTorch models or finally if you would like to run further graph transformations on the resultant model,graph_modulewill return atorch.fx.GraphModule.Multi-GPU Safety
To address a long standing source of overhead, single GPU systems will now operate without typical required device checks. This check can be re-added when multiple GPUs are available to the host process using
torch_tensorrt.runtime.set_multi_device_safe_modeMore information can be found here: https://pytorch.org/TensorRT/user_guide/runtime.html
Capability Validators
In the Dynamo frontend, tests can be written and associated with converters to dynamically enable or disable them based on conditions in the target graph.
For example, the convolution converter in dynamo only supports 1D, 2D, and 3D convolution. We can therefore create a lambda which given a convolution FX node can determine if the convolution is supported:
In such a case where the
Nodeis not supported, the node will be partitioned out and run in PyTorch.All capability validators are run prior to partitioning, after the lowering phase.
More information on writing converters for the Dynamo frontend can be found here: https://pytorch.org/TensorRT/contributors/dynamo_converters.html
Breaking Changes
torch.nn.Modules ortorch.fx.GraphModules provided totorch_tensorrt.compilewill by default be exported usingtorch.exportthen compiled. This default can be overridden by setting their=[torchscript|fx]kwarg. Any bugs reported will first be attempted to be resolved in the dynamo stack before attempting other frontends however pull requests for additional functionally in the TorchScript and FX frontends from the community will still be accepted.What's Changed
mainby @gs-olive in chore: Update Torch and Torch-TRT versions and docs onmain#1784input_signature) by @gs-olive in fix: Allow full model compilation with collection inputs (input_signature) #1656TRTEngine.to_str()method by @gs-olive in fix: Error caused by invalid binding name inTRTEngine.to_str()method #1846aten.mean.defaultandaten.mean.dimconverters by @gs-olive in fix: Implementaten.mean.defaultandaten.mean.dimconverters #1810torch._dynamoimport in__init__by @gs-olive in fix: Add version checking fortorch._dynamoimport in__init__#1881acc_opsconvolution layers in FX by @gs-olive in fix: Improve input weight handling toacc_opsconvolution layers in FX #1886mainto TRT 8.6, CUDA 11.8, CuDNN 8.8, Torch Dev by @gs-olive in fix: Upgrademainto TRT 8.6, CUDA 11.8, CuDNN 8.8, Torch Dev #1852torch.compilepath by @gs-olive in fix: Improve partitioning + lowering systems intorch.compilepath #1879aten.catby @gs-olive in fix: Add support for default dimension inaten.cat#1863.numpy()issue on fake tensors by @gs-olive in fix: Address.numpy()issue on fake tensors #1949aten::Int.Tensoruses by @gs-olive in fix/feat: Add lowering pass to resolve mostaten::Int.Tensoruses #1937aten.addmmby @gs-olive in fix: Add decomposition foraten.addmm#1953convert_method_to_trt_enginecalls by @gs-olive in fix: Add lowering pass to remove output repacking inconvert_method_to_trt_enginecalls #1945TRTInterpreterimpl in Dynamo compile [1 / x] by @gs-olive in chore/fix: UpdateTRTInterpreterimpl in Dynamo compile [1 / x] #2002optionskwargs for Torch compile [3 / x] by @gs-olive in feat: Addoptionskwargs for Torch compile [3 / x] #2005TRTInterpreter[2 / x] by @gs-olive in feat: Add support for output data types inTRTInterpreter[2 / x] #20042.1.0.dev20230605[4 / x] by @gs-olive in chore: Upgrade Torch nightly to2.1.0.dev20230605[4 / x] #1975impl+ add feature (FX converter refactor) by @gs-olive in fix/feat: Move convolution core toimpl+ add feature (FX converter refactor) #1972TorchTensorRTModulein Dynamo [1 / x] by @gs-olive in feat: Add support forTorchTensorRTModulein Dynamo [1 / x] #2003truncate_long_and_doublein Dynamo [8 / x] by @gs-olive in fix: Add support fortruncate_long_and_doublein Dynamo [8 / x] #1983atenPRs to Dynamo converter registry by @gs-olive in fix: Move allatenPRs to Dynamo converter registry #2070torch_tensorrt.dynamo.compilepath [1.1 / x] by @gs-olive in examples: Add example usage scripts fortorch_tensorrt.dynamo.compilepath [1.1 / x] #1966mainby @gs-olive in ci: Add automatic GHA job to build + push Docker Container onmain#2129pyyamlimport to GHA Docker job by @gs-olive in chore: Addpyyamlimport to GHA Docker job #2170aten.embeddingto reflect schema by @gs-olive in fix: Updateaten.embeddingto reflect schema #2182_to_copy,operator.getandcloneATen converters by @gs-olive in feat: Add_to_copy,operator.getandcloneATen converters #2161aten.whereby @gs-olive in fix: Repair broadcasting utility foraten.where#2228dynamic=Falseintorch.compilecall by @gs-olive in fix: Setdynamic=Falseintorch.compilecall #2240aten.expandby @gs-olive in fix: Allow rank differences inaten.expand#2234pipinstallation by @gs-olive in fix: Legacy CIpipinstallation #2239require_full_compilationin Dynamo by @gs-olive in feat: Add support forrequire_full_compilationin Dynamo #2138cloneandto_copywhere input of graph is output by @gs-olive in fix: Add special cases forcloneandto_copywhere input of graph is output #2265get_irprefixes by @gs-olive in minor fix: Updateget_irprefixes #2369aten.wherewith Numpy + Broadcast by @gs-olive in fix: Repairaten.wherewith Numpy + Broadcast #2372release/2.1by @gs-olive in cherry-pick: Key converters and documentation torelease/2.1#2387release/2.1by @gs-olive in cherry-pick: Transformer XL fix torelease/2.1#2414releaseto Torch 2.1.1 by @gs-olive in chore: Upgradereleaseto Torch 2.1.1 #2472release/2.1CI Repair by @gs-olive in fix:release/2.1CI Repair #2528mainby @gs-olive in cherry-pick: Port most changes frommain#2574release/2.2by @gs-olive in cherry-pick: Docker fixesrelease/2.2#2628compile(small fix: Remove extraneous argument incompile#2635) by @gs-olive in cherry-pick: Remove extraneous argument incompile(#2635) #2638New Contributors
Full Changelog: v1.4.0...v2.2.0
This discussion was created from the release Torch-TensorRT v2.2.0.
Beta Was this translation helpful? Give feedback.
All reactions