[TRT RTX EP] Implement GetEPContextNodes() #24901

thevishalagarwal · 2025-05-29T10:04:01Z

Implements GetEPContextNodes()

thevishalagarwal · 2025-05-29T10:05:09Z

HectorSVC · 2025-05-29T21:35:36Z

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

azure-pipelines · 2025-05-29T21:35:55Z

Azure Pipelines successfully started running 5 pipeline(s).

HectorSVC · 2025-05-29T21:42:45Z

onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.h

@@ -266,6 +268,7 @@ class NvExecutionProvider : public IExecutionProvider {
  std::string cache_prefix_;
  std::string op_types_to_exclude_;
  int nv_profile_index_ = 0;
+  std::vector<std::unique_ptr<onnxruntime::Model>> ep_context_models_;


ep_context_models_

The model instance here is just help to hold the EPContext nodes. Keep one instance is enough, why need a list of models?

HectorSVC · 2025-05-29T21:59:20Z

onnxruntime/core/providers/nv_tensorrt_rtx/onnx_ctx_model_helper.cc

@@ -72,7 +72,8 @@ ONNX_NAMESPACE::ModelProto* CreateCtxModel(const GraphViewer& graph_viewer,
                                           const int64_t embed_mode,
                                           const std::string compute_capability,
                                           const std::string onnx_model_path,
-                                           const logging::Logger* logger) {
+                                           const logging::Logger* logger,
+                                           std::vector<std::unique_ptr<onnxruntime::Model>>& ep_context_models, const std::string& ep_context_node_name) {
  auto model_build = graph_viewer.CreateModel(*logger);


auto model_build = graph_viewer.CreateModel(*logger);

Seems wired to me. Why create a model from a existing graph_viewer? how the lifecycle is controlled with the existing graph_viewer? ep_context_model instance need to be valid as long as the EP instance is alive.
All EP need to do with GetEpContextNodes() is, the EP creates the model instance to hold all EPContext nodes for the graph partitioner to query. EP only need to add all EPContext nodes into that model instance.

onnxruntime/onnxruntime/core/providers/qnn/qnn_execution_provider.cc

Lines 1179 to 1191 in 9705b17

qnn_ep_context_model_ = Factory<Model>::Create(std::string{"qnn_ep_context_model"}, false, logger);

ORT_RETURN_IF_ERROR(qnn::CreateEPContextNodes(qnn_ep_context_model_.get(),

context_buffer.get(),

buffer_size,

qnn_backend_manager_->GetSdkVersion(),

fused_nodes_and_graphs,

qnn_models_,

context_model_path,

qnn_context_embed_mode_,

max_spill_fill_buffer_size,

logger,

share_ep_contexts_,

stop_share_ep_contexts_));

The name of the CreateCtxModel() might be a bit confusing.
It has some historic reasons

At the beginning of TRT EP implementing EP Context feature, it didn't follow the rule to implement GetEpContextNodes()

it only supported a model only containing a single EP Context node, meaning the whole graph can be run by TRT. There is no partitioning.

It didn't implement EP API's GetEpContextNodes(), instead it directly creates the ONNX model, so that's why this function has this name to really create a onnx model and dump to file.

Right now, since RTX EP and TRT EP are going to follow EP Context implementation rule to implement GetEpContextNodes(), we could rename it to avoid confusion.

Sharing some source between TRT and TRT RTX seems like a good idea but will be kind of awkward. Any opinion on this ? Otherwise let's duplicate changes and source for now.

HectorSVC · 2025-06-02T16:59:53Z

There was a fix for the Web CI pipeline, please merge the code from latest main branch.

chilo-ms · 2025-06-03T16:57:53Z

When running with TRT RTX EP, will it also handle the case where model contains contrib ops that will be fallback to run on CUDA EP or CPU? I assume it will.

If that's the case, following change need to be added to this PR as well.

In GetCapability, it should be able the correctly parse the EP Context model that contains some EP Context nodes and some onnx nodes.

gedoensmax · 2025-06-04T08:38:39Z

If we will be filling the source section on an EP Context node will the partitioner automatically force a node onto an EP ? I think if there are EP Context nodes there is no longer a real partitioning needed 🤔

thevishalagarwal added 3 commits May 29, 2025 15:30

implement GetEPContextNodes()

578303e

clean up

157fec9

rebase to latest

48b589f

HectorSVC added the ep:NvRTX NV RTX execution provider label May 29, 2025

HectorSVC reviewed May 29, 2025

View reviewed changes

HectorSVC requested review from jywu-msft and chilo-ms May 29, 2025 22:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TRT RTX EP] Implement GetEPContextNodes() #24901

[TRT RTX EP] Implement GetEPContextNodes() #24901

thevishalagarwal commented May 29, 2025

Uh oh!

thevishalagarwal commented May 29, 2025

Uh oh!

HectorSVC commented May 29, 2025

Uh oh!

azure-pipelines bot commented May 29, 2025

Uh oh!

HectorSVC May 29, 2025

Uh oh!

HectorSVC May 29, 2025 •

edited

Loading

Uh oh!

chilo-ms Jun 3, 2025 •

edited

Loading

Uh oh!

chilo-ms Jun 3, 2025

Uh oh!

gedoensmax Jun 4, 2025

Uh oh!

HectorSVC commented Jun 2, 2025

Uh oh!

chilo-ms commented Jun 3, 2025

Uh oh!

gedoensmax commented Jun 4, 2025

Uh oh!

Uh oh!

	qnn_ep_context_model_ = Factory<Model>::Create(std::string{"qnn_ep_context_model"}, false, logger);
	ORT_RETURN_IF_ERROR(qnn::CreateEPContextNodes(qnn_ep_context_model_.get(),
	context_buffer.get(),
	buffer_size,
	qnn_backend_manager_->GetSdkVersion(),
	fused_nodes_and_graphs,
	qnn_models_,
	context_model_path,
	qnn_context_embed_mode_,
	max_spill_fill_buffer_size,
	logger,
	share_ep_contexts_,
	stop_share_ep_contexts_));

[TRT RTX EP] Implement GetEPContextNodes() #24901

Are you sure you want to change the base?

[TRT RTX EP] Implement GetEPContextNodes() #24901

Conversation

thevishalagarwal commented May 29, 2025

Uh oh!

thevishalagarwal commented May 29, 2025

Uh oh!

HectorSVC commented May 29, 2025

Uh oh!

azure-pipelines bot commented May 29, 2025

Uh oh!

HectorSVC May 29, 2025

Choose a reason for hiding this comment

Uh oh!

HectorSVC May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chilo-ms Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chilo-ms Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

gedoensmax Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

HectorSVC commented Jun 2, 2025

Uh oh!

chilo-ms commented Jun 3, 2025

Uh oh!

gedoensmax commented Jun 4, 2025

Uh oh!

Uh oh!

HectorSVC May 29, 2025 •

edited

Loading

chilo-ms Jun 3, 2025 •

edited

Loading