Add DecomposeGruPass for ARM backend (#17137) by apullin · Pull Request #19463 · pytorch/executorch

apullin · 2026-05-11T18:49:42Z

Summary:

Adds a decomposition pass that transforms aten.gru.input into elementary
ops supported by TOSA (matmul, sigmoid, tanh, mul, add, slice, cat).

GRU cell equations per timestep:
r_t = sigmoid(x_t @ W_ir.T + b_ir + h_{t-1} @ W_hr.T + b_hr)
z_t = sigmoid(x_t @ W_iz.T + b_iz + h_{t-1} @ W_hz.T + b_hz)
n_t = tanh(x_t @ W_in.T + b_in + r_t * (h_{t-1} @ W_hn.T + b_hn))
h_t = n_t + z_t * (h_{t-1} - n_t)

Features:

Multi-layer GRU support
Bidirectional GRU support
With/without bias
batch_first support
Batched gate computation (2 mm ops per timestep instead of 6)

Differential Revision: D92058313

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell @rascani

Summary: Adds quantizable versions of GRU and RNN modules that can be used with PyTorch quantization-aware training (QAT) for the ARM backend. The standard nn.GRU and nn.RNN are opaque composite ops that the quantizer cannot annotate. These modules decompose the RNN operations into nn.Linear + FloatFunctional so that QAT observers can be inserted at each arithmetic boundary. ## New modules: - `GRUCell`, `_GRUSingleLayer`, `_GRULayer`, `GRU` - `RNNCell`, `_RNNSingleLayer`, `_RNNLayer`, `RNN` ## Features: - `from_float()` class method to convert from nn.GRU/nn.RNN - Multi-layer support - Bidirectional support - Both tanh and relu nonlinearities (for RNN) ## Usage: ```python from executorch.backends.arm.quantizable import GRU, RNN # Create quantizable GRU model = GRU(input_size=10, hidden_size=20, num_layers=2) # Or convert from existing nn.GRU eager_model = torch.nn.GRU(10, 20, 2) eager_model.qconfig = torch.ao.quantization.get_default_qat_qconfig("fbgemm") quantizable_model = GRU.from_float(eager_model) ``` Differential Revision: D92059608

Summary: Adds a decomposition pass that transforms aten.gru.input into elementary ops supported by TOSA (matmul, sigmoid, tanh, mul, add, slice, cat). GRU cell equations per timestep: r_t = sigmoid(x_t @ W_ir.T + b_ir + h_{t-1} @ W_hr.T + b_hr) z_t = sigmoid(x_t @ W_iz.T + b_iz + h_{t-1} @ W_hz.T + b_hz) n_t = tanh(x_t @ W_in.T + b_in + r_t * (h_{t-1} @ W_hn.T + b_hn)) h_t = n_t + z_t * (h_{t-1} - n_t) Features: - Multi-layer GRU support - Bidirectional GRU support - With/without bias - batch_first support - Batched gate computation (2 mm ops per timestep instead of 6) Differential Revision: D92058313

pytorch-bot · 2026-05-11T18:49:46Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19463

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 6 New Failures, 1 Unrelated Failure

As of commit 60d56eb with merge base 126507c ():

NEW FAILURES - The following jobs have failed:

Docathon Labels Sync / check-labels (gh)
Process completed with exit code 1.
Lint / lintrunner (gh)
Lint / lintrunner-mypy (gh)
pull / unittest-arm-backend-with-no-deps (test_pytest_ops_tosa) / linux-job (gh)
RuntimeError: Command docker exec -t 1b627dbf13fad46f976072122f8f0be60451ba16f599a8f3baced06d357e8023 /exec failed with exit code 1
trunk / test-arm-backend-ethos-u (test_pytest_ops_ethos_u85) / linux-job (gh)
RuntimeError: Command docker exec -t 074a28692310801e34f3e392b144492d971bd14d0a7678d672d163dc4c9bea0a /exec failed with exit code 1
trunk / test-huggingface-transformers-macos (llama3.2-1b|coreml_fp32_gpu|--quantize) / macos-job (gh)
RuntimeError: Command bash /Users/runner/work/_temp/exec_script failed with exit code 1

FLAKY - The following job failed but was likely due to flakiness present on trunk:

trunk / test-models-linux-aarch64 (phi_4_mini, portable, linux.arm64.m7g.4xlarge) / linux-job (gh) (matched linux rule in flaky-rules.json)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2026-05-11T18:49:53Z

@apullin has exported this pull request. If you are a Meta employee, you can view the originating Diff in D92058313.

pytorch-bot · 2026-05-11T18:50:38Z

~~Workflows were awaiting approval.~~ CI has now been triggered for the ciflow labels on this PR.

github-actions · 2026-05-11T18:51:37Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Andrew Pullin added 2 commits May 11, 2026 11:49

apullin force-pushed the export-D92058313 branch from ac16b1c to 60d56eb Compare May 11, 2026 18:49

apullin requested a review from digantdesai as a code owner May 11, 2026 18:49

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 11, 2026

meta-codesync Bot added fb-exported meta-exported labels May 11, 2026

github-actions Bot added ciflow/trunk module: arm Issues related to arm backend labels May 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add DecomposeGruPass for ARM backend (#17137)#19463

Add DecomposeGruPass for ARM backend (#17137)#19463
apullin wants to merge 2 commits into
pytorch:mainfrom
apullin:export-D92058313

apullin commented May 11, 2026 •

edited by pytorch-bot Bot

Loading

Uh oh!

pytorch-bot Bot commented May 11, 2026 •

edited

Loading

Uh oh!

meta-codesync Bot commented May 11, 2026

Uh oh!

pytorch-bot Bot commented May 11, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

apullin commented May 11, 2026 • edited by pytorch-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19463

❌ 6 New Failures, 1 Unrelated Failure

Uh oh!

meta-codesync Bot commented May 11, 2026

Uh oh!

pytorch-bot Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 11, 2026

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

apullin commented May 11, 2026 •

edited by pytorch-bot Bot

Loading

pytorch-bot Bot commented May 11, 2026 •

edited

Loading

pytorch-bot Bot commented May 11, 2026 •

edited

Loading

This PR needs a `release notes:` label