[ARM] GRU fails to fallback to CPU (Non-passthrough operation could not run on NPU)

I am exporting a streaming GRU model using torch.export + PT2E quantization and lowering it to the Executorch Ethos‑U backend. The export and quantization steps succeed, but lowering fails during Vela compilation with:

Traceback (most recent call last):
  File "/mnt/data/sarah/quantize.py", line 98, in <module>
    edge_program_manager = to_edge_transform_and_lower(
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/data/sarah/executorch/exir/program/_program.py", line 114, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/data/sarah/executorch/exir/program/_program.py", line 1371, in to_edge_transform_and_lower
    edge_manager = edge_manager.to_backend(method_to_partitioner)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/data/sarah/executorch/exir/program/_program.py", line 114, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/data/sarah/executorch/exir/program/_program.py", line 1672, in to_backend
    new_edge_programs = to_backend(method_to_programs_and_partitioners)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/functools.py", line 909, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/data/sarah/executorch/exir/backend/backend_api.py", line 762, in _
    lower_all_submodules_to_backend(
  File "/mnt/data/sarah/executorch/exir/backend/backend_api.py", line 591, in lower_all_submodules_to_backend
    backend_name_to_subclass[backend_id].preprocess_multimethod(
  File "/mnt/data/sarah/executorch/exir/backend/backend_details.py", line 129, in preprocess_multimethod
    preprocess_result = cls.preprocess(program, compile_spec_for_program)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/data/sarah/executorch/backends/arm/ethosu/backend.py", line 81, in preprocess
    binary = EthosUBackend._compile_tosa_flatbuffer(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/data/sarah/executorch/backends/arm/ethosu/backend.py", line 55, in _compile_tosa_flatbuffer
    binary = vela_compile(
             ^^^^^^^^^^^^^
  File "/mnt/data/sarah/executorch/backends/arm/arm_vela.py", line 70, in vela_compile
    vela.main(" ".join(args).split(" "))
  File "/mnt/data/sarah/executorch-venv/lib/python3.12/site-packages/ethosu/vela/vela.py", line 1266, in main
    process_regor(
  File "/mnt/data/sarah/executorch-venv/lib/python3.12/site-packages/ethosu/vela/vela.py", line 142, in process_regor
    compiled_model = regor.compile(accelerator, network, fmt, system_config, options=options, verbose=True)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Non-passthrough operation could not run on NPU.

The exported graph contains:
aten.gru.input (unsupported on Ethos‑U)
aten.linear, aten.tanh, aten.sigmoid (supported)

I expected the partitioner to:
leave the GRU on CPU
offload the linear/tanh/sigmoid heads to the NPU

Instead, the Ethos‑U backend attempts to compile a segment that still contains the GRU, and Vela rejects it with the error above.

cc @digantdesai @SS-JIA @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ARM] GRU fails to fallback to CPU (Non-passthrough operation could not run on NPU) #17753

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[ARM] GRU fails to fallback to CPU (Non-passthrough operation could not run on NPU) #17753

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions