[RFC] Move Transducer (RNN-T/TDT) support to `extension/asr/runner/`

### 🚀 The feature, motivation and pitch


## Motivation

`extension/asr/runner/` currently provides `AsrRunner`, which only supports Seq2Seq (encoder-decoder) models like Whisper. The decode loop assumes a standard autoregressive pattern: `encoder → text_decoder(input_ids, encoder_output, cache_position) → logits → sample → next_token`.

Transducer-based ASR models (RNN-T, TDT, HAT) use a fundamentally different decode paradigm — frame-by-frame scanning with a joint network — and cannot reuse `AsrRunner`. As a result, the Parakeet TDT runner (`examples/models/parakeet/main.cpp`) implements the entire decode algorithm inline (~200 lines of greedy decode + LSTM state management), making it hard to reuse for other transducer models.

## Proposal

Restructure `extension/asr/runner/` to support both architectures:

1. **Rename** `AsrRunner` → `Seq2SeqRunner` to clarify that it's Seq2Seq-specific
2. **Add** `TransducerRunner` for RNN-T/TDT models, extracting the core decode logic from Parakeet's `main.cpp`
3. Keep both in the same flat directory (no subdirectories)

### Proposed file layout

```
extension/asr/runner/
├── CMakeLists.txt
├── seq2seq_runner.h         # renamed from runner.h
├── seq2seq_runner.cpp       # renamed from runner.cpp
├── transducer_runner.h      # new
└── transducer_runner.cpp    # new
```

### `TransducerRunner` sketch

```cpp
namespace executorch::extension::asr {

struct TransducerConfig {
  int64_t blank_id = 0;
  int64_t num_rnn_layers = 2;
  int64_t pred_hidden = 640;
  int64_t max_symbols_per_step = 10;
  // TDT duration values; empty = standard RNN-T (duration always 1)
  std::vector<int> durations = {};
};

class TransducerRunner {
 public:
  TransducerRunner(
      const std::string& module_path,
      const std::string& tokenizer_path,
      TransducerConfig config);

  Error load();

  // Returns decoded token IDs with frame offsets
  Result<std::vector<Token>> transcribe(
      TensorPtr preprocessed_features,
      std::function<void(const std::string&)> token_callback = {});
};

}  // namespace executorch::extension::asr
```

Expected module methods: `encoder`, `decoder_step`, `joint` (+ optional `preprocessor`).

### What stays in `examples/models/parakeet/`

Model-specific post-processing (timestamp computation at token/word/segment level) remains in the example — it's not general enough for a shared runner.


## Migration

- Whisper `main.cpp`: `AsrRunner` → `Seq2SeqRunner` (one-line rename)
- Parakeet `main.cpp`: replace inline decode with `TransducerRunner::transcribe()`
- Downstream consumers of `AsrRunner`: update include path and class name


### Alternatives

_No response_

### Additional context

_No response_

### RFC (Optional)

_No response_

cc @larryliu0820 @mergennachin @cccclai @helunwencser @jackzhxng

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Move Transducer (RNN-T/TDT) support to `extension/asr/runner/` #17686

🚀 The feature, motivation and pitch

Motivation

Proposal

Proposed file layout

`TransducerRunner` sketch

What stays in `examples/models/parakeet/`

Migration

Alternatives

Additional context

RFC (Optional)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] Move Transducer (RNN-T/TDT) support to extension/asr/runner/ #17686

Description

🚀 The feature, motivation and pitch

Motivation

Proposal

Proposed file layout

TransducerRunner sketch

What stays in examples/models/parakeet/

Migration

Alternatives

Additional context

RFC (Optional)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[RFC] Move Transducer (RNN-T/TDT) support to `extension/asr/runner/` #17686

`TransducerRunner` sketch

What stays in `examples/models/parakeet/`