Add support for stop_words in Ray MBridge deployment by athitten · Pull Request #605 · NVIDIA-NeMo/Export-Deploy

athitten · 2026-02-18T05:38:00Z

Extracts stop_words from incoming request and exposes them in nemo_deploy/llm/megatronllm_deployable_ray.py and nemo_deploy/llm/megatronllm_deployable.py to be passed along to the mcore inference engine. Helps reduce unnecessary token generation(which was the case before where a lot of unnecessary tokens were generated) beyond the stop_words passed in the incoming eval requests hence improving the speed.

Speed improvement with 10% gsm8k eval on llama 3.2 1B:

Before stop_words support in deployment: 10 mins
After stop_words support in deployment: 4 min 37s

Signed-off-by: Abhishree <abhishreetm@gmail.com>

copy-pr-bot · 2026-02-18T05:38:03Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

athitten · 2026-02-18T05:46:17Z

/ok to test 7937894

Signed-off-by: Abhishree <abhishreetm@gmail.com>

athitten · 2026-02-18T20:10:29Z

/ok to test 1fedc31

Add support for stop_words in Ray deployment

7937894

Signed-off-by: Abhishree <abhishreetm@gmail.com>

athitten requested review from meatybobby, oyilmaz-nvidia and pthombre as code owners February 18, 2026 05:38

github-actions bot added deploy LLM tests labels Feb 18, 2026

athitten mentioned this pull request Feb 18, 2026

Stop word support for mcore inference #564

Open

copy-pr-bot bot temporarily deployed to test February 18, 2026 05:46 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 18, 2026 06:03 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci February 18, 2026 06:03 Failure

copy-pr-bot bot temporarily deployed to nemo-ci February 18, 2026 06:03 Inactive

Fix tests for stop_words

1fedc31

Signed-off-by: Abhishree <abhishreetm@gmail.com>

copy-pr-bot bot temporarily deployed to test February 18, 2026 20:11 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 18, 2026 20:23 Inactive

oyilmaz-nvidia approved these changes Feb 18, 2026

View reviewed changes

oyilmaz-nvidia enabled auto-merge (squash) February 18, 2026 20:47

copy-pr-bot bot had a problem deploying to nemo-ci February 18, 2026 21:02 Failure

copy-pr-bot bot had a problem deploying to nemo-ci February 18, 2026 21:05 Failure

copy-pr-bot bot temporarily deployed to nemo-ci February 18, 2026 21:05 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci February 18, 2026 22:42 Failure

copy-pr-bot bot temporarily deployed to nemo-ci February 18, 2026 22:42 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci February 18, 2026 23:33 Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for stop_words in Ray MBridge deployment#605

Add support for stop_words in Ray MBridge deployment#605
athitten wants to merge 2 commits intomainfrom
athitten/support_stop_word

athitten commented Feb 18, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Feb 18, 2026

Uh oh!

athitten commented Feb 18, 2026

Uh oh!

athitten commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

athitten commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Feb 18, 2026

Uh oh!

athitten commented Feb 18, 2026

Uh oh!

athitten commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

athitten commented Feb 18, 2026 •

edited

Loading