Skip to content

[vllm] test out ray v2 executor #1660

Open
erictang000 wants to merge 1 commit into
NovaSky-AI:mainfrom
erictang000:test_ray_v2_executor
Open

[vllm] test out ray v2 executor #1660
erictang000 wants to merge 1 commit into
NovaSky-AI:mainfrom
erictang000:test_ray_v2_executor

Conversation

@erictang000
Copy link
Copy Markdown
Collaborator

Testing for #1653 to see if we need to fix anything on vllm side

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables the Ray V2 executor backend for vLLM by setting the VLLM_USE_RAY_V2_EXECUTOR_BACKEND environment variable in both the runtime environment utility and the test configuration. The reviewer suggests avoiding hardcoding this value to allow users to override it via environment variables and recommends moving the assignment into a backend-specific block to prevent environment pollution for other backends.

Comment on lines +608 to +609
# manually set this for testing everywhere
env_vars["VLLM_USE_RAY_V2_EXECUTOR_BACKEND"] = "1"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding VLLM_USE_RAY_V2_EXECUTOR_BACKEND to "1" unconditionally prevents users from overriding this setting via environment variables. It is better to check if the variable is already set in os.environ and only apply the default if it is missing. This ensures that users can explicitly disable the V2 executor if they encounter issues. Additionally, since this is a vLLM-specific setting, it would ideally be placed within the vLLM backend check block (around line 629) to avoid polluting the environment for other backends.

Suggested change
# manually set this for testing everywhere
env_vars["VLLM_USE_RAY_V2_EXECUTOR_BACKEND"] = "1"
# Use Ray V2 executor for vLLM by default, but allow override from environment
env_vars["VLLM_USE_RAY_V2_EXECUTOR_BACKEND"] = os.environ.get("VLLM_USE_RAY_V2_EXECUTOR_BACKEND", "1")

@erictang000
Copy link
Copy Markdown
Collaborator Author

regression - train gpu ci all passes but runs a little slower

on main at commit 4e0ba0edac8229dc2e9e77aaf86dee792bb61aa8 - 2 hours 9 mins: `https://console.anyscale.com/cld_hxkifz7xa22mwicp21nzkds1lw/prj_4b6c498rypyq6g7yhk6vzgjevt/jobs/prodjob_cd3xhxltgtnnbm27y82vxbcr2r?job-logs-section-tabs=application_logs&job-tab=overview

this commit - 2 hours 31 mins: https://console.anyscale.com/cld_hxkifz7xa22mwicp21nzkds1lw/prj_4b6c498rypyq6g7yhk6vzgjevt/jobs/prodjob_6uukbqbz2xe4ss84446xlm1p7g?job-logs-section-tabs=application_logs&job-tab=overview

maybe new ray executor takes longer to spin down compared to before?

megatron models ci passes as before

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant