Update guidellm #6

Chibukach · 2025-07-03T09:19:50Z

This PR updates guidellm automation in clearml to use the latest guidellm pythonic interface with benchmarking scenarios.
The standard research benchmarking scenarios have been updated to reflect this change.
It also adds support for development using custom branches in a single location

anmarques · 2025-07-18T19:28:30Z

src/automation/vllm/server.py

 ):
    task = Task.current_task()

+    print("Inside start vllm server")


Is this a debugging print?

anmarques · 2025-07-18T19:30:05Z

src/automation/vllm/server.py

    executable_path = os.path.dirname(sys.executable)
    vllm_path = os.path.join(executable_path, "vllm")

-    num_gpus = torch.cuda.device_count()
+    available_gpus = list(range(torch.cuda.device_count()))


Is there any reason lines 27-31 is necessary? using tensor-parallel-size will do the same with or without this as far as I know

anmarques · 2025-07-18T19:30:15Z

src/automation/vllm/server.py


    parsed_target = urlparse(target)
+    print(f"vllm path is: {vllm_path}")


Another debugging print?

anmarques · 2025-07-18T19:30:25Z

src/automation/vllm/server.py

    ]

+    print(server_command)


Debugging print?

anmarques · 2025-07-18T19:30:46Z

src/automation/vllm/server.py

    server_process = subprocess.Popen(server_command, stdout=server_log_file, stderr=server_log_file, shell=False, env=subprocess_env)

    delay = 5
    server_initialized = False
    for _ in range(server_wait_time // delay):
        try:
            response = requests.get(target + "/models")
+            print(f"response: {response}")


Debugging print?

anmarques · 2025-07-18T19:47:27Z

src/automation/tasks/scripts/guidellm_script.py


-
-def main(configurations=None):
+def main():


The reason I passed configurations as an argument is to enable execute_locally. In my experience, when executing the task locally, the process doesn't see the configuration_object for some reason. So I pass the config as a dict directly to the process. This only works for local processes.

Also, I see that you either try to fetch the configuration object or assume that it will be replaced get_parameters_dict. In ClearML parameters and configs are different things and you cannot replace one by the other

anmarques · 2025-07-18T19:49:45Z

src/automation/vllm/server.py

@@ -14,44 +14,58 @@ def start_vllm_server(
    vllm_args, 
    model_id, 
    target, 
-    server_wait_time, 
+    server_wait_time,
+    gpu_count,


Why add gpu count as an argument? In general, if you are running a vllm server in a remote server, why would you not use all gpus?

anmarques · 2025-07-18T19:50:20Z

src/automation/tasks/scripts/guidellm_script.py

    # Resolve model_id
    model_id = resolve_model_id(args["Args"]["model"], clearml_model, force_download)

+    gpu_count = int(guidellm_args.get("gpu_count", 1)) 


Why add gpu count as an argument? Why would someone use less gpus than all available?

anmarques · 2025-07-18T20:18:44Z

src/automation/tasks/scripts/guidellm_script.py

+    else:
+        filepath = Path(os.path.join(".", "src", "automation", "standards", "benchmarking", f"{DEFAULT_GUIDELLM_SCENARIO}.json"))
+        current_scenario = GenerativeTextScenario.from_file(filepath, dict(guidellm_args))
+    print(current_scenario.model_fields)


Is this a debugging print?

anmarques · 2025-07-18T20:29:40Z

src/automation/tasks/scripts/guidellm_script.py

+    from guidellm.benchmark.scenario import GenerativeTextScenario, get_builtin_scenarios
+
+    user_scenario = guidellm_args.get("scenario", "")
+    if user_scenario:


Let's add a comment here to clarify that all these scenarios are to be defined in guidellm, not here. This is a temporary solution

chibu added 30 commits June 27, 2025 10:45

simple change

f6aa8fe

test lmeval change

18f41b1

update branch

a425d43

use main

6fc29f4

remove gcs

956a12b

readd gc

5e09fb7

remove gc

655f00e

back to guidellm

ba703b0

simplified

b4deac8

simple vllm

6ed6862

skip vllm

b3f55bc

pause vllm

3a709da

update benchmark report

02cac57

update ip

a85bb4f

update branch

c3af0cf

added base task param

ede7482

retry branch name

87496ea

repo branch

b64ffd8

readd branch

7dc5e48

branch in base task

2d05c64

optional branch

60e6e9e

add branch choice

ee4d7c9

include benchmark

998a8bc

refactor default

6944cb4

moved generate text

6e4a5d5

test

41f3f21

add debug

850fd21

add os lib

5e87674

use default scenario

c9b63a8

benchmark with scenario

4d68ea8

chibu added 28 commits July 2, 2025 12:22

remove vllm

10874d3

cleanup

629d195

back to base

768d135

readd

09c3978

readd start vllm server

e64fb12

use guidellm branch

873c222

base complete

16b83bc

test rag

432031e

clean up

e9117ea

base package as variable

9984a8c

test default branch change

b8b51e9

update branch names

b99afec

use main branch in config

b2c2918

print the scenario

d1e686b

modify tokens

5d3e3ff

revert lmeval and setup.py, update vllm server log

3b0d86c

readd default scenarios

a2d6eb5

change default guidellm json

81f5199

add config examples json

1550333

use original default

420137d

add log

9d284c9

include user scenario

e863516

revert lmeval example

3703e62

add file error handling

d1b985a

removed package prints

e60aab1

default config

515a1db

readd output path

ac9ef63

onpremise settings

69638ea

Chibukach requested a review from anmarques July 3, 2025 14:52

anmarques requested changes Jul 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update guidellm #6

Update guidellm #6

Uh oh!

Chibukach commented Jul 3, 2025 •

edited

Loading

Uh oh!

anmarques Jul 18, 2025

Uh oh!

anmarques Jul 18, 2025

Uh oh!

anmarques Jul 18, 2025

Uh oh!

anmarques Jul 18, 2025

Uh oh!

anmarques Jul 18, 2025

Uh oh!

anmarques Jul 18, 2025

Uh oh!

anmarques Jul 18, 2025

Uh oh!

anmarques Jul 18, 2025

Uh oh!

anmarques Jul 18, 2025

Uh oh!

anmarques Jul 18, 2025

Uh oh!

Uh oh!


		parsed_target = urlparse(target)
		print(f"vllm path is: {vllm_path}")

Update guidellm #6

Are you sure you want to change the base?

Update guidellm #6

Uh oh!

Conversation

Chibukach commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Chibukach commented Jul 3, 2025 •

edited

Loading