Skip to content

Commit 35e9f39

Browse files
committed
address review comments
1 parent d0777a8 commit 35e9f39

File tree

2 files changed

+4
-2
lines changed

2 files changed

+4
-2
lines changed

.github/workflows/sglang-benchmark.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -186,7 +186,7 @@ jobs:
186186
# Verify installations
187187
echo "$(pwd)/sgl_server_env/bin" >> $GITHUB_PATH
188188
189-
- name: Install NVCC
189+
- name: Install NVCC #TODO: Use docker image (nvidia/cuda:12.8.1-devel-ubuntu22.04) instead of locally specifying the variables
190190
if: env.DEVICE_NAME == 'cuda'
191191
shell: bash
192192
run: |

.github/workflows/vllm-profiling.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
# TODO: Refactor the workflows to extract the common parts into a GHA reusable module
12
name: vLLM Profiling
23

34
on:
@@ -14,6 +15,7 @@ on:
1415
description: vLLM commit (optional, default to the latest commit in the branch that has not yet been benchmarked)
1516
required: false
1617
type: string
18+
# TODO: add support for profiling on a specific model and runner
1719
pull_request:
1820
paths:
1921
- .github/workflows/vllm-profiling.yml
@@ -39,7 +41,7 @@ jobs:
3941
fail-fast: false
4042
matrix:
4143
include:
42-
- runs-on: linux.aws.h100.4
44+
- runs-on: linux.aws.a100
4345
device-name: cuda
4446
runs-on: ${{ matrix.runs-on }}
4547
environment: pytorch-x-vllm

0 commit comments

Comments
 (0)