Skip to content

Commit df11d7a

Browse files
committed
[ds fp4] set block-size to 16
1 parent ad2e8b9 commit df11d7a

File tree

1 file changed

+5
-5
lines changed

1 file changed

+5
-5
lines changed

evaluation/deepseek_fp4/launch_deepseekr1_fp4_TP.sh

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -30,16 +30,16 @@ echo "running $model_path"
3030

3131
vllm serve $model_path \
3232
--host localhost \
33-
--port 6789 \
33+
--port 9000 \
3434
--tensor-parallel-size 8 \
3535
--max-num-batched-tokens 32768 \
3636
--trust-remote-code \
3737
--no-enable-prefix-caching \
3838
--disable-log-requests \
39-
--compilation-config '{"cudagraph_mode": "FULL_AND_PIECEWISE"}' \
39+
--enforce-eager \
4040
--gpu_memory_utilization 0.7 \
41-
--block-size 1 \
42-
--seed 123 2>&1 | tee log.server.log
41+
--block-size 16 \
42+
--seed 123 2>&1 | tee log.server.log &
4343

44-
# --enforce-eager \
44+
# --compilation-config '{"cudagraph_mode": "FULL_AND_PIECEWISE"}' \
4545
# --enable-expert-parallel \

0 commit comments

Comments
 (0)