Skip to content

Conversation

@qwes5s5
Copy link
Contributor

@qwes5s5 qwes5s5 commented Nov 7, 2025

Motivation

  • LLM.generate()目前尚不支持logrpobs和prompt_logrpobs,引擎侧目前已经提供相关功能,需要添加链路上的参数输入和结果输出处理逻辑,支持整体链路。
  • Engine创建时,目前不支持max_logprobs逻辑,固定为20,需要支持max_logprobs参数并使其可以为-1,表示值为词表大小,同时也需要支持LLM.generate()的logrpobs和prompt_logrpobs为-1的情况。

Modifications

  • 修改SamplingParams,支持prompt_logrpobs
  • 修改ModelConifg,支持max_logprobs
  • 添加max_logprobs在engine、work_process、async_llm中的命令行参数支持。
  • 在RequestOutput中添加prompt_logrpobs输出的支持
  • 在TokenProcessor中添加接收zmq数据后对logprobs和prompt_logrpobs处理和结果构建
  • 在LLM中添加generate对logprobs和prompt_logrpobs输出的支持

Usage or Command

import json
import os

from fastdeploy.engine.sampling_params import SamplingParams
from fastdeploy.entrypoints.llm import LLM

os.environ["CUDA_VISIBLE_DEVICES"] = "2"


prompts = ["User: 床前明月光下语句是什么?仅回答我诗句,不要回答别的。\nAssistant: 好的。"]
os.environ["FD_USE_GET_SAVE_OUTPUT_V1"] = "1"

# 采样参数
sampling_params = SamplingParams(top_p=0.95, max_tokens=6400, logprobs=-1, prompt_logprobs=-1)
# sampling_params = SamplingParams(top_p=0.95, max_tokens=6400,logprobs=-1)

# 加载模型
llm = LLM(
    model="/workspace/model/ERNIE-4.5-0.3B-Paddle",
    tensor_parallel_size=1,
    max_model_len=8192,
    max_logprobs=-1,
    enable_logprob=True,
    enable_prefix_caching=False,
)

# 批量进行推理(llm内部基于资源情况进行请求排队、动态插入处理)
while True:
    outputs = llm.generate(prompts=prompts, sampling_params=sampling_params)
    print(json.dumps(outputs[0].to_dict()))

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link

paddle-bot bot commented Nov 7, 2025

Thanks for your contribution!

@paddle-bot paddle-bot bot added the contributor External developers label Nov 7, 2025
gongshaotian
gongshaotian previously approved these changes Nov 7, 2025
Copy link
Collaborator

@gongshaotian gongshaotian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

ckl117
ckl117 previously approved these changes Nov 7, 2025
@Jiang-Jia-Jun Jiang-Jia-Jun requested a review from Copilot November 7, 2025 11:43
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds support for prompt logprobs functionality to the FastDeploy inference system, allowing users to retrieve log probabilities for prompt tokens in addition to generated tokens.

  • Adds prompt_logprobs parameter support to SamplingParams with validation
  • Implements _build_prompt_logprobs method to process prompt logprobs tensors
  • Enhances error handling in token processing with try-catch blocks for logprobs parsing
  • Adds comprehensive test coverage for logprobs and prompt_logprobs validation

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
fastdeploy/worker/output.py Adds PromptLogprobs type alias for prompt logprobs data structure
fastdeploy/engine/sampling_params.py Updates validation logic for logprobs parameters to support -1 value and adds prompt_logprobs validation
fastdeploy/engine/request.py Adds prompt_logprobs and prompt_logprobs_tensors fields to RequestOutput class
fastdeploy/entrypoints/llm.py Implements prompt logprobs processing with _build_prompt_logprobs and _make_logprob_dict methods, adds validation for max_logprobs
fastdeploy/output/token_processor.py Adds error handling for logprobs and prompt_logprobs parsing in batch output processing
fastdeploy/config.py Adds validation for max_logprobs configuration parameter
tests/output/test_process_batch_output_use_zmq.py Adds test coverage for logprobs and prompt_logprobs processing in token processor
tests/engine/test_sampling_params.py Adds comprehensive test suite for SamplingParams validation including logprobs and prompt_logprobs
tests/entrypoints/test_vllm_run_engine.py Duplicate test file for SamplingParams validation (should be consolidated)

self.line_break_id = args.get("line_break_id", -1)
if self.max_logprobs == -1 and hasattr(self, "vocab_size"):
self.max_logprobs = self.vocab_size
if self.max_logprobs < -1 and self.max_logprobs > self.ori_vocab_size:
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic error in validation condition. The expression self.max_logprobs < -1 and self.max_logprobs > self.ori_vocab_size can never be true because a value cannot simultaneously be less than -1 AND greater than ori_vocab_size. This should use or instead of and: if self.max_logprobs < -1 or self.max_logprobs > self.ori_vocab_size:

Suggested change
if self.max_logprobs < -1 and self.max_logprobs > self.ori_vocab_size:
if self.max_logprobs < -1 or self.max_logprobs > self.ori_vocab_size:

Copilot uses AI. Check for mistakes.
Comment on lines +230 to +231
if self.max_logprobs < -1 and self.max_logprobs > self.ori_vocab_size:
raise ValueError(" The possible values for max_logprobs are -1 and [0, vocab_size] ")
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent error message with validation logic. The error message says "The possible values for max_logprobs are -1 and [0, vocab_size]", but the validation condition checks for values less than -1 and greater than ori_vocab_size. The message should match the actual validation: "max_logprobs must be -1 or in the range [0, vocab_size]".

Suggested change
if self.max_logprobs < -1 and self.max_logprobs > self.ori_vocab_size:
raise ValueError(" The possible values for max_logprobs are -1 and [0, vocab_size] ")
if self.max_logprobs != -1 and not (0 <= self.max_logprobs <= self.ori_vocab_size):
raise ValueError("max_logprobs must be -1 or in the range [0, vocab_size]")

Copilot uses AI. Check for mistakes.
params._verify_args()

error_msg = str(cm.exception)
self.assertIn("prompt_logprobs must can't be less than -1", error_msg)
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammatical error in error message: "prompt_logprobs must can't be less than -1" should be "prompt_logprobs must be greater than or equal to -1" or "prompt_logprobs can't be less than -1".

Suggested change
self.assertIn("prompt_logprobs must can't be less than -1", error_msg)
self.assertIn("prompt_logprobs must be greater than or equal to -1", error_msg)

Copilot uses AI. Check for mistakes.
params = SamplingParams(prompt_logprobs=-2)
params._verify_args()

self.assertIn("prompt_logprobs must can't be less than -1", str(cm.exception))
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammatical error in error message: "prompt_logprobs must can't be less than -1" should be "prompt_logprobs must be greater than or equal to -1" or "prompt_logprobs can't be less than -1".

Suggested change
self.assertIn("prompt_logprobs must can't be less than -1", str(cm.exception))
self.assertIn("prompt_logprobs must be greater than or equal to -1", str(cm.exception))

Copilot uses AI. Check for mistakes.
# Recover shapes.
num_prompt_tokens, num_logprobs = logprobs.shape

# Pythonize the torch tensors.
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential bug: Using comment in docstring for _build_prompt_logprobs mentions "torch tensors", but the codebase uses PaddlePaddle (paddle.Tensor). The comment on line 436 should be corrected to refer to "paddle tensors" instead of "torch tensors" for consistency with the framework being used.

Suggested change
# Pythonize the torch tensors.
# Pythonize the paddle tensors.

Copilot uses AI. Check for mistakes.
if self.logprobs is not None and self.logprobs > 20 and os.getenv("FD_USE_GET_SAVE_OUTPUT_V1", "0") == "0":
raise ValueError("Invalid value for 'top_logprobs': must be less than or equal to 20.")
if self.prompt_logprobs is not None and self.prompt_logprobs < -1:
raise ValueError(f"prompt_logprobs must can't be less than -1, got {self.prompt_logprobs}.")
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammatical error in error message: "prompt_logprobs must can't be less than -1" should be "prompt_logprobs must be greater than or equal to -1" or "prompt_logprobs can't be less than -1".

Suggested change
raise ValueError(f"prompt_logprobs must can't be less than -1, got {self.prompt_logprobs}.")
raise ValueError(f"prompt_logprobs must be greater than or equal to -1, got {self.prompt_logprobs}.")

Copilot uses AI. Check for mistakes.
params = SamplingParams(prompt_logprobs=-2)
params._verify_args()

self.assertIn("prompt_logprobs must can't be less than -1", str(cm.exception))
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammatical error in error message: "prompt_logprobs must can't be less than -1" should be "prompt_logprobs must be greater than or equal to -1" or "prompt_logprobs can't be less than -1".

Suggested change
self.assertIn("prompt_logprobs must can't be less than -1", str(cm.exception))
self.assertIn("prompt_logprobs must be greater than or equal to -1", str(cm.exception))

Copilot uses AI. Check for mistakes.
params._verify_args()

error_msg = str(cm.exception)
self.assertIn("prompt_logprobs must can't be less than -1", error_msg)
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammatical error in error message: "prompt_logprobs must can't be less than -1" should be "prompt_logprobs must be greater than or equal to -1" or "prompt_logprobs can't be less than -1".

Suggested change
self.assertIn("prompt_logprobs must can't be less than -1", error_msg)
self.assertIn("prompt_logprobs must be greater than or equal to -1", error_msg)

Copilot uses AI. Check for mistakes.
Comment on lines +297 to +300
try:
result.prompt_logprobs_tensors = stream_data.prompt_logprobs
except Exception as e:
llm_logger.warning(f"Failed to parse prompt_logprobs from StreamTransferData: {e}")
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overly broad exception handling. The try-except block on lines 297-300 catches any Exception when trying to assign stream_data.prompt_logprobs to result.prompt_logprobs_tensors, but this is a simple assignment that shouldn't fail. If an exception occurs here, it likely indicates a deeper issue that should be investigated rather than silently logged. Consider either removing the try-except or being more specific about what exceptions are expected.

Suggested change
try:
result.prompt_logprobs_tensors = stream_data.prompt_logprobs
except Exception as e:
llm_logger.warning(f"Failed to parse prompt_logprobs from StreamTransferData: {e}")
result.prompt_logprobs_tensors = stream_data.prompt_logprobs

Copilot uses AI. Check for mistakes.
num_prompt_logprobs = self.llm_engine.cfg.model_config.ori_vocab_size
if num_prompt_logprobs > max_logprobs:
raise ValueError(
f"Number of logprobs requested ({num_prompt_logprobs}) exceeds maximum allowed value ({max_logprobs})."
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent error messages for logprobs validation. Line 323 says "Number of logprobs requested ({num_logprobs})" while line 331 says "Number of logprobs requested ({num_prompt_logprobs})". The second message should clarify it's about prompt_logprobs: "Number of prompt_logprobs requested ({num_prompt_logprobs})".

Suggested change
f"Number of logprobs requested ({num_prompt_logprobs}) exceeds maximum allowed value ({max_logprobs})."
f"Number of prompt_logprobs requested ({num_prompt_logprobs}) exceeds maximum allowed value ({max_logprobs})."

Copilot uses AI. Check for mistakes.
@qwes5s5 qwes5s5 dismissed stale reviews from ckl117 and gongshaotian via 2c2b9c2 November 7, 2025 13:14
@qwes5s5 qwes5s5 force-pushed the new_add_prompt_logprobs branch from 131aae9 to 2c2b9c2 Compare November 7, 2025 13:14
@qwes5s5
Copy link
Contributor Author

qwes5s5 commented Nov 10, 2025

/re-run run_tests_with_coverage

@qwes5s5 qwes5s5 force-pushed the new_add_prompt_logprobs branch 2 times, most recently from 2a93b61 to fbd6840 Compare November 10, 2025 06:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants