[Model] Classification models support logit_bias / sigmoid_normalize #24031

noooop · 2025-09-01T08:09:53Z

TL;DR

Use the override_pooler_config to support mxbai-rerank sigmoid_normalize:

vllm serve mixedbread-ai/mxbai-rerank-base-v2 \
  --runner pooling \
  --hf_overrides '{"architectures":["Qwen2ForSequenceClassification"],"classifier_from_token":["0","1"],"method":"from_2_way_softmax"}' \
  --override_pooler_config '{"logit_bias": 4.5}'

logit_bias is half of estimated_max:

https://github.com/mixedbread-ai/mxbai-rerank/blob/21d9e79f181298b8dd436bef20d7ac3d80643c9a/mxbai_rerank/mxbai_rerank_v2.py#L20-L25

https://github.com/mixedbread-ai/mxbai-rerank/blob/21d9e79f181298b8dd436bef20d7ac3d80643c9a/mxbai_rerank/utils.py#L8-L21

mixedbread-ai/mxbai-rerank-base-v2: {"logit_bias": 4.5}
mixedbread-ai/mxbai-rerank-large-v2: {"logit_bias": 6.0}

Demo:

import requests

url = "http://127.0.0.1:8000/score"
MODEL_NAME = "mixedbread-ai/mxbai-rerank-base-v2"

# Please use the query_template and document_template to format the query and
# document for better reranker results.

prefix = "<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\n<|im_start|>user\n"
suffix = "<|im_end|>\n<|im_start|>assistant\n"

query_template = "{prefix}query: {query}\n"
document_template = "document: {doc}\n{instruction}{suffix}"

instruction = "You are a search relevance expert who evaluates how well documents match search queries. For each query-document pair, carefully analyze the semantic relationship between them, then provide your binary relevance judgment (0 for not relevant, 1 for relevant).\nRelevance:"

queries = [
    "Who wrote To Kill a Mockingbird?"
]

documents = [
    "'To Kill a Mockingbird' is a novel by Harper Lee published in 1960. It was immediately successful, winning the Pulitzer Prize, and has become a classic of modern American literature.",
    "The novel 'Moby-Dick' was written by Herman Melville and first published in 1851. It is considered a masterpiece of American literature and deals with complex themes of obsession, revenge, and the conflict between good and evil.",
    "Harper Lee, an American novelist widely known for her novel 'To Kill a Mockingbird', was born in 1926 in Monroeville, Alabama. She received the Pulitzer Prize for Fiction in 1961.",
    "Jane Austen was an English novelist known primarily for her six major novels, which interpret, critique and comment upon the British landed gentry at the end of the 18th century.",
    "The 'Harry Potter' series, which consists of seven fantasy novels written by British author J.K. Rowling, is among the most popular and critically acclaimed books of the modern era.",
    "'The Great Gatsby', a novel written by American author F. Scott Fitzgerald, was published in 1925. The story is set in the Jazz Age and follows the life of millionaire Jay Gatsby and his pursuit of Daisy Buchanan."
]

queries = [
    query_template.format(prefix=prefix, query=query)
    for query in queries
]
documents = [
    document_template.format(doc=doc, suffix=suffix, instruction=instruction) for doc in documents
]


response = requests.post(url,
                         json={
                             "model": MODEL_NAME,
                             "text_1": queries,
                             "text_2": documents,
                             "truncate_prompt_tokens": -1,
                         }).json()
for i, r in enumerate(response["data"]):
    print(i, r["score"])

0 0.9945342540740967
1 0.0470464788377285
2 0.9746929407119751
3 0.12403740733861923
4 0.026046426966786385
5 0.02023334428668022

similar to model.rank(query, documents, normalize=True)


import torch
from mxbai_rerank import MxbaiRerankV2

model = MxbaiRerankV2("mixedbread-ai/mxbai-rerank-base-v2", torch_dtype=torch.float32)

query = "Who wrote 'To Kill a Mockingbird'?"
documents = [
    "'To Kill a Mockingbird' is a novel by Harper Lee published in 1960. It was immediately successful, winning the Pulitzer Prize, and has become a classic of modern American literature.",
    "The novel 'Moby-Dick' was written by Herman Melville and first published in 1851. It is considered a masterpiece of American literature and deals with complex themes of obsession, revenge, and the conflict between good and evil.",
    "Harper Lee, an American novelist widely known for her novel 'To Kill a Mockingbird', was born in 1926 in Monroeville, Alabama. She received the Pulitzer Prize for Fiction in 1961.",
    "Jane Austen was an English novelist known primarily for her six major novels, which interpret, critique and comment upon the British landed gentry at the end of the 18th century.",
    "The 'Harry Potter' series, which consists of seven fantasy novels written by British author J.K. Rowling, is among the most popular and critically acclaimed books of the modern era.",
    "'The Great Gatsby', a novel written by American author F. Scott Fitzgerald, was published in 1925. The story is set in the Jazz Age and follows the life of millionaire Jay Gatsby and his pursuit of Daisy Buchanan."
]

# Lets get the scores
results = model.rank(query, documents, normalize=True)

results.sort(key = lambda x:x.index)

for i, r in enumerate(results):
    print(i, r.score)


"""
0 0.9941529631614685
1 0.0521107017993927
2 0.9704784750938416
3 0.2976154386997223
4 0.06989647448062897
5 0.028927486389875412
"""

Purpose

Classification models support logit_bias / sigmoid_normalize

Fix #22983
address #19675 (comment)

Test Plan

pytest -s -vvv tests/models/multimodal/pooling/test_jinavl_reranker.py

Test Result

pass

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: wang.yuqi <[email protected]>

gemini-code-assist

Code Review

This pull request introduces support for logit_bias in classification models, which is useful for models like mxbai-rerank. The implementation correctly adds the logit_bias parameter to the PoolerConfig and applies it within the ClassifierPooler. Additionally, this PR includes a significant and beneficial refactoring of JinaVLForSequenceClassification, correcting its integration with the pooling mechanism by properly using the ClassifierPooler and removing hardcoded logic. This makes the implementation cleaner and more robust. I've found one critical issue in the implementation that needs to be addressed.

vllm/model_executor/layers/pooler.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: wang.yuqi <[email protected]>

Signed-off-by: wang.yuqi <[email protected]>

noooop · 2025-09-02T10:50:29Z

cc @DarkLight1337

vllm/config/__init__.py

devang-sifthub · 2025-09-02T17:51:32Z

Thanks for the update, really appreciate it!
And when will this be released ?

…llm-project#24031) Signed-off-by: wang.yuqi <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

noooop · 2025-09-03T00:22:01Z

Thanks for the update, really appreciate it! And when will this be released ?

vLLM provides wheels for Linux running on an x86 platform with CUDA 12 for every commit

You can install the latest code at any time.

https://docs.vllm.ai/en/latest/getting_started/installation/gpu.html#install-the-latest-code_1

vLLM releases a new version approximately every four weeks.

…llm-project#24031) Signed-off-by: wang.yuqi <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: 子悬 <[email protected]>

…llm-project#24031) Signed-off-by: wang.yuqi <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Matthew Bonanni <[email protected]>

…llm-project#24031) Signed-off-by: wang.yuqi <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

…llm-project#24031) Signed-off-by: wang.yuqi <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Shiyan Deng <[email protected]>

Comment

a96534c

Signed-off-by: wang.yuqi <[email protected]>

noooop requested review from simon-mo, WoosukKwon, youkaichao, robertgshaw2-redhat, mgoin, tlrmchlsmth, houseroad, hmellor, yewentao256 and ProExpertProg as code owners September 1, 2025 08:09

noooop force-pushed the sigmoid_normalize branch from d5cb15a to a96534c Compare September 1, 2025 08:09

gemini-code-assist bot reviewed Sep 1, 2025

View reviewed changes

vllm/model_executor/layers/pooler.py Outdated Show resolved Hide resolved

noooop and others added 3 commits September 1, 2025 16:13

Update vllm/model_executor/layers/pooler.py

75103de

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: wang.yuqi <[email protected]>

fix

196eac3

Signed-off-by: wang.yuqi <[email protected]>

Merge branch 'main' into sigmoid_normalize

6d95c6a

DarkLight1337 reviewed Sep 2, 2025

View reviewed changes

vllm/config/__init__.py Show resolved Hide resolved

DarkLight1337 approved these changes Sep 2, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) September 2, 2025 11:55

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 2, 2025

DarkLight1337 merged commit e0653f6 into vllm-project:main Sep 2, 2025
52 checks passed

noooop deleted the sigmoid_normalize branch September 3, 2025 00:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Model] Classification models support logit_bias / sigmoid_normalize #24031

[Model] Classification models support logit_bias / sigmoid_normalize #24031

Uh oh!

noooop commented Sep 1, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

noooop commented Sep 2, 2025

Uh oh!

Uh oh!

Uh oh!

devang-sifthub commented Sep 2, 2025 •

edited

Loading

Uh oh!

noooop commented Sep 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[Model] Classification models support logit_bias / sigmoid_normalize #24031

[Model] Classification models support logit_bias / sigmoid_normalize #24031

Uh oh!

Conversation

noooop commented Sep 1, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TL;DR

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

noooop commented Sep 2, 2025

Uh oh!

Uh oh!

Uh oh!

devang-sifthub commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

noooop commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

noooop commented Sep 1, 2025 •

edited by github-actions bot

Loading

devang-sifthub commented Sep 2, 2025 •

edited

Loading

noooop commented Sep 3, 2025 •

edited

Loading