[Frontend] Added chat-style multimodal support to /classify. #27516

WorldExplored · 2025-10-25T23:25:27Z

modified api server for multi-modal inputs:

Added chat-style multimodal support to /classify. (Made input optional, and wired chat template configs into ServingClassification for parity with /pooling and /chat.)
The original chat and multimodal inputs (e.g., video_url) now work end-to-end without 400 errors.

smoke test:


import argparse
import json
import sys
import urllib.error
import urllib.request


def build_payload(
    model: str,
    prompt: str,
    *,
    video_url: str | None,
) -> dict:
    if video_url:
        messages = [
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": prompt},
                    {
                        "type": "video_url",
                        "video_url": {"url": video_url},
                    },
                ],
            }
        ]
        return {"model": model, "messages": messages}

    return {"model": model, "input": [prompt]}


def run_smoke_test(args: argparse.Namespace) -> int:
    url = f"http://{args.host}:{args.port}/classify"
    payload = build_payload(
        model=args.model,
        prompt=args.prompt,
        video_url=args.video_url,
    )

    request = urllib.request.Request(
        url=url,
        data=json.dumps(payload).encode("utf-8"),
        headers={
            "Content-Type": "application/json",
            "User-Agent": "classification-smoke-test/0.1",
        },
        method="POST",
    )

    try:
        with urllib.request.urlopen(request, timeout=args.timeout) as response:
            print(f"{response.status} {response.reason}")
            body = response.read().decode("utf-8")
            print(body)
            return 0
    except urllib.error.HTTPError as exc:
        print(f"HTTP {exc.code}: {exc.reason}", file=sys.stderr)
        print(exc.read().decode("utf-8"), file=sys.stderr)
    except urllib.error.URLError as exc:
        print(f"Request failed: {exc}", file=sys.stderr)

    return 1


def parse_args() -> argparse.Namespace:
    parser = argparse.ArgumentParser(description=__doc__)
    parser.add_argument("--host", default="localhost")
    parser.add_argument("--port", type=int, default=8000)
    parser.add_argument("--model", required=True)
    parser.add_argument(
        "--prompt",
        default="请判断该视频是否存在质量问题，存在返回0，不存在返回1。",
    )
    parser.add_argument("--video-url", dest="video_url")
    parser.add_argument("--timeout", type=float, default=30.0)
    return parser.parse_args()


def main() -> None:
    args = parse_args()
    raise SystemExit(run_smoke_test(args))


if __name__ == "__main__":
    main()

output:

python vllm/examples/online_serving/classification_smoke_test.py \
  --host 127.0.0.1 \
  --port 8080 \
  --model test-model \
  --prompt "classify this input"

200 OK
{"error":{"message":"The model does not support Classification API","type":"BadRequestError","param":null,"code":400}}

cc @noooop @DarkLight1337

Signed-off-by: WorldExplored <[email protected]> adds support to the ServingClassification class and its initialization. Updates preprocessing logic to handle chat messages in classification requests. Co-Authored-By: vnadathur <[email protected]>

Signed-off-by: WorldExplored <[email protected]>

Signed-off-by: WorldExplored <[email protected]> Co-Authored-By: vnadathur <[email protected]>

gemini-code-assist

Code Review

This pull request modifies the API server to support multi-modal inputs for the /classify endpoint by enabling chat-style messages input. The changes are logical and align with the goal of achieving feature parity with other endpoints like /chat. My review has identified one high-severity issue in the handling of empty messages lists, which could lead to unexpected behavior for users of this new feature. A code suggestion is provided to address this. Otherwise, the changes are well-implemented.

vllm/entrypoints/openai/serving_classification.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Srreyansh Sethi <[email protected]>

vllm/entrypoints/openai/protocol.py

Signed-off-by: WorldExplored <[email protected]>

Signed-off-by: WorldExplored <[email protected]> Co-Authored-By: vnadathur <[email protected]>

vllm/entrypoints/openai/protocol.py

DarkLight1337

I also suggest separating this into a separate protocol class, just like EmbeddingChatRequest vs EmbeddingCompletionRequest

requirements/test.txt

Signed-off-by: WorldExplored <[email protected]> Co-Authored-By: vnadathur <[email protected]>

vllm/entrypoints/openai/serving_classification.py

Signed-off-by: WorldExplored <[email protected]>

vllm/entrypoints/openai/serving_classification.py

vllm/entrypoints/openai/serving_engine.py

Signed-off-by: WorldExplored <[email protected]> Co-Authored-By: vnadathur <[email protected]>

vllm/entrypoints/openai/serving_engine.py

Signed-off-by: WorldExplored <[email protected]>

Signed-off-by: vnadathur <[email protected]>

tests/entrypoints/pooling/openai/test_vision_classification.py

Signed-off-by: vnadathur <[email protected]>

WorldExplored · 2025-11-11T05:45:50Z

Thanks for the prompt reviews! Issue has been addressed, and auto merge is disabled. @noooop

noooop · 2025-11-11T06:13:53Z

@muziyongshixin

Are there any issues with running the vllm local test now?

Look forward to your feedback on this new feature.

muziyongshixin · 2025-11-11T06:57:56Z

@muziyongshixin

Are there any issues with running the vllm local test now?

Look forward to your feedback on this new feature.

Sorry to reply so late. I still have problems preparing the environment. I can follow the below command to install the vllm package, but the flash_attn is incompatible. And I tried to uninstall the flash_attn and use the xformers as the backend, a new error occurs. And when I tried to reinstall the flash_attn, the process still failed.

conda create -n vllm_main python=3.12 anaconda
conda activate vllm_main

git clone https://github.com/vllm-project/vllm.git vllm_main

cd vllm_main/

pip install uv
pip install numpy==2.2.6

# You may need to manually remove xformers and flashinfer-python from requirements/cuda.txt
VLLM_USE_PRECOMPILED=1 uv pip install -v --editable .

And it seems that you don't have a VLM model trained for classification? I can upload one to the Huggingface. Can you help me test it?
Because I have been really busy recent days, I may not have enough time to fix the environment issues.
If you can help me to test it, that will be very helpful.
The url of the model checkpoint is: https://huggingface.co/muziyongshixin/Qwen2.5-VL-7B-for-VideoCls

noooop · 2025-11-13T08:26:43Z

@WorldExplored

Please fix the ci Failed

Signed-off-by: WorldExplored <[email protected]> Co-Authored-By: vnadathur <[email protected]>

WorldExplored · 2025-11-14T04:53:33Z

I don't believe the CI fails are due to me, the files of complaint aren't anything I touched. @noooop

Signed-off-by: WorldExplored <[email protected]> Co-Authored-By: vnadathur <[email protected]>

noooop · 2025-11-14T05:04:02Z

I don't believe the CI fails are due to me, the files of complaint aren't anything I touched. @noooop

https://buildkite.com/vllm/ci/builds/38782/steps/canvas?sid=019a7ba8-0df5-421e-82e3-a3d47208f960

ValueError: At most 0 video(s) may be provided in one prompt.

google/gemma-3-4b-it does not support video input. Please select a model that supports video input for testing.

muziyongshixin · 2025-11-14T06:35:41Z

I don't believe the CI fails are due to me, the files of complaint aren't anything I touched. @noooop

https://buildkite.com/vllm/ci/builds/38782/steps/canvas?sid=019a7ba8-0df5-421e-82e3-a3d47208f960

ValueError: At most 0 video(s) may be provided in one prompt.

google/gemma-3-4b-it does not support video input. Please select a model that supports video input for testing.

This model can be used for test
https://huggingface.co/muziyongshixin/Qwen2.5-VL-7B-for-VideoCls

noooop · 2025-11-14T06:38:20Z

I don't believe the CI fails are due to me, the files of complaint aren't anything I touched. @noooop

https://buildkite.com/vllm/ci/builds/38782/steps/canvas?sid=019a7ba8-0df5-421e-82e3-a3d47208f960
ValueError: At most 0 video(s) may be provided in one prompt.
google/gemma-3-4b-it does not support video input. Please select a model that supports video input for testing.

This model can be used for test https://huggingface.co/muziyongshixin/Qwen2.5-VL-7B-for-VideoCls

working on it....

Signed-off-by: wang.yuqi <[email protected]>

noooop · 2025-11-14T07:08:24Z

I don't believe the CI fails are due to me, the files of complaint aren't anything I touched. @noooop

https://buildkite.com/vllm/ci/builds/38782/steps/canvas?sid=019a7ba8-0df5-421e-82e3-a3d47208f960
ValueError: At most 0 video(s) may be provided in one prompt.
google/gemma-3-4b-it does not support video input. Please select a model that supports video input for testing.

This model can be used for test https://huggingface.co/muziyongshixin/Qwen2.5-VL-7B-for-VideoCls

working on it....

I am not 100% sure that this PR can satisfy your use case. It's best for you to test it locally.

tests/entrypoints/pooling/openai/test_vision_classification.py

I feel that the installation of vllm main is smoother.

conda create -n vllm_main python=3.12 anaconda
conda activate vllm_main

git clone https://github.com/vllm-project/vllm.git vllm_main

cd vllm_main/

pip install uv
pip install numpy==2.2.6

VLLM_USE_PRECOMPILED=1 uv pip install -v --editable .

muziyongshixin · 2025-11-14T07:14:10Z

I don't believe the CI fails are due to me, the files of complaint aren't anything I touched. @noooop

https://buildkite.com/vllm/ci/builds/38782/steps/canvas?sid=019a7ba8-0df5-421e-82e3-a3d47208f960
ValueError: At most 0 video(s) may be provided in one prompt.
google/gemma-3-4b-it does not support video input. Please select a model that supports video input for testing.

This model can be used for test https://huggingface.co/muziyongshixin/Qwen2.5-VL-7B-for-VideoCls

working on it....

I am not 100% sure that this PR can satisfy your use case. It's best for you to test it locally.

tests/entrypoints/pooling/openai/test_vision_classification.py

I feel that the installation of vllm main is smoother.
conda create -n vllm_main python=3.12 anaconda
conda activate vllm_main

git clone https://github.com/vllm-project/vllm.git vllm_main

cd vllm_main/

pip install uv
pip install numpy==2.2.6

VLLM_USE_PRECOMPILED=1 uv pip install -v --editable .

Thanks for your patient efforts.
If the code presented below can function correctly, then this pull request (PR) will be able to satisfy my requirements.

def test_classify_accepts_chat_video_url(
    server_vlm_classify: RemoteOpenAIServer, model_name: str
) -> None:
    messages = [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Please classify this video."},
                {"type": "video_url", "video_url": {"url": TEST_VIDEO_URL}},
            ],
        }
    ]

    response = requests.post(
        server_vlm_classify.url_for("classify"),
        json={"model": model_name, "messages": messages},
    )
    response.raise_for_status()

    output = ClassificationResponse.model_validate(response.json())

    assert output.object == "list"
    assert output.model == model_name
    assert len(output.data) == 1
    assert len(output.data[0].probs) == 2
    assert output.usage.prompt_tokens == 4807

noooop · 2025-11-14T07:19:26Z

If the code presented below can function correctly, then this pull request (PR) will be able to satisfy my requirements.

test can pass locally

~~I think this PR can make it for the upcoming vllm 0.11.1 release.~~

vllm still needs a lot of improvement in multimodal usage, Feel free to raise issues, let us know more about the user scenarios.

DarkLight1337 · 2025-11-14T07:21:30Z

We have already cut the branch so it won't be in the upcoming release

Signed-off-by: wang.yuqi <[email protected]>

…roject#27516) Signed-off-by: WorldExplored <[email protected]> Signed-off-by: Srreyansh Sethi <[email protected]> Signed-off-by: vnadathur <[email protected]> Signed-off-by: wang.yuqi <[email protected]> Co-authored-by: vnadathur <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: vnadathur <[email protected]> Co-authored-by: wang.yuqi <[email protected]> Co-authored-by: wang.yuqi <[email protected]> Signed-off-by: George D. Torres <[email protected]>

…roject#27516) Signed-off-by: WorldExplored <[email protected]> Signed-off-by: Srreyansh Sethi <[email protected]> Signed-off-by: vnadathur <[email protected]> Signed-off-by: wang.yuqi <[email protected]> Co-authored-by: vnadathur <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: vnadathur <[email protected]> Co-authored-by: wang.yuqi <[email protected]> Co-authored-by: wang.yuqi <[email protected]> Signed-off-by: Bram Wasti <[email protected]>

WorldExplored and others added 3 commits October 24, 2025 23:43

modified API servers

9ac8d5f

Signed-off-by: WorldExplored <[email protected]> adds support to the ServingClassification class and its initialization. Updates preprocessing logic to handle chat messages in classification requests. Co-Authored-By: vnadathur <[email protected]>

refactors to simplify code

1b61ed7

Signed-off-by: WorldExplored <[email protected]>

fixed conditional logic

8e6762c

Signed-off-by: WorldExplored <[email protected]> Co-Authored-By: vnadathur <[email protected]>

WorldExplored requested review from aarnphm and chaunceyjiang as code owners October 25, 2025 23:25

mergify bot added the frontend label Oct 25, 2025

gemini-code-assist bot reviewed Oct 25, 2025

View reviewed changes

vllm/entrypoints/openai/serving_classification.py Outdated Show resolved Hide resolved

Update vllm/entrypoints/openai/serving_classification.py

58e2c45

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Srreyansh Sethi <[email protected]>

DarkLight1337 reviewed Oct 26, 2025

View reviewed changes

vllm/entrypoints/openai/protocol.py Outdated Show resolved Hide resolved

fixed pre-commit

bfd1ea9

Signed-off-by: WorldExplored <[email protected]>

mergify bot added the ci/build label Oct 26, 2025

WorldExplored and others added 2 commits October 26, 2025 18:46

addressed reviewer concerns

301c491

Signed-off-by: WorldExplored <[email protected]> Co-Authored-By: vnadathur <[email protected]>

Merge branch 'main' into APIfix

0c4d457

DarkLight1337 reviewed Oct 27, 2025

View reviewed changes

vllm/entrypoints/openai/protocol.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed Oct 27, 2025

View reviewed changes

noooop reviewed Oct 27, 2025

View reviewed changes

requirements/test.txt Show resolved Hide resolved

WorldExplored and others added 2 commits October 27, 2025 23:42

Merge branch 'main' into APIfix

97344fb

addressed reviewer comments

12e08e0

Signed-off-by: WorldExplored <[email protected]> Co-Authored-By: vnadathur <[email protected]>

DarkLight1337 reviewed Oct 28, 2025

View reviewed changes

vllm/entrypoints/openai/serving_classification.py Outdated Show resolved Hide resolved

fix: restore and correctly position aiosignal dependency comment

dcdd9bd

Signed-off-by: WorldExplored <[email protected]>

WorldExplored force-pushed the APIfix branch from 62bf833 to dcdd9bd Compare October 28, 2025 07:18

addressed reviews

7791d2b

Signed-off-by: WorldExplored <[email protected]>

DarkLight1337 reviewed Oct 29, 2025

View reviewed changes

vllm/entrypoints/openai/serving_classification.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed Oct 29, 2025

View reviewed changes

vllm/entrypoints/openai/serving_engine.py Outdated Show resolved Hide resolved

WorldExplored and others added 2 commits October 28, 2025 21:31

Merge branch 'main' into APIfix

21c8331

addressed reviewer comments

a5041b5

Signed-off-by: WorldExplored <[email protected]> Co-Authored-By: vnadathur <[email protected]>

DarkLight1337 reviewed Oct 29, 2025

View reviewed changes

vllm/entrypoints/openai/serving_engine.py Outdated Show resolved Hide resolved

WorldExplored and others added 3 commits October 29, 2025 09:51

Merge branch 'main' into APIfix

58aae95

Addressed Reviewer Feedback

ffdf73d

Signed-off-by: WorldExplored <[email protected]>

Merge branch 'main' into APIfix

6d72821

Signed-off-by: vnadathur <[email protected]>

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 10, 2025

noooop approved these changes Nov 10, 2025

View reviewed changes

noooop reviewed Nov 11, 2025

View reviewed changes

tests/entrypoints/pooling/openai/test_vision_classification.py Show resolved Hide resolved

added conversion trick to test

4175a0a

Signed-off-by: vnadathur <[email protected]>

auto-merge was automatically disabled November 11, 2025 05:41
Head branch was pushed to by a user without write access

Merge branch 'main' into APIfix

ebcfdf7

noooop enabled auto-merge (squash) November 13, 2025 05:19

addressed ci check

e7d8411

Signed-off-by: WorldExplored <[email protected]> Co-Authored-By: vnadathur <[email protected]>

auto-merge was automatically disabled November 13, 2025 23:04
Head branch was pushed to by a user without write access

pre commit fix

2343460

Signed-off-by: WorldExplored <[email protected]> Co-Authored-By: vnadathur <[email protected]>

fix tests

25d1c79

Signed-off-by: wang.yuqi <[email protected]>

noooop changed the title ~~[bugfix] modify api server for multi-modal inputs~~ [Frontend] Added chat-style multimodal support to /classify. Nov 14, 2025

Merge branch 'main' into APIfix

396b057

noooop enabled auto-merge (squash) November 14, 2025 06:58

input cannot be None

2b810d8

Signed-off-by: wang.yuqi <[email protected]>

noooop merged commit 360bd87 into vllm-project:main Nov 14, 2025
45 checks passed

Uh oh!

[Frontend] Added chat-style multimodal support to /classify. #27516

[Frontend] Added chat-style multimodal support to /classify. #27516

Uh oh!

Conversation

WorldExplored commented Oct 25, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

WorldExplored commented Nov 11, 2025

Uh oh!

noooop commented Nov 11, 2025

Uh oh!

muziyongshixin commented Nov 11, 2025

Uh oh!

noooop commented Nov 13, 2025

Uh oh!

WorldExplored commented Nov 14, 2025

Uh oh!

noooop commented Nov 14, 2025

Uh oh!

muziyongshixin commented Nov 14, 2025

Uh oh!

noooop commented Nov 14, 2025

Uh oh!

noooop commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

muziyongshixin commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

noooop commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DarkLight1337 commented Nov 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

WorldExplored commented Oct 25, 2025 •

edited by github-actions bot

Loading

noooop commented Nov 14, 2025 •

edited

Loading

muziyongshixin commented Nov 14, 2025 •

edited

Loading

noooop commented Nov 14, 2025 •

edited

Loading