Add standalone multimodal encoder benchmark #27511

alhridoy · 2025-10-25T19:49:26Z

Summary

 - add a standalone multimodal encoder benchmark
 (`vllm/benchmarks/encoder.py`) that loads dummy-weight models, builds
 HF-processor inputs, and measures `get_multimodal_embeddings` latency across
  configurable batch/image sizes
 - wire the runner into the CLI as `vllm bench encoder` and keep a legacy
 shim at `benchmarks/benchmark_encoder.py`
 - support JSON export, optional preprocessing timing, and sane defaults for
 batch sizes and image resolutions

Signed-off-by: Al-Ekram Elahee Hridoy <[email protected]>

gemini-code-assist

Code Review

This pull request introduces a valuable standalone benchmark for multimodal encoders, integrating it into the vLLM CLI. The implementation is well-structured and uses appropriate techniques for performance measurement, such as torch.cuda.Event for accurate GPU timing. My review focuses on improving robustness by handling a couple of edge cases in user input that could lead to unexpected crashes. The suggested changes will make the benchmark more resilient to varied inputs.

gemini-code-assist · 2025-10-25T19:51:10Z

vllm/benchmarks/encoder.py

+    if "x" in size_str:
+        width_str, height_str = size_str.lower().split("x", maxsplit=1)
+        width = int(width_str)
+        height = int(height_str)
+    else:
+        width = height = int(size_str)


The current implementation for parsing image sizes does not handle malformed inputs like "224x" or "x224". In such cases, split would produce an empty string, and int('') would raise a ValueError, causing the benchmark to crash. It's better to handle this with a try-except block to provide a more informative error message to the user.

Suggested change

if "x" in size_str:

width_str, height_str = size_str.lower().split("x", maxsplit=1)

width = int(width_str)

height = int(height_str)

else:

width = height = int(size_str)

try:

if "x" in size_str:

width_str, height_str = size_str.lower().split("x", maxsplit=1)

width = int(width_str)

height = int(height_str)

else:

width = height = int(size_str)

except ValueError as exc:

raise ValueError(f"Invalid image size format: {size_str!r}") from exc

gemini-code-assist · 2025-10-25T19:51:10Z

vllm/benchmarks/encoder.py

+def _summarize(times_ms: Sequence[float]) -> dict[str, float]:
+    arr = np.array(times_ms, dtype=np.float64)
+    return {
+        "avg": float(arr.mean()),
+        "p50": float(np.percentile(arr, 50.0)),
+        "p90": float(np.percentile(arr, 90.0)),
+        "p99": float(np.percentile(arr, 99.0)),
+        "stdev": float(arr.std(ddof=0)) if arr.size > 1 else 0.0,
+    }


If a user provides --num-iters 0, the _time_encoder function will return an empty list. This causes the _summarize function to crash with an IndexError when np.percentile is called on an empty array. The function should be made more robust to handle an empty input sequence, for instance, by returning zero for all statistics.

def _summarize(times_ms: Sequence[float]) -> dict[str, float]: if not times_ms: return { "avg": 0.0, "p50": 0.0, "p90": 0.0, "p99": 0.0, "stdev": 0.0, } arr = np.array(times_ms, dtype=np.float64) return { "avg": float(arr.mean()), "p50": float(np.percentile(arr, 50.0)), "p90": float(np.percentile(arr, 90.0)), "p99": float(np.percentile(arr, 99.0)), "stdev": float(arr.std(ddof=0)) if arr.size > 1 else 0.0, }

alhridoy · 2025-10-25T19:52:14Z

Hi @ywang96 , would you look into this PR?

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2025-10-25T19:53:00Z

vllm/entrypoints/cli/benchmark/encoder.py

+class BenchmarkEncoderSubcommand(BenchmarkSubcommandBase):
+    """The `encoder` subcommand for vllm bench."""
+
+    name = "encoder"
+    help = "Benchmark standalone multimodal encoder forward latency."
+
+    @classmethod
+    def add_cli_args(cls, parser: argparse.ArgumentParser) -> None:
+        add_cli_args(parser)
+
+    @staticmethod
+    def cmd(args: argparse.Namespace) -> None:
+        main(args)


Register encoder subcommand with vllm CLI

The new BenchmarkEncoderSubcommand is defined here, but nothing imports this module when the CLI starts. vllm.entrypoints.cli.benchmark.main discovers bench commands via BenchmarkSubcommandBase.__subclasses__() after loading vllm.entrypoints.cli, whose __init__ currently only imports latency/throughput/serve. Because encoder.py is never imported, the class is absent from __subclasses__() and vllm bench will not expose the encoder command at all, defeating the purpose of the new benchmark. Import the module in vllm.entrypoints.cli.__init__ or otherwise ensure it is loaded before command discovery.

Useful? React with 👍 / 👎.

Signed-off-by: Al-Ekram Elahee Hridoy <[email protected]>

DarkLight1337 · 2025-10-26T07:24:16Z

benchmarks/benchmark_encoder.py

@@ -0,0 +1,26 @@
+# SPDX-License-Identifier: Apache-2.0


No need to add this deprecation since it's a new script

DarkLight1337 · 2025-10-26T07:25:26Z

vllm/benchmarks/encoder.py

+        tokenization_kwargs=inputs.tokenization_kwargs,
+    )
+    preprocess_ms = None
+    if include_preproc:


I think we should just always include this, since we can get this metric essentially for free

Add standalone multimodal encoder benchmark

f60d8ab

Signed-off-by: Al-Ekram Elahee Hridoy <[email protected]>

alhridoy requested review from aarnphm and chaunceyjiang as code owners October 25, 2025 19:49

mergify bot added frontend performance Performance-related issues labels Oct 25, 2025

gemini-code-assist bot reviewed Oct 25, 2025

View reviewed changes

chatgpt-codex-connector bot reviewed Oct 25, 2025

View reviewed changes

Add standalone multimodal encoder benchmark

1c83fd9

Signed-off-by: Al-Ekram Elahee Hridoy <[email protected]>

DarkLight1337 reviewed Oct 26, 2025

View reviewed changes

DarkLight1337 requested a review from ywang96 October 26, 2025 07:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add standalone multimodal encoder benchmark #27511

Add standalone multimodal encoder benchmark #27511

alhridoy commented Oct 25, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 25, 2025

Uh oh!

gemini-code-assist bot Oct 25, 2025

Uh oh!

alhridoy commented Oct 25, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Oct 25, 2025

Uh oh!

DarkLight1337 Oct 26, 2025

Uh oh!

DarkLight1337 Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Add standalone multimodal encoder benchmark #27511

Are you sure you want to change the base?

Add standalone multimodal encoder benchmark #27511

Conversation

alhridoy commented Oct 25, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 25, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 25, 2025

Choose a reason for hiding this comment

Uh oh!

alhridoy commented Oct 25, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Oct 25, 2025

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alhridoy commented Oct 25, 2025 •

edited by github-actions bot

Loading