[VLM] Add a CLI plugin system for `mlperf-inf-mm-q3vl benchmark` #2420

wangshangsam · 2025-12-24T23:56:55Z

For launching the VLM benchmark, currently we have:

mlperf-inf-mm-q3vl benchmark endpoint: Benchmarking against a generic endpoint that follows the OpenAI API spec. This allows the submitter to benchmark a generic inference system, but does require more manual (or bash scripting) efforts to set it up.
mlperf-inf-mm-q3vl benchmark vllm: Deploy and launch vLLM, wait for it to be healthy, then run the same benchmarking routine. For the submitter who only wants to benchmark vLLM, this is a very convenient command that does everything for the submitter.

But what if the submitter wants to benchmark an inference system that's different from the out-of-the-box vLLM, yet still wants to achieve the same convenience that mlperf-inf-mm-q3vl benchmark vllm provides? This PR introduces a plugin system that allows the submitter to implement their own subcommand of mlperf-inf-mm-q3vl benchmark from a 3rd party python package (i.e., without direct modification to the mlperf-inf-mm-q3vl source code).

github-actions · 2025-12-24T23:57:04Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

wangshangsam · 2026-01-05T20:50:16Z

@soodoshll @johncalesp Could you help to review this PR?

soodoshll

LGTM. Thanks!

soodoshll · 2026-01-05T21:44:35Z

multimodal/qwen3-vl/README.md

+
+from .schema import FooEndpoint
+
+def register_foo_benchmark() -> Callable[[Settings, Dataset, FooEndpoint, int, int, Verbosity], None]:


This return type annotation is a little verbose? Is it a must-have?

It's not a must-have. I just wanted to highlight that the return value should be the CLI command function.
I'll reduce it to just a Callable.

johncalesp · 2026-01-05T22:08:42Z

multimodal/qwen3-vl/README.md

+from mlperf_inf_mm_q3vl.schema import Settings, Dataset, Endpoint, Verbosity
+from mlperf_inf_mm_q3vl.log import setup_loguru_for_benchmark
+
+from .schema import FooEndpoint


Seems like the user would need to follow a similar structure for the Endpoint from mlperf_inf_mm_q3vl.schema. Should we put into the package structure the schema.py file ?

mlperf-inf-mm-q3vl-foo/ ├── pyproject.toml └── src/ └── mlperf_inf_mm_q3vl_foo/ ├── __init__.py ├── schema.py └── plugin.py

For the Endpoint pydantic BaseModel, yes, the user likely would need to follow a similar class structure. However, in terms of the package structure, not necessarily. The user can put everything in __init__.py if they want (even though that's generally bad software engineering practices).

johncalesp · 2026-01-05T22:09:05Z

multimodal/qwen3-vl/README.md

+        This command deploys a model using the Foo backend
+        and runs the MLPerf benchmark against it.
+        """
+        from .deploy import FooDeployer


Same like schema.py

mlperf-inf-mm-q3vl-foo/ ├── pyproject.toml └── src/ └── mlperf_inf_mm_q3vl_foo/ ├── __init__.py ├── deploy.py └── plugin.py

Introduce the mlperf-inf-mm-q3vl benchmark plugin system

281b2e9

wangshangsam requested a review from a team as a code owner December 24, 2025 23:56

wangshangsam and others added 2 commits January 4, 2026 23:47

fix circular import

0c5452b

[Automated Commit] Format Codebase

6dd564b

soodoshll approved these changes Jan 5, 2026

View reviewed changes

johncalesp reviewed Jan 5, 2026

View reviewed changes

johncalesp approved these changes Jan 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[VLM] Add a CLI plugin system for `mlperf-inf-mm-q3vl benchmark` #2420

[VLM] Add a CLI plugin system for `mlperf-inf-mm-q3vl benchmark` #2420

wangshangsam commented Dec 24, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 24, 2025 •

edited

Loading

Uh oh!

wangshangsam commented Jan 5, 2026

Uh oh!

soodoshll left a comment

Uh oh!

soodoshll Jan 5, 2026

Uh oh!

wangshangsam Jan 5, 2026

Uh oh!

johncalesp Jan 5, 2026

Uh oh!

wangshangsam Jan 5, 2026

Uh oh!

johncalesp Jan 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		from .schema import FooEndpoint

		def register_foo_benchmark() -> Callable[[Settings, Dataset, FooEndpoint, int, int, Verbosity], None]:

[VLM] Add a CLI plugin system for mlperf-inf-mm-q3vl benchmark #2420

Are you sure you want to change the base?

[VLM] Add a CLI plugin system for mlperf-inf-mm-q3vl benchmark #2420

Conversation

wangshangsam commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wangshangsam commented Jan 5, 2026

Uh oh!

soodoshll left a comment

Choose a reason for hiding this comment

Uh oh!

soodoshll Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

wangshangsam Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

johncalesp Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

wangshangsam Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

johncalesp Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[VLM] Add a CLI plugin system for `mlperf-inf-mm-q3vl benchmark` #2420

[VLM] Add a CLI plugin system for `mlperf-inf-mm-q3vl benchmark` #2420

wangshangsam commented Dec 24, 2025 •

edited

Loading

github-actions bot commented Dec 24, 2025 •

edited

Loading