-
Notifications
You must be signed in to change notification settings - Fork 599
[VLM] Add a CLI plugin system for mlperf-inf-mm-q3vl benchmark
#2420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[VLM] Add a CLI plugin system for mlperf-inf-mm-q3vl benchmark
#2420
Conversation
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
|
@soodoshll @johncalesp Could you help to review this PR? |
soodoshll
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
|
|
||
| from .schema import FooEndpoint | ||
|
|
||
| def register_foo_benchmark() -> Callable[[Settings, Dataset, FooEndpoint, int, int, Verbosity], None]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This return type annotation is a little verbose? Is it a must-have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a must-have. I just wanted to highlight that the return value should be the CLI command function.
I'll reduce it to just a Callable.
| from mlperf_inf_mm_q3vl.schema import Settings, Dataset, Endpoint, Verbosity | ||
| from mlperf_inf_mm_q3vl.log import setup_loguru_for_benchmark | ||
|
|
||
| from .schema import FooEndpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the user would need to follow a similar structure for the Endpoint from mlperf_inf_mm_q3vl.schema. Should we put into the package structure the schema.py file ?
mlperf-inf-mm-q3vl-foo/
├── pyproject.toml
└── src/
└── mlperf_inf_mm_q3vl_foo/
├── __init__.py
├── schema.py
└── plugin.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the Endpoint pydantic BaseModel, yes, the user likely would need to follow a similar class structure. However, in terms of the package structure, not necessarily. The user can put everything in __init__.py if they want (even though that's generally bad software engineering practices).
| This command deploys a model using the Foo backend | ||
| and runs the MLPerf benchmark against it. | ||
| """ | ||
| from .deploy import FooDeployer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same like schema.py
mlperf-inf-mm-q3vl-foo/
├── pyproject.toml
└── src/
└── mlperf_inf_mm_q3vl_foo/
├── __init__.py
├── deploy.py
└── plugin.py
For launching the VLM benchmark, currently we have:
mlperf-inf-mm-q3vl benchmark endpoint: Benchmarking against a generic endpoint that follows the OpenAI API spec. This allows the submitter to benchmark a generic inference system, but does require more manual (or bash scripting) efforts to set it up.mlperf-inf-mm-q3vl benchmark vllm: Deploy and launch vLLM, wait for it to be healthy, then run the same benchmarking routine. For the submitter who only wants to benchmark vLLM, this is a very convenient command that does everything for the submitter.But what if the submitter wants to benchmark an inference system that's different from the out-of-the-box vLLM, yet still wants to achieve the same convenience that
mlperf-inf-mm-q3vl benchmark vllmprovides? This PR introduces a plugin system that allows the submitter to implement their own subcommand ofmlperf-inf-mm-q3vl benchmarkfrom a 3rd party python package (i.e., without direct modification to themlperf-inf-mm-q3vlsource code).