Skip to content

Conversation

@soodoshll
Copy link

@soodoshll soodoshll commented Jan 27, 2026

Motivation: All requests have the same share in final score, so we prioritize shorter jobs.

====================================================================================================
Job             All2All Backend                QPS      99% Lat (ms)    99% Lat (s)  Result
====================================================================================================
fcfs        flashinfer_all2allv            12.07    13,267.77       13.27        INVALID
sjf         flashinfer_all2allv            12.07    11,809.73       11.81        VALID
====================================================================================================

Best 99% latency: flashinfer_all2allv (11.81s)
Worst 99% latency: flashinfer_all2allv (13.27s)
Improvement: 11.0%

@wangshangsam
Copy link

Thanks a lot, @soodoshll ! Which flag to enable this?

@wangshangsam wangshangsam merged commit 6b26207 into CentML:mlperf-inf-mm-q3vl-v6.0 Jan 30, 2026
1 check passed
@wangshangsam wangshangsam added the enhancement New feature or request label Jan 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants