-
-
Notifications
You must be signed in to change notification settings - Fork 12k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Revert "[Fix]Load kv-cache dtype from hf_quant_config.json automatically"
ci-failure
Issue about an unexpected test failure in CI
ready
ONLY add when PR is ready to merge/full CI is needed
#30653
opened Dec 14, 2025 by
robertgshaw2-redhat
Loading…
Strengthen input validation and tests for 'parse_raw_prompts’.
#30652
opened Dec 14, 2025 by
mivehk
Loading…
3 of 5 tasks
[Bugfix] CustomAR + TritonAttn[AMPERE] + FULL_CG - gpt-oss
gpt-oss
Related to GPT-OSS models
nvidia
#30650
opened Dec 14, 2025 by
bbrowning
Loading…
additional protection for CVE-2025-62164
frontend
multi-modality
Related to multi-modality (#4194)
#30649
opened Dec 14, 2025 by
wenqiglantz
Loading…
[Bugfix] Drop empty tool_calls lists to keep assistant replies in chat template
frontend
#30648
opened Dec 14, 2025 by
seokhyunan
Loading…
3 of 5 tasks
[Perf] Eliminate padding and slicing op for GPT-OSS with Flashinfer MXFP4 MXFP8 MoE
ci/build
gpt-oss
Related to GPT-OSS models
#30647
opened Dec 14, 2025 by
elvischenv
•
Draft
5 tasks
fix: unsatisfiable testing dependencies caused by a version conflict
ci/build
#30646
opened Dec 14, 2025 by
leejianwoo-collab
Loading…
fix: fix engine initialization fails with ValueError
#30645
opened Dec 14, 2025 by
leejianwoo-collab
Loading…
5 tasks done
Auto-rebase PRs older than 40 commits compared to main
ci/build
#30643
opened Dec 14, 2025 by
khluu
Loading…
Update docs README.md to add NVFP4 quantization support
documentation
Improvements or additions to documentation
#30634
opened Dec 14, 2025 by
omrialmog
Loading…
[MoE][Refactor 1/N] Separate Online Quantization
#30627
opened Dec 13, 2025 by
robertgshaw2-redhat
Loading…
5 tasks
[docker] Restructure Dockerfile for more efficient and cache-friendly builds
ci/build
documentation
Improvements or additions to documentation
ready
ONLY add when PR is ready to merge/full CI is needed
#30626
opened Dec 13, 2025 by
amrmahdi
Loading…
fix: prevent reasoning output when enable_thinking is false
frontend
#30625
opened Dec 13, 2025 by
llsj14
Loading…
5 tasks
[CI/Build] Ignore max transformers version skipping for initialization tests
ready
ONLY add when PR is ready to merge/full CI is needed
#30619
opened Dec 13, 2025 by
Isotr0py
Loading…
1 of 5 tasks
[BugFix][Hybrid] Fix prefill chunk incorrectly including draft tokens
v1
#30618
opened Dec 13, 2025 by
peakcrosser7
Loading…
3 of 5 tasks
[Docs] Add FlashInfer environment variables to env_vars documentation
documentation
Improvements or additions to documentation
#30616
opened Dec 13, 2025 by
majiayu000
Loading…
2 tasks done
[Feature] Default EPLB num_redundant_experts to minimum valid value
#30614
opened Dec 13, 2025 by
majiayu000
Loading…
2 tasks done
[Bugfix] Add validation for tool requests when tool_parser is unavailable
documentation
Improvements or additions to documentation
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
#30613
opened Dec 13, 2025 by
majiayu000
Loading…
2 tasks done
[ROCm][Perf] Replace cat to bmm's inplace write when aiter enabled
rocm
Related to AMD ROCm
v1
#30611
opened Dec 13, 2025 by
ganyi1996ppo
Loading…
5 tasks
[FixBug]fix gpt-oss v1/completions response bug
frontend
gpt-oss
Related to GPT-OSS models
tool-calling
#30608
opened Dec 13, 2025 by
princepride
Loading…
3 of 5 tasks
[Bugfix] Improve DCP error hint in cp_utils
v1
#30607
opened Dec 13, 2025 by
jliu9515
Loading…
3 of 5 tasks
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-11-14.