-
Notifications
You must be signed in to change notification settings - Fork 2.8k
feat: add GLM-4.7+ thinking mode support for LM Studio and OpenAI-compatible endpoints (#11071) #11090
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…patible endpoints (#11071)
Reviewed the GLM-4.7+ thinking mode implementation. Found 1 issue that affects both providers:
Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues. |
| // For GLM-4.7+ models, add thinking mode support similar to Z.ai | ||
| if (glmOptions?.supportsThinking) { | ||
| const useReasoning = shouldUseReasoningEffort({ model: modelInfo, settings: this.options }) | ||
| params.thinking = useReasoning ? { type: "enabled" } : { type: "disabled" } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldUseReasoningEffort() will always return false for LM Studio models because modelInfo comes from either the model cache or openAiModelInfoSaneDefaults, neither of which have supportsReasoningEffort or reasoningEffort properties. The function's fallback logic checks !!modelDefaultEffort, which will be undefined for these models. This means thinking mode will never be enabled - the code will always send thinking: { type: "disabled" }. Consider either: (1) not relying on shouldUseReasoningEffort() for detected GLM-4.7+ models and instead default to enabled unless user explicitly disables via enableReasoningEffort: false, or (2) have getGlmModelOptions() return synthetic reasoning capability info that can be used instead of/in addition to the model info.
Fix it with Roo Code or mention @roomote and request a fix.
| // For GLM-4.7+ models, add thinking mode support similar to Z.ai | ||
| if (glmOptions?.supportsThinking) { | ||
| const useReasoning = shouldUseReasoningEffort({ model: info, settings: this.options }) | ||
| params.thinking = useReasoning ? { type: "enabled" } : { type: "disabled" } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue as in lm-studio.ts: shouldUseReasoningEffort() will always return false for models that use default model info (e.g., dynamic providers) because openAiModelInfoSaneDefaults lacks supportsReasoningEffort and reasoningEffort properties. The thinking parameter will always be set to disabled for GLM-4.7+ models using generic OpenAI-compatible providers.
Fix it with Roo Code or mention @roomote and request a fix.
Related GitHub Issue
Closes: #11071
Roo Code Task Context (Optional)
No Roo Code task context for this PR
Description
This PR extends the GLM model detection implementation to apply GLM-4.7 specific parameters that were added in release 3.44.1, making GLM models on LM Studio and OpenAI-compatible endpoints behave more like they do on the Z.ai provider.
Key implementation details:
isGlm47Plus()function to detect GLM-4.7 and higher versions using regex pattern matchingthinking: { type: "enabled" | "disabled" }parameter, similar to how the Z.ai provider handles itshouldUseReasoningEffort()utility to determine whether to enable thinking mode based on user settingsmergeToolResultText,disableParallelToolCalls) for all GLM modelsDesign choices:
/glm[-_]4\.([7-9]|\d{2,})|glm[-_][5-9]\./i) to avoid false positives on ambiguous formats likeglm47orchatglm-6bthinkingparameter is only added for GLM-4.7+, not for older versions like GLM-4.5 or GLM-4.6What reviewers should focus on:
shouldUseReasoningEffort()- does it correctly respect user settings?thinkingparameter should be applied to all GLM models or only 4.7+Test Procedure
Unit Tests:
Created comprehensive test suite with 101 test cases covering:
Ran tests:
cd src && npx vitest run api/providers/utils/__tests__/glm-model-detection.spec.tsRan Z.ai provider tests to ensure no regression:
cd src && npx vitest run api/providers/__tests__/zai.spec.tsRan full linting and type checking via pre-push hooks
How to verify:
mlx-community/GLM-4.7-4bit)thinkingparameterPre-Submission Checklist
Screenshots / Videos
No UI changes in this PR
Documentation Updates
Additional Notes
Implementation builds on PR #11082:
This PR assumes that PR #11082 (basic GLM detection) will be merged first, or can be incorporated into this PR if needed. The implementation is designed to be backward compatible and can work independently if the basic GLM detection is not yet merged.
Questions for reviewers:
thinkingparameter also be applied to GLM-4.6 models, or only 4.7+?Relationship to issue context:
The user asked whether the GLM model detection would also apply model-specific parameters like those added for GLM-4.7 in release 3.44.1. This PR ensures that yes, GLM-4.7 specific parameters (thinking mode) are now applied to GLM models regardless of whether they're accessed via Z.ai, LM Studio, or OpenAI-compatible endpoints.
Get in Touch
@roomote
Important
Adds GLM-4.7+ thinking mode support for LM Studio and OpenAI-compatible endpoints with new model detection and parameter handling.
isGlm47Plus()inglm-model-detection.tsto detect GLM-4.7+ versions using regex.thinkingparameter inbase-openai-compatible-provider.tsandlm-studio.ts.shouldUseReasoningEffort()to enable thinking mode based on user settings.glm-model-detection.spec.tsfor model detection and option generation.This description was created by
for 470c845. You can customize this summary. It will automatically update as commits are pushed.