-
Notifications
You must be signed in to change notification settings - Fork 2.8k
feat(ollama): add diagnostic logging for debugging request hangs #11057
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
roomote
wants to merge
1
commit into
main
Choose a base branch
from
fix/ollama-timeout-and-logging
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -14,6 +14,10 @@ interface OllamaChatOptions { | |||||
| num_ctx?: number | ||||||
| } | ||||||
|
|
||||||
| // Default timeout for Ollama requests (5 minutes to accommodate slow model loading) | ||||||
| // Ollama models can take 30-60+ seconds to load into memory on first use | ||||||
| const DEFAULT_OLLAMA_TIMEOUT_MS = 300_000 // 5 minutes | ||||||
|
|
||||||
| function convertToOllamaMessages(anthropicMessages: Anthropic.Messages.MessageParam[]): Message[] { | ||||||
| const ollamaMessages: Message[] = [] | ||||||
|
|
||||||
|
|
@@ -158,9 +162,11 @@ export class NativeOllamaHandler extends BaseProvider implements SingleCompletio | |||||
| private ensureClient(): Ollama { | ||||||
| if (!this.client) { | ||||||
| try { | ||||||
| const baseUrl = this.options.ollamaBaseUrl || "http://localhost:11434" | ||||||
| console.log(`[Ollama] Creating client for host: ${baseUrl}`) | ||||||
|
|
||||||
| const clientOptions: OllamaOptions = { | ||||||
| host: this.options.ollamaBaseUrl || "http://localhost:11434", | ||||||
| // Note: The ollama npm package handles timeouts internally | ||||||
| host: baseUrl, | ||||||
| } | ||||||
|
|
||||||
| // Add API key if provided (for Ollama cloud or authenticated instances) | ||||||
|
|
@@ -172,6 +178,7 @@ export class NativeOllamaHandler extends BaseProvider implements SingleCompletio | |||||
|
|
||||||
| this.client = new Ollama(clientOptions) | ||||||
| } catch (error: any) { | ||||||
| console.error(`[Ollama] Error creating client: ${error.message}`) | ||||||
| throw new Error(`Error creating Ollama client: ${error.message}`) | ||||||
| } | ||||||
| } | ||||||
|
|
@@ -205,8 +212,29 @@ export class NativeOllamaHandler extends BaseProvider implements SingleCompletio | |||||
| messages: Anthropic.Messages.MessageParam[], | ||||||
| metadata?: ApiHandlerCreateMessageMetadata, | ||||||
| ): ApiStream { | ||||||
| const baseUrl = this.options.ollamaBaseUrl || "http://localhost:11434" | ||||||
| const requestStartTime = Date.now() | ||||||
|
|
||||||
| console.log(`[Ollama] createMessage: Starting request at ${new Date().toISOString()}`) | ||||||
|
|
||||||
| const client = this.ensureClient() | ||||||
| const { id: modelId } = await this.fetchModel() | ||||||
|
|
||||||
| console.log(`[Ollama] createMessage: Fetching model info...`) | ||||||
| const { id: modelId, info: modelInfo } = await this.fetchModel() | ||||||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Fix it with Roo Code or mention @roomote and request a fix. |
||||||
| console.log( | ||||||
| `[Ollama] createMessage: Model '${modelId}' fetched in ${Date.now() - requestStartTime}ms, ` + | ||||||
| `found in cache: ${!!this.models[modelId]}`, | ||||||
| ) | ||||||
|
|
||||||
| // Warn if model wasn't found in the tool-capable models list | ||||||
| if (!this.models[modelId]) { | ||||||
| console.warn( | ||||||
| `[Ollama] Warning: Model '${modelId}' was not found in the list of tool-capable models. ` + | ||||||
| `This may indicate the model does not support native tool calling, or your Ollama version ` + | ||||||
| `does not report capabilities. Check with: ollama show ${modelId}`, | ||||||
| ) | ||||||
| } | ||||||
|
|
||||||
| const useR1Format = modelId.toLowerCase().includes("deepseek-r1") | ||||||
|
|
||||||
| const ollamaMessages: Message[] = [ | ||||||
|
|
@@ -234,15 +262,25 @@ export class NativeOllamaHandler extends BaseProvider implements SingleCompletio | |||||
| chatOptions.num_ctx = this.options.ollamaNumCtx | ||||||
| } | ||||||
|
|
||||||
| const toolsToSend = this.convertToolsToOllama(metadata?.tools) | ||||||
| console.log( | ||||||
| `[Ollama] createMessage: Sending chat request to ${baseUrl}/api/chat with model '${modelId}', ` + | ||||||
| `${ollamaMessages.length} messages, ${toolsToSend?.length ?? 0} tools`, | ||||||
| ) | ||||||
|
|
||||||
| const chatStartTime = Date.now() | ||||||
|
|
||||||
| // Create the actual API request promise | ||||||
| const stream = await client.chat({ | ||||||
| model: modelId, | ||||||
| messages: ollamaMessages, | ||||||
| stream: true, | ||||||
| options: chatOptions, | ||||||
| tools: this.convertToolsToOllama(metadata?.tools), | ||||||
| tools: toolsToSend, | ||||||
| }) | ||||||
|
|
||||||
| console.log(`[Ollama] createMessage: Stream started after ${Date.now() - chatStartTime}ms`) | ||||||
|
|
||||||
| let totalInputTokens = 0 | ||||||
| let totalOutputTokens = 0 | ||||||
| // Track tool calls across chunks (Ollama may send complete tool_calls in final chunk) | ||||||
|
|
||||||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DEFAULT_OLLAMA_TIMEOUT_MSis defined but never used. The constant and its comments suggest timeout handling exists for slow model loading, but it's not passed to the Ollama client anywhere. Either remove this dead code or apply the timeout to the client configuration if that was the intent.Fix it with Roo Code or mention @roomote and request a fix.