feat: Add reasoning support for Cohere chat generator#2266
feat: Add reasoning support for Cohere chat generator#2266ChinmayBansal wants to merge 8 commits intodeepset-ai:mainfrom
Conversation
9bce185 to
52ae02b
Compare
...rations/cohere/src/haystack_integrations/components/generators/cohere/chat/chat_generator.py
Outdated
Show resolved
Hide resolved
|
@sjrl I have addressed your feedback. The changes I have made are using the native method while keeping the fallback method if the native method does not apply reasoning content. I have also updated the test cases. Let me know if more changes are needed! |
...rations/cohere/src/haystack_integrations/components/generators/cohere/chat/chat_generator.py
Outdated
Show resolved
Hide resolved
|
Hey @ChinmayBansal I noticed in your PR description you also mention adding support for reasoning content when streaming as well, but I don't see any changes in the code to reflect this. Is this something you could also add? |
|
@sjrl I had included it in the text-based fallback extraction thinking it was needed for streaming. After doing some more research, I found that streaming API does support thinking with |
I'd recommend checking out how it's used in the OpenAIRresponsesChatGenerator here https://github.com/deepset-ai/haystack/blob/0100a4849abbe1cbaa6d2b1cdbb3e1d2a471c3ff/haystack/components/generators/chat/openai_responses.py#L579 |
|
@sjrl I have added streaming support for reasoning content. I handled the content-deltas now and created reasoning content objects for thinking deltas. Since the StreamingChunk can have one of content/tool_calls/reasoning set at a time, I prioritized thinking content over text when both are there. I also added test cases and followed the OpenAI implementation. |
Related Issues
Proposed Changes:
This PR implements reasoning support for Cohere's chat generator following the parent issue #9700 requirements. The implementation adds the ability to extract and store model reasoning content in the
ReasoningContentfield ofChatMessage.Key Features Added:
_extract_reasoning_from_response) thatsupports multiple formats:
<thinking>,<reasoning>## Reasoning,## Thinking,## My reasoningcall scenarios
_convert_streaming_chunks_to_chat_message_with_reasoningfunctionwithout conflicts
text
positives
The implementation follows the established patterns from Ollama (#2200) and
Google GenAI (#2212) PRs, ensuring consistency across providers.
How did you test it?
TestReasoningExtraction)covering:
<thinking>,<reasoning>,step-by-step)
TestCohereChatGeneratorReasoningclass with:(lightweight)
rate limiting (API working correctly)
hatch run fmt) and type checking (hatch run test:types) passNotes for the reviewer
_extract_reasoning_from_response()function (lines ~355-435)
_parse_response()for both regular and toolcall paths (lines ~194-220)
reasoning in response text
command-a-reasoning-111b-2024-10-03model asrecommended
thinkingparameter viageneration_kwargs={"thinking": True}Checklist
fix:,feat:,build:,chore:,ci:,docs:,style:,refactor:,perf:,test:.