-
Notifications
You must be signed in to change notification settings - Fork 402
Description
📋 Prerequisites
- I have searched the existing issues to avoid creating a duplicate
- By submitting this issue, you agree to follow our Code of Conduct
- I am using the latest version of the software
- I have tried to clear cache/cookies or used incognito mode (if ui-related)
- I can consistently reproduce this issue
🎯 Affected Service(s)
Multiple services / System-wide issue
🚦 Impact/Severity
Blocker
🐛 Bug Description
When using kagent 0.7.13 in a Kubernetes environment with a multi-agent setup (an orchestrator agent invoking other agents that use MCP tools), kagent intermittently crashes during MCP session cleanup.
The failure manifests as:
Warning: Error during MCP session cleanup for session_no_headers:
Attempted to exit a cancel scope that isn't the current task's current cancel scope
followed by a CancelledError from an asyncio queue during event stream shutdown, ultimately resulting in a 500 Internal Server Error from the kagent API.
The problem appears when:
- An orchestrator agent delegates work to another agent that uses MCP tools, and
- The tool call completes (or is cancelled), triggering MCP session teardown.
This leaves me with a impression that this is a bug in how MCP cancel scopes / task lifecycles are managed during cleanup, likely when nested agents and MCP tools are involved.
This results in seeing Agents calls in the UI that just return a empty response:
{"result":""}
Any tips on how this could be resolved would be greatly appreciated.
🔄 Steps To Reproduce
- Deploy kagent v0.7.13 on a kubeadm Kubernetes cluster (v1.34.3).
- Configure Azure OpenAI
gpt-5-minias the default model for all agents. - Create an orchestrator agent that calls another agent as a tool:
apiVersion: kagent.dev/v1alpha2
kind: Agent
metadata:
name: orchestrator-agent
spec:
description: "(...)"
type: Declarative
declarative:
modelConfig: default-model-config
systemMessage: |
(...)
tools:
- type: Agent
agent:
name: discovery-agent- Create a secondary agent that uses an MCP server:
apiVersion: kagent.dev/v1alpha2
kind: Agent
metadata:
name: discovery-agent
spec:
description: "(...)"
type: Declarative
declarative:
a2aConfig:
skills:
(...)
modelConfig: default-model-config
systemMessage: |
(...)
tools:
- type: McpServer
mcpServer:
apiGroup: kagent.dev
kind: RemoteMCPServer
name: grafana-mcpserver
toolNames:
- list_prometheus_metric_names
- list_prometheus_metric_metadata
- list_prometheus_label_names
- list_prometheus_label_values- Issue a request to the orchestrator agent that causes it to call the
discovery-agent, which then calls one of the Grafana MCP tools. - Observe logs during tool execution.
🤔 Expected Behavior
- MCP sessions should close cleanly after tool execution.
- No warnings about cancel scopes.
📱 Actual Behavior
- kagent logs emit:
Warning: Error during MCP session cleanup for session_no_headers:
Attempted to exit a cancel scope that isn't the current task's current cancel scope
- This is followed by:
asyncio.exceptions.CancelledError: Cancelled by cancel scope ...
- The HTTP request ultimately fails with:
500 Internal Server Error
- The crash happens during event queue shutdown inside MCP cleanup (
event_queue.close()), indicating incorrect handling of cancel scopes across tasks.
💻 Environment
| Component | Version / Details |
|---|---|
| kagent | 0.7.13 |
| Kubernetes | kubeadm v1.34.3 |
| Cloud / Infra | Self-managed k8s cluster |
| LLM provider | Azure OpenAI |
| Model | gpt-5-mini |
| MCP Servers Used | kagent-tools + Grafana MCP Server |
| Grafana MCP Server | 0.9.0 |
| Agent topology | Orchestrator → secondary agent → MCP tools |
🔧 CLI Bug Report
No response
🔍 Additional Context
- The stack trace suggests a mismatch between where a cancel scope is entered vs. exited during MCP session cleanup.
- Similar issues have been reported in other MCP-based projects involving cancel-scope lifecycles during teardown (e.g., mcp-agent and ADK Python).
- This may indicate a deeper issue in how kagent integrates MCP session management with asyncio task groups.
📋 Logs
Warning: Error during MCP session cleanup for session_no_headers: Attempted to exit a cancel scope that isn't the current tasks's current cancel scope
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/.kagent/.venv/lib/python3.13/site-packages/uvicorn/protocols/http/h11_impl.py", line 403, in run_asgi
result = await app( # type: ignore[func-returns-value]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
self.scope, self.receive, self.send
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/.kagent/.venv/lib/python3.13/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
return await self.app(scope, receive, send)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.kagent/.venv/lib/python3.13/site-packages/fastapi/applications.py", line 1139, in __call__
INFO: 172.17.103.57:37580 - "POST / HTTP/1.1" 500 Internal Server Error
await super().__call__(scope, receive, send)
File "/.kagent/.venv/lib/python3.13/site-packages/starlette/applications.py", line 107, in __call__
await self.middleware_stack(scope, receive, send)
File "/.kagent/.venv/lib/python3.13/site-packages/starlette/middleware/errors.py", line 164, in __call__
await self.app(scope, receive, _send)
File "/.kagent/.venv/lib/python3.13/site-packages/opentelemetry/instrumentation/asgi/__init__.py", line 795, in __call__
await self.app(scope, otel_receive, otel_send)
File "/.kagent/.venv/lib/python3.13/site-packages/starlette/middleware/errors.py", line 164, in __call__
await self.app(scope, receive, _send)
File "/.kagent/.venv/lib/python3.13/site-packages/opentelemetry/instrumentation/fastapi/__init__.py", line 307, in __call__
await self.app(scope, receive, send)
File "/.kagent/.venv/lib/python3.13/site-packages/starlette/middleware/exceptions.py", line 63, in __call__
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/.kagent/.venv/lib/python3.13/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/.kagent/.venv/lib/python3.13/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
await self.app(scope, receive, send)
File "/.kagent/.venv/lib/python3.13/site-packages/starlette/routing.py", line 716, in __call__
await self.middleware_stack(scope, receive, send)
File "/.kagent/.venv/lib/python3.13/site-packages/starlette/routing.py", line 736, in app
await route.handle(scope, receive, send)
File "/.kagent/.venv/lib/python3.13/site-packages/starlette/routing.py", line 290, in handle
await self.app(scope, receive, send)
File "/.kagent/.venv/lib/python3.13/site-packages/fastapi/routing.py", line 119, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/.kagent/.venv/lib/python3.13/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/.kagent/.venv/lib/python3.13/site-packages/fastapi/routing.py", line 105, in app
response = await f(request)
^^^^^^^^^^^^^^^^
File "/.kagent/.venv/lib/python3.13/site-packages/fastapi/routing.py", line 385, in app
raw_response = await run_endpoint_function(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...<3 lines>...
)
^
File "/.kagent/.venv/lib/python3.13/site-packages/fastapi/routing.py", line 284, in run_endpoint_function
return await dependant.call(**values)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.kagent/.venv/lib/python3.13/site-packages/a2a/server/apps/jsonrpc/jsonrpc_app.py", line 368, in _handle_requests
return await self._process_non_streaming_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
request_id, a2a_request, call_context
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/.kagent/.venv/lib/python3.13/site-packages/a2a/server/apps/jsonrpc/jsonrpc_app.py", line 448, in _process_non_streaming_request
handler_result = await self.handler.on_message_send(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
request_obj, context
^^^^^^^^^^^^^^^^^^^^
)
^
File "/.kagent/.venv/lib/python3.13/site-packages/a2a/utils/telemetry.py", line 196, in async_wrapper
result = await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.kagent/.venv/lib/python3.13/site-packages/a2a/server/request_handlers/jsonrpc_handler.py", line 106, in on_message_send
task_or_message = await self.request_handler.on_message_send(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
request.params, context
^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/.kagent/.venv/lib/python3.13/site-packages/a2a/utils/telemetry.py", line 196, in async_wrapper
result = await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.kagent/.venv/lib/python3.13/site-packages/a2a/server/request_handlers/default_request_handler.py", line 342, in on_message_send
await self._cleanup_producer(producer_task, task_id)
File "/.kagent/.venv/lib/python3.13/site-packages/a2a/utils/telemetry.py", line 196, in async_wrapper
result = await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.kagent/.venv/lib/python3.13/site-packages/a2a/server/request_handlers/default_request_handler.py", line 438, in _cleanup_producer
await producer_task
File "/.kagent/.venv/lib/python3.13/site-packages/a2a/utils/telemetry.py", line 196, in async_wrapper
result = await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.kagent/.venv/lib/python3.13/site-packages/a2a/server/request_handlers/default_request_handler.py", line 197, in _run_event_stream
await queue.close()
File "/.kagent/.venv/lib/python3.13/site-packages/a2a/utils/telemetry.py", line 196, in async_wrapper
result = await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.kagent/.venv/lib/python3.13/site-packages/a2a/server/events/event_queue.py", line 175, in close
await asyncio.gather(
self.queue.join(), *(child.close() for child in self._children)
)
File "/python/cpython-3.13.11-linux-x86_64-gnu/lib/python3.13/asyncio/queues.py", line 239, in join
async def join(self):
asyncio.exceptions.CancelledError: Cancelled by cancel scope 7f3e5ba34c00📷 Screenshots
🙋 Are you willing to contribute?
- I am willing to submit a PR to fix this issue
Metadata
Metadata
Assignees
Labels
Type
Projects
Status