fix: set StatusCode.ERROR on tool spans when tool result status is error#2027
fix: set StatusCode.ERROR on tool spans when tool result status is error#2027nicolas-bezdolya wants to merge 1 commit intostrands-agents:mainfrom
Conversation
When a tool raises an exception, Strands catches it and converts it to a ToolResult dict with status='error'. The original exception object is lost before end_tool_call_span is called, so the span always gets StatusCode.OK. This fix synthesizes an Exception from the error content in the tool result when no explicit error is passed, ensuring the span correctly gets StatusCode.ERROR. This is critical for observability backends like Langfuse that rely on span status codes for error detection. Fixes strands-agents#2016
| if error is None and tool_result is not None and tool_result.get("status") == "error": | ||
| content = tool_result.get("content", []) | ||
| error_message = content[0].get("text", "Tool call failed") if content else "Tool call failed" | ||
| error = Exception(error_message) |
There was a problem hiding this comment.
Issue: Missing test coverage for this new error-handling logic.
Suggestion: Add unit tests to verify:
end_tool_call_spansetsStatusCode.ERRORwhentool_result["status"] == "error"- The error message is correctly extracted from
content[0]["text"] - The default message "Tool call failed" is used when content is empty
Example test structure:
def test_end_tool_call_span_with_error_status(mock_span):
"""Test that tool spans get StatusCode.ERROR when tool result status is error."""
tracer = Tracer()
tool_result = {
"status": "error",
"content": [{"text": "Error: ValueError - something failed"}],
"toolUseId": "test-id"
}
tracer.end_tool_call_span(mock_span, tool_result)
mock_span.set_status.assert_called_once_with(
StatusCode.ERROR,
"Error: ValueError - something failed"
)
mock_span.record_exception.assert_called_once()
def test_end_tool_call_span_with_error_status_empty_content(mock_span):
"""Test default error message when content is empty."""
tracer = Tracer()
tool_result = {"status": "error", "content": [], "toolUseId": "test-id"}
tracer.end_tool_call_span(mock_span, tool_result)
mock_span.set_status.assert_called_once_with(StatusCode.ERROR, "Tool call failed")|
Assessment: Request Changes This PR correctly addresses the issue where tool spans receive Review Details
The fix itself looks correct - please add tests to verify the behavior before merging. |
| }, | ||
| ) | ||
|
|
||
| # If no explicit error but tool_result indicates an error, synthesize one so the span gets StatusCode.ERROR |
There was a problem hiding this comment.
instead of fixing it here, shouldn't we fix it downstream so that the exception is propagated up the chain? Otherwise, we are doing exception -> text in the tool, and then text -> exception here.
Problem
When a tool raises an exception, Strands catches it in
_stream()and converts it to aToolResultdict withstatus='error'. The original exception object is lost beforeend_tool_call_spanis called, so_end_span()always falls into theelsebranch and marks the span asStatusCode.OK.The
gen_ai.tool.status: "error"attribute IS correctly set, but the spanStatusCoderemainsOK. This is critical because observability backends like Langfuse rely onStatusCode.ERRORto raise alarms/statistics on failed tool calls.Solution
In
end_tool_call_span(), before calling_end_span(), check iftool_result["status"] == "error"and synthesize anExceptionfrom the error content. This ensures the span correctly getsStatusCode.ERRORandrecord_exception()is called.Testing
Fixes #2016