feat: add token-aware conversation manager with proactive compaction by FlorentLa · Pull Request #2038 · strands-agents/sdk-python

FlorentLa · 2026-04-02T14:18:56Z

Motivation

Autonomous agent workloads with long tool-call cycles (web browsing, code generation, research) accumulate context rapidly. The existing conversation managers either react only to context overflow (SummarizingConversationManager) or count messages without regard to actual token usage (SlidingWindowConversationManager). Neither proactively manages context based on the real token pressure the model experiences.

TokenAwareConversationManager reads actual inputTokens from model response metrics and triggers compaction before hitting the context window limit, using a four-pass strategy that preserves as much useful context as possible.

Public API Changes

New class TokenAwareConversationManager exported from strands.agent.conversation_manager:

from strands import Agent
from strands.agent.conversation_manager import TokenAwareConversationManager

agent = Agent(
    model=model,
    conversation_manager=TokenAwareConversationManager(
        compact_threshold=150_000,   # trigger at 150k input tokens
        preserve_recent=6,           # always keep 6 recent messages
        should_truncate_results=True, # truncate tool results before summarizing
    ),
)

Four-pass compaction strategy when threshold is exceeded:

Sanitize — strip ANSI escape codes, collapse repeated lines in tool results
Truncate — replace oversized tool result content with placeholders (oldest first)
Summarize — call model.stream() directly to summarize older messages into a concise assistant message
Trim — hard-remove oldest messages as last resort

The first user message (original task) is always preserved. The summary is inserted as an assistant message to maintain proper role alternation.

Use Cases

Autonomous coding agents that execute hundreds of tool calls over long sessions
Research agents that accumulate large tool outputs (web scrapes, file reads, API responses)
Any workload where context grows unpredictably and you want proactive management based on actual token counts rather than message counts

Testing

35 unit tests covering all compaction passes, hook callbacks, state persistence, edge cases
Live integration test (test_token_aware_100k.py) verified against Bedrock Haiku 4.5 with 100k+ token threshold — compaction triggered correctly, agent remained coherent after summarization
All 105 existing conversation manager tests continue to pass
hatch fmt --formatter + hatch fmt --linter clean
semgrep (321 rules): 0 findings
bandit: 0 findings in production code

Token-based context management that uses actual inputTokens from model responses to decide when to compact, instead of counting messages. Four-pass compaction strategy: 1. Sanitize — strip ANSI escape codes, collapse repeated lines 2. Truncate — replace oversized tool results with placeholders 3. Summarize — use model.stream() to summarize older messages 4. Trim — remove oldest messages as last resort The first user message is always preserved so the agent never loses sight of its original task. Summarization calls model.stream() directly, avoiding re-entrant agent invocation and deadlocks on _invocation_lock.

35 tests covering all four compaction passes, hook callbacks, state persistence, role alternation after summarization, and edge cases (too few messages, summarization failure fallback). Depends on strands-agents#2038 being merged first (imports TokenAwareConversationManager).

github-actions bot added the size/xl label Apr 2, 2026

FlorentLa requested a deployment to manual-approval April 2, 2026 14:19 — with GitHub Actions Waiting

FlorentLa force-pushed the feat/token-aware-conversation-manager branch from 4d15ba5 to fd785cd Compare April 2, 2026 14:21

github-actions bot added size/m and removed size/xl labels Apr 2, 2026

FlorentLa requested a deployment to manual-approval April 2, 2026 14:21 — with GitHub Actions Waiting

FlorentLa mentioned this pull request Apr 2, 2026

test: add unit tests for TokenAwareConversationManager #2039

Closed

FlorentLa mentioned this pull request Apr 2, 2026

test: add unit tests for TokenAwareConversationManager #2040

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add token-aware conversation manager with proactive compaction#2038

feat: add token-aware conversation manager with proactive compaction#2038
FlorentLa wants to merge 1 commit intostrands-agents:mainfrom
FlorentLa:feat/token-aware-conversation-manager

FlorentLa commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant