-
Notifications
You must be signed in to change notification settings - Fork 766
bug: Ollama, Mistral, LlamaAPI, and Writer providers missing ContextWindowOverflowException #2052
Description
Description
Four model providers do not raise ContextWindowOverflowException when the context window is exceeded. The SDK's event loop catches this exception to trigger automatic context reduction via ConversationManager.reduce_context(). Without it, these providers crash with raw API errors and the recovery path never runs.
Providers that handle it (7): Anthropic, Bedrock, OpenAI, OpenAI Responses, LiteLLM, Gemini, llama.cpp
Providers that don't (4):
| Provider | File | What happens instead |
|---|---|---|
| Ollama | models/ollama.py |
No exception handling in stream() at all |
| Mistral | models/mistral.py:502-504 |
Only catches rate limits via "rate" in str(e).lower() |
| LlamaAPI | models/llamaapi.py:369 |
Only catches llama_api_client.RateLimitError |
| Writer | models/writer.py:398 |
Only catches writerai.RateLimitError |
Impact
Agents using these four providers work fine in short conversations but crash with unhandled errors in longer conversations when the context window fills up. The SlidingWindowConversationManager recovery logic is effectively dead code for these providers.
Suggested fix
Add provider-specific error detection to each stream() method, mapping the provider's context-overflow error to ContextWindowOverflowException. The pattern is the same as what Anthropic, OpenAI, and Bedrock already do — catch the provider-specific exception and re-raise as the SDK exception.
Each provider has different error formats:
- Ollama: Check for context-related error strings in
ResponseError - Mistral: Catch
MistralAPIExceptionand check for token/context error codes - LlamaAPI: Catch API errors with context-related messages
- Writer: Catch API errors with context-related messages
Happy to submit a PR for this.