-
Notifications
You must be signed in to change notification settings - Fork 47
Description
Summary
When a structured list of 48 items is provided in the system prompt inside <available_skills> XML tags, M2.5 consistently enumerates only 46 of them. The same 3 items are dropped on every request, across all sessions and prompt variations. The items are confirmed present in the system prompt server-side. The model CAN reference the dropped items by name when directly prompted, indicating they are processed during inference — the failure is isolated to sequential enumeration/recall.
Environment
| Field | Value |
|---|---|
| Model | MiniMax M2.5 |
| Endpoint | https://api.minimax.io/anthropic/v1/messages |
| Protocol | Anthropic Messages API (compatible) |
| Client | Rust application |
| System prompt size | ~30-40 KB total |
| Skill catalog size | ~22 KB, 48 entries |
Reproduction Steps
- Construct a system prompt containing an
<available_skills>block with 48 entries formatted as a flat list, one per line:<available_skills> - skill-name-1: description text here - skill-name-2: description text here ...48 total entries... </available_skills> - Send a user message such as:
"List all your available skills"or"Count your skills"or"Copy-paste the available_skills block" - Model responds with only 46 items.
Expected: Model enumerates all 48 items.
Actual: Model enumerates 46, consistently dropping the same 3 entries (community-building, narrative-design, video-scriptwriting).
Diagnostic Details
- Server-side verified: Application logging confirms all 48 entries (22,220 bytes) are present in the system prompt sent to the API. The omission occurs on the model side.
- Not positional: The 3 dropped items are not clustered at any specific position in the list.
- Not length-related: The dropped items do not have unusually long or short descriptions. Trimming their descriptions to match the average length of other entries had no effect on the behavior.
- Not a formatting issue: All 48 entries use identical formatting (
- name: description). No special characters, encoding differences, or whitespace anomalies distinguish the dropped items from the rest. - Functional access is intact: Requesting the model to activate or reference a dropped item by name succeeds (e.g.,
"activate the community-building skill"works correctly). This confirms the items are present in the model's context during inference. The failure is specific to self-enumeration of the full list. - 100% reproducible: The same 3 items are dropped across multiple sessions, application restarts, and all tested prompt phrasings.
Workaround
None found. Affected items remain functional when referenced by name, but the model cannot reliably enumerate or audit its own full list from the system prompt.
Questions
- Does M2.5 apply any truncation, deduplication, or salience-based filtering when enumerating items from structured blocks in the system prompt?
- Is this a known limitation of the model's list-recall behavior?
- Would adjusting prompt structure (e.g., numbered list, different XML schema, chunked enumeration) mitigate this?