Skip to content

Fix/cron agent tool call not triggered#6112

Closed
NayukiChiba wants to merge 1 commit intoAstrBotDevs:masterfrom
NayukiChiba:fix/cron-agent-tool-call-not-triggered
Closed

Fix/cron agent tool call not triggered#6112
NayukiChiba wants to merge 1 commit intoAstrBotDevs:masterfrom
NayukiChiba:fix/cron-agent-tool-call-not-triggered

Conversation

@NayukiChiba
Copy link
Contributor

@NayukiChiba NayukiChiba commented Mar 12, 2026

问题描述

修复 #6103

定时任务触发后,Agent 不调用 send_message_to_user 工具,finish_reasonstop(纯文本输出),消息从未实际发送给用户。

根本原因

  1. req.prompt 误导:末尾指令要求 LLM「总结并输出执行结果」,LLM 直接生成文本即认为任务完成,不调用工具。
  2. 系统提示语义模糊PROACTIVE_AGENT_CRON_WOKE_SYSTEM_PROMPT 规则 5 写的是「if needed」,LLM 不知道文本对用户完全不可见;存在换行符缺失导致格式混乱。
  3. 历史污染反馈循环:失败执行(未调用工具)也写入对话历史,LLM 从历史中学到「文本输出 = 完成」,形成恶性循环。

修复方案

  • req.prompt:明确说明文本不可见,必须调用工具才能传达消息给用户。
  • PROACTIVE_AGENT_CRON_WOKE_SYSTEM_PROMPT:强制要求工具调用,修复换行符缺失。
  • _woke_main_agent:仅在 tool_was_called 时写入历史,否则记录 warning 并跳过,打破反馈循环。

Summary by Sourcery

确保由 cron 触发的代理通过工具调用可靠地向用户发送主动消息,并在未发生投递时避免污染会话历史。

Bug Fixes:

  • 强制 cron 代理调用 send_message_to_user 工具,而不是发出不可见的纯文本响应,从而确保计划任务产生的消息实际能够发送给用户。
  • 当某次 cron 运行中没有调用任何工具或不存在 LLM 响应时,不再持久化会话历史,避免在后续运行中产生误导性的上下文。
  • 处理来自不支持 /models 端点的 Anthropic 兼容提供商的模型列出失败问题,改为返回空列表而不是抛出异常。

Enhancements:

  • 明确并强化主动 cron 代理的系统提示词,清楚说明文本对用户不可见且 send_message_to_user 工具是唯一的投递机制,并修复模板中的格式问题以提高可读性。

Tests:

  • 新增单元测试,覆盖 cron 代理提示词构造、系统提示词的可见性保证、基于工具使用情况和 LLM 响应的历史持久化行为、缺失工具调用的日志记录,以及在工具集中包含 send_message_to_user 等场景。
Original summary in English

Summary by Sourcery

Ensure cron-triggered agents reliably deliver proactive messages via tool calls and avoid polluting history when no delivery occurred.

Bug Fixes:

  • Force cron agents to call the send_message_to_user tool instead of emitting invisible plain-text responses so scheduled messages are actually delivered to users.
  • Avoid persisting conversation history for cron runs where no tools were called or no LLM response exists, preventing misleading context in future runs.
  • Handle failures when listing models from Anthropic-compatible providers that do not support the /models endpoint by returning an empty list instead of raising.

Enhancements:

  • Clarify and strengthen the proactive cron agent system prompt so that it explicitly states text is not user-visible and the send_message_to_user tool is the only delivery mechanism, and fix formatting issues in the template for better readability.

Tests:

  • Add unit tests covering cron agent prompt construction, system prompt visibility guarantees, history persistence behavior based on tool usage and LLM responses, logging of missing tool calls, and inclusion of send_message_to_user in the toolset.

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Mar 12, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求旨在解决定时任务代理在触发后无法有效向用户发送消息的关键问题。通过精确调整代理的系统提示和运行时指令,明确强制其使用 send_message_to_user 工具进行通信,同时优化了对话历史记录的存储机制,确保只有成功的工具调用才会被记录,从而避免了代理从无效交互中学习。这些改进确保了定时任务代理能够可靠地执行其通信职责,并提升了系统的整体健壮性。

Highlights

  • 修复定时任务代理工具调用问题: 解决了定时任务触发后,Agent 未能调用 send_message_to_user 工具,导致消息未实际发送给用户的问题。
  • 优化代理提示词: 明确了 req.promptPROACTIVE_AGENT_CRON_WOKE_SYSTEM_PROMPT,强制要求代理通过工具而非纯文本输出与用户交互,并修复了系统提示中的格式问题。
  • 改进历史记录持久化逻辑: 修改了 _woke_main_agent,仅当代理实际调用了工具时才将执行记录写入对话历史,避免了因未发送消息的失败执行而污染历史上下文,打破了恶性反馈循环。
  • 增强模型提供商兼容性: 为 Anthropic 模型提供商添加了错误处理,使其在部分第三方提供商不支持 /models 端点时能优雅降级。
  • 新增单元测试: 为 _woke_main_agent 的提示词构造和历史记录持久化逻辑添加了全面的单元测试,确保修复的有效性。
Changelog
  • astrbot/core/astr_main_agent_resources.py
    • 在系统提示词“You are given:”后添加了换行符,改善了可读性。
    • 更新了 PROACTIVE_AGENT_CRON_WOKE_SYSTEM_PROMPT 中的第5条规则,明确指出代理的文本输出对用户不可见,并强制要求使用 send_message_to_user 工具来传递消息。
  • astrbot/core/cron/manager.py
    • 修改了 _woke_main_agent 函数中的 req.prompt,使其更明确地指示代理必须调用 send_message_to_user 工具,并强调直接文本响应对用户不可见。
    • 重构了 _woke_main_agent 中的历史记录持久化逻辑,现在仅当代理实际调用了工具时才将执行结果写入对话历史,否则记录警告并跳过,以防止误导性记录。
  • astrbot/core/provider/sources/anthropic_source.py
    • get_models 方法中添加了 try-except 块,用于捕获 httpx.HTTPStatusErroranthropic.APIStatusError,以兼容不支持 /models 端点的第三方提供商,并在失败时记录警告。
  • tests/unit/test_cron_manager.py
    • 新增了 TestWokeMainAgentPrompt 类,包含测试用例以验证 _woke_main_agent 生成的 req.prompt 和系统提示词 PROACTIVE_AGENT_CRON_WOKE_SYSTEM_PROMPT 是否包含正确的工具调用指令和可见性警告,并检查了系统提示词的格式。
    • 新增了 TestWokeMainAgentHistoryPersistence 类,包含测试用例以验证 _woke_main_agent 在工具被调用、未被调用或无 LLM 响应时,历史记录持久化行为的正确性,并确保在未调用工具时记录警告日志。
    • 添加了测试用例以确认 req.func_tool 中包含了 send_message_to_user 工具。
Activity
  • 此拉取请求的活动信息未在上下文中提供。
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@dosubot dosubot bot added the area:core The bug / feature is about astrbot's core, backend label Mar 12, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

此拉取请求有效地解决了一个关键问题,即由 cron 触发的代理未能使用必要的工具发送消息,从而导致静默失败。所做的更改是全面的:

  1. 提示词工程:针对 cron 代理的系统提示词和用户提示词都得到了显著加强,明确且强制性地指示 LLM 使用 send_message_to_user 工具,并阐明了直接的文本输出对用户不可见。
  2. 历史记录管理:一项关键的逻辑变更现在可以防止在没有调用工具的失败运行污染对话历史。只有在实际使用了工具的情况下,历史记录才会被持久化,这打破了负反馈循环,并提高了代理未来的性能。
  3. 错误处理:通过优雅地处理从不支持模型列表端点的提供商处列出模型时的 API 错误,anthropic_source 的稳健性得到了增强。
  4. 测试:增加了广泛的单元测试来验证新行为,包括提示词构建、有条件的历史持久化和日志记录,确保了修复的质量和正确性。

实现清晰,逻辑严谨,并且修复措施精确地针对已识别的根本原因。在审查过程中未发现任何问题。

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - 我发现了 1 个问题,并给出了一些高层面的反馈:

  • _woke_main_agent 中的 tool_was_called = any(msg.role == "tool" for msg in runner.run_context.messages) 检查假设工具总是以 "tool" 角色的消息形式呈现;如果 runner 或 provider 使用不同方式表示工具调用(例如通过 tool_calls 字段或其他角色),这个启发式可能会误判运行结果并跳过历史持久化,因此建议改为基于 runner 的结构化工具调用元数据来判断。
  • 若干新增测试对提示词和日志消息中的特定英文子串("MUST""NOT visible"、warning 文本)做断言,这会使它们对无伤大雅的文案修改非常脆弱;更稳健的方式是针对结构性质做断言(例如在 func_tool.tools 中存在 send_message_to_user、布尔标志,或关键短语组),而不是对精确措辞做断言。
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- `_woke_main_agent` 中的 `tool_was_called = any(msg.role == "tool" for msg in runner.run_context.messages)` 检查假设工具总是以 `"tool"` 角色的消息形式呈现;如果 runner 或 provider 使用不同方式表示工具调用(例如通过 `tool_calls` 字段或其他角色),这个启发式可能会误判运行结果并跳过历史持久化,因此建议改为基于 runner 的结构化工具调用元数据来判断。
- 若干新增测试对提示词和日志消息中的特定英文子串(`"MUST"``"NOT visible"`、warning 文本)做断言,这会使它们对无伤大雅的文案修改非常脆弱;更稳健的方式是针对结构性质做断言(例如在 `func_tool.tools` 中存在 `send_message_to_user`、布尔标志,或关键短语组),而不是对精确措辞做断言。

## Individual Comments

### Comment 1
<location path="tests/unit/test_cron_manager.py" line_range="739-741" />
<code_context>
+                extras={"cron_job": {"id": "j4", "name": "test"}, "cron_payload": {}},
+            )
+
+        # 必须有相关警告
+        warning_calls = [str(c) for c in mock_logger.warning.call_args_list]
+        assert any("tool" in w.lower() or "NOT" in w for w in warning_calls), (
+            "未找到关于工具未被调用的 warning 日志"
+        )
</code_context>
<issue_to_address>
**suggestion (testing):** 收紧 logger warning 的断言,使测试对无关日志调用更加稳健

当前断言可能会因为不相关的 warning 恰好包含 “tool” 或 “NOT” 而通过,即便关键的 warning 消息已经退化。更好的方式是对该场景发出的特定 warning 进行断言,例如:

```python
mock_logger.warning.assert_called_once()
msg, *rest = mock_logger.warning.call_args[0]
assert "did not call any tools" in msg
assert "NOT delivered" in msg
```

这样可以让测试更精确,并且对无关日志变更更具韧性。
</issue_to_address>

Sourcery 对开源项目免费——如果你喜欢我们的评审,请考虑分享它们 ✨
帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进评审质量。
Original comment in English

Hey - I've found 1 issue, and left some high level feedback:

  • The tool_was_called = any(msg.role == "tool" for msg in runner.run_context.messages) check in _woke_main_agent assumes tools are always surfaced as messages with role "tool"; if the runner or provider represents tool calls differently (e.g., via tool_calls fields or other roles), this heuristic may misclassify runs and skip history persistence, so consider basing this on the runner's structured tool-call metadata instead.
  • Several new tests assert against specific English substrings in prompts and log messages ("MUST", "NOT visible", warning text), which makes them brittle to harmless copy edits; it would be more robust to assert on structural properties (e.g., presence of send_message_to_user in func_tool.tools, boolean flags, or keyphrase groups) rather than exact phrasing.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `tool_was_called = any(msg.role == "tool" for msg in runner.run_context.messages)` check in `_woke_main_agent` assumes tools are always surfaced as messages with role `"tool"`; if the runner or provider represents tool calls differently (e.g., via `tool_calls` fields or other roles), this heuristic may misclassify runs and skip history persistence, so consider basing this on the runner's structured tool-call metadata instead.
- Several new tests assert against specific English substrings in prompts and log messages (`"MUST"`, `"NOT visible"`, warning text), which makes them brittle to harmless copy edits; it would be more robust to assert on structural properties (e.g., presence of `send_message_to_user` in `func_tool.tools`, boolean flags, or keyphrase groups) rather than exact phrasing.

## Individual Comments

### Comment 1
<location path="tests/unit/test_cron_manager.py" line_range="739-741" />
<code_context>
+                extras={"cron_job": {"id": "j4", "name": "test"}, "cron_payload": {}},
+            )
+
+        # 必须有相关警告
+        warning_calls = [str(c) for c in mock_logger.warning.call_args_list]
+        assert any("tool" in w.lower() or "NOT" in w for w in warning_calls), (
+            "未找到关于工具未被调用的 warning 日志"
+        )
</code_context>
<issue_to_address>
**suggestion (testing):** Tighten the logger warning assertion to make the test more robust against unrelated log calls

The current assertion could pass due to unrelated warnings that happen to contain “tool” or “NOT”, and would still pass if the key warning message were degraded. Instead, assert on the specific warning emitted by this scenario, e.g.:

```python
mock_logger.warning.assert_called_once()
msg, *rest = mock_logger.warning.call_args[0]
assert "did not call any tools" in msg
assert "NOT delivered" in msg
```

This makes the test more precise and resilient to unrelated log changes.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +739 to +741
# 必须有相关警告
warning_calls = [str(c) for c in mock_logger.warning.call_args_list]
assert any("tool" in w.lower() or "NOT" in w for w in warning_calls), (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): 收紧 logger warning 的断言,使测试对无关日志调用更加稳健

当前断言可能会因为不相关的 warning 恰好包含 “tool” 或 “NOT” 而通过,即便关键的 warning 消息已经退化。更好的方式是对该场景发出的特定 warning 进行断言,例如:

mock_logger.warning.assert_called_once()
msg, *rest = mock_logger.warning.call_args[0]
assert "did not call any tools" in msg
assert "NOT delivered" in msg

这样可以让测试更精确,并且对无关日志变更更具韧性。

Original comment in English

suggestion (testing): Tighten the logger warning assertion to make the test more robust against unrelated log calls

The current assertion could pass due to unrelated warnings that happen to contain “tool” or “NOT”, and would still pass if the key warning message were degraded. Instead, assert on the specific warning emitted by this scenario, e.g.:

mock_logger.warning.assert_called_once()
msg, *rest = mock_logger.warning.call_args[0]
assert "did not call any tools" in msg
assert "NOT delivered" in msg

This makes the test more precise and resilient to unrelated log changes.

@dosubot
Copy link

dosubot bot commented Mar 12, 2026

Related Documentation

1 document(s) may need updating based on files changed in this PR:

AstrBotTeam's Space

pr4697的改动
View Suggested Changes
@@ -327,6 +327,20 @@
 - 任务管理:通过工具或 Web UI 创建、修改、删除定时任务。主代理可为每个任务添加描述,便于后续管理。
 - 触发机制:到达设定时间后,CronJobManager 调用 MainAgent 执行任务循环,自动完成预设操作。
 
+**定时任务工具调用修复(PR #6112)**
+
+[PR #6112](https://github.com/AstrBotDevs/AstrBot/pull/6112) 修复了定时代理不调用 `send_message_to_user` 工具的关键问题([Issue #6103](https://github.com/AstrBotDevs/AstrBot/issues/6103))。修复前,定时任务触发后,Agent 会生成纯文本输出(`finish_reason: stop`),但消息从未实际发送给用户,导致用户无法收到定时推送。
+
+**根本原因:**
+1. **误导性提示词**:`req.prompt` 末尾要求 LLM「总结并输出执行结果」,导致 LLM 直接生成文本即认为任务完成,不调用工具
+2. **系统提示语义模糊**:`PROACTIVE_AGENT_CRON_WOKE_SYSTEM_PROMPT` 规则 5 写的是「if needed」,LLM 不知道文本对用户完全不可见;此外,存在换行符缺失导致格式混乱
+3. **历史污染反馈循环**:失败执行(未调用工具)也写入对话历史,LLM 从历史中学到「文本输出 = 完成」,形成恶性循环
+
+**修复方案:**
+- **req.prompt 增强**:明确说明「你的文本响应对用户不可见——只有工具调用才会生效。你必须调用 `send_message_to_user` 工具才能将消息传递给用户。」
+- **系统提示强化**:规则 5 现在强制要求:「重要:你的文本输出对用户不可见。向用户传递消息的唯一方式是调用 `send_message_to_user` 工具。你必须调用此工具发送任何消息——不要只生成文本。」同时修复了换行符缺失问题。
+- **历史持久化逻辑改进**:现在仅在 Agent 实际调用了工具时(`tool_was_called` 检查)才将对话历史持久化,防止失败执行污染上下文,打破负反馈循环。
+
 #### 使用场景示例
 用户可通过对话指令或 UI 创建定时任务,例如“每天早上8点发送今日早报”。主代理会注册一个 ActiveAgentCronJob,定时触发后自动执行相关脚本并通过 send_message_to_user 工具推送内容。
 
@@ -344,6 +358,39 @@
 
 - 支持多平台:Telegram、Slack、Discord 等
 - 支持多媒体类型:文本、图片、语音消息(record)、视频、文件等
+
+#### 提示词强制要求(PR #6112)
+在定时任务场景中,系统通过以下机制确保 Agent 正确调用 `send_message_to_user` 工具:
+
+**用户提示(req.prompt):**
+```
+A scheduled task has been triggered. 
+Proceed according to your system instructions. 
+You MUST call the `send_message_to_user` tool to deliver any message to the user. 
+Your direct text response is NOT visible to the user — only tool calls take effect. 
+Use the same language as the previous conversation.
+```
+
+**系统提示(PROACTIVE_AGENT_CRON_WOKE_SYSTEM_PROMPT 规则 5):**
+```
+IMPORTANT: Your text output is NOT visible to the user. The ONLY way to deliver a message 
+to the user is by calling the `send_message_to_user` tool. You MUST call this tool to send 
+any message — do NOT just generate text.
+```
+
+这些提示词明确告知 LLM:文本响应对用户完全不可见,必须通过工具调用才能传递消息。
+
+#### 历史持久化逻辑
+为防止失败执行污染上下文,定时任务的对话历史持久化逻辑已改进(`_woke_main_agent` 方法):
+
+- **仅在工具被调用时持久化**:系统通过检查 `runner.run_context.messages` 中是否存在 `role == "tool"` 的消息来判断工具是否被调用
+- **跳过失败执行**:如果 Agent 完成执行但未调用任何工具,系统会记录警告日志并跳过历史持久化:
+  ```
+  Cron job agent did not call any tools. 
+  The message was likely NOT delivered to the user. 
+  Skipping history persistence to avoid misleading future context.
+  ```
+- **打破负反馈循环**:防止 LLM 从失败执行的历史中学到错误行为(即"生成文本即任务完成")
 
 #### 使用示例
 在定时任务触发后,主动调用 `send_message_to_user` 工具,将生成的日报、图片或其他内容直接发送给用户。

[Accept] [Decline]

Note: You must be authenticated to accept/decline updates.

How did I do? Any feedback?  Join Discord

@NayukiChiba NayukiChiba marked this pull request as draft March 12, 2026 09:12
原因:
- req.prompt 末尾要求 LLM 总结输出,导致 LLM 直接生成文本而不调用工具
- 系统提示规则模糊,未说明文本输出对用户不可见,存在换行符缺失
- 失败执行也写入对话历史,形成 LLM 误以为文本即完成的反馈循环

修复:
- req.prompt 明确要求必须调用工具,说明直接文本不可见
- 更新 PROACTIVE_AGENT_CRON_WOKE_SYSTEM_PROMPT,强制工具调用,修复换行
- 仅在 tool_was_called 时写入历史,否则记录 warning 并跳过

closes AstrBotDevs#6103
@NayukiChiba NayukiChiba force-pushed the fix/cron-agent-tool-call-not-triggered branch from d102685 to 40bd282 Compare March 12, 2026 09:16
@NayukiChiba NayukiChiba deleted the fix/cron-agent-tool-call-not-triggered branch March 12, 2026 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core The bug / feature is about astrbot's core, backend size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant