Skip to content

fix: normalize invalid MCP required flags in MCP schemas#6077

Open
stablegenius49 wants to merge 1 commit intoAstrBotDevs:masterfrom
stablegenius49:fix/5972-normalize-mcp-required
Open

fix: normalize invalid MCP required flags in MCP schemas#6077
stablegenius49 wants to merge 1 commit intoAstrBotDevs:masterfrom
stablegenius49:fix/5972-normalize-mcp-required

Conversation

@stablegenius49
Copy link
Contributor

@stablegenius49 stablegenius49 commented Mar 11, 2026

Summary

  • normalize non-standard MCP property-level required: true/false flags into the parent object's required array before validating tool schemas
  • keep validation strict for real schema errors instead of disabling it entirely
  • add focused regression coverage for top-level and nested MCP input schemas

Testing

  • PYTHONPATH=. pytest -q tests/unit/test_mcp_client_schema.py

Closes #5972

Summary by Sourcery

规范化 MCP 工具的输入模式,以在保留严格校验的同时处理非标准的 required 标记。

Bug 修复:

  • 支持那些错误地在各个属性上使用布尔值标记必填属性的 MCP 模式,通过将这些属性提升到父级的 required 数组中进行修正。

测试:

  • 为模式规范化行为添加单元测试,包括嵌套对象,以及在使用非标准 required 标记时与 MCPTool 的集成测试。
Original summary in English

Summary by Sourcery

Normalize MCP tool input schemas to handle non-standard required flags while preserving strict validation.

Bug Fixes:

  • Support MCP schemas that incorrectly mark required properties with boolean flags on individual properties by lifting them into the parent required array.

Tests:

  • Add unit tests for schema normalization behavior, including nested objects and MCPTool integration with non-standard required flags.

@auto-assign auto-assign bot requested review from Fridemn and Raven95676 March 11, 2026 20:06
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. area:core The bug / feature is about astrbot's core, backend labels Mar 11, 2026
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - 我发现了 1 个问题,并且给出了一些整体性的反馈:

  • _normalize_mcp_input_schema 中使用的 seen: set[str] = set() / if not (name in seen or seen.add(name)) 这种模式有点晦涩;建议采用更易读的去重方式(例如在构建新列表时使用 if name not in seen: seen.add(name); result.append(name)),以使意图更加清晰。
给 AI Agent 的提示
Please address the comments from this code review:

## Overall Comments
- `_normalize_mcp_input_schema` 中使用的 `seen: set[str] = set()` / `if not (name in seen or seen.add(name))` 这种模式有点晦涩;建议采用更易读的去重方式(例如在构建新列表时使用 `if name not in seen: seen.add(name); result.append(name)`),以使意图更加清晰。

## Individual Comments

### Comment 1
<location path="astrbot/core/agent/mcp_client.py" line_range="146-150" />
<code_context>
+                        required_list.append(prop_name)
+
+            if required_list:
+                seen: set[str] = set()
+                normalized["required"] = [
+                    name
+                    for name in required_list
+                    if not (name in seen or seen.add(name))
+                ]
+            elif isinstance(required, list):
</code_context>
<issue_to_address>
**nitpick:** 在推导式中使用 `seen.add` 做去重的小技巧会影响可读性。

这种习惯用法对很多读者来说不太好理解,而 `required_list` 的规模很小且使用频率不高,因此这类微优化没有必要。建议采用更清晰的方式——例如显式循环,或者使用 `normalized["required"] = list(dict.fromkeys(required_list))`——在保持顺序的同时完成去重,对性能影响可以忽略不计。
</issue_to_address>

Sourcery 对开源项目是免费的——如果你觉得我们的评审有帮助,欢迎分享 ✨
帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进后续评审。
Original comment in English

Hey - I've found 1 issue, and left some high level feedback:

  • The seen: set[str] = set() / if not (name in seen or seen.add(name)) pattern in _normalize_mcp_input_schema is a bit opaque; consider a more readable deduplication approach (e.g., building a new list while checking if name not in seen: seen.add(name); result.append(name)) to make the intent clearer.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `seen: set[str] = set()` / `if not (name in seen or seen.add(name))` pattern in `_normalize_mcp_input_schema` is a bit opaque; consider a more readable deduplication approach (e.g., building a new list while checking `if name not in seen: seen.add(name); result.append(name)`) to make the intent clearer.

## Individual Comments

### Comment 1
<location path="astrbot/core/agent/mcp_client.py" line_range="146-150" />
<code_context>
+                        required_list.append(prop_name)
+
+            if required_list:
+                seen: set[str] = set()
+                normalized["required"] = [
+                    name
+                    for name in required_list
+                    if not (name in seen or seen.add(name))
+                ]
+            elif isinstance(required, list):
</code_context>
<issue_to_address>
**nitpick:** The deduplication trick using `seen.add` in a comprehension impacts readability.

This idiom is hard to parse for many readers, and the small size / infrequent use of `required_list` means the micro-optimization isn’t needed. Consider a clearer approach—e.g. an explicit loop or `normalized["required"] = list(dict.fromkeys(required_list))`—to deduplicate while preserving order with negligible performance impact.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +146 to +150
seen: set[str] = set()
normalized["required"] = [
name
for name in required_list
if not (name in seen or seen.add(name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: 在推导式中使用 seen.add 做去重的小技巧会影响可读性。

这种习惯用法对很多读者来说不太好理解,而 required_list 的规模很小且使用频率不高,因此这类微优化没有必要。建议采用更清晰的方式——例如显式循环,或者使用 normalized["required"] = list(dict.fromkeys(required_list))——在保持顺序的同时完成去重,对性能影响可以忽略不计。

Original comment in English

nitpick: The deduplication trick using seen.add in a comprehension impacts readability.

This idiom is hard to parse for many readers, and the small size / infrequent use of required_list means the micro-optimization isn’t needed. Consider a clearer approach—e.g. an explicit loop or normalized["required"] = list(dict.fromkeys(required_list))—to deduplicate while preserving order with negligible performance impact.

@dosubot
Copy link

dosubot bot commented Mar 11, 2026

Related Documentation

1 document(s) may need updating based on files changed in this PR:

AstrBotTeam's Space

pr4697的改动
View Suggested Changes
@@ -413,6 +413,20 @@
 - 环境变量 `ASTRBOT_MCP_INIT_STRICT` 可配置为 `1`、`true`、`yes` 或 `on` 启用严格模式
 - 严格模式下,如果所有 MCP 服务初始化失败,系统会中止启动
 - 非严格模式下(默认),MCP 初始化失败仅记录警告,系统继续启动
+
+**工具 Schema 规范化(PR #6077):**
+
+[PR #6077](https://github.com/AstrBotDevs/AstrBot/pull/6077) 新增了 MCP 工具 JSON Schema 规范化功能,提升与非标准 MCP 服务器的兼容性([Issue #5972](https://github.com/AstrBotDevs/AstrBot/issues/5972)):
+
+- **问题背景**:部分 MCP 服务器提供的 JSON Schema 不符合 Draft 2020-12 规范,在属性级别使用布尔值 `required: true/false` 标记必填字段,而非在父对象中声明 `required` 数组。这种非标准格式会导致 schema 验证失败。
+- **规范化逻辑**:新增 `_normalize_mcp_input_schema()` 函数,在 `MCPTool` 对象初始化前自动规范化 JSON Schema:
+  - 递归检测属性级别的 `required` 布尔标志
+  - 将 `required: true` 的属性名提升到父对象的 `required` 数组
+  - 移除属性自身的 `required` 布尔值
+  - 自动去重,确保 `required` 数组中无重复项
+  - 支持嵌套对象的 schema 规范化
+- **验证策略**:保持严格的 schema 验证,而非完全禁用验证,确保对真正的 schema 错误仍能正确报告
+- **兼容性提升**:修复后,系统可以正常处理来自非标准 MCP 服务器的工具定义,无需手动修改 schema 或放宽验证规则
 
 #### 后台任务执行流程(PR #5722)
 

[Accept] [Decline]

Note: You must be authenticated to accept/decline updates.

How did I do? Any feedback?  Join Discord

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue where certain MCP servers provide JSON schemas that use a non-standard boolean required flag directly on properties, which can cause validation failures. A new utility function has been introduced to automatically normalize these schemas by lifting the required property names into the parent object's required array, aligning them with the JSON Schema Draft 2020-12 specification. This ensures that the system can correctly process and validate these schemas without compromising strictness, improving compatibility with various MCP server implementations.

Highlights

  • Schema Normalization: Implemented a function to normalize non-standard required: true/false flags found directly on properties within MCP JSON schemas, converting them to the standard parent object's required array format.
  • Validation Integrity: Ensured that schema validation remains strict for actual errors by correctly interpreting and transforming non-standard schemas, rather than disabling validation.
  • Regression Testing: Added comprehensive unit tests to cover the schema normalization logic, including scenarios with top-level and nested MCP input schemas.
Changelog
  • astrbot/core/agent/mcp_client.py
    • Added _normalize_mcp_input_schema function to convert property-level required booleans to the parent object's required array.
    • Modified MCPTool initialization to use the new normalization function on inputSchema.
  • tests/unit/test_mcp_client_schema.py
    • Added new test cases for _normalize_mcp_input_schema to verify correct handling of property-level required flags.
    • Included tests for nested objects and MCPTool integration to ensure end-to-end schema normalization.
Activity
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a normalization step for non-standard required flags in MCP input schemas, which is a good enhancement for compatibility. The implementation uses a recursive function to traverse and correct the schema, and it's well-supported by new unit tests. My review includes a couple of suggestions to improve the efficiency and readability of the new normalization function.

Comment on lines +146 to +151
seen: set[str] = set()
normalized["required"] = [
name
for name in required_list
if not (name in seen or seen.add(name))
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The method used for de-duplicating required_list while preserving order is a bit verbose. For Python 3.7+, you can use dict.fromkeys() which is a more modern, concise, and idiomatic way to achieve the same result.

Suggested change
seen: set[str] = set()
normalized["required"] = [
name
for name in required_list
if not (name in seen or seen.add(name))
]
normalized["required"] = list(dict.fromkeys(required_list))


return normalized

return _normalize(copy.deepcopy(schema))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The _normalize function is pure and does not mutate its input; it returns a new, modified copy of the schema. Therefore, the copy.deepcopy() call here is redundant. Removing it will improve performance by avoiding an unnecessary full copy of the schema object, which can be significant for large schemas.

Suggested change
return _normalize(copy.deepcopy(schema))
return _normalize(schema)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core The bug / feature is about astrbot's core, backend size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]无法添加部分mcp服务器的相关bug

1 participant