fix: use infer_single for GSVI TTS API#6117
fix: use infer_single for GSVI TTS API#6117stablegenius49 wants to merge 1 commit intoAstrBotDevs:masterfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request modernizes the GSVI Text-to-Speech (TTS) provider by transitioning from a basic, direct Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Hey - 我发现了 1 个问题,并给出了一些整体性的反馈:
- 在
_auth_headers中,建议解析self.api_base来检查主机名(例如通过urllib.parse.urlparse),而不是使用对 "acgnai.top" 的子字符串匹配,以避免在查询路径或用户名中出现意外匹配。 - 旧版的
/tts路径在_download_legacy_tts中重新实现了响应处理;你可以复用_download_binary(或一个共享的辅助函数)来执行下载,从而在旧版流程和infer_single流程之间保持错误处理和行为的一致性。
给 AI 代理的提示词
请根据这次代码评审中的评论进行修改:
## 总体评论
- 在 `_auth_headers` 中,建议解析 `self.api_base` 来检查主机名(例如通过 `urllib.parse.urlparse`),而不是使用对 "acgnai.top" 的子字符串匹配,以避免在查询路径或用户名中出现意外匹配。
- 旧版的 `/tts` 路径在 `_download_legacy_tts` 中重新实现了响应处理;你可以复用 `_download_binary`(或一个共享的辅助函数)来执行下载,从而在旧版流程和 `infer_single` 流程之间保持错误处理和行为的一致性。
## 单条评论
### 评论 1
<location path="astrbot/core/provider/sources/gsvi_tts_source.py" line_range="33" />
<code_context>
-
- async def get_audio(self, text: str) -> str:
- temp_dir = get_astrbot_temp_path()
- path = os.path.join(temp_dir, f"gsvi_tts_{uuid.uuid4()}.wav")
- params = {"text": text}
-
</code_context>
<issue_to_address>
**suggestion (bug_risk):** 输出文件扩展名被硬编码为 .wav,可能与配置的 media_type 不一致。
现在 `media_type` 已经可以配置,文件名应该从 `self.media_type` 推导出扩展名,或者以其他方式确保扩展名与实际格式匹配。当格式发生变更时,如果扩展名不匹配,可能会导致下游依赖扩展名进行路由或解码的消费者错误处理文件。
建议实现:
```python
import os
import mimetypes
```
```python
async def get_audio(self, text: str) -> str:
temp_dir = get_astrbot_temp_path()
# 基于 media_type 推导文件扩展名,如果未知则回退到 .wav
extension = mimetypes.guess_extension(getattr(self, "media_type", "") or "") or ".wav"
if not extension.startswith("."):
extension = f".{extension}"
path = os.path.join(temp_dir, f"gsvi_tts_{uuid.uuid4()}{extension}")
params = {"text": text}
```
1. 确保在该类的其他位置正确设置了 `self.media_type`(例如 `"audio/wav"`、`"audio/mpeg"` 等),以便 `mimetypes.guess_extension` 能返回正确的扩展名。
2. 如果你的项目已经有集中维护的 media type 到扩展名的映射,为了更严格的控制,你可以用该映射替换 `mimetypes.guess_extension` 的调用。
</issue_to_address>帮我变得更有用!请在每条评论上点击 👍 或 👎,我会根据反馈改进后续的评审。
Original comment in English
Hey - I've found 1 issue, and left some high level feedback:
- In
_auth_headers, consider parsingself.api_baseto check the hostname (e.g., viaurllib.parse.urlparse) instead of using a substring match for"acgnai.top", to avoid accidental matches in query paths or usernames. - The legacy
/ttspath reimplements response handling in_download_legacy_tts; you could reuse_download_binary(or a shared helper) for the download to keep error handling and behavior consistent between legacy andinfer_singleflows.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `_auth_headers`, consider parsing `self.api_base` to check the hostname (e.g., via `urllib.parse.urlparse`) instead of using a substring match for `"acgnai.top"`, to avoid accidental matches in query paths or usernames.
- The legacy `/tts` path reimplements response handling in `_download_legacy_tts`; you could reuse `_download_binary` (or a shared helper) for the download to keep error handling and behavior consistent between legacy and `infer_single` flows.
## Individual Comments
### Comment 1
<location path="astrbot/core/provider/sources/gsvi_tts_source.py" line_range="33" />
<code_context>
-
- async def get_audio(self, text: str) -> str:
- temp_dir = get_astrbot_temp_path()
- path = os.path.join(temp_dir, f"gsvi_tts_{uuid.uuid4()}.wav")
- params = {"text": text}
-
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Output file extension is hardcoded to .wav and may not match the configured media_type.
Now that `media_type` is configurable, the filename should either derive its extension from `self.media_type` or otherwise ensure the extension matches the actual format. A mismatch can cause downstream consumers that use the extension for routing or decoding to mis-handle the file when the format changes.
Suggested implementation:
```python
import os
import mimetypes
```
```python
async def get_audio(self, text: str) -> str:
temp_dir = get_astrbot_temp_path()
# Derive file extension from media_type, fall back to .wav if unknown
extension = mimetypes.guess_extension(getattr(self, "media_type", "") or "") or ".wav"
if not extension.startswith("."):
extension = f".{extension}"
path = os.path.join(temp_dir, f"gsvi_tts_{uuid.uuid4()}{extension}")
params = {"text": text}
```
1. Ensure that `self.media_type` is set appropriately elsewhere in this class (e.g., `"audio/wav"`, `"audio/mpeg"`, etc.) so that `mimetypes.guess_extension` returns a correct extension.
2. If your project already has a centralized mapping from media types to extensions, you may want to replace the `mimetypes.guess_extension` call with that mapping for stricter control.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| self.emotion = provider_config.get("emotion") | ||
| self.version = provider_config.get("version") | ||
| self.api_key = provider_config.get("api_key", "") | ||
| self.timeout = int(provider_config.get("timeout", 20)) |
There was a problem hiding this comment.
suggestion (bug_risk): 输出文件扩展名被硬编码为 .wav,可能与配置的 media_type 不一致。
现在 media_type 已经可以配置,文件名应该从 self.media_type 推导出扩展名,或者以其他方式确保扩展名与实际格式匹配。当格式发生变更时,如果扩展名不匹配,可能会导致下游依赖扩展名进行路由或解码的消费者错误处理文件。
建议实现:
import os
import mimetypes async def get_audio(self, text: str) -> str:
temp_dir = get_astrbot_temp_path()
# 基于 media_type 推导文件扩展名,如果未知则回退到 .wav
extension = mimetypes.guess_extension(getattr(self, "media_type", "") or "") or ".wav"
if not extension.startswith("."):
extension = f".{extension}"
path = os.path.join(temp_dir, f"gsvi_tts_{uuid.uuid4()}{extension}")
params = {"text": text}- 确保在该类的其他位置正确设置了
self.media_type(例如"audio/wav"、"audio/mpeg"等),以便mimetypes.guess_extension能返回正确的扩展名。 - 如果你的项目已经有集中维护的 media type 到扩展名的映射,为了更严格的控制,你可以用该映射替换
mimetypes.guess_extension的调用。
Original comment in English
suggestion (bug_risk): Output file extension is hardcoded to .wav and may not match the configured media_type.
Now that media_type is configurable, the filename should either derive its extension from self.media_type or otherwise ensure the extension matches the actual format. A mismatch can cause downstream consumers that use the extension for routing or decoding to mis-handle the file when the format changes.
Suggested implementation:
import os
import mimetypes async def get_audio(self, text: str) -> str:
temp_dir = get_astrbot_temp_path()
# Derive file extension from media_type, fall back to .wav if unknown
extension = mimetypes.guess_extension(getattr(self, "media_type", "") or "") or ".wav"
if not extension.startswith("."):
extension = f".{extension}"
path = os.path.join(temp_dir, f"gsvi_tts_{uuid.uuid4()}{extension}")
params = {"text": text}- Ensure that
self.media_typeis set appropriately elsewhere in this class (e.g.,"audio/wav","audio/mpeg", etc.) so thatmimetypes.guess_extensionreturns a correct extension. - If your project already has a centralized mapping from media types to extensions, you may want to replace the
mimetypes.guess_extensioncall with that mapping for stricter control.
There was a problem hiding this comment.
Code Review
This pull request effectively refactors the GSVI TTS provider to use the modern infer_single endpoint and adds corresponding tests. The implementation is solid. I've identified a minor typo in an API parameter and a small regression in the legacy fallback logic. Addressing these points will make the changes even better.
| "batch_size": 1, | ||
| "batch_threshold": 0.75, | ||
| "split_bucket": True, | ||
| "speed_facter": 1, |
There was a problem hiding this comment.
| encoded_text = urllib.parse.quote(str(text)) | ||
| url = f"{self.api_base}/tts?text={encoded_text}" |
There was a problem hiding this comment.
The fallback to the legacy /tts endpoint does not include the emotion parameter, even if it's configured. The previous implementation did include it, which makes this a regression in functionality for the fallback path. Using urllib.parse.urlencode can simplify parameter encoding and ensure all relevant parameters are included.
params = {"text": text}
if self.emotion:
params["emotion"] = self.emotion
url = f"{self.api_base}/tts?{urllib.parse.urlencode(params)}"
Summary
/ttsrequest path with remote catalog discovery +infer_singledefault->默认Testing
PYTHONPATH=. pytest -q tests/test_gsvi_tts_source.pyCloses #5638
Summary by Sourcery
更新 GSVI TTS provider,使其使用远程目录发现(remote catalog discovery)和
infer_single端点,而不是硬编码的/tts路径,并为新的流程添加测试。新功能:
错误修复:
/tts端点。增强优化:
测试:
infer_single的新音频生成与下载流程。Original summary in English
Summary by Sourcery
Update the GSVI TTS provider to use remote catalog discovery and the infer_single endpoint instead of a hard-coded /tts path, and add tests for the new flow.
New Features:
Bug Fixes:
Enhancements:
Tests: