Skip to content

Implement data URI handling for audio references#8620

Open
tjc6666666666666 wants to merge 4 commits into
AstrBotDevs:masterfrom
tjc6666666666666:patch-12
Open

Implement data URI handling for audio references#8620
tjc6666666666666 wants to merge 4 commits into
AstrBotDevs:masterfrom
tjc6666666666666:patch-12

Conversation

@tjc6666666666666
Copy link
Copy Markdown
Contributor

@tjc6666666666666 tjc6666666666666 commented Jun 6, 2026

Add support for data URI audio references by decoding base64 audio data and saving it to a temporary file.

解决
_audio_ref_to_local_path 方法(第 358 行)没有处理 data: URI 格式。当收到 data:audio/wav;base64,... 时,它直接返回原值。然后在 _resolve_audio_part 中调用 Path("data:audio/wav;base64,..."),Windows 的 Path 会将 / 转换为 \,导致 data:audio\wav;base64,...,引发 Invalid argument 错误。

Modifications / 改动点

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果


Checklist / 检查清单

  • 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。

  • 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”

  • 🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    / 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到 requirements.txtpyproject.toml 文件相应位置。

  • 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

Summary by Sourcery

Add handling for data URI audio references by decoding and storing them as temporary audio files before further processing.

New Features:

  • Support data: URI audio references for audio inputs by decoding base64 content into temporary files.

Bug Fixes:

  • Prevent invalid path errors on Windows caused by treating data: audio URIs as filesystem paths.

Add support for data URI audio references by decoding base64 audio data and saving it to a temporary file.
@dosubot dosubot Bot added size:S This PR changes 10-29 lines, ignoring generated files. area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Jun 6, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • When handling data: URIs, consider explicitly validating that the header includes ;base64 (and using base64.b64decode(..., validate=True)) so that non‑base64 or malformed data URIs fail fast and don't produce confusing partial decodes.
  • The data: branch currently returns the original audio_ref on exception, which will re‑expose the original Windows Path("data:...") issue; it may be safer to raise a more explicit error or return None so callers can handle the failure without reusing the problematic URI.
  • You may want to restrict accepted data: URIs to audio MIME types (e.g., data:audio/...) before writing to a temp file to avoid handling unexpected or potentially unsafe media types through this code path.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- When handling `data:` URIs, consider explicitly validating that the header includes `;base64` (and using `base64.b64decode(..., validate=True)`) so that non‑base64 or malformed data URIs fail fast and don't produce confusing partial decodes.
- The `data:` branch currently returns the original `audio_ref` on exception, which will re‑expose the original Windows `Path("data:...")` issue; it may be safer to raise a more explicit error or return `None` so callers can handle the failure without reusing the problematic URI.
- You may want to restrict accepted `data:` URIs to audio MIME types (e.g., `data:audio/...`) before writing to a temp file to avoid handling unexpected or potentially unsafe media types through this code path.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/sources/openai_source.py" line_range="382" />
<code_context>
+                suffix = "." + mime_type.split("/")[-1] if "/" in mime_type else ".wav"
+                if suffix not in (".wav", ".mp3", ".ogg", ".m4a", ".aac", ".flac"):
+                    suffix = ".wav"
+                audio_bytes = base64.b64decode(base64_data)
+                temp_dir = Path(get_astrbot_temp_path())
+                temp_dir.mkdir(parents=True, exist_ok=True)
</code_context>
<issue_to_address>
**🚨 issue (security):** Consider validating or bounding the size of the base64 payload before decoding to avoid very large in-memory allocations.

Because this handles arbitrary `data:` URIs, an unbounded `base64_data` string could cause excessive memory use during `b64decode`, potentially enabling a DoS. Please enforce a reasonable maximum length (or estimated decoded size) before decoding, and reject inputs that exceed it (optionally logging and treating them as invalid audio).
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/sources/openai_source.py" line_range="389" />
<code_context>
+                target_path.write_bytes(audio_bytes)
+                cleanup_paths.append(target_path)
+                return str(target_path), cleanup_paths
+            except Exception as exc:
+                logger.warning("解析 data URI 音频失败: %s,错误: %s", audio_ref[:100], exc)
+                return audio_ref, cleanup_paths
</code_context>
<issue_to_address>
**issue (bug_risk):** The broad `except Exception` may hide unexpected programming errors.

Catching `Exception` here will also swallow unrelated bugs (e.g., filesystem or type errors), making them harder to diagnose. Please narrow this to the specific parse/decode and I/O exceptions you expect (such as `ValueError`, `binascii.Error`, and possibly `OSError`) so only malformed `data:` URIs are handled and real programming errors still surface.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread astrbot/core/provider/sources/openai_source.py
Comment thread astrbot/core/provider/sources/openai_source.py Outdated
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for parsing and decoding base64-encoded data: URIs for audio references in openai_source.py, saving them to temporary files. The review feedback suggests mapping common audio MIME types (such as audio/mpeg to .mp3) using an explicit dictionary to prevent unnecessary fallbacks to .wav and subsequent conversions. Additionally, the reviewer recommends adding unit tests to verify the new parsing and decoding functionality.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread astrbot/core/provider/sources/openai_source.py
return str(target_path), cleanup_paths
if audio_ref.startswith("file://"):
return self._file_uri_to_path(audio_ref), cleanup_paths
if audio_ref.startswith("data:"):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

According to the general rules, new functionality such as handling attachments should be accompanied by corresponding unit tests. Please add unit tests to verify that data URIs are correctly parsed, decoded, and saved to temporary files.

References
  1. New functionality, such as handling attachments, should be accompanied by corresponding unit tests.

@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Jun 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant