Summary
PromptMessageContent does not include an Audio variant, so it diverges from the MCP specification's ContentBlock union used by PromptMessage.content. As a result, rmcp cannot deserialize audio content in prompts/get responses (and cannot construct it server-side). This is the audio analogue of the embedded-resource bug fixed in #842 / #843.
Affected versions: rmcp-v1.4.0 through rmcp-v1.7.0, and the current main branch.
Actual (current) behavior
PromptMessageContent has four variants (crates/rmcp/src/model/prompt.rs):
pub enum PromptMessageContent {
Text { text: String },
Image { #[serde(flatten)] image: ImageContent },
Resource { #[serde(flatten)] resource: EmbeddedResource },
ResourceLink { #[serde(flatten)] link: super::resource::Resource },
}
There is no Audio variant, and the enum is #[serde(tag = "type")] with no catch-all, so an {"type":"audio",...} content block fails to deserialize:
Error("unknown variant `audio`, expected one of `text`, `image`, `resource`, `resource_link`")
Expected (MCP spec)
Per the spec, PromptMessage.content is a ContentBlock, whose union includes AudioContent:
export type ContentBlock =
| TextContent
| ImageContent
| AudioContent // ← missing from rmcp's PromptMessageContent
| ResourceLink
| EmbeddedResource;
(MCP spec, Server → Prompts: https://modelcontextprotocol.io/specification/2025-06-18/server/prompts — ContentBlock is unchanged through 2025-11-25.)
Root cause
The Audio variant was simply never added to PromptMessageContent. The supporting type already exists — AudioContent (Annotated<RawAudioContent>, crates/rmcp/src/model/content.rs) — and Audio is already a variant of the general RawContent enum (used by tool results and sampling). Only PromptMessageContent omits it.
Impact
A spec-conformant MCP server that returns audio content in a prompts/get response breaks rmcp clients end-to-end: Peer::get_prompt fails with ServiceError::UnexpectedResponse because the GetPromptResult body cannot be deserialized. The other content types (text, image, resource, resource_link) are unaffected. This mirrors the conformance break in #842, where the same class of mismatch was treated as a bug and fixed.
Server-side, there is also no way to construct an audio prompt message (no PromptMessage::new_audio, unlike new_image).
Minimal reproducer
use rmcp::model::PromptMessageContent;
fn main() {
// A spec-valid audio content block from prompts/get:
let json = r#"{"type":"audio","data":"<base64>","mimeType":"audio/wav"}"#;
let result = serde_json::from_str::<PromptMessageContent>(json);
println\!("{result:?}"); // Err: unknown variant `audio`
}
Proposed fix
Add the variant, mirroring Image (which already flattens ImageContent):
// in the content import:
content::{AudioContent, EmbeddedResource, ImageContent},
// in PromptMessageContent:
/// Audio content with base64-encoded data
Audio { #[serde(flatten)] audio: AudioContent },
A round-trip test (in the style of #843's regression test) and a PromptMessage::new_audio constructor, for symmetry with new_image, would round out the change.
Related
Summary
PromptMessageContentdoes not include anAudiovariant, so it diverges from the MCP specification'sContentBlockunion used byPromptMessage.content. As a result, rmcp cannot deserialize audio content inprompts/getresponses (and cannot construct it server-side). This is the audio analogue of the embedded-resource bug fixed in #842 / #843.Affected versions:
rmcp-v1.4.0throughrmcp-v1.7.0, and the currentmainbranch.Actual (current) behavior
PromptMessageContenthas four variants (crates/rmcp/src/model/prompt.rs):There is no
Audiovariant, and the enum is#[serde(tag = "type")]with no catch-all, so an{"type":"audio",...}content block fails to deserialize:Expected (MCP spec)
Per the spec,
PromptMessage.contentis aContentBlock, whose union includesAudioContent:(MCP spec, Server → Prompts: https://modelcontextprotocol.io/specification/2025-06-18/server/prompts —
ContentBlockis unchanged through 2025-11-25.)Root cause
The
Audiovariant was simply never added toPromptMessageContent. The supporting type already exists —AudioContent(Annotated<RawAudioContent>,crates/rmcp/src/model/content.rs) — andAudiois already a variant of the generalRawContentenum (used by tool results and sampling). OnlyPromptMessageContentomits it.Impact
A spec-conformant MCP server that returns audio content in a
prompts/getresponse breaks rmcp clients end-to-end:Peer::get_promptfails withServiceError::UnexpectedResponsebecause theGetPromptResultbody cannot be deserialized. The other content types (text,image,resource,resource_link) are unaffected. This mirrors the conformance break in #842, where the same class of mismatch was treated as a bug and fixed.Server-side, there is also no way to construct an audio prompt message (no
PromptMessage::new_audio, unlikenew_image).Minimal reproducer
Proposed fix
Add the variant, mirroring
Image(which already flattensImageContent):A round-trip test (in the style of #843's regression test) and a
PromptMessage::new_audioconstructor, for symmetry withnew_image, would round out the change.Related
PromptMessageContentenum, same class of spec divergence), fixed by flattening theResourcevariant.