Skip to content

Bug: PromptMessageContent is missing the Audio variant (diverges from MCP spec ContentBlock) #864

@bricef

Description

@bricef

Summary

PromptMessageContent does not include an Audio variant, so it diverges from the MCP specification's ContentBlock union used by PromptMessage.content. As a result, rmcp cannot deserialize audio content in prompts/get responses (and cannot construct it server-side). This is the audio analogue of the embedded-resource bug fixed in #842 / #843.

Affected versions: rmcp-v1.4.0 through rmcp-v1.7.0, and the current main branch.

Actual (current) behavior

PromptMessageContent has four variants (crates/rmcp/src/model/prompt.rs):

pub enum PromptMessageContent {
    Text { text: String },
    Image { #[serde(flatten)] image: ImageContent },
    Resource { #[serde(flatten)] resource: EmbeddedResource },
    ResourceLink { #[serde(flatten)] link: super::resource::Resource },
}

There is no Audio variant, and the enum is #[serde(tag = "type")] with no catch-all, so an {"type":"audio",...} content block fails to deserialize:

Error("unknown variant `audio`, expected one of `text`, `image`, `resource`, `resource_link`")

Expected (MCP spec)

Per the spec, PromptMessage.content is a ContentBlock, whose union includes AudioContent:

export type ContentBlock =
  | TextContent
  | ImageContent
  | AudioContent      // ← missing from rmcp's PromptMessageContent
  | ResourceLink
  | EmbeddedResource;

(MCP spec, Server → Prompts: https://modelcontextprotocol.io/specification/2025-06-18/server/promptsContentBlock is unchanged through 2025-11-25.)

Root cause

The Audio variant was simply never added to PromptMessageContent. The supporting type already exists — AudioContent (Annotated<RawAudioContent>, crates/rmcp/src/model/content.rs) — and Audio is already a variant of the general RawContent enum (used by tool results and sampling). Only PromptMessageContent omits it.

Impact

A spec-conformant MCP server that returns audio content in a prompts/get response breaks rmcp clients end-to-end: Peer::get_prompt fails with ServiceError::UnexpectedResponse because the GetPromptResult body cannot be deserialized. The other content types (text, image, resource, resource_link) are unaffected. This mirrors the conformance break in #842, where the same class of mismatch was treated as a bug and fixed.

Server-side, there is also no way to construct an audio prompt message (no PromptMessage::new_audio, unlike new_image).

Minimal reproducer

use rmcp::model::PromptMessageContent;

fn main() {
    // A spec-valid audio content block from prompts/get:
    let json = r#"{"type":"audio","data":"<base64>","mimeType":"audio/wav"}"#;
    let result = serde_json::from_str::<PromptMessageContent>(json);
    println\!("{result:?}"); // Err: unknown variant `audio`
}

Proposed fix

Add the variant, mirroring Image (which already flattens ImageContent):

// in the content import:
content::{AudioContent, EmbeddedResource, ImageContent},

// in PromptMessageContent:
    /// Audio content with base64-encoded data
    Audio { #[serde(flatten)] audio: AudioContent },

A round-trip test (in the style of #843's regression test) and a PromptMessage::new_audio constructor, for symmetry with new_image, would round out the change.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions