Skip to content

MLE-5321: docs(audio): broaden supported file formats to include ogg, opus, aac#269

Merged
rishabh-bhargava merged 1 commit intomainfrom
feat/audio-supported-formats
May 7, 2026
Merged

MLE-5321: docs(audio): broaden supported file formats to include ogg, opus, aac#269
rishabh-bhargava merged 1 commit intomainfrom
feat/audio-supported-formats

Conversation

@rishabh-bhargava
Copy link
Copy Markdown
Contributor

Summary

Adds .ogg, .opus, and .aac to the documented supported-format list for /audio/transcriptions and /audio/translations. The decoder is shared across STT models, so these formats were already supported in practice — just not in the docs.

Verification

End-to-end on api.together.ai, both endpoints, three models, three new formats — 9 combinations all return HTTP 200 with the correct LibriSpeech reference transcript:

Endpoint Model .ogg .opus .aac
/audio/transcriptions nvidia/parakeet-tdt-0.6b-v3
/audio/transcriptions openai/whisper-large-v3
/audio/transcriptions mistralai/Voxtral-Mini-3B-2507
/audio/translations openai/whisper-large-v3

Reproduction:

curl -sS -X POST https://api.together.ai/v1/audio/transcriptions \
  -H "Authorization: Bearer \$TOGETHER_API_KEY" \
  -F "model=nvidia/parakeet-tdt-0.6b-v3" \
  -F "file=@audio.opus"

Test plan

  • OpenAPI YAML lints cleanly (no schema breakage — only description string changes)
  • Mintlify replication picks up the description change automatically (no action needed in mintlify-docs/openapi.yaml per the team's replication setup)

How this was found

Surfaced by the formats.unsupported test in the new STT feature-test framework (benchmarks/stt/features/ on the together-voice repo). The test originally probed for a 4xx on .ogg (per the docs) and instead got 200 + a correct transcription, so I expanded the probe to .opus/.aac across all three STT models and both endpoints to confirm.

A follow-up commit will need to update the hand-written guide at mintlify-docs/docs/speech-to-text.mdx:189-193 (separate PR on the mintlify-docs repo) since that page lists the formats independently of the OpenAPI spec.

🤖 Generated with Claude Code

Verified end-to-end on api.together.ai for both /audio/transcriptions and
/audio/translations across nvidia/parakeet-tdt-0.6b-v3, openai/whisper-large-v3,
and mistralai/Voxtral-Mini-3B-2507: HTTP 200 + correct LibriSpeech transcript
on .ogg, .opus, and .aac. The decoder is shared across STT models, so the
broader format set was already supported in practice — just not documented.

Surfaced via the STT feature-test framework (formats.unsupported test on
together-voice repo).
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

✱ Stainless preview builds

This PR will update the togetherai SDKs with the following commit messages.

go

docs(api): update supported formats in audio transcription/translation file params

openapi

feat(api): add ogg/opus/aac format support to audio transcription and translation

python

docs(api): add .ogg, .opus, .aac to supported formats in audio transcriptions/translations

terraform

chore(internal): regenerate SDK with no functional changes

typescript

docs(api): update supported file formats in audio transcriptions/translations
togetherai-openapi studio · code

Your SDK build had at least one "note" diagnostic.
generate ✅

⚠️ togetherai-go studio · code

Your SDK build had a failure in the test CI job, which is a regression from the base state.
generate ✅build ⏭️lint ✅test ❗

go get github.com/stainless-sdks/togetherai-go@3bf7862520e1abdc30136f58dc89dcff44ddad66
⚠️ togetherai-python studio · code

Your SDK build had at least one "warning" diagnostic.
generate ⚠️build ✅lint ✅test ⏭️

pip install https://pkg.stainless.com/s/togetherai-python/9c1211a8143dc435ef351dd57b9553d39ae06b8e/together-2.12.0-py3-none-any.whl
⚠️ togetherai-typescript studio · conflict

Your SDK build had at least one warning diagnostic.

togetherai-terraform studio · code

Your SDK build had at least one "note" diagnostic.
generate ✅lint ✅test ✅


This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
If you push custom code to the preview branch, re-run this workflow to update the comment.
Last updated: 2026-05-07 22:06:20 UTC

@rishabh-bhargava rishabh-bhargava changed the title docs(audio): broaden supported file formats to include ogg, opus, aac MLE-5321: docs(audio): broaden supported file formats to include ogg, opus, aac May 7, 2026
@rishabh-bhargava rishabh-bhargava merged commit 1790e98 into main May 7, 2026
6 checks passed
@rishabh-bhargava rishabh-bhargava deleted the feat/audio-supported-formats branch May 7, 2026 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants