| title | Language release process |
|---|---|
| description | Here's what API users can expect when DeepL adds translation support for a new language or language variant. |
On a regular basis, DeepL adds translation support for new languages or language variants. In this article, we describe the process we'll follow with a new language or variant release.
DeepL language codes follow BCP 47. A language code always includes a base language subtag (e.g. en, zh), and may include additional subtags for script, region, or variant where needed to distinguish variants. For example:
EN-US,PT-BR-- region subtag to distinguish regional variants.ZH-HANS,ZH-HANT-- script subtag to distinguish writing systems.
BCP 47 is an expansive standard, and language codes can vary significantly in structure and length. As DeepL adds support for more languages and variants, new codes may use any combination of subtags permitted by the spec. For example, codes like sr-Cyrl-RS or sr-Latn-RS (Serbian in Cyrillic vs. Latin script, as used in Serbia) are valid BCP 47 codes -- while DeepL does not support these today, your integration should be able to handle codes of this form if they are added in the future.
- We will add the language code for the newly supported language or variant to the "Source languages" and "Target languages" lists on the Supported languages page in the API documentation. We'll include a note on that page if the language or variant does not support both text and document translation.
- If a newly added language or variant supports both text and document translation, we will add the language or variant to the
/languagesendpoint response. The variant code used depends on the characteristics of the variant:- In some cases, a variant is primarily used in a specific region, and so a region subtag is the best way to identify it (e.g.
EN-US,PT-BR). - In other cases, a variant is used widely across multiple regions, and so a script subtag is more appropriate (e.g.
ZH-HANS,ZH-HANT). The subtag structure will be selected by DeepL on a case-by-case basis following BCP 47 conventions.
- In some cases, a variant is primarily used in a specific region, and so a region subtag is the best way to identify it (e.g.
- In cases where a new language code with a variant duplicates the behavior of an existing language code without a variant (e.g.
ZH-HANSwas recently added as a language code for translating into simplified Chinese, along withZH):- In the
/languagesendpoint response, we will continue to return both language codes in two separate dicts with the same value in the"name"field. - For backwards compatibility, we will continue to support the original language code (in this example,
ZH) for text and document translation.
- In the
- We will add the language code for the newly supported language or variant to our OpenAPI spec.
This will allow us to specify whether a language supports both text and document translation, whether a language code is considered deprecated because it's been duplicated by a variant language code, and so on.
The additional metadata would also allow us, for example, to add languages like AR and ZH-HANT to the languages endpoint even before document translation is supported.