diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 000000000..f67d3705f --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,31 @@ + + +# Agent Guide for Apache Synapse + +This file is read by automated agents (security scanners, code analyzers, +AI assistants) operating on this repository. + +## Security + +Security model: [SECURITY.md](./SECURITY.md) -> [THREAT_MODEL.md](./THREAT_MODEL.md) + +Agents that scan this repository should consult `SECURITY.md` and the linked +`THREAT_MODEL.md` before reporting issues. Synapse is a mediation engine / ESB: +the untrusted message sender is the adversary, while the integration +configuration (sequences, scripts, XSLT/XQuery bodies, endpoint addresses) is +trusted operator input. The model covers transports, XSLT/XQuery transforms +(XXE/SSRF), dynamic endpoint routing (SSRF), and message DoS. diff --git a/SECURITY.md b/SECURITY.md new file mode 100644 index 000000000..0f6ea2dad --- /dev/null +++ b/SECURITY.md @@ -0,0 +1,14 @@ +# Security Policy + +## Reporting a Vulnerability + +Apache Synapse follows the [Apache Software Foundation security process](https://www.apache.org/security/). +Please report suspected vulnerabilities **privately** to `security@apache.org` (the Synapse PMC is reachable +at `private@synapse.apache.org`). Do **not** open public GitHub issues or pull requests for security reports. + +## Threat Model + +What Synapse treats as in/out of scope, the security properties it provides and disclaims (safe-by-default +XML transforms, mediation-level security, secret protection), the adversary model (the untrusted message +sender vs. the trusted integration configuration), and how findings are triaged are documented in +[THREAT_MODEL.md](./THREAT_MODEL.md). diff --git a/THREAT_MODEL.md b/THREAT_MODEL.md new file mode 100644 index 000000000..d78bdf606 --- /dev/null +++ b/THREAT_MODEL.md @@ -0,0 +1,262 @@ + + +# Threat Model — Apache Synapse + +## §1 Header + +- **Project:** Apache Synapse — a lightweight, high-performance **Enterprise Service Bus (ESB) / mediation + engine**. It accepts messages over pluggable transports (HTTP/S, JMS, VFS, Mail, …), runs them through + operator-defined **mediation sequences / proxy services** (mediators: XSLT, XQuery, script, filter, + switch, send-to-endpoint, …), and routes/transforms them toward backend **endpoints** *(documented — README; + source `org.apache.synapse.mediators`, `config.xml`)*. +- **Modelled against:** `apache/synapse` `master`/HEAD (2026-05-31). +- **Status:** **DRAFT — v0, not yet reviewed by the Synapse PMC.** Produced by the ASF Security team via the + `threat-model-producer` rubric (). +- **Reporting / version-binding / legend** as in the sibling models. **Draft confidence:** ~12 documented / + 0 maintainer / ~48 inferred. Each *(inferred)* routes to §14. + +**Framing note (as for any framework):** Synapse is a *mediation engine*, not a finished application. The +**integration developer** authors the synapse configuration — sequences, mediators, scripts, XSLT/XQuery, +endpoints, and security policies. That configuration is **trusted input** (§3); the **inbound message from a +network client is the untrusted adversary input** (§7). Most properties are conditional on how the +integration is configured, so §9/§10 carry a lot of weight. + +## §2 Scope and intended use + +Intended use *(documented)*: deploy Synapse as a message broker/mediator in front of or between services — +clients send messages to a Synapse proxy/API; Synapse mediates (transform, route, secure, throttle) and +forwards to backend endpoints. + +Caller roles: + +- **Message client (untrusted)** — any peer that can send a message to a Synapse listener/proxy/API. +- **Backend endpoint** — a service Synapse calls; semi-trusted (its responses re-enter mediation). +- **Integration developer / operator** — authors the synapse config (mediation logic, scripts, XSLT, + endpoints, secure-vault secrets, transport + WS-Security policy). **Trusted; out of model as adversary (§3).** + +**Component-family table:** + +| Family | Entry point | Touches outside process | In model? | +| --- | --- | --- | --- | +| Transport listeners | HTTP/S (NHTTP/passthrough), JMS, VFS, Mail | network / fs / mail | **Yes** | +| Mediation engine | sequences / proxy services / APIs | — | **Yes** | +| XML transform mediators | **XSLT**, **XQuery**, payload factory | XML; **external refs** | **Yes (high-value)** | +| Script mediators | JS/Groovy/… (operator-authored) | runs config code over message data | **Yes (data-in surface)** | +| Endpoints (outbound) | send/call mediators, address/WSDL/loadbalance | **network egress** | **Yes (SSRF surface)** | +| Eventing | WS-Eventing subscriptions | network | **Yes** | +| Secrets / secure-vault | encrypted config secrets | keystore | **Yes** | +| Samples / docs / build | `modules/documentation`, samples, tests | — | No → §3 | + +## §3 Out of scope (explicit non-goals) + +- **The integration developer / operator as adversary**, and the **synapse configuration** itself (sequences, + scripts, XSLT/XQuery bodies, endpoint addresses, secrets). Config is authored by a trusted party; a script + mediator running operator-authored code is not an adversary surface — the message *data flowing into* it is + *(inferred)*. +- **Misconfiguration** (enabling external-entity resolution, routing to an attacker-derived endpoint without + validation, disabling TLS) — Synapse provides the controls; using them is the operator's job (§10/§11). +- **Backend services** Synapse mediates to, and the message producers' own security. +- **Samples, documentation, and tests** *(inferred)*. +- **The underlying XML/crypto stacks** (the JAXP/StAX provider, Rampart/WSS4J) except as Synapse configures + and invokes them. + +## §4 Trust boundaries and data flow + +The trust boundary is the **transport listener + the mediation entry**: bytes arriving on a listener are +untrusted until mediation (and any configured WS-Security/transport auth) has processed them *(inferred)*. + +Trust transitions: + +1. **Wire → message build:** the transport builds a message (SOAP/XML/JSON/binary). XML building is the + XXE / entity-expansion / large-message DoS surface *(inferred — wave-1)*. +2. **Message → XSLT/XQuery mediator:** transforms may resolve external entities, `document()` / `doc()` + references, or extension functions — an **XXE / SSRF / file-read** surface if external resolution is enabled + *(inferred — `XSLTMediator`; high-value, §14)*. +3. **Message → script mediator:** operator-authored JS/Groovy runs with message data as input. The *code* is + trusted (config); the risk is unsafe handling of message data inside it *(inferred)*. +4. **Message → endpoint resolution:** static endpoints are config (trusted); **dynamic / content-based + routing** that derives an endpoint address from message content is an **SSRF** surface *(inferred)*. +5. **Endpoint response → mediation:** backend responses re-enter mediation as semi-trusted input. + +**Reachability precondition:** a finding is in-model if reachable from an inbound message *before* the +mediation auth/validation the integration configured; a finding requiring a malicious **config** (script, +XSLT body, endpoint address chosen by the operator) is `OUT-OF-MODEL: trusted-input` (§3/§6). + +## §5 Assumptions about the environment + +- JVM host running the Synapse runtime; operator-managed `synapse.xml` config, keystores, and transport setup. +- Transports reachable per operator network config; TLS provided by the transport configuration *(inferred)*. +- Secrets via secure-vault are protected by an operator-managed keystore/password *(inferred)*. +- **What Synapse does to its host (*(inferred)* — wave-2):** binds transport listeners; opens **outbound** + connections to configured (and possibly dynamically-resolved) endpoints; reads config + keystores; XSLT/ + XQuery may fetch external references if enabled. Not assumed to spawn host processes beyond configured + command/script mediators. + +## §5a Build-time and configuration variants + +| Knob (names *(inferred)*) | Effect | Ruling needed | +| --- | --- | --- | +| XML secure-processing / DTD + external-entity resolution in builders & XSLT/XQuery | XXE / SSRF / file-read on inbound transforms | **Open (wave-1):** are external entities/`document()` off by default? | +| Message size / element-depth / streaming limits | XML/large-message DoS | **Open (wave-1)** | +| Dynamic / content-based endpoint resolution | SSRF if endpoint derived from message | Open — validated/allow-listed? | +| Transport TLS (HTTPS listener + outbound) | Confidentiality/integrity | Operator (§10) | +| WS-Security (Rampart) on a proxy | Message-level auth/sig/enc | Integration choice | +| Script-mediator languages enabled | Operator-code surface | Operator config | + +## §6 Assumptions about inputs + +| Entry point | Parameter | Attacker-controllable? | Caller/operator must enforce | +| --- | --- | --- | --- | +| transport listener | message body (SOAP/XML/JSON/binary), headers, SOAPAction | **yes** | XML limits; transport/WS-Security; size caps | +| XSLT/XQuery mediator | message payload (the transform *input*) | **yes** | disable external entity/`document()` resolution | +| script mediator | message payload passed to the script | **yes** | safe handling of message data in the script | +| dynamic endpoint | endpoint address *derived from message* (if used) | **yes (if configured)** | validate/allow-list resolved addresses | +| synapse config (sequences, scripts, XSLT, endpoints, secrets) | all | **no — operator-trusted** | never sourced from a message | + +## §7 Adversary model + +- **Primary adversary:** an untrusted client sending messages to a Synapse listener/proxy/API. Capabilities: + craft SOAP/XML/JSON payloads (XXE, entity-expansion, oversized), drive content that influences XSLT/XQuery + resolution, supply data that a dynamic route turns into an endpoint address (SSRF), or that a script + mishandles. +- **Secondary:** a malicious backend endpoint returning hostile responses into mediation. +- **Goals:** XXE/file-read/SSRF via transforms or routing; XML/message DoS; bypass of a configured + mediation-level auth; exfiltration of secrets reachable through a transform. +- **Out of model:** the integration developer/operator; the config (scripts, XSLT bodies, endpoint + addresses); keystore/secret holders. + +## §8 Security properties the project provides + +*(Conditional on configuration; *(inferred)* pending §14.)* + +1. **Robust message building/parsing.** Malformed/oversized inbound messages yield a fault, not memory + corruption or unbounded resource use (subject to configured limits) *(inferred)*. *Symptom:* crash/hang/OOM + from crafted input. *Severity:* high. +2. **Safe-by-default XML transforms.** XSLT/XQuery and message builders do not resolve external entities/ + `document()` against untrusted input unless explicitly enabled *(inferred — load-bearing; wave-1)*. + *Symptom:* XXE read / SSRF / file disclosure via a transform. *Severity:* critical. +3. **Mediation-level security mechanisms.** When configured, transport security and WS-Security (Rampart) + authenticate/sign/encrypt messages *(inferred)*. *Symptom:* accepted unauthenticated/forged message where + policy required otherwise. *Severity:* critical. +4. **Secret protection.** Secure-vault keeps configured secrets encrypted at rest, not in plaintext config + *(inferred)*. *Symptom:* plaintext secret exposure. *Severity:* high. +5. **Transport security support.** TLS on HTTPS listeners and outbound calls with cert validation when + configured *(inferred)*. *Symptom:* MITM where TLS expected. *Severity:* high. + +## §9 Security properties the project does NOT provide + +- **No security without configuration** — a proxy with no transport/WS-Security and permissive transforms is + only as protected as the integration wired it *(inferred)*. +- **No defence against the integration developer** — scripts, XSLT/XQuery bodies, and endpoint addresses are + trusted config (§3). +- **No intrinsic SSRF protection for dynamic/content-based routing** — if an endpoint is derived from message + content, validating it is the integration's job *(inferred)*. + +**False friends:** + +- *An XSLT/XQuery transform looks like pure data transformation but can read files / fetch URLs* via external + entities, `document()`/`doc()`, or extension functions if external resolution is left enabled. +- *A script mediator looks sandboxed but runs with the engine's privileges* — it is operator code, not a + security boundary for message data. +- *Content-based routing looks like internal plumbing but can become SSRF* when the route target is + attacker-influenced. + +**Well-known attack classes to keep in view:** XXE and XML entity-expansion DoS; SSRF via XSLT `document()`/ +external entities and via dynamic endpoint resolution; oversized-message / streaming DoS; injection into a +downstream system via an unsanitized transform; secret exposure through an over-broad transform; XML +signature-wrapping where WS-Security is used (see the CXF/WSS4J model). + +## §10 Downstream (integrator/operator) responsibilities + +- **Keep external-entity / DTD / `document()` resolution disabled** in message builders and XSLT/XQuery on + untrusted inbound paths; keep message-size/depth limits on. +- **Validate or allow-list** any endpoint address derived from message content (anti-SSRF). +- Configure transport TLS (with cert validation) and WS-Security where the integration requires + authentication/integrity. +- Treat script/XSLT/XQuery mediator bodies as code you own; don't accept them from untrusted sources. +- Protect the secure-vault keystore/password; don't commit plaintext secrets. + +## §11 Known misuse patterns + +- Exposing a proxy with no transport/message security and assuming the ESB "is secure". +- Enabling external-entity / `document()` resolution in XSLT/XQuery over untrusted messages. +- Deriving an endpoint address from message content without validation (SSRF). +- Embedding secrets in plaintext config instead of secure-vault. +- Routing untrusted message content into a script mediator that then executes/concatenates it unsafely. + +## §11a Known non-findings (recurring false positives) + +*(v0 seed — the PMC will own the authoritative list — §14.)* + +- **A script/XSLT/XQuery mediator "executes code"** — operator-authored config (§3/§9); not a finding unless a + *default* path lets an untrusted message reach unsafe resolution. +- **XXE/SSRF reachable only when the operator enabled external resolution** — `OUT-OF-MODEL: non-default-build` + unless the *default* resolves external entities (then `VALID` — wave-1). +- **SSRF via an endpoint address the operator configured statically** — trusted input (§6). +- **Findings in samples / documentation / tests** — out of scope (§3). +- **Use of a weak algorithm explicitly configured** in a WS-Security policy — integration choice. + +## §12 Conditions that would change this model + +- A change to default XML/transform external-resolution or size-limit posture. +- A new transport, mediator, or default that resolves untrusted references. +- Dynamic endpoint resolution becoming on/permissive by default. +- A change in secure-vault or WS-Security defaults. +- Any report not cleanly routable to a §13 disposition. + +## §13 Triage dispositions + +| Disposition | Meaning | Licensed by | +| --- | --- | --- | +| `VALID` | Violates a claimed property via an in-scope adversary/input in a default config. | §8, §6, §7 | +| `VALID-HARDENING` | No §8 property broken, but a §11 misuse warrants a safer default/guard. | §11 | +| `OUT-OF-MODEL: trusted-input` | Requires control of the synapse config (script/XSLT/endpoint/secret). | §6, §3 | +| `OUT-OF-MODEL: adversary-not-in-scope` | Requires operator/keystore capability. | §7, §3 | +| `OUT-OF-MODEL: unsupported-component` | Lands in samples/docs/tests. | §3 | +| `OUT-OF-MODEL: non-default-build` | Only when an insecure non-default transform/resolution option was enabled. | §5a | +| `BY-DESIGN: property-disclaimed` | Concerns a §9-disclaimed property (no security without config; scripts are operator code). | §9 | +| `KNOWN-NON-FINDING` | Matches a §11a entry. | §11a | +| `MODEL-GAP` | Routes to none of the above → revise the model. | §12 | + +## §14 Open questions for the maintainers + +**Wave 1 — transform/parse defaults (decide VALID-vs-misconfig; §5a/§8):** +1. By default, do the **message builders and XSLT/XQuery mediators disable DTD / external-entity / `document()` + resolution** on untrusted inbound messages, so an XXE/SSRF-via-transform report against defaults is `VALID`? + *Proposed:* external resolution off by default; enabling it is operator opt-in. +2. Are there **default message-size / element-depth / streaming limits** that bound XML/large-message DoS? + *Proposed:* configurable limits; sensible defaults. + +**Wave 2 — routing & scripts (§4/§9):** +3. Is **dynamic / content-based endpoint resolution** something an untrusted message can influence by default, + and is the resolved address validated/allow-listed? *Proposed:* static endpoints are the norm; dynamic + resolution is opt-in and the integration validates it (SSRF = integration responsibility). +4. Confirm **script / XSLT / XQuery mediator bodies are trusted config** (operator-authored), so "code + execution in a mediator" is `OUT-OF-MODEL: trusted-input` rather than a framework finding. *Proposed:* yes. + +**Wave 3 — secrets, WS-Security, §11a (§8/§11a):** +5. How does **secure-vault** protect secrets, and what does Synapse claim about secret exposure through + transforms/logging? *Proposed:* encrypted at rest; avoid logging secrets. +6. What do scanners most often (re)report here that the PMC considers a **non-finding**? (Seeds §11a.) + +**Meta:** +7. Confirm this model lives as root `THREAT_MODEL.md` referenced from a new `SECURITY.md`. *Proposed:* yes. + +## §15 Machine-readable companion + +Deferred for v0; a `threat-model.yaml` can later encode the §6 trust table, §2/§3 scoping, §8 rows, §9 false +friends, §11a non-findings, and §13 dispositions.