Custom memory persistence silently no-ops on Workflow (BaseNode) roots in 2.x — no working post-run hook

> **Edit 2026-04-11 (after @surajksharma07's triage + source re-verification):** scope is narrower than this body originally claimed. Only `plugin_manager.run_after_run_callback` is missing dispatch on the BaseNode path. `run_on_event_callback` *does* fire via `_consume_event_queue` at `runners.py:619`, and `after_model_callback` is dispatched by `LlmAgent` internals on the model-call boundary independently of the Runner. The primary ask stands — land the `runners.py:427` TODO for `run_after_run_callback` — but it's a one-dispatch-call fix, not the broad "plugin lifecycle is incomplete" re-architecture this body originally framed. Overclaims in the sections below are struck through. Verified dispatch matrix + working workaround + drop-in regression test in [this comment](#) below.

---

**Environment:** `google-adk==2.0.0a3` (released 2026-04-09), Python 3.14, Linux/macOS. Feature flags `NEW_WORKFLOW`, `V1_LLM_AGENT`, `PLUGGABLE_AUTH` default-on (the a3 defaults).

---

## The problem we ran into

We have a multi-turn chat agent backed by a custom SSE server (FastAPI + `google.adk.Runner`). The agent is a graph workflow — `App(root_agent=Workflow(edges=[...]))` — classifying, guarding, and routing requests, and we persist each completed session to a custom `MyMemoryService(BaseMemoryService)` for long-term recall across sessions. The implementation details of the memory backend are not relevant here; the issue reproduces identically with `InMemoryMemoryService`.

We followed the canonical pattern from the [memory docs](https://adk.dev/sessions/memory/):

```python
async def save_to_memory(callback_context):
    await callback_context.add_session_to_memory()

root_agent = Workflow(
    name="FrontDoorWorkflow",
    after_agent_callback=save_to_memory,   # ← docs pattern
    edges=[...],
)
```

Sessions and events were persisting normally (via `DatabaseSessionService`). The memory store's write count was stuck at its pre-upgrade value — **no new rows after the a3 upgrade**. No exception, no warning, no log line. Looked like the store had stopped accepting writes; it hadn't.

The next thing we tried was the `BasePlugin.after_run_callback` pattern, which reads as the more ADK-native approach:

```python
class MemoryPersistPlugin(BasePlugin):
    async def after_run_callback(self, *, invocation_context):
        if invocation_context.memory_service is None:
            return
        await invocation_context.memory_service.add_session_to_memory(
            invocation_context.session
        )

app = App(root_agent=root_agent, plugins=[MemoryPersistPlugin()])
```

This **also** silently no-ops. We traced it all the way through the installed source and found the real story, which is that ~~`BaseNode` root paths in 2.0.0a3 have **no** post-run lifecycle hook surface exposed to users — not via callbacks, not via plugins.~~ **[Edit: overclaim. The precise gap is `run_after_run_callback`. See top-of-issue note.]**

~~This is a blocker for any cross-cutting post-run concern on Workflow roots: memory persistence, metrics emission, audit logging, final telemetry, cleanup. Users migrating toward the BaseNode-rooted architecture that 2.0 is steering them onto silently lose every after-hook they had on `LlmAgent`.~~ **[Edit: overclaim. The blocker is specifically on `after_run_callback`-based patterns, which still rules out most ADK-native cross-cutting wiring, but `on_event_callback`-based observability works today.]**

---

## Root cause (source trace)

### 1. `Workflow` is a `BaseNode`, not a `BaseAgent`

- `google/adk/workflow/_workflow_class.py:143` — `class Workflow(BaseNode)`.
- `BaseNode` has no `after_agent_callback` field. `Workflow` doesn't either.
- Pydantic v2 default `extra="ignore"` → passing `after_agent_callback=save_to_memory` to `Workflow(...)` is silently dropped at construction. No error, no warning.
- The dispatch mechanism for the field, `BaseAgent._handle_after_agent_callback` at `google/adk/agents/base_agent.py:491`, is agent-only and unreachable from a BaseNode root.

This alone produces the symptom. But the deeper issue is what happens if users try the ADK-native alternative:

### 2. `run_after_run_callback` is not dispatched on the BaseNode path

`Runner.run_async` at `google/adk/runners.py:760` branches on root type:

- Branch A (`isinstance(self.agent, LlmAgent)`, line 807): wraps in `_V1LlmAgentWrapper`, calls `_run_node_async`, `return` at line 834.
- Branch B (`isinstance(self.agent, BaseNode) and not LlmAgent`, line 838): calls `_run_node_async`, `return` at line 848. **This is the path Workflow roots take.**
- Branch C (legacy fallthrough `_run_with_trace`, line 851+): uses `_exec_with_plugin` which dispatches `run_after_run_callback` at `runners.py:1230`. **Neither root shape reaches this path.**

So the only code path that dispatches `after_run_callback` is the one that nobody hits in 2.0. Both `LlmAgent` and `Workflow` roots funnel into `_run_node_async`.

### 3. `_run_node_async` has not yet wired `run_after_run_callback`

`google/adk/runners.py:413` `_run_node_async()`:

- Line 427: `# TODO: Add tracing and plugin lifecycle for the node runtime path.` — explicit acknowledgement this is incomplete.
- Line 467: `run_on_user_message_callback` ✓
- Line 482: `run_before_run_callback` ✓
- Line 506: `async for event in self._consume_event_queue(ic, done_sentinel)` — drains events via `_consume_event_queue`, which dispatches `run_on_event_callback` at `runners.py:619` ✓ **[Edit: this was wrong in the original text below; `on_event_callback` does fire for Workflow roots.]**
- **No call to `run_after_run_callback` anywhere in the function.** ~~`run_after_agent_callback`, `run_after_model_callback`, or `run_on_event_callback`~~ **[Edit: `run_on_event_callback` fires via `_consume_event_queue:619`; `after_model_callback` is dispatched by `LlmAgent` on the model-call boundary, not the Runner; `after_agent_callback` is agent-only and not a `_run_node_async` concern.]**

Pre-run plugin hooks (`on_user_message`, `before_run`) and per-event hooks (`on_event_callback`) DO fire for BaseNode roots. `run_after_run_callback` does not. A `BasePlugin` overriding `after_run_callback` loads successfully, registers with the Runner, and then never executes.

---

## Minimal reproducer

```python
# pip install google-adk==2.0.0a3
import asyncio
from google.adk import Event, Workflow
from google.adk.apps import App
from google.adk.events import EventActions  # noqa: F401
from google.adk.plugins.base_plugin import BasePlugin
from google.adk.runners import Runner
from google.adk.sessions.in_memory_session_service import InMemorySessionService
from google.adk.memory.in_memory_memory_service import InMemoryMemoryService
from google.genai import types


def terminal_node(ctx) -> Event:
    return Event(state={"done": True})


class TracerPlugin(BasePlugin):
    def __init__(self):
        super().__init__(name="tracer")
        self.fired = {
            "on_user_message_callback": 0,
            "before_run_callback": 0,
            "after_run_callback": 0,
            "after_agent_callback": 0,
            "on_event_callback": 0,
        }

    async def on_user_message_callback(self, *, invocation_context, user_message):
        self.fired["on_user_message_callback"] += 1

    async def before_run_callback(self, *, invocation_context):
        self.fired["before_run_callback"] += 1

    async def after_run_callback(self, *, invocation_context):
        self.fired["after_run_callback"] += 1

    async def after_agent_callback(self, *, agent, callback_context):
        self.fired["after_agent_callback"] += 1

    async def on_event_callback(self, *, invocation_context, event):
        self.fired["on_event_callback"] += 1


async def main():
    plugin = TracerPlugin()
    workflow = Workflow(name="Demo", edges=[("START", terminal_node)])
    app = App(name="demo", root_agent=workflow, plugins=[plugin])
    runner = Runner(
        app_name="demo",
        app=app,
        session_service=InMemorySessionService(),
        memory_service=InMemoryMemoryService(),
    )
    session = await runner.session_service.create_session(app_name="demo", user_id="u1")
    async for _ in runner.run_async(
        user_id="u1",
        session_id=session.id,
        new_message=types.Content(parts=[types.Part(text="hi")], role="user"),
    ):
        pass
    print(plugin.fired)


asyncio.run(main())
```

**[Edit: the reproducer above has a bug — `terminal_node` yields `Event(state=...)` with no `content`, which skews `on_event_callback` counts, and the original "Actual" output below reflected that. A corrected drop-in test case with a content-bearing terminal node and a working `WorkaroundRunner` is in [this comment](#).]**

~~**Expected:** every hook fires at least once.~~
~~**Actual (on 2.0.0a3):**~~
~~```~~
~~{'on_user_message_callback': 1, 'before_run_callback': 1,~~
~~ 'after_run_callback': 0, 'after_agent_callback': 0, 'on_event_callback': 0}~~
~~```~~

**Corrected actual (2.0.0a3, content-bearing terminal event):**
```
{'on_user_message_callback': 1, 'before_run_callback': 1,
 'on_event_callback': 1, 'after_run_callback': 0}
```

Only `after_run_callback` stays at 0.

---

## Impact

~~Every cross-cutting post-invocation concern silently no-ops on Workflow roots:~~ **[Edit: narrower than the table originally claimed. Corrected:]**

| Concern | Typical wiring | ADK 2.0.0a3 Workflow root status |
|---|---|---|
| Long-term memory (`add_session_to_memory`) | `after_agent_callback` or `after_run_callback` | **Silent no-op** (this issue) |
| Metrics emission (latency, token counts, success rate) | `after_run_callback` | **Silent no-op** (this issue) |
| Audit logging / compliance trails | `after_run_callback` | **Silent no-op via `after_run_callback`** (this issue). `on_event_callback`-based audit works today. |
| Final state cleanup, resource release | `after_run_callback` | **Silent no-op** (this issue) |
| ~~Post-run token/cost accounting~~ | ~~`after_model_callback`~~ | ~~Silent no-op~~ **[Edit: `after_model_callback` is dispatched by `LlmAgent` on the model-call boundary, not by the Runner. Fires on Workflow roots that contain LlmAgent nodes. Not part of this issue.]** |
| Per-event telemetry / event rewriting | `on_event_callback` | **Works** via `_consume_event_queue:619`. |

And because there is no warning, teams discover the problem only when empty dashboards, missing memory rows, or absent audit trails are flagged downstream — often after shipping.

Our workaround was to embed memory persistence as a **terminal `FunctionNode` inside the graph** (`async def persist_memory(ctx)` wired as a final edge after every terminal specialist). It works, but it's specific to memory and doesn't generalize to token accounting, metrics, or audit hooks, which can't live as graph nodes cleanly. **[Edit: a cleaner interim — subclass `Runner` and wrap `run_async` to dispatch `run_after_run_callback` after the generator drains — also works for teams instantiating `Runner` directly. Not viable when `adk api_server` owns the Runner instantiation (hardcoded at `adk_web_server.py:737`). Working example in the comment below.]**

---

## Open question — how should custom memory persistence work on Workflow roots today?

The memory docs page shows one canonical pattern: `Agent(..., after_agent_callback=auto_save_session_to_memory_callback)`. With `Workflow(BaseNode)` being a flagship 2.0 feature and the intended future root type for non-LLM graph agents (per the `runners.py:836-837` TODO to collapse `LlmAgent` into the BaseNode path), it would be very helpful if maintainers could confirm the supported answer to one of:

1. **Is there a hook we missed?** If there's a canonical surface we haven't found for "run code at end of invocation on a BaseNode root," please point at it — we searched `BasePlugin`, `BaseNode`, `Workflow`, `App`, and `Runner` methods and could not find one that dispatches. Docs/example link welcome.
2. **Is the terminal-`FunctionNode`-in-graph pattern the intended interim answer?** If so, it's worth documenting on the memory page next to the `Agent` example, so new users land on it instead of the silently-dropped `after_agent_callback` kwarg.
3. **If neither — what's the recommended path?** We would like to build correctly against a path maintainers endorse, not against whichever private API happens to work.

Worth flagging: we believe **every team building custom memory persistence in 2.x will hit this**, not just us. The direction of travel in 2.0 is explicit — `Workflow(BaseNode)` is the flagship graph root type, and the TODO at `runners.py:836-837` says LlmAgent itself will be refactored to inherit from BaseNode, at which point `_run_node_async` becomes the single dispatch path for all Runner invocations. Any team that:

- subclasses `BaseMemoryService` (the documented extension point for custom memory backends), AND
- uses a `Workflow` root (or, post-LlmAgent-migration, any 2.x root at all), AND
- follows the memory docs page to wire persistence via `after_agent_callback` OR the ADK-native `BasePlugin.after_run_callback`

…will land in exactly the silent no-op we did. There is no "lucky" configuration that avoids it on 2.0.0a3 with a BaseNode root. And because the failure mode is silent (no exception, no warning, no log), downstream symptoms — empty memory store, missing cross-session recall, flat retrieval quality — are easy to attribute to embedding tuning, retrieval scoring, or chunk sizing instead of "the write never happened."

If the `runners.py:427` TODO sits for multiple release cycles, which is entirely plausible given Runner-dispatch completeness does not currently appear to be a top-priority area, then custom memory persistence on Workflow roots is effectively **unsupported in 2.x until either the TODO lands or the docs adopt the in-graph terminal-node pattern as an interim answer**. The current situation — "docs pattern silently drops, Plugin alternative silently no-ops, no documented workaround, open TODO of unknown priority" — leaves every new adopter either guessing or reverse-engineering the Runner internals. A one-paragraph note on the memory docs page would save each of them the same multi-hour source trace.

---

## Related issues

- **[#4181](https://github.com/google/adk-python/issues/4181)** (open) — *"before_model_callback and after_model_callback not invoked for live streaming sessions (run_live)"*. Same structural pattern: a specific Runner code path (`run_live`) bypasses the callback dispatch that the standard path calls. This issue is the BaseNode-root analogue — same structural pattern, different Runner branch (`_run_node_async` instead of `run_live`). A fix for one does not automatically cover the other; each alternate dispatch path has to be wired individually.
- **[#4774](https://github.com/google/adk-python/issues/4774)** (open) — *"Add Lifecycle Error Callbacks (`on_agent_error`, `on_run_error`) to ADK Framework"*. Motivates why `after_run_callback`/`after_agent_callback` are load-bearing for enterprise observability: *"`AGENT_COMPLETED` and `INVOCATION_COMPLETED` events are never emitted to observability sinks… failed runs disappear from the denominator in standard reports."* Same concern applies when `after_run_callback` doesn't fire at all on BaseNode roots — telemetry plugins depending on it miss every successful run, not just failed ones.

---

## The ask

### Primary — land the `runners.py:427` TODO

Wire `run_after_run_callback` into `_run_node_async`, mirroring the dispatch that `_exec_with_plugin` already performs on the legacy path at `runners.py:1230`. Concretely: after the `async for event in self._consume_event_queue(...)` loop at line 506 drains, before the function returns, call `await ic.plugin_manager.run_after_run_callback(invocation_context=ic)`.

~~- `run_after_run_callback` at the end of the function (after the event queue drains, before return) — mirrors `runners.py:1230`~~
~~- `run_after_agent_callback` when a node finishes~~
~~- `run_after_model_callback` at the model-call boundary~~
~~- `run_on_event_callback` per yielded event — mirrors `runners.py:1216`~~

**[Edit: the other three bullets are either already handled (`run_on_event_callback` via `_consume_event_queue:619`) or out of scope for `_run_node_async` (`after_agent_callback` is `BaseAgent`-only; `after_model_callback` is dispatched by `LlmAgent` on the model-call boundary). Scope is one dispatch call.]**

This is structurally the same ask as [#4181](https://github.com/google/adk-python/issues/4181): identify an alternate Runner dispatch path that the current plugin wiring skipped, and add the callback call. The fix pattern is well-understood.

This lets a `BasePlugin` work the same on BaseNode roots as it does on the legacy `_exec_with_plugin` path today, and unblocks the whole class of `after_run_callback`-based cross-cutting concerns (memory persistence, metrics, audit, cleanup).

Related: the TODO at `runners.py:836-837` (*"remove `not isinstance(self.agent, LlmAgent)` after LLM agent is refactored to inherit from BaseNode"*) shows `_run_node_async` is the intended single dispatch path going forward. Closing this gap removes the last pre-condition to collapsing Branch A and Branch B into one.

### Secondary — defensive UX while the TODO is open

Log a one-time warning at `Runner.__init__` (or first `run_async` call) when:

1. the root is a `BaseNode` (not a `BaseAgent` subclass), AND
2. `app.plugins` contains any plugin that overrides `after_run_callback`.

Something like:

```
WARNING: Plugin 'MemoryPersistPlugin' defines after_run_callback, but the
BaseNode root path in Runner does not currently dispatch it (see #5282
and runners.py:427). This callback will not fire. For memory writes on
Workflow roots, use a terminal FunctionNode until this is resolved.
```

This turns a silent data-loss bug into a startup-time warning. Users find out immediately, not after shipping and checking empty dashboards.

### Tertiary — documentation

Update [`adk.dev/sessions/memory/`](https://adk.dev/sessions/memory/). The current memory page shows only the `Agent(..., after_agent_callback=auto_save_session_to_memory_callback)` pattern, with no mention that Workflow roots behave differently. Given `Workflow(BaseNode)` is a flagship 2.0 feature, the page should:

- either add a Workflow section with the interim terminal-node workaround
- or, ideally, after the primary fix lands, show one unified `BasePlugin` pattern that works for both root shapes

---

## Non-ask (for the record)

We considered two alternative framings and explicitly do NOT recommend them:

- **"Make `Workflow` accept `after_agent_callback`"** — patches deprecated API. `BaseAgent` is on the way out per `runners.py:836-837`.
- **"Make `BaseNode` use `extra='forbid'`"** — would have caught our silent-drop trap at construction, but likely breaks kwargs forwarded from subclasses and is an API break. The secondary warning above gets the same DX benefit without the compatibility cost.

The primary ask — finishing `_run_node_async`'s `run_after_run_callback` dispatch — is aligned with the direction 2.0 is already headed.


Concern	Typical wiring	ADK 2.0.0a3 Workflow root status
Long-term memory (`add_session_to_memory`)	`after_agent_callback` or `after_run_callback`	Silent no-op (this issue)
Metrics emission (latency, token counts, success rate)	`after_run_callback`	Silent no-op (this issue)
Audit logging / compliance trails	`after_run_callback`	Silent no-op via `after_run_callback` (this issue). `on_event_callback`-based audit works today.
Final state cleanup, resource release	`after_run_callback`	Silent no-op (this issue)
~~Post-run token/cost accounting~~	~~`after_model_callback`~~	~~Silent no-op~~ [Edit: `after_model_callback` is dispatched by `LlmAgent` on the model-call boundary, not by the Runner. Fires on Workflow roots that contain LlmAgent nodes. Not part of this issue.]
Per-event telemetry / event rewriting	`on_event_callback`	Works via `_consume_event_queue:619`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom memory persistence silently no-ops on Workflow (BaseNode) roots in 2.x — no working post-run hook #5282

The problem we ran into

Root cause (source trace)

1. `Workflow` is a `BaseNode`, not a `BaseAgent`

2. `run_after_run_callback` is not dispatched on the BaseNode path

3. `_run_node_async` has not yet wired `run_after_run_callback`

Minimal reproducer

Impact

Open question — how should custom memory persistence work on Workflow roots today?

Related issues

The ask

Primary — land the `runners.py:427` TODO

Secondary — defensive UX while the TODO is open

Tertiary — documentation

Non-ask (for the record)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Custom memory persistence silently no-ops on Workflow (BaseNode) roots in 2.x — no working post-run hook #5282

Description

The problem we ran into

Root cause (source trace)

1. Workflow is a BaseNode, not a BaseAgent

2. run_after_run_callback is not dispatched on the BaseNode path

3. _run_node_async has not yet wired run_after_run_callback

Minimal reproducer

Impact

Open question — how should custom memory persistence work on Workflow roots today?

Related issues

The ask

Primary — land the runners.py:427 TODO

Secondary — defensive UX while the TODO is open

Tertiary — documentation

Non-ask (for the record)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. `Workflow` is a `BaseNode`, not a `BaseAgent`

2. `run_after_run_callback` is not dispatched on the BaseNode path

3. `_run_node_async` has not yet wired `run_after_run_callback`

Primary — land the `runners.py:427` TODO