Skip to content

fix(cli): reconcile hybrid stitch+stamp identity with shared LegacyIdentityStitch service#5607

Merged
jgoux merged 7 commits into
developfrom
sean/growth-891-reconcile-stitch-shared-service
Jun 17, 2026
Merged

fix(cli): reconcile hybrid stitch+stamp identity with shared LegacyIdentityStitch service#5607
jgoux merged 7 commits into
developfrom
sean/growth-891-reconcile-stitch-shared-service

Conversation

@seanoliver

Copy link
Copy Markdown
Contributor

Problem

The #5366 gate stopped the ephemeral-env $identify spike, but at the cost of attribution: in CI, Docker, and npx supabase, cli_* events stay orphaned on throwaway device IDs and never link to the authenticated user. GROWTH-891 (#5559) fixes that with a hybrid stitch+stamp model.

While #5559 was in review, #5579 (db lint/advisors port) landed on develop and independently extracted the legacy identity stitch into a shared LegacyIdentityStitch service — one per-command stitchAttempted guard so the advisor transports alias at most once. That's the architecture we want, but it's a port of the pre-891 behavior: it only stamps when it aliases (persistent, first login), so it doesn't restore CI/Docker/npx attribution; it sets stitchAttempted after the file-read yield; and it reads the runtime.distinctId field that 891 replaced with a mutable identity slot. Merging #5559 on top as-is would silently drop the attribution feature and reintroduce the race.

This PR reconciles the two: keep #5579's shared-service architecture, fold the hybrid stitch+stamp behavior into it. Supersedes #5559.

Changes

  • The shared LegacyIdentityStitch now stamps everywhere. On the first authenticated response the user UUID is stamped into runtime.identity in every runtime, so captures in CI/Docker/npx carry the real user. The $create_alias (pre-login history merge) and the telemetry.json write still only happen on a persistent machine.
  • Hardening preserved: stitchAttempted is set before the first yield (no double-stitch race); when an identity already exists we stamp without aliasing (never merge two person graphs); alias fires at most once across all transports sharing the service.
  • stitchedDistinctId() returns runtime.identity.current() so the post-run cli_command_executed is attributed to the real user in every runtime, including steady state.
  • legacy-analytics.layer.ts resolves distinctId from the identity slot while keeping develop's already-keyed groups map.
  • Stitch behavior tests live in legacy-identity-stitch.integration.test.ts (CI-stamp-no-alias, stale-identity-stamp-no-alias, concurrent-alias-once); the platform-api layer test keeps develop's service-mocked wiring. A few command test runtimes still using the removed distinctId field were updated to makeTelemetryIdentity.
  • Brings the Go + next-TS 891 changes (logout identity reset + device-id rotation, the redundant $identify removal, ADR 0013) along through the merge.

Testing

Typecheck clean, full unit suite (1318) green, and the affected integration suites (identity-stitch, platform-api, login, logout, advisors, lint, services, gen/types, issue, linked-project-cache) pass under bun. Also ran an independent Codex review focused on the spike-regression risk — it confirmed no alias in ephemeral runtimes, alias-at-most-once across transports, the pre-yield race guard, and the no-cross-graph-merge invariant, with no findings.

GROWTH-891

seanoliver and others added 7 commits June 11, 2026 17:10
Two concurrent writers in the same millisecond shared a tmp path and raced
the rename into ENOENT. Surfaced as a cross-file flake in the login/logout
integration tests, but two real CLI processes could hit it the same way.
After the first authenticated API call, the user UUID is stashed in process
memory and used as distinct_id on all subsequent capture events (stamping,
zero extra PostHog events). The $create_alias merge and the telemetry.json
write now only happen in persistent runtimes, and the gate lives inside the
stitch owner on every surface (Go StitchLogin, legacy TS stitchLogin +
platform-api layer, next TS login) so no call site can forget it. Logout
clears both the in-memory and persisted identity on all surfaces, and the
redundant $identify that GROWTH-890 missed in next/login is gone.

This restores per-user attribution for cli_* events in CI/Docker/npx (lost
to the #5366 gate) without re-introducing the identify/alias volume spike.
See docs/adr/0013-hybrid-stitch-stamp-identity-attribution.md.

GROWTH-891
Three fixes from an independent review of the hybrid stitch+stamp change:

- Stamp the in-memory identity from the first authenticated response even
  when a persisted distinct_id already exists (stale disk identity + a
  different live token previously kept attributing events to the old user).
  Alias only the first identity a device ever sees — re-aliasing on
  re-login, or on the login command's direct StitchLogin call after the
  response hook already stitched, would merge a second user into the
  device's existing person graph.
- Mark the legacy TS stitch attempt before its first yield point so
  concurrent authenticated responses cannot double-stitch.
- Clear the telemetry identity on logout even when no token exists; a stale
  distinct_id can outlive the token.
Second-round review findings:

- Login as A, logout, login as B re-aliased the same device — a device
  already merged into A's person graph. Logout now resets the identity
  entirely (new ResetIdentity / resetIdentity): forget the user AND rotate
  the device id, so the next login aliases a fresh device. Transient
  failure paths keep ClearDistinctID and preserve the device id.
- A failed alias enqueue left the in-memory identity set, so the login
  command's follow-up StitchLogin treated the identity as already aliased
  and permanently skipped the history merge. The identity is now un-stashed
  on enqueue failure, keeping the first-identity gate retryable.
…red service

develop's #5579 (db lint/advisors port) extracted the legacy identity stitch
into a shared LegacyIdentityStitch service with a single per-command
stitchAttempted guard — exactly the architecture we want. But it was a port of
the pre-891 behavior: it only set a distinct_id when it aliased (persistent,
first-login), never stamped in CI/Docker/npx, set stitchAttempted after the
file-read yield, and read the now-removed runtime.distinctId.

This merge keeps #5579's shared-service architecture and folds the hybrid
stitch+stamp behavior into it:

- Stamp the in-memory identity (runtime.identity) on the first authenticated
  response in every runtime, so captures in CI/Docker/npx carry the real user
  id; alias + telemetry.json write still only happen on a persistent machine.
- Set stitchAttempted before the first yield (concurrency guard).
- Stamp without aliasing when an identity already exists (avoids merging two
  person graphs).
- stitchedDistinctId() now returns runtime.identity.current() so the post-run
  cli_command_executed event is attributed in every runtime.

legacy-analytics resolves to runtime.identity.current() for distinctId while
keeping develop's already-keyed groups map. Stitch behavior assertions moved to
legacy-identity-stitch.integration.test.ts (added CI-stamp-no-alias,
stale-identity-stamp-no-alias, concurrent-alias-once); the platform-api layer
test keeps develop's service-mocked wiring. Fixed test runtimes still using the
removed distinctId field to use makeTelemetryIdentity.

GROWTH-891
@seanoliver seanoliver requested a review from a team as a code owner June 17, 2026 19:50
@github-actions

Copy link
Copy Markdown

Supabase CLI preview

npx --yes https://pkg.pr.new/supabase/cli/supabase@509c563f965335ba421b15cfdacda88491ffb44d

Preview package for commit 509c563.

@seanoliver

Copy link
Copy Markdown
Contributor Author

Heads up on CI: the only red check here is Codegen, and it's not from this PR — this branch touches zero codegen/API files.

go generate pulls the live staging spec from api.supabase.green/api/v1-yaml, and that spec currently has a schema oapi-codegen v2.4.1 can't handle:

ProjectUpgradeEligibilityResponse.warnings[] uses a discriminator (propertyName: type) with inline oneOf branches and no mapping → "discriminator: not all schemas were mapped". The three warning types (pg_graphql_introspection_change, ltree_reindex_required, operator_estimator_gate) look like a recent platform addition, so this is breaking Codegen on all current CLI PRs.

Fix is upstream: add the discriminator mapping (or switch the oneOf branches to $refs) in the API spec, or adjust the cli's types.cfg.yaml to tolerate it. Everything else on this PR is green (unit + integration, all 3 e2e shards, lint, code quality).

@seanoliver seanoliver requested review from Coly010 and jgoux June 17, 2026 20:11
@jgoux jgoux added this pull request to the merge queue Jun 17, 2026
Merged via the queue into develop with commit cbbcd06 Jun 17, 2026
32 of 34 checks passed
@jgoux jgoux deleted the sean/growth-891-reconcile-stitch-shared-service branch June 17, 2026 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants