Skip to content

[codex] Rewrite client connection architecture#2978

Open
juliusmarminge wants to merge 13 commits into
codex/hosted-web-tracing-pocfrom
codex/connection-state-audit
Open

[codex] Rewrite client connection architecture#2978
juliusmarminge wants to merge 13 commits into
codex/hosted-web-tracing-pocfrom
codex/connection-state-audit

Conversation

@juliusmarminge

@juliusmarminge juliusmarminge commented Jun 6, 2026

Copy link
Copy Markdown
Member

Summary

  • replace the legacy web and mobile connection runtimes with the shared Effect client-runtime architecture
  • centralize connection supervision, retry/backoff, online/offline handling, RPC session ownership, persistence, and projections
  • migrate web, mobile, and desktop integration points to the new connection catalog and runtime
  • add focused tests for transport lifecycle, supervision, registry ownership, subscriptions, authorization, and storage

Stack

Validation

  • vp check
  • vp run typecheck
  • vp run lint:mobile
  • client-runtime suite: 193 tests passed

The stacked tree is byte-for-byte identical to the original pre-split PR tip.


Note

High Risk
Touches encrypted credential/catalog persistence, cloud account transitions, and broad mobile connection/auth paths; incorrect migration or cleanup could drop saved environments or leave stale relay sessions.

Overview
This slice wires desktop, mobile, and supporting client-runtime integration for the shared connection catalog and supervision model.

Desktop adds an encrypted DesktopConnectionCatalogStore (Electron safe storage, atomic writes, discard on decrypt failure) with IPC/preload handlers and splits ElectronSafeStorage types into ElectronSafeStorageService. Cloud auth and saved-environment code import the new service module.

Mobile introduces a connection/ stack: SecureStore-backed catalog with legacy migration, platform layers for connectivity (expo-network), wakeups, cloud session, and full ConnectionTarget / credential / shell-thread cache persistence. UI moves from ad-hoc remote bootstrap to useWorkspaceState, entity atoms, and useConnectionController; settings environments get switch-based connect/disconnect, shared status phases (available / connected / offline / error), and trace-ID copy affordances. Cloud sign-in now serializes account cleanup (relay token cache, relay env removal, APNs unregister); managed relay DPoP tokens persist in SecureStore; agent-awareness device registration is queued and coalesced.

Imports shift to explicit @t3tools/client-runtime/* subpaths; connection status dots and empty states align with the new phase vocabulary. Minor fixes: cloud auth HTTP headers as plain objects, favicons via DPoP-aware remote HTTP headers, thread outbox drain at root layout.

Reviewed by Cursor Bugbot for commit 95fa2be. Bugbot is set up for automated code reviews on this repo. Configure here.

Note

Rewrite client connection architecture to use atom-backed environment registry and supervisor

  • Introduces a new client-runtime connection layer (packages/client-runtime/src/connection/) with an EnvironmentRegistry, EnvironmentSupervisor, ConnectionDriver, and ConnectionResolver that manage environment lifecycle, retry/backoff, and prepared connections (Bearer, Relay, SSH, Primary) via Effect services.
  • Replaces direct RPC client calls, Zustand store selectors, and imperative environment bootstrap hooks across web, mobile, and desktop with environment-scoped atom families (createEnvironmentQueryAtomFamily, createEnvironmentSubscriptionAtomFamily) for all major domains: threads, shell, VCS, terminal, server config, projects, source control, auth, and review.
  • Removes large imperative bootstrap hooks (useRemoteEnvironmentBootstrap, connectSavedEnvironment, disconnectEnvironment, ServerStateBootstrap, EnvironmentConnectionManagerBootstrap) in favor of the registry's connectionLayer and per-environment atoms.
  • Adds a persistent, schema-validated ConnectionCatalogDocument with platform-specific storage on web (IndexedDB), mobile (expo-secure-store), and desktop (DesktopConnectionCatalogStore with encrypted file), including one-time migration from legacy storage.
  • Introduces a ManagedRelayClient with DPoP token caching, serialized/coalesced device registration queue, and non-interfering relay tracing; relay environments are discovered via RelayEnvironmentDiscovery with status validation.
  • Splits @t3tools/client-runtime into explicit subpath exports (/connection, /authorization, /environment, /rpc, /relay, /state/*); bare root imports are now a lint error.
  • Adds a persistent thread outbox (state/thread-outbox) on mobile with retry backoff and background drain hook (useThreadOutboxDrain).
  • Risk: @t3tools/client-runtime root imports now fail at runtime; all consumers must use explicit subpaths. LocalApi browser implementation no longer proxies to a live backend — all server.* and shell.openInEditor calls reject immediately until a backend is paired via the new registry.

Macroscope summarized 95fa2be.

@coderabbitai

coderabbitai Bot commented Jun 6, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: cdf61b50-ba96-4a5c-afc9-4a10efa286ef

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/connection-state-audit

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added vouch:trusted PR author is trusted by repo permissions or the VOUCHED list. size:XXL 1,000+ changed lines (additions + deletions). labels Jun 6, 2026
@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

🚀 Expo continuous deployment is ready!

  • Project → t3-code
  • Platforms → android, ios
  • Scheme → t3code-preview
  🤖 Android 🍎 iOS
Fingerprint 30eb942851c2693b81dc607a4de2651c7bf7cc4c 092a8bb81d8fc363a745d8a7d891ae056a15f49b
Build Details Build Permalink
DetailsDistribution: INTERNAL
Build profile: preview:dev
Runtime version: 30eb942851c2693b81dc607a4de2651c7bf7cc4c
App version: 0.1.0
Git commit: 4352388e37307ecbc2c302774fb9c9d91061148d
Build Permalink
DetailsDistribution: INTERNAL
Build profile: preview:dev
Runtime version: 092a8bb81d8fc363a745d8a7d891ae056a15f49b
App version: 0.1.0
Git commit: 4352388e37307ecbc2c302774fb9c9d91061148d
Update Details Update Permalink
DetailsBranch: pr-2978
Runtime version: 30eb942851c2693b81dc607a4de2651c7bf7cc4c
Git commit: 4352388e37307ecbc2c302774fb9c9d91061148d
Update Permalink
DetailsBranch: pr-2978
Runtime version: 092a8bb81d8fc363a745d8a7d891ae056a15f49b
Git commit: 4352388e37307ecbc2c302774fb9c9d91061148d
Update QR

Comment thread apps/web/src/environments/runtime/service.ts Outdated
Comment thread apps/web/src/connection/appQueries.ts Outdated
Comment thread apps/web/src/connection/storage.ts
Comment thread apps/mobile/src/app/settings/environments.tsx
Comment thread apps/web/src/state/terminalSessions.ts
Comment thread packages/client-runtime/src/connection/resolver.ts Outdated
Comment thread packages/client-runtime/src/state/threads.ts
@juliusmarminge juliusmarminge force-pushed the codex/connection-state-audit branch from 16c9aba to 2a29e34 Compare June 7, 2026 19:53
@juliusmarminge juliusmarminge changed the title Harden remote connection state handling across mobile, web, and relay [codex] Rewrite client connection architecture Jun 7, 2026
@juliusmarminge juliusmarminge changed the base branch from main to codex/connection-infra-otel-base June 7, 2026 19:53
@juliusmarminge juliusmarminge force-pushed the codex/connection-state-audit branch from 2a29e34 to b976d5f Compare June 7, 2026 20:09
Comment on lines +87 to +113
const openDatabase = Effect.fn("web.connectionStorage.openDatabase")(function* () {
return yield* Effect.callback<IDBDatabase, ConnectionTransientError>((resume) => {
if (typeof indexedDB === "undefined") {
resume(
Effect.fail(catalogError("open", "IndexedDB is unavailable in this browser context.")),
);
return;
}
const request = indexedDB.open(DATABASE_NAME, DATABASE_VERSION);
request.addEventListener("upgradeneeded", () => {
if (!request.result.objectStoreNames.contains(CATALOG_STORE_NAME)) {
request.result.createObjectStore(CATALOG_STORE_NAME);
}
if (!request.result.objectStoreNames.contains(SHELL_STORE_NAME)) {
request.result.createObjectStore(SHELL_STORE_NAME);
}
if (!request.result.objectStoreNames.contains(THREAD_STORE_NAME)) {
request.result.createObjectStore(THREAD_STORE_NAME);
}
});
request.addEventListener("error", () => {
resume(Effect.fail(catalogError("open", request.error ?? "Unknown IndexedDB error")));
});
request.addEventListener("success", () => {
resume(Effect.succeed(request.result));
});
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium connection/webConnectionStorage.ts:87

When indexedDB.open() needs to upgrade the schema but another tab holds an older version, the blocked event fires and the Effect never completes — neither success nor error handlers execute until the blocking tab closes. If that tab never closes, openDatabase hangs indefinitely. Consider handling the blocked event by resuming with a descriptive error or triggering a retry with timeout.

+    request.addEventListener("blocked", () => {
+      resume(Effect.fail(catalogError("open", "Database upgrade blocked by another tab")));
+    });
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/web/src/connection/webConnectionStorage.ts around lines 87-113:

When `indexedDB.open()` needs to upgrade the schema but another tab holds an older version, the `blocked` event fires and the Effect never completes — neither `success` nor `error` handlers execute until the blocking tab closes. If that tab never closes, `openDatabase` hangs indefinitely. Consider handling the `blocked` event by resuming with a descriptive error or triggering a retry with timeout.

Evidence trail:
apps/web/src/connection/webConnectionStorage.ts lines 87-114 at REVIEWED_COMMIT: `openDatabase` registers handlers for `upgradeneeded` (line 96), `error` (line 107), and `success` (line 110), but not for `blocked`. IndexedDB spec: https://w3c.github.io/IndexedDB/#request-api — the `blocked` event fires on IDBOpenDBRequest when the open operation is blocked by existing connections, and `success`/`error` don't fire until the block is resolved.

@juliusmarminge juliusmarminge force-pushed the codex/connection-state-audit branch 2 times, most recently from fb36d5e to 8de82fb Compare June 7, 2026 20:23
Comment thread apps/web/src/connection/webConnectionPlatform.ts Outdated
Comment thread apps/mobile/src/features/cloud/CloudAuthProvider.tsx Outdated
@juliusmarminge juliusmarminge force-pushed the codex/connection-state-audit branch 4 times, most recently from 9ba3552 to a01b7f9 Compare June 7, 2026 20:45
@juliusmarminge juliusmarminge force-pushed the codex/connection-infra-otel-base branch from ced8bdf to 22b4b24 Compare June 7, 2026 20:50
@juliusmarminge juliusmarminge force-pushed the codex/connection-state-audit branch 4 times, most recently from 409fd6c to ecd84a7 Compare June 7, 2026 21:06
Base automatically changed from codex/connection-infra-otel-base to main June 8, 2026 03:21
juliusmarminge and others added 2 commits June 7, 2026 21:02
Co-authored-by: codex <codex@users.noreply.github.com>
Co-authored-by: codex <codex@users.noreply.github.com>
@juliusmarminge juliusmarminge force-pushed the codex/connection-state-audit branch from ecd84a7 to f5c0afe Compare June 8, 2026 04:09
@juliusmarminge juliusmarminge changed the base branch from main to codex/hosted-web-tracing-poc June 8, 2026 04:09
…t and tracing

- Updated the EnvironmentSupervisor to handle connection states more effectively, including new states for connecting and available.
- Introduced tracing for connection attempts to capture detailed error information and improve debugging.
- Removed the environment runtime state management code as it was deemed unnecessary.
- Adjusted tests to reflect changes in connection state handling and ensure proper functionality.
- Enhanced error handling in relay tracing to ensure safe error propagation without affecting application behavior.
Comment thread packages/client-runtime/src/connection/registry.ts Outdated
- Refactored EnvironmentRegistry to utilize new EnvironmentServices and EnvironmentServicesFactory.
- Replaced runtime-related methods with service-based methods for better abstraction.
- Introduced new layers for environment services, including commands and threads.
- Removed deprecated rpcGenerationChanges and other unused methods from EnvironmentRegistryService.
- Updated tests to reflect changes in the registry and runtime structure.
- Cleaned up unused imports and adjusted types accordingly.
…ment

- Deleted filesystemBrowseState and sourceControlDiscoveryState modules along with their associated tests.
- Removed related imports from index.ts and knownEnvironment.ts.
- Updated knownEnvironment tests to remove unused HTTP base URL checks.
- Refactored shellSnapshotState and threadDetailState to simplify interfaces.
- Cleaned up vcsRefState and vcsStatusState by removing unused types and functions.
);
if (!isSignedIn || !userId) {
setManagedRelaySession(appAtomRegistry, null);
if (previousAccount !== null) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium cloud/managedAuth.tsx:59

On first render when signed out, previousAccount is undefined, so undefined !== null evaluates to true and queueAccountCleanup() runs unnecessarily. This triggers removeRelayEnvironments() and relay client token cache reset even though no account was ever established. Change the condition to if (previousAccount) so cleanup only happens when transitioning from an actual signed-in state.

-      if (previousAccount !== null) {
+      if (previousAccount) {
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/web/src/cloud/managedAuth.tsx around line 59:

On first render when signed out, `previousAccount` is `undefined`, so `undefined !== null` evaluates to `true` and `queueAccountCleanup()` runs unnecessarily. This triggers `removeRelayEnvironments()` and relay client token cache reset even though no account was ever established. Change the condition to `if (previousAccount)` so cleanup only happens when transitioning from an actual signed-in state.

Evidence trail:
apps/web/src/cloud/managedAuth.tsx lines 26, 35, 37, 57-61, 63 at REVIEWED_COMMIT. Line 26: ref initialized to `undefined`. Line 35: `previousAccount = observedAccountRef.current` (undefined on first render). Line 59: condition `previousAccount !== null` doesn't account for `undefined`. Line 63: correctly checks all three states (`!== undefined && !== null && !== userId`).

? (activeSavedEnvironmentRuntime?.connectionState ?? "disconnected")
: "connected";
const primaryEnvironmentId = primaryEnvironment?.environmentId ?? null;
const activeEnvironment =

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium components/ChatView.tsx:1119

The refactored code removed the guard that excluded the primary environment from unavailability checks. Now activeEnvironmentUnavailable is computed for all environments including the primary, so if the primary's connection.phase is temporarily "connecting" or "available" during startup, the UI shows an unavailable banner and blocks message dispatch — behavior that never occurred before.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/web/src/components/ChatView.tsx around line 1119:

The refactored code removed the guard that excluded the primary environment from unavailability checks. Now `activeEnvironmentUnavailable` is computed for all environments including the primary, so if the primary's `connection.phase` is temporarily `"connecting"` or `"available"` during startup, the UI shows an unavailable banner and blocks message dispatch — behavior that never occurred before.

Evidence trail:
Old guard at MERGE_BASE in ChatView.tsx: `activeThread.environmentId !== primaryEnvironmentId` check in `activeSavedEnvironmentRecord` assignment; new code at REVIEWED_COMMIT ChatView.tsx lines 1119-1123 with no primary guard; `AVAILABLE_CONNECTION_STATE` defined at packages/client-runtime/src/connection/model.ts:148-154 with `phase: "available"`; `activeEnvironmentUnavailable` used to block sends at ChatView.tsx:2806, ChatView.tsx:3427; used to block revert at ChatView.tsx:2746; used for banner at ChatView.tsx:1353; `isConnecting` always false at ChatView.tsx:904 (`_setIsConnecting` unused).

(patch: Partial<UnifiedSettings>) => {
const { serverPatch, clientPatch } = splitPatch(patch);

if (Object.keys(serverPatch).length > 0) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium hooks/useSettings.ts:204

When primaryEnvironment is null, server settings are optimistically applied locally via applySettingsUpdated() but the RPC call to serverActions.updateSettings() is skipped. The user sees the change in the UI, but on refresh the setting reverts because it was never persisted to the server.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/web/src/hooks/useSettings.ts around line 204:

When `primaryEnvironment` is `null`, server settings are optimistically applied locally via `applySettingsUpdated()` but the RPC call to `serverActions.updateSettings()` is skipped. The user sees the change in the UI, but on refresh the setting reverts because it was never persisted to the server.

Evidence trail:
apps/web/src/hooks/useSettings.ts lines 197-225 (REVIEWED_COMMIT) — independent guards for optimistic update (line 206: checks `currentServerConfig`) vs RPC persist (line 209: checks `primaryEnvironment`). apps/web/src/connection/useWebEnvironments.ts lines 59-68 — `useWebPrimaryEnvironment()` returns `WebEnvironmentPresentation | null`. apps/web/src/rpc/serverState.ts line 58 — `serverConfigAtom` initialized to `null`. apps/web/src/rpc/serverState.ts line 176-178 — `resolveServerConfig` sets the atom. apps/web/src/connection/WebServerStateProjection.tsx lines 27-43 — sets config snapshot from primary environment data, but never clears atom when `environmentId` becomes null. apps/web/src/rpc/serverState.ts lines 169-172 — only test reset exists, no production clearing of stale config.

Comment thread apps/web/src/state/query.ts

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium

When the toast is created while primaryEnvironment is loading (null), clicking "Update" silently does nothing because runUpdates captures the stale null value in its closure. The effect re-runs when primaryEnvironment resolves, but it early-returns at line 174 because activeToastRef.current is already set, so the callback is never recreated with the resolved environment value.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/web/src/components/ProviderUpdateLaunchNotification.tsx around line 186:

When the toast is created while `primaryEnvironment` is loading (`null`), clicking "Update" silently does nothing because `runUpdates` captures the stale `null` value in its closure. The effect re-runs when `primaryEnvironment` resolves, but it early-returns at line 174 because `activeToastRef.current` is already set, so the callback is never recreated with the resolved environment value.

Evidence trail:
apps/web/src/components/ProviderUpdateLaunchNotification.tsx lines 163-305 (REVIEWED_COMMIT): The useEffect at line 163 has primaryEnvironment in its deps (line 303). runUpdates at line 190 captures primaryEnvironment from closure; line 191 checks `!primaryEnvironment` and returns early if null. Line 179 adds notificationKey to seenProviderUpdateNotificationKeys. Line 173 checks seenProviderUpdateNotificationKeys.has(notificationKey), and line 174 checks activeToastRef.current — both are truthy on re-run, causing early return. The toast callback at line 269 (onClick: runUpdates) is never recreated with the resolved primaryEnvironment value.

Comment thread apps/mobile/src/features/threads/use-project-actions.ts
Split connection, authorization, RPC, state, and platform concerns into composable modules. Migrate web and mobile consumers to narrow subpath exports and remove legacy stores, barrels, and compatibility directories.

Co-authored-by: codex <codex@users.noreply.github.com>
Comment thread apps/mobile/src/connection/storage.ts
Comment on lines +86 to +94
function exitUnlessInterrupted<A, E, R>(
effect: Effect.Effect<A, E, R>,
): Effect.Effect<Exit.Exit<A, E>, never, R> {
return Effect.matchCauseEffect(effect, {
onFailure: (cause) =>
Cause.hasInterrupts(cause) ? Effect.interrupt : Effect.succeed(Exit.failCause(cause)),
onSuccess: (value) => Effect.succeed(Exit.succeed(value)),
});
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium connection/supervisor.ts:86

exitUnlessInterrupted uses Cause.hasInterrupts(cause) on line 91, which returns true when the cause contains any interrupt reason even if it also contains typed failures. When a Cause has both an interrupt and a typed failure (e.g., from race conditions), the function calls Effect.interrupt and discards the typed failure. The caller in failureFromExit then receives { _tag: "Interrupted" } instead of the actual error. Per the Cause documentation, hasInterruptsOnly should be used when you want to propagate interrupts only when there are no other failure reasons.

function exitUnlessInterrupted<A, E, R>(
  effect: Effect.Effect<A, E, R>,
): Effect.Effect<Exit.Exit<A, E>, never, R> {
  return Effect.matchCauseEffect(effect, {
    onFailure: (cause) =>
-      Cause.hasInterrupts(cause) ? Effect.interrupt : Effect.succeed(Exit.failCause(cause)),
+      Cause.hasInterruptsOnly(cause) ? Effect.interrupt : Effect.succeed(Exit.failCause(cause)),
    onSuccess: (value) => Effect.succeed(Exit.succeed(value)),
  });
}
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @packages/client-runtime/src/connection/supervisor.ts around lines 86-94:

`exitUnlessInterrupted` uses `Cause.hasInterrupts(cause)` on line 91, which returns `true` when the cause contains *any* interrupt reason even if it also contains typed failures. When a `Cause` has both an interrupt and a typed failure (e.g., from race conditions), the function calls `Effect.interrupt` and discards the typed failure. The caller in `failureFromExit` then receives `{ _tag: "Interrupted" }` instead of the actual error. Per the `Cause` documentation, `hasInterruptsOnly` should be used when you want to propagate interrupts only when there are no other failure reasons.

Evidence trail:
packages/client-runtime/src/connection/supervisor.ts line 91 — `Cause.hasInterrupts(cause)` used in `exitUnlessInterrupted`
packages/client-runtime/src/connection/supervisor.ts line 177 — `Cause.hasInterruptsOnly(exit.cause)` used in `failureFromExit`
packages/client-runtime/src/connection/supervisor.ts line 438 — `!Cause.hasInterruptsOnly(establishment.exit.cause)` used in defect check
packages/client-runtime/src/state/shell.ts line 134 — `Cause.hasInterruptsOnly(cause)` in identical pattern
packages/client-runtime/src/state/threads.ts line 199 — `Cause.hasInterruptsOnly(cause)` in identical pattern
pnpm-workspace.yaml line 19 — `effect: 4.0.0-beta.78`

- Implemented `useEnvironmentQuery` for handling environment data fetching and error management.
- Created state management for relay, review, server, session, shell, source control, terminal, and threads.
- Introduced hooks for managing terminal sessions and actions, including attaching, writing, resizing, and clearing terminals.
- Added support for source control actions such as pulling, publishing repositories, and managing threads.
- Enhanced VCS integration with actions for listing refs and managing worktrees.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium

const loadBranches = useCallback(async () => {

The loadBranches callback no longer handles errors from branchState.refresh(). If the branch query fails, setPendingConnectionError is never called and users receive no feedback about the failure. Consider wrapping branchState.refresh() in a try-catch and calling setPendingConnectionError on error, or verify that branchState surfaces errors through a different mechanism that renders this handling unnecessary.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/mobile/src/features/threads/new-task-flow-provider.tsx around line 388:

The `loadBranches` callback no longer handles errors from `branchState.refresh()`. If the branch query fails, `setPendingConnectionError` is never called and users receive no feedback about the failure. Consider wrapping `branchState.refresh()` in a try-catch and calling `setPendingConnectionError` on error, or verify that `branchState` surfaces errors through a different mechanism that renders this handling unnecessary.

Evidence trail:
- apps/mobile/src/features/threads/new-task-flow-provider.tsx lines 388-416 (REVIEWED_COMMIT): new `loadBranches` with no error handling
- git_diff MERGE_BASE..REVIEWED_COMMIT on same file: removed try-catch with `setPendingConnectionError("Failed to load branches.")` catch block
- apps/mobile/src/state/query.ts lines 24-36 (REVIEWED_COMMIT): `useEnvironmentQuery` returns `{ error, refresh }` where `refresh` is `useAtomRefresh` (returns `void`, line 29)
- git_grep for `branchState\.error` in apps/mobile: zero results — never consumed
- apps/mobile/src/state/use-remote-environment-registry.ts lines 26-32, 114-200: `pendingConnectionError` atom is the UI error surface, only set from `setPendingConnectionError`

(activeThread.session !== null && activeThread.session.status !== "stopped")),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium components/ChatView.tsx:1808

The envLocked check at line 1808 only excludes status !== "stopped", but the new session model has two additional terminal states "interrupted" and "error". Threads with these statuses will incorrectly have envLocked = true, blocking users from changing branch/env-mode settings on effectively terminated threads. Consider updating the condition to also exclude "interrupted" and "error".

-      (activeThread.session !== null && activeThread.session.status !== "stopped")),
+      (activeThread.session !== null &&
+        activeThread.session.status !== "stopped" &&
+        activeThread.session.status !== "interrupted" &&
+        activeThread.session.status !== "error")),
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @apps/web/src/components/ChatView.tsx around line 1808:

The `envLocked` check at line 1808 only excludes `status !== "stopped"`, but the new session model has two additional terminal states `"interrupted"` and `"error"`. Threads with these statuses will incorrectly have `envLocked = true`, blocking users from changing branch/env-mode settings on effectively terminated threads. Consider updating the condition to also exclude `"interrupted"` and `"error"`.

Evidence trail:
packages/contracts/src/orchestration.ts:249-257 (OrchestrationSessionStatus defines: idle, starting, running, ready, interrupted, stopped, error); apps/web/src/session-logic.ts:1252-1260 (derivePhase treats stopped/interrupted/error all as 'disconnected' terminal states); apps/web/src/components/ChatView.tsx:1805-1809 (envLocked check only excludes 'stopped'); apps/web/src/components/ChatView.tsx:1816 (envLocked blocks onEnvironmentChange); apps/web/src/components/ChatView.tsx:2497 (envLocked blocks canOverrideServerThreadEnvMode); apps/web/src/components/ChatView.tsx:3820 (envLocked passed to child component)

juliusmarminge and others added 4 commits June 9, 2026 18:41
- keep subscriptions alive across transport and supervisor replacement
- bound probes and improve retry, authorization, and state projections
- add deterministic regression coverage for connection lifecycle failures

Co-authored-by: codex <codex@users.noreply.github.com>
- self-heal persisted connection catalogs and clean stale temporary data
- preserve bearer authorization across edits and attachment requests
- centralize mobile runtime reuse and background outbox draining

Co-authored-by: codex <codex@users.noreply.github.com>
- project server snapshots and request latency into focused atoms
- surface slow RPC requests without coupling shared runtime to React
- reuse the web runtime and make hosted connection discovery resilient

Co-authored-by: codex <codex@users.noreply.github.com>
- recover desktop connection catalogs from partial writes
- remove stale temporary catalog files after recovery
- trust incoming relay trace context only after authentication

Co-authored-by: codex <codex@users.noreply.github.com>
@juliusmarminge juliusmarminge marked this pull request as ready for review June 10, 2026 02:43

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Wrong empty state during sync
    • Added a check for hasPendingShellSnapshot && !hasLoadedShellSnapshot before the final fallthrough, returning a loading state instead of the false 'No projects found' message when shell data is still synchronizing on a connected environment.
  • ✅ Fixed: Session activates after failed cleanup
    • Changed queueAccountCleanup to return a Promise<boolean> indicating success (while keeping the transition chain alive via a separate void promise), and gated activateSession on the boolean so it only runs when all cleanup operations succeeded.

Create PR

Or push these changes by commenting:

@cursor push 98b896a5eb
Preview (98b896a5eb)
diff --git a/apps/mobile/src/app/new/index.tsx b/apps/mobile/src/app/new/index.tsx
--- a/apps/mobile/src/app/new/index.tsx
+++ b/apps/mobile/src/app/new/index.tsx
@@ -62,6 +62,14 @@
     };
   }
 
+  if (catalogState.hasPendingShellSnapshot && !catalogState.hasLoadedShellSnapshot) {
+    return {
+      title: "Loading projects",
+      detail: "Synchronizing shell data from the connected environment.",
+      loading: true,
+    };
+  }
+
   return {
     title: "No projects found",
     detail: "The connected environment did not report any projects.",

diff --git a/apps/mobile/src/features/cloud/CloudAuthProvider.tsx b/apps/mobile/src/features/cloud/CloudAuthProvider.tsx
--- a/apps/mobile/src/features/cloud/CloudAuthProvider.tsx
+++ b/apps/mobile/src/features/cloud/CloudAuthProvider.tsx
@@ -48,8 +48,8 @@
         readonly userId: string;
         readonly provider: () => Promise<string | null>;
       } | null,
-    ) => {
-      accountTransitionRef.current = accountTransitionRef.current.then(async () => {
+    ): Promise<boolean> => {
+      const transition = accountTransitionRef.current.then(async () => {
         const cleanup = [
           resetManagedRelayTokenCache(),
           removeRelayEnvironments(),
@@ -58,13 +58,17 @@
             : []),
         ];
         const results = await Promise.allSettled(cleanup);
+        let succeeded = true;
         for (const result of results) {
           if (result.status === "rejected") {
             console.warn("[t3-cloud] cloud account cleanup failed", result.reason);
+            succeeded = false;
           }
         }
+        return succeeded;
       });
-      return accountTransitionRef.current;
+      accountTransitionRef.current = transition.then(() => {});
+      return transition;
     };
 
     if (!isSignedIn || !userId) {
@@ -102,7 +106,9 @@
       previousTokenProviderRef.current = null;
       setAgentAwarenessRelayTokenProvider(null);
       setManagedRelaySession(appAtomRegistry, null);
-      void queueAccountCleanup(previous).then(activateSession);
+      void queueAccountCleanup(previous).then((succeeded) => {
+        if (succeeded) activateSession();
+      });
     } else {
       void accountTransitionRef.current.then(activateSession);
     }

You can send follow-ups to the cloud agent here.

Comment thread apps/mobile/src/app/new/index.tsx
void queueAccountCleanup(previous).then(activateSession);
} else {
void accountTransitionRef.current.then(activateSession);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Session activates after failed cleanup

Medium Severity

On cloud account change or sign-in after sign-out, activateSession runs after queueAccountCleanup finishes, but cleanup only logs rejected work from Promise.allSettled. If removeRelayEnvironments or related cleanup fails, the new Clerk session and relay token provider still activate, leaving prior relay registrations on the device under the new account context.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 4b2e001. Configure here.

@macroscopeapp

macroscopeapp Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Approvability

Verdict: Needs human review

8 blocking correctness issues found. Diff is too large for automated approval analysis. A human reviewer should evaluate this PR.

You can customize Macroscope's approvability policy. Learn more.

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 3 total unresolved issues (including 1 from previous review).

Fix All in Cursor

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Corrupt catalog skips legacy fallback
    • Changed decodeCatalog error handling to use Effect.option and fall through to the legacy migration path instead of immediately returning EMPTY_CONNECTION_CATALOG_DOCUMENT.
  • ✅ Fixed: Cloud switch blocks error retry
    • Changed switch value from connectionState !== "available" to only be true for "connected", "connecting", or "reconnecting" states, allowing users to toggle ON from error/offline states to trigger reconnect.

Create PR

Or push these changes by commenting:

@cursor push 0369ca79fd
Preview (0369ca79fd)
diff --git a/apps/mobile/src/app/settings/environments.tsx b/apps/mobile/src/app/settings/environments.tsx
--- a/apps/mobile/src/app/settings/environments.tsx
+++ b/apps/mobile/src/app/settings/environments.tsx
@@ -259,7 +259,11 @@
         props.onDisconnect();
       }}
       onToggleError={props.onToggleError}
-      value={props.environment.connectionState !== "available"}
+      value={
+        props.environment.connectionState === "connected" ||
+        props.environment.connectionState === "connecting" ||
+        props.environment.connectionState === "reconnecting"
+      }
     />
   );
 }

diff --git a/apps/mobile/src/connection/catalog-store.ts b/apps/mobile/src/connection/catalog-store.ts
--- a/apps/mobile/src/connection/catalog-store.ts
+++ b/apps/mobile/src/connection/catalog-store.ts
@@ -66,17 +66,25 @@
       return cached.value;
     }
     const raw = yield* storage.getItem(CONNECTION_CATALOG_KEY);
-    let catalog: ConnectionCatalogDocumentType;
+    let catalog: ConnectionCatalogDocumentType = EMPTY_CONNECTION_CATALOG_DOCUMENT;
+    let catalogLoaded = false;
     if (raw !== null && raw.trim() !== "") {
-      catalog = yield* decodeCatalog(raw).pipe(
-        Effect.catch((error) =>
-          Effect.logWarning("Discarding corrupt mobile connection catalog", error).pipe(
-            Effect.andThen(storage.deleteItem(CONNECTION_CATALOG_KEY)),
-            Effect.as(EMPTY_CONNECTION_CATALOG_DOCUMENT),
-          ),
+      const decoded = yield* decodeCatalog(raw).pipe(
+        Effect.option,
+        Effect.tap((result) =>
+          Option.isNone(result)
+            ? Effect.logWarning("Discarding corrupt mobile connection catalog").pipe(
+                Effect.andThen(storage.deleteItem(CONNECTION_CATALOG_KEY)),
+              )
+            : Effect.void,
         ),
       );
-    } else {
+      if (Option.isSome(decoded)) {
+        catalog = decoded.value;
+        catalogLoaded = true;
+      }
+    }
+    if (!catalogLoaded) {
       const legacyRaw = yield* storage.getItem(LEGACY_CONNECTIONS_KEY);
       catalog =
         legacyRaw === null || legacyRaw.trim() === ""

You can send follow-ups to the cloud agent here.

Reviewed by Cursor Bugbot for commit 95fa2be. Configure here.

yield* storage.setItem(CONNECTION_CATALOG_KEY, encoded);
yield* storage.deleteItem(LEGACY_CONNECTIONS_KEY);
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrupt catalog skips legacy fallback

High Severity

When CONNECTION_CATALOG_KEY holds non-empty but invalid data, loadUnlocked discards it and returns an empty catalog without ever reading LEGACY_CONNECTIONS_KEY. A partially migrated device can lose saved connections even when legacy storage is still intact.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 95fa2be. Configure here.

props.onDisconnect();
}}
onToggleError={props.onToggleError}
value={props.environment.connectionState !== "available"}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cloud switch blocks error retry

Medium Severity

ConnectedCloudEnvironmentRow drives the switch with connectionState !== "available" and only calls reconnect when the switch turns on. In error, connecting, reconnecting, or offline, the switch is already on, so users cannot trigger onReconnectEnvironment without first turning it off, which invokes disconnect.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 95fa2be. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XXL 1,000+ changed lines (additions + deletions). vouch:trusted PR author is trusted by repo permissions or the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant