Workflow boards: kanban state machines that drive coding agents#3032
Workflow boards: kanban state machines that drive coding agents#3032ccdwyer wants to merge 298 commits into
Conversation
Constraint: Task 7 needed drill-in UI on the current workflow detail contract without inventing missing step thread fields.\nRejected: Building a new diff renderer | The app already has getRenderablePatch and FileDiff wiring for checkpoint diffs.\nConfidence: high\nScope-risk: moderate\nDirective: Extend WorkflowTicketDetailView before adding live per-step thread activity; do not infer thread IDs in the UI.\nTested: pnpm --dir apps/web exec vp test run src/components/board/TicketDrawer.test.tsx; pnpm --dir apps/web exec vp test run src/components/board; pnpm --filter @t3tools/web typecheck; pnpm exec vp check\nNot-tested: Live provider question/approval resume and diff refresh against a running board.
Constraint: Task 8 required a board registration path and sample board while staying within v1 scope.\nRejected: Server startup board discovery | The plan allowed a register-board action, and that avoids speculative project discovery at boot.\nConfidence: high\nScope-risk: moderate\nDirective: Keep the sample board lintable on a default install; update the decode/lint test when provider naming changes.\nTested: pnpm --dir apps/web exec vp test run src/components/board; pnpm --dir apps/server exec vp test run src/workflow/sampleBoardFile.test.ts; pnpm --filter @t3tools/web typecheck; pnpm --filter t3 typecheck; pnpm exec vp check\nNot-tested: Full live board run-through with a real provider and screenshot.
… tokens M5 verification showed the token endpoint still parsed the pre-workflow scope whitelist while clients and pairing defaults had moved to the exported contract scope lists. This keeps token exchange and tests coupled to the contract rather than stale literals. Constraint: Workflow scopes are now part of AuthEnvironmentScope and must be requestable through OAuth token exchange. Rejected: Leaving tests on hand-written scope arrays | They hid drift when workflow scopes became standard/admin contract members. Confidence: high Scope-risk: narrow Directive: Prefer AuthStandardClientScopes/AuthAdministrativeScopes in auth tests instead of duplicated scope literals. Tested: pnpm --dir apps/server exec vp test run src/bin.test.ts src/auth/EnvironmentAuth.test.ts src/auth/EnvironmentAuthAdmin.test.ts src/auth/PairingGrantStore.test.ts src/auth/SessionStore.test.ts src/server.test.ts; pnpm --dir apps/web exec vp test run src/localApi.test.ts; pnpm exec vp run typecheck; pnpm exec vp check; pnpm --filter @t3tools/contracts test; pnpm --filter @t3tools/web test; pnpm --filter t3 test; pnpm --dir apps/server exec vp test run src/server.test.ts Not-tested: none
Real workflow execution was green only through stubs; the live path needed durable recovery, provider-question waits, repo-root worktrees, and hard supersede handling to satisfy the v1 invariants. Constraint: Fixes were driven by docs/superpowers/reviews/2026-06-07-workflow-boards-v1-adversarial-review.md and the v1 design invariants. Rejected: Stub-only fixes | they would preserve the broken end-to-end runtime path. Confidence: high Scope-risk: broad Directive: Keep future workflow changes covered by real-path tests with temp git repos and provider waits, not only stubbed unit tests. Tested: pnpm exec vp run typecheck; pnpm exec vp check; cd apps/server && pnpm exec vp test run src/workflow; pnpm --filter @t3tools/contracts test Not-tested: full non-workflow server/web suites beyond the required workflow and contracts gates
Recovery now interrupts stale pre-restart projection turns, restarts provider dispatches without rebinding to dead turns, and hands terminal results back to the engine so autonomous pipelines route after restart. Provider question waits now come from real user-input activity projection and the engine re-awaits terminal completion after answers before starting the next step. Constraint: Fix reviewed residuals without adding v2 workflow features or weakening stub-free real-path tests. Rejected: Stubbing TurnProjectionPort or pending approvals in real-path coverage | Those were the exact seams hiding the restart and user-input bugs. Confidence: high Scope-risk: moderate Directive: Keep provider wait completion tied to dispatch terminal state; do not confirm a step merely because the user answered a provider question. Tested: cd apps/server && pnpm exec vp test run src/workflow; pnpm exec vp run typecheck; pnpm exec vp check Not-tested: Full application browser workflow; changes are server workflow runtime only.
Outer pipeline failures should leave a durable blocked-ticket record and an operator-visible warning instead of disappearing behind recovery behavior. The regression now asserts the real event store contains the error detail and the warning is emitted. Constraint: Low-severity polish only; step-level failures continue to use StepFailed routing. Rejected: Routing orchestration failures through step failure handling | There may be no started step when the pipeline wrapper fails. Confidence: high Scope-risk: narrow Directive: Keep the outer pipeline catch interrupt-aware; manual supersede interrupts must not block tickets. Tested: cd apps/server && pnpm exec vp test run src/workflow Not-tested: Full repo typecheck/check will run after the polish pass.
Recovered provider monitors intentionally fork independently so workflow startup is not held open by provider terminal waits. The comment records that these restart-window continuations are not tracked as live pipeline fibers and therefore cannot be interrupted by manual moves. Constraint: Optional low-risk polish only; do not destabilize passing restart recovery behavior. Rejected: Reworking recovery continuations through live pipeline fiber tracking | That changes concurrency and interruption semantics beyond the requested low-risk pass. Confidence: high Scope-risk: narrow Directive: Revisit this only with dedicated recovery concurrency tests. Tested: cd apps/server && pnpm exec vp test run src/workflow Not-tested: Full repo typecheck/check will run after the polish pass.
The TurnStateReader tests now include one focused path through ProjectionTurnRepositoryLive and TurnProjectionPortLive so running, completed, and error states are covered without the TurnProjectionPort stub seam. Constraint: Optional low-risk test coverage only; no production behavior change. Rejected: Broad real-path workflow expansion | The requested seam is covered by a focused projection test. Confidence: high Scope-risk: narrow Directive: Keep state-mapping tests close to TurnStateReader rather than relying only on workflow runtime scenarios. Tested: pnpm exec vp test run src/workflow/Layers/TurnStateReader.test.ts; cd apps/server && pnpm exec vp test run src/workflow Not-tested: Full repo typecheck/check will run after the polish pass.
Make adding a board to a project as easy as a new chat: multiple named file-backed boards listed in the sidebar (board icon), one-click "Add board" that writes a templated .t3/boards/<slug>.json defaulting to the user's most recent agent, and server-side discovery/registration of board files. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Server-resolved workspace root (drop client repoRoot/filePath; remove trust-prone registerBoardFromFile); board list as a separate source from the registry (invalid files surface as error entries); net-new per-project board-list store slice; grouped-project member picker for "Add board"; modelSelection from full threads with availability-filtered providers; loader path split; file-deletion unregister; exclusive-create writes; BoardSnapshot.projectId; full RPC layer wiring enumerated. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
14 TDD tasks: default template + slug helpers, BoardListEntry/createBoard contracts, registry unregister + read-model list/delete, loader path split, project workspace resolver, BoardDiscovery, listBoards/createBoard handlers (server-resolved root; registerBoardFromFile removed), runtime+watcher wiring, client RPC wiring, resolveRecentAgent, board-list store slice, sidebar board rows + Add board, board route cleanup. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Task 5: keep repo compiling (shim the existing registerBoardFromFile caller to the new loader signature until Task 8 removes it). - Task 3: BoardSnapshot.projectId is top-level; remove registerBoardFromFile everywhere (ipc.ts, client runtime, EnvironmentApi, scope, mocks). - Task 6: resolver returns a dedicated tagged error; unwrap Option from getProjectShellById; don't mistype projection errors. - Task 8: add WorkspaceFileSystem.createFileExclusive (wx); createBoard uses it. - Task 9: discovery is on-demand via listBoards; server file-watcher/boardsChanged push explicitly deferred (no generic project-visible hook exists). - Task 11: read recent agent from thread SHELLS; filter providers by installed. - Task 12: refetch on mount + after createBoard (no push). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Constraint: Board Creation UX plan requires the default board template before create/discovery wiring.\nRejected: Hand-authoring ad hoc test fixtures | the default board should be generated by the same helper createBoard will use.\nConfidence: high\nScope-risk: narrow\nDirective: Keep this template pure; provider validation belongs to loader/discovery layers.\nTested: cd apps/server && pnpm exec vp test run src/workflow/defaultBoard.test.ts; pnpm exec vp run typecheck; pnpm exec vp check; cd apps/server && pnpm exec vp test run src/workflow\nNot-tested: Full app manual create-board flow is not implemented yet.
Constraint: Board creation needs deterministic, collision-safe file slugs before createBoard writes files.\nRejected: Inline slug logic in RPC handlers | shared pure helpers keep discovery and createBoard aligned.\nConfidence: high\nScope-risk: narrow\nDirective: Keep slug generation deterministic; idempotent discovery depends on it.\nTested: cd apps/server && pnpm exec vp test run src/workflow/boardSlug.test.ts; pnpm exec vp run typecheck; pnpm exec vp check; cd apps/server && pnpm exec vp test run src/workflow\nNot-tested: createBoard slug collision flow is not wired yet.
…omFile Constraint: New workflow RPC entries must be in WsRpcGroup now, but the plan keeps the legacy register handler alive until Task 8 for the loader-path transition.\nRejected: Removing registerBoardFromFile handlers in Task 3 | Task 5 and Task 8 explicitly sequence that removal later.\nConfidence: medium\nScope-risk: moderate\nDirective: Replace the temporary listBoards/createBoard handler failures with real BoardDiscovery/createBoard wiring in Task 8, then remove registerBoardFromFile everywhere.\nTested: pnpm --filter @t3tools/contracts test -- workflow.test.ts; pnpm exec vp run typecheck; pnpm exec vp check; cd apps/server && pnpm exec vp test run src/workflow; pnpm --filter @t3tools/contracts test\nNot-tested: listBoards/createBoard runtime behavior is intentionally not implemented until later tasks.
Constraint: Discovery must unregister deleted board files and list boards per project from projection rows.\nRejected: Keeping stale projection rows on delete | deleted files must disappear from listBoards results.\nConfidence: high\nScope-risk: narrow\nDirective: Do not cascade into ticket runtime state from deleteBoard; this method removes board-list projection rows only.\nTested: cd apps/server && pnpm exec vp test run src/workflow/Layers/BoardRegistry.test.ts src/workflow/Layers/WorkflowReadModel.test.ts; pnpm exec vp run typecheck; cd apps/server && pnpm exec vp test run src/workflow; pnpm exec vp check\nNot-tested: File-discovery delete flow is not wired until Task 7.
Constraint: create/list discovery must resolve board files under a project workspace but persist workspace-relative paths.\nRejected: Continuing to store absolute or client-provided paths | sidebar board rows need portable relative file paths and server-side root resolution.\nConfidence: high\nScope-risk: moderate\nDirective: Task 8 should remove the temporary registerBoardFromFile shim after createBoard/listBoards are real.\nTested: cd apps/server && pnpm exec vp test run src/workflow/Layers/WorkflowFileLoader.test.ts; cd apps/server && pnpm exec vp test run src/workflow/Layers/WorkflowRpcHandlers.test.ts; pnpm exec vp run typecheck; cd apps/server && pnpm exec vp test run src/workflow; pnpm exec vp check\nNot-tested: BoardDiscovery scanning is not implemented until Task 7.
Constraint: Board creation must resolve workspace roots server-side from projectId, never from client paths.\nRejected: Passing repo paths from the web client | this preserves the invariant that server projections own workspace roots.\nConfidence: high\nScope-risk: narrow\nDirective: Use this port in BoardDiscovery and createBoard instead of accepting paths in RPC payloads.\nTested: cd apps/server && pnpm exec vp test run src/workflow/Layers/ProjectWorkspaceResolver.test.ts; pnpm exec vp run typecheck; cd apps/server && pnpm exec vp test run src/workflow; pnpm exec vp check\nNot-tested: Real projection SQL lookup is covered by existing ProjectionSnapshotQuery tests, not duplicated here.
Constraint: listBoards must discover file-backed boards on demand and unregister rows for files that disappear.\nRejected: Server push/watchers for board changes | v1 explicitly defers watchers and uses on-demand list refresh.\nConfidence: high\nScope-risk: moderate\nDirective: Keep discovery idempotent and keep invalid files as error entries without registering them.\nTested: cd apps/server && pnpm exec vp test run src/workflow/Layers/BoardDiscovery.test.ts; pnpm exec vp run typecheck; cd apps/server && pnpm exec vp test run src/workflow; pnpm exec vp check\nNot-tested: RPC list/create handlers are not wired until Task 8.
…oardFromFile Constraint: Removing registerBoardFromFile everywhere required client/runtime/test-harness cleanup and workflow provider wiring in the same compile boundary. Rejected: Leaving temporary registerBoardFromFile shims until later tasks | the plan invariant requires grep-clean removal in Task 8. Confidence: high Scope-risk: moderate Directive: Keep board creation server-resolved; do not reintroduce client-supplied board file paths. Tested: cd apps/server && pnpm exec vp test run src/workspace/Layers/WorkspaceFileSystem.test.ts; cd apps/server && pnpm exec vp test run src/workflow/Layers/WorkflowRpcHandlers.test.ts; cd apps/server && pnpm exec vp test run src/workflow; pnpm --filter @t3tools/contracts test; pnpm exec vp run typecheck; pnpm exec vp check; rg registerBoardFromFile packages/contracts apps/server/src apps/web/src packages/client-runtime/src -n Not-tested: end-to-end browser sidebar creation flow is covered in later web tasks.
Constraint: Task 8's grep-clean handler removal made the runtime wiring necessary before that compile boundary, so this commit records the verified Task 9 boundary. Rejected: Reintroducing a temporary registerBoardFromFile shim to defer runtime wiring | it violates the Task 8 removal invariant. Confidence: high Scope-risk: narrow Directive: Keep discovery on-demand through listBoards; do not add a watcher or boardsChanged push for v1. Tested: pnpm exec vp run typecheck; cd apps/server && pnpm exec vp test run src/workflow; pnpm exec vp check Not-tested: no additional runtime code changed after Task 8.
Constraint: registerBoardFromFile client/runtime entries were removed at the Task 8 compile boundary; this task completes the web helper surface. Rejected: Passing a board file path through the helper | createBoard is server-resolved by projectId/name/agent only. Confidence: high Scope-risk: narrow Directive: Keep board creation callers using EnvironmentApi.workflow.createBoard with no client path field. Tested: cd apps/web && pnpm exec vp test run src/workflow/boardRpc.test.ts; cd apps/web && pnpm exec vp test run src/workflow; pnpm exec vp run typecheck; pnpm exec vp check Not-tested: sidebar create button flow is covered in later tasks.
Constraint: Board creation must default to the user's most recent available agent without reading full thread details. Rejected: Reading sidebar summaries for modelSelection | summaries intentionally omit modelSelection, so the resolver uses thread shells. Confidence: high Scope-risk: narrow Directive: Keep availability gated by enabled, installed, and isAvailable provider entries. Tested: cd apps/web && pnpm exec vp test run src/workflow/resolveRecentAgent.test.ts; cd apps/web && pnpm exec vp test run src/workflow; pnpm exec vp run typecheck; pnpm exec vp check Not-tested: sidebar create-button integration is covered in later tasks.
Constraint: Board discovery is on-demand in v1, so the store needs explicit per-project replacement rather than live push merging. Rejected: Appending board list entries incrementally | listBoards is a snapshot and deleted files must disappear on the next fetch. Confidence: high Scope-risk: narrow Directive: Treat setProjectBoards/applyBoardList as replacement semantics for each project. Tested: cd apps/web && pnpm exec vp test run src/workflow/boardListState.test.ts; cd apps/web && pnpm exec vp test run src/workflow; pnpm exec vp run typecheck; pnpm exec vp check Not-tested: sidebar fetch/refetch wiring is covered in Task 13.
Add the board creation action in the same project header flow as new threads, so discovered boards are visible and navigable from the sidebar without a separate registration step. Constraint: Task 13 requires a one-click project Add board affordance, sidebar board rows, and a manual running-app verification path. Rejected: Client-supplied board paths | createBoard is server-resolved by projectId and agent only. Confidence: high Scope-risk: moderate Directive: Keep board creation coupled to listBoards refetches until a future watcher/push design is explicitly introduced. Tested: cd apps/web && pnpm exec vp test run src/components/Sidebar.logic.test.ts; cd apps/web && pnpm exec vp test run src/workflow; pnpm exec vp run typecheck; pnpm exec vp check; Playwright against http://localhost:5734 created a Workflow board row and navigated to /board?boardId=... Not-tested: Multi-project context-menu branch was not clicked manually; implementation mirrors the existing new-thread member picker.
Remove the manual board registration affordance now that board creation and discovery provide real board ids from the sidebar path. Constraint: Task 14 requires the board route to open sidebar-provided boardId values without a Register step and to render an explicit missing-board state. Rejected: Keeping a disabled Register button | the v1 creation path no longer has a manual registration workflow. Confidence: high Scope-risk: narrow Directive: Keep board runtime actions routed through subscribeBoard/createTicket/moveTicket/resolveApproval/runLane; this commit only changes route presentation and board lookup state. Tested: cd apps/web && pnpm exec vp test run src/components/board/BoardHeaderControls.test.tsx 'src/routes/-boardRouteState.test.ts'; cd apps/web && pnpm exec vp test run src/workflow; pnpm exec vp run typecheck; pnpm exec vp check; Playwright opened sidebar board with no Register button and showed Board not found for a deleted/missing board id. Not-tested: Browser test did not create a ticket from the cleaned route; ticket flows were left unchanged and covered by existing workflow tests.
Agent steps with captureOutput gain an optional panel size (lint
enforces 2..5 and requires captureOutput). The executor runs that many
independent turns of the same step concurrently — each on its own
dispatch thread, titled "reviewer N/panel" — and takes the strict
majority of the captured verdicts. A member that fails, asks a
question, or returns unusable output contributes no vote; without a
strict majority the step fails rather than silently picking a side.
The combined output is {verdict, votes[]} so routing predicates keep
reading output.verdict unchanged, and usage is summed across members.
The step editor gains a "Reviewers" selector on capture-output steps.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Review findings on the panel feature:
- Restart recovery assumed one dispatch per step, so the first panel
member to reach a terminal state after a restart would complete the
whole step without a majority. Recovery now detects multi-dispatch
steps, settles all their outbox rows, and fails the step honestly as
retryable ("review panel interrupted by restart").
- A member that stalled on a question was counted as no-vote but left a
live provider session and an unconfirmed outbox row forever. The
panel now interrupts and stops non-completed members' sessions and
confirms every member row once the verdict is decided.
- Members ran concurrently with full access in the shared ticket
worktree, able to corrupt each other's view; they now run
sequentially.
- Unchecking "Capture output" in the editor stranded a hidden panel
value that lint then rejected — it clears the panel with it.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Hidden orchestration threads created for workflow dispatches outlived their tickets forever. A thread janitor now collects dispatch thread ids before the workflow cascade removes the outbox rows that know them (board deletion, ticket deletion, retention sweeps) and deletes each through the real thread.delete command path afterwards — best-effort per thread so one failure cannot abort the rest. Intake deletes its one-shot thread as part of session cleanup. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The intake agent can now mark a proposal as depending on earlier ones
("build the API, then the UI on it") via zero-based dependsOn indices.
Backward references only — forward, self, and junk indices are dropped
during parsing, so a proposed set can never contain a cycle. The
dialog shows "After #N" on dependent proposals, approval remaps edges
onto the approved list (edges to excluded rows disappear), and tickets
are created sequentially so each dependent passes the real TicketIds
of its predecessors. A braindump with implied ordering becomes a
self-executing pipeline: dependents queue until their prerequisites
land, then start automatically.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Agent steps now offer "View agent session" in the drawer — a read-only transcript of the hidden orchestration thread behind the step: the exact instruction sent, every assistant reply, and a collapsible activity log (tool calls, status changes). Available for completed runs, not just live ones, via the existing subscribeThread snapshot (by-id lookups intentionally resolve hidden threads). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The drawer gains an Artifacts section that lazily lists and renders the scratch documents pipeline steps write under .t3/ticket/<id>/ (PLAN.md, SPEC.md, REVIEW.md, ...) — each expandable, capped at 20 files and 64k characters with a truncation marker. Reads go through the worktree resolved for the ticket and the workspace filesystem's new listFiles, which keeps the same realpath containment as every other workspace read. Combined with route history and discussion, every ticket is now a self-documenting record of how the work happened. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Boards can now react to the world. Lanes gain onEvent matchers
({name, when?, to}) whose predicates see only {event: {name, payload}}
(lint enforces the allowlist and validates targets). A per-board
webhook — POST /hooks/workflow/:boardId with an x-t3-webhook-token
header — correlates events to tickets by explicit ticketId XOR a
workflow/<ticketId> branch name, dedupes optional deliveryIds
race-free, bounds payloads with a JSON-aware sanitizer, and answers
404 identically for unknown boards and bad tokens.
Matching events move the ticket through the engine's new
ingestExternalEvent: matchers are evaluated against the current lane,
the move commits a TicketRouteDecided with source "external_event"
under the board admission lock with a stale-lane guard (a concurrent
move makes the event a no-op), supersedes in-flight work like a manual
move, and reports moved/queued/noop precisely (enterLane now returns
what it did). Route history explains these moves with the event name.
Tokens are stored hashed (sha256) with an 8-char prefix; the plaintext
appears only in the create/rotate response of the new
workflow.getWebhookConfig RPC. Cron, PR-mode merges, and broadcast
events stay out of scope per the design review.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- Supersede a ticket's running work only inside the admission lock, after the stale-lane guard and a matcher/target revalidation pass — stale events can no longer interrupt the current pipeline - Webhook route: byte-accurate body cap (content-length precheck + byteLength), fail-closed 503 when delivery dedupe cannot be recorded, safe decodeURIComponent - Sanitizer drops __proto__/prototype/constructor keys - Board deletion revokes webhook tokens and delivery logs (RPC delete, recovery sweep, and discovery sweep paths) - onEvent predicate allowlist no longer accepts event.payloadX typos Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- getBoardDigest RPC: created/shipped counts, token + agent-time totals, and tickets waiting on a human, over a clamped 1-168h window - BoardTicketView.updatedAt threads through projections to the web store - Ticket cards show a warn/alert aging badge once a ticket has been waiting_on_user/blocked for 30min/2h - Digest dialog in the board header with a needs-attention count badge Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Selecting a lane (or one of its steps/transitions) fades every edge that neither leaves nor enters that lane to 15% opacity and renders the connected edges last so they sit on top. Slot allocation is unchanged, so edge geometry stays put while selecting. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- The digest dialog's load fires from the controlled open click — onOpenChange only covers internal closes, so the fetch never ran - In-flight digest fetches are invalidated on close so stale responses cannot repopulate the dialog - Aging badges and the needs-attention count recompute on a 60s tick instead of waiting for an unrelated re-render - getBoardDigest clamps windowHours into 1..168 instead of falling back to 24 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
simulateBoardRoute walks a definition with every step forced to a chosen outcome (success/failure/blocked), mirroring the engine's route precedence (step.on → transitions → lane.on). Transition predicates run against a synthetic context, so lane.runCount loop bounds behave as they would live. Ends classify as terminal / manual / no_route / cycle_cap — surfacing dead ends and unbounded loops before an agent burns tokens on them. Exposed as workflow.dryRunBoard (read scope) over the definition currently in the editor — unsaved changes included — with a "Dry run" panel in the workflow editor explaining every hop. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- "Webhook" board-header dialog: endpoint URL, token shown exactly once (on first provision or rotate, prefix-only afterwards, cleared from memory on close), copyable curl example built from the page origin - "External events" section in the lane routing editor: name, optional predicate JSON over event.name/event.payload.*, and target lane, backed by add/update/remove editorModel mutations Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The intake dialog now has the same provider/model + effort pickers as agent steps, defaulting to the recent-agent heuristic that previously chose silently. The selection (including effort options) rides the existing intakeTickets AgentSelection; proposing is disabled when no provider is available. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Review findings on the dry-run batch: - lane.runCount now mirrors countLanePipelineRuns: a consecutive streak that resets when another lane runs a pipeline — alternating loops now correctly dry-run as unbounded instead of bounded - Empty auto lanes no longer route (the engine returns before starting a pipeline with no steps); the dry run ends with an explanatory note - Synthetic ticket status matches what the routing-context builder would read (running / blocked), with a note whenever a predicate reads status so the approximation is visible - Predicate evaluation errors stop the walk (live routing errors there) instead of falling through to later transitions - dryRunBoard rejects oversized definitions (256k chars, 200 lanes, 100 steps/transitions/events per lane) — it is read-scoped and takes caller-supplied definitions Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
# Conflicts: # apps/web/src/components/Sidebar.tsx
Captured from a live run on a mock project (Snackbase): a delivery board whose agent steps run GPT-5.5 at low/medium/high/xhigh reasoning, with a real ticket driven through plan → build → review. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Live-run notes: - Captured output now falls back to earlier assistant messages in the same turn (newest first) when the final message has no fenced json block — multi-message agents (skill-driven review formats, progress notes) no longer read as "no vote" - The auto-appended captureOutput suffix states it overrides any skill/workflow output format Macroscope findings on the PR: - ApprovalGate.getOrCreate registers the deferred atomically via Ref.modify — concurrent callers can no longer wait on an orphaned deferred - TurnProjectionPort treats interrupted turns as completed, matching toTurnState's terminal classification - Predicate path lint rejects steps.<key>.status.<extra> nesting - Failed ticket badge says "failed", not "blocked" - Canvas lane-height equality checks key presence, not just value - Version history preview guards against stale async responses Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Addressed all 6 Macroscope findings in 87db8cc:
Same commit also hardens review-panel verdict capture (found during the live demo run): captured output now falls back to earlier assistant messages in the turn when the final message lacks the fenced json block, and the auto-appended captureOutput suffix explicitly overrides skill-driven output formats. 🤖 Generated with Claude Code |
ApprovabilityVerdict: Needs human review Diff is too large for automated approval analysis. A human reviewer should evaluate this PR. You can customize Macroscope's approvability policy. Learn more. |
Workflow Boards
Per-project kanban boards as event-sourced state machines that drive coding agents. Lanes hold pipelines of steps (agent / script / approval / merge); routing between lanes is decided by step outcomes, JSONLogic predicates over captured output, lane fallbacks, manual actions, or external webhook events. Every ticket gets its own git worktree, every move is audited and explained.
All screenshots below are from a live run on a mock project ("Snackbase"): the board's agent steps run GPT-5.5 at different reasoning levels per lane (planning = low, implementation = medium with escalation to extra-high on retry, review = high ×3 reviewers), and the "Fix off-by-one" ticket was driven through the pipeline by real agents.
The board
Lanes with per-lane colors and WIP limits, tickets with status stripes, dependency badges ("waiting on 1 dependency"), token budgets ("0 tok / 250k"), and usage roll-ups.
Creating a ticket: description, blocked-by dependencies, and an optional token budget that halts agent steps once spent.
Intake: braindump → tickets
Paste a braindump, pick the agent (provider/model + reasoning effort), and it proposes structured tickets — including dependency edges ("After #1") — which you edit and approve before anything is created. These proposals came from a real GPT-5.5 run.
The workflow editor
Canvas view: lanes as cards, steps typed and colored, routing edges colored by outcome (success/failure/blocked), numbered transitions, dotted action edges, routing-precedence legend. Edits are drag-to-connect or via the inspector; explicit Save lints and writes the board file (
.t3/boards/*.jsonis the source of truth).Selecting a lane dims every edge that doesn't touch it, so dense graphs stay readable:
Agent steps, fully configurable
The implement step: GPT-5.5 · Medium reasoning, 2 retry attempts, and "Escalate on retry" to GPT-5.5 · Extra High — a failed attempt automatically reruns on the stronger configuration.
The review step: GPT-5.5 · High, captured output (the agent ends with a fenced JSON verdict that routing predicates can read), and a 3-reviewer panel — three independent sessions vote, strict majority wins.
Lane form: merge steps, routing, external events
The Land lane in form view: a merge step (commits the ticket worktree and merges it into the checked-out branch; conflicts block instead of failing), lane success/failure/blocked routes, and external event matchers — a
ci.passedwebhook with a payload predicate moves the ticket to Done.Dry run
Simulate a hypothetical ticket through the definition you're editing (unsaved changes included) under all-succeed / all-fail / all-block scenarios. It mirrors the engine's exact routing semantics and explains every hop — here it correctly flags that the success path stalls in Review unless a verdict transition matches.
Version history
Every save is snapshotted per board with diffs and non-destructive revert.
External events
Each board gets a webhook endpoint with a rotating token (shown exactly once) and a copyable curl example. CI, PR automation, or cron can move correlated tickets (by
ticketIdorworkflow/<id>branch) through their lane's event matchers, with delivery dedupe.The board reports to you
A digest of the last 24h: shipped/created counts, tokens spent, agent time, and which tickets are waiting on a human.
Living with a ticket
The drawer: "Why is this ticket here?" route explainability (every hop with the rule that caused it), a discussion thread whose comments reach the next agent step as context, per-step status/duration/token usage, and one-click lane actions ("Retry build", "Back to backlog").
Script steps are gated by per-project trust — the first
node --testrun blocks until you allow it:Every ticket has a case file (
.t3/ticket/<id>/) the agents write into — here the PLAN.md the planning agent produced — plus the script output and reviewer sessions:And any agent step's full session is one click away, read-only:
Boards live in the sidebar with hover rename/delete (delete cascades tickets, events, versions, worktrees, and webhook tokens):
Not shown but included
Durable restart recovery (pipelines, retries, merges, approvals resume safely), WIP queueing with FIFO auto-admission, dependency auto-release, terminal-lane retention TTL with full state cleanup, aging badges and waiting-on-you toasts, ticket search, multi-environment boards, and an event-sourced audit trail under everything.
Notes
apps/server/src/workflow/**(Effect TS, event-sourced over SQLite) with contracts inpackages/contracts/src/workflow.tsand the web UI underapps/web/src/components/board/**.docs/workflow-demo/(these screenshots) is demo material and can be dropped before merge.🤖 Generated with Claude Code
Note
Add kanban workflow boards that drive coding agents through configurable lane pipelines
WorkflowEnginewith durable agent dispatch (ProviderDispatchOutbox), worktree leasing, WIP enforcement, retention sweeping, and crash recovery viaWorkflowRecovery.WORKFLOW_WS_METHODSand two new auth scopes (workflow:read,workflow:operate) enforced at the HTTP and WS layers.BoardViewwith drag-and-drop lanes,TicketDrawerfor messaging/approvals/diffs, a visualWorkflowEditorwith canvas and form views, version history diffing, and a dry-run simulator.Add boardcreation using the most-recently-used agent.attachHistoryStreamtoTerminalManagerso the ticket drawer can stream script output without spawning a new shell.Macroscope summarized 03c09f4.
Note
High Risk
Large new subsystem spanning auth, SQLite migrations, worktree leases, dispatch recovery, and orchestration visibility; correctness and security (webhooks, scopes) depend on many interacting paths.
Overview
Adds per-project kanban workflow boards backed by a new event-sourced engine: boards are defined in
.t3/boards/*.json(sampledelivery.jsonincluded) with lanes, WIP limits, and pipelines of agent / script / approval / merge steps, plus JSONLogic-based routing, webhooks, ticket dependencies, token budgets, and durable dispatch/recovery tables (migrations 033–054).The server wires
WorkflowServerRuntimeLiveinto startup (recovery + terminal retention sweeper), exposes workflow HTTP hooks, and depends onjson-logic-jsfor predicates. Auth gainsworkflow:read/workflow:operatescopes (including token exchange); tests now assert against shared scope constants.Orchestration gains hidden threads so workflow agent runs stay out of public thread lists, snapshots, and agent-awareness relay; checkpoint/diff callers implement
isThreadHidden. Terminals addattachHistoryStreamto replay persisted script output without opening a new PTY.Reviewed by Cursor Bugbot for commit 03c09f4. Bugbot is set up for automated code reviews on this repo. Configure here.