This file is no longer the canonical changelog for compound-engineering releases.
Historical entries are preserved below, but new release history is recorded in the root CHANGELOG.md.
All notable changes to the compound-engineering plugin will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
2.62.0 (2026-04-10)
- Add pluggable orchestrators with /ce:run, agent shims, and /lfg refactor (ea822a4)
- ce-compound: add discoverability check for docs/solutions/ in instruction files (#456) (5ac8a2c)
- ce-compound: add track-based schema for bug vs knowledge learnings (#445) (739109c)
- ce-plan: add interactive deepening mode for on-demand plan strengthening (#443) (ca78057)
- ce-plan: reduce token usage by extracting conditional references (#489) (fd562a0)
- ce-review: enforce table format, require question tool, fix autofix_class calibration (#454) (847ce3f)
- ce-work: suggest branch rename when worktree name is meaningless (#451) (e872e15)
- cli-agent-readiness-reviewer: add smart output defaults criterion (#448) (a01a8aa)
- cli-readiness-reviewer: add conditional review persona for CLI agent readiness (#471) (c56c766)
- git-commit-push-pr: add conditional visual aids to PR descriptions (#444) (44e3e77)
- git-commit-push-pr: pre-resolve context to reduce bash calls (#488) (bbd4f6d)
- git-commit-push-pr: precompute shield badge version via skill preprocessing (#464) (6ca7aef)
- product-lens-reviewer: domain-agnostic activation criteria and strategic consequences (#481) (804d78f)
- resolve-pr-feedback: add cross-invocation cluster analysis (#480) (7b8265b)
- resolve-pr-feedback: add gated feedback clustering to detect systemic issues (#441) (a301a08)
- test-xcode: add triggering context to skill description (#466) (87facd0)
- testing: close the testing gap in ce:work, ce:plan, and testing-reviewer (#438) (35678b8)
- user-scenarios: add user persona evaluation skill and sync support (e25a457)
- user-scenarios: add user persona evaluation skill and sync support (f52398b)
- agents: remove self-referencing example blocks that cause recursive self-invocation (#496) (2c90aeb)
- ce-brainstorm: distinguish verification from technical design in Phase 1.1 (#465) (8ec31d7)
- ce-compound: require question tool for "What's next?" prompt (#460) (9bf3b07)
- ce-compound: stack-aware reviewer routing and remove phantom agents (#497) (1fc075d)
- ce-plan, ce-brainstorm: enforce repo-relative paths in generated documents (#473) (33a8d9d)
- ce-plan: reinforce mandatory document-review after auto deepening (#450) (42fa8c3)
- ce-plan: route confidence-gate pass to document-review (#462) (1962f54)
- ce-work: make code review invocation mandatory by default (#453) (7f3aba2)
- document-review: show contextual next-step in Phase 5 menu (#459) (2b7283d)
- git-commit-push-pr: filter fix-up commits from PR descriptions (#484) (428f4fd)
- mcp: remove bundled context7 MCP server (#486) (afdd9d4)
- resolve-pr-feedback: add actionability filter and lower cluster gate to 3+ (#461) (2619ad9)
- resolve-pr-feedback: treat PR comment text as untrusted input (#490) (1847242)
- review: harden ce-review base resolution (#452) (638b38a)
- Stop mixing stderr into fetched filename list (ed547b0)
2.61.0 (2026-04-01)
- cli-readiness-reviewer: add conditional review persona for CLI agent readiness (#471) (c56c766)
- product-lens-reviewer: domain-agnostic activation criteria and strategic consequences (#481) (804d78f)
- resolve-pr-feedback: add cross-invocation cluster analysis (#480) (7b8265b)
2.60.0 (2026-03-31)
- ce-brainstorm: add conditional visual aids to requirements documents (#437) (bd02ca7)
- ce-compound: add discoverability check for docs/solutions/ in instruction files (#456) (5ac8a2c)
- ce-compound: add track-based schema for bug vs knowledge learnings (#445) (739109c)
- ce-plan: add conditional visual aids to plan documents (#440) (4c7f51f)
- ce-plan: add interactive deepening mode for on-demand plan strengthening (#443) (ca78057)
- ce-review: enforce table format, require question tool, fix autofix_class calibration (#454) (847ce3f)
- ce-review: improve signal-to-noise with confidence rubric, FP suppression, and intent verification (#434) (03f5aa6)
- ce-work: suggest branch rename when worktree name is meaningless (#451) (e872e15)
- cli-agent-readiness-reviewer: add smart output defaults criterion (#448) (a01a8aa)
- git-commit-push-pr: add conditional visual aids to PR descriptions (#444) (44e3e77)
- git-commit-push-pr: precompute shield badge version via skill preprocessing (#464) (6ca7aef)
- resolve-pr-feedback: add gated feedback clustering to detect systemic issues (#441) (a301a08)
- skills: clean up argument-hint across ce:* skills (#436) (d2b24e0)
- test-xcode: add triggering context to skill description (#466) (87facd0)
- testing: close the testing gap in ce:work, ce:plan, and testing-reviewer (#438) (35678b8)
- ce-brainstorm: distinguish verification from technical design in Phase 1.1 (#465) (8ec31d7)
- ce-compound: require question tool for "What's next?" prompt (#460) (9bf3b07)
- ce-plan: reinforce mandatory document-review after auto deepening (#450) (42fa8c3)
- ce-plan: route confidence-gate pass to document-review (#462) (1962f54)
- ce-work: make code review invocation mandatory by default (#453) (7f3aba2)
- document-review: show contextual next-step in Phase 5 menu (#459) (2b7283d)
- git-commit-push-pr: quiet expected no-pr gh exit (#439) (1f49948)
- resolve-pr-feedback: add actionability filter and lower cluster gate to 3+ (#461) (2619ad9)
- review: harden ce-review base resolution (#452) (638b38a)
2.59.0 (2026-03-29)
- ce-review: add headless mode for programmatic callers (#430) (3706a97)
- ce-work: accept bare prompts and add test discovery (#423) (6dabae6)
- document-review: collapse batch_confirm tier into auto (#432) (0f5715d)
- review: make review mandatory across pipeline skills (#433) (9caaf07)
2.58.1 (2026-03-28)
- compound-engineering: Synchronize compound-engineering versions
2.57.0 (2026-03-28)
2.56.1 (2026-03-28)
2.56.0 (2026-03-28)
- cli-agent-readiness-reviewer: remove top-5 cap on improvements (#419) (16eb8b6)
- document-review: enforce interactive questions and fix autofix classification (#415) (d447296)
2.55.0 (2026-03-27)
- add adversarial review agents for code and documents (#403) (5e6cd5c)
- add CLI agent-readiness reviewer and principles guide (#391) (13aa3fa)
- add project-standards-reviewer as always-on ce:review persona (#402) (b30288c)
- ce-brainstorm: group requirements by logical concern, tighten autofix classification (#412) (90684c4)
- ce-plan: strengthen test scenario guidance across plan and work skills (#410) (615ec5d)
- ce-review: add base: and plan: arguments, extract scope detection (#405) (914f9b0)
- document-review: smarter autofix, batch-confirm, and error/omission classification (#401) (0863cfa)
- onboarding: add consumer perspective and split architecture diagrams (#413) (31326a5)
- add strict YAML validation for plugin frontmatter (#399) (0877b69)
- consolidate compound-docs into ce-compound skill (#390) (daddb7d)
- document SwiftUI Text link tap limitation in test-xcode skill (#400) (6ddaec3)
- harden git workflow skills with better state handling (#406) (f83305e)
- improve agent-native-reviewer with triage, prioritization, and stack-aware search (#387) (e792166)
- replace broken markdown link refs in skills (#392) (506ad01)
2.54.1 (2026-03-26)
2.54.0 (2026-03-26)
- add new
onboardingskill to create onboarding guide for repo (#384) (27b9831) - replace manual review agent config with ce:review delegation (#381) (fed9fd6)
- add default-branch guard to commit skills (#386) (31f07c0)
- scope commit-push-pr descriptions to full branch diff (#385) (355e739)
2.53.0 (2026-03-25)
- add git commit and branch helper skills (#378) (fe08af2)
- improve
resolve-pr-feedbackskill (#379) (2ba4f3f) - improve commit-push-pr skill with net-result focus and badging (#380) (efa798c)
- integrate orphaned stack-specific reviewers into ce:review (#375) (ce9016f)
2.52.0 (2026-03-25)
- add consolidation support and overlap detection to
ce:compoundandce:compound-refreshskills (#372) (fe27f85) - optimize
ce:compoundspeed and effectiveness (#370) (4e3af07) - promote
ce:review-betato stablece:review(#371) (7c5ff44) - rationalize todo skill names and optimize skills (#368) (2612ed6)
2.51.0 (2026-03-24)
- add
ce:review-betawith structured persona pipeline (#348) (e932276) - promote ce:plan-beta and deepen-plan-beta to stable (#355) (169996a)
- redesign
document-reviewskill with persona-based review (#359) (18d22af)
2.50.0 (2026-03-23)
- ce-work: add Codex delegation mode (#328) (341c379)
- improve
feature-videoskill with GitHub native video upload (#344) (4aa50e1) - rewrite
frontend-designskill with layered architecture and visual verification (#343) (423e692)
2.49.0 (2026-03-22)
- add execution mode toggle and context pressure bounds to parallel skills (#336) (216d6df)
- fix skill transformation pipeline across all targets (#334) (4087e1d)
- improve reproduce-bug skill, sync agent-browser, clean up redundant skills (#333) (affba1a)
2.48.0 (2026-03-22)
- git-worktree: auto-trust mise and direnv configs in new worktrees (#312) (cfbfb67)
- make skills platform-agnostic across coding agents (#330) (52df90a)
2.47.0 (2026-03-20)
2.46.0 (2026-03-20)
2.45.0 (2026-03-19)
- edit resolve_todos_parallel skill for complete todo lifecycle (#292) (88c89bc)
- integrate claude code auto memory as supplementary data source for ce:compound and ce:compound-refresh (#311) (5c1452d)
2.44.0 (2026-03-18)
- ce:compound context budget precheck — Warns when context is constrained and offers compact-safe mode to avoid compaction mid-compound (#235)
- ce:plan daily sequence numbers — Plan filenames now include a 3-digit daily sequence number (e.g.,
2026-03-10-001-feat-...) to prevent collisions (#238) - ce:review serial mode — Pass
--serialflag (or auto-detects when 6+ agents configured) to run review agents sequentially, preventing context limit crashes (#237) - agent-browser inspection & debugging commands — Added JS eval, console/errors, network, storage, device emulation, element debugging, recording/tracing, tabs, and advanced mouse commands to agent-browser skill (#236)
- test-browser port detection — Auto-detects dev server port from CLAUDE.md, package.json, or .env files; supports
--portflag (#233) - lfg phase gating — Added explicit GATE checks between /lfg steps to enforce plan-before-work ordering (#231)
- Context7 API key auth — MCP server config now passes
CONTEXT7_API_KEYviax-api-keyheader to avoid anonymous rate limits (#232) - CLI: MCP server merge order —
syncnow correctly overwrites same-named MCP servers with plugin values instead of preserving stale entries
- every-style-editor agent — Removed duplicate agent; functionality already exists as
every-style-editorskill (#234)
- Matt Van Horn (@mvanhorn) — PRs #231–#238
- Cross-platform
AskUserQuestionfallback —setupskill andcreate-new-skill/add-workflowworkflows now include an "Interaction Method" preamble that instructs non-Claude LLMs (Codex, Gemini, Copilot, Kiro) to use numbered lists instead ofAskUserQuestion, preventing silent auto-configuration. (#204) - Codex AGENTS.md
AskUserQuestionmapping — Strengthened from "ask the user in chat" to structured numbered-list guidance with multi-select support and a "never skip or auto-configure" rule. - Skill compliance checklist — Added
AskUserQuestionlint rule toCLAUDE.mdto prevent recurrence.
workflows:plan,workflows:work,workflows:review,workflows:brainstorm,workflows:compoundrenamed toce:plan,ce:work,ce:review,ce:brainstorm,ce:compoundfor clarity — thece:prefix unambiguously identifies these as compound-engineering commands
workflows:*commands — all five remain functional as aliases that forward to theirce:*equivalents with a deprecation notice. Will be removed in a future version.
- CLI: auto-detect install targets —
bunx @every-env/compound-plugin install compound-engineering --to allauto-detects installed AI coding tools and installs to all of them in one command. (#191) - CLI: Gemini sync —
sync --target geminisymlinks personal skills to.gemini/skills/and merges MCP servers into.gemini/settings.json. (#191) - CLI: sync defaults to
--target all— Runningsyncwith no target now syncs to all detected tools automatically. (#191)
/workflows:reviewrendering — Fixed broken markdown output: "Next Steps" items 3 & 4 and Severity Breakdown no longer leak outside the Summary Report template, section numbering fixed (was jumping 5→7, now correct), removed orphaned fenced code block delimiters that caused the entire End-to-End Testing section to render as a code block, and fixed unclosed quoted string in section 1. (#214) — thanks @XSAM!.worktreesgitignore — Added.worktrees/to.gitignoreto prevent worktree directories created by thegit-worktreeskill from being tracked. (#213) — thanks @XSAM!
proofskill — Create, edit, comment on, and share markdown documents via Proof's web API and local bridge. Supports document creation, track-changes suggestions, comments, and bulk rewrites. No authentication required for creating shared documents.- Optional Proof sharing in
/workflows:brainstorm— "Share to Proof" is now a menu option in Phase 4 handoff, letting you upload the brainstorm document when you want to, rather than automatically on every run. - Optional Proof sharing in
/workflows:plan— "Share to Proof" is now a menu option in Post-Generation Options, letting you upload the plan file on demand rather than automatically.
- OpenClaw install target —
bunx @every-env/compound-plugin install compound-engineering --to openclawnow installs the plugin to OpenClaw's extensions directory. (#217) — thanks @TrendpilotAI! - Qwen Code install target —
bunx @every-env/compound-plugin install compound-engineering --to qwennow installs the plugin to Qwen Code's extensions directory. (#220) — thanks @rlam3! - Windsurf install target —
bunx @every-env/compound-plugin install compound-engineering --to windsurfconverts plugins to Windsurf format. Agents become Windsurf skills, commands become flat workflows, and MCP servers write tomcp_config.json. Defaults to global scope (~/.codeium/windsurf/); use--scope workspacefor project-level output. (#202) — thanks @rburnham52!
create-agent-skill/heal-skillYAML crash —argument-hintvalues containing special characters now properly quoted to prevent YAML parse errors in the Claude Code TUI. (#219) — thanks @solon!resolve-pr-parallelskill name — Renamed fromresolve_pr_parallel(underscore) toresolve-pr-parallel(hyphen) to match the standard naming convention. (#202) — thanks @rburnham52!
/workflows:planbrainstorm integration — When plan finds a brainstorm document, it now heavily references it throughout. Addedorigin:frontmatter field to plan templates, brainstorm cross-check in final review, and "Sources" section at the bottom of all three plan templates (MINIMAL, MORE, A LOT). Brainstorm decisions are carried forward with explicit references (see brainstorm: <path>) and a mandatory scan before finalizing ensures nothing is dropped.
/workflows:worksystem-wide test check — Added "System-Wide Test Check" to the task execution loop. Before marking a task done, forces five questions: what callbacks/middleware fire when this runs? Do tests exercise the real chain or just mocked isolation? Can failure leave orphaned state? What other interfaces need the same change? Do error strategies align across layers? Includes skip criteria for leaf-node changes. Also added integration test guidance to the "Test Continuously" section./workflows:plansystem-wide impact templates — Added "System-Wide Impact" section to MORE and A LOT plan templates (interaction graph, error propagation, state lifecycle, API surface parity, integration test scenarios) as lightweight prompts to flag risks during planning.
/lfgand/slfgfirst-run failures — Made ralph-loop step optional with graceful fallback whenralph-wiggumskill is not installed (#154). Added explicit "do not stop" instruction across all steps (#134)./workflows:plannot writing file in pipeline — Added mandatory "Write Plan File" step with explicit Write tool instructions before Post-Generation Options. The file is now always written to disk before any interactive prompts (#155). Also adds pipeline-mode note to skip AskUserQuestion calls when invoked from LFG/SLFG (#134).- Agent namespace typo in
/workflows:plan—Task spec-flow-analyzer(...)now uses the full qualified nameTask compound-engineering:workflow:spec-flow-analyzer(...)to prevent Claude from prepending the wrongworkflows:prefix (#193).
- Gemini CLI target — New converter target for Gemini CLI. Install with
--to geminito convert agents to.gemini/skills/*/SKILL.md, commands to.gemini/commands/*.toml(TOML format withdescription+prompt), and MCP servers to.gemini/settings.json. Skills pass through unchanged (identical SKILL.md standard). Namespaced commands create directory structure (workflows:plan→commands/workflows/plan.toml). 29 new tests. (#190)
/workflows:plancommand - All plan templates now includestatus: activein YAML frontmatter. Plans are created withstatus: activeand markedstatus: completedwhen work finishes./workflows:workcommand - Phase 4 now updates plan frontmatter fromstatus: activetostatus: completedafter shipping. Agents can grep for status to distinguish current vs historical plans.
setupskill — Interactive configurator for review agents- Auto-detects project type (Rails, Python, TypeScript, etc.)
- Two paths: "Auto-configure" (one click) or "Customize" (pick stack, focus areas, depth)
- Writes
compound-engineering.local.mdin project root (tool-agnostic — works for Claude, Codex, OpenCode) - Invoked automatically by
/workflows:reviewwhen no settings file exists
learnings-researcherin/workflows:review— Always-run agent that searchesdocs/solutions/for past issues related to the PRschema-drift-detectorwired into/workflows:review— Conditional agent for PRs with migrations
/workflows:review— Now reads review agents fromcompound-engineering.local.mdsettings file. Falls back to invoking setup skill if no file exists./workflows:work— Review agents now configurable via settings file/release-docscommand — Moved from plugin to local.claude/commands/(repo maintenance, not distributed)
/technical_reviewcommand — Superseded by configurable review agents
- Factory Droid target — New converter target for Factory Droid. Install with
--to droidto output agents, commands, and skills to~/.factory/. Includes tool name mapping (Claude → Factory), namespace prefix stripping, Task syntax conversion, and agent reference rewriting. 13 new tests (9 converter + 4 writer). (#174)
dspy-rubyskill — Complete rewrite to DSPy.rb v0.34.3 API:.call()/result.fieldpatterns,T::Enumclasses,DSPy::Tools::Base/Toolset. Added events system, lifecycle callbacks, fiber-local LM context, GEPA optimization, evaluation framework, typed context pattern, BAML/TOON schema formats, storage system, score reporting, RubyLLM adapter. 5 reference files (2 new: toolsets, observability), 3 asset templates rewritten.
document-reviewskill — Brainstorm and plan refinement through structured review (@Trevin Chow)/synccommand — Sync Claude Code personal config across machines (@Terry Li)
- Context token optimization (79% reduction) — Plugin was consuming 316% of the context description budget, causing Claude Code to silently exclude components. Now at 65% with room to grow:
- All 29 agent descriptions trimmed from ~1,400 to ~180 chars avg (examples moved to agent body)
- 18 manual commands marked
disable-model-invocation: true(side-effect commands like/lfg,/deploy-docs,/triage, etc.) - 6 manual skills marked
disable-model-invocation: true(orchestrating-swarms,git-worktree,skill-creator,compound-docs,file-todos,resolve-pr-parallel)
- git-worktree: Remove confirmation prompt for worktree creation (@Sam Xie)
- Prevent subagents from writing intermediary files in compound workflow (@Trevin Chow)
- Fix crash when hook entries have no matcher (@Roberto Mello)
- Fix git-worktree detection where
.gitis a file, not a directory (@David Alley) - Backup existing config files before overwriting in sync (@Zac Williams)
- Note new repository URL (@Aarni Koskela)
- Plugin component counts corrected: 29 agents, 24 commands, 18 skills
orchestrating-swarmsskill - Comprehensive guide to multi-agent orchestration- Covers primitives: Agent, Team, Teammate, Leader, Task, Inbox, Message, Backend
- Documents two spawning methods: subagents vs teammates
- Explains all 13 TeammateTool operations
- Includes orchestration patterns: Parallel Specialists, Pipeline, Self-Organizing Swarm
- Details spawn backends: in-process, tmux, iterm2
- Provides complete workflow examples
/slfgcommand - Swarm-enabled variant of/lfgthat uses swarm mode for parallel execution
/workflows:workcommand - Added optional Swarm Mode section for parallel execution with coordinated agents
schema-drift-detectoragent - Detects unrelated schema.rb changes in PRs- Compares schema.rb diff against migrations in the PR
- Catches columns, indexes, and tables from other branches
- Prevents accidental inclusion of local database state
- Provides clear fix instructions (checkout + migrate)
- Essential pre-merge check for any PR with database changes
/workflows:brainstormcommand - Guided ideation flow to expand options quickly (#101)
/workflows:plancommand - Smarter research decision logic before deep dives (#100)- Research checks - Mandatory API deprecation validation in research flows (#102)
- Docs - Call out experimental OpenCode/Codex providers and install defaults
- CLI defaults -
installpulls from GitHub by default and writes OpenCode/Codex output to global locations
- #102 feat(research): add mandatory API deprecation validation
- #101 feat: Add /workflows:brainstorm command and skill
- #100 feat(workflows:plan): Add smart research decision logic
Huge thanks to the community contributors who made this release possible! 🙌
- @tmchow - Brainstorm workflow, research decision logic (2 PRs)
- @jaredmorgenstern - API deprecation validation
/workflows:plancommand - Interactive Q&A refinement phase (#88)- After generating initial plan, now offers to refine with targeted questions
- Asks up to 5 questions about ambiguous requirements, edge cases, or technical decisions
- Incorporates answers to strengthen the plan before finalization
/workflows:workcommand - Incremental commits and branch safety (#93)- Now commits after each completed task instead of batching at end
- Added branch protection checks before starting work
- Better progress tracking with per-task commits
dhh-rails-styleskill - Fixed broken markdown table formatting (#96)- Documentation - Updated hardcoded year references from 2025 to 2026 (#86, #91)
Huge thanks to the community contributors who made this release possible! 🙌
- @tmchow - Interactive Q&A for plans, incremental commits, year updates (3 PRs!)
- @ashwin47 - Markdown table fix
- @rbouschery - Documentation year update
- 27 agents, 23 commands, 14 skills, 1 MCP server
/workflows:workcommand - Now marks off checkboxes in plan document as tasks complete- Added step to update original plan file (
[ ]→[x]) after each task - Ensures no checkboxes are left unchecked when work is done
- Keeps plan as living document showing progress
- Added step to update original plan file (
/workflows:workcommand - PRs now include Compound Engineered badge- Updated PR template to include badge at bottom linking to plugin repo
- Added badge requirement to quality checklist
- Badge provides attribution and link to the plugin that created the PR
design-iteratoragent - Now auto-loads design skills at start of iterations- Added "Step 0: Discover and Load Design Skills (MANDATORY)" section
- Discovers skills from ~/.claude/skills/, .claude/skills/, and plugin cache
- Maps user context to relevant skills (Swiss design → swiss-design skill, etc.)
- Reads SKILL.md files to load principles into context before iterating
- Extracts key principles: grid specs, typography rules, color philosophy, layout principles
- Skills are applied throughout ALL iterations for consistent design language
/test-browsercommand - Clarified to use agent-browser CLI exclusively- Added explicit "CRITICAL: Use agent-browser CLI Only" section
- Added warning: "DO NOT use Chrome MCP tools (mcp__claude-in-chrome__*)"
- Added Step 0: Verify agent-browser installation before testing
- Added full CLI reference section at bottom
- Added Next.js route mapping patterns
best-practices-researcheragent - Now checks skills before going online- Phase 1: Discovers and reads relevant SKILL.md files from plugin, global, and project directories
- Phase 2: Only goes online for additional best practices if skills don't provide enough coverage
- Phase 3: Synthesizes all findings with clear source attribution (skill-based > official docs > community)
- Skill mappings: Rails → dhh-rails-style, Frontend → frontend-design, AI → agent-native-architecture, etc.
- Prioritizes curated skill knowledge over external sources for trivial/common patterns
/lfgcommand - Full autonomous engineering workflow- Orchestrates complete feature development from plan to PR
- Runs: plan → deepen-plan → work → review → resolve todos → test-browser → feature-video
- Uses ralph-loop for autonomous completion
- Migrated from local command, updated to use
/test-browserinstead of/playwright-test
- 27 agents, 21 commands, 14 skills, 1 MCP server
agent-browserskill - Browser automation using Vercel's agent-browser CLI- Navigate, click, fill forms, take screenshots
- Uses ref-based element selection (simpler than Playwright)
- Works in headed or headless mode
-
Replaced Playwright MCP with agent-browser - Simpler browser automation across all browser-related features:
/test-browsercommand - Now uses agent-browser CLI with headed/headless mode option/feature-videocommand - Uses agent-browser for screenshotsdesign-iteratoragent - Browser automation via agent-browserdesign-implementation-revieweragent - Screenshot comparisonfigma-design-syncagent - Design verificationbug-reproduction-validatoragent - Bug reproduction/reviewworkflow - Screenshot capabilities/workworkflow - Browser testing
-
/test-browsercommand - Added "Step 0" to ask user if they want headed (visible) or headless browser mode
- Playwright MCP server - Replaced by agent-browser CLI (simpler, no MCP overhead)
/playwright-testcommand - Renamed to/test-browser
- 27 agents, 20 commands, 14 skills, 1 MCP server
/reproduce-bugcommand - Enhanced with Playwright visual reproduction:- Added Phase 2 for visual bug reproduction using browser automation
- Step-by-step guide for navigating to affected areas
- Screenshot capture at each reproduction step
- Console error checking
- User flow reproduction with clicks, typing, and snapshots
- Better documentation structure with 4 clear phases
- 27 agents, 21 commands, 13 skills, 2 MCP servers
- Agent model inheritance - All 26 agents now use
model: inheritso they match the user's configured model. Onlylintkeepsmodel: haikufor cost efficiency. (fixes #69)
- 27 agents, 21 commands, 13 skills, 2 MCP servers
/agent-native-auditcommand - Comprehensive agent-native architecture review- Launches 8 parallel sub-agents, one per core principle
- Principles: Action Parity, Tools as Primitives, Context Injection, Shared Workspace, CRUD Completeness, UI Integration, Capability Discovery, Prompt-Native Features
- Each agent produces specific score (X/Y format with percentage)
- Generates summary report with overall score and top 10 recommendations
- Supports single principle audit via argument
- 27 agents, 21 commands, 13 skills, 2 MCP servers
rcloneskill - Upload files to S3, Cloudflare R2, Backblaze B2, and other cloud storage providers
/feature-videocommand - Enhanced with:- Better ffmpeg commands for video/GIF creation (proper scaling, framerate control)
- rclone integration for cloud uploads
- Screenshot copying to project folder
- Improved upload options workflow
- 27 agents, 20 commands, 13 skills, 2 MCP servers
- Version history cleanup after merge conflict resolution
This release consolidates all recent work:
/feature-videocommand for recording PR demos/deepen-plancommand for enhanced planningcreate-agent-skillsskill rewrite (official spec compliance)agent-native-architectureskill major expansiondhh-rails-styleskill consolidation (merged dhh-ruby-style)- 27 agents, 20 commands, 12 skills, 2 MCP servers
/feature-videocommand - Record video walkthroughs of features using Playwright
create-agent-skillsskill - Complete rewrite to match Anthropic's official skill specification
dhh-ruby-styleskill - Merged intodhh-rails-styleskill
-
/deepen-plancommand - Power enhancement for plans. Takes an existing plan and runs parallel research sub-agents for each major section to add:- Best practices and industry patterns
- Performance optimizations
- UI/UX improvements (if applicable)
- Quality enhancements and edge cases
- Real-world implementation examples
The result is a deeply grounded, production-ready plan with concrete implementation details.
/workflows:plancommand - Added/deepen-planas option 2 in post-generation menu. Added note: if running with ultrathink enabled, automatically run deepen-plan for maximum depth.
-
agent-native-architectureskill - Added Dynamic Capability Discovery pattern and Architecture Review Checklist:New Patterns in mcp-tool-design.md:
- Dynamic Capability Discovery - For external APIs (HealthKit, HomeKit, GraphQL), build a discovery tool (
list_*) that returns available capabilities at runtime, plus a generic access tool that takes strings (not enums). The API validates, not your code. This means agents can use new API capabilities without code changes. - CRUD Completeness - Every entity the agent can create must also be readable, updatable, and deletable. Incomplete CRUD = broken action parity.
New in SKILL.md:
- Architecture Review Checklist - Pushes reviewer findings earlier into the design phase. Covers tool design (dynamic vs static, CRUD completeness), action parity (capability map, edit/delete), UI integration (agent → UI communication), and context injection.
- Option 11: API Integration - New intake option for connecting to external APIs like HealthKit, HomeKit, GraphQL
- New anti-patterns: Static Tool Mapping (building individual tools for each API endpoint), Incomplete CRUD (create-only tools)
- Tool Design Criteria section added to success criteria checklist
New in shared-workspace-architecture.md:
- iCloud File Storage for Multi-Device Sync - Use iCloud Documents for your shared workspace to get free, automatic multi-device sync without building a sync layer. Includes implementation pattern, conflict handling, entitlements, and when NOT to use it.
- Dynamic Capability Discovery - For external APIs (HealthKit, HomeKit, GraphQL), build a discovery tool (
This update codifies a key insight for agent-native apps: when integrating with external APIs where the agent should have the same access as the user, use Dynamic Capability Discovery instead of static tool mapping. Instead of building read_steps, read_heart_rate, read_sleep... build list_health_types + read_health_data(dataType: string). The agent discovers what's available, the API validates the type.
Note: This pattern is specifically for agent-native apps following the "whatever the user can do, the agent can do" philosophy. For constrained agents with intentionally limited capabilities, static tool mapping may be appropriate.
-
agent-native-architectureskill - Major expansion based on real-world learnings from building the Every Reader iOS app. Added 5 new reference documents and expanded existing ones:New References:
- dynamic-context-injection.md - How to inject runtime app state into agent system prompts. Covers context injection patterns, what context to inject (resources, activity, capabilities, vocabulary), implementation patterns for Swift/iOS and TypeScript, and context freshness.
- action-parity-discipline.md - Workflow for ensuring agents can do everything users can do. Includes capability mapping templates, parity audit process, PR checklists, tool design for parity, and context parity guidelines.
- shared-workspace-architecture.md - Patterns for agents and users working in the same data space. Covers directory structure, file tools, UI integration (file watching, shared stores), agent-user collaboration patterns, and security considerations.
- agent-native-testing.md - Testing patterns for agent-native apps. Includes "Can Agent Do It?" tests, the Surprise Test, automated parity testing, integration testing, and CI/CD integration.
- mobile-patterns.md - Mobile-specific patterns for iOS/Android. Covers background execution (checkpoint/resume), permission handling, cost-aware design (model tiers, token budgets, network awareness), offline handling, and battery awareness.
Updated References:
- architecture-patterns.md - Added 3 new patterns: Unified Agent Architecture (one orchestrator, many agent types), Agent-to-UI Communication (shared data store, file watching, event bus), and Model Tier Selection (fast/balanced/powerful).
Updated Skill Root:
- SKILL.md - Expanded intake menu (now 10 options including context injection, action parity, shared workspace, testing, mobile patterns). Added 5 new agent-native anti-patterns (Context Starvation, Orphan Features, Sandbox Isolation, Silent Actions, Capability Hiding). Expanded success criteria with agent-native and mobile-specific checklists.
-
agent-native-revieweragent - Significantly enhanced with comprehensive review process covering all new patterns. Now checks for action parity, context parity, shared workspace, tool design (primitives vs workflows), dynamic context injection, and mobile-specific concerns. Includes detailed anti-patterns, output format template, quick checks ("Write to Location" test, Surprise test), and mobile-specific verification.
These updates operationalize a key insight from building agent-native mobile apps: "The agent should be able to do anything the user can do, through tools that mirror UI capabilities, with full context about the app state." The failure case that prompted these changes: an agent asked "what reading feed?" when a user said "write something in my reading feed"—because it had no publish_to_feed tool and no context about what "feed" meant.
dhh-rails-styleskill - Massively expanded reference documentation incorporating patterns from Marc Köhlbrugge's Unofficial 37signals Coding Style Guide:- controllers.md - Added authorization patterns, rate limiting, Sec-Fetch-Site CSRF protection, request context concerns
- models.md - Added validation philosophy, let it crash philosophy (bang methods), default values with lambdas, Rails 7.1+ patterns (normalizes, delegated types, store accessor), concern guidelines with touch chains
- frontend.md - Added Turbo morphing best practices, Turbo frames patterns, 6 new Stimulus controllers (auto-submit, dialog, local-time, etc.), Stimulus best practices, view helpers, caching with personalization, broadcasting patterns
- architecture.md - Added path-based multi-tenancy, database patterns (UUIDs, state as records, hard deletes, counter caches), background job patterns (transaction safety, error handling, batch processing), email patterns, security patterns (XSS, SSRF, CSP), Active Storage patterns
- gems.md - Added expanded what-they-avoid section (service objects, form objects, decorators, CSS preprocessors, React/Vue), testing philosophy with Minitest/fixtures patterns
- Reference patterns derived from Marc Köhlbrugge's Unofficial 37signals Coding Style Guide
- All skills - Fixed spec compliance issues across 12 skills:
- Reference files now use proper markdown links (
[file.md](./references/file.md)) instead of backtick text - Descriptions now use third person ("This skill should be used when...") per skill-creator spec
- Affected skills: agent-native-architecture, andrew-kane-gem-writer, compound-docs, create-agent-skills, dhh-rails-style, dspy-ruby, every-style-editor, file-todos, frontend-design, gemini-imagegen
- Reference files now use proper markdown links (
- CLAUDE.md - Added Skill Compliance Checklist with validation commands for ensuring new skills meet spec requirements
/workflows:reviewcommand - Section 7 now detects project type (Web, iOS, or Hybrid) and offers appropriate testing. Web projects get/playwright-test, iOS projects get/xcode-test, hybrid projects can run both.
-
/xcode-testcommand - Build and test iOS apps on simulator using XcodeBuildMCP. Automatically detects Xcode project, builds app, launches simulator, and runs test suite. Includes retries for flaky tests. -
/playwright-testcommand - Run Playwright browser tests on pages affected by current PR or branch. Detects changed files, maps to affected routes, generates/runs targeted tests, and reports results with screenshots.