From 2d859587aa84add0cc30b55269fe4117693a992b Mon Sep 17 00:00:00 2001 From: baskduf Date: Tue, 9 Jun 2026 11:39:18 +0900 Subject: [PATCH 1/4] Add harness engineering skill --- docs/README.skills.md | 1 + skills/harness-engineering/SKILL.md | 218 ++++++++++++++++++++++++++++ 2 files changed, 219 insertions(+) create mode 100644 skills/harness-engineering/SKILL.md diff --git a/docs/README.skills.md b/docs/README.skills.md index 9c448fc5c..705bb13fe 100644 --- a/docs/README.skills.md +++ b/docs/README.skills.md @@ -199,6 +199,7 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-skills) for guidelines on how to | [gtm-positioning-strategy](../skills/gtm-positioning-strategy/SKILL.md)
`gh skills install github/awesome-copilot gtm-positioning-strategy` | Find and own a defensible market position. Use when messaging sounds like competitors, conversion is weak despite awareness, repositioning a product, or testing positioning claims. Includes Crawl-Walk-Run rollout methodology and the word change that improved enterprise deal progression. | None | | [gtm-product-led-growth](../skills/gtm-product-led-growth/SKILL.md)
`gh skills install github/awesome-copilot gtm-product-led-growth` | Build self-serve acquisition and expansion motions. Use when deciding PLG vs sales-led, optimizing activation, driving freemium conversion, building growth equations, or recognizing when product complexity demands human touch. Includes the parallel test where sales-led won 10x on revenue. | None | | [gtm-technical-product-pricing](../skills/gtm-technical-product-pricing/SKILL.md)
`gh skills install github/awesome-copilot gtm-technical-product-pricing` | Pricing strategy for technical products. Use when choosing usage-based vs seat-based, designing freemium thresholds, structuring enterprise pricing conversations, deciding when to raise prices, or using price as a positioning signal. | None | +| [harness-engineering](../skills/harness-engineering/SKILL.md)
`gh skills install github/awesome-copilot harness-engineering` | Adopt repository-level harness engineering for coding agents. Use when a user wants to prevent repeated AI coding-agent mistakes by turning failures into durable instructions, drift checks, regression tests, failure memory, and adoption reports tailored to the target repository. | None | | [image-annotations](../skills/image-annotations/SKILL.md)
`gh skills install github/awesome-copilot image-annotations` | Annotate screenshots, diagrams, and images with callout rectangles, arrows, labels, and color-coded highlights using PIL. Includes rules for animated GIF annotations with timing and pacing. | None | | [image-manipulation-image-magick](../skills/image-manipulation-image-magick/SKILL.md)
`gh skills install github/awesome-copilot image-manipulation-image-magick` | Process and manipulate images using ImageMagick. Supports resizing, format conversion, batch processing, and retrieving image metadata. Use when working with images, creating thumbnails, resizing wallpapers, or performing batch image operations. | None | | [impediment-prioritization](../skills/impediment-prioritization/SKILL.md)
`gh skills install github/awesome-copilot impediment-prioritization` | Ranks any list of impediments and their countermeasures using a value-stream scoring model (ROI, Cost to Implement, Ease of Deployment, Risk Factor) and a fixed prioritization formula. Use when someone asks to prioritize, rank, sequence, or triage impediments, countermeasures, remediation items, risks, findings, gaps, action items, or backlog entries; or mentions value-stream prioritization, A3 / lean countermeasure ranking, ROI vs. effort scoring, or building a remediation / improvement backlog. Works with GHQR findings, audit results, retrospective action items, risk registers, architecture review gaps, or any free-form `{impediment, countermeasure}` list. | `references/scoring-rubric.md` | diff --git a/skills/harness-engineering/SKILL.md b/skills/harness-engineering/SKILL.md new file mode 100644 index 000000000..8175f6e0a --- /dev/null +++ b/skills/harness-engineering/SKILL.md @@ -0,0 +1,218 @@ +--- +name: harness-engineering +description: 'Adopt repository-level harness engineering for coding agents. Use when a user wants to prevent repeated AI coding-agent mistakes by turning failures into durable instructions, drift checks, regression tests, failure memory, and adoption reports tailored to the target repository.' +--- + +# Harness Engineering + +Harness engineering turns repeated coding-agent mistakes into durable +repository artifacts: + +```text +Harness = Instructions + Constraints + Feedback + Memory + Evaluation + Governance +``` + +Use this skill when the user asks to: + +- make a repository more reliable for GitHub Copilot or other coding agents +- add durable agent instructions, repository rules, or guardrails +- prevent repeated AI coding-agent mistakes +- record known failure paths and the checks that prevent recurrence +- add lightweight drift checks for project rules +- review, refresh, or update an existing agent harness + +Do not use this skill for ordinary feature implementation unless the user asks +to improve the repository's agent operating environment. + +## Core Principles + +- Treat the target repository as the source of truth. +- Inspect before editing. Preserve the existing stack, package manager, CI, + docs, naming, and architecture. +- Add the smallest useful harness. Prefer updating existing files over adding + duplicate guidance. +- Make important rules enforceable where practical through tests, linters, + type checks, CI, pre-commit hooks, or drift scripts. +- Use manual review points only when automation would be brittle or misleading. +- Record high-risk failures that should not recur, and name the check or review + point that catches recurrence. +- Do not copy generic templates blindly. Adapt every artifact to real evidence + in the target repository. + +## Discovery + +Before proposing or making harness changes, inspect the repository for existing +rules and evidence. + +Read these files and folders when they exist: + +- `README.md` +- `AGENTS.md` +- `.github/copilot-instructions.md` +- `.github/instructions/` +- `.github/workflows/` +- `CONTRIBUTING.md` +- package manifests such as `package.json`, `pyproject.toml`, `go.mod`, + `Cargo.toml`, `pom.xml`, or `build.gradle` +- existing docs under `docs/` +- existing scripts under `scripts/` +- existing tests and CI checks + +Then summarize: + +- stack, package manager, and entry points +- existing development and verification commands +- current agent instructions or repository conventions +- known failures, incidents, flaky paths, or repeated review comments +- gaps where project rules are not enforced + +## Adoption Workflow + +### 1. Choose the Harness Surface + +Pick only the surfaces that fit the target repository: + +| Need | Preferred artifact | +| --- | --- | +| Always-on agent behavior | `AGENTS.md` or `.github/copilot-instructions.md` | +| File-scoped guidance | `.github/instructions/*.instructions.md` | +| Recurring project checks | `scripts/check_*.py`, shell scripts, or package scripts | +| CI enforcement | existing workflow files or a small new workflow | +| Known failures | `docs/failures/*.md` | +| Architecture or process decisions | `docs/decisions/*.md` | +| Adoption evidence | `docs/harness/adoption-report.md` or similar | + +If the repository already has an equivalent location, update it instead of +creating a parallel system. + +### 2. Write Agent Instructions + +Agent instructions should be concrete and operational. Include: + +- project purpose and major ownership boundaries +- setup, test, lint, build, and verification commands +- package manager and dependency rules +- safe editing rules, generated file rules, and forbidden paths +- testing expectations for changed code +- PR and commit conventions if the repo has them +- how to record new failures or decisions + +Avoid broad personality guidance, generic best practices, and rules that cannot +be checked or reviewed. + +### 3. Add Enforceable Checks + +Convert high-value rules into checks. Good harness checks are: + +- narrow enough to avoid false positives +- fast enough to run locally and in CI +- named clearly so agents can run them before finishing +- documented with the rule they protect + +Examples: + +```text +Rule: Do not edit generated API clients. +Check: script scans diffs for generated paths and fails with a clear message. + +Rule: Every failure memory note names a regression check. +Check: script validates docs/failures/*.md for a "Detection" section. + +Rule: Profile docs and templates must stay aligned. +Check: test compares profile README files to expected template files. +``` + +### 4. Record Failure Memory + +Record failures when they are user-visible, high-risk, or likely to recur. +Use a new file under `docs/failures/` unless an existing note already covers +the same root cause. + +Recommended structure: + +```markdown +# Short Failure Title + +## Summary + +What failed, who saw it, and why it matters. + +## Root Cause + +The technical or process cause. Avoid blame. + +## Prevention + +Instruction, test, drift check, CI gate, fixture, or manual review point that +prevents or detects recurrence. + +## Evidence + +Links to issue, PR, test, log, command output, or file paths. +``` + +If no automated check is practical, record the manual review point and why +automation would be unsafe or misleading. + +### 5. Add Drift Checks + +Use drift checks for guidance that can silently become stale. Common examples: + +- docs mention commands that no longer exist +- profile snippets and generated examples diverge +- failure notes omit regression checks +- decision records are missing for structural changes +- CI references stale scripts or package commands + +Prefer small scripts using the repository's existing language. If the repo has +no scripting convention, Python with only the standard library is a portable +default. + +### 6. Report the Adoption + +Finish substantial harness work with an adoption report that includes: + +- files changed +- rules added or updated +- checks added or reused +- commands run and results +- assumptions and manual follow-up +- failure memory created or intentionally skipped +- how effectiveness will be measured + +## Review Workflow + +When asked to review a harness change, take an opposing perspective. Look for: + +- generic rules copied without evidence from the target repository +- duplicate or conflicting instruction files +- broad checks that are likely to fail on valid changes +- unenforced high-risk rules +- missing failure memory for repeated mistakes or runtime failures +- generated docs not refreshed after source changes +- CI gates that do not run the relevant checks +- target repository conventions being overwritten by harness defaults + +Report findings first, ordered by severity, with file and line references when +available. Do not modify files during a review unless the user explicitly asks +for fixes. + +## Output Contract + +Before finishing harness adoption work, verify: + +- the target repository was inspected before edits +- new guidance is specific to the target repository +- changed checks can be run locally or have a documented manual substitute +- failure memory was recorded when required, or the final response explains why + it was skipped +- generated docs or indexes are refreshed +- the final report names every command run and its result + +## Optional Reference + +The prompt-first workflow in +`https://github.com/baskduf/harness-starter-kit` is a reference implementation +of these ideas. Use it as reference material only when the user asks for it or +when the repository already includes it. The target repository remains the +source of truth. From 6d5ab9d5ec8fa0e0b68f6a560ff3044c82c80895 Mon Sep 17 00:00:00 2001 From: baskduf Date: Tue, 9 Jun 2026 11:42:23 +0900 Subject: [PATCH 2/4] Add numbered workflow to harness skill --- skills/harness-engineering/SKILL.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/skills/harness-engineering/SKILL.md b/skills/harness-engineering/SKILL.md index 8175f6e0a..7dcaad853 100644 --- a/skills/harness-engineering/SKILL.md +++ b/skills/harness-engineering/SKILL.md @@ -68,6 +68,15 @@ Then summarize: ## Adoption Workflow +Follow this sequence: + +1. Choose the harness surface that fits the target repository. +2. Write target-specific agent instructions. +3. Add enforceable checks for high-value rules. +4. Record failure memory for high-risk or recurring failures. +5. Add drift checks for guidance that can silently become stale. +6. Report the adoption with evidence, assumptions, and follow-up. + ### 1. Choose the Harness Surface Pick only the surfaces that fit the target repository: From 9eeefa925f8124bcd817c63dd8b70141d4bdbdb1 Mon Sep 17 00:00:00 2001 From: baskduf Date: Tue, 9 Jun 2026 11:55:32 +0900 Subject: [PATCH 3/4] Remove external reference from harness skill --- skills/harness-engineering/SKILL.md | 8 -------- 1 file changed, 8 deletions(-) diff --git a/skills/harness-engineering/SKILL.md b/skills/harness-engineering/SKILL.md index 7dcaad853..305f51cbd 100644 --- a/skills/harness-engineering/SKILL.md +++ b/skills/harness-engineering/SKILL.md @@ -217,11 +217,3 @@ Before finishing harness adoption work, verify: it was skipped - generated docs or indexes are refreshed - the final report names every command run and its result - -## Optional Reference - -The prompt-first workflow in -`https://github.com/baskduf/harness-starter-kit` is a reference implementation -of these ideas. Use it as reference material only when the user asks for it or -when the repository already includes it. The target repository remains the -source of truth. From 3a28b11cfe8f7cca93b4b729c887facbb46cc50c Mon Sep 17 00:00:00 2001 From: baskduf Date: Tue, 9 Jun 2026 12:31:10 +0900 Subject: [PATCH 4/4] Revert "Remove external reference from harness skill" This reverts commit 9eeefa925f8124bcd817c63dd8b70141d4bdbdb1. --- skills/harness-engineering/SKILL.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/skills/harness-engineering/SKILL.md b/skills/harness-engineering/SKILL.md index 305f51cbd..7dcaad853 100644 --- a/skills/harness-engineering/SKILL.md +++ b/skills/harness-engineering/SKILL.md @@ -217,3 +217,11 @@ Before finishing harness adoption work, verify: it was skipped - generated docs or indexes are refreshed - the final report names every command run and its result + +## Optional Reference + +The prompt-first workflow in +`https://github.com/baskduf/harness-starter-kit` is a reference implementation +of these ideas. Use it as reference material only when the user asks for it or +when the repository already includes it. The target repository remains the +source of truth.