Skip to content

Commit 81c5e7d

Browse files
ChrisJBurnsclaude
andauthored
Add dry-run doc validation skill (#724)
* Add dry-run doc validation skill New /test-docs-dryrun skill for fast CRD schema validation of all YAML blocks in K8s and vMCP documentation. Extracts YAML from docs, runs kubectl apply --dry-run=server, and reports pass/fail per doc. No resources created. Completes in under 5 minutes for all docs. Includes: - SKILL.md with scope selection (single page, section, all) - Python extraction script for YAML blocks from .mdx files - Lightweight prereqs checker (only needs CRDs, not operator) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Improve dry-run skill with modular procedures - Split skill into SKILL.md (workflow) + 4 procedure files: cluster-setup.md, extract.md, validate.md, report.md - Fix duplicate filename bug: extraction script now takes --prefix flag to namespace output files by section (k8s-, vmcp-) - Fix cluster lifecycle: check for existing cluster first, default to keeping it after the run - Fix result tracking: use CSV file instead of bash associative arrays for zsh compatibility - Require per-doc breakdown table as primary output every run Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix validation procedure to use script file Long-running for loops with nested if/elif and subshells get backgrounded or time out when run as inline Bash tool calls. The procedure now instructs writing the loop to a .sh file first, then executing with bash. Also uses || true after kubectl and checks output string instead of exit code for reliability. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Address PR review feedback for dry-run skill Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 2d1a73a commit 81c5e7d

7 files changed

Lines changed: 479 additions & 0 deletions

File tree

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
---
2+
name: test-docs-dryrun
3+
description: >
4+
Fast CRD schema validation for ToolHive documentation. Extracts all YAML blocks from K8s and vMCP docs, runs kubectl apply --dry-run=server to catch field name errors, type mismatches, and schema drift. No cluster resources are created. Use for: "dry-run the docs", "validate the YAML", "check for schema issues", "run a quick doc check", or after any CRD/API change to catch doc rot. Requires a Kubernetes cluster with ToolHive CRDs installed.
5+
---
6+
7+
# Test Docs - Dry Run
8+
9+
Fast CRD schema validation for all ToolHive documentation YAML blocks. Extracts every YAML block containing `toolhive.stacklok.dev` resources, runs `kubectl apply --dry-run=server` on each, and reports pass/fail in a per-doc table. No resources are created. Under 5 minutes for all docs.
10+
11+
## Workflow
12+
13+
Follow these steps in order. Each step references a procedure file in the `procedures/` directory - read that file and follow its instructions exactly. Do NOT improvise inline bash when a procedure exists.
14+
15+
### 1. Determine scope
16+
17+
Ask what to validate:
18+
19+
- **Kubernetes Operator** - all `.mdx` in `docs/toolhive/guides-k8s/`
20+
- **Virtual MCP Server** - all `.mdx` in `docs/toolhive/guides-vmcp/`
21+
- **All** - both sections
22+
23+
### 2. Cluster setup
24+
25+
Follow `procedures/cluster-setup.md`. Key points:
26+
27+
- Check for existing `toolhive` kind cluster FIRST
28+
- Only create if missing
29+
- Only CRDs needed, not the operator
30+
- Default to KEEPING the cluster after the run
31+
32+
### 3. Extract YAML
33+
34+
Follow `procedures/extract.md`. Key points:
35+
36+
- Always use `scripts/extract-yaml.py` with the `--prefix` flag
37+
- Use `--prefix k8s` for K8s docs, `--prefix vmcp` for vMCP docs
38+
- This prevents filename collisions between sections
39+
40+
### 4. Validate
41+
42+
Follow `procedures/validate.md`. Key points:
43+
44+
- Write results to a CSV file (not bash variables)
45+
- Classify as pass/fail/expected/skip per the procedure
46+
- One CSV line per YAML block
47+
48+
### 5. Report
49+
50+
Follow `procedures/report.md`. Key points:
51+
52+
- Always output the per-doc breakdown table to the user
53+
- The table is the PRIMARY output of this skill
54+
- Bold non-zero fail counts
55+
- List real failures with error messages and fix suggestions
56+
- Write results to `TEST_DRYRUN_*.md`
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# Cluster setup
2+
3+
How to manage the kind cluster for dry-run validation. The goal is to avoid unnecessary cluster creation/deletion between runs.
4+
5+
## Check for existing cluster
6+
7+
Always check first. Do NOT create a cluster without checking.
8+
9+
```bash
10+
kind get clusters 2>/dev/null | grep -q "^toolhive$"
11+
```
12+
13+
If the cluster exists, verify CRDs are installed:
14+
15+
```bash
16+
kubectl get crd --context kind-toolhive 2>/dev/null | grep -c toolhive.stacklok.dev
17+
```
18+
19+
If CRDs are present (count > 0), skip straight to extraction. No setup needed.
20+
21+
## Create cluster only if missing
22+
23+
If no `toolhive` cluster exists:
24+
25+
```bash
26+
kind create cluster --name toolhive
27+
helm upgrade --install toolhive-operator-crds \
28+
oci://ghcr.io/stacklok/toolhive/toolhive-operator-crds \
29+
-n toolhive-system --create-namespace
30+
```
31+
32+
The operator is NOT needed for dry-run validation. Only install CRDs.
33+
34+
## After the run: keep the cluster
35+
36+
Default to keeping the cluster after the run. Do NOT delete it unless the user explicitly asks. This avoids the 2-minute cluster creation penalty on the next run.
37+
38+
If the user asks to clean up, delete with:
39+
40+
```bash
41+
kind delete cluster --name toolhive
42+
```
Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# Extract YAML blocks
2+
3+
How to extract ToolHive CRD YAML blocks from documentation files. Always use the extraction script - do NOT write inline bash to parse YAML.
4+
5+
## Commands
6+
7+
Use the `--prefix` flag to avoid filename collisions between sections. This is critical because both sections have files with the same name (e.g., `telemetry-and-metrics.mdx`).
8+
9+
```bash
10+
SKILL_PATH="<skill-path>"
11+
YAML_DIR="$(mktemp -d)/yaml"
12+
mkdir -p "$YAML_DIR"
13+
14+
# K8s Operator docs
15+
for f in docs/toolhive/guides-k8s/*.mdx; do
16+
python3 "$SKILL_PATH/scripts/extract-yaml.py" "$f" "$YAML_DIR" --prefix k8s
17+
done
18+
19+
# vMCP docs
20+
for f in docs/toolhive/guides-vmcp/*.mdx; do
21+
python3 "$SKILL_PATH/scripts/extract-yaml.py" "$f" "$YAML_DIR" --prefix vmcp
22+
done
23+
```
24+
25+
Replace `<skill-path>` with the absolute path to this skill's directory.
26+
27+
## Output
28+
29+
The script writes one file per YAML block:
30+
31+
```text
32+
k8s-auth-k8s_0.yaml
33+
k8s-auth-k8s_1.yaml
34+
k8s-run-mcp-k8s_0.yaml
35+
vmcp-authentication_0.yaml
36+
vmcp-telemetry-and-metrics_0.yaml
37+
```
38+
39+
The prefix ensures `k8s-telemetry-and-metrics_0.yaml` and `vmcp-telemetry-and-metrics_0.yaml` don't collide.
40+
41+
## For single section
42+
43+
If only validating one section, use just the relevant loop:
44+
45+
```bash
46+
# K8s only
47+
for f in docs/toolhive/guides-k8s/*.mdx; do
48+
python3 "$SKILL_PATH/scripts/extract-yaml.py" "$f" "$YAML_DIR" --prefix k8s
49+
done
50+
51+
# vMCP only
52+
for f in docs/toolhive/guides-vmcp/*.mdx; do
53+
python3 "$SKILL_PATH/scripts/extract-yaml.py" "$f" "$YAML_DIR" --prefix vmcp
54+
done
55+
```
56+
57+
## For single page
58+
59+
```bash
60+
python3 "$SKILL_PATH/scripts/extract-yaml.py" <file-path> "$YAML_DIR" --prefix single
61+
```
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# Report results
2+
3+
How to produce the per-doc breakdown table from the results CSV.
4+
5+
## Build the table from CSV
6+
7+
The validate step writes a CSV at `$RESULTS` with columns: `doc,filename,result,detail`
8+
9+
Aggregate it into the per-doc table using this bash:
10+
11+
```bash
12+
RESULTS="<path-to-results.csv>"
13+
14+
echo "| Doc | Section | Blocks | Pass | Fail | Expected | Skip |"
15+
echo "|-----|---------|--------|------|------|----------|------|"
16+
17+
TBLOCKS=0; TPASS=0; TFAIL=0; TEXPECTED=0; TSKIP=0
18+
19+
for doc in $(cut -d',' -f1 "$RESULTS" | sort -u); do
20+
blocks=$(grep -c "^$doc," "$RESULTS")
21+
pass=$(grep -c "^$doc,[^,]*,pass" "$RESULTS" || true)
22+
fail=$(grep -c "^$doc,[^,]*,fail" "$RESULTS" || true)
23+
expected=$(grep -c "^$doc,[^,]*,expected" "$RESULTS" || true)
24+
skip=$(grep -c "^$doc,[^,]*,skip" "$RESULTS" || true)
25+
26+
# Determine section from prefix
27+
section="K8s"
28+
echo "$doc" | grep -q "^vmcp-" && section="vMCP"
29+
30+
# Bold non-zero fail counts
31+
fail_display="$fail"
32+
[ "$fail" -gt 0 ] && fail_display="**$fail**"
33+
34+
echo "| ${doc} | ${section} | ${blocks} | ${pass} | ${fail_display} | ${expected} | ${skip} |"
35+
36+
TBLOCKS=$((TBLOCKS + blocks))
37+
TPASS=$((TPASS + pass))
38+
TFAIL=$((TFAIL + fail))
39+
TEXPECTED=$((TEXPECTED + expected))
40+
TSKIP=$((TSKIP + skip))
41+
done
42+
43+
echo "| **TOTAL** | | **$TBLOCKS** | **$TPASS** | **$TFAIL** | **$TEXPECTED** | **$TSKIP** |"
44+
```
45+
46+
## Table rules
47+
48+
- Include every doc that had at least 1 YAML block extracted
49+
- Omit docs with 0 blocks
50+
- Group K8s docs first (prefix `k8s-`), then vMCP docs (prefix `vmcp-`)
51+
- Sort alphabetically within each section
52+
- Bold the TOTAL row
53+
- Bold any non-zero Fail count
54+
55+
## After the table
56+
57+
List each real failure with:
58+
59+
```text
60+
### Real failures
61+
62+
**k8s-mcp-server-entry_1.yaml**: `strict decoding error: unknown field "spec.remoteURL"`
63+
Fix: Change `remoteURL` to `remoteUrl`
64+
65+
**vmcp-optimizer_3.yaml**: `strict decoding error: unknown field "spec.modelCache.storageSize"`
66+
Fix: Change `storageSize` to `size`
67+
```
68+
69+
## Write to file
70+
71+
Always write the table and failure details to a markdown file:
72+
73+
- Single page: `TEST_DRYRUN_<PAGE_NAME>.md`
74+
- Section: `TEST_DRYRUN_<SECTION_NAME>.md`
75+
- All: `TEST_DRYRUN_ALL.md`
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# Validate YAML blocks
2+
3+
How to dry-run validate extracted YAML blocks efficiently.
4+
5+
## Important: use a script file, not inline bash
6+
7+
Long-running `for` loops with nested `if/elif/else` and subshells do NOT work reliably as inline Bash tool calls - they get backgrounded or time out. Always write the validation loop to a temporary `.sh` file and execute it with `bash`.
8+
9+
## Step 1: Write the validation script
10+
11+
Write the following to a temp file (e.g., `$YAML_DIR/../validate.sh`). Replace `<yaml-dir>` with the actual YAML directory path.
12+
13+
```bash
14+
#!/bin/bash
15+
set -euo pipefail
16+
17+
YAML_DIR="<yaml-dir>"
18+
RESULTS="$YAML_DIR/../results.csv"
19+
> "$RESULTS"
20+
21+
for f in "$YAML_DIR"/*.yaml; do
22+
bn=$(basename "$f")
23+
doc=$(echo "$bn" | sed 's/_[0-9]*\.yaml$//')
24+
25+
# Skip incomplete resources (snippets without kind/apiVersion/name)
26+
if ! grep -q 'kind:' "$f" || ! grep -q 'apiVersion:' "$f" || ! grep -q 'name:' "$f"; then
27+
echo "$doc,$bn,skip,incomplete fragment" >> "$RESULTS"
28+
continue
29+
fi
30+
31+
OUT=$(kubectl apply --dry-run=server -f "$f" 2>&1) || true
32+
33+
if echo "$OUT" | grep -q 'created (server dry run)'; then
34+
echo "$doc,$bn,pass," >> "$RESULTS"
35+
elif echo "$OUT" | grep -q 'namespaces.*not found'; then
36+
echo "$doc,$bn,pass,namespace missing but schema valid" >> "$RESULTS"
37+
elif echo "$OUT" | grep -qE '<SERVER_NAME>|<NAMESPACE>|<SERVER_URL>'; then
38+
echo "$doc,$bn,expected,placeholder name" >> "$RESULTS"
39+
elif echo "$OUT" | grep -qE 'incomingAuth: Required|compositeTools.*steps: Required'; then
40+
echo "$doc,$bn,expected,partial snippet" >> "$RESULTS"
41+
else
42+
ERR=$(echo "$OUT" | tr '\n' ' ' | head -c 200)
43+
echo "$doc,$bn,fail,$ERR" >> "$RESULTS"
44+
fi
45+
done
46+
47+
echo "Done: $(wc -l < "$RESULTS") lines"
48+
```
49+
50+
## Step 2: Run the script
51+
52+
Use the Write tool to create the script file, then execute it:
53+
54+
```bash
55+
bash $YAML_DIR/../validate.sh
56+
```
57+
58+
This runs in a single process and completes reliably regardless of how many YAML blocks are validated.
59+
60+
## Result classification
61+
62+
| Result | Meaning | Action needed? |
63+
| ---------- | ---------------------------------------- | ----------------- |
64+
| `pass` | Schema valid (or only namespace missing) | No |
65+
| `expected` | Placeholder name or partial snippet | No |
66+
| `fail` | Real schema error | Yes - fix the doc |
67+
| `skip` | Incomplete fragment (no kind/name) | No |
68+
69+
## Important notes
70+
71+
- ALWAYS write to a script file first, then execute with `bash`
72+
- Use `|| true` after `kubectl apply` to prevent `set -e` from exiting on validation failures
73+
- Check for `created (server dry run)` in output rather than exit code, because some versions of kubectl return non-zero even on dry-run success with warnings
74+
- Use a CSV file for results, not bash variables or associative arrays
75+
- Write one line per YAML block with: doc name, filename, result, detail
76+
- The CSV is consumed by the report procedure to build the table
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
#!/bin/bash
2+
# SPDX-FileCopyrightText: Copyright 2026 Stacklok, Inc.
3+
# SPDX-License-Identifier: Apache-2.0
4+
5+
# Check prerequisites for dry-run documentation validation.
6+
# Only requires kubectl and a cluster with CRDs installed (no operator needed).
7+
8+
set -euo pipefail
9+
10+
RED='\033[0;31m'
11+
GREEN='\033[0;32m'
12+
YELLOW='\033[0;33m'
13+
NC='\033[0m'
14+
15+
ERRORS=0
16+
WARNINGS=0
17+
18+
echo "=== Dry-Run Validation Prerequisites ==="
19+
echo ""
20+
21+
# Check kubectl
22+
if command -v kubectl &>/dev/null; then
23+
echo -e "${GREEN}[OK]${NC} kubectl available"
24+
else
25+
echo -e "${RED}[MISSING]${NC} kubectl (required)"
26+
ERRORS=$((ERRORS + 1))
27+
fi
28+
29+
# Check cluster connectivity
30+
echo ""
31+
echo "--- Kubernetes cluster ---"
32+
CLUSTER_REACHABLE=false
33+
if kubectl cluster-info &>/dev/null; then
34+
CONTEXT=$(kubectl config current-context 2>/dev/null || echo "unknown")
35+
echo -e "${GREEN}[OK]${NC} Cluster reachable (context: $CONTEXT)"
36+
CLUSTER_REACHABLE=true
37+
else
38+
echo -e "${RED}[MISSING]${NC} No Kubernetes cluster reachable"
39+
ERRORS=$((ERRORS + 1))
40+
fi
41+
42+
# Check ToolHive CRDs installed (only if cluster is reachable)
43+
echo ""
44+
echo "--- ToolHive CRDs ---"
45+
if [ "$CLUSTER_REACHABLE" = true ]; then
46+
CRD_COUNT=$(kubectl get crd 2>/dev/null | grep -c toolhive.stacklok.dev || true)
47+
if [ "$CRD_COUNT" -gt 0 ]; then
48+
echo -e "${GREEN}[OK]${NC} $CRD_COUNT ToolHive CRDs installed"
49+
else
50+
echo -e "${RED}[MISSING]${NC} No ToolHive CRDs found"
51+
echo " Install with: helm upgrade --install toolhive-operator-crds oci://ghcr.io/stacklok/toolhive/toolhive-operator-crds -n toolhive-system --create-namespace"
52+
ERRORS=$((ERRORS + 1))
53+
fi
54+
else
55+
echo -e "${YELLOW}[SKIPPED]${NC} Cannot check CRDs (cluster unreachable)"
56+
fi
57+
58+
# Check python3 for extraction script
59+
echo ""
60+
echo "--- Tools ---"
61+
if command -v python3 &>/dev/null; then
62+
echo -e "${GREEN}[OK]${NC} python3 available"
63+
else
64+
echo -e "${RED}[MISSING]${NC} python3 (required for YAML extraction)"
65+
ERRORS=$((ERRORS + 1))
66+
fi
67+
68+
echo ""
69+
echo "=== Summary ==="
70+
if [ "$ERRORS" -gt 0 ]; then
71+
echo -e "${RED}$ERRORS error(s)${NC}"
72+
exit 1
73+
else
74+
echo -e "${GREEN}All prerequisites met${NC}"
75+
exit 0
76+
fi

0 commit comments

Comments
 (0)