Skip to content

Commit 5d4438b

Browse files
committed
Add issue proposal: improve complex mechanical/illustrative diagrams
Defines 5 concrete problems with current diagram generation (rough shapes, floating components, disconnected controls, thin skill guidance, oversimplified sub-diagrams) and proposes 5 PRs to fix them: 1. Expand illustrative diagram rules in svg-diagram-skill.txt 2. Add diagram planning protocol (pre-generation thinking step) 3. Add progressive rendering pattern for interactive diagrams 4. Create dedicated mechanical-illustration-skill.txt 5. Add SVG shape library with reusable path fragments All changes are skill/prompt file modifications only — no code changes. https://claude.ai/code/session_0116PYmVGknEBVfUU8xZT78B
1 parent fada403 commit 5d4438b

1 file changed

Lines changed: 152 additions & 0 deletions

File tree

Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
# Improve complex mechanical/illustrative diagram quality
2+
3+
## Problem
4+
5+
When users ask questions like "how do cars work?" or "how do airplanes work?", the agent generates SVG/HTML diagrams via the `widgetRenderer` component that are **functional but lack precision and intentionality**.
6+
7+
### Screenshots
8+
9+
**Car diagram** — rough silhouette, floating component boxes, crude annotations, oversimplified pedal sub-diagram:
10+
11+
![car-diagram](./assets/car-diagram-screenshot.png)
12+
13+
**Airplane diagram** — imprecise plane shape, rough control surfaces mini-diagram, adequate force arrows but weak airfoil representation:
14+
15+
![airplane-diagram](./assets/airplane-diagram-screenshot.png)
16+
17+
### Root causes
18+
19+
1. **No planning step** — the agent generates complex illustrative diagrams in a single pass with no composition planning (what to emphasize, where components sit spatially, how controls map to visuals)
20+
2. **Thin illustrative guidance** — the existing skill files give strong rules for flowcharts and structural diagrams but only **9 lines** to illustrative diagrams (`svg-diagram-skill.txt` lines 157-165)
21+
3. **No shape construction method** — the LLM generates freehand SVG paths that produce rough, unrecognizable silhouettes
22+
4. **No control-to-visual binding pattern** — interactive elements (sliders, presets) are architecturally disconnected from the SVG elements they should modify
23+
24+
### Specific deficiencies
25+
26+
- Object silhouettes (car body, airplane fuselage) are rough freehand paths rather than recognizable shapes
27+
- Internal components (engine, transmission, control surfaces) are colored boxes floating over the outline without spatial accuracy
28+
- Annotations use crude arrows/lines without a clear visual hierarchy (primary vs secondary labels)
29+
- Sub-diagrams (pedal arrangement, control surface detail) are oversimplified rectangles
30+
- Interactive controls don't visually update the diagram — no feedback loop
31+
32+
---
33+
34+
## Proposed PRs
35+
36+
All changes are to **skill/prompt files only** — no runtime code changes needed. New `.txt` files are auto-discovered by `load_all_skills()` in `apps/agent/skills/__init__.py`.
37+
38+
### PR 1: Expand illustrative diagram rules in `svg-diagram-skill.txt`
39+
40+
**Scope:** `apps/agent/skills/svg-diagram-skill.txt` lines 157-165 (+ MCP mirror)
41+
42+
The current "Illustrative Diagram" section is only 9 lines. Expand with:
43+
- **Composition grid rule**: divide 680px viewBox into zones — main illustration (x=40-480), annotation margin (x=490-640), optional sub-diagram zone (bottom 25%)
44+
- **Depth layering order**: background -> silhouette -> internals -> connectors -> labels
45+
- **Shape construction rule**: build recognizable objects from 4-8 geometric primitives, not a single complex path
46+
- **Force/motion arrow conventions**: larger arrowheads (`markerWidth=8`), color-coded by type (warm = resistance, cool = propulsion, gray = gravity)
47+
- **Cross-section vs side-view guidance**: when to use each and how to indicate cut planes
48+
49+
**Impact:** Immediate quality improvement for all illustrative diagrams. Smallest change, do first.
50+
51+
---
52+
53+
### PR 2: Add diagram planning protocol (pre-generation thinking step)
54+
55+
**Scope:** `apps/agent/skills/svg-diagram-skill.txt` (+ MCP mirror)
56+
57+
Before generating any illustrative diagram, the agent must complete a structured plan:
58+
59+
1. **Subject decomposition** — identify 3-5 key subsystems, classify as primary (prominent) vs secondary (annotation-level)
60+
2. **Spatial layout** — assign components to a coordinate grid before writing SVG
61+
3. **Educational priority** — what is the single most important thing to convey?
62+
4. **Composition sketch** — define bounding rectangles for illustration, annotations, controls, sub-diagrams
63+
5. **Control-to-visual mapping** — list which SVG element IDs each interactive control will modify
64+
6. **Shape fidelity check** — list 4-6 defining visual features that make the subject recognizable (e.g., for a car: wheel wells, hood slope, windshield angle, roof line)
65+
66+
Pure prompt enhancement — no code changes. **Highest single-item impact.**
67+
68+
---
69+
70+
### PR 3: Add progressive rendering pattern for interactive diagrams
71+
72+
**Scope:** `apps/agent/skills/master-agent-playbook.txt`, `apps/agent/skills/svg-diagram-skill.txt` (+ MCP mirrors)
73+
74+
Replace scattered `oninput` handlers with a structured architecture:
75+
76+
- **Explicit HTML sections** via comments: `<!-- SECTION 1: Main SVG -->`, `<!-- SECTION 2: Sub-diagrams -->`, `<!-- SECTION 3: Stats -->`, `<!-- SECTION 4: Controls -->`, `<!-- SECTION 5: JS state + bindings -->`
77+
- **Centralized state-and-render pattern**:
78+
```js
79+
const state = { throttle: 50, gear: 3, mode: 'city' };
80+
function updateState(key, value) { state[key] = value; render(); }
81+
function render() { /* update ALL visual elements from state */ }
82+
```
83+
- **Element ID convention**: `viz-{component}-{property}` (e.g., `viz-engine-fill`, `viz-speed-text`)
84+
- **Preset pattern**: presets as calls to `updateState` with multiple values
85+
86+
Ensures controls and visuals are architecturally connected rather than accidentally coupled.
87+
88+
---
89+
90+
### PR 4: Create dedicated `mechanical-illustration-skill.txt`
91+
92+
**Scope:** New file `apps/agent/skills/mechanical-illustration-skill.txt` (+ MCP mirror)
93+
94+
Comprehensive skill for mechanical/illustrative diagrams:
95+
96+
- **Shape construction method** — decompose objects into geometric primitives (body = rounded rect, wheels = circles, windows = trapezoids)
97+
- **Spatial accuracy rules** — grid overlay technique for component placement
98+
- **Annotation hierarchy** — primary (14px/500 weight, solid leader lines), secondary (12px/400, dashed leaders), tertiary (11px/400, inline)
99+
- **Sub-diagram quality standards** — minimum 150x100px, visually connected to main diagram via callout
100+
- **Interactive control binding template** — reusable JS pattern for slider/button -> SVG element updates
101+
- **Two worked reference compositions** — car drivetrain and airfoil cross-section with full SVG examples
102+
103+
Auto-loaded by the existing `load_all_skills()` glob — no code changes.
104+
105+
---
106+
107+
### PR 5: Add SVG shape library with reusable path fragments
108+
109+
**Scope:** New file `apps/agent/skills/svg-shape-library.txt` (+ MCP mirror)
110+
111+
Pre-designed SVG path fragments the agent can reference and adapt:
112+
113+
- **Vehicles**: sedan side profile, airplane side profile, airplane top-down
114+
- **Mechanical parts**: gear/cog, piston, spring, valve, wheel with spokes
115+
- **Physics shapes**: airfoil cross-section (NACA-style), force arrow with proper head
116+
- **Annotation elements**: zoom callout box, dimension line with tick marks
117+
- **Control surfaces**: rudder, aileron, elevator (neutral and deflected)
118+
119+
Each entry includes normalized coordinates (0-100 units), transform instructions for positioning, and customization notes.
120+
121+
**Note:** Adds ~30-50KB to system prompt. Monitor context budget. Consider including only 6-8 most common shapes initially.
122+
123+
---
124+
125+
## Recommended sequencing
126+
127+
| Order | PR | Effort | Impact |
128+
|-------|-----|--------|--------|
129+
| 1 | PR 1: Expand illustrative rules | Small | Medium |
130+
| 2 | PR 2: Diagram planning protocol | Small | **High** |
131+
| 3 | PR 3: Progressive rendering | Medium | Medium |
132+
| 4 | PR 4: Mechanical illustration skill | Medium | High |
133+
| 5 | PR 5: SVG shape library | Large | High |
134+
135+
PRs 1 and 2 can be merged independently. PR 4 builds on conventions from PR 1. PR 3 and PR 5 are independent of each other.
136+
137+
## Verification
138+
139+
After each PR, test with these prompts and compare before/after:
140+
- "how do cars work?"
141+
- "can you explain how airplanes fly? I'm a visual person"
142+
- "explain how a combustion engine works"
143+
- Verify dark mode compatibility
144+
- Check system prompt token count stays within model context budget
145+
146+
## Key files
147+
148+
- `apps/agent/skills/svg-diagram-skill.txt` — primary skill to expand (illustrative section at lines 157-165)
149+
- `apps/agent/skills/master-agent-playbook.txt` — interactive widget templates (Part 2)
150+
- `apps/agent/skills/__init__.py` — auto-discovers new `.txt` skill files (line 23, glob pattern)
151+
- `apps/agent/main.py` — injects skills into system prompt (line 49)
152+
- `apps/mcp/skills/` — MCP mirrors of all skill files

0 commit comments

Comments
 (0)