Drive and inspect long-lived terminal sessions from the CLI, with reviewable snapshots, screenshots, and recordings.
agent-tty keeps a real PTY-backed terminal session alive across separate CLI invocations. You run a command in it, wait for the screen to reach a condition instead of sleeping, then capture what happened as a semantic text snapshot, a PNG screenshot, an asciinema-compatible .cast, or a WebM. The recording is the point: a human — or an AI coding agent — can replay and verify exactly what the terminal did, instead of trusting a blind script.
It started as a way to reproduce and verify TUI bug reports (see Why it exists), and it's equally useful for shell automation, CI smoke tests, and driving interactive CLIs.
Requires Node >=24 <27. Screenshots and WebM export also need a Playwright Chromium install (npx playwright install chromium).
npm install -g agent-tty
# Sessions, logs, and artifacts live under ~/.agent-tty by default.
# Optionally point AGENT_TTY_HOME at a throwaway dir for an isolated run:
export AGENT_TTY_HOME="$(mktemp -d)"
agent-tty doctor --json # check your environment
# Open a session, do something, wait for it, look at the result.
SID=$(agent-tty create --json -- /bin/bash | jq -r '.result.sessionId')
agent-tty run "$SID" 'printf "hello from agent-tty\n"' --json
agent-tty wait "$SID" --text 'hello from agent-tty' --json
agent-tty snapshot "$SID" --format text --json
agent-tty screenshot "$SID" --json
agent-tty destroy "$SID" --jsonDriving an interactive TUI is the same loop with key chords and a stability wait:
agent-tty run "$SID" 'nvim --clean' --no-wait --json
agent-tty wait "$SID" --screen-stable-ms 1000 --json
agent-tty send-keys "$SID" Down Down Enter --json
agent-tty screenshot "$SID" --json
agent-tty record export "$SID" --format webm --jsonMore workflows in docs/USAGE.md. Other install paths (tarballs, prerelease channels, source checkouts) in docs/INSTALL.md.
Those tools are good, and you can get partway with any of them. agent-tty exists because driving a terminal and getting reviewable evidence back is awkward with each one:
| If you reach for… | You get | What agent-tty adds |
|---|---|---|
tmux + send-keys / capture-pane |
drive a pane, scrape raw bytes | a wait-for-condition primitive (stop sleeping and grepping), semantic snapshots, and PNG / .cast / WebM artifacts a process or human can review |
expect |
scripted input/output matching on a byte stream | a model of the rendered screen (cursor, alt-screen, colors), plus shareable visual artifacts |
asciinema / VHS |
a recording to watch later | programmatic drive + wait + inspect — act on terminal state, not just record it (and it still exports asciinema-compatible .cast) |
| Playwright | this exact stateful loop, for browsers | the same drive → wait → inspect → snapshot loop, applied to terminals and TUIs |
agent-tty is an automation-and-inspection layer, not a tmux replacement.
agent-tty CLI → per-session host → PTY + append-only event log → Ghostty renderer → artifacts
Every session is backed by a real PTY (node-pty) and an append-only event log. The log is the source of truth, so snapshots, screenshots, and recordings can be regenerated deterministically by replaying it — even after the session has exited.
Rendering uses Ghostty's terminal engine through two interchangeable backends (--renderer):
libghostty-vt— Ghostty's native VT engine, bound into Node. Fast, browser-free semantic snapshots andwaitchecks.ghostty-web(default) — a headless web build of Ghostty driven by Playwright/Chromium. Adds pixel PNG screenshots and WebM video.
ghostty-web is a reference renderer: it shows what a pinned Ghostty build draws, not a pixel-for-pixel guarantee of any particular native terminal window. That tradeoff is deliberate — the renderer sits behind an adapter, so native backends can be added later without changing the CLI contract.
I maintain coder/claudecode.nvim and was drowning in issues and PRs I couldn't easily reproduce — Neovim is a TUI, and "reproduce this, configure that, screenshot the result" is painful to script with sleeps and capture-pane. agent-tty lets me spin up an isolated, reproducible terminal environment, hand it to a coding agent to attempt a fix, and then verify the fix with a fresh session and a recording I can actually look at.
A colleague then used agent-tty to build an experimental TUI for Coder agents almost entirely by letting coding agents drive it — checking the screenshots and recordings it produced instead of watching over their shoulder. That's the loop it's built for: an agent acts, agent-tty captures reviewable evidence, a human (or another agent) verifies.
Global flags: --home <path> (or AGENT_TTY_HOME; defaults to ~/.agent-tty), --renderer <ghostty-web|libghostty-vt> (or AGENT_TTY_RENDERER), --json, --timeout-ms <n>, --no-color, --log-level <level>, --profile <name>.
- Environment:
version,doctor,skills list|get|path - Lifecycle:
create,list,inspect,destroy,gc - Input & control:
run,type,paste,send-keys,resize,signal,mark - Observe & capture:
wait,snapshot,screenshot,record export
Every user-facing command takes --json and returns a stable, machine-readable envelope. See docs/USAGE.md for details and docs/TROUBLESHOOTING.md for renderer/environment issues.
Real Codex and Claude TUIs discovering the agent-tty skill, driving nvim --clean, writing a file, and exporting inner proof artifacts. (GitHub renders these as click-to-play H.264 players.)
| Codex | Claude |
|---|---|
codex-outer-h264.mp4 |
claude-outer-h264.mp4 |
Full reproducer, transcripts, and proof bundles in dogfood/agent-uses-agent-tty/ and dogfood/CATALOG.md.
agent-tty ships a bootstrap skill so coding agents can load current usage instructions at runtime:
agent-tty skills list
agent-tty skills get agent-ttySee docs/AGENT-SKILLS.md.
agent-tty is 0.2.x and focused on reliable, isolated, reviewable terminal/TUI automation through a stable CLI.
- Linux and macOS are tier-1; Windows is tier-2 and not CI-tested.
- Screenshots and WebM export depend on Playwright/Chromium and the
ghostty-webbackend. runis best for shell setup and command injection; it does not capture a child command's structured output or exit status.- Apache-2.0, runs entirely locally, no account or SaaS.
Deferred work (native renderers, mouse input, remote control, an MCP wrapper) is tracked in ROADMAP.md. The supported contract is in RELEASE.md; the architecture is in design/ARCHITECTURE.md.
Issues and PRs welcome — see docs/CONTRIBUTING.md and the good first issue / help wanted labels.
mise install && mise run bootstrap # preferred
npm run cli -- --help
npm run verify