Atlas foundation: codebase-knowledge layer in Pathfinder (off-by-default)#94
Open
jpr5 wants to merge 13 commits into
Open
Atlas foundation: codebase-knowledge layer in Pathfinder (off-by-default)#94jpr5 wants to merge 13 commits into
jpr5 wants to merge 13 commits into
Conversation
A transient failure in markAtlasCachePagesStaleForSources caused executeJob to reject, so onReindexComplete never fired for a reindex that actually succeeded — suppressing bash-instance refresh, llms.txt/ faq.txt cache clearing, and the reindex audit. Wrap the Atlas cache invalidation call in its own try/catch that logs and continues, keeping it before the callback but unable to suppress it.
The per-page catch in gardenAtlasCachePages persisted the generation error to the DB but never logged it, leaving operators blind. Worse, the recordAtlasCachePageGenerationError call was unguarded: if it threw (e.g. "Atlas cache page not found" on a concurrently deleted/re-keyed row, or any transient DB error), the rejection escaped the loop and aborted the entire gardening pass, losing all prior progress and never returning a summary. Now the generation failure is logged via console.error, and the bookkeeping call is wrapped in its own try/catch that logs and continues so a single page's bookkeeping failure can't poison the batch. Adds a red-green test covering the bookkeeping-throws case.
- parseSseMessages now skips empty/whitespace `data:` frames (keepalives) and wraps per-event JSON.parse so unparseable frames are skipped instead of crashing the search command with an opaque "Unexpected end of JSON input" error. - DEFAULT_TOOL is now "atlas-search" to match the Atlas tool name in pathfinder.example.yaml, so `atlas search "x"` targets Atlas by default instead of the docs search tool. Adds red-green tests: an empty `data:` SSE frame that previously crashed, and a default-tool assertion pinned to "atlas-search".
…path keys Finding 1 (HIGH): approveAtlasCandidate silently returned 200 without queuing a reindex when no orchestrator was wired (Atlas sources but no search/knowledge tools). Now log a loud, actionable error and surface reindexQueued:boolean in the JSON response. The orchestrator-present 200 path is unchanged. Finding 2: the path-param approve/reject routes used :canonicalKey, which a literal "/" in a real key (e.g. "github-pr:atlas:owner/repo:42") would truncate, addressing the wrong key. Switch to an Express 5 wildcard param (*canonicalKey) and reconstruct/decode the full key in atlasCanonicalKey, so both %2F-escaped and literal-slash keys round-trip. Body-based routes are untouched. Tests: add red-green coverage for both findings in atlas-ratification-endpoints.test.ts.
The path-param wildcard routes (POST /api/atlas/candidates/*canonicalKey/ approve and /reject) double-decoded the key (Express 5 decodes wildcard segments, then decodeURIComponent ran again, corrupting %XX keys), were body/path-inconsistent, and were fully redundant with the working body-based routes. Drop both registrations and the now-unused atlasCanonicalKey(req) helper. Keep the body routes, atlasCanonicalKeyFromBody, and the approve-without-orchestrator fix. Convert the surviving tests to the body route and remove the path-param-only slash-key test.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the Atlas foundation to Pathfinder — an agent-maintained codebase-knowledge layer (the "codebase-memory" quadrant alongside auto-memory, handoffs, and episodic memory). This PR lands the foundation dormant / off-by-default: the schema, providers, gardener, ratification endpoints, webhook ingestion, analytics, and a thin
atlasCLI are all present and tested, but no operational loop is scheduled and no behavior changes for existing Pathfinder users.What's included
atlas_seed_entries(durable inputs — decisions/corrections/inbox/schema) andatlas_cache_pages(regenerable derived pages). Durability attaches to inputs, not the wiki.AtlasDataProvider(src/db/atlas.ts) — seed + cache persistence.src/indexing/atlas-gardener.ts) — regenerates cache pages from seed; hardened error path (logs failures, guards bookkeeping).src/server.ts) — body-param routes for approving/rejecting seed entries (path-param variants intentionally dropped — see below).src/webhooks/) — capture is webhook-driven server-side, not agent-driven.src/db/analytics.ts) — Atlas retrievals excluded from standard/analytics.atlas-cli.ts— thin stateless MCP client so agents (esp. Codex, which struggles with MCP reconnect) get a first-classatlas search "<question>"access path without configuring an MCP server.Scope notes
Deferred wiring follow-up (pilot prerequisite — NOT in this PR)
The operational loop is a deliberate follow-up, required before the pilot:
service:session tagging on retrievalsseed_pathwiring (seed lives in a privatebackofficesidecar for the pilot, not in-repo)Pilot repos:
copilotkit/copilotkit+ag-ui-protocol/ag-ui.Test plan
tsc --noEmit— 0 errors