Skip to content

Pathfinder retrieval upgrade + query-source observability#93

Open
jpr5 wants to merge 4 commits into
mainfrom
blitz/pathfinder-gaps/integration-engine
Open

Pathfinder retrieval upgrade + query-source observability#93
jpr5 wants to merge 4 commits into
mainfrom
blitz/pathfinder-gaps/integration-engine

Conversation

@jpr5
Copy link
Copy Markdown
Contributor

@jpr5 jpr5 commented Jun 6, 2026

Summary

  • Repoint docs indexing to the live shell-docs tree. Pathfinder was indexing the retired docs/content/docs/ tree; the docs source now points at showcase/shell-docs/src/content/docs/ (path, URL derivation, webhook path-triggers) so search returns current pages.
  • Inline MDX @/snippets/* imports before chunking so snippet-composed pages index with real content instead of near-empty stubs.
  • Embed chunk title + heading path into the vector text so precise symbol/prop/heading queries keep their strongest anchor (plus an embedding count-mismatch guard and a metadata-precedence fix).
  • Hybrid (vector + keyword) search with min_score: 0.3 on all four tools; scope code/ag-ui-code indexes (exclude examples/showcase/.next/*.d.ts/generated/proto; broaden ag-ui-code to the Python + community SDKs); sharpen tool descriptions.
  • Query-log observability: tag each row with request_source (X-Pathfinder-Source) + session_id, and add a low-confidence windowed metric so our own analysis traffic is separable from real users.

Why

From a 30-day analytics gap analysis: the top empty-result drivers were the indexer pointing at the retired docs tree, snippet-composed pages indexing empty, and no way to distinguish synthetic/analysis traffic from real users.

Test plan

  • tsc --noEmit clean; prettier --check "src/**/*.ts" clean
  • full suite green (3608 tests / 242 files)
  • CI green on this PR

Follow-up (separate PR)

Pre-existing indexing-durability + dashboard-consistency issues surfaced during review (delete-before-upsert chunk loss, "All time" 366-day cap, ag-ui-docs .md 404s, code-fence heading-path attribution) will be handled in a dedicated PR — none are introduced here.

🤖 Generated with Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant