Skip to content

Add llmwiki live provider with lazy registration, entities API, and Langflow plugin#25

Merged
ethanj merged 1 commit into
mainfrom
sync/monorepo-9f7ed3d
Jun 11, 2026
Merged

Add llmwiki live provider with lazy registration, entities API, and Langflow plugin#25
ethanj merged 1 commit into
mainfrom
sync/monorepo-9f7ed3d

Conversation

@ethanj

@ethanj ethanj commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

Introduces three major new surfaces: the @atomicmemory/llmwiki package (a bridge from llmwiki JSON exports into AtomicMemory verbatim records), a Core entities API with external-ID idempotency, and the Langflow plugin. Also ships the CLI import --type llmwiki command, SDK entities client, capability-profile conformance corpus, a retrieval-receipt service, verbatim deduplication, an offline-mode doc, and a CI-enforced release pipeline with publish guards.

Changes

@atomicmemory/llmwiki (new package)

  • Converts an llmwiki export --target json envelope into one verbatim AtomicMemory record per wiki page, preserving memory.metadata.llmwiki.* advisory metadata (kind, citations, confidence, provenance state, contradictions, aliases, freshness).
  • Deterministic external IDs (llmwiki/<projectId>/<path>), nesting guard, pagination, capability check, and a fail-safe re-import detection gate.
  • Comprehensive unit and live integration test suites.
  • LiveLLMWikiProvider behind the /live subpath: writable source-backed CRUD, search, package, and explicit compile against a local llm-wiki-compiler checkout (optional peer dependency), with scope guards on every operation and trust-fenced packaged output.
  • Light /register subpath: liveLlmwikiLazyEntry() defers loading the live provider until first use and maps a missing optional peer to the stable E_LLMWIKI_COMPILER_MISSING error code.
  • Snapshot bridge renamed SnapshotLLMWikiProvider; package version 1.1.0 with peer @atomicmemory/sdk: ^1.1.0.

CLI import --type llmwiki

  • New packages/cli/src/commands/memory/import-llmwiki.ts handler dispatched from import.ts.
  • Dry-run mode (no adapter required), verbatim ingest routing, --project-id override, --allow-append-only / --accept-duplicates / --yes opt-in flags for re-import.
  • Subprocess tests drive the built binary end-to-end; unit tests cover all guard and error paths.
  • --content-class summary|redacted|raw flag added to ingest and reflected in cli-spec.json.

Core entities API

  • routes/entities.ts (CRUD + list), db/entity-settings-repository.ts, db/entity-cards-repository.ts, and schemas/entities.ts.
  • DB migrations: 0002_entity_settings.sql, 0002_memories_external_id_index.sql, 0003_memories_external_id_unique.sql.
  • External-ID field on memories with idempotency: duplicate verbatim writes with the same externalId are deduplicated by content hash.
  • middleware/asserted-user.ts for request-scoped user assertion.
  • app/capabilities-descriptor.ts and /capabilities route with OpenAPI spec served at /openapi.json.
  • services/retrieval-receipt.ts — attaches a receipt to search responses documenting what was retrieved and why.
  • Verbatim deduplication service (services/__tests__/verbatim-dedup.test.ts).
  • Trusted-proxy default, raw-content policy, and offline-mode tests.
  • packages/core/docs/OFFLINE.md — offline/air-gapped deployment guide.

SDK

  • src/entities/EntitiesClient, types, and index re-export.
  • src/memory/capability-profiles.ts — canonical capability sets for providers.
  • schema/v1/ — JSON Schema for provider contract, ingest input, search result page, and capabilities descriptor; conformance corpus with golden fixture files.
  • src/memory/__tests__/conformance-corpus.test.ts validates all conformance fixtures against their schemas.
  • AtomicMemoryProvider retrieval-receipt mapping and ingest-content-class support.
  • packages/sdk/CONTRACT.md — public provider contract documentation.
  • Provider registry accepts async factories (MemoryProviderEntry.create may return a promise), awaited at the single service-initialization call site.
  • MemoryClient.initialize is memoized and concurrency-safe; a failed initialization is sticky — retrying re-throws the original error, so callers construct a new client after resolving the cause.
  • MemoryService.initialize stages provider registrations atomically with best-effort teardown, so a mid-initialization failure never leaves partial provider state observable.
  • Version 1.0.3 → 1.1.0; @atomicmemory/llmwiki pins peer ^1.1.0 because older SDKs would store an async factory's promise as the provider.

Langflow plugin (new)

  • Four Langflow components: ChatMemory, StoreMessage, SearchContext, Delete.
  • _sdk.py bridge against the AtomicMemory HTTP API; Python test suite covering all components, SDK bridge, and scope/message helpers.
  • install.mjs installer, plugin.yaml manifest, and three example flow JSONs.

OpenClaw / Codex / Cursor / Claude Code plugins

  • Skill and instructions files updated to reference --content-class and the revised verbatim ingest guidance.
  • Claude Code smoke lib: ATOMICMEMORY_API_KEY env var aligned to canonical name.

CI and release infrastructure

  • .github/workflows/publish-packages.yml — new manifest-driven publish pipeline (preflight pack-dry-run → npm Trusted Publishing → registry verification → Core Docker trigger).
  • publish-core-docker.yml — immutable-tag guard (refuses to overwrite an existing semver tag with a different digest); new retag_aliases_only path to repair missing siblings without rebuilding.
  • ci.ymlrelease-policy job, core-docker-smoke job (boot-only, conditional on core-affecting changes).
  • scripts/ci/release-policy.mjs + unit tests — enforces publish guardrails.
  • scripts/guards/guard-npm-publish.mjs and guard-public-push.mjs — pre-publish and pre-push safety guards; wired into all adapter and package prepublishOnly scripts.
  • scripts/git-hooks/install-hooks.mjs — auto-installs pre-push hook via prepare.
  • sync-to-private.yml — guard: fires only from the public repository to prevent phantom syncs from the internal mirror.
  • publishConfig (access: public, registry URL) added to all adapter package.json files.

Why

  • llmwiki bridge: lets teams compile a knowledge base with llmwiki (standalone) and optionally load it into AtomicMemory for runtime semantic recall without duplicating source management.
  • Entities API + external ID: gives callers a stable identity handle for memories so verbatim re-imports are idempotent rather than silently duplicating content.
  • Retrieval receipt: surfaces provenance in search responses so agents and developers can trace which records influenced a result.
  • Langflow plugin: extends AtomicMemory to the Langflow visual-builder ecosystem without changes to Core.
  • SDK async factories + lazy llmwiki registration: applications can declare the llmwiki provider without paying its import cost up front; sdk 1.1.0 and llmwiki 1.1.0 version together because the lazy entry depends on the awaited-factory contract.
  • Release pipeline: replaces ad-hoc publish steps with a manifest-driven workflow that verifies tarball shape, version alignment, and registry visibility before the Docker image is triggered; publish guards block accidental npm publish outside the pipeline.

Validation

  • pnpm run build && pnpm run typecheck && pnpm run lint
  • pnpm run test (self-contained packages; Core DB tests require Postgres/pgvector provisioning)
  • pnpm run pack-dry-run && pnpm run package-metadata
  • pnpm run repo-hygiene && pnpm run security-compliance && pnpm run docs-contract
  • pnpm run release-policy && pnpm run test:guards && pnpm run test:release-policy
  • pnpm run public-integration-smoke

…angflow plugin

## Summary

Introduces three major new surfaces: the `@atomicmemory/llmwiki` package (a bridge from llmwiki JSON exports into AtomicMemory verbatim records), a Core entities API with external-ID idempotency, and the Langflow plugin. Also ships the CLI `import --type llmwiki` command, SDK entities client, capability-profile conformance corpus, a retrieval-receipt service, verbatim deduplication, an offline-mode doc, and a CI-enforced release pipeline with publish guards.

## Changes

**`@atomicmemory/llmwiki` (new package)**
- Converts an `llmwiki export --target json` envelope into one verbatim AtomicMemory record per wiki page, preserving `memory.metadata.llmwiki.*` advisory metadata (kind, citations, confidence, provenance state, contradictions, aliases, freshness).
- Deterministic external IDs (`llmwiki/<projectId>/<path>`), nesting guard, pagination, capability check, and a fail-safe re-import detection gate.
- Comprehensive unit and live integration test suites.
- `LiveLLMWikiProvider` behind the `/live` subpath: writable source-backed CRUD, search, package, and explicit compile against a local `llm-wiki-compiler` checkout (optional peer dependency), with scope guards on every operation and trust-fenced packaged output.
- Light `/register` subpath: `liveLlmwikiLazyEntry()` defers loading the live provider until first use and maps a missing optional peer to the stable `E_LLMWIKI_COMPILER_MISSING` error code.
- Snapshot bridge renamed `SnapshotLLMWikiProvider`; package version 1.1.0 with peer `@atomicmemory/sdk: ^1.1.0`.

**CLI `import --type llmwiki`**
- New `packages/cli/src/commands/memory/import-llmwiki.ts` handler dispatched from `import.ts`.
- Dry-run mode (no adapter required), verbatim ingest routing, `--project-id` override, `--allow-append-only / --accept-duplicates / --yes` opt-in flags for re-import.
- Subprocess tests drive the built binary end-to-end; unit tests cover all guard and error paths.
- `--content-class summary|redacted|raw` flag added to `ingest` and reflected in `cli-spec.json`.

**Core entities API**
- `routes/entities.ts` (CRUD + list), `db/entity-settings-repository.ts`, `db/entity-cards-repository.ts`, and `schemas/entities.ts`.
- DB migrations: `0002_entity_settings.sql`, `0002_memories_external_id_index.sql`, `0003_memories_external_id_unique.sql`.
- External-ID field on memories with idempotency: duplicate verbatim writes with the same `externalId` are deduplicated by content hash.
- `middleware/asserted-user.ts` for request-scoped user assertion.
- `app/capabilities-descriptor.ts` and `/capabilities` route with OpenAPI spec served at `/openapi.json`.
- `services/retrieval-receipt.ts` — attaches a receipt to search responses documenting what was retrieved and why.
- Verbatim deduplication service (`services/__tests__/verbatim-dedup.test.ts`).
- Trusted-proxy default, raw-content policy, and offline-mode tests.
- `packages/core/docs/OFFLINE.md` — offline/air-gapped deployment guide.

**SDK**
- `src/entities/` — `EntitiesClient`, types, and index re-export.
- `src/memory/capability-profiles.ts` — canonical capability sets for providers.
- `schema/v1/` — JSON Schema for provider contract, ingest input, search result page, and capabilities descriptor; conformance corpus with golden fixture files.
- `src/memory/__tests__/conformance-corpus.test.ts` validates all conformance fixtures against their schemas.
- `AtomicMemoryProvider` retrieval-receipt mapping and `ingest-content-class` support.
- `packages/sdk/CONTRACT.md` — public provider contract documentation.
- Provider registry accepts async factories (`MemoryProviderEntry.create` may return a promise), awaited at the single service-initialization call site.
- `MemoryClient.initialize` is memoized and concurrency-safe; a failed initialization is sticky — retrying re-throws the original error, so callers construct a new client after resolving the cause.
- `MemoryService.initialize` stages provider registrations atomically with best-effort teardown, so a mid-initialization failure never leaves partial provider state observable.
- Version 1.0.3 → 1.1.0; `@atomicmemory/llmwiki` pins peer `^1.1.0` because older SDKs would store an async factory's promise as the provider.

**Langflow plugin (new)**
- Four Langflow components: `ChatMemory`, `StoreMessage`, `SearchContext`, `Delete`.
- `_sdk.py` bridge against the AtomicMemory HTTP API; Python test suite covering all components, SDK bridge, and scope/message helpers.
- `install.mjs` installer, `plugin.yaml` manifest, and three example flow JSONs.

**OpenClaw / Codex / Cursor / Claude Code plugins**
- Skill and instructions files updated to reference `--content-class` and the revised verbatim ingest guidance.
- Claude Code smoke lib: `ATOMICMEMORY_API_KEY` env var aligned to canonical name.

**CI and release infrastructure**
- `.github/workflows/publish-packages.yml` — new manifest-driven publish pipeline (preflight pack-dry-run → npm Trusted Publishing → registry verification → Core Docker trigger).
- `publish-core-docker.yml` — immutable-tag guard (refuses to overwrite an existing semver tag with a different digest); new `retag_aliases_only` path to repair missing siblings without rebuilding.
- `ci.yml` — `release-policy` job, `core-docker-smoke` job (boot-only, conditional on core-affecting changes).
- `scripts/ci/release-policy.mjs` + unit tests — enforces publish guardrails.
- `scripts/guards/guard-npm-publish.mjs` and `guard-public-push.mjs` — pre-publish and pre-push safety guards; wired into all adapter and package `prepublishOnly` scripts.
- `scripts/git-hooks/install-hooks.mjs` — auto-installs pre-push hook via `prepare`.
- `sync-to-private.yml` — guard: fires only from the public repository to prevent phantom syncs from the internal mirror.
- `publishConfig` (`access: public`, registry URL) added to all adapter `package.json` files.

## Why

- **llmwiki bridge**: lets teams compile a knowledge base with llmwiki (standalone) and optionally load it into AtomicMemory for runtime semantic recall without duplicating source management.
- **Entities API + external ID**: gives callers a stable identity handle for memories so verbatim re-imports are idempotent rather than silently duplicating content.
- **Retrieval receipt**: surfaces provenance in search responses so agents and developers can trace which records influenced a result.
- **Langflow plugin**: extends AtomicMemory to the Langflow visual-builder ecosystem without changes to Core.
- **SDK async factories + lazy llmwiki registration**: applications can declare the llmwiki provider without paying its import cost up front; sdk 1.1.0 and llmwiki 1.1.0 version together because the lazy entry depends on the awaited-factory contract.
- **Release pipeline**: replaces ad-hoc publish steps with a manifest-driven workflow that verifies tarball shape, version alignment, and registry visibility before the Docker image is triggered; publish guards block accidental `npm publish` outside the pipeline.

## Validation

- `pnpm run build && pnpm run typecheck && pnpm run lint`
- `pnpm run test` (self-contained packages; Core DB tests require Postgres/pgvector provisioning)
- `pnpm run pack-dry-run && pnpm run package-metadata`
- `pnpm run repo-hygiene && pnpm run security-compliance && pnpm run docs-contract`
- `pnpm run release-policy && pnpm run test:guards && pnpm run test:release-policy`
- `pnpm run public-integration-smoke`
@ethanj ethanj merged commit 74d798f into main Jun 11, 2026
11 checks passed
@ethanj ethanj deleted the sync/monorepo-9f7ed3d branch June 11, 2026 01:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant