feat: add Vector Search plugin by adamgurary · Pull Request #200 · databricks/appkit

adamgurary · 2026-03-20T17:06:33Z

Summary

Adds @databricks/appkit-vector-search — a plugin that gives Databricks Apps built with AppKit instant vector search query capabilities
Ships backend (Express routes, VS REST API client, service principal + OBO auth) and frontend (React hook, styled components with Radix UI)
Developer experience target: ~45 lines for a full search page with search box, results, filters, and keyword highlighting
82 tests included; validated against real VS index on dogfood

What's included

Backend plugin (src/plugin/):

VectorSearchPlugin.ts — plugin class with lifecycle, config, route injection
VectorSearchClient.ts — REST API client for VS endpoints
routes.ts — Express route handlers
auth.ts — service principal default, OBO opt-in per index

Frontend UI (src/ui/):

useVectorSearch React hook
SearchBox, SearchResults, SearchResultCard, SearchLoadMore components

Design decisions

Decision	Choice	Rationale
Package structure	Single package (backend + UI)	Shared types, single dependency, matches Lakebase plugin pattern
Default search mode	Hybrid (ANN + keyword)	Best out-of-the-box quality
Reranking	Off by default, opt-in per index	Adds latency; too slow for interactive search by default
Auth	Service principal default, OBO opt-in	Simple default, secure option when needed

Test plan

Review plugin structure against existing AppKit plugin patterns (Lakebase)
Run test suite (vitest run in packages/vector-search/)
Validate against live VS index on dogfood
Review API surface and types

Brainstorm: add PR #166 (Agent plugin) and PR #200 (Vector Search plugin) as future extension references. Rename future enhancement section to cover both Vector Search and Lakebase pgvector options. Plan: address findings from multi-agent code review (architecture, security, performance, spec flow, pattern recognition): - Fix cache infrastructure: use shared CacheManager pool, not fictional maxEntries config - Clarify error contract: programmatic API errors propagate, HTTP handlers use execute() for interceptors - Separate _chatCollect()/_embed() from HTTP handlers - Add SSE buffer max size (1MB) to prevent OOM - Restrict response_format to text/json_object (no json_schema v1) - Add runtime role validation against known set - Add model to parameter allowlist for Foundation Model API - Add stop parameter bounds (4 entries, 256 chars) - Standardize connection pool at 100 (was contradictory 50/100) - Add retry on 503 for chatCollect() (cold-start resilience) - Specify setup() throws on missing endpoint, shutdown() cleanup - Extract SSE parser to stream/sse-parser.ts in Phase 2 - Add per-route body-parser middleware (not global) - Update acceptance criteria and security checklist Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>

Brainstorm: - Added chatCollect() for non-streaming programmatic API - Scoped out vision/multimodal, thinking/budget_tokens, tools/tool_choice as v2 items with specific rationale - Added reasoning_effort to v1 scope - Referenced PRs #166 (agent plugin) and #200 (vector search) - Updated references with query/vision/reasoning/function-calling docs Plan: - Cross-referenced Databricks Query API spec vs OpenAI conventions - Documented type sourcing decision (hand-write for v1, sourced from OpenAI API reference) - Added SDK comparison table (OpenAI vs Anthropic vs AppKit) - Fixed id: string | null in response types - Noted served-model-name header for telemetry - Documented extra_params vs top-level field convention Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>

Brainstorm: add PR #166 (Agent plugin) and PR #200 (Vector Search plugin) as future extension references. Rename future enhancement section to cover both Vector Search and Lakebase pgvector options. Plan: address findings from multi-agent code review (architecture, security, performance, spec flow, pattern recognition): - Fix cache infrastructure: use shared CacheManager pool, not fictional maxEntries config - Clarify error contract: programmatic API errors propagate, HTTP handlers use execute() for interceptors - Separate _chatCollect()/_embed() from HTTP handlers - Add SSE buffer max size (1MB) to prevent OOM - Restrict response_format to text/json_object (no json_schema v1) - Add runtime role validation against known set - Add model to parameter allowlist for Foundation Model API - Add stop parameter bounds (4 entries, 256 chars) - Standardize connection pool at 100 (was contradictory 50/100) - Add retry on 503 for chatCollect() (cold-start resilience) - Specify setup() throws on missing endpoint, shutdown() cleanup - Extract SSE parser to stream/sse-parser.ts in Phase 2 - Add per-route body-parser middleware (not global) - Update acceptance criteria and security checklist Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>

Brainstorm: - Added chatCollect() for non-streaming programmatic API - Scoped out vision/multimodal, thinking/budget_tokens, tools/tool_choice as v2 items with specific rationale - Added reasoning_effort to v1 scope - Referenced PRs #166 (agent plugin) and #200 (vector search) - Updated references with query/vision/reasoning/function-calling docs Plan: - Cross-referenced Databricks Query API spec vs OpenAI conventions - Documented type sourcing decision (hand-write for v1, sourced from OpenAI API reference) - Added SDK comparison table (OpenAI vs Anthropic vs AppKit) - Fixed id: string | null in response types - Noted served-model-name header for telemetry - Documented extra_params vs top-level field convention Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>

calvarjorge

These comments from the AI review make sense to me:

  #: 1
  Severity: critical
  File: plugins/vector-search/vector-search.ts
  Line: 175-189
  Issue: Missing execute() interceptor
  Description: Every other plugin wraps API calls in this.execute() for cache/retry/timeout/telemetry. This plugin calls
  the
    connector directly, bypassing the entire interceptor chain.
  ────────────────────────────────────────
  #: 2
  Severity: high
  File: plugins/vector-search/defaults.ts
  Line: 1-7
  Issue: Defaults defined but never used
  Description: vectorSearchDefaults is exported but never imported anywhere — dead code.
  ────────────────────────────────────────
  #: 3
  Severity: high
  File: connectors/vector-search/client.ts
  Line: 24-26
  Issue: timeout config stored but never applied
  Description: Constructor stores this.config.timeout but no AbortSignal or timeout mechanism is ever created from it.
  ────────────────────────────────────────
  #: 4
  Severity: high
  File: plugins/vector-search/vector-search.ts
  Line: 140
  Issue: No request body validation
  Description: req.body is cast directly to SearchRequest without Zod validation. The repo design philosophy states
    "Type-safe — Heavy TypeScript usage with runtime validation (Zod)". Other plugins validate at system boundaries.
  ────────────────────────────────────────
  #: 5
  Severity: medium
  File: plugins/vector-search/vector-search.ts
  Line: 349
  Issue: fromCache: false hardcoded
  Description: Since execute() isn't used, caching never happens — this field is always false, making it misleading in the
    response type.
  ────────────────────────────────────────
  #: 6
  Severity: medium
  File: plugins/vector-search/vector-search.ts
  Line: 296-298
  Issue: shutdown() calls streamManager.abortAll() but plugin never uses streaming
  Description: The plugin has no SSE streams. This won't error (base class provides streamManager), but it's misleading
    about the plugin's capabilities.
  ────────────────────────────────────────
  #: 7
  Severity: medium
  File: plugins/vector-search/vector-search.ts
  Line: 234
  Issue: Non-null assertion on endpointName!
  Description: indexConfig.endpointName! uses ! operator. Validated at setup time, but if config changes dynamically or
    setup is skipped, this will throw an untyped error.

atilafassina

Great work!

I left a few comments, some are minor but I think there are 3 critical ones

not wrapping the plugin connector calls in the executor
returning plain objects as error
not wrapping the embeddingFn in a try/catch

The AbortControl in the connector would be nice to have too, but that's not a big issue if we have the plugin calls wrapped in the executor.

The "hybrid" hardcode in the response is mostly a question from me, maybe it's a non-issue.

Once we land in the right API, we need to add docs and a route in the dev-playground. Btu I'd defer that for after a the reviews to avoid re-work. Ofc, you do what you prefer :)

pkosiec

Thanks a lot for your contribution! I think most have been already said by Atila and Jorge (e.g. about the dev playground addition), just a few small comments from my side.

IMO the most important point is to figure out the apps init flow (configure required resource(s) for the plugin, to be selectable interactively as a part of the init flow), and also make sure that the API is designed around the resource. See the Files, Genie or Lakebase plugins as an example. Thanks!

Adds @databricks/appkit-vector-search — a plugin that gives Databricks Apps built with AppKit instant vector search query capabilities. Ships backend (Express routes, VS REST API client, auth) and frontend (React hook, styled components with Radix UI). Developer experience target: ~45 lines for a full search page with search box, results, filters, and keyword highlighting. 82 tests included. Validated against real VS index on dogfood.

- Move from packages/vector-search/ into packages/appkit/src/plugins/vector-search/ - Replace custom auth (ServicePrincipalTokenProvider, OboTokenExtractor) with AppKit's built-in asUser(req) and getWorkspaceClient() context - Add VectorSearchConnector using workspaceClient.apiClient.request() instead of raw fetch with manual token management - Plugin now extends Plugin base class with proper manifest.json, defaults.ts, this.route(), this.execute(), and toPlugin() factory - Remove standalone package.json, tsconfig.json, and vitest config - Register plugin and connector in index barrel exports Addresses review feedback: - Plugin lives under plugins/ folder alongside analytics, genie, files - No custom auth handling — uses AppKit's built-in mechanisms - Follows create-core-plugin patterns (manifest, defaults, connector) Signed-off-by: Adam Gurary <adam.gurary@databricks.com>

- Connector: wrap VS API calls in telemetry spans with index name, query type, result count, and latency attributes - Connector: check AbortSignal before executing requests - Connector: add WideEvent context logging with query metadata - Plugin: replace this.execute() in route handlers with direct try/catch — preserves actual error details (code, message, status) instead of swallowing them into undefined - Remove unused SearchFilters import Signed-off-by: Adam Gurary <adam.gurary@databricks.com>

- Wrap all connector calls in this.execute() for retry/cache/timeout/telemetry - Use vectorSearchDefaults (was dead code, now feeds execute()) - Pass AbortSignal from execute() to connector methods - Change error responses to {error, plugin} shape matching Files/Genie - Throw Error instances instead of plain objects in programmatic API - Wrap user-provided embeddingFn in try/catch - Extract shared query prep logic into _prepareQuery() helper - Remove fromCache (execute handles caching transparently) - Remove streamManager.abortAll() from shutdown (no streams) - Replace endpointName! assertion with runtime guard - Use index queryType for next-page response instead of hardcoded "hybrid" - Add vector_search_index resource to manifest for apps init flow - Add dev-playground integration (server, route, nav) - Add plugin documentation page - Add template integration (App.tsx, VectorSearchPage, plugins.json) - Export vectorSearch from top-level package index - Update tests for new patterns (execute, error shapes, embeddingFn) Co-authored-by: Isaac Signed-off-by: Adam Gurary <adam.gurary@databricks.com>

adamgurary · 2026-04-14T23:20:27Z

Addressed all review feedback from @atilafassina, @calvarjorge, and @pkosiec. Rebased onto current main.

Blockers (Atila)

this.execute() wrapping — All connector calls now go through the interceptor chain (retry/cache/timeout/telemetry). vectorSearchDefaults feeds into execute(), no longer dead code.
Error instances — Programmatic query() throws new Error(...) instead of plain objects.
embeddingFn try/catch — User-provided embedding function wrapped in try/catch with clear error message.

Pattern alignment

Error responses now use { error, plugin } shape matching Files/Genie
Extracted _prepareQuery() helper to DRY shared logic between query() and route handlers
Removed streamManager.abortAll() from shutdown() (plugin has no streams)
Removed fromCache from response type (execute handles caching transparently)
Replaced endpointName! non-null assertion with runtime guard
Next-page response uses indexConfig.queryType instead of hardcoded "hybrid"
Removed unused Context import from connector; AbortSignal passed from execute() to connector

New content

manifest.json — Added vector_search_index required resource with env fields for apps init flow
Dev playground — Server config, nav link, search route with form + results display
Docs — docs/docs/plugins/vector-search.md with usage, config options, API reference
Template — App.tsx routes, VectorSearchPage.tsx, appkit.plugins.json entry
Package export — vectorSearch added to top-level index.ts

Skipped

Zod request body validation (Jorge's AI review feat: arrow stream #4) — No existing plugin uses Zod for req.body validation. All use simple if (!field) guards. Matched existing patterns.

Note on CI

CI doesn't appear to run on fork PRs (protected runner group). Could not run lint/typecheck/tests locally either — the Databricks npm proxy (npm-proxy.dev.databricks.com) is returning empty replies / ECONNRESET. If a maintainer can trigger CI or approve the workflow run, that would help verify. Happy to fix anything CI catches.

atilafassina

🏆 Amazing work, @adamgurary
thank you very much

What's next:
I'll QA and update the docs, then we'll prepare a new release with the plugin available.

I'll ping you for review when writing the skills for it

Co-authored-by: Isaac Signed-off-by: Atila Fassina <atila@fassina.eu>

varunrao mentioned this pull request Mar 27, 2026

[Enhancement] Leverage AppKit Genie Plugin and Agent Plugin for live in-app Agent & Genie integration databricks-solutions/vibe-coding-workshop-app#4

Open

jamesbroadhead requested review from MarioCadenas and pkosiec April 1, 2026 14:23

calvarjorge reviewed Apr 2, 2026

View reviewed changes

atilafassina requested changes Apr 2, 2026

View reviewed changes

pkosiec reviewed Apr 2, 2026

View reviewed changes

Comment thread packages/appkit/src/plugins/vector-search/manifest.json Outdated

Comment thread packages/appkit/src/plugins/vector-search/manifest.json

Comment thread packages/appkit/src/plugins/index.ts

Adam Gurary added 4 commits April 14, 2026 16:15

adamgurary force-pushed the feat/vector-search-plugin branch from d6bd39a to 2f89bdf Compare April 14, 2026 23:19

atilafassina changed the title ~~Add Vector Search plugin~~ feat: add Vector Search plugin Apr 15, 2026

atilafassina approved these changes Apr 15, 2026

View reviewed changes

MarioCadenas enabled auto-merge (squash) April 15, 2026 14:10

MarioCadenas disabled auto-merge April 15, 2026 14:12

atilafassina force-pushed the feat/vector-search-plugin branch from 1b20b49 to 0081ab1 Compare April 15, 2026 14:16

chore: fix linting, removing export

1ec2e11

Co-authored-by: Isaac Signed-off-by: Atila Fassina <atila@fassina.eu>

atilafassina force-pushed the feat/vector-search-plugin branch from 0081ab1 to 1ec2e11 Compare April 15, 2026 14:22

MarioCadenas merged commit 279954e into databricks:main Apr 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Vector Search plugin#200

feat: add Vector Search plugin#200
MarioCadenas merged 5 commits intodatabricks:mainfrom
adamgurary:feat/vector-search-plugin

adamgurary commented Mar 20, 2026

Uh oh!

calvarjorge left a comment •

edited

Loading

Uh oh!

atilafassina left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pkosiec left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adamgurary commented Apr 14, 2026

Uh oh!

atilafassina left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

adamgurary commented Mar 20, 2026

Summary

What's included

Design decisions

Test plan

Uh oh!

calvarjorge left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

atilafassina left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pkosiec left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adamgurary commented Apr 14, 2026

Blockers (Atila)

Pattern alignment

New content

Skipped

Note on CI

Uh oh!

atilafassina left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

calvarjorge left a comment •

edited

Loading