diff --git a/skills/vector-search/SKILL.md b/skills/vector-search/SKILL.md index 92c2a81..28570e9 100644 --- a/skills/vector-search/SKILL.md +++ b/skills/vector-search/SKILL.md @@ -1,6 +1,6 @@ --- name: documentdb-vector-search -description: Vector search best practices for Azure DocumentDB using `cosmosSearch` — choosing between DiskANN / HNSW / IVF, creating indexes, tuning `lBuild` / `lSearch` / `maxDegree`, Product Quantization (up to 16,000 dims), half-precision (fp16) indexing, and normalizing embeddings for cosine similarity. Use when building RAG / semantic-search applications, creating a vector index, tuning recall/latency, or reducing vector-index memory footprint. +description: Vector search best practices for Azure DocumentDB using `cosmosSearch` — choosing between DiskANN / HNSW / IVF, creating indexes via the raw `db.command({ createIndexes, ... })` shape (because typed driver `createIndex` wrappers silently drop `cosmosSearchOptions`), tuning `lBuild` / `lSearch` / `maxDegree`, Product Quantization (up to 16,000 dims), half-precision (fp16) indexing, and normalizing embeddings for cosine similarity. Use when building RAG / semantic-search applications, creating a vector index, tuning recall/latency, or reducing vector-index memory footprint. license: MIT --- @@ -19,7 +19,7 @@ Similarity options: `COS` (cosine), `L2` (Euclidean), `IP` (inner product). ## Rules - [vector-choose-index-type](vector-choose-index-type.md) — Prefer DiskANN for production; use HNSW up to 50k, IVF under 10k. -- [vector-create-diskann-index](vector-create-diskann-index.md) — Create a `vector-diskann` index with correct `dimensions`, `similarity`, `maxDegree`, and `lBuild`. +- [vector-create-diskann-index](vector-create-diskann-index.md) — Create a `vector-diskann` index with correct `dimensions`, `similarity`, `maxDegree`, and `lBuild`. **Use `db.command({ createIndexes, ... })` from driver code** — the Node.js / PyMongo / .NET typed `createIndex` wrappers silently drop `cosmosSearchOptions`. - [vector-knn-query](vector-knn-query.md) — Query with `$search` + `cosmosSearch`; tune `lSearch` and `k`; combine with pre-filters. - [vector-product-quantization](vector-product-quantization.md) — Shrink high-dimensional vectors (up to 16,000 dims) while preserving recall. - [vector-half-precision](vector-half-precision.md) — Halve vector memory with fp16 indexing and minimal recall loss. diff --git a/skills/vector-search/vector-create-diskann-index.md b/skills/vector-search/vector-create-diskann-index.md index 57c12f1..7b531b0 100644 --- a/skills/vector-search/vector-create-diskann-index.md +++ b/skills/vector-search/vector-create-diskann-index.md @@ -6,6 +6,12 @@ DiskANN is the recommended vector index in Azure DocumentDB for production-scale workloads. Its `maxDegree`, `lBuild`, and the query-time `lSearch` parameters trade off build time, memory, recall, and query latency. The `dimensions` and `similarity` values **must match** the embeddings you insert — mismatches produce wrong results silently. +> ### ⚠️ Driver-safety: use `db.command(...)`, not `collection.createIndex(...)` +> +> The typed `createIndex` wrappers in the official **Node.js**, **PyMongo (sync `create_index`)**, and **.NET** drivers serialize against a fixed `IndexDescription` / `CreateIndexOptions` schema and **silently drop unknown option keys** — including `cosmosSearchOptions`. The wire message goes out without it, the server creates a plain B-tree index on the embedding field, and your app falls back to brute-force scans with **no error**. Upgrading the driver does not fix this; it is a typed-API limitation, not a bug. +> +> `mongosh` works because the shell forwards arbitrary keys straight to the underlying `createIndexes` command. To get the same behavior from a driver, bypass the typed wrapper and issue the raw command via `db.command(...)` (Node / PyMongo) or `RunCommand` (.NET). See the per-driver examples below. + Parameter guide: | Parameter | Range | Default | Notes | @@ -18,8 +24,9 @@ Parameter guide: ## Incorrect +### Wrong dimensions vs. the embedding model — silently-incorrect results + ```javascript -// Wrong dimensions vs. the embedding model -> silently-incorrect results db.products.createIndex( { embedding: "cosmosSearch" }, { cosmosSearchOptions: { kind: "vector-diskann", dimensions: 768, similarity: "L2" } } @@ -27,8 +34,40 @@ db.products.createIndex( // ...but the app uses 1536-dim OpenAI text-embedding-3-small with cosine similarity. ``` +### Using the Node.js / PyMongo typed `createIndex` wrapper — silently creates a plain index + +```javascript +// Node.js driver — DO NOT use for cosmosSearch indexes. +// The driver's IndexDescription type strips `cosmosSearchOptions` +// before sending. No error is raised. The resulting index is NOT a +// vector index, and queries fall back to brute-force scans. +await db.collection("products").createIndex( + { embedding: "cosmosSearch" }, + { + name: "products_embedding_diskann", + cosmosSearchOptions: { // ← dropped on the wire + kind: "vector-diskann", + dimensions: 1536, + similarity: "COS" + } + } +); +``` + +```python +# PyMongo sync — same problem. `create_index` ignores cosmosSearchOptions. +db.products.create_index( + [("embedding", "cosmosSearch")], + name="products_embedding_diskann", + cosmosSearchOptions={"kind": "vector-diskann", "dimensions": 1536, + "similarity": "COS"}, +) +``` + ## Correct +### `mongosh` (or any shell that forwards unknown keys) + ```javascript db.products.createIndex( { embedding: "cosmosSearch" }, @@ -45,8 +84,92 @@ db.products.createIndex( ); ``` +### Node.js driver — use `db.command({ createIndexes, ... })` + +```javascript +await db.command({ + createIndexes: "products", + indexes: [ + { + name: "products_embedding_diskann", + key: { embedding: "cosmosSearch" }, + cosmosSearchOptions: { + kind: "vector-diskann", + dimensions: 1536, + similarity: "COS", + maxDegree: 32, + lBuild: 50 + } + } + ] +}); +``` + +### PyMongo — same shape via `db.command(...)` + +```python +db.command({ + "createIndexes": "products", + "indexes": [ + { + "name": "products_embedding_diskann", + "key": {"embedding": "cosmosSearch"}, + "cosmosSearchOptions": { + "kind": "vector-diskann", + "dimensions": 1536, + "similarity": "COS", + "maxDegree": 32, + "lBuild": 50, + }, + } + ], +}) +``` + +### .NET driver — use `IMongoDatabase.RunCommand` + +```csharp +var cmd = new BsonDocument +{ + { "createIndexes", "products" }, + { "indexes", new BsonArray + { + new BsonDocument + { + { "name", "products_embedding_diskann" }, + { "key", new BsonDocument("embedding", "cosmosSearch") }, + { "cosmosSearchOptions", new BsonDocument + { + { "kind", "vector-diskann" }, + { "dimensions", 1536 }, + { "similarity", "COS" }, + { "maxDegree", 32 }, + { "lBuild", 50 }, + } + } + } + } + } +}; +await db.RunCommandAsync(cmd); +``` + If you change embedding models, **rebuild the index** — mixing dimensions or similarities corrupts results. +## Verifying the index was actually created as a vector index + +Because the typed-wrapper failure is silent, always verify the index shape right after creation: + +```javascript +db.products.getIndexes() + .find(i => i.name === "products_embedding_diskann"); +// Expect the result to include a `cosmosSearchOptions` block with +// `kind: "vector-diskann"`. If that block is missing, the driver +// dropped it — re-create via `db.command(...)` above. +``` + ## References - [Vector search — DiskANN index creation](https://learn.microsoft.com/azure/documentdb/vector-search) +- [MongoDB `createIndexes` command](https://www.mongodb.com/docs/manual/reference/command/createIndexes/) + diff --git a/skills/vector-search/vector-half-precision.md b/skills/vector-search/vector-half-precision.md index 1dadc6a..1cb3d6d 100644 --- a/skills/vector-search/vector-half-precision.md +++ b/skills/vector-search/vector-half-precision.md @@ -44,6 +44,8 @@ db.products.createIndex( Run an A/B recall check against the prior fp32 index with a representative query set; for most production text embeddings the delta is well under 1%. +> When creating this index from Node.js, PyMongo, or .NET driver code, issue it via `db.command({ createIndexes, ... })` rather than `collection.createIndex(...)` — the typed wrappers strip `cosmosSearchOptions`. See [vector-create-diskann-index](vector-create-diskann-index.md) for the driver-safe shape. + ## References - [Half-Precision Vector Indexing](https://learn.microsoft.com/azure/documentdb/half-precision) diff --git a/skills/vector-search/vector-product-quantization.md b/skills/vector-search/vector-product-quantization.md index 068ffd8..7b39a72 100644 --- a/skills/vector-search/vector-product-quantization.md +++ b/skills/vector-search/vector-product-quantization.md @@ -50,6 +50,8 @@ db.products.createIndex( Validate recall after enabling PQ with a held-out query set; raise `lSearch` at query time to recover any recall lost to quantization. +> When creating this index from Node.js, PyMongo, or .NET driver code, issue it via `db.command({ createIndexes, ... })` rather than `collection.createIndex(...)` — the typed wrappers strip `cosmosSearchOptions`. See [vector-create-diskann-index](vector-create-diskann-index.md) for the driver-safe shape. + ## References - [Product Quantization for DiskANN in Azure DocumentDB](https://learn.microsoft.com/azure/documentdb/product-quantization)