Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions skills/vector-search/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
name: documentdb-vector-search
description: Vector search best practices for Azure DocumentDB using `cosmosSearch` — choosing between DiskANN / HNSW / IVF, creating indexes, tuning `lBuild` / `lSearch` / `maxDegree`, Product Quantization (up to 16,000 dims), half-precision (fp16) indexing, and normalizing embeddings for cosine similarity. Use when building RAG / semantic-search applications, creating a vector index, tuning recall/latency, or reducing vector-index memory footprint.
description: Vector search best practices for Azure DocumentDB using `cosmosSearch` — choosing between DiskANN / HNSW / IVF, creating indexes via the raw `db.command({ createIndexes, ... })` shape (because typed driver `createIndex` wrappers silently drop `cosmosSearchOptions`), tuning `lBuild` / `lSearch` / `maxDegree`, Product Quantization (up to 16,000 dims), half-precision (fp16) indexing, and normalizing embeddings for cosine similarity. Use when building RAG / semantic-search applications, creating a vector index, tuning recall/latency, or reducing vector-index memory footprint.
license: MIT
---

Expand All @@ -19,7 +19,7 @@ Similarity options: `COS` (cosine), `L2` (Euclidean), `IP` (inner product).
## Rules

- [vector-choose-index-type](vector-choose-index-type.md) — Prefer DiskANN for production; use HNSW up to 50k, IVF under 10k.
- [vector-create-diskann-index](vector-create-diskann-index.md) — Create a `vector-diskann` index with correct `dimensions`, `similarity`, `maxDegree`, and `lBuild`.
- [vector-create-diskann-index](vector-create-diskann-index.md) — Create a `vector-diskann` index with correct `dimensions`, `similarity`, `maxDegree`, and `lBuild`. **Use `db.command({ createIndexes, ... })` from driver code** — the Node.js / PyMongo / .NET typed `createIndex` wrappers silently drop `cosmosSearchOptions`.
- [vector-knn-query](vector-knn-query.md) — Query with `$search` + `cosmosSearch`; tune `lSearch` and `k`; combine with pre-filters.
- [vector-product-quantization](vector-product-quantization.md) — Shrink high-dimensional vectors (up to 16,000 dims) while preserving recall.
- [vector-half-precision](vector-half-precision.md) — Halve vector memory with fp16 indexing and minimal recall loss.
Expand Down
125 changes: 124 additions & 1 deletion skills/vector-search/vector-create-diskann-index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,12 @@

DiskANN is the recommended vector index in Azure DocumentDB for production-scale workloads. Its `maxDegree`, `lBuild`, and the query-time `lSearch` parameters trade off build time, memory, recall, and query latency. The `dimensions` and `similarity` values **must match** the embeddings you insert — mismatches produce wrong results silently.

> ### ⚠️ Driver-safety: use `db.command(...)`, not `collection.createIndex(...)`
>
> The typed `createIndex` wrappers in the official **Node.js**, **PyMongo (sync `create_index`)**, and **.NET** drivers serialize against a fixed `IndexDescription` / `CreateIndexOptions` schema and **silently drop unknown option keys** — including `cosmosSearchOptions`. The wire message goes out without it, the server creates a plain B-tree index on the embedding field, and your app falls back to brute-force scans with **no error**. Upgrading the driver does not fix this; it is a typed-API limitation, not a bug.
>
> `mongosh` works because the shell forwards arbitrary keys straight to the underlying `createIndexes` command. To get the same behavior from a driver, bypass the typed wrapper and issue the raw command via `db.command(...)` (Node / PyMongo) or `RunCommand` (.NET). See the per-driver examples below.

Parameter guide:

| Parameter | Range | Default | Notes |
Expand All @@ -18,17 +24,50 @@ Parameter guide:

## Incorrect

### Wrong dimensions vs. the embedding model — silently-incorrect results

```javascript
// Wrong dimensions vs. the embedding model -> silently-incorrect results
db.products.createIndex(
{ embedding: "cosmosSearch" },
{ cosmosSearchOptions: { kind: "vector-diskann", dimensions: 768, similarity: "L2" } }
);
// ...but the app uses 1536-dim OpenAI text-embedding-3-small with cosine similarity.
```

### Using the Node.js / PyMongo typed `createIndex` wrapper — silently creates a plain index

```javascript
// Node.js driver — DO NOT use for cosmosSearch indexes.
// The driver's IndexDescription type strips `cosmosSearchOptions`
// before sending. No error is raised. The resulting index is NOT a
// vector index, and queries fall back to brute-force scans.
await db.collection("products").createIndex(
{ embedding: "cosmosSearch" },
{
name: "products_embedding_diskann",
cosmosSearchOptions: { // ← dropped on the wire
kind: "vector-diskann",
dimensions: 1536,
similarity: "COS"
}
}
);
```

```python
# PyMongo sync — same problem. `create_index` ignores cosmosSearchOptions.
db.products.create_index(
[("embedding", "cosmosSearch")],
name="products_embedding_diskann",
cosmosSearchOptions={"kind": "vector-diskann", "dimensions": 1536,
"similarity": "COS"},
)
```

## Correct

### `mongosh` (or any shell that forwards unknown keys)

```javascript
db.products.createIndex(
{ embedding: "cosmosSearch" },
Expand All @@ -45,8 +84,92 @@ db.products.createIndex(
);
```

### Node.js driver — use `db.command({ createIndexes, ... })`

```javascript
await db.command({
createIndexes: "products",
indexes: [
{
name: "products_embedding_diskann",
key: { embedding: "cosmosSearch" },
cosmosSearchOptions: {
kind: "vector-diskann",
dimensions: 1536,
similarity: "COS",
maxDegree: 32,
lBuild: 50
}
}
]
});
```

### PyMongo — same shape via `db.command(...)`

```python
db.command({
"createIndexes": "products",
"indexes": [
{
"name": "products_embedding_diskann",
"key": {"embedding": "cosmosSearch"},
"cosmosSearchOptions": {
"kind": "vector-diskann",
"dimensions": 1536,
"similarity": "COS",
"maxDegree": 32,
"lBuild": 50,
},
}
],
})
```

### .NET driver — use `IMongoDatabase.RunCommand`

```csharp
var cmd = new BsonDocument
{
{ "createIndexes", "products" },
{ "indexes", new BsonArray
{
new BsonDocument
{
{ "name", "products_embedding_diskann" },
{ "key", new BsonDocument("embedding", "cosmosSearch") },
{ "cosmosSearchOptions", new BsonDocument
{
{ "kind", "vector-diskann" },
{ "dimensions", 1536 },
{ "similarity", "COS" },
{ "maxDegree", 32 },
{ "lBuild", 50 },
}
}
}
}
}
};
await db.RunCommandAsync<BsonDocument>(cmd);
```

If you change embedding models, **rebuild the index** — mixing dimensions or similarities corrupts results.

## Verifying the index was actually created as a vector index

Because the typed-wrapper failure is silent, always verify the index shape right after creation:

```javascript
db.products.getIndexes()
.find(i => i.name === "products_embedding_diskann");
// Expect the result to include a `cosmosSearchOptions` block with
// `kind: "vector-diskann"`. If that block is missing, the driver
// dropped it — re-create via `db.command(...)` above.
```

## References

- [Vector search — DiskANN index creation](https://learn.microsoft.com/azure/documentdb/vector-search)
- [MongoDB `createIndexes` command](https://www.mongodb.com/docs/manual/reference/command/createIndexes/)

2 changes: 2 additions & 0 deletions skills/vector-search/vector-half-precision.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ db.products.createIndex(

Run an A/B recall check against the prior fp32 index with a representative query set; for most production text embeddings the delta is well under 1%.

> When creating this index from Node.js, PyMongo, or .NET driver code, issue it via `db.command({ createIndexes, ... })` rather than `collection.createIndex(...)` — the typed wrappers strip `cosmosSearchOptions`. See [vector-create-diskann-index](vector-create-diskann-index.md) for the driver-safe shape.

## References

- [Half-Precision Vector Indexing](https://learn.microsoft.com/azure/documentdb/half-precision)
2 changes: 2 additions & 0 deletions skills/vector-search/vector-product-quantization.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@ db.products.createIndex(

Validate recall after enabling PQ with a held-out query set; raise `lSearch` at query time to recover any recall lost to quantization.

> When creating this index from Node.js, PyMongo, or .NET driver code, issue it via `db.command({ createIndexes, ... })` rather than `collection.createIndex(...)` — the typed wrappers strip `cosmosSearchOptions`. See [vector-create-diskann-index](vector-create-diskann-index.md) for the driver-safe shape.

## References

- [Product Quantization for DiskANN in Azure DocumentDB](https://learn.microsoft.com/azure/documentdb/product-quantization)
Loading