Skip to content

Document db.command() as the driver-safe way to create cosmosSearch indexes#26

Open
khelanmodi wants to merge 1 commit into
Azure:mainfrom
khelanmodi:fix/vector-index-db-command
Open

Document db.command() as the driver-safe way to create cosmosSearch indexes#26
khelanmodi wants to merge 1 commit into
Azure:mainfrom
khelanmodi:fix/vector-index-db-command

Conversation

@khelanmodi
Copy link
Copy Markdown
Collaborator

Summary

The Node.js, PyMongo (sync create_index), and .NET MongoDB drivers serialize index specs against a fixed IndexDescription / CreateIndexOptions schema and silently strip unknown option keys before sending the wire message. cosmosSearchOptions is one of those unknown keys.

Effect: the createIndexes command goes out without cosmosSearchOptions, the server creates a plain B-tree index on the embedding field, and the app silently falls back to brute-force scans on every vector query. No error is raised, and upgrading the driver does not fix it (it is a typed-API limitation, not a bug).

mongosh works because the shell forwards arbitrary option keys straight to the underlying createIndexes command. To get the same behavior from a driver, bypass the typed wrapper and issue the raw command via db.command({ createIndexes, indexes: [...] }) (Node / PyMongo) or RunCommand (.NET).

The kit's current vector-create-diskann-index.md only shows the mongosh-style collection.createIndex(spec, opts) call, which makes the antipattern look universal and is what the agent emits by default.

Changes

  • skills/vector-search/vector-create-diskann-index.md
    • Driver-safety callout at the top of the rule.
    • Keeps the mongosh example, but adds the db.command(...) equivalent for Node.js, PyMongo, and .NET as the canonical pattern for application code.
    • Adds an Incorrect example showing the typed collection.createIndex call (Node + Python) so the agent learns to avoid it.
    • Adds a getIndexes() verification snippet so the silent failure is caught at index-creation time.
  • skills/vector-search/SKILL.md — updates the rule's description line + skill description: frontmatter so routing surfaces the db.command requirement.
  • vector-half-precision.md, vector-product-quantization.md — short cross-reference to the driver-safe shape so the agent doesn't relearn the antipattern from these sibling rules.

Reported by

@khelanmodi — while creating a DiskANN vector index from the Node.js driver, the index was created but as a plain B-tree index. cosmosSearchOptions was being sent by the application but stripped by the driver before it hit the wire. Reproducible across PyMongo's create_index and .NET's typed Indexes.CreateOneAsync.

Validation

  • pwsh scripts/validate-skills.ps1 still passes (17/17).
  • Diff stat: 4 files, +130 / -3 lines.

…ndexes

The Node.js, PyMongo (sync `create_index`), and .NET drivers serialize
index specs against a fixed IndexDescription / CreateIndexOptions schema
and silently strip unknown option keys before sending the wire message.
`cosmosSearchOptions` is one of those unknown keys. Result: the wire
`createIndexes` command goes out without it, the server creates a
plain B-tree index on the embedding field, and the app silently falls
back to brute-force scans on every query. There is no error, and
upgrading the driver does not fix this (it is a typed-API limitation,
not a bug).

`mongosh` works because the shell forwards arbitrary option keys
straight to the underlying command. To get the same behavior from a
driver, you have to bypass the typed wrapper and issue the raw command
via `db.command({ createIndexes, indexes: [...] })` (Node / PyMongo)
or `RunCommand` (.NET).

This patch:

- Rewrites `skills/vector-search/vector-create-diskann-index.md`:
  - Adds a driver-safety callout at the top explaining the silent failure.
  - Keeps the `mongosh` example, but adds the `db.command(...)`
    equivalent for Node.js, PyMongo, and .NET as the canonical pattern
    for driver code.
  - Adds an Incorrect example showing the typed `collection.createIndex`
    call so the agent recognizes and avoids it.
  - Adds a verification snippet (`getIndexes()`) so the silent failure
    is caught at index-creation time.
- Adds short cross-references to `vector-half-precision.md` and
  `vector-product-quantization.md` pointing at the driver-safe shape.
- Updates the skill `description:` so routing surfaces the
  `db.command` requirement.

Reported by @khelanmodi while creating a DiskANN vector index from the
Node.js driver — the index was created, but as a plain index, so queries
silently fell back to scans. Reproducible across PyMongo and .NET.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant