Skip to content

feat: Index Migrator (Pre-release)#583

Draft
nkanu17 wants to merge 65 commits into
mainfrom
feat/index-migrator-pre-release
Draft

feat: Index Migrator (Pre-release)#583
nkanu17 wants to merge 65 commits into
mainfrom
feat/index-migrator-pre-release

Conversation

@nkanu17
Copy link
Copy Markdown
Collaborator

@nkanu17 nkanu17 commented Apr 14, 2026

feat: Index Migrator (Pre-release)

Zero-downtime, crash-safe index migration for RedisVL. Plan, apply, and rollback schema changes, including vector quantization, field renames, prefix changes, and algorithm swaps, through a single CLI or programmatic API.

Summary

This PR adds a complete index migration system to RedisVL, enabling users to evolve their index schemas without data loss. The migrator handles the full lifecycle: plan → review → apply → validate → rollback.

Key Capabilities

Category Operations
Index-only Change algorithm (FLAT ↔ HNSW ↔ SVS-VAMANA), distance metric, HNSW params (M, EF_CONSTRUCTION), make fields sortable
Schema + Data Add/remove fields, rename fields, rename index, change key prefix, change field options (separator, stemming)
Vector Quantization float32 → float16, bfloat16, int8, uint8 with automatic re-encoding and crash-safe backup

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    CLI: rvl migrate                          │
│  plan │ wizard │ apply │ rollback │ estimate │ validate     │
├───────┴────────┴───────┴──────────┴──────────┴──────────────┤
│                  MigrationPlanner                           │
│  Schema diffing, change classification, plan generation     │
├─────────────────────────────────────────────────────────────┤
│            MigrationExecutor (sync + async)                 │
│  enumerate → field-rename → dump → drop → key-rename →     │
│  quantize → create → index → validate                      │
├─────────────────────────────────────────────────────────────┤
│  VectorBackup    │  Quantize Pipeline  │  Reliability Layer │
│  Crash-safe dump │  Batched R/W/Convert│  Validation checks │
└──────────────────┴─────────────────────┴────────────────────┘

Usage

CLI: Interactive Wizard

# List available indexes
rvl migrate list --url redis://localhost:6379

# Interactive wizard: walks through changes step by step
rvl migrate wizard --url redis://localhost:6379

# Generate a plan from a target schema YAML
rvl migrate plan --index my_index --schema target_schema.yaml --url redis://localhost:6379

# Apply with crash-safe backup and multi-worker quantization
rvl migrate apply --plan migration_plan.yaml \
  --backup-dir /tmp/migration_backup \
  --workers 4 --batch-size 1000 \
  --url redis://localhost:6379

# Estimate disk space (dry-run, no mutations)
rvl migrate estimate --plan migration_plan.yaml --url redis://localhost:6379

# Rollback if needed
rvl migrate rollback --backup-dir /tmp/migration_backup \
  --index my_index --url redis://localhost:6379

# Validate post-migration
rvl migrate validate --plan migration_plan.yaml --url redis://localhost:6379

CLI: Batch Migration (Multiple Indexes)

rvl migrate batch-plan --config batch_config.yaml --url redis://localhost:6379
rvl migrate batch-apply --plan batch_plan.yaml --url redis://localhost:6379
rvl migrate batch-status --plan batch_plan.yaml

Programmatic API

from redisvl.migration import MigrationPlanner, MigrationExecutor

# Plan
planner = MigrationPlanner()
plan = planner.create_plan(
    index_name="my_index",
    target_schema=target_schema,
    redis_url="redis://localhost:6379",
)

# Apply
executor = MigrationExecutor()
report = executor.apply(
    plan,
    redis_url="redis://localhost:6379",
    backup_dir="/tmp/migration_backup",
    num_workers=4,
    batch_size=1000,
)

print(f"Result: {report.result}")  # "succeeded"
print(f"Duration: {report.timings.total_duration}s")

Async API

from redisvl.migration import AsyncMigrationExecutor

executor = AsyncMigrationExecutor()
report = await executor.apply(
    plan,
    redis_url="redis://localhost:6379",
    backup_dir="/tmp/migration_backup",
    num_workers=4,
)

Crash Safety & Resume

Migrations are crash-safe by default when --backup-dir is provided:

  1. Before drop: Original vectors are dumped to a binary backup file on disk
  2. On crash: Re-running the same command detects the backup and resumes from the last completed batch
  3. Rollback: rvl migrate rollback restores original vectors from the backup at any time

The backup file tracks phase (dump → ready → active → completed) and batch progress, so resume skips already-completed work.

Performance

  • Pipelined reads/writes: Batch HGET/HSET operations (configurable --batch-size)
  • Multi-worker quantization: --workers N parallelizes vector re-encoding via ThreadPoolExecutor (sync) or asyncio.gather (async)
  • Redis Cluster support: Batched DUMP/RESTORE/DEL for cross-slot key renames (100 keys/pipeline)
  • Disk space estimation: rvl migrate estimate calculates RDB + AOF impact before any mutations

What's Blocked

Change Why Workaround
Change vector dimensions Requires re-embedding Re-embed with new model, reload data
Change storage type (hash ↔ JSON) Different data format Export, transform, reload
Add a new vector field Requires vectors for all docs Add vectors first, then migrate

New Files

redisvl/migration/ (core module)

File Description
models.py Data models: MigrationPlan, MigrationReport, MigrationTimings, etc.
planner.py Schema diffing, change classification, plan generation
executor.py Sync migration executor, full apply lifecycle
async_executor.py Async migration executor
async_planner.py Async planner
validation.py / async_validation.py Pre/post-migration validation
backup.py VectorBackup: crash-safe binary backup format
quantize.py Pipelined vector quantization + multi-worker orchestration
reliability.py Dtype conversion safety checks, width analysis
wizard.py Interactive migration wizard
batch_planner.py / batch_executor.py Multi-index batch migration
utils.py Shared utilities (disk estimation, key enumeration, etc.)

redisvl/cli/migrate.py

Full CLI with 12 subcommands: list, wizard, plan, apply, estimate, rollback, validate, helper, batch-plan, batch-apply, batch-resume, batch-status

Tests

Suite Count Description
test_migration_planner.py 67 Schema diffing, change classification
test_migration_wizard.py 76 Interactive wizard, adversarial inputs
test_vector_backup.py 32 Backup create/load/resume/rollback/cleanup
test_pipeline_quantize.py 12 Pipelined read/write/convert
test_executor_backup_quantize.py 7 Executor backup integration
test_multi_worker_quantize.py 22 Multi-worker, resume, deprecation
test_async_migration_executor.py 13 Async executor
test_async_migration_planner.py 11 Async planner
test_batch_migration.py 44 Batch planner/executor
Integration tests 6 files Full end-to-end with live Redis

Total: 178+ unit tests passing, all pre-commit checks clean.

Review Notes

  • This is a pre-release: API surface is stable but may evolve based on feedback
  • 6 rounds of automated code review (nkode-review) have been applied, addressing correctness, security, performance, and backward compatibility findings
  • The branch includes removal of the MCP module (previously merged separately). Those deletions are unrelated to the migrator
  • Documentation is in docs/user_guide/how_to_guides/migrate-indexes.md

Copilot AI review requested due to automatic review settings April 14, 2026 05:46
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a pre-release RedisVL index migration system (sync + async) with planning, execution, validation, batch workflows, and extensive documentation/tests to enable crash-safe, document-preserving schema evolution.

Changes:

  • Introduces migration core modules (planner/executor/validator/backup/quantize/reliability) plus batch migration support.
  • Adds async equivalents and a new rvl migrate CLI entry point + docs.
  • Adds comprehensive unit/integration tests and benchmark/e2e helper scripts.

Reviewed changes

Copilot reviewed 45 out of 53 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/unit/test_async_migration_planner.py Adds async planner unit coverage mirroring sync planner tests.
tests/unit/test_async_migration_executor.py Adds async executor + disk space estimator unit tests and dtype-detection tests.
tests/integration/test_migration_v1.py End-to-end integration test for sync plan/apply/validate flow.
tests/integration/test_migration_routes.py Integration coverage for supported migration “routes” (algo/metric/dtype/params).
tests/integration/test_field_modifier_ordering_integration.py Adds integration tests for new/related field modifiers (INDEXEMPTY/UNF/NOINDEX).
tests/integration/test_batch_migration_integration.py Adds integration tests for batch plan/apply/resume/progress callback.
tests/integration/test_async_migration_v1.py End-to-end integration test for async plan/apply/validate flow.
tests/benchmarks/visualize_results.py Adds benchmark visualization script for retrieval/memory/latency charts.
scripts/verify_data_correctness.py Adds manual script to verify float32→float16 migration correctness.
scripts/test_migration_e2e.py Adds large-scale e2e migration benchmark script.
scripts/test_crash_resume_e2e.py Adds crash/resume robustness test script for quantization checkpointing.
redisvl/redis/connection.py Enhances vector attribute parsing to include HNSW params (m, ef_construction).
redisvl/migration/validation.py Adds sync migration validation (schema/doc counts/key samples/functional checks).
redisvl/migration/utils.py Adds YAML helpers, schema canonicalization, readiness polling, disk estimation utilities.
redisvl/migration/reliability.py Adds crash-safety utilities: dtype detection, checkpointing, BGSAVE helpers, undo buffer.
redisvl/migration/quantize.py Adds pipelined (and multi-worker) quantization for vector dtype conversions.
redisvl/migration/models.py Adds migration/batch models and disk space estimate models/helpers.
redisvl/migration/batch_planner.py Adds batch planner for applying a shared patch across many indexes.
redisvl/migration/batch_executor.py Adds batch executor with checkpointing/resume and reporting.
redisvl/migration/backup.py Adds crash-safe on-disk backup format for vector dumps and resume.
redisvl/migration/async_validation.py Adds async validator parity with sync validation checks.
redisvl/migration/async_planner.py Adds async planner wrapper over sync diff/classification logic.
redisvl/migration/init.py Exposes new migration/batch APIs at package boundary.
redisvl/cli/utils.py Fixes redis URL scheme building and refactors CLI option helpers.
redisvl/cli/main.py Wires in new migrate CLI command group.
docs/user_guide/index.md Adds migration to user guide landing page highlights.
docs/user_guide/how_to_guides/index.md Adds “Migrate an Index” how-to link and toctree entry.
docs/user_guide/cli.ipynb Updates CLI notebook with rvl migrate commands and reorganizes connection section.
docs/concepts/search-and-indexing.md Updates concept docs to point to new migration workflow and docs.
docs/concepts/index.md Adds “Index Migrations” concept card and toctree entry.
docs/concepts/index-migrations.md New concept doc describing migration modes, supported changes, and sync/async behavior.
docs/concepts/field-attributes.md Expands vector datatype docs + migration support notes for modifiers.
docs/api/cli.rst Adds a full CLI reference including rvl migrate command group.
CLAUDE.md Adds protected directory note (local_docs/).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread redisvl/migration/validation.py Outdated
Comment thread redisvl/migration/async_validation.py Outdated
Comment thread redisvl/migration/validation.py Outdated
Comment on lines +69 to +73
if plan.rename_operations.change_prefix is not None:
old_prefix = plan.source.keyspace.prefixes[0]
new_prefix = plan.rename_operations.change_prefix
# Mirror executor logic exactly:
# new_key = new_prefix + key[len(old_prefix):]
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefix-change validation assumes plan.source.keyspace.prefixes[0] is the only (or correct) source prefix. If the index has multiple prefixes, translating sampled keys using only the first prefix can produce incorrect keys_to_check and false validation failures. Consider translating each sampled key by matching against all configured prefixes (or storing the actual matched prefix per sampled key in the snapshot).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved — key translation now iterates all configured prefixes in commit a9b3972.

Comment thread redisvl/migration/async_validation.py Outdated
Comment on lines +136 to +139
return []
n = len(keys)
chunk_size = math.ceil(n / num_workers)
return [keys[i : i + chunk_size] for i in range(0, n, chunk_size)]
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring states the returned list has num_workers slices and 'some may be empty if keys < workers', but the implementation returns fewer slices (never empty slices). Either update the docstring to reflect the current behavior, or adjust the function to always return exactly num_workers slices (padding with empty lists) if that invariant is important for worker bookkeeping and stable shard naming.

Suggested change
return []
n = len(keys)
chunk_size = math.ceil(n / num_workers)
return [keys[i : i + chunk_size] for i in range(0, n, chunk_size)]
return [[] for _ in range(num_workers)]
n = len(keys)
chunk_size = math.ceil(n / num_workers)
slices = [keys[i : i + chunk_size] for i in range(0, n, chunk_size)]
slices.extend([[] for _ in range(num_workers - len(slices))])
return slices

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved — docstring corrected in commit a9b3972.

Comment thread redisvl/migration/quantize.py Outdated
Comment thread tests/unit/test_async_migration_executor.py
@nkanu17 nkanu17 force-pushed the feat/index-migrator-pre-release branch from 3fe4972 to 3f03bb7 Compare April 14, 2026 05:49
@jit-ci
Copy link
Copy Markdown

jit-ci Bot commented Apr 14, 2026

🛡️ Jit Security Scan Results

CRITICAL HIGH MEDIUM

✅ No security findings were detected in this PR


Security scan by Jit

Copilot AI review requested due to automatic review settings May 1, 2026 15:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review this pull request because it exceeds the maximum number of lines (20,000). Try reducing the number of changed lines and requesting a review from Copilot again.

@nkanu17 nkanu17 added auto:release Create a release when this PR is merged auto:minor Increment the minor version when merged labels May 1, 2026
@nkanu17 nkanu17 requested a review from tylerhutcherson May 1, 2026 17:03
@nkanu17
Copy link
Copy Markdown
Collaborator Author

nkanu17 commented May 1, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5671e6fe69

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread redisvl/migration/executor.py
Comment thread redisvl/migration/async_executor.py
Copilot AI review requested due to automatic review settings May 1, 2026 18:40
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review this pull request because it exceeds the maximum number of lines (20,000). Try reducing the number of changed lines and requesting a review from Copilot again.

@nkanu17 nkanu17 requested a review from abrookins May 1, 2026 18:48
nkanu17 added 16 commits May 1, 2026 15:04
…, validator, and utilities

Adds the core data structures and planning engine for the Index Migrator:
- models.py: Pydantic models for MigrationPlan, DiffClassification, ValidationResult, MigrationReport
- planner.py: MigrationPlanner with schema introspection, diffing, and change classification
- validation.py: MigrationValidator for post-migration checks
- utils.py: shared helpers for YAML I/O, disk estimation, index listing, timestamps
- connection.py: HNSW parameter extraction for schema introspection
- 15 unit tests for planner logic
- Fix import ordering in utils.py (isort compliance)
- Simplify validation prefix key rewriting to mirror executor logic
- Normalize single-element list prefixes in normalize_target_schema_to_patch
Adds the migration executor and CLI subcommands for plan/apply/validate:
- executor.py: MigrationExecutor with sync apply, key enumeration, index drop/create, quantization, field/key rename
- reliability.py: BatchUndoBuffer, QuantizationCheckpoint, BGSAVE helpers
- cli/migrate.py: CLI with plan, apply, validate, list, helper, estimate subcommands
- cli/main.py: register migrate command
- cli/utils.py: add_redis_connection_options helper
- Integration tests for comprehensive migration, v1, routes, and field modifier ordering
- Fix CLI step labels to match executor order
- Fix GEO coordinates to lat,lon order in integration tests
- Move JSON path to top-level field property in tests
- Use sys.exit() instead of exit() in CLI
- Use transaction=False for quantize pipeline
Adds guided migration builder for interactive plan creation:
- wizard.py: MigrationWizard with index selection, field operations, vector tuning, quantization, and preview
- cli/migrate.py: adds 'wizard' subcommand (rvl migrate wizard --index <name>)
- Unit tests for wizard logic (41 tests)
- Improve field removal to clean up renames by both old_name and new_name
- Resolve update names through rename map in working schema preview
- Add multi-prefix guard to reject indexes with multiple prefixes
- Fix dependent prompts (UNF, no_index) when field is already sortable
- Pass existing field attrs to common attrs prompts for update mode
…CLI flag

Adds non-blocking async migration support:
- async_executor.py: AsyncMigrationExecutor with async apply, BGSAVE, quantization
- async_planner.py: AsyncMigrationPlanner with async create_plan
- async_validation.py: AsyncMigrationValidator with async validate
- async_utils.py: async Redis helpers
- cli/migrate.py: adds --async flag to 'apply' subcommand
- Unit tests for async executor and planner
- Fix SVS client leak in async_planner check_svs_requirements
- Remove dead async_utils.py (functions duplicated in async_executor)
…ands

- batch_planner.py: multi-index plan generation with pattern/list support
- batch_executor.py: checkpointed batch execution with resume capability
- CLI: batch-plan, batch-apply, batch-resume, batch-status subcommands
- 32 unit tests for batch migration logic
… CLI

- Refactor _check_index_applicability to return Tuple[BatchIndexEntry, bool]
  where bool indicates quantization, avoiding redundant create_plan_from_patch
- Replace exit(1) with sys.exit(1) in batch-apply and batch-resume CLI commands
- Sanitize report filenames (colons to underscores) for Windows compat
- docs/concepts/index-migrations.md: migration concepts and architecture
- docs/user_guide/how_to_guides/migrate-indexes.md: step-by-step migration guide
- docs/api/cli.rst: CLI reference for rvl migrate commands
- tests/benchmarks/: migration benchmark scripts and visualization
- Updated field-attributes, search-and-indexing, and user guide indexes
- Remove 13_sql_query_exercises.ipynb (unrelated to migration feature)
- Replace ' -- ' emdashes with colons in crash-safe resume docs
- Fix async executor readiness check to handle missing percent_indexed
- Fix benchmark wait_for_index_ready masking percent_indexed=0
- Fix wizard showing dependent prompts when sortable explicitly False
- Fix CLI docs: --patch→--schema-patch, --output/-o→--plan-out
- Fix migration docs: field renames now listed as supported
- Fix batch resume not forwarding batch_plan_path to apply()
- Fix batch-resume CLI missing quantization safety gate
- Fix --query-check → --query-check-file in cli.rst
- Fix --target-schema mutually-exclusive ref to --schema-patch in cli.rst
- Fix async validation functional check to match sync (>0 not ==)
- Fix async quantize pipeline to use transaction=False
- Fix test checkpoint status 'succeeded' → 'success'
…e order

- Update wildcard_search details to say 'expected >0, source had N' instead of misleading 'expected N'
- Change doc examples from 'succeeded' to 'success' matching BatchIndexState values
- Add --accept-data-loss note for batch-resume in migration guide
- Fix geo coordinate order (lon,lat) in test sample data
nkanu17 added 17 commits May 1, 2026 15:09
…orker guards, overcount

- Rollback CLI: add --index filter to scope restore to specific index
- Rollback CLI: remove unused --batch-size flag
- --resume: fail fast if value looks like a checkpoint file (old semantics)
- --workers > 1: enforce --backup-dir at CLI level
- Direct quantize overcount: count len(converted) not len(batch_keys) (sync+async)
…ollback

Findings addressed:
- Rollback gates on backup phase: refuses incomplete (phase='dump') backups
  unless --force is passed
- Rollback requires --index or --yes when multiple indexes detected (no
  interactive input() prompt that blocks CI)
- Backup cleanup uses exact .header/.data extensions and boundary check
  (prevents deleting unrelated files with shared prefix)
- Async/sync direct quantize counts len(converted) not len(batch_keys)

Tests added:
- test_rollback_skips_incomplete_backup_phase
- test_rollback_index_filter
- test_rollback_multi_index_requires_flag
- test_cleanup_only_removes_known_extensions
- test_cleanup_does_not_match_similar_prefix
…mpat note

Findings addressed:
- Add TestWorkerResume: resume from dump/ready/completed phases, async load
  vs create, FileExistsError on double-create
- Replace Unicode checkmark with ASCII in rollback CLI output
- Finding #2 (--resume compat): intentional breaking change, already has
  clear error message - accepted as-is

Tests: 174 passed (4 new resume tests)
…e CLI tests

Findings addressed:
- Add checkpoint_path as deprecated kwarg to MigrationExecutor.apply and
  AsyncMigrationExecutor.apply - maps to backup_dir with DeprecationWarning
- Add TestResumeDeprecation: verify checkpoint_path in signatures, --resume
  YAML rejection, directory acceptance

Accepted as-is:
- Rollback non-recursive dir search: intentional flat layout
- redis.asyncio import: redis>=5.0 already required

Tests: 178 passed (4 new)
…lisions

Critical fix:
- Sync and async executor resume paths now perform key prefix renames
  after quantize (renames happen after drop in normal path, so they
  may not have completed before crash)

Backup naming:
- Add sha256[:8] hash of index_name to backup filenames to prevent
  collisions between distinct names that sanitize identically
  (e.g., 'a/b' and 'a:b' both become 'a_b' but have different hashes)
- Applied to single-worker, multi-worker, and cleanup paths

Accepted as-is:
- --resume backward compat: intentional, clear error message
- Rollback non-recursive: flat layout matches how backups are written

Tests: 178 passed
…llback

Performance fix:
- Sync and async _rename_keys_cluster now batch DUMP+PTTL reads and
  RESTORE+DEL writes in groups of 100 (pipeline_size), reducing from
  5 RTTs/key to ~3 RTTs per batch of 100 keys. ~50x fewer round trips
  for large key renames in Redis Cluster mode.

Backward compat fix:
- Both executors now probe for legacy backup filenames (pre-hash naming
  convention: migration_backup_<safe_name>) when the hashed path is not
  found. This ensures crash-resume works across library upgrades.

Tests: 178 passed
- Remove QuantizationCheckpoint, BatchUndoBuffer, trigger_bgsave_and_wait,
  async_trigger_bgsave_and_wait from reliability.py (dead code)
- Remove checkpoint_path parameter from MigrationExecutor.apply() and
  AsyncMigrationExecutor.apply()
- Remove deprecated --resume CLI flag and legacy_resume handling
- Remove TestResumeDeprecation test class
- Update docs to use --backup-dir as primary crash-safe recovery mechanism
- Update cli.rst to document current flags (--backup-dir, --workers, etc.)
- Replace 'checkpoint' wording with 'state file/tracking' in batch commands
- Validation: replace multi-key EXISTS with per-key calls (cluster-safe)
- Validation: multi-prefix key translation matches against all prefixes
- Wizard: block rename to a name staged for removal
- Batch executor: broad regex filename sanitization for report files
- CLI: empty-string plan_path no longer bypasses batch-resume safety gate
- Quantize: fix split_keys docstring, remove dead sum() code
- Add 10 TDD tests covering all fixes
… executors

If a migration crashes mid-rename and is resumed, RENAMENX returns False
for already-renamed keys. Previously this was treated as a collision,
aborting the entire resume. Now the executor checks EXISTS on both source
and destination: if source is gone and destination exists, the key is
counted as already-renamed and skipped.

Applies to both standalone (RENAMENX) and cluster (DUMP/RESTORE) paths.
Includes 7 unit tests covering resume, collision, and edge cases.
@nkanu17 nkanu17 force-pushed the feat/index-migrator-pre-release branch from 01d3025 to 037c2d8 Compare May 1, 2026 19:20
nkanu17 added 6 commits May 7, 2026 17:58
Move the per-vector dtype conversion logic out of the convert_vectors
loop into a dedicated _quantize_array helper. Float-to-float casts stay
a simple astype, while float-to-integer conversions apply per-vector
min-max scaling so int8/uint8 targets fill the full integer range
instead of truncating embedding values in [-1, 1] to zero.

Use TYPE_CHECKING + lazy_import so numpy is loaded lazily at runtime
while mypy still resolves np.ndarray annotations.

Adds unit coverage for float64-to-float32 and float64-to-float16 paths
and for the int8 scaling behavior across constant/varying vectors.
Quantization migrations now always write a vector backup before
mutating data. The previous opt-out path (--keep-backup / no
--backup-dir) is gone; instead the executor auto-defaults backup_dir
to ./migration_backups when none is supplied and refuses to run if
backup_dir is explicitly set to an empty value.

Backup files are retained on disk after a successful migration so the
operator can audit or roll back; cleanup is now a manual step.

Changes:
- executor / async_executor: auto-default backup_dir, drop keep_backup,
  drop the post-success cleanup step, expose DEFAULT_BACKUP_DIR.
- batch_executor: thread backup_dir / batch_size / num_workers through
  apply_batch and resume_batch so per-index migrations inherit them.
- cli migrate: drop --keep-backup, drop the '--workers > 1 requires
  --backup-dir' guard (backup is now always present), and add
  --backup-dir / --batch-size / --workers to batch-apply and
  batch-resume. Refresh the BGSAVE warnings to point at the automatic
  backup instead.
- docs/migrate-indexes.md: update the backup section, the rollback
  section, and the CLI flag reference to match the new behavior.
- scripts/test_migration_e2e.py, scripts/test_crash_resume_e2e.py:
  drop the keep_backup kwarg from executor.apply() calls.
- tests: add TestMandatoryBackupEnforcement covering the auto-default,
  the empty-string rejection, and that backups are no longer cleaned
  up on success.
The Migrate dispatcher logged exceptions through logger.error and then
exited with status 1, but the package logger uses NullHandler by
default so the user's terminal saw a silent non-zero exit. Print the
exception message to stderr in addition to logging it so failures
(e.g. plan-time validation errors) are always visible.
Two RediSearch indexes whose key prefixes overlap (one is a literal
string-prefix of the other, matching FT.CREATE PREFIX semantics) cover
the same Redis keyspace. Running a batch quantization migration over
them re-reads vectors that an earlier index in the batch has already
quantized, producing garbage bytes and corrupting data.

Reject these batches at plan time so the operator must split the
migration into disjoint groups before any vectors are touched.

- utils: add normalize_prefixes and find_overlapping_index_groups, and
  align build_scan_match_patterns with FT.CREATE PREFIX semantics
  (literal prefix + glob, no injected key separator) so the scan
  pattern matches the same keyspace the planner reasons about.
- planner: same scan-pattern fix in MigrationPlanner; bump the SCAN
  COUNT hint while we're touching the loop.
- batch_planner: collect prefixes during applicability checks and
  raise ValueError listing the overlapping (index, index, [pairs])
  before create_batch_plan returns.
- tests/unit: TestBatchMigrationPlannerOverlapDetection covers
  identical, nested, multi-prefix, and disjoint cases plus the
  applicable-only filtering rule.
- tests/integration: TestBatchMigrationOverlapDetectionIntegration
  exercises the same paths against a real Redis with FT.CREATE'd
  indexes.
The CLI command-table cell contained raw newline characters inside
JSON string literals (e.g. "| Command | Purpose |<LF>",) which is
invalid JSON. nbformat / nbval failed to parse the notebook, breaking
the docs and notebook test jobs:

  nbformat.reader.NotJSONError: Notebook does not appear to be JSON
  Invalid control character at: line 44 column 27

Replace the embedded newlines with the proper \n escape so each
markdown table row is a valid JSON string. nbval now collects the
notebook (13 cells) and make docs-build succeeds.
Add user-facing coverage for the new plan-time overlap check that
batch-plan now performs:

- Batch Migration > Batch Plan Review: short note explaining that
  batch-plan refuses overlapping prefixes and pointing at the
  troubleshooting entry.
- Troubleshooting: new 'overlapping indexes detected' entry with the
  verbatim error output the operator will see and the resolution
  (split into prefix-disjoint groups, one batch-plan per group).
@nkanu17 nkanu17 requested a review from vishal-bala May 7, 2026 22:12
nkanu17 added 3 commits May 7, 2026 18:13
…rmat

- docs/concepts/index-migrations.md: add a 'Vector backups (mandatory
  for quantization)' section that states backups are unconditional for
  any quantization step, documents the default location
  (./migration_backups), the on-disk layout, what backups enable
  (crash-safe resume + manual rollback), and the new manual retention
  rule. Add a short 'Overlapping indexes' section that mirrors the
  troubleshooting entry in the how-to guide so the concept is reachable
  from the conceptual docs as well.
- tests/unit/test_batch_migration.py: add
  test_overlap_error_matches_documented_format which calls
  BatchMigrationPlanner._format_overlap_error directly and asserts the
  exact substrings that the user-facing docs reproduce, so the docs
  cannot drift from the implementation without the test failing.
BatchMigrationExecutor.apply/resume accepted num_workers but only
forwarded redis_client into the single-index executor. The multi-worker
quantization path in MigrationExecutor.apply requires redis_url so each
worker can open its own connection, and raises ValueError when it is
None. As a result rvl migrate batch-apply --workers >1 against a
quantizing index would fail before doing any work.

Thread redis_url through _migrate_single_index and into
self._single_executor.apply so the batch path now matches the
single-index contract.
Copilot AI review requested due to automatic review settings May 8, 2026 19:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review this pull request because it exceeds the maximum number of lines (20,000). Try reducing the number of changed lines and requesting a review from Copilot again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto:minor Increment the minor version when merged auto:release Create a release when this PR is merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants