Skip to content

fix(mm): close TOCTOU race in _restore_incomplete_installs#9142

Open
lstein wants to merge 1 commit intomainfrom
fix/model-install-restore-race
Open

fix(mm): close TOCTOU race in _restore_incomplete_installs#9142
lstein wants to merge 1 commit intomainfrom
fix/model-install-restore-race

Conversation

@lstein
Copy link
Copy Markdown
Collaborator

@lstein lstein commented May 9, 2026

Summary

  • _restore_incomplete_installs snapshotted active sources under the lock, then iterated tmpdirs against the stale snapshot. A foreground import_model call landing during iteration could be missed, the same source enqueued twice, and the second download would fail with FileNotFoundError when trying to rename the .downloading file the first download had already moved.
  • Fix: move the membership check inside the lock immediately before _install_jobs.append(job) so check-and-append is atomic. Duplicate-tmpdir suppression within a single restore pass stays outside the lock since it doesn't race against import_model.

Fixes #9141.

Test plan

  • pytest tests/app/services/model_install/test_model_install.py::test_simple_download — ran 10× consecutively, all pass (was flaky ~30–50% on main)
  • pytest tests/app/services/model_install/ — 31/31 pass

🤖 Generated with Claude Code

The restore path snapshotted active sources under the lock, then
released the lock and iterated tmpdirs against the stale snapshot.
A foreground import_model call landing during iteration could append
a job the loop never saw, leading to a duplicate enqueue and a
FileNotFoundError when the second download tried to rename the
.downloading file the first had already moved.

Move the membership check inside the lock immediately before the
append so check-and-append is atomic. Fixes #9141.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added python PRs that change python files services PRs that change app services labels May 9, 2026
@lstein lstein added the 6.14.x label May 9, 2026
@lstein lstein moved this to 6.14.x Theme: LIBRARY UPDATES in Invoke - Community Roadmap May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.14.x python PRs that change python files services PRs that change app services

Projects

Status: 6.14.x Theme: LIBRARY UPDATES

Development

Successfully merging this pull request may close these issues.

Race condition in ModelInstallService._restore_incomplete_installs causes duplicate download enqueue

2 participants