feat: size-based download integrity + idle timeout#22
Merged
roziscoding merged 3 commits intoJun 7, 2026
Merged
Conversation
Resolve the expected size from the transfer header (Content-Length / Content-Range total) falling back to the *arr release size, and make the final size check unconditional so a truncated stream is no longer renamed into place and imported as complete. When no size is known at all, fail fast with UnknownSizeError before writing any bytes. Adds 'release_size' as an ExpectedBytesSource (schema + migration 0003) for accurate telemetry, and handles resume edge cases via releaseSize: a .part larger than the file is discarded and restarted; a .part already equal to the file is finalized without re-downloading.
The 30-min whole-request deadline both killed large active downloads and was slow to notice a stalled connection. Replace it with an inactivity timeout (downloads.idleTimeoutMs, default 60s) armed only around the fetch and each read, so slow disk writes never trip it. The abort carries a sentinel reason so only it — not a later real error — is reclassified as a retryable IdleTimeoutError; the .part is preserved so retry resumes. Also tears down the reader (cancel) on error and cancels the body if the .part file can't be opened.
- Label a 206 resume's size as 'content_range' (not 'content_length') for accurate telemetry; adds it to ExpectedBytesSource (migration 0004). - Release the stream reader only after the completeness check, so the error path's cancel+releaseLock runs against a still-locked reader on the IncompleteDownloadError (retried) path instead of being a swallowed no-op.
4bc6eaf
into
feat/qbittorrent-api-emulation
8 of 10 checks passed
roziscoding
added a commit
that referenced
this pull request
Jun 7, 2026
Squash of feat/download-integrity-monitoring onto the size-based integrity / idle-timeout work already landed via #22, so this adds only the net-new changes on top of that branch: Primary fix — failed qB downloads no longer silently report 'ok': addTorrent ignored startQbDownload's result, so an unknown peer or an unsafe peer-supplied filename created no row and was invisible to both jack and *arr, which waited out its full stuck-download grace period. startQbDownload now returns 'started' | 'duplicate' | 'failed'; 'failed' propagates HTTP 503 so *arr retries promptly, while an already-active 'duplicate' stays a success so a re-add of an in-flight release is not turned into a needless retry. Also: bump TypeScript to 6 and resolve the resulting type errors (getReader cast, fetch-spy typings, tsconfig types: "bun").
roziscoding
added a commit
that referenced
this pull request
Jun 7, 2026
…adarr (#21) * feat: emulate qBittorrent auth + app endpoints Add /api/v2 routes for qBittorrent WebUI API emulation so Sonarr/Radarr can register jack as a qBittorrent download client. Mounts before the global requireApiKey middleware with its own SID-cookie auth. - POST /auth/login validates username against configured server names - GET /app/{version,webapiVersion,preferences} return contract values - GET /torrents/{info,categories} return empty/stub for connection test - Stub infohash helper for Phase 2 torrent mapping * feat: map jack downloads to qBittorrent torrents Add qb_category/qb_source_server columns (migration 0002) and the download→qBittorrent-torrent mapper so Sonarr/Radarr see real progress. - toQbTorrent: real stub infohash, pausedUP on completion, content_path != save_path, byte-based progress - torrents/info filters by category + hashes - torrents/properties (404 unknown) and torrents/files - chore: untrack accidentally-committed .claude lock + e2e sqlite db * feat: qBittorrent add/delete/setCategory with *arr-pull import switch Add the qBittorrent write API and switch qB-added downloads to *arr-pull imports (blackhole-added downloads keep the existing push). - DownloadsService: shared createDownload core + startQbDownload; skip triggerImport when qbSourceServer is set (per-download pull switch) - repository delete/setQbCategory - torrents/add accepts only jack stubs / jack download URLs (rejects magnets and foreign torrents); delete/setCategory scoped to the session - wire a single DownloadsService through getApp + the blackhole watcher * feat: auto-register jack as a qBittorrent download client Replace the TorrentBlackhole auto-registration with a qBittorrent client pointing at ${jack.baseUrl}/api/v2, keeping the indexer↔client binding so only the Jack client grabs the Jack indexer's releases. - registerDownloadClient builds a QBittorrentSettings client (host/port/ useSsl/urlBase/username/password + per-app category field); matches an existing "Jack" client by name to upgrade a prior blackhole client - best-effort setShareLimits/topPrio/setForceStart no-ops - e2e qBittorrent-flow test (login → add → poll → import) * refactor: clearer qB add failure + session sweep - torrents/add returns 503 (not 415) when the download pipeline is unconfigured (no downloads config) — a server misconfig, not a bad torrent; 415 stays for magnets/foreign torrents - QbSessionStore sweeps expired sessions on login so abandoned SIDs don't accumulate (get() only evicted lazily) * test(e2e): update auto-registration for qBittorrent client; drop blackhole flow - auto-registration e2e now asserts the qBittorrent download client (name "Jack", implementation QBittorrent, host/port = jack-beta) - remove download-flow.test.ts: the blackhole-via-stack flow is superseded by qbittorrent-flow.test.ts (real *arr integration) and conflicted with it on the shared fixture; blackhole watcher logic stays covered by downloads-service unit tests * refactor: remove blackhole watcher; jack is qBittorrent-only now *arr hands grabs to jack via POST /api/v2/torrents/add (parsed in-memory), so nothing writes to a watch folder anymore. Drop the folder-watching path entirely for a pure qB-driven, *arr-pull download service. - delete BlackholeWatcher; remove DownloadsService.processTorrentFile, the reenqueue bookkeeping, and the stub-file cleanup - remove the now-dead push-import path: DownloadsService.triggerImport + the destinations dependency, and ArrServerConnector.triggerImport / importCommandName (qB downloads import via *arr pull) - drop the watchPath config option (completedPath stays) - stop constructing/starting the watcher in index.ts (resume-on-startup and stale-row reconciliation are retained) - update tests, e2e setup/compose, and example config accordingly * fix: address review — refresh qB session TTL on access, gate qB mount on downloads config - QbSessionStore.get() now extends expiry on each successful access so an actively-polling *arr SID never expires mid-session (matches real qB) - mount the /api/v2 qB API only when config.downloads is set, so the reported save_path is always the real completedPath (never "") * chore(e2e): drop unused blackhole-watch volume dir from e2e task * fix: always register the Jack indexer + qB client (forceSave) The Jack indexer was registered unbound: registerDownloadClient ran with forceSave=false, so *arr's connection test to jack's qB API at registration time could fail and throw, leaving downloadClientId undefined — the indexer then registered with no Download Client ("Any"). - register both the indexer and the qBittorrent client with forceSave=true, so the client always saves (returns its id → indexer gets bound) and the indexer saves even when its test query returns no results - always run auto-registration (drop the no-peers skip); with forceSave the indexer is accepted with an empty catalog and starts working once peers exist - e2e: assert the Jack indexer's downloadClientId equals the Jack qB client id (regression guard for the binding) Verified against a real Radarr: in-place TorrentBlackhole→QBittorrent PUT and top-level downloadClientId binding both persist on create and update. * feat: tag auto-registered qBittorrent download client *arr round-robins torrent grabs across all enabled torrent clients, so real torrents from other indexers could be routed to jack's qB client (which rejects them). Tag the client with a configurable label (default "jack-internal") that no media carries, so *arr keeps those grabs out of it; Jack grabs still arrive via the indexer's explicit downloadClientId binding, which bypasses tag filtering. Adds autoregister.tag to the config (set "" to disable) and an ensureTag helper that creates the tag in each destination if missing. * docs: align README and compose examples with qBittorrent-only flow Rewrite the download flow, mounts, and troubleshooting for the qBittorrent API model (no watch folder); add commented depends_on service_healthy examples and document autoregister.tag. * docs: clarify ETA_UNKNOWN sentinel comment * fix: isolate Jack download client via lowest priority, not a tag The tag approach was broken: Radarr/Sonarr filter the indexer-bound download client by the *movie's* tags too (DownloadClientProvider builds the tag-filtered pool before resolving the indexer's DownloadClientId), so a "jack-internal" tag no movie carries made every grab fail with "Indexer specified download client does not exist". Instead, register the Jack qBittorrent client at *arr's lowest priority (50). The general client pool only round-robins the best-priority group, so real torrents from other indexers never land on Jack — while the indexer→client binding still routes Jack grabs to it (resolved before priority grouping). Tags are explicitly cleared so an upgrade heals a client previously broken by the tag. Reverts the autoregister.tag config/ensureTag added earlier. * feat: size-based download integrity + idle timeout (#22) * feat: verify peer downloads by size with fail-fast on unknown size Resolve the expected size from the transfer header (Content-Length / Content-Range total) falling back to the *arr release size, and make the final size check unconditional so a truncated stream is no longer renamed into place and imported as complete. When no size is known at all, fail fast with UnknownSizeError before writing any bytes. Adds 'release_size' as an ExpectedBytesSource (schema + migration 0003) for accurate telemetry, and handles resume edge cases via releaseSize: a .part larger than the file is discarded and restarted; a .part already equal to the file is finalized without re-downloading. * feat: replace total download timeout with a network-scoped idle timeout The 30-min whole-request deadline both killed large active downloads and was slow to notice a stalled connection. Replace it with an inactivity timeout (downloads.idleTimeoutMs, default 60s) armed only around the fetch and each read, so slow disk writes never trip it. The abort carries a sentinel reason so only it — not a later real error — is reclassified as a retryable IdleTimeoutError; the .part is preserved so retry resumes. Also tears down the reader (cancel) on error and cancels the body if the .part file can't be opened. * fix: address review — content_range source label + reader cleanup order - Label a 206 resume's size as 'content_range' (not 'content_length') for accurate telemetry; adds it to ExpectedBytesSource (migration 0004). - Release the stream reader only after the completeness check, so the error path's cancel+releaseLock runs against a still-locked reader on the IncompleteDownloadError (retried) path instead of being a swallowed no-op. * fix: surface failed qB downloads to *arr instead of silent 'ok' Squash of feat/download-integrity-monitoring onto the size-based integrity / idle-timeout work already landed via #22, so this adds only the net-new changes on top of that branch: Primary fix — failed qB downloads no longer silently report 'ok': addTorrent ignored startQbDownload's result, so an unknown peer or an unsafe peer-supplied filename created no row and was invisible to both jack and *arr, which waited out its full stuck-download grace period. startQbDownload now returns 'started' | 'duplicate' | 'failed'; 'failed' propagates HTTP 503 so *arr retries promptly, while an already-active 'duplicate' stays a success so a re-add of an in-flight release is not turned into a needless retry. Also: bump TypeScript to 6 and resolve the resulting type errors (getReader cast, fetch-spy typings, tsconfig types: "bun"). * fix: serve peer files with native backpressure to prevent OOM Serving a peer file via a hand-pumped Bun.file().stream() does not apply TCP backpressure: when the consumer stalls or aborts mid-download, the origin reads the entire file into memory as fast as it can, starving the event loop and exhausting RAM. On a swapless host a single stalled stream of a large file is enough to hard-freeze the process until the kernel OOM-kills it. Return the BunFile (or its sliced view) directly from streamFile and hand it to new Response, so Bun.serve streams it with native backpressure and sendfile. Verified: peak RSS under a stalled consumer drops from ~2x the file size to flat, while range serving stays byte-correct. * feat: widen peer download retry backoff to survive a peer outage A peer can be unreachable for ~15-20min (restart, tunnel hiccup). The old schedule (5 attempts, 60s max backoff) gave up in ~3min, terminally failing a resumable download long before the peer recovered. Raise maxDownloadAttempts to 13 and retryMaxDelayMs to 30min so the exponential schedule spans a worst-case ~64min window (~32min with jitter); early retries stay ~1s for ordinary network blips. The download keeps retrying and resumes from its .part once the peer is back.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Hardens peer downloads against silent corruption and stalls. Targets
feat/qbittorrent-api-emulation(stacked on top of PR #21), not main.Phase 1 — Size-based integrity
Content-Range total) falling back to the *arr
releaseSize, so a peer thatdoesn't advertise a size no longer yields
expectedBytes: null.downloadedBytes === expectedBytescheck is now unconditional — atruncated stream is no longer renamed into place and imported as complete.
UnknownSizeError(permanent, non-retried) when no size isknown at all, before any byte is written.
release_sizeas anExpectedBytesSource(schema + migration0003) foraccurate telemetry.
releaseSize: a.partlarger than the file is discardedand restarted (
part_oversize); a.partalready equal to the file is finalizedwithout re-downloading.
Phase 2 — Idle timeout
AbortSignal.timeoutwith an inactivitytimeout (
downloads.idleTimeoutMs, default 60s), armed only around the fetch andeach read so slow disk I/O never trips it.
a retryable
IdleTimeoutError; the.partis preserved so retry resumes..partcan't be opened.Why not "Part 2" of the original design
The brainstorm proposed making the peer send
Content-Length, but research foundthe peer already does (
peer.router.ts). TheexpectedBytes: nullseen inprod was an older peer / a stripped header — handled by the releaseSize fallback.
Content hashing was considered and dropped (HTTPS covers transport integrity;
size-only is the agreed bar).
Testing
bun test apps/backend/src/— 172 pass; lint clean.idle stall + slow-but-active,
release_sizeDB round-trip, retry-policyclassification (IdleTimeoutError retryable, plain AbortError not).
Greptile Summary
Hardens peer downloads with two independent layers: size-based integrity (fail-fast on unknown size, unconditional truncation check,
releaseSizefallback, and pre-fetch oversize/exact-finalize resume guards) and per-chunk inactivity timeout (replacing the old whole-requestAbortSignal.timeoutwith anarmIdle/clearIdlepattern so disk I/O doesn't count as idle). Two stacked SQLite migrations addrelease_sizeas a validexpected_bytes_sourceand thencontent_range, keeping the DB schema in lockstep.transferSize(fromContent-RangeorContent-Length) now falls back toreleaseSize, and thedownloadedBytes === expectedBytescheck is unconditional;UnknownSizeErrorfires before any bytes are written if no size source exists.AbortControlleris armed only duringfetch()and eachreader.read(), cleared immediately after, so slow disk writes never trip it;IdleTimeoutError(retryable,.partpreserved) is distinguished from a plainAbortError(permanent) via a sentinel abort reason..partlarger thanreleaseSizeis discarded before the request; one already equal is finalized without a network round-trip, emitting the expectedheaders+completedevents for DB persistence.Confidence Score: 5/5
The change is safe to merge — the download integrity and timeout logic are well-structured, thoroughly tested, and both new error types are correctly wired into the retry policy.
The idle-timeout design (per-chunk armIdle/clearIdle with a sentinel abort reason) correctly isolates network stalls from disk I/O, and the unconditional size check closes the silent-truncation gap. The two stacked migrations sequence correctly since both are applied before the service starts. The only findings are minor observability gaps (original errors not forwarded as cause to IdleTimeoutError), with no impact on correctness or retry behavior.
apps/backend/src/lib/servers/peer.ts — the idleTimeout() factory in both catch sites drops the original error rather than chaining it via cause.
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[downloadFile called] --> B{existingBytes > 0 AND releaseSize known?} B -->|yes, part > releaseSize| C[unlink part → emitRestart part_oversize → existingBytes=0] B -->|yes, part == releaseSize| D[rename part→dest\nemit headers+completed\nreturn] B -->|no / unknown size| E[doFetch with idle timer] C --> E E --> F{response status} F -->|206 valid| G[resuming=true] F -->|206 invalid / 416| H[restartFresh → doFetch fresh] F -->|200 ok| I[resuming=false] H --> I G --> J[Parse Content-Range total as transferSize] I --> K[Parse Content-Length as transferSize] J & K --> L{transferSize != null?} L -->|yes| M[expectedBytes = transferSize\nexpectedBytesSource = content_range or content_length] L -->|no, releaseSize known| N[expectedBytes = releaseSize\nexpectedBytesSource = release_size] L -->|no, no releaseSize| O[cancel body → throw UnknownSizeError permanent] M & N --> P[Read loop\narmIdle per chunk\nclearIdle after each read] P -->|chunk received| Q[write to .part\nupdate progress] Q --> P P -->|idle timer fires| R[AbortController aborts\nreader.read rejects AbortError\ncatch: throw IdleTimeoutError retryable] P -->|done=true| S{downloadedBytes == expectedBytes?} S -->|no| T[throw IncompleteDownloadError retryable\ncatch: reader.cancel → releaseLock] S -->|yes| U[releaseLock → rename part→dest → emit completed]Reviews (2): Last reviewed commit: "fix: address review — content_range sour..." | Re-trigger Greptile