Skip to content

Commit 325439a

Browse files
nuglifeleojiLeo Jid-v-b
authored
DOC: document threading.max_workers config option (#3852)
Closes #3492 Made-with: Cursor Co-authored-by: Leo Ji <nuglifeleoji@gmail.com> Co-authored-by: Davis Bennett <davis.v.bennett@gmail.com>
1 parent 8f14d67 commit 325439a

2 files changed

Lines changed: 23 additions & 0 deletions

File tree

changes/3492.doc.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Document the `threading.max_workers` configuration option in the performance guide.

docs/user-guide/performance.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -217,6 +217,28 @@ Lower concurrency values may be beneficial when:
217217
- Memory is constrained (each concurrent operation requires buffer space)
218218
- Using Zarr within a parallel computing framework (see below)
219219

220+
### Thread pool size (`threading.max_workers`)
221+
222+
When synchronous Zarr code calls async operations internally, Zarr uses a
223+
`ThreadPoolExecutor` to run those coroutines. The `threading.max_workers`
224+
configuration option controls the maximum number of worker threads in that pool.
225+
By default it is `None`, which lets Python choose the pool size (typically
226+
`min(32, os.cpu_count() + 4)`).
227+
228+
You can set it explicitly when you want more predictable resource usage:
229+
230+
```python
231+
import zarr
232+
233+
zarr.config.set({'threading.max_workers': 8})
234+
```
235+
236+
Reducing this value can help avoid overloading the event loop when Zarr is used
237+
inside a parallel computing framework such as Dask that already manages its own
238+
thread pool (see the Dask section below). Increasing it may improve throughput
239+
in CPU-bound workloads where many synchronous-to-async dispatches happen
240+
concurrently.
241+
220242
### Using Zarr with Dask
221243

222244
[Dask](https://www.dask.org/) is a popular parallel computing library that works well with Zarr for processing large arrays. When using Zarr with Dask, it's important to consider the interaction between Dask's thread pool and Zarr's concurrency settings.

0 commit comments

Comments
 (0)