Skip to content

feat: Added flexible backends with the Zarr(s)-Python and Tensorstore support.#393

Merged
srivarra merged 14 commits intomainfrom
zarrs-python
Apr 2, 2026
Merged

feat: Added flexible backends with the Zarr(s)-Python and Tensorstore support.#393
srivarra merged 14 commits intomainfrom
zarrs-python

Conversation

@srivarra
Copy link
Copy Markdown
Collaborator

@srivarra srivarra commented Mar 25, 2026

Previously iohub relied on the Zarr-Python internals directly throughout the NGFF layer, and TensorStore was used for a couple of one off features: Position.to_tensorstore() which opens a FOV as a TensorStore handle manually, and for downsampling.

We'd like to be able to "automatically" use TensorStore, or Zarr Python depending on the use case without having to call .to_tensorstore().

We've replaced that with a pluggable backend protocol.

There's a new abstraction layer over in src/iohub/core which defines a Protocol ZarrImplementation[G, A] which is composed of the GroupBackend[G] (open/navigate groups), the ArrayBackend[G, A] (create/open arrays) and an ArrayIO[A] (read, write, downsample, convert).

There are two implementations: ZarrPythonImplementation (with autodetection of the Rust zarrs codec pipeline) and TensorStoreImplementation (needed some light shims to make it handle the filesystem / Zarr Groups like Zarr-Python, it's a bit messy but I couldn't find a better way).

NGFFNDArray wraps any backend handle. All I/O (reads, writes, orthogonal indexing), dask conversion, downsampling, and append delegates to the bound ZarrImplementation. ImageArray subclasses it instead of zarr.Array.

Here's a diagram to show how it's wired up:

                      open_ome_zarr(path, implementation="tensorstore") # or "zarr"
                              │
                              ▼
                      ┌───────────────┐
                      │   NGFFNode    │  (Plate / Well / Position)
                      │  self._impl ──┼──────────────────────────┐
                      └───────┬───────┘                          │
                              │                                  │
                      ┌───────▼───────┐                          │
                      │  NGFFNDArray  │  (ImageArray)            │
                      │  self._impl ──┼──────────────────────────┤
                      └───────────────┘                          │
                                                                 │
                      ┌──────────────────────────────────────────▼──┐
                      │          ZarrImplementation[G, A]           │
                      │              (Protocol)                     │
                      │                                             │
                      │  open_group  · create_array · read · write  │
                      │  group_keys  · open_array   · oindex        │
                      │  array_keys  · get_shape    · to_dask       │
                      │  close       · get_chunks   · downsample    │
                      └──────────────────┬───────────────┬──────────┘
                                         │               │
                 ┌───────────────────────┘               └─────────────────────┐
                 │                                                             │
      ┌──────────▼──────────────┐                            ┌────────────────▼──────────────┐
      │  ZarrPythonImpl         │                            │  TensorStoreImpl              │
      │  G = zarr.Group         │                            │  G = _TsGroup                 │
      │  A = zarr.Array         │                            │  A = ts.TensorStore           │
      │                         │                            │                               │
      │  auto-detects zarrs     │                            │  _detect_zarr_driver()        │
      │  Rust codec pipeline    │                            │  once per open_group          │
      │  shard-region iteration │                            │  → "zarr2" or "zarr3"         │
      │  NumPy downsampling     │                            │  ts.downsample() virtual view │
      └─────────────────────────┘                            └───────────────────────────────┘
                 │                                                             │
      ┌──────────▼──────────────┐                            ┌────────────────▼──────────────┐
      │      zarr.Array         │                            │        _TsGroup shim          │
      │   (zarr-python store)   │                            │  _TsAttrs (reads .zattrs/     │
      └─────────────────────────┘                            │   zarr.json, writes zarr.json)│
                                                             └───────────────────────────────┘

This also now defaults to writing OME-Zarr v0.5 and removed the ability to write OME-Zarr v0.4. (Reading OME-Zarr v0.4 is still allowed).

You can specify the Zarr implementation within open_ome_zarr like so:

open_ome_zarr(..., implementation=Literal["zarr", "tensorstore"]

… support.

Writing OME-Zarr v0.4 is unsupported, we can only write OME-Zarr v0.5 now.

BREAKING CHANGE:

Signed-off-by: Sricharan Reddy Varra <sricharan.varra@biohub.org>
Signed-off-by: Sricharan Reddy Varra <sricharan.varra@biohub.org>
Signed-off-by: Sricharan Reddy Varra <sricharan.varra@biohub.org>
Signed-off-by: Sricharan Reddy Varra <sricharan.varra@biohub.org>
@srivarra
Copy link
Copy Markdown
Collaborator Author

While the performance benefits of TensorStore is pretty nice, it's also adding a lot of complexity to handle both of these Zarr implementations.

I think for now this is useful in the short term, but a future release we should completely drop it in favor of just Zarr-Python with the Zarrs-Python codec.

See:

Very exciting stuff going on over there!

srivarra added 10 commits March 27, 2026 14:47
Signed-off-by: Sricharan Reddy Varra <sricharan.varra@biohub.org>
Signed-off-by: Sricharan Reddy Varra <sricharan.varra@biohub.org>
Signed-off-by: Sricharan Reddy Varra <sricharan.varra@biohub.org>
Signed-off-by: Sricharan Reddy Varra <sricharan.varra@biohub.org>
Signed-off-by: Sricharan Reddy Varra <sricharan.varra@biohub.org>
Signed-off-by: Sricharan Reddy Varra <sricharan.varra@biohub.org>
n

Signed-off-by: Sricharan Reddy Varra <sricharan.varra@biohub.org>
Signed-off-by: Sricharan Reddy Varra <sricharan.varra@biohub.org>
n

Signed-off-by: Sricharan Reddy Varra <sricharan.varra@biohub.org>
n

Signed-off-by: Sricharan Reddy Varra <sricharan.varra@biohub.org>
@srivarra srivarra marked this pull request as ready for review April 2, 2026 17:11
@srivarra srivarra merged commit 8f0c02c into main Apr 2, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flexible Zarr implementations Fix Tests for CI - Windows Issues Add more ruff rules

1 participant