Add support for structured dtypes to zarr3 driver, open structs as void#271
Add support for structured dtypes to zarr3 driver, open structs as void#271BrianMichell wants to merge 140 commits into
zarr3 driver, open structs as void#271Conversation
…t for raw bits dtype
Implement shim for `open_as_void` driver level flag
* Begin removing void field shim * Fully removed void string shim * Cleanup debug prints * Remove shimmed validation * Remove unnecessary comment * Prefer false over zero for ternary clarity
* Implement a more general and portable example set * Fix driver cache bug * Update example for template * Cleanup example * Remove testing examples from source
* Use the appropriate fill value for open_as_void structured data * Cleanup
|
I'll try to get to this in about a week, before I look this one over, please double check that the prior PR works for you. Also look over this one and see if any of the suggestions from the other one applies. |
Matches the pattern from zarr v2 driver (PR google#272). When both "field" and "open_as_void" are specified in the spec, return an error since these options are mutually exclusive - field selects a specific field from a structured array, while open_as_void provides raw byte access to the entire structure.
The zarr3 URL syntax cannot represent field selection or void access mode. Following the pattern from zarr v2 driver (PR google#272), ToUrl() now returns an error when either of these options is specified instead of silently ignoring them.
…trip Following the pattern from zarr v2 driver (PR google#272), override GetBoundSpecData in ZarrDataCache to set spec.open_as_void from ChunkCacheImpl::open_as_void_. This ensures that when you open a store with open_as_void=true and then call spec(), the resulting spec correctly has open_as_void=true set. Without this fix, opening a store with open_as_void=true and then getting its spec would lose the open_as_void flag, causing incorrect behavior if the spec is used to re-open the store.
* Update to follow google#287 and implement general feedback
|
Looks like development has picked back up on Tensorstore. I think I've addressed any outstanding issues and made sure everything is following the defined spec. Please let me know if there's anything else I can do to help move this forward any more! |
|
@jbms and I talked briefly about this and he has some comments; briefly there are a few concerns regarding endian encoding/decoding when used with the zarr3 dtype and with field selection. edit: ... making the "bytes" codec understand the new ZarrDType might be a somewhat invasive change to how the codecs work. |
Use iota instead of looping
| const auto& field = zarr_dtype_.fields[field_i]; | ||
| const auto& component_shape = grid().components[field_i].shape(); | ||
| auto result_array = | ||
| AllocateArray(component_shape, c_order, default_init, field.dtype); |
There was a problem hiding this comment.
This should take preferred inner order into account when allocating here. Also should try to avoid copying at all, and just reference the original array with the correct stride, if it is already suitable aligned, as for zarr v2.
There was a problem hiding this comment.
(Suitably aligned and suitable endian)
| decoded.fill_value = metadata->fill_value[0]; | ||
| } | ||
| } else { | ||
| // Zero-filled scalar byte fill_value (broadcast to chunk shape at use). |
There was a problem hiding this comment.
The fill value handling here seems problematic --- the zero-filled fill_value surely isn't going to work. Plus in some cases fill_value seems to be unset.
| {0, 0}, {2, 2})) | ||
| .result()); | ||
| EXPECT_EQ(b_data, b_read); | ||
| } |
There was a problem hiding this comment.
I think fill values for structured fields aren't handled correctly in the case of sharding --- please add tests for that.
Supersedes #264
Resolves comments 1 and 2