fix: array creation in tests#169
Conversation
That's correct, and we should spell this out more clearly in the docs. The basic problem we had to solve was making sharding look like a top-level array field, instead of a codec that contains other codecs. But this simplification is not free |
|
OK, if I understand correctly, one has to change from Array.create(
chunk_shape=shard_shape, # !
codecs=[ShardingCodec(chunk_shape=chunk_shape)],
)to create_array(
chunk_shape=chunk_shape,
shards=shard_shape,
)right? If I do that, all tests pass except for this one. IDK why the number of stored bytes goes down, any ideas? $ open …/pytest-165/test_delete_empty_shards_local0/delete_empty_shards/c/0/0 | into binary
Length: 53 (0x35) bytes
00000000: 28 b5 2f fd 20 80 45 00 00 10 00 00 01 00 3b 05 (×/× ×E00•00•0;•
00000010: 58 00 00 00 00 00 00 00 00 11 00 00 00 00 00 00 X00000000•000000
00000020: 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0×××××××××××××××
00000030: ff a5 26 2a 18 ××&*•zarrs-python/tests/test_sharding.py Lines 267 to 296 in 698a11c |
Offhand I have no idea! The first thing I would check is the array metadata document -- maybe the |
|
Ah, of course, the default compressor is zstd, so before, by overriding I think that means all is well now. I still don’t understand everything enough to confidently say that my assumption in #169 (comment) is true, but I’m relatively confident. |
| chunk_shape=(16, 16), | ||
|
|
||
|
|
||
| def test_invalid_metadata_chunk_shape(store: Store) -> None: |
There was a problem hiding this comment.
I think this and the next test (test_invalid_metadata_inner_chunk_shape) are the most likely candidates for me messing up:
Since both the parameter semantics and the error message changes between the two APIs, it’s possible that I messed up and these no longer test what they should.
But if my assumption about the semantics is accurate, this should be fine.
so what I thought happens is that they split up
codecsintoshards: if present, this contains the parameters for an implicitly createdArrayToArrayCodecthat wraps all other given codecs and applies them for each shard individuallyfilters: list ofArrayToArrayCodecsserializer: a singleArrayToBytesCodeccompressors: a list ofBytesToBytesCodecsthis assumption seems to hold for e.g.
tests/test_codecs.py::test_order, but other tests fail, so apparently I didn’t understand this.