Implement support for structured and struct zarr-extension defined dtypes#3781
Implement support for structured and struct zarr-extension defined dtypes#3781BrianMichell wants to merge 14 commits intozarr-developers:mainfrom
structured and struct zarr-extension defined dtypes#3781Conversation
|
this looks great, thanks for this. Instead of modifying the existing |
Should be good to go now! |
|
@d-v-b Is there anything I can do to move this PR forward? |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3781 +/- ##
==========================================
- Coverage 92.98% 92.89% -0.09%
==========================================
Files 87 87
Lines 11246 11309 +63
==========================================
+ Hits 10457 10506 +49
- Misses 789 803 +14
🚀 New features to boost your workflow:
|
hi @BrianMichell thanks for the ping, I will get a review in soon. since numpy is handling the actual data type, I think we will ask much less of you than the tensorstore devs :) |
…plement_canonical_structs
| raise DataTypeValidationError( | ||
| f"Invalid data type: {dtype}. Expected an instance of {cls.dtype_cls}" | ||
| ) | ||
| raise DataTypeValidationError(f"Use 'Struct' for native dtype matching. Got: {dtype}") |
There was a problem hiding this comment.
why do we need to modify the Structured data type at all? I would think we can keep it exactly as-is, but do not register it in the data type registry. Users who need backwards compatibility can explicitly unregister the new data type, and register the old one.
There was a problem hiding this comment.
I made this design choice to support transparent r+ of Structured dtype and also r+/w for Struct. This implementation should support my usecase where we have petabyte scale Structured legacy data and expect to move forward with generating new data using Struct without requiring either a remaster or client code to register/unregister on the fly.
There was a problem hiding this comment.
if we remove Structured from the registry, and Struct accepts data_type metadata that would have worked for Structured, then doesn't this achieve your goal? The idea is that the input type for Struct is wider than the input type for Structured, so any valid Structured would also be a valid Struct data type. But if for some reason someone needs to use the old Structured data type, it is still there, untouched. They just have to opt in to using it.
edit, I mixed up Struct and Structured
There was a problem hiding this comment.
I see what you're getting at. Made that update!
…ow it to appropriately pick up `structured` dtype
Updates for consistency with the zarr-extension for structured types.
Implements legacy read-only for
structuredtype and support for newstructdefinition.TODO:
docs/user-guide/*.mdchanges/