-
Notifications
You must be signed in to change notification settings - Fork 90
Description
Feature Request
Problem
Many existing DataJoint pipelines rely on the blob attribute type to store arbitrary serialized Python objects (e.g., NumPy arrays, dictionaries, custom class instances). While DataJoint 2.0 is moving towards a more structured approach with object types and specialized adaptors, there is a critical need for a backward-compatible solution. Without a standardized <dj_blob> adaptor, migrating legacy pipelines becomes a significant challenge, requiring users to manually update their code to handle serialization, which defeats the purpose of a seamless upgrade path.
Requirements
A successful implementation of this improvement should provide a built-in Custom Type Adaptor that replicates the functionality of the legacy blob type for storing general Python objects. This implementation must adhere to the DataJoint 2.0 Specification.
The core requirements are:
- Create a
dj.CustomType
Adaptor for classic datajoint blobs.�
- A new class must be implemented that inherits from
dj.CustomType
.
- Implement the Standard Interface:
- The
input_type
property must return the type name to be used in the definition:dj_blob
. - The
stored_type
property must return the type name to convert into for storage (can be anotherCustomType
. - The
put
method must accept an arbitrary Python object and serialize it into a binary format suitable for storage in ablob
database field. - The get method must accept the binary data from the blob field and deserialize it back into the original Python object.
- The
<dj_blob>
adaptor should be registered by default with the DataJoint client, making it available out-of-the-box without requiring user intervention.
Justification
The primary justification for this feature is backward compatibility. It provides a direct and simple migration path for countless existing pipelines that depend on the ability to store serialized Python objects in blob fields. Furthermore, it offers a standardized, safe, and officially supported method for handling generic Python objects, ensuring consistency and preventing the proliferation of ad-hoc serialization solutions in new pipelines.