Skip to content

FEAT: dj_blob type plugin for DataJoint 2.0 #1262

@dimitri-yatsenko

Description

@dimitri-yatsenko

Feature Request

Problem

Many existing DataJoint pipelines rely on the blob attribute type to store arbitrary serialized Python objects (e.g., NumPy arrays, dictionaries, custom class instances). While DataJoint 2.0 is moving towards a more structured approach with object types and specialized adaptors, there is a critical need for a backward-compatible solution. Without a standardized <dj_blob> adaptor, migrating legacy pipelines becomes a significant challenge, requiring users to manually update their code to handle serialization, which defeats the purpose of a seamless upgrade path.

Requirements

A successful implementation of this improvement should provide a built-in Custom Type Adaptor that replicates the functionality of the legacy blob type for storing general Python objects. This implementation must adhere to the DataJoint 2.0 Specification.

The core requirements are:

  1. Create a dj.CustomType Adaptor for classic datajoint blobs.�
  • A new class must be implemented that inherits from dj.CustomType.
  1. Implement the Standard Interface:
  • The input_type property must return the type name to be used in the definition: dj_blob.
  • The stored_type property must return the type name to convert into for storage (can be another CustomType.
  • The put method must accept an arbitrary Python object and serialize it into a binary format suitable for storage in a blob database field.
  • The get method must accept the binary data from the blob field and deserialize it back into the original Python object.
  1. The <dj_blob> adaptor should be registered by default with the DataJoint client, making it available out-of-the-box without requiring user intervention.

Justification

The primary justification for this feature is backward compatibility. It provides a direct and simple migration path for countless existing pipelines that depend on the ability to store serialized Python objects in blob fields. Furthermore, it offers a standardized, safe, and officially supported method for handling generic Python objects, ensuring consistency and preventing the proliferation of ad-hoc serialization solutions in new pipelines.

Metadata

Metadata

Labels

breakingNot backward compatible changesfeatureIndicates new features

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions