-
Notifications
You must be signed in to change notification settings - Fork 90
Closed
Labels
bugIndicates an unexpected problem or unintended behaviorIndicates an unexpected problem or unintended behavior
Description
Bug Report
Description
Attempting to insert a file into a File
part table as external storage triggers a ValueError
due to incompatibility between DataJoint-Python (v0.14.3) and numpy (v2.2.*).
numpy 2.2.*
raises an error for truth-testing on empty arrays, whereas earlier versions issued a DeprecationWarning
. This affects any truth-testing of empty arrays within DataJoint-Python.
Reproducibility
The issue is reproducible during file insertion using the following definition:
class File(dj.Part):
definition = """
-> master
---
file_name: varchar(255)
file: filepath@seq-raw
"""
- OS: macOS Sequoia 15.2
- Python Version: 3.11.10
- DataJoint Version: 0.14.3
- Minimum number of steps to reliably reproduce the issue:
- Define a
dj.Part
table as shown above. - Attempt to insert a file using the
insert1
method with numpy 2.2.* installed.
- Define a
- Complete error stack as a result of evaluating the above steps
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[38], line 8
6 if gtf_file.is_file():
7 file_info = dict(file_name="2021-04-23.mm10.ncbiRefSeq.gtf", file=gtf_file)
----> 8 rna_seq.GTFAnnotation.File.insert1(
9 {**stored_ref_gen, **file_info}, skip_duplicates=True
10 )
File ~/miniconda/envs/ucsf_cadwell-lab/lib/python3.11/site-packages/datajoint/table.py:347, in Table.insert1(self, row, **kwargs)
340 def insert1(self, row, **kwargs):
341 """
342 Insert one data record into the table. For ``kwargs``, see ``insert()``.
343
344 :param row: a numpy record, a dict-like object, or an ordered sequence to be inserted
345 as one row.
346 """
--> 347 self.insert((row,), **kwargs)
File ~/miniconda/envs/ucsf_cadwell-lab/lib/python3.11/site-packages/datajoint/table.py:430, in Table.insert(self, rows, replace, skip_duplicates, ignore_extra_fields, allow_direct_insert)
428 # collects the field list from first row (passed by reference)
429 field_list = []
--> 430 rows = list(
431 self.__make_row_to_insert(row, field_list, ignore_extra_fields)
432 for row in rows
433 )
434 if rows:
435 try:
File ~/miniconda/envs/ucsf_cadwell-lab/lib/python3.11/site-packages/datajoint/table.py:431, in <genexpr>(.0)
428 # collects the field list from first row (passed by reference)
429 field_list = []
430 rows = list(
--> 431 self.__make_row_to_insert(row, field_list, ignore_extra_fields)
432 for row in rows
433 )
434 if rows:
435 try:
File ~/miniconda/envs/ucsf_cadwell-lab/lib/python3.11/site-packages/datajoint/table.py:913, in Table.__make_row_to_insert(self, row, field_list, ignore_extra_fields)
911 elif isinstance(row, collections.abc.Mapping): # dict-based
912 check_fields(row)
--> 913 attributes = [
914 self.__make_placeholder(name, row[name], ignore_extra_fields)
915 for name in self.heading
916 if name in row
917 ]
918 else: # positional
919 try:
File ~/miniconda/envs/ucsf_cadwell-lab/lib/python3.11/site-packages/datajoint/table.py:914, in <listcomp>(.0)
911 elif isinstance(row, collections.abc.Mapping): # dict-based
912 check_fields(row)
913 attributes = [
--> 914 self.__make_placeholder(name, row[name], ignore_extra_fields)
915 for name in self.heading
916 if name in row
917 ]
918 else: # positional
919 try:
File ~/miniconda/envs/ucsf_cadwell-lab/lib/python3.11/site-packages/datajoint/table.py:873, in Table.__make_placeholder(self, name, value, ignore_extra_fields)
867 value = (
868 str.encode(attachment_path.name)
869 + b"\0"
870 + attachment_path.read_bytes()
871 )
872 elif attr.is_filepath:
--> 873 value = self.external[attr.store].upload_filepath(value).bytes
874 elif attr.numeric:
875 value = str(int(value) if isinstance(value, bool) else value)
File ~/miniconda/envs/ucsf_cadwell-lab/lib/python3.11/site-packages/datajoint/external.py:281, in ExternalTable.upload_filepath(self, local_filepath)
279 # check if the remote file already exists and verify that it matches
280 check_hash = (self & {"hash": uuid}).fetch("contents_hash")
--> 281 if check_hash:
282 # the tracking entry exists, check that it's the same file as before
283 if contents_hash != check_hash[0]:
284 raise DataJointError(
285 f"A different version of '{relative_filepath}' has already been placed."
286 )
ValueError: The truth value of an empty array is ambiguous. Use `array.size > 0` to check that an array is not empty.
Expected Behavior
The file should be inserted into the table without error, as it works with numpy versions prior to 2.2.
Additional Research and Context
- numpy/numpy#9583: Deprecation of truth-testing on empty arrays.
- numpy 2.2.0rc1: Introduced changes causing this error.
Solution
Patching DataJoint-Python to handle truth-testing of empty arrays correctly.
Metadata
Metadata
Assignees
Labels
bugIndicates an unexpected problem or unintended behaviorIndicates an unexpected problem or unintended behavior