BUG/API: setting behavior inserting negative numbers in np.uint/nullable UInt Series #48867

mroeschke · 2022-09-29T18:02:12Z

Pandas version checks

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

In [22]: np.__version__
Out[22]: '1.23.1'

In [23]: arr = np.array([1], dtype=np.uint8)

In [24]: arr[0] = -1

In [25]: arr
Out[25]: array([255], dtype=uint8)

In [26]: ser =  pd.Series([1], dtype=np.uint8)

In [27]: ser.iloc[0] = -1

In [28]: ser
Out[28]:
0   -1
dtype: int16

In [30]: ser_nullable = pd.Series([1], dtype="UInt8")

In [31]: ser_nullable.iloc[0] = -1

In [32]: ser_nullable
Out[32]:
0    255
dtype: UInt8

Issue Description

Numpy currently returns an overflowed valued but may raise in the future with NEP 50 cc @seberg
Series with np.uint appears to upcast the type to support -1
Series with nullable UInt appears to return an overflowed value

Here are the corresponding construction behavior

In [33]: np.array([-1], dtype=np.uint8)
Out[33]: array([255], dtype=uint8)

In [34]: pd.Series([-1], dtype=np.uint8)
OverflowError: Trying to coerce negative values to unsigned integers

In [35]: pd.Series([-1], dtype="UInt8")
TypeError: Cannot cast array data from dtype('int64') to dtype('uint8') according to the rule 'safe'

The above exception was the direct cause of the following exception:

TypeError: cannot safely cast non-equivalent int64 to uint8

Expected Behavior

Not sure if the existing rules here are established, but maybe given construction raises shouldn't setting too?

Installed Versions

Numpy Version: '1.23.1'

The text was updated successfully, but these errors were encountered:

phofl · 2022-09-29T18:50:31Z

Isn’t the setting behavior similar to when you set a float into an integer column? Eg we try to find a common dtype

mroeschke · 2022-09-29T19:00:27Z

Ah yeah that's analogous behavior.

Here's I'm pointing out the inconsistency between non-nullable and nullable uint and what @seberg is considering in the future for NEP 50.

seberg · 2022-09-29T19:21:22Z

At this point, I actually don't care much about what we do here in NumPy (which may be my problem). Doing it should simplify the code/logic in the long term, though ;).

If pandas typically (or even sometimes) has the behavior of promoting columns on insertion then it would seem to me that an error in NumPy is preferable over doing the unsafe cast (from a pandas perspective at least).

phofl · 2022-09-29T19:30:58Z

The inconsistency on our side is a bit related to #47577 too.

pllim · 2023-10-25T18:15:30Z

I think this is biting us downstream now that NEP 50 is implemented in numpy 2.0.dev.

Example log: https://github.com/spacetelescope/jdaviz/actions/runs/6643820714/job/18051683584 (that is a lot to read through so I repeated the relevant bits below)

Numpy: 2.0.0.dev0+git20231025.9f6789c
Pandas: 2.2.0.dev0+447.gaae33c036c
...
.../pandas/core/dtypes/cast.py:594: in maybe_promote
    dtype, fill_value = _maybe_promote(dtype, fill_value)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

dtype = dtype('int64'), fill_value = -1
...
            elif issubclass(dtype.type, np.integer):
>               if not np.can_cast(fill_value, dtype):
E               TypeError: can_cast() does not support Python ints, floats, and complex because the result used to depend on the value.
E               This change was part of adopting NEP 50, we may explicitly allow them again in the future.

.../pandas/core/dtypes/cast.py:702: TypeError

pandas/pandas/core/dtypes/cast.py

Line 612 in 074ab2f

def _maybe_promote(dtype: np.dtype, fill_value=np.nan):

mroeschke · 2023-10-26T18:20:04Z

Thanks for the report. Addressing this in #55707 and will hopefully get this in today

mroeschke added Bug Indexing Related to indexing on series/frames, not to indexes themselves API - Consistency Internal Consistency of API/Behavior NA - MaskedArrays Related to pd.NA and nullable extension arrays labels Sep 29, 2022

mroeschke mentioned this issue Oct 25, 2023

CI: Debug timeouts #55687

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG/API: setting behavior inserting negative numbers in np.uint/nullable UInt Series #48867

BUG/API: setting behavior inserting negative numbers in np.uint/nullable UInt Series #48867

mroeschke commented Sep 29, 2022

phofl commented Sep 29, 2022

mroeschke commented Sep 29, 2022

seberg commented Sep 29, 2022

phofl commented Sep 29, 2022

pllim commented Oct 25, 2023

mroeschke commented Oct 26, 2023

BUG/API: setting behavior inserting negative numbers in np.uint/nullable UInt Series #48867

BUG/API: setting behavior inserting negative numbers in np.uint/nullable UInt Series #48867

Comments

mroeschke commented Sep 29, 2022

Pandas version checks

Reproducible Example

Issue Description

Expected Behavior

Installed Versions

phofl commented Sep 29, 2022

mroeschke commented Sep 29, 2022

seberg commented Sep 29, 2022

phofl commented Sep 29, 2022

pllim commented Oct 25, 2023

mroeschke commented Oct 26, 2023