Concat empty series with mixed TZ doesn't coerce to object #22186

TomAugspurger · 2018-08-03T02:41:28Z

Do we intentionally special case concatenating an empty series with a non-empty series here? I don't think we should. It'd be nice to statically know the output dtype of a concat without having to look into the values.

In [2]: import pandas as pd

In [3]: pd.concat([pd.Series([], dtype='datetime64[ns, US/Central]'), pd.Series(['2018'], dtype='datetime64[ns, US/Eastern]')])
Out[3]:
0   2018-01-01 00:00:00-05:00
dtype: datetime64[ns, US/Eastern]

In [4]: pd.concat([pd.Series([], dtype='datetime64[ns, US/Central]'), pd.Series([], dtype='datetime64[ns, US/Eastern]')])
Out[4]: Series([], dtype: object)

I'd expect both of those to be object dtype.

The text was updated successfully, but these errors were encountered:

TomAugspurger · 2018-08-03T21:31:25Z

Taking this over as a general concat issue

Currently concat[SparseSeries] uses the fill_value of the first array.

In [13]: pd.concat([
    ...:     pd.SparseSeries([0, 0, None], fill_value=0),
    ...:     pd.SparseSeries([0, None, 0], fill_value=np.nan)
    ...: ]).fill_value
Out[13]: 0

In [14]: pd.concat([
    ...:     pd.SparseSeries([0, 0, None], fill_value=np.nan),
    ...:     pd.SparseSeries([0, None, 0], fill_value=0)
    ...: ]).fill_value
Out[14]: nan

Are we comfortable calling that a bug? What's the expected behavior, raising?

TomAugspurger · 2018-08-03T21:37:43Z

I guess this raises a general design question: what dtype coercion are we comfortable with concat doing?

We're clearly OK with concatenating a mix of ints and floats and getting floats.
There's the open issue of concat[categorical] using union categorical
We'll need to figure out concat of Period with different freqs (unless we already do that?)

TomAugspurger · 2018-08-03T21:49:31Z

For concat([sparse, dense]) I'd propose that the result always be sparse.

Currently, we try to keep it sparse. e.g. concat([ Sparse[float64], float64 ]) is Sparse[float64].
But concat([ Sparse[float64], object] ) is object. IMO it should be Sparse[object].

jreback · 2018-08-03T22:15:35Z

prob should density if sparse and dense and both float

TomAugspurger · 2018-08-04T14:46:09Z

prob should density if sparse and dense and both float

Hmm I don't think so. I think ever implicitly going from sparse to dense is problematic. And on master we currently sparsify the dense series, but we're inconsistent about it:

Sparse[float] & float -> Sparse[float]
Sparse[float] & object -> object (dense)

TomAugspurger added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Dtype Conversions Unexpected or buggy dtype conversions labels Aug 3, 2018

mroeschke added the Timezones Timezone data dtype label Aug 3, 2018

mroeschke added the Bug label Jun 21, 2021

jbrockmendel mentioned this issue May 19, 2023

DEPR: concat ignoring empty objects #52532

Merged

6 tasks

MarcoGorelli closed this as completed in #52532 Jul 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concat empty series with mixed TZ doesn't coerce to object #22186

Concat empty series with mixed TZ doesn't coerce to object #22186

TomAugspurger commented Aug 3, 2018 •

edited

Loading

TomAugspurger commented Aug 3, 2018

TomAugspurger commented Aug 3, 2018 •

edited

Loading

TomAugspurger commented Aug 3, 2018

jreback commented Aug 3, 2018

TomAugspurger commented Aug 4, 2018

Concat empty series with mixed TZ doesn't coerce to object #22186

Concat empty series with mixed TZ doesn't coerce to object #22186

Comments

TomAugspurger commented Aug 3, 2018 • edited Loading

TomAugspurger commented Aug 3, 2018

TomAugspurger commented Aug 3, 2018 • edited Loading

TomAugspurger commented Aug 3, 2018

jreback commented Aug 3, 2018

TomAugspurger commented Aug 4, 2018

TomAugspurger commented Aug 3, 2018 •

edited

Loading

TomAugspurger commented Aug 3, 2018 •

edited

Loading