Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: factorize_from_iterables applies type to empty iterables #16844

Closed
drudd opened this issue Jul 6, 2017 · 1 comment · Fixed by #29211
Closed

BUG: factorize_from_iterables applies type to empty iterables #16844

drudd opened this issue Jul 6, 2017 · 1 comment · Fixed by #29211
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions

Comments

@drudd
Copy link
Contributor

drudd commented Jul 6, 2017

Code Sample, a copy-pastable example if possible

In [1]: import pandas as pd

In [2]: A = pd.MultiIndex(levels=[[], []], labels=[[], []], names=['a', 'b'])

In [3]: B = pd.MultiIndex.from_arrays(arrays=[[], []], names=['a', 'b'])

In [4]: A
Out[4]:
MultiIndex(levels=[[], []],
           labels=[[], []],
           names=[u'a', u'b'])

In [5]: B
Out[5]:
MultiIndex(levels=[[], []],
           labels=[[], []],
           names=[u'a', u'b'])

In [6]: pd.testing.assert_index_equal(A, B)
...
AssertionError: MultiIndex level [0] are different

MultiIndex level [0] classes are not equivalent
[left]:  Index([], dtype='object', name=u'a')
[right]: Float64Index([], dtype='float64', name=u'a')

Problem description

Empty iterables should be factorized to object types rather than Float64Index. This causes MultiIndexes created from the constructor to not equal those created using .from_arrays, which relies on factorization (ref PR #16782).

drudd pushed a commit to drudd/pandas that referenced this issue Jul 6, 2017
@jreback jreback added Bug Difficulty Intermediate Dtype Conversions Unexpected or buggy dtype conversions Internals Related to non-user accessible pandas implementation MultiIndex labels Jul 6, 2017
@jreback jreback modified the milestones: Admin, Next Major Release Jul 6, 2017
@jreback jreback changed the title factorize_from_iterables applies type to empty iterables BUG: factorize_from_iterables applies type to empty iterables Jul 6, 2017
jreback pushed a commit to drudd/pandas that referenced this issue Jul 7, 2017
@mroeschke
Copy link
Member

This looks to work on master. Could use a test.

In [276]: pd.__version__
Out[276]: '0.26.0.dev0+593.g9d45934af'

In [277]: In [1]: import pandas as pd
     ...:
     ...: In [2]: A = pd.MultiIndex(levels=[[], []], labels=[[], []], names=['a', 'b'])
     ...:
     ...: In [3]: B = pd.MultiIndex.from_arrays(arrays=[[], []], names=['a', 'b'])
/anaconda3/envs/pandas-dev/bin/ipython:3: FutureWarning: the 'labels' keyword is deprecated, use 'codes' instead
  # -*- coding: utf-8 -*-

In [278]: In [6]: pd.testing.assert_index_equal(A, B)
     ...:

In [279]: A
Out[279]: MultiIndex([], names=['a', 'b'])

In [280]: B
Out[280]: MultiIndex([], names=['a', 'b'])

@mroeschke mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed Bug Difficulty Intermediate Dtype Conversions Unexpected or buggy dtype conversions Internals Related to non-user accessible pandas implementation MultiIndex labels Oct 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants