Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected results creating an empty Series #16737

Closed
AllenDowney opened this issue Jun 20, 2017 · 9 comments
Closed

Unexpected results creating an empty Series #16737

AllenDowney opened this issue Jun 20, 2017 · 9 comments
Assignees
Labels
Bug Constructors Series/DataFrame/Index/pd.array Constructors Indexing Related to indexing on series/frames, not to indexes themselves

Comments

@AllenDowney
Copy link
Contributor

Code Sample, a copy-pastable example if possible

Python 3.5.1 |Anaconda custom (64-bit)| (default, Dec  7 2015, 11:16:01) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from pandas import Series
>>> s1 = Series([])
>>> s1[0] = 100
>>> s1
0    100
dtype: int64
>>> s2 = Series()
>>> s2[0] = 100
Traceback (most recent call last):
  File "/home/downey/anaconda2/envs/py3k/lib/python3.5/site-packages/pandas/core/series.py", line 778, in _set_with_engine
    self.index._engine.set_value(values, key, value)
  File "pandas/_libs/index.pyx", line 116, in pandas._libs.index.IndexEngine.set_value (pandas/_libs/index.c:4649)
  File "pandas/_libs/index.pyx", line 124, in pandas._libs.index.IndexEngine.set_value (pandas/_libs/index.c:4475)
  File "pandas/_libs/index.pyx", line 154, in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5085)
  File "pandas/_libs/hashtable_class_helper.pxi", line 1207, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:20405)
  File "pandas/_libs/hashtable_class_helper.pxi", line 1215, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:20359)
KeyError: 0

Problem description

When I create an empty series like this model = Series([]), I get the expected behavior when I try to set an element.

When I create an empty series like this model = Series(), I was hoping for the same behavior. It seems like a Series created with data=None isn't good for much. Or maybe there's a reason I would want one?

I realize that providing [] as an argument is not a big deal, but since I am using it as a teaching example, it would be nice to have one less thing to explain.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.5.1.final.0 python-bits: 64 OS: Linux OS-release: 3.19.0-32-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.utf8 LOCALE: en_US.UTF-8

pandas: 0.20.1
pytest: 2.8.5
pip: 8.1.1
setuptools: 20.3
Cython: 0.23.4
numpy: 1.11.0
scipy: 0.17.0
xarray: None
IPython: 4.1.2
sphinx: 1.3.5
patsy: 0.4.1
dateutil: 2.5.1
pytz: 2016.2
blosc: None
bottleneck: 1.2.1
tables: 3.3.0
numexpr: 2.5.2
feather: None
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: 4.4.1
html5lib: 0.999
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
s3fs: None
pandas_gbq: None
pandas_datareader: None

@chris-b1
Copy link
Contributor

For what it's worth, from the point of view of teaching, my suggestion would be to act as if there is no such thing as an empty Series. Almost always better to build things up with python data structures, then convert.

That said, this is strange - cause seems to be:

In [90]: s1.index
Out[90]: RangeIndex(start=0, stop=0, step=1)

In [91]: s2.index
Out[91]: Index([], dtype='object')

@chris-b1 chris-b1 added Bug Indexing Related to indexing on series/frames, not to indexes themselves labels Jun 20, 2017
@chris-b1 chris-b1 added this to the Next Major Release milestone Jun 20, 2017
@AllenDowney
Copy link
Contributor Author

Thanks for looking into this.

I see your point about building a Python data structure first, but (in case you are interested) I am working on a new book that teaches modeling and simulation for people who have not programmed before, and I am taking a top-down approach where I teach high-level tools, like Pandas, before lower-level tools, like Python data structures. So at this point, the reader has not yet seen lists or dictionaries, which is why I would prefer to make the empty list disappear.

@BranYang
Copy link
Contributor

It seems that in our test, we do expect this mismatch between two empty Series
Test Code here

# the are Index() and RangeIndex() which don't compare type equal
# but are just .equals
assert_series_equal(empty, empty2, check_index_type=False)

Current behavior will fail if we remove check_index_type=False.
If we do think we should align these two case, I can give it a try.

@jreback
Copy link
Contributor

jreback commented Jun 21, 2017

@BranYang yeah there is a difference in contruction.

In [3]: Series([]).index
Out[3]: RangeIndex(start=0, stop=0, step=1)

In [4]: Series().index
Out[4]: Index([], dtype='object')

This should prob not be the case. This might cause a number of test failures, but would be to good to investigate.

@SaturnFromTitan
Copy link
Contributor

take

@phofl
Copy link
Member

phofl commented Nov 15, 2020

Can we do this immediately or should we issue a FutureWarning before changing this? Or will this be done when changing the dtype of empty Series from float to object?

@mroeschke mroeschke added the Constructors Series/DataFrame/Index/pd.array Constructors label Jun 12, 2021
@jbrockmendel
Copy link
Member

if this needs a deprecation, itd be nice to get it in so we can change in 2.0

@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@AllenDowney
Copy link
Contributor Author

It looks like this behavior changed at some point, so the example works now.

I'm not sure if I should close the issue, but maybe someone should.

Maybe @mroeschke , since you had the last touch on this issue.

@mroeschke
Copy link
Member

Yeah this looks to work and I think we have tests for this so closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Constructors Series/DataFrame/Index/pd.array Constructors Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
8 participants