new row not created upon assignment to nonexistent row label in DataFrame with MultiIndex #6699

lebedov · 2014-03-24T19:31:57Z

According to the documentation for pandas 0.13+, assigning to a non-existent key via the .ix or .loc operations should create a new row in the DataFrame. Using pandas 0.13.1 with Python 2.7.5, I noticed that this doesn't seem to be the case when the DataFrame has a MultiIndex. To illustrate, consider a DataFrame created as follows:

import itertools
import numpy as np
import pandas

rows = 2 
cols = 3 
a = np.random.randint(0, 2, rows*cols).reshape((rows, cols))
b = a*np.random.rand(rows, cols)
idx = pandas.MultiIndex.from_tuples([(i, j) for i, j in \
             itertools.product(xrange(rows), xrange(cols))],
                              names=['rows','cols'])
df = pandas.DataFrame({'a': a.flatten(),
                       'b': b.flatten()}, index=idx)

If df is

           a         b
rows cols             
0    0     0  0.000000
     1     0  0.000000
     2     1  0.315458
1    0     0  0.000000
     1     0  0.000000
     2     1  0.061142

then running df.loc[(0, 1), :] = (10, 20.0) modifies df as follows:

            a          b
rows cols               
0    0      0   0.000000
     1     10  20.000000
     2      1   0.315458
1    0      0   0.000000
     1      0   0.000000
     2      1   0.061142

However, if one attempts to run df.loc[(0,3), :] = (10, 20.0) on the original df, one obtains

            a          b
rows cols               
0    0     10  20.000000
     1      0   0.000000
     2      1   0.315458
1    0     10  20.000000
     1      0   0.000000
     2      1   0.061142

Is this behavior expected?

The text was updated successfully, but these errors were encountered:

jreback · 2014-03-24T19:42:24Z

Its not implemented with a multi-index, should prob just raise an error for now.

want to do a PR to raise NotImplementedError?

toobaz · 2017-02-22T01:08:52Z

This seems to be fixed (at least, the proposed example behaves fine in git):

In [2]: df.loc[(0,3), :] = (10, 20.0)

In [3]: df
Out[3]: 
              a          b
rows cols                 
0    0      1.0   0.274632
     1      0.0   0.000000
     2      1.0   0.056103
1    0      1.0   0.941639
     1      1.0   0.546224
     2      1.0   0.866284
0    3     10.0  20.000000

lebedov · 2017-02-22T18:00:30Z

I concur - looks good in pandas 0.19.2. Closing.

jreback added MultiIndex labels Mar 24, 2014

jreback added this to the 0.14.0 milestone Mar 24, 2014

jreback modified the milestones: 0.15.0, 0.14.0 Apr 6, 2014

jreback modified the milestones: 0.16.0, Next Major Release Mar 3, 2015

jreback mentioned this issue Jul 12, 2015

BUG: DataFrame.loc silently drops non-existent elements when using MultiIndex #10549

Closed

jreback mentioned this issue Feb 13, 2016

Assigning from other NDFrame broken on multiple MultiIndex columns #12313

Closed

jreback added Indexing Related to indexing on series/frames, not to indexes themselves Difficulty Intermediate labels Feb 13, 2016

jreback mentioned this issue Feb 16, 2016

Weird assignment behaviour with MultiIndex #12343

Closed

lebedov closed this as completed Feb 22, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

new row not created upon assignment to nonexistent row label in DataFrame with MultiIndex #6699

new row not created upon assignment to nonexistent row label in DataFrame with MultiIndex #6699

lebedov commented Mar 24, 2014

jreback commented Mar 24, 2014

toobaz commented Feb 22, 2017

lebedov commented Feb 22, 2017

new row not created upon assignment to nonexistent row label in DataFrame with MultiIndex #6699

new row not created upon assignment to nonexistent row label in DataFrame with MultiIndex #6699

Comments

lebedov commented Mar 24, 2014

jreback commented Mar 24, 2014

toobaz commented Feb 22, 2017

lebedov commented Feb 22, 2017