df.to_dict() accepts orient which are not in the list of options #32515

elmonsomiat · 2020-03-07T12:07:45Z

Code Sample, a copy-pastable example if possible

import pandas as pd

df = pd.DataFrame(
    {'name': ['alice', 'bob'],
     'age': [30, 28]})

df.to_dict(orient='racoon')

returns:

[{'age': 30, 'name': 'alice'}, {'age': 28, 'name': 'bob'}]

instead of ValueError

Problem description

The current version of pandas accepts orient= any word which starts with r, l, s, sp, i and d instead of just the actual options in the documentation.

The function should be returning ValueError if the orient option is not any of the following: {'dict', 'list', 'series', 'split', 'records', 'index'}.

Not fixing this might lead to mistakes by using an orient which might point to a wrong output. I would not consider this a bug but this should be consistent with other methods such as df.to_json(). The part of the code which needs to be fixed in master is:

        if orient.lower().startswith("d"):
            return into_c((k, v.to_dict(into)) for k, v in self.items())
        elif orient.lower().startswith("l"):
            return into_c((k, v.tolist()) for k, v in self.items())
        elif orient.lower().startswith("sp"):
            return into_c(
                (
                    ("index", self.index.tolist()),
                    ("columns", self.columns.tolist()),
                    (
                        "data",
                        [
                            list(map(com.maybe_box_datetimelike, t))
                            for t in self.itertuples(index=False, name=None)
                        ],
                    ),
                )
            )
        elif orient.lower().startswith("s"):
            return into_c((k, com.maybe_box_datetimelike(v)) for k, v in self.items())
        elif orient.lower().startswith("r"):
            columns = self.columns.tolist()
            rows = (
                dict(zip(columns, row))
                for row in self.itertuples(index=False, name=None)
            )
            return [
                into_c((k, com.maybe_box_datetimelike(v)) for k, v in row.items())
                for row in rows
            ]
        elif orient.lower().startswith("i"):
            if not self.index.is_unique:
                raise ValueError("DataFrame index must be unique for orient='index'.")
            return into_c(
                (t[0], dict(zip(self.columns, t[1:])))
                for t in self.itertuples(name=None)
            )
        else:
            raise ValueError(f"orient '{orient}' not understood")

by replacing the .lower().startswith() with == and the corresponding string
I with now open a PR to fix the issue. THE BUG IS STILL IN MASTER

[this should explain why the current behaviour is a problem and why the expected output is a better solution.]

Note: We receive a lot of issues on our GitHub tracker, so it is very possible that your issue has been posted before. Please check first before submitting so that we do not have to handle and close duplicates!

Note: Many problems can be resolved by simply upgrading pandas to the latest version. Before submitting, please check if that solution works for you. If possible, you may want to check if master addresses this issue, but that is not necessary.

For documentation-related issues, you can check the latest versions of the docs on master here:

https://pandas-docs.github.io/pandas-docs-travis/

If the issue has not been resolved there, go ahead and file it in the issue tracker.

Expected Output

Output of `pd.show_versions()`

[paste the output of `pd.show_versions()` here below this line]
INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-88-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8

pandas: 0.24.2
pytest: None
pip: 19.3.1
setuptools: 41.6.0
Cython: None
numpy: 1.17.4
scipy: None
pyarrow: None
xarray: None
IPython: 7.9.0
sphinx: None
patsy: None
dateutil: 2.8.1
pytz: 2019.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: 1.3.11
pymysql: 0.9.3
psycopg2: None
jinja2: 2.10.3
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

The text was updated successfully, but these errors were encountered:

ShaharNaveh · 2020-03-07T12:13:01Z

take

elmonsomiat · 2020-03-07T12:21:00Z

@MomIsBestFriend will put a PR up now

ShaharNaveh · 2020-03-07T12:31:20Z

@elmonsomiat You want to take a shot at it?

You can assign yourself by commenting the word take (and nothing else, just take).

elmonsomiat · 2020-03-07T12:32:04Z

take

elmonsomiat · 2020-03-07T12:32:32Z

@MomIsBestFriend Thanks! this is my first contribution to pandas, let's try it out! :)

ShaharNaveh · 2020-03-07T12:35:30Z

@elmonsomiat Good luck!

I'm here if you need a hint :)

MillanSharma · 2020-03-13T18:59:58Z

take

elmonsomiat · 2020-03-13T20:05:45Z

take

elmonsomiat · 2020-03-13T20:06:08Z

@MillanSharma there is a PR (#32516) awaiting for approval for this issue

luggie · 2020-12-07T18:03:09Z

df.to_dict(orient='records')
df.to_dict('records')
both raise
Using short name for 'orient' is deprecated. Only the options: ('dict', list, 'series', 'split', 'records', 'index') will be used in a future version. Use one of the above to silence this warning.
with pandas 1.1.4

simonjayhawkins · 2020-12-12T17:59:19Z

@luggie have you got a minimal reproducible example? If so please open a new issue.

I'm not seeing any warnings or exceptions raised.

>>> import pandas as pd
>>> pd.__version__
'1.1.4'
>>>
>>> df = pd.DataFrame(
...     {'name': ['alice', 'bob'],
...      'age': [30, 28]})
>>>
>>> df.to_dict('records')
[{'name': 'alice', 'age': 30}, {'name': 'bob', 'age': 28}]
>>>
>>> df.to_dict(orient='records')
[{'name': 'alice', 'age': 30}, {'name': 'bob', 'age': 28}]
>>>

to_dict("rows") is not a valid value for parameter orient. pandas-dev/pandas#32515

github-actions bot assigned ShaharNaveh Mar 7, 2020

ShaharNaveh added DataFrame DataFrame data structure good first issue labels Mar 7, 2020

ShaharNaveh removed their assignment Mar 7, 2020

github-actions bot assigned elmonsomiat Mar 7, 2020

elmonsomiat mentioned this issue Mar 7, 2020

Deprecate Aliases as orient Argument in DataFrame.to_dict #32516

Merged

github-actions bot assigned MillanSharma Mar 13, 2020

jreback added this to the 1.1 milestone Mar 14, 2020

WillAyd closed this as completed in #32516 Mar 14, 2020

jennydaman added a commit to jennydaman/codecarbon that referenced this issue Jul 23, 2023

Fix to_dict("records")

87d67b1

to_dict("rows") is not a valid value for parameter orient. pandas-dev/pandas#32515

jennydaman mentioned this issue Jul 23, 2023

Fix to_dict("records") mlco2/codecarbon#432

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

df.to_dict() accepts orient which are not in the list of options #32515

df.to_dict() accepts orient which are not in the list of options #32515

elmonsomiat commented Mar 7, 2020

[paste the output of `pd.show_versions()` here below this line]
INSTALLED VERSIONS

ShaharNaveh commented Mar 7, 2020

elmonsomiat commented Mar 7, 2020

ShaharNaveh commented Mar 7, 2020

elmonsomiat commented Mar 7, 2020

elmonsomiat commented Mar 7, 2020

ShaharNaveh commented Mar 7, 2020 •

edited

Loading

MillanSharma commented Mar 13, 2020

elmonsomiat commented Mar 13, 2020

elmonsomiat commented Mar 13, 2020

luggie commented Dec 7, 2020

simonjayhawkins commented Dec 12, 2020

df.to_dict() accepts orient which are not in the list of options #32515

df.to_dict() accepts orient which are not in the list of options #32515

Comments

elmonsomiat commented Mar 7, 2020

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line] INSTALLED VERSIONS

ShaharNaveh commented Mar 7, 2020

elmonsomiat commented Mar 7, 2020

ShaharNaveh commented Mar 7, 2020

elmonsomiat commented Mar 7, 2020

elmonsomiat commented Mar 7, 2020

ShaharNaveh commented Mar 7, 2020 • edited Loading

MillanSharma commented Mar 13, 2020

elmonsomiat commented Mar 13, 2020

elmonsomiat commented Mar 13, 2020

luggie commented Dec 7, 2020

simonjayhawkins commented Dec 12, 2020

Output of `pd.show_versions()`

[paste the output of `pd.show_versions()` here below this line]
INSTALLED VERSIONS

ShaharNaveh commented Mar 7, 2020 •

edited

Loading