Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable to restrict columns for pandas.read_parquet #18154

Closed
hoffmann opened this issue Nov 7, 2017 · 3 comments · Fixed by #18155
Closed

Enable to restrict columns for pandas.read_parquet #18154

hoffmann opened this issue Nov 7, 2017 · 3 comments · Fixed by #18155
Labels
Milestone

Comments

@hoffmann
Copy link
Contributor

hoffmann commented Nov 7, 2017

Problem description

In pandas 0.21 the top level funtion read_parquet() was introduced. Both available engines fastparquet and pyarrow support the specifications of columns to read. If you are only interested in certain columns of a dataframe this reduces the io.

It should be also possible to specify the columns in pandas.read_parquet().

@gfyoung
Copy link
Member

gfyoung commented Nov 7, 2017

Sounds good to me!

@jreback
Copy link
Contributor

jreback commented Nov 7, 2017

this is actually quite trivial, we just need to pass kwargs thru on the read. and then you can specify columns=, which we could document as a formal kwarg.

PR's welcome!

@jreback jreback modified the milestones: Next Major Release, 0.21.1 Nov 7, 2017
@jorisvandenbossche
Copy link
Member

More in general, should we pass through **kwargs to the actual engine call?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants