Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement .collect() as alias for .clone() for DataFrame #13549

Closed
erikamundson opened this issue Jan 8, 2024 · 3 comments
Closed

Implement .collect() as alias for .clone() for DataFrame #13549

erikamundson opened this issue Jan 8, 2024 · 3 comments
Labels
enhancement New feature or an improvement of an existing feature

Comments

@erikamundson
Copy link

erikamundson commented Jan 8, 2024

Description

It would be convenient to allow calls to .collect() on a DataFrame to return a copy (same behavior as .clone()).

Currently LazyFrame allows .lazy() which allows us to write a function that accepts either a LazyFrame or a materialized DataFrame and returns a Lazy copy of that object.

def ensure_lazy(frame: DataFrame | LazyFrame) -> LazyFrame:
    return frame.lazy()

It would be useful to also be able to write a similar function that always returns a materialized DataFrame.

def ensure_materialized(frame: DataFrame | LazyFrame) -> DataFrame:
    return frame.collect()

We can get around it right now by calling frame.lazy().collect() but there may be additional overhead to converting from DataFrame to Lazy and back again, rather than just cloning the DataFrame directly.

Happy to give this a go myself if approved.

@erikamundson erikamundson added the enhancement New feature or an improvement of an existing feature label Jan 8, 2024
@Wainberg
Copy link
Contributor

Wainberg commented Jan 9, 2024

I mean, if lf.lazy() is a lazyframe, then df.collect() should be a dataframe.

@MarcoGorelli
Copy link
Collaborator

MarcoGorelli commented Jan 9, 2024

I think this was discussed and rejected, could you search the issue tracker please?

EDIT: here it is #7882 (comment)

@erikamundson
Copy link
Author

Ah sorry @MarcoGorelli I missed that when searching. Will close this one.

@erikamundson erikamundson closed this as not planned Won't fix, can't repro, duplicate, stale Jan 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

3 participants