Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add cancellable queries #13178

Merged
merged 7 commits into from
Dec 23, 2023
Merged

feat: add cancellable queries #13178

merged 7 commits into from
Dec 23, 2023

Conversation

ritchie46
Copy link
Member

@ritchie46 ritchie46 commented Dec 21, 2023

This gives you an InProcessQuery handler that allows you to cancel the query

import polars as pl
import time

def sleep(df: pl.DataFrame) -> pl.DataFrame:
    time.sleep(1)
    return df
    
ipq = pl.LazyFrame({"a": [1]}).map_batches(sleep).map_batches(sleep).collect(background=True)

# do some other work in the meantime

# check if query is finished
# Note that if you use python UDFs, you actually have to call this sometimes (or fetch_blocking)
# to give the running process a hold to the GIL. As long as you hold it we cannot run python UDFs.
# So minimize lambdas. :)
out = ipq.fetch()

if out is not None:
     print("query finished", df)

# we can cancel the query if we need to
# note that deleting the `InProcessQuery` object will also cancel it, so keep it alive
if condition():
   ipq.cancel()
    
# do more work

# nothing left to do, let's block the thread until the query is finished
ipq.fetch_blocking()

@ritchie46 ritchie46 marked this pull request as draft December 21, 2023 15:01
@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars labels Dec 21, 2023
@ritchie46 ritchie46 changed the title feat: add cancelable queries feat: add cancellable queries Dec 22, 2023
@ritchie46 ritchie46 marked this pull request as ready for review December 22, 2023 19:05
@ritchie46
Copy link
Member Author

@maartenbreddels I think this should work.

@maartenbreddels
Copy link

Nice! I’ll take a look after the hollidays, certainly makes it fit well with solara !

@stinodego
Copy link
Contributor

stinodego commented Dec 22, 2023

Perhaps an even clearer parameter name for this would be run_in_background, rather than background. A few characters longer, but immediately obvious.

Maybe the returned object could be named QueryInProcess rather than InProcessQuery - I like having Query be the first/main thing in the name.

@deanm0000
Copy link
Collaborator

Suppose you did sleep 60. What's ipq return before a minute is up?

I'm imagining something like

while True:
    # do something else
    if ipq.done()
        break

@ritchie46
Copy link
Member Author

ritchie46 commented Dec 23, 2023

Suppose you did sleep 60. What's ipq return before a minute is up?

fetch will return None if it isn't ready. Note that it doesn't work super nicely with python because of the GIL, but normal polars queries will actually run in the background.

@ritchie46
Copy link
Member Author

Perhaps an even clearer parameter name for this would be run_in_background, rather than background

I feel that that's almost a sentence that belongs in the docstring.

@ritchie46 ritchie46 merged commit 9617dc5 into main Dec 23, 2023
26 checks passed
@ritchie46 ritchie46 deleted the background branch December 23, 2023 10:21
@c-peters c-peters added the accepted Ready for implementation label Dec 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants