Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SharedFd doesn't call wake() #122

Closed
FrankReh opened this issue Sep 28, 2022 · 5 comments
Closed

SharedFd doesn't call wake() #122

FrankReh opened this issue Sep 28, 2022 · 5 comments

Comments

@FrankReh
Copy link
Collaborator

FrankReh commented Sep 28, 2022

There's some plumbing in shared_fd.rs for tracking a waker but I don't see a call to wake(). There's a good chance a call to file.close().await would hang if the SharedFd was being shared with something else at the time.

I was going to submit a PR for this, but I took the route of supporting multiple close calls, so keeping a list of wakers, and it was causing lots of changes to the file and then I still hadn't figured out how to let multiple calls block on a close().await but already have the RC strong count not count them ... and I decided to punt on that and just log this issue. Maybe someone closer to the original code will see how to fix it up with minimal changes.

If you do want a File to be clonable and closable multiple times, let me know and I'll finish what I had started. Otherwise the fix can wait for a single call to close to be supported.

This isn't a blocker for my work so no great rush. I just ran into it while reading through more of the code base.

@FrankReh
Copy link
Collaborator Author

FrankReh commented Feb 7, 2023

Here's a reproducer.

use tokio_uring::fs;

fn main() {
    tokio_uring::start(async {
        let path = "/tmp/debug-122";
        let data = vec![0; 128 * 1024 * 1024];

        let file = fs::File::create(path).await.unwrap();

        let future = file.write_at(data, 0);

        // Create a way to get the first part of the future run before having it dropped.
        // This can then show the hang when trying to close.
        tokio::select! {
            biased;
            _ = future => {
                println!("future finished");
            }
            _ = async {} => {
                println!("easy finished");
            }
        }

        println!("calling file.close().await");
        let _ = file.close().await;
        println!("done waiting for file.close().await");
    });
}

@FrankReh
Copy link
Collaborator Author

FrankReh commented Feb 7, 2023

I think I know enough to offer a fix tonight or tomorrow. If someone wants to jump in in the next few hours, be my guest.

@thomasbarrett
Copy link

thomasbarrett commented Feb 8, 2023

Interesting @FrankReh . So this example does not currently work with my "Simplify SharedFd close logic branch". I get a panic! because there is in-flight IO (which I have been asserting is impossible). It appears that the tokio::select! macro requires selected futures to be "Cancellation Safe". As far as I can tell, this means that tokio::select! drops the future rather than waiting for it to complete. It does not appear that tokio_uring operations are currently "cancellation safe" (in either the current implementation for SharedFd or my proposed implementation).

@thomasbarrett
Copy link

I have tracked the root of this "Cancellation un-Safety" down to the behavior of Driver::remove_op. When an Op is dropped, if it is still in progress, it is moved to a Lifecycle::Ignored state rather than deleted. This means that a deleted Op still holds a reference to a SharedFd after it is dropped. This is kind of weird behavior that I didn't anticipate, but not necessarily incorrect.

@FrankReh
Copy link
Collaborator Author

FrankReh commented Feb 8, 2023

I have a fix for this. Will submit the PR for this in the morning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants