Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-15233: [C++] Correct thread contention in s3fs #34671

Closed
wants to merge 2 commits into from

Conversation

EpsilonPrime
Copy link
Contributor

@EpsilonPrime EpsilonPrime commented Mar 22, 2023

Avoids thread contention which can occur when using CopyFiles to copy files from a filesystem other than S3 to the S3FS. This is accomplished by not trying to complete the merging of a potentially multipart file asynchronously while already in an I/O thread (instead that work is completed in the already active thread).

@github-actions
Copy link

@github-actions
Copy link

⚠️ GitHub issue #15233 has been automatically assigned in GitHub to PR creator.

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Mar 22, 2023
@westonpace
Copy link
Member

I think CloseAsync is expected to act as an async method. By pushing the writes to the foreground it is no longer acting as an async method.

This is similar to the issues we ran into on the read path. The solution was to make everything async by default and the only time we wrap async with sync is at the API layer (e.g. on that first call from the user to a sync method). This allows us to keep one async path for all the "internals" and only mirror the two paths at the edges.

Instead, could we change CopyFile (and CopyFiles) to use async methods? Or create CopyFileAysnc and CopyFilesAsync (and change CopyFiles to return CopyFilesAsync.result()?

@amol-
Copy link
Member

amol- commented Apr 5, 2023

@EpsilonPrime are you still interested in moving this forward? It looks like there are some failing checks and some review feedbacks to address

@EpsilonPrime
Copy link
Contributor Author

@EpsilonPrime are you still interested in moving this forward? It looks like there are some failing checks and some review feedbacks to address

@amol- Yes, I'm still working on this but my alternate version of the solution (using futures throughout) is not yet working.

@zeroshade
Copy link
Member

@EpsilonPrime Just checking back in on this, are you still working on this and we should keep it open? Do you need any assistance in getting it working?

@EpsilonPrime
Copy link
Contributor Author

Solved by another PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Python] pyarrow.fs.copy_files hangs indefinitely
5 participants