Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zarr Sink Multiprocessing #1551

Merged
merged 8 commits into from
Jul 16, 2024
Merged

Zarr Sink Multiprocessing #1551

merged 8 commits into from
Jul 16, 2024

Conversation

annehaley
Copy link
Collaborator

In addition to the zarr sink being compatible with multithreading, this PR makes the zarr sink compatible with multiprocessing. This is accomplished by the following:

  • splitting the addLock into threadLock and processLock (each a different class)
  • changing how a zarr sink is initialized from a path; now made more predictable by using os.tempdir as the root rather than a generated tempdir.TemporaryDirectory() and opened with "append" mode via zarr
  • removing the clause to set self.unpickleable = True
  • adding a pytest to check the behavior of both multithreading and multiprocessing cases

With this addition, there continues to be the caveat that the size of the whole image must be known before any multiprocessing or multithreading can work. The image size cannot be changed within a thread or process. The test reflects this; the last tile in the tileset (at the maximum extents) is added first, then all other tiles are added within a thread or process.

@annehaley annehaley requested a review from manthey July 11, 2024 19:08
Copy link
Member

@manthey manthey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent. @annehaley I know you are working on it, but the next step is making sure we document how to use this. I'd like to see an example either in the example_usage document (or somewhere similar) or as a jupyter notebook example.

@annehaley annehaley merged commit 9f33a36 into master Jul 16, 2024
16 checks passed
@annehaley annehaley deleted the zarr-sink-concurrency branch July 16, 2024 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants