Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add task/worker pool similar to Domainslib task pool #512

Closed
Tracked by #388
darrenldl opened this issue May 9, 2023 · 9 comments
Closed
Tracked by #388

Add task/worker pool similar to Domainslib task pool #512

darrenldl opened this issue May 9, 2023 · 9 comments
Milestone

Comments

@darrenldl
Copy link

Thanks for your work first of all, it's been nice to use.

The Domainslib integration as described in README works fine, but working with mutexes then becomes a bit complicated.

  • Invoking Eio.Mutex.lock etc within a Domainslib (or just non-Eio I guess) domain causes it to stop progressing
    • Not unexpected, but was not straightforward to track down, and Eio data structure documentation doesn't seem warn about this
  • Swapping to stdlib's Mutex works, but then suffers from the problem of locking the entire domain needlessly if used on the Eio side

Right now it seems that there are enough components to replicate task pool

  • Run a few background workers upon pool construction using Fiber.fork and Domain_manager.run
  • Construct promises, embed within a closure and pass it onto work stream (unit -> unit) Eio.Stream.t (same as domainslib example in readme)

And it would be very nice to be able to stay in Eio domains for parallel worker pools.

I'm happy to draft a PR in next few days if the addition is not deemed problematic.

@Sudha247
Copy link
Contributor

Sudha247 commented May 9, 2023

Hi @darrenldl, thanks for bringing this up. Would domain-local-await work for you, for domainslib integration?

@talex5 talex5 mentioned this issue May 9, 2023
25 tasks
@talex5
Copy link
Collaborator

talex5 commented May 9, 2023

I'm happy to draft a PR in next few days if the addition is not deemed problematic.

Yes, that would be good. We have an item in the 1.0 roadmap for this (which I've now linked to this issue).

As @Sudha247 mentioned, it is possible to make mutexes work over multiple schedulers (this support will be in Eio 0.10), but we do want a worker pool in Eio too. There's an example showing how to do this in the README already (https://github.com/ocaml-multicore/eio#example-worker-pool) but we should agree on a proper API for it.

We might want to make an Eio.Pool abstraction first, and then build worker pools on top of that, but we can prototype one that starts all domains immediately without that.

@darrenldl
Copy link
Author

Thanks @Sudha247 @talex5 !

We have an item in the 1.0 roadmap for this (which I've now linked to this issue).

Ah I must have missed it, oops, my bad.

We might want to make an Eio.Pool abstraction first, and then build worker pools on top of that, but we can prototype one that starts all domains immediately without that.

Could you elaborate on that? I'm confused by what Eio.Pool should do by itself - is it a pool of fibers?

@talex5
Copy link
Collaborator

talex5 commented May 9, 2023

Eio.Pool would be similar to Lwt_pool - a generic way to hold resources, creating them as needed (up to a limit). But we can add that later.

@darrenldl
Copy link
Author

darrenldl commented May 9, 2023

I see, seems like a good skeleton to base the problem on indeed. Thanks for the pointer!

@patricoferris
Copy link
Collaborator

Eio.Pool would be similar to Lwt_pool - a generic way to hold resources, creating them as needed (up to a limit). But we can add that later.

I think we can do a nicer implementation with the low-level primitives in Eio, but I did hastily port this whilst porting Irmin in case it is useful https://github.com/patricoferris/irmin/blob/direct-style/src/irmin-fs/unix/eio_pool.ml

@darrenldl
Copy link
Author

Ooo neat! Will take a closer look when I actually begin the PR (if no one has beaten me to it by then).

At first glance it seems that acquire has a time-of-check time-of-use race condition if environment is parallel.

@polytypic
Copy link
Contributor

Related to making synchronization primitives work across schedulers, you might be interested in this example of how one might setup interop between Eio and Domainslib using the kcas library (that uses the domain-local-await mentioned previously).

@balat balat added this to the Eio 1.0 milestone May 12, 2023
@talex5
Copy link
Collaborator

talex5 commented Nov 23, 2023

Added by #639.

@talex5 talex5 closed this as completed Nov 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants