Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isready() blocks #7006

Closed
goszlanyi opened this issue May 28, 2014 · 15 comments
Closed

isready() blocks #7006

goszlanyi opened this issue May 28, 2014 · 15 comments
Labels
parallelism Parallel or distributed computation

Comments

@goszlanyi
Copy link

isready(RemoteRef) is there for checking whether a long calculation is ready.

However, the following call to isready() blocks until the calculation is ready:

julia> r = @spawnat 2 (n=4000; a=rand(n,n); b=inv(a);)
julia> isready(r)
true

This makes some of my parallel code useless.
(tested using 0.3 prerelease on both 32-bit Windows and Ubuntu Nightly)

@ViralBShah
Copy link
Member

Yes, isready should really just check and return a Bool immediately.

@amitmurthy
Copy link
Contributor

The issue is that if the other worker is computation bound, no other code can execute, even the isready call. Till we have a background thread, at least for such checks (and also maybe heartbeats), the workaround is to have a local RemoteRef and have the computation populate that. The following should work.

r=RemoteRef()
@async put!(r, remotecall_fetch(2, ()->(n=4000; a=rand(n,n); inv(a))) 
isready(r)

@amitmurthy
Copy link
Contributor

Or just have isready take a timeout value.

Just trolling @StefanKarpinski

:-)

@goszlanyi
Copy link
Author

Thank you very much for the workaround, Amit!

(At the same time, I think the issue is still open.)

@JeffBezanson
Copy link
Member

isready is a race condition and should not exist (by the time you get its answer, it might not be correct anymore). This should be done by waiting for the computation in an async task as @amitmurthy shows, but you can do anything you need to when it finishes: pinging another RemoteRef as shown, or setting a flag, printing a message, whatever.

@goszlanyi
Copy link
Author

So you plan to remove isready completely ?

The suggested workaround is already subtle and slightly verbose for me,
what is the best practice that is brief and works without isready ?

Maybe a new macro ?

@JeffBezanson
Copy link
Member

Well, isready does not really fit with our model, which is event-driven. You should get notifications when something happens, not ask whether it has happened. Perhaps say a bit more about how you use it, and why you need it.

@goszlanyi
Copy link
Author

I am not arguing for isready, I started using it because it was there.
I did not understand that its use means a race condition and is considered bad practice.
Now that I learned it, I will try to switch from polling to something event driven.

I would greatly appreciate good examples. The use case?
Fairly simple: hundreds of calculations taking different times on several cores,
trying to keep all cores busy by noticing when a given core is available.

@JeffBezanson
Copy link
Member

For that, I recommend the dynamic scheduling pattern used by pmap. You have a task per worker, each of which waits for a worker to finish its current task and then gives it the next item.

I think for this release we should deprecate isready.

@goszlanyi
Copy link
Author

Thank you for the pmap suggestion.

As for the fate of isready let the experts decide.
It seemed just so simple and straightforward.

@amitmurthy
Copy link
Contributor

I am not for deprecating isready. The event-driven model is fine when everything is working fine. But when tasks take too long - for whatever reason - bugs, external dependencies, etc,., we do need a mechanism that makes it easy to detect the same.

The race condition with isready exists only when there more than 2 readers/writers on the same remoteref. The application just needs to be aware of the same.

As and when #6741 is ready, we should actually have IO (libuv) running in its own thread and all computation in their own. Like ZeroMQ does - it starts an independent internal I/O thread and all application 0MQ sockets communicate with it via an internal mechanism.

@goszlanyi
Copy link
Author

So because my use case of isready was a single reader/single writer of the same RemoteRef,
I did not do anything wrong?

@amitmurthy
Copy link
Contributor

Yes.
A simple example of the possible race condition with isready

isready(r)    # say, returns true in process 1
isready(r)    # say, returns true in process 2
take!(r)      # gets value immediately in process 1
take!(r)      # blocks in process 2, since between the time of isready and take! in process 2, it has already been taken by process 1.

@JeffBezanson
Copy link
Member

Ok, we can keep it. We can warn in the docs that it might block, and suggest async waiting instead.

@ViralBShah
Copy link
Member

I like keeping it and warning in the docs. How about putting the recipe described here in the manual for future users with the same question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parallelism Parallel or distributed computation
Projects
None yet
Development

No branches or pull requests

4 participants