Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Break out inner loop of optimize_layout so joblib can parallelize it (0.4dev version) #292

Closed

Conversation

tomwhite
Copy link
Collaborator

This is the 0.4dev version of #255. Probably best to review this one.

If no random seed is set then optimize_layout will use all cores on the machine, otherwise it will use a single core for reproducibility.

@tomwhite tomwhite requested a review from lmcinnes September 11, 2019 11:11
@pep8speaks
Copy link

Hello @tomwhite! Thanks for opening this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 262:80: E501 line too long (86 > 79 characters)
Line 299:80: E501 line too long (116 > 79 characters)

Line 1892:80: E501 line too long (99 > 79 characters)

parallel(parallel_calls(inner_fn, n_tasks))


@numba.njit(fastmath=True, nogil=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not super deep into what you are doing here, so bare with me if my question makes no sense.
Why can't you use @numba.njit(parallel=True) @ line 129 instead of using nogil=True and joblib?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your question makes a lot of sense. The answer is to do with random number seeding and reproducibility: basically Numba parallel doesn't allow controlled seeding across threads so it's not possible to get a reproducible result. With this code (using joblib), if you don't set a seed it will use all cores, otherwise it will use a single core and the result will be deterministic. See discussion in #231 for why that is not achievable with Numba parallel.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just toggle the decorator.

import numba
import numpy as np

def searchsorted(a, v):
    indices = np.empty(v.shape, dtype=np.int64)
    for i in numba.prange(v.shape[0]):
        indices[i] = np.searchsorted(a[i], v[i])
    return indices

# No numba, no parallel
searchsorted(np.array([[4,3,2,5]]), np.array([[1]]))

# Now uses the numba implementation
parallel_searchsorted = numba.njit(searchsorted)

parallel_searchsorted(np.array([[4,3,2,5]]), np.array([[1]]))

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way you can also programmatically just toggle parallel on or off by passing it to njit.
See https://github.com/numba/numba/blob/master/numba/decorators.py#L41

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sleighsoft you are right, this would be a better approach, since it's shorter and doesn't need a new dependency (joblib). I actually investigated this approach a while ago, but it got lost in the discussion in #231. I think the key point (which I missed before) was that we can use numba parallel if no seed is set (to get speed), but disable parallel if a seed is set (to get reproducibility).

I've opened #294 with the suggested approach.

@tomwhite
Copy link
Collaborator Author

Closing in favour of #294

@tomwhite tomwhite closed this Sep 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants