-
-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix algorithms "newton", "gradient_descen" and "proximal_grad" with cupy inputs #87
Conversation
|
What do you mean with "it doesn't work"? It throws errors or the expected performance is lower than expected? |
It throws an error in this function, Line 148 specifically. When Lines 137 to 149 in 69a947e
Error message:
|
I'll wait for this PR to get in first: dask/dask#6680 |
Why wait? I don't think my draft PR impacts this one, plus my thing causes a bunch of other stuff to break so there's likely to be a long discussion about what we should do instead. If this is working, I'd encourage you to get it in. |
Thank you for the comment. This PR depends on |
Where is this line, sorry? (I didn't find it with a quick search all across the dask repository) |
It is line 143 of Lines 137 to 149 in 69a947e
|
Ah, thank you! |
dask_glm/algorithms.py
Outdated
@@ -97,7 +97,7 @@ def gradient_descent(X, y, max_iter=100, tol=1e-14, family=Logistic, **kwargs): | |||
stepSize = 1.0 | |||
recalcRate = 10 | |||
backtrackMult = firstBacktrackMult | |||
beta = np.zeros(p) | |||
beta = np.zeros_like(X._meta, shape=(p)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
beta = np.zeros_like(X._meta, shape=(p)) | |
beta = np.zeros_like(X._meta, shape=p) |
I think that we probably don't need the extra parens around p
here and below
beta = regularizer.proximal_operator(obeta - stepSize * gradient, stepSize * lamduh) | ||
step = obeta - beta | ||
beta = regularizer.proximal_operator(- stepSize * gradient + obeta, stepSize * lamduh) | ||
step = - beta + obeta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious, why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When obeta
is a cupy array
, cupy's
operators +/-
can't handle a dask array
as the second operands in those cases. After the changes, dask's
operators of +/-
will be used and it works with cupy array
as the second operands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you could do obeta = da.from_array(obeta)
before the computation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't matter either way. Either way is fine. Can we report this upstream to cupy? Presumably they need to return NotImplemented somwhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't matter either way. Either way is fine.
I know, just providing a more Dask-like solution.
Can we report this upstream to cupy? Presumably they need to return NotImplemented somwhere?
I don't know exactly what error we get, but would reporting it change anything? Would returning NotImplemented
really make things much clearer to bother?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a Python object doesn't know what to do with another operand in a binary operator, it can return NotIimplemented (or maybe it's raise?) and then Python will try the other order (calling other.__rsub__(self)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mrocklin , I didn't know of that. I filed cupy/cupy#4118 to start the discussion on the CuPy side, I'm not sure but maybe there's a technical reason not to do that, so let's wait to hear from maintainers.
I've removed the WIP label. This PR seems good to me. Any objections to merging? |
None from me. |
It's great to see it took "just" some 17 months to get everything in place for this, since #75 . :) |
Thanks @daxiongshu ! This is in. |
Sometimes you plant a seed and it just takes a while to grow :) |
This PR introduces minor changes that should have no effect on existing functionalities but they enable
cupy
inputs immediately fordask-glm.algorithms
"newton", "gradient_descen" and "proximal_grad".