Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix algorithms "newton", "gradient_descen" and "proximal_grad" with cupy inputs #87

Merged
merged 6 commits into from
Oct 9, 2020

Conversation

daxiongshu
Copy link
Contributor

This PR introduces minor changes that should have no effect on existing functionalities but they enable cupy inputs immediately for dask-glm.algorithms "newton", "gradient_descen" and "proximal_grad".

@daxiongshu daxiongshu changed the title Fix algorithms "newton", "gradient_descen" and "proximal_grad" with cupy inputs [WIP] Fix algorithms "newton", "gradient_descen" and "proximal_grad" with cupy inputs Sep 26, 2020
@daxiongshu
Copy link
Contributor Author

fit_intercept=True doesn't work with cupy for now. will work on it.

@stsievert
Copy link
Member

What do you mean with "it doesn't work"? It throws errors or the expected performance is lower than expected?

@daxiongshu
Copy link
Contributor Author

It throws an error in this function, Line 148 specifically. When X is a dask array with cupy chunks, o is a dask array with numpy chunks so da.concatenate([X, o], axis=1) throws an error.

dask-glm/dask_glm/utils.py

Lines 137 to 149 in 69a947e

@dispatch(da.Array)
def add_intercept(X):
if np.isnan(np.sum(X.shape)):
raise NotImplementedError("Can not add intercept to array with "
"unknown chunk shape")
j, k = X.chunks
o = da.ones((X.shape[0], 1), chunks=(j, 1))
if is_dask_array_sparse(X):
o = o.map_blocks(sparse.COO)
# TODO: Needed this `.rechunk` for the solver to work
# Is this OK / correct?
X_i = da.concatenate([X, o], axis=1).rechunk((j, (k[0] + 1,)))
return X_i

Error message:

~/rapids/daskml_cupy/dask-glm/dask_glm/utils.py in add_intercept(X)
    146     # TODO: Needed this `.rechunk` for the solver to work
    147     # Is this OK / correct?
--> 148     X_i = da.concatenate([X, o], axis=1).rechunk((j, (k[0] + 1,)))
    149     return X_i
    150 

~/anaconda3/envs/rapids0.15/lib/python3.7/site-packages/dask/array/core.py in concatenate(seq, axis, allow_unknown_chunksizes)
   3497         type(max(seq_metas, key=lambda x: getattr(x, "__array_priority__", 0)))
   3498     )
-> 3499     meta = _concatenate(seq_metas, axis=axis)
   3500 
   3501     # Promote types to match meta

~/anaconda3/envs/rapids0.15/lib/python3.7/site-packages/cupy/manipulation/join.py in concatenate(tup, axis)
     52         tup = [m.ravel() for m in tup]
     53         axis = 0
---> 54     return core.concatenate_method(tup, axis)
     55 
     56 

cupy/core/_routines_manipulation.pyx in cupy.core._routines_manipulation.concatenate_method()

cupy/core/_routines_manipulation.pyx in cupy.core._routines_manipulation.concatenate_method()

TypeError: Only cupy arrays can be concatenated

@daxiongshu
Copy link
Contributor Author

daxiongshu commented Sep 28, 2020

I'll wait for this PR to get in first: dask/dask#6680

@GenevieveBuckley
Copy link

I'll wait for this PR to get in first: dask/dask#6680

Why wait? I don't think my draft PR impacts this one, plus my thing causes a bunch of other stuff to break so there's likely to be a long discussion about what we should do instead. If this is working, I'd encourage you to get it in.

@daxiongshu
Copy link
Contributor Author

Thank you for the comment. This PR depends on da.ones_like working correctly with cupy input to replace the line o = da.ones((X.shape[0], 1), chunks=(j, 1)) . If da.ones_like with cupy is a long shot, I'll try to work around it. But I still hope your PR could get in. Thank you!

@GenevieveBuckley
Copy link

This PR depends on da.ones_like working correctly with cupy input to replace the line o = da.ones((X.shape[0], 1), chunks=(j, 1)) .

Where is this line, sorry? (I didn't find it with a quick search all across the dask repository)

@daxiongshu
Copy link
Contributor Author

It is line 143 of utils.py:

dask-glm/dask_glm/utils.py

Lines 137 to 149 in 69a947e

@dispatch(da.Array)
def add_intercept(X):
if np.isnan(np.sum(X.shape)):
raise NotImplementedError("Can not add intercept to array with "
"unknown chunk shape")
j, k = X.chunks
o = da.ones((X.shape[0], 1), chunks=(j, 1))
if is_dask_array_sparse(X):
o = o.map_blocks(sparse.COO)
# TODO: Needed this `.rechunk` for the solver to work
# Is this OK / correct?
X_i = da.concatenate([X, o], axis=1).rechunk((j, (k[0] + 1,)))
return X_i

@GenevieveBuckley
Copy link

Ah, thank you!

@@ -97,7 +97,7 @@ def gradient_descent(X, y, max_iter=100, tol=1e-14, family=Logistic, **kwargs):
stepSize = 1.0
recalcRate = 10
backtrackMult = firstBacktrackMult
beta = np.zeros(p)
beta = np.zeros_like(X._meta, shape=(p))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
beta = np.zeros_like(X._meta, shape=(p))
beta = np.zeros_like(X._meta, shape=p)

I think that we probably don't need the extra parens around p here and below

beta = regularizer.proximal_operator(obeta - stepSize * gradient, stepSize * lamduh)
step = obeta - beta
beta = regularizer.proximal_operator(- stepSize * gradient + obeta, stepSize * lamduh)
step = - beta + obeta
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious, why this change?

Copy link
Contributor Author

@daxiongshu daxiongshu Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When obeta is a cupy array, cupy's operators +/- can't handle a dask array as the second operands in those cases. After the changes, dask's operators of +/- will be used and it works with cupy array as the second operands.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could do obeta = da.from_array(obeta) before the computation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter either way. Either way is fine. Can we report this upstream to cupy? Presumably they need to return NotImplemented somwhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter either way. Either way is fine.

I know, just providing a more Dask-like solution.

Can we report this upstream to cupy? Presumably they need to return NotImplemented somwhere?

I don't know exactly what error we get, but would reporting it change anything? Would returning NotImplemented really make things much clearer to bother?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a Python object doesn't know what to do with another operand in a binary operator, it can return NotIimplemented (or maybe it's raise?) and then Python will try the other order (calling other.__rsub__(self))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mrocklin , I didn't know of that. I filed cupy/cupy#4118 to start the discussion on the CuPy side, I'm not sure but maybe there's a technical reason not to do that, so let's wait to hear from maintainers.

@mrocklin mrocklin changed the title [WIP] Fix algorithms "newton", "gradient_descen" and "proximal_grad" with cupy inputs Fix algorithms "newton", "gradient_descen" and "proximal_grad" with cupy inputs Oct 9, 2020
@mrocklin
Copy link
Member

mrocklin commented Oct 9, 2020

I've removed the WIP label. This PR seems good to me. Any objections to merging?

@pentschev
Copy link
Member

Any objections to merging?

None from me.

@pentschev
Copy link
Member

It's great to see it took "just" some 17 months to get everything in place for this, since #75 . :)

@mrocklin mrocklin merged commit 7b2f85f into dask:master Oct 9, 2020
@mrocklin
Copy link
Member

mrocklin commented Oct 9, 2020

Thanks @daxiongshu ! This is in.

@mrocklin
Copy link
Member

mrocklin commented Oct 9, 2020

It's great to see it took "just" some 17 months to get everything in place for this, since #75 . :)

Sometimes you plant a seed and it just takes a while to grow :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants