Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allowing more flexible cost functions for optimizers #959
Allowing more flexible cost functions for optimizers #959
Changes from 17 commits
fe79cc6
9362b82
5609f5d
2c14788
24d30e1
21e9e01
a820b55
ad0906a
ed4ac91
b30a552
cb2b328
5c03e65
df0a7a8
4cf358c
b9a447d
7624f24
5e0c40b
12598c8
4f35668
193cb53
095de9e
9f189f9
b4b0d71
83dcf94
a3c2534
b4ed411
b3c3857
29416d2
dfc0092
6cc787c
703942d
6ca34cb
4854c6b
16c6bde
7ddbcbc
c53adb7
d9d03a9
afd0cfa
751a030
90257ba
a35d782
033af1b
3c58644
f39a839
0eb4133
bfe0a4b
c3d1e49
761bfed
f7e9d67
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget to add your name to contributors! (unless you have done that already, and I missed it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few
x
s in some docstrings that should be updated toargs
withrequires_grad=True
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So
*args
, with the star, is neither a tuple nor a list, its an unpacked tuple.I've looked up examples of how to document
*args
and**kwargs
in docstrings. At least for this "google style", they don't specify a particular type for*args
and**kwargs
, just what they do/ get used for.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right. My bad! Then this is probably the right way to do it. Feel free to mark all of these comments as resolved. 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nicer to clarify a bit better what shape means in this case. E.g. "will always be a tuple containing ...".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
args
itself is a tuple, but*args
itself is an unwrapped tuple and its components are not confined to a particular type.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than saying 'must match shape of...', perhaps it's better to explicitly say what the output of a custom grad function should look like?
E.g., "The provided grad function must have output of shape ..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than saying 'must match shape of...', perhaps it's better to explicitly say what the output of a custom grad function should look like?
E.g., "The provided grad function must have output of shape ..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just struggling with a good way to describe the form. (df/d(trainable argument 0), df/d(trainable argument 1), ... ) just doesn't look very nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some feedback from the hackathon based upon a submission that uses
compute_grad
andapply_grad
: they had some confusion because they needed to switch from passingx=params
toargs=[params]
and also (forcompute_grad
) they needed to explicitly passkwargs={}
.Should we consider making these methods be similar to
step(*args, **kwargs)
? Or, alternatively just make these hidden methods.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trbromley I tend to agree, especially if this would be more of a user-facing method (not sure how much of an issue it would be otherwise). There was a brief discussion re. this when this was under review. Perhaps just making them hidden by preceding with an underscore would be good here. 🤔 @albi3ro @josh146
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm somewhat ambivalent here. We definitely considered
compute_grad
andapply_grad
'private' methods when first designing the optimizers, as they were not intended to be used as the public optimizer API.But it probably makes more sense to keep them public, and standardize the signatures, since the fact that users were finding and using them suggests they add value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless there's a larger demand from users for this function, it'd be inefficient to unpack and repack the values. It would add operations for no real purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in favour of 'doing nothing' for now unless there is more feedback and demand.