Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Fixes #17304 Flaky Test -> test_higher_order_grad.test_tanh #17321

Merged
merged 4 commits into from
Jan 15, 2020

Conversation

kshitij12345
Copy link
Contributor

@apeforest @reminisce

Going through the code for _backward_tanh, it is implemented as 1 - (tanh^2(x)), which is equivalent to sech^2(x) or (1/cosh(x))^2.

However for the failed seed, I have verified that 1 - (tanh^2(x)) is not coming equivalent to (1/ cosh(x))^2 for the randomly generated array of dim 4 (will try to investigate further for the cause).

For now replacing grad_op with its equivalent 1- tanh^2(x) looks to work without the need to relax tolerance.

@codecov-io
Copy link

codecov-io commented Jan 15, 2020

Codecov Report

Merging #17321 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #17321   +/-   ##
=======================================
  Coverage    67.5%    67.5%           
=======================================
  Files         275      275           
  Lines       31227    31227           
  Branches     4721     4721           
=======================================
  Hits        21080    21080           
  Misses       8780     8780           
  Partials     1367     1367

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 05a0e5b...ecac9a8. Read the comment docs.

@haojin2
Copy link
Contributor

haojin2 commented Jan 15, 2020

@kshitij12345 Have you tried to run this new version with > 500 trials?

@kshitij12345
Copy link
Contributor Author

@haojin2 , Hi can you tell me how I can do that.

@haojin2
Copy link
Contributor

haojin2 commented Jan 15, 2020

@kshitij12345 like this:
MXNET_TEST_COUNT=500 nosetests tests/python/unittest/test_higher_order_grad.py:test_tanh

@kshitij12345
Copy link
Contributor Author

@haojin2 Thank You.

That helped, first order test failed once. Have relaxed the tolerance for first order to rtol=1e-6 and atol=1e-6.

Ran this ->
MXNET_TEST_COUNT=500 nosetests tests/python/unittest/test_higher_order_grad.py:test_tanh
multiple times after that which succeeded each time.

Copy link
Contributor

@haojin2 haojin2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will merge after CI passes

Copy link
Contributor

@apeforest apeforest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is some numerical instability due to precision. Please re-run your tests multiple times as @haojin2 suggested. Otherwise, LGTM. Thanks for your prompt response.

@apeforest
Copy link
Contributor

@kshitij12345 Could you please also do so for other operators? I suspect they might have similar issues as well. Thanks!

@kshitij12345
Copy link
Contributor Author

@apeforest @haojin2 , Yes I have seen that it is happening for arcsin and arctanh as well as #16739.

@haojin2 haojin2 merged commit be9e17e into apache:master Jan 15, 2020
@haojin2
Copy link
Contributor

haojin2 commented Jan 15, 2020

Merged, please continue with fixing other flaky higher-order gradient tests @kshitij12345

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants