You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Major problem: float32's rounding error is so large that it may dominate the difference between the numerical gradients and the analytical gradients, which cases relatively large relative error in gradient checking. As a consensus, the gradient checker used in unit tests may be unreliable.
Experiments
The differences between the numerical and analytical gradients of the linear function f(x, y) = x^T * y are shown as bellow. We can conclude that
Although linear function is very simple, the absolute error and relative error are unacceptable large if float32 is used.
The errors are very small is float64 is used.
If the scale of epsilon is comparable with x/y, errors will be small. But I'm not sure whether this conclusion generalizes to more complicated functions.
Major problem: float32's rounding error is so large that it may dominate the difference between the numerical gradients and the analytical gradients, which cases relatively large relative error in gradient checking. As a consensus, the gradient checker used in unit tests may be unreliable.
Potential solution:
Experiments
The differences between the numerical and analytical gradients of the linear function f(x, y) = x^T * y are shown as bellow. We can conclude that
code
The text was updated successfully, but these errors were encountered: