empirical_sinkhorn_divergence doesn't have a grad_fn #393

gabrielsantosrv · 2022-08-09T20:15:52Z

Describe the bug

I am trying to use empirical_sinkhorn_divergence as a loss function in pytorch, but the returned tensor does not have a grad_fn, so the gradient can't be propagated.

Code sample

loss = ot.bregman.empirical_sinkhorn_divergence(source, target, 1)

Expected behavior

Return a tensor with a grad_fn.

Environment (please complete the following information):

OS (e.g. MacOS, Windows, Linux): Linux
Python version: 3.8.13
How was POT installed (source, pip, conda): pip
Build command you used (if compiling from source):
Only for GPU related bugs:
- CUDA version: 11.2
- GPU models and configuration: Quadro RTX 8000

Output of the following code snippet:

import platform; print(platform.platform())
import sys; print("Python", sys.version)
import numpy; print("NumPy", numpy.__version__)
import scipy; print("SciPy", scipy.__version__)
import ot; print("POT", ot.__version__)

Linux-5.4.0-73-generic-x86_64-with-glibc2.17
Python 3.8.13 (default, Mar 28 2022, 11:38:47) 
[GCC 7.5.0]
NumPy 1.21.6
SciPy 1.8.1
POT 0.8.2

The text was updated successfully, but these errors were encountered:

rflamary · 2022-08-10T06:30:39Z

Good catch! The backend for the sinkhorn divergence is not properly done so it goes to numpy and back losing gradient information (and a lot of interest ;)).

We will look into it in the meantime i suggest that you code a function that uses ot.dist to compute distances and ot.sinkhorn2 to return differentiable losses, an sums the three terms. It can be done in a few lines of code.

gabrielsantosrv · 2022-08-10T15:07:02Z

Since I am using a GPU to run my experiments, do you think it is better use ot.sinkhorn2 with parameter method="sinkhorn_log" to compute the sinkhorn divergence?

rflamary · 2022-08-10T15:22:42Z

yes it should be more numerically stable (slightly slower than traditional sinkhorn though)

gabrielsantosrv · 2022-08-10T15:32:12Z

That's ok, thank you :D

rflamary · 2022-08-18T05:59:18Z

Hello @gabrielsantosrv

The function should now preserve the gradients on the master branch (we added a test to check so that it does not happens again.)

gabrielsantosrv added bug help wanted labels Aug 9, 2022

rflamary mentioned this issue Aug 17, 2022

[MRG] Debug sinkhorn divergence bug and add proper test #394

Merged

4 tasks

rflamary closed this as completed in #394 Aug 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

empirical_sinkhorn_divergence doesn't have a grad_fn #393

empirical_sinkhorn_divergence doesn't have a grad_fn #393

gabrielsantosrv commented Aug 9, 2022

rflamary commented Aug 10, 2022

gabrielsantosrv commented Aug 10, 2022 •

edited

Loading

rflamary commented Aug 10, 2022

gabrielsantosrv commented Aug 10, 2022

rflamary commented Aug 18, 2022

empirical_sinkhorn_divergence doesn't have a grad_fn #393

empirical_sinkhorn_divergence doesn't have a grad_fn #393

Comments

gabrielsantosrv commented Aug 9, 2022

Describe the bug

Code sample

Expected behavior

Environment (please complete the following information):

rflamary commented Aug 10, 2022

gabrielsantosrv commented Aug 10, 2022 • edited Loading

rflamary commented Aug 10, 2022

gabrielsantosrv commented Aug 10, 2022

rflamary commented Aug 18, 2022

gabrielsantosrv commented Aug 10, 2022 •

edited

Loading