Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Tensorflow backend & Benchmarker & Myst_parser #316

Merged
merged 48 commits into from
Dec 9, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
2e944b0
First batch of tf methods (to be continued)
ncassereau Nov 29, 2021
1653b70
Second batch of method (yet to debug)
ncassereau Nov 30, 2021
2fe3441
tensorflow for cpu
ncassereau Nov 30, 2021
5955643
add tf requirement
ncassereau Nov 30, 2021
d988df0
pep8 + bug
ncassereau Nov 30, 2021
86cdbc6
small changes
ncassereau Dec 1, 2021
93bc691
attempt to solve pymanopt bug with tf2
ncassereau Dec 1, 2021
8baeae9
attempt #2
ncassereau Dec 1, 2021
1881f85
attempt #3
ncassereau Dec 1, 2021
249ea2f
attempt 4
ncassereau Dec 1, 2021
b887746
docstring
ncassereau Dec 1, 2021
2d90386
Merge branch 'master' into tensorflow_backend
ncassereau Dec 3, 2021
a34aeff
correct pep8 violation introduced in merge conflicts resolution
ncassereau Dec 3, 2021
eae1c9a
attempt 5
ncassereau Dec 3, 2021
01fce56
attempt 6
ncassereau Dec 3, 2021
8223e76
just a random try
ncassereau Dec 3, 2021
aaac0ee
Revert "just a random try"
ncassereau Dec 3, 2021
0feea71
GPU tests for tensorflow
ncassereau Dec 6, 2021
32c2838
pep8
ncassereau Dec 6, 2021
3cadd11
attempt to solve issue with m2r2
ncassereau Dec 6, 2021
aaa7e4a
Remove transpose backend method
ncassereau Dec 6, 2021
2d77a37
Merge branch 'master' into tensorflow_backend
ncassereau Dec 6, 2021
245a3c2
first draft of benchmarker (need to correct time measurement)
ncassereau Dec 6, 2021
f269fda
prettier bench table
ncassereau Dec 7, 2021
bdba755
Bitsize and prettier device methods
ncassereau Dec 7, 2021
30e7ba7
prettified table bench
ncassereau Dec 7, 2021
689ae01
Bug corrected (results were mixed up in the final table)
ncassereau Dec 7, 2021
1347387
Better perf counter (for GPU support)
ncassereau Dec 7, 2021
96488ce
pep8
ncassereau Dec 7, 2021
59ea42e
EMD bench
ncassereau Dec 7, 2021
22c4d0c
solve bug if no GPU available
ncassereau Dec 7, 2021
eca3f80
pep8
ncassereau Dec 7, 2021
aa5257a
warning about tensorflow numpy api being required in the backend.py d…
ncassereau Dec 7, 2021
1f62660
Bug solve in backend docstring
ncassereau Dec 7, 2021
0f1d299
not covering code which requires a GPU
ncassereau Dec 7, 2021
0984000
Tensorflow gradients manipulation tested
ncassereau Dec 7, 2021
9e56ccd
Number of warmup runs is now customizable
ncassereau Dec 7, 2021
84fb002
typo
ncassereau Dec 7, 2021
9a7fd03
Remove some warnings while building docs
ncassereau Dec 7, 2021
acc9474
Change prettier_device to device_type in backend
ncassereau Dec 7, 2021
0593f87
Correct JAX mistakes preventing to see the CPU if a GPU is present
ncassereau Dec 7, 2021
321597f
Attempt to solve JAX bug in case no GPU is found
ncassereau Dec 7, 2021
ec72f30
Reworked benchmarks order and results storage & clear GPU after usage…
ncassereau Dec 8, 2021
9a80c7a
Add bench to backend docstring
ncassereau Dec 8, 2021
21d34ea
better benchs
ncassereau Dec 8, 2021
c507d3b
remove useless stuff
ncassereau Dec 8, 2021
d02f71a
Better device_type
ncassereau Dec 8, 2021
86faaa4
Now using MYST_PARSER and solving links issue in the README.md / onli…
ncassereau Dec 9, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/requirements_test_windows.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ cython
matplotlib
autograd
pymanopt==0.2.4; python_version <'3'
pymanopt; python_version >= '3'
pymanopt==0.2.6rc1; python_version >= '3'
cvxopt
scikit-learn
pytest
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ POT provides the following generic OT solvers (links to examples):
* [Partial Wasserstein and Gromov-Wasserstein](https://pythonot.github.io/auto_examples/unbalanced-partial/plot_partial_wass_and_gromov.html) (exact [29] and entropic [3]
formulations).
* [Sliced Wasserstein](https://pythonot.github.io/auto_examples/sliced-wasserstein/plot_variance.html) [31, 32] and Max-sliced Wasserstein [35] that can be used for gradient flows [36].
* [Several backends](https://pythonot.github.io/quickstart.html#solving-ot-with-multiple-backends) for easy use of POT with [Pytorch](https://pytorch.org/)/[jax](https://github.com/google/jax)/[Numpy](https://numpy.org/) arrays.
* [Several backends](https://pythonot.github.io/quickstart.html#solving-ot-with-multiple-backends) for easy use of POT with [Pytorch](https://pytorch.org/)/[jax](https://github.com/google/jax)/[Numpy](https://numpy.org/)/[Cupy](https://cupy.dev/)/[Tensorflow](https://www.tensorflow.org/) arrays.

POT provides the following Machine Learning related solvers:

Expand Down Expand Up @@ -202,12 +202,12 @@ This toolbox benefit a lot from open source research and we would like to thank

* [Gabriel Peyré](http://gpeyre.github.io/) (Wasserstein Barycenters in Matlab)
* [Mathieu Blondel](https://mblondel.org/) (original implementation smooth OT)
* [Nicolas Bonneel](http://liris.cnrs.fr/~nbonneel/) ( C++ code for EMD)
* [Nicolas Bonneel](http://liris.cnrs.fr/~nbonneel/) (C++ code for EMD)
* [Marco Cuturi](http://marcocuturi.net/) (Sinkhorn Knopp in Matlab/Cuda)

## Contributions and code of conduct

Every contribution is welcome and should respect the [contribution guidelines](https://pythonot.github.io/contributing.html). Each member of the project is expected to follow the [code of conduct](https://pythonot.github.io/code_of_conduct.html).
Every contribution is welcome and should respect the [contribution guidelines](.github/CONTRIBUTING.md). Each member of the project is expected to follow the [code of conduct](.github/CODE_OF_CONDUCT.md).

## Support

Expand All @@ -217,7 +217,7 @@ You can ask questions and join the development discussion:
* On the POT [gitter channel](https://gitter.im/PythonOT/community)
* On the POT [mailing list](https://mail.python.org/mm3/mailman3/lists/pot.python.org/)

You can also post bug reports and feature requests in Github issues. Make sure to read our [guidelines](https://pythonot.github.io/contributing.html) first.
You can also post bug reports and feature requests in Github issues. Make sure to read our [guidelines](.github/CONTRIBUTING.md) first.

## References

Expand Down
5 changes: 5 additions & 0 deletions benchmarks/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from . import benchmark
from . import sinkhorn_knopp
from . import emd

__all__= ["benchmark", "sinkhorn_knopp", "emd"]
105 changes: 105 additions & 0 deletions benchmarks/benchmark.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# /usr/bin/env python3
# -*- coding: utf-8 -*-

from ot.backend import get_backend_list, jax, tf
import gc


def setup_backends():
if jax:
from jax.config import config
config.update("jax_enable_x64", True)

if tf:
from tensorflow.python.ops.numpy_ops import np_config
np_config.enable_numpy_behavior()


def exec_bench(setup, tested_function, param_list, n_runs, warmup_runs):
backend_list = get_backend_list()
for i, nx in enumerate(backend_list):
if nx.__name__ == "tf" and i < len(backend_list) - 1:
# Tensorflow should be the last one to be benchmarked because
# as far as I'm aware, there is no way to force it to release
# GPU memory. Hence, if any other backend is benchmarked after
# Tensorflow and requires the usage of a GPU, it will not have the
# full memory available and you may have a GPU Out Of Memory error
# even though your GPU can technically hold your tensors in memory.
backend_list.pop(i)
backend_list.append(nx)
break

inputs = [setup(param) for param in param_list]
results = dict()
for nx in backend_list:
for i in range(len(param_list)):
print(nx, param_list[i])
args = inputs[i]
results_nx = nx._bench(
tested_function,
*args,
n_runs=n_runs,
warmup_runs=warmup_runs
)
gc.collect()
results_nx_with_param_in_key = dict()
for key in results_nx:
new_key = (param_list[i], *key)
results_nx_with_param_in_key[new_key] = results_nx[key]
results.update(results_nx_with_param_in_key)
return results


def convert_to_html_table(results, param_name, main_title=None, comments=None):
string = "<table>\n"
keys = list(results.keys())
params, names, devices, bitsizes = zip(*keys)

devices_names = sorted(list(set(zip(devices, names))))
params = sorted(list(set(params)))
bitsizes = sorted(list(set(bitsizes)))
length = len(devices_names) + 1
cpus_cols = list(devices).count("CPU") / len(bitsizes) / len(params)
gpus_cols = list(devices).count("GPU") / len(bitsizes) / len(params)
assert cpus_cols + gpus_cols == len(devices_names)

if main_title is not None:
string += f'<tr><th align="center" colspan="{length}">{str(main_title)}</th></tr>\n'

for i, bitsize in enumerate(bitsizes):

if i != 0:
string += f'<tr><td colspan="{length}">&nbsp;</td></tr>\n'

# make bitsize header
text = f"{bitsize} bits"
if comments is not None:
text += " - "
if isinstance(comments, (tuple, list)) and len(comments) == len(bitsizes):
text += str(comments[i])
else:
text += str(comments)
string += f'<tr><th align="center">Bitsize</th>'
string += f'<th align="center" colspan="{length - 1}">{text}</th></tr>\n'

# make device header
string += f'<tr><th align="center">Device</th>'
string += f'<th align="center" colspan="{cpus_cols}">CPU</th>'
string += f'<th align="center" colspan="{gpus_cols}">GPU</th></tr>\n'

# make param_name / backend header
string += f'<tr><th align="center">{param_name}</th>'
for device, name in devices_names:
string += f'<th align="center">{name}</th>'
string += "</tr>\n"

# make results rows
for param in params:
string += f'<tr><td align="center">{param}</td>'
for device, name in devices_names:
key = (param, name, device, bitsize)
string += f'<td align="center">{results[key]:.4f}</td>'
string += "</tr>\n"

string += "</table>"
return string
40 changes: 40 additions & 0 deletions benchmarks/emd.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# /usr/bin/env python3
# -*- coding: utf-8 -*-

import numpy as np
import ot
from .benchmark import (
setup_backends,
exec_bench,
convert_to_html_table
)


def setup(n_samples):
rng = np.random.RandomState(789465132)
x = rng.randn(n_samples, 2)
y = rng.randn(n_samples, 2)

a = ot.utils.unif(n_samples)
M = ot.dist(x, y)
return a, M


if __name__ == "__main__":
n_runs = 100
warmup_runs = 10
param_list = [50, 100, 500, 1000, 2000, 5000]

setup_backends()
results = exec_bench(
setup=setup,
tested_function=lambda a, M: ot.emd(a, a, M),
param_list=param_list,
n_runs=n_runs,
warmup_runs=warmup_runs
)
print(convert_to_html_table(
results,
param_name="Sample size",
main_title=f"EMD - Averaged on {n_runs} runs"
))
42 changes: 42 additions & 0 deletions benchmarks/sinkhorn_knopp.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# /usr/bin/env python3
# -*- coding: utf-8 -*-

import numpy as np
import ot
from .benchmark import (
setup_backends,
exec_bench,
convert_to_html_table
)


def setup(n_samples):
rng = np.random.RandomState(123456789)
a = rng.rand(n_samples // 4, 100)
b = rng.rand(n_samples, 100)

wa = ot.unif(n_samples // 4)
wb = ot.unif(n_samples)

M = ot.dist(a.copy(), b.copy())
return wa, wb, M


if __name__ == "__main__":
n_runs = 100
warmup_runs = 10
param_list = [50, 100, 500, 1000, 2000, 5000]

setup_backends()
results = exec_bench(
setup=setup,
tested_function=lambda *args: ot.bregman.sinkhorn(*args, reg=1, stopThr=1e-7),
param_list=param_list,
n_runs=n_runs,
warmup_runs=warmup_runs
)
print(convert_to_html_table(
results,
param_name="Sample size",
main_title=f"Sinkhorn Knopp - Averaged on {n_runs} runs"
))
2 changes: 1 addition & 1 deletion docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ numpydoc
memory_profiler
pillow
networkx
m2r2
myst-parser
2 changes: 1 addition & 1 deletion docs/requirements_rtd.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ numpydoc
memory_profiler
pillow
networkx
m2r2
myst-parser
numpy
scipy>=1.0
cython
Expand Down
6 changes: 6 additions & 0 deletions docs/source/.github/CODE_OF_CONDUCT.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Code of Conduct
===============

.. include:: ../../../.github/CODE_OF_CONDUCT.md
:parser: myst_parser.sphinx_
:start-line: 2
6 changes: 6 additions & 0 deletions docs/source/.github/CONTRIBUTING.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Contributing to POT
===================

.. include:: ../../../.github/CONTRIBUTING.md
:parser: myst_parser.sphinx_
:start-line: 3
1 change: 0 additions & 1 deletion docs/source/code_of_conduct.rst

This file was deleted.

2 changes: 1 addition & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ def __getattr__(cls, name):
'sphinx.ext.viewcode',
'sphinx.ext.napoleon',
'sphinx_gallery.gen_gallery',
'm2r2'
'myst_parser'
]

autosummary_generate = True
Expand Down
1 change: 0 additions & 1 deletion docs/source/contributing.rst

This file was deleted.

9 changes: 4 additions & 5 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,11 @@ Contents
all
auto_examples/index
releases
contributing
Code of Conduct <code_of_conduct>

.. mdinclude:: ../../README.md
:start-line: 2
.github/CONTRIBUTING
.github/CODE_OF_CONDUCT

.. include:: ../../README.md
:parser: myst_parser.sphinx_


Indices and tables
Expand Down
Loading