From 2e0a0b322cc909cf29cdd7f89f583729a70bb5a9 Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Wed, 23 Jun 2021 11:55:39 +0200 Subject: [PATCH 01/24] add detailed installation documentation --- README.rst | 2 ++ doc/install.rst | 72 +++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 74 insertions(+) create mode 100644 doc/install.rst diff --git a/README.rst b/README.rst index 39d6db37..eb966db4 100644 --- a/README.rst +++ b/README.rst @@ -36,6 +36,8 @@ To install from source and recompile the HDF5 plugins, run:: Installing from source can achieve better performances by enabling AVX2 and OpenMP if available. +For more details, see the `installation documentation`_. + Documentation ------------- diff --git a/doc/install.rst b/doc/install.rst new file mode 100644 index 00000000..8854dcfc --- /dev/null +++ b/doc/install.rst @@ -0,0 +1,72 @@ +============== + Installation +============== + +Pre-built packages +------------------ + +Pre-built binaries of `hdf5plugin` are available from: + +- `pypi`_, to install run: + `pip install hdf5plugin [--user]` +- `conda-forge`_, to install run: + `conda install -c conda-forge hdf5plugin` + +To maximize compatibility, those binaries are built without optimization options (such as `AVX2`_ and `OpenMP`_). +`Installation from source`_ can achieve better performances than pre-built binaries. + +Installation from source +------------------------ + +The build process enables compilation optimizations that are supported by the host machine. + +To install from source and recompile the HDF5 plugins, run:: + + pip install hdf5plugin --no-binary hdf5plugin [--user] + +To override the defaults that are probed from the machine, it is possible to specify build options. +This is achieved by either setting environment variables or passing options to `python setup.py build`, for example: + +- `HDF5PLUGIN_OPENMP=False pip install hdf5plugin --no-binary hdf5plugin` +- From the source directory: `python setup.py build --openmp=False` + +Available options +................. + +.. list-table:: + :widths: 1 1 4 + :header-rows: 1 + + * - Environment variable + - Command line option + - Description + * - `HDF5PLUGIN_HDF5_DIR` + - `--hdf5` + - Custom path to HDF5 (as in h5py). + * - `HDF5PLUGIN_HDF5_DIR` + - `--openmp` + - Whether or not to compile with OpenMP. + Default: True if probed (always False on macOS). + * - `HDF5PLUGIN_NATIVE` + - `--native` + - True to compile specifically for the host, False for generic support (For unix compilers only). + Default: True on supported architectures, False otherwise + * - `HDF5PLUGIN_SSE2` + - `--sse2` + - Whether or not to compile with SSE2 support. + Default: True on ppc64le and when probed on x86, False otherwise + * - `HDF5PLUGIN_AVX2` + - `--avx2` + - Whether or not to compile with AVX2 support. avx2=True requires sse2=True. + Default: True on x86 when probed, False otherwise + * - `HDF5PLUGIN_CPP11` + - `--cpp11` + - Whether or not to compile C++11 code if available. + Default: True if probed. + +Note: Boolean options are passed as `True` or `False`. + + +.. AVX2: https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#Advanced_Vector_Extensions_2 +.. SSE2: https://en.wikipedia.org/wiki/SSE2 +.. OpenMP: https://www.openmp.org/ From 28ca23e4f3f35f1beacc43bba1212d4bca8ca24b Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Wed, 23 Jun 2021 11:59:22 +0200 Subject: [PATCH 02/24] rst typos --- doc/install.rst | 42 +++++++++++++++++++++--------------------- 1 file changed, 21 insertions(+), 21 deletions(-) diff --git a/doc/install.rst b/doc/install.rst index 8854dcfc..db77e521 100644 --- a/doc/install.rst +++ b/doc/install.rst @@ -7,9 +7,9 @@ Pre-built packages Pre-built binaries of `hdf5plugin` are available from: -- `pypi`_, to install run: +- `pypi `_, to install run: `pip install hdf5plugin [--user]` -- `conda-forge`_, to install run: +- `conda-forge `_, to install run: `conda install -c conda-forge hdf5plugin` To maximize compatibility, those binaries are built without optimization options (such as `AVX2`_ and `OpenMP`_). @@ -25,10 +25,10 @@ To install from source and recompile the HDF5 plugins, run:: pip install hdf5plugin --no-binary hdf5plugin [--user] To override the defaults that are probed from the machine, it is possible to specify build options. -This is achieved by either setting environment variables or passing options to `python setup.py build`, for example: +This is achieved by either setting environment variables or passing options to ``python setup.py build``, for example: -- `HDF5PLUGIN_OPENMP=False pip install hdf5plugin --no-binary hdf5plugin` -- From the source directory: `python setup.py build --openmp=False` +- ``HDF5PLUGIN_OPENMP=False pip install hdf5plugin --no-binary hdf5plugin`` +- From the source directory: ``python setup.py build --openmp=False`` Available options ................. @@ -40,33 +40,33 @@ Available options * - Environment variable - Command line option - Description - * - `HDF5PLUGIN_HDF5_DIR` - - `--hdf5` + * - ``HDF5PLUGIN_HDF5_DIR`` + - ``--hdf5`` - Custom path to HDF5 (as in h5py). - * - `HDF5PLUGIN_HDF5_DIR` - - `--openmp` + * - ``HDF5PLUGIN_HDF5_DIR`` + - ``--openmp`` - Whether or not to compile with OpenMP. Default: True if probed (always False on macOS). - * - `HDF5PLUGIN_NATIVE` - - `--native` + * - ``HDF5PLUGIN_NATIVE`` + - ``--native`` - True to compile specifically for the host, False for generic support (For unix compilers only). Default: True on supported architectures, False otherwise - * - `HDF5PLUGIN_SSE2` - - `--sse2` + * - ``HDF5PLUGIN_SSE2`` + - ``--sse2`` - Whether or not to compile with SSE2 support. Default: True on ppc64le and when probed on x86, False otherwise - * - `HDF5PLUGIN_AVX2` - - `--avx2` + * - ``HDF5PLUGIN_AVX2`` + - ``--avx2`` - Whether or not to compile with AVX2 support. avx2=True requires sse2=True. Default: True on x86 when probed, False otherwise - * - `HDF5PLUGIN_CPP11` - - `--cpp11` + * - ``HDF5PLUGIN_CPP11`` + - ``--cpp11`` - Whether or not to compile C++11 code if available. Default: True if probed. -Note: Boolean options are passed as `True` or `False`. +Note: Boolean options are passed as ``True`` or ``False``. -.. AVX2: https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#Advanced_Vector_Extensions_2 -.. SSE2: https://en.wikipedia.org/wiki/SSE2 -.. OpenMP: https://www.openmp.org/ +.. _AVX2: https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#Advanced_Vector_Extensions_2 +.. _SSE2: https://en.wikipedia.org/wiki/SSE2 +.. _OpenMP: https://www.openmp.org/ From 1c63677285b2feedeabc07af02dfceec9092ddc1 Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Wed, 23 Jun 2021 12:00:04 +0200 Subject: [PATCH 03/24] update array --- doc/install.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/install.rst b/doc/install.rst index db77e521..70cf3623 100644 --- a/doc/install.rst +++ b/doc/install.rst @@ -38,7 +38,7 @@ Available options :header-rows: 1 * - Environment variable - - Command line option + - ``python setup.py build`` option - Description * - ``HDF5PLUGIN_HDF5_DIR`` - ``--hdf5`` From b608b1c8cd8acdd5ac4dd542895d36238ff2dde3 Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Wed, 23 Jun 2021 12:01:15 +0200 Subject: [PATCH 04/24] more rst typos added links --- doc/install.rst | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/doc/install.rst b/doc/install.rst index 70cf3623..64e00658 100644 --- a/doc/install.rst +++ b/doc/install.rst @@ -8,9 +8,9 @@ Pre-built packages Pre-built binaries of `hdf5plugin` are available from: - `pypi `_, to install run: - `pip install hdf5plugin [--user]` + ``pip install hdf5plugin [--user]`` - `conda-forge `_, to install run: - `conda install -c conda-forge hdf5plugin` + ``conda install -c conda-forge hdf5plugin`` To maximize compatibility, those binaries are built without optimization options (such as `AVX2`_ and `OpenMP`_). `Installation from source`_ can achieve better performances than pre-built binaries. @@ -45,7 +45,7 @@ Available options - Custom path to HDF5 (as in h5py). * - ``HDF5PLUGIN_HDF5_DIR`` - ``--openmp`` - - Whether or not to compile with OpenMP. + - Whether or not to compile with `OpenMP`_. Default: True if probed (always False on macOS). * - ``HDF5PLUGIN_NATIVE`` - ``--native`` @@ -53,11 +53,11 @@ Available options Default: True on supported architectures, False otherwise * - ``HDF5PLUGIN_SSE2`` - ``--sse2`` - - Whether or not to compile with SSE2 support. + - Whether or not to compile with `SSE2`_ support. Default: True on ppc64le and when probed on x86, False otherwise * - ``HDF5PLUGIN_AVX2`` - ``--avx2`` - - Whether or not to compile with AVX2 support. avx2=True requires sse2=True. + - Whether or not to compile with `AVX2`_ support. avx2=True requires sse2=True. Default: True on x86 when probed, False otherwise * - ``HDF5PLUGIN_CPP11`` - ``--cpp11`` From 5abe10d6bfbd0cdea9af0f50c941f5982160c22a Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Wed, 23 Jun 2021 12:02:30 +0200 Subject: [PATCH 05/24] rst typo --- README.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.rst b/README.rst index eb966db4..38449875 100644 --- a/README.rst +++ b/README.rst @@ -36,7 +36,7 @@ To install from source and recompile the HDF5 plugins, run:: Installing from source can achieve better performances by enabling AVX2 and OpenMP if available. -For more details, see the `installation documentation`_. +For more details, see the `installation documentation `_. Documentation ------------- From a513e229d8cf8c500ab492b2b2e85355a20b3757 Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Wed, 23 Jun 2021 15:58:54 +0200 Subject: [PATCH 06/24] add sphinx config files --- doc/Makefile | 20 ++++++++++++++++++++ doc/conf.py | 46 ++++++++++++++++++++++++++++++++++++++++++++++ doc/make.bat | 35 +++++++++++++++++++++++++++++++++++ 3 files changed, 101 insertions(+) create mode 100644 doc/Makefile create mode 100644 doc/conf.py create mode 100644 doc/make.bat diff --git a/doc/Makefile b/doc/Makefile new file mode 100644 index 00000000..d4bb2cbb --- /dev/null +++ b/doc/Makefile @@ -0,0 +1,20 @@ +# Minimal makefile for Sphinx documentation +# + +# You can set these variables from the command line, and also +# from the environment for the first two. +SPHINXOPTS ?= +SPHINXBUILD ?= sphinx-build +SOURCEDIR = . +BUILDDIR = _build + +# Put it first so that "make" without argument is like "make help". +help: + @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) + +.PHONY: help Makefile + +# Catch-all target: route all unknown targets to Sphinx using the new +# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). +%: Makefile + @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) diff --git a/doc/conf.py b/doc/conf.py new file mode 100644 index 00000000..718b31a5 --- /dev/null +++ b/doc/conf.py @@ -0,0 +1,46 @@ +# Configuration file for the Sphinx documentation builder. +# +# This file only contains a selection of the most common options. For a full +# list see the documentation: +# https://www.sphinx-doc.org/en/master/usage/configuration.html + +# -- Path setup -------------------------------------------------------------- + +# If extensions (or modules to document with autodoc) are in another directory, +# add these directories to sys.path here. If the directory is relative to the +# documentation root, use os.path.abspath to make it absolute, like shown here. +# +# import os +# import sys +# sys.path.insert(0, os.path.abspath('.')) + + +# -- Project information ----------------------------------------------------- + +project = 'hdf5plugin' +from hdf5plugin import strictversion, version, __date__ as _date +release = strictversion +year = _date.split("/")[-1] +copyright = u'2016-%s, Data analysis unit, European Synchrotron Radiation Facility, Grenoble' % year +author = 'ESRF - Data Analysis Unit' + +# -- General configuration --------------------------------------------------- + +# Add any Sphinx extension module names here, as strings. They can be +# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom +# ones. +extensions = [ +] + +# List of patterns, relative to source directory, that match files and +# directories to ignore when looking for source files. +# This pattern also affects html_static_path and html_extra_path. +exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store'] + + +# -- Options for HTML output ------------------------------------------------- + +# The theme to use for HTML and HTML Help pages. See the documentation for +# a list of builtin themes. +# +html_theme = 'default' diff --git a/doc/make.bat b/doc/make.bat new file mode 100644 index 00000000..922152e9 --- /dev/null +++ b/doc/make.bat @@ -0,0 +1,35 @@ +@ECHO OFF + +pushd %~dp0 + +REM Command file for Sphinx documentation + +if "%SPHINXBUILD%" == "" ( + set SPHINXBUILD=sphinx-build +) +set SOURCEDIR=. +set BUILDDIR=_build + +if "%1" == "" goto help + +%SPHINXBUILD% >NUL 2>NUL +if errorlevel 9009 ( + echo. + echo.The 'sphinx-build' command was not found. Make sure you have Sphinx + echo.installed, then set the SPHINXBUILD environment variable to point + echo.to the full path of the 'sphinx-build' executable. Alternatively you + echo.may add the Sphinx directory to PATH. + echo. + echo.If you don't have Sphinx installed, grab it from + echo.http://sphinx-doc.org/ + exit /b 1 +) + +%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% +goto end + +:help +%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% + +:end +popd From b35124f26b0a9a8b2f8d9e158a3b04287bae89c5 Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Wed, 23 Jun 2021 15:59:58 +0200 Subject: [PATCH 07/24] split README into sphinx documentation --- README.rst | 263 ++------------------------------------------ doc/changelog.rst | 5 + doc/contribute.rst | 30 ++++- doc/index.rst | 36 ++++++ doc/information.rst | 45 ++++++++ doc/usage.rst | 207 ++++++++++++++++++++++++++++++++++ 6 files changed, 327 insertions(+), 259 deletions(-) create mode 100644 doc/changelog.rst create mode 100644 doc/index.rst create mode 100644 doc/information.rst create mode 100644 doc/usage.rst diff --git a/README.rst b/README.rst index 38449875..a95b9195 100644 --- a/README.rst +++ b/README.rst @@ -3,21 +3,9 @@ hdf5plugin This module provides HDF5 compression filters (namely: blosc, bitshuffle, lz4, FCIDECOMP, ZFP, zstd) and registers them to the HDF5 library used by `h5py `_. -* Supported operating systems: Linux, Windows, macOS. -* Supported versions of Python: >= 3.4 - `hdf5plugin` provides a generic way to enable the use of the provided HDF5 compression filters with `h5py` that can be installed via `pip` or `conda`. -Alternatives to install HDF5 compression filters are: system-wide installation on Linux or other conda packages: `blosc-hdf5-plugin `_, `hdf5-lz4 `_. - -The HDF5 plugin sources were obtained from: - -* LZ4 plugin (v0.1.0) and lz4 (v1.3.0, tag r122): https://github.com/nexusformat/HDF5-External-Filter-Plugins, https://github.com/lz4/lz4 -* bitshuffle plugin (0.3.5): https://github.com/kiyo-masui/bitshuffle -* hdf5-blosc plugin (v1.0.0), c-blosc (v1.20.1) and snappy (v1.1.1): https://github.com/Blosc/hdf5-blosc, https://github.com/Blosc/c-blosc and https://github.com/Blosc/c-blosc/tree/v1.17.0/internal-complibs/snappy-1.1.1 -* FCIDECOMP plugin (v1.0.2) and CharLS (branch 1.x-master SHA1 ID:25160a42fb62e71e4b0ce081f5cb3f8bb73938b5): ftp://ftp.eumetsat.int/pub/OPS/out/test-data/Test-data-for-External-Users/MTG_FCI_Test-Data/FCI_Decompression_Software_V1.0.2/ and https://github.com/team-charls/charls.git -* HDF5-ZFP plugin (v1.0.1) and zfp (v0.5.5): https://github.com/LLNL/H5Z-ZFP and https://github.com/LLNL/zfp -* HDF5Plugin-Zstandard (commit d5afdb5) and zstd (v1.4.5): https://github.com/aparamon/HDF5Plugin-Zstandard and https://github.com/Blosc/c-blosc/tree/v1.20.1/internal-complibs/zstd-1.4.5 +See `documentation `_. Installation ------------ @@ -38,254 +26,17 @@ Installing from source can achieve better performances by enabling AVX2 and Open For more details, see the `installation documentation `_. -Documentation -------------- +How-to use +---------- To use it, just use ``import hdf5plugin`` and supported compression filters are available from `h5py `_. -Sample code: - -.. code-block:: python - - import numpy - import h5py - import hdf5plugin - - # Compression - f = h5py.File('test.h5', 'w') - f.create_dataset('data', data=numpy.arange(100), **hdf5plugin.LZ4()) - f.close() - - # Decompression - f = h5py.File('test.h5', 'r') - data = f['data'][()] - f.close() - -``hdf5plugin`` provides: - -* Compression option helper classes to prepare arguments to provide to ``h5py.Group.create_dataset``: - - - `Bitshuffle(nelems=0, lz4=True)`_ - - `Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE)`_ - - `FciDecomp()`_ - - `LZ4(nbytes=0)`_ - - `Zfp()`_ - - -* The HDF5 filter ID of embedded plugins: - - - ``BLOSC_ID`` - - ``BSHUF_ID`` - - ``FCIDECOMP_ID`` - - ``LZ4_ID`` - - ``ZFP_ID`` - - ``ZSTD_ID`` - -* ``FILTERS``: A dictionary mapping provided filters to their ID -* ``PLUGINS_PATH``: The directory where the provided filters library are stored. - -Bitshuffle(nelems=0, lz4=True) -****************************** - -This class takes the following arguments and returns the compression options to feed into ``h5py.Group.create_dataset`` for using the bitshuffle filter: - -* **nelems** the number of elements per block, needs to be divisible by eight (default is 0, about 8kB per block) -* **lz4** if True the elements get compressed using lz4 (default is True) - -It can be passed as keyword arguments. - -Sample code: - -.. code-block:: python - - f = h5py.File('test.h5', 'w') - f.create_dataset('bitshuffle_with_lz4', data=numpy.arange(100), - **hdf5plugin.Bitshuffle(nelems=0, lz4=True)) - f.close() - -Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE) -********************************************* - -This class takes the following arguments and returns the compression options to feed into ``h5py.Group.create_dataset`` for using the blosc filter: - -* **cname** the compression algorithm, one of: - - * 'blosclz' - * 'lz4' (default) - * 'lz4hc' - * 'snappy' (optional, requires C++11) - * 'zlib' - * 'zstd' - -* **clevel** the compression level, from 0 to 9 (default is 5) -* **shuffle** the shuffling mode, in: - - * `Blosc.NOSHUFFLE` (0): No shuffle - * `Blosc.SHUFFLE` (1): byte-wise shuffle (default) - * `Blosc.BITSHUFFLE` (2): bit-wise shuffle - -It can be passed as keyword arguments. - -Sample code: - -.. code-block:: python - - f = h5py.File('test.h5', 'w') - f.create_dataset('blosc_byte_shuffle_blosclz', data=numpy.arange(100), - **hdf5plugin.Blosc(cname='blosclz', clevel=9, shuffle=hdf5plugin.Blosc.SHUFFLE)) - f.close() - -FciDecomp() -*********** - -This class returns the compression options to feed into ``h5py.Group.create_dataset`` for using the FciDecomp filter: - -It can be passed as keyword arguments. - -Sample code: - -.. code-block:: python - - f = h5py.File('test.h5', 'w') - f.create_dataset('fcidecomp', data=numpy.arange(100), - **hdf5plugin.FciDecomp()) - f.close() - -LZ4(nbytes=0) -************* - -This class takes the number of bytes per block as argument and returns the compression options to feed into ``h5py.Group.create_dataset`` for using the lz4 filter: - -* **nbytes** number of bytes per block needs to be in the range of 0 < nbytes < 2113929216 (1,9GB). - The default value is 0 (for 1GB). - -It can be passed as keyword arguments. - -Sample code: - -.. code-block:: python - - f = h5py.File('test.h5', 'w') - f.create_dataset('lz4', data=numpy.arange(100), - **hdf5plugin.LZ4(nbytes=0)) - f.close() - -Zfp() -***** - -This class returns the compression options to feed into ``h5py.Group.create_dataset`` for using the zfp filter: - -It can be passed as keyword arguments. - -Sample code: - -.. code-block:: python - - f = h5py.File('test.h5', 'w') - f.create_dataset('zfp', data=numpy.random.random(100), - **hdf5plugin.Zfp()) - f.close() - -The zfp filter compression mode is defined by the provided arguments. -The following compression modes are supported: - -- **Fixed-rate** mode: - For details, see `zfp fixed-rate mode `_. - - .. code-block:: python - - f.create_dataset('zfp_fixed_rate', data=numpy.random.random(100), - **hdf5plugin.Zfp(rate=10.0)) - -- **Fixed-precision** mode: - For details, see `zfp fixed-precision mode `_. - - .. code-block:: python - - f.create_dataset('zfp_fixed_precision', data=numpy.random.random(100), - **hdf5plugin.Zfp(precision=10)) - -- **Fixed-accuracy** mode: - For details, see `zfp fixed-accuracy mode `_. - - .. code-block:: python - - f.create_dataset('zfp_fixed_accuracy', data=numpy.random.random(100), - **hdf5plugin.Zfp(accuracy=0.001)) - -- **Reversible** (i.e., lossless) mode: - For details, see `zfp reversible mode `_. - - .. code-block:: python - - f.create_dataset('zfp_reversible', data=numpy.random.random(100), - **hdf5plugin.Zfp(reversible=True)) - -- **Expert** mode: - For details, see `zfp expert mode `_. - - .. code-block:: python - - f.create_dataset('zfp_expert', data=numpy.random.random(100), - **hdf5plugin.Zfp(minbits=1, maxbits=16657, maxprec=64, minexp=-1074)) - -Zstd() -****** - -This class returns the compression options to feed into ``h5py.Group.create_dataset`` for using the Zstd filter: - -It can be passed as keyword arguments. - -Sample code: - -.. code-block:: python - - f = h5py.File('test.h5', 'w') - f.create_dataset('zstd', data=numpy.arange(100), - **hdf5plugin.Zstd()) - f.close() - - -Dependencies ------------- - -* `h5py `_ - -Testing -------- - -To run self-contained tests, from Python: - -.. code-block:: python - - import hdf5plugin.test - hdf5plugin.test.run_tests() - -Or, from the command line:: - - python -m hdf5plugin.test - -To also run tests relying on actual HDF5 files, run from the source directory:: - - python test/test.py - -This tests the installed version of `hdf5plugin`. +For details, see `Usage documentation `_ License ------- -The source code of *hdf5plugin* itself is licensed under the MIT license. -Use it at your own risk. -See `LICENSE `_ - -The source code of the embedded HDF5 filter plugin libraries is licensed under different open-source licenses. -Please read the different licenses: - -* bitshuffle: See `src/bitshuffle/LICENSE `_ -* blosc: See `src/hdf5-blosc/LICENSES/ `_ and `src/c-blosc/LICENSES/ `_ -* lz4: See `src/LZ4/COPYING `_ and `src/lz4-r122/LICENSE `_ -* FCIDECOMP: See `src/fcidecomp/LICENSE `_ and `src/charls/src/License.txt `_ -* zfp: See `src/H5Z-ZFP/LICENSE `_ and `src/zfp/LICENSE `_ -* zstd: See `src/HDF5Plugin-Zstandard/LICENSE` +The source code of *hdf5plugin* itself is licensed under the `MIT license `_. -The HDF5 v1.10.5 headers (and Windows .lib file) used to build the filters are stored for convenience in the repository. The license is available here: `src/hdf5/COPYING `_. +Embedded HDF5 compression filters are licensed under different open-source licenses: +see the `license documentation `_. diff --git a/doc/changelog.rst b/doc/changelog.rst new file mode 100644 index 00000000..010444b3 --- /dev/null +++ b/doc/changelog.rst @@ -0,0 +1,5 @@ +=========== + Changelog +=========== + +.. include:: ../CHANGELOG.rst \ No newline at end of file diff --git a/doc/contribute.rst b/doc/contribute.rst index d29396c0..0d9dec27 100644 --- a/doc/contribute.rst +++ b/doc/contribute.rst @@ -1,9 +1,29 @@ -============== - Contributing -============== +============ + Contribute +============ This project follows the standard open-source project github workflow, which is described in other projects like `matplotlib `_ or `scikit-image `_. +Testing +======= + +To run self-contained tests, from Python: + +.. code-block:: python + + import hdf5plugin.test + hdf5plugin.test.run_tests() + +Or, from the command line:: + + python -m hdf5plugin.test + +To also run tests relying on actual HDF5 files, run from the source directory:: + + python test/test.py + +This tests the installed version of `hdf5plugin`. + Guidelines to add a compression filter ====================================== @@ -46,3 +66,7 @@ This briefly describes the steps to add a HDF5 compression filter to the zoo. * Update ``doc/compression_opts.rst`` to document the format of ``compression_opts`` expected by the filter. +Low-level compression filter arguments +====================================== + +See :doc:`compression_opts`. diff --git a/doc/index.rst b/doc/index.rst new file mode 100644 index 00000000..aa578300 --- /dev/null +++ b/doc/index.rst @@ -0,0 +1,36 @@ +hdf5plugin |version| +==================== + +*hdf5plugin* provides HDF5 compression filters (namely: blosc, bitshuffle, lz4, FCIDECOMP, ZFP, zstd) and registers them to the HDF5 library used by `h5py `_. + +* Supported operating systems: Linux, Windows, macOS. +* Supported versions of Python: >= 3.4 + +*hdf5plugin* provides a generic way to enable the use of the provided HDF5 compression filters with `h5py` that can be installed via `pip` or `conda`. + +Alternatives to install HDF5 compression filters are: system-wide installation on Linux or other conda packages: `blosc-hdf5-plugin `_, `hdf5-lz4 `_. + +.. toctree:: + :hidden: + + changelog.rst + compression_opts.rst + contribute.rst + information.rst + usage.rst + +:doc:`usage` + How-to use *hdf5plugin* + +:doc:`information` + Releases, changelog, repository, license + +:doc:`contribute` + How-to contribute to *hdf5plugin* + +Indices and tables +================== + +* :ref:`genindex` +* :ref:`modindex` +* :ref:`search` diff --git a/doc/information.rst b/doc/information.rst new file mode 100644 index 00000000..f2168258 --- /dev/null +++ b/doc/information.rst @@ -0,0 +1,45 @@ +===================== + Project information +===================== + +Releases +-------- + +Source code and pre-built binaries (aka Python wheels) for Windows, MacOS and +ManyLinux are available at the following places: + +- `Wheels and source code on PyPi `_ +- `Packages on conda-forge `_ + +For the history of modifications, see the :doc:`changelog`. + +Project resources +----------------- + +- `Source repository `_ +- `Issue tracker `_ +- Continuous integration: *hdf5plugin* is continuously tested on all three major + operating systems: + + - Linux, MacOS, Windows: `GitHub Actions `_ + - Windows: `AppVeyor `_ +- `Weekly builds `_ + +License +------- + +The source code of *hdf5plugin* itself is licensed under the MIT license. +Use it at your own risk. +See `LICENSE `_. + +The source code of the embedded HDF5 filter plugin libraries is licensed under different open-source licenses. +Please read the different licenses: + +* bitshuffle: See `src/bitshuffle/LICENSE `_ +* blosc: See `src/hdf5-blosc/LICENSES/ `_ and `src/c-blosc/LICENSES/ `_ +* lz4: See `src/LZ4/COPYING `_ and `src/lz4-r122/LICENSE `_ +* FCIDECOMP: See `src/fcidecomp/LICENSE `_ and `src/charls/src/License.txt `_ +* zfp: See `src/H5Z-ZFP/LICENSE `_ and `src/zfp/LICENSE `_ +* zstd: See `src/HDF5Plugin-Zstandard/LICENSE `_ + +The HDF5 v1.10.5 headers (and Windows .lib file) used to build the filters are stored for convenience in the repository. The license is available here: `src/hdf5/COPYING `_. diff --git a/doc/usage.rst b/doc/usage.rst new file mode 100644 index 00000000..a4212593 --- /dev/null +++ b/doc/usage.rst @@ -0,0 +1,207 @@ +======= + Usage +======= + +To use it, just use ``import hdf5plugin`` and supported compression filters are available from `h5py `_. + +Sample code: + +.. code-block:: python + + import numpy + import h5py + import hdf5plugin + + # Compression + f = h5py.File('test.h5', 'w') + f.create_dataset('data', data=numpy.arange(100), **hdf5plugin.LZ4()) + f.close() + + # Decompression + f = h5py.File('test.h5', 'r') + data = f['data'][()] + f.close() + +``hdf5plugin`` provides: + +* Compression option helper classes to prepare arguments to provide to ``h5py.Group.create_dataset``: + + - `Bitshuffle(nelems=0, lz4=True)`_ + - `Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE)`_ + - `FciDecomp()`_ + - `LZ4(nbytes=0)`_ + - `Zfp()`_ + + +* The HDF5 filter ID of embedded plugins: + + - ``BLOSC_ID`` + - ``BSHUF_ID`` + - ``FCIDECOMP_ID`` + - ``LZ4_ID`` + - ``ZFP_ID`` + - ``ZSTD_ID`` + +* ``FILTERS``: A dictionary mapping provided filters to their ID +* ``PLUGINS_PATH``: The directory where the provided filters library are stored. + +Bitshuffle(nelems=0, lz4=True) +============================== + +This class takes the following arguments and returns the compression options to feed into ``h5py.Group.create_dataset`` for using the bitshuffle filter: + +* **nelems** the number of elements per block, needs to be divisible by eight (default is 0, about 8kB per block) +* **lz4** if True the elements get compressed using lz4 (default is True) + +It can be passed as keyword arguments. + +Sample code: + +.. code-block:: python + + f = h5py.File('test.h5', 'w') + f.create_dataset('bitshuffle_with_lz4', data=numpy.arange(100), + **hdf5plugin.Bitshuffle(nelems=0, lz4=True)) + f.close() + +Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE) +============================================= + +This class takes the following arguments and returns the compression options to feed into ``h5py.Group.create_dataset`` for using the blosc filter: + +* **cname** the compression algorithm, one of: + + * 'blosclz' + * 'lz4' (default) + * 'lz4hc' + * 'snappy' (optional, requires C++11) + * 'zlib' + * 'zstd' + +* **clevel** the compression level, from 0 to 9 (default is 5) +* **shuffle** the shuffling mode, in: + + * `Blosc.NOSHUFFLE` (0): No shuffle + * `Blosc.SHUFFLE` (1): byte-wise shuffle (default) + * `Blosc.BITSHUFFLE` (2): bit-wise shuffle + +It can be passed as keyword arguments. + +Sample code: + +.. code-block:: python + + f = h5py.File('test.h5', 'w') + f.create_dataset('blosc_byte_shuffle_blosclz', data=numpy.arange(100), + **hdf5plugin.Blosc(cname='blosclz', clevel=9, shuffle=hdf5plugin.Blosc.SHUFFLE)) + f.close() + +FciDecomp() +=========== + +This class returns the compression options to feed into ``h5py.Group.create_dataset`` for using the FciDecomp filter: + +It can be passed as keyword arguments. + +Sample code: + +.. code-block:: python + + f = h5py.File('test.h5', 'w') + f.create_dataset('fcidecomp', data=numpy.arange(100), + **hdf5plugin.FciDecomp()) + f.close() + +LZ4(nbytes=0) +============= + +This class takes the number of bytes per block as argument and returns the compression options to feed into ``h5py.Group.create_dataset`` for using the lz4 filter: + +* **nbytes** number of bytes per block needs to be in the range of 0 < nbytes < 2113929216 (1,9GB). + The default value is 0 (for 1GB). + +It can be passed as keyword arguments. + +Sample code: + +.. code-block:: python + + f = h5py.File('test.h5', 'w') + f.create_dataset('lz4', data=numpy.arange(100), + **hdf5plugin.LZ4(nbytes=0)) + f.close() + +Zfp() +===== + +This class returns the compression options to feed into ``h5py.Group.create_dataset`` for using the zfp filter: + +It can be passed as keyword arguments. + +Sample code: + +.. code-block:: python + + f = h5py.File('test.h5', 'w') + f.create_dataset('zfp', data=numpy.random.random(100), + **hdf5plugin.Zfp()) + f.close() + +The zfp filter compression mode is defined by the provided arguments. +The following compression modes are supported: + +- **Fixed-rate** mode: + For details, see `zfp fixed-rate mode `_. + + .. code-block:: python + + f.create_dataset('zfp_fixed_rate', data=numpy.random.random(100), + **hdf5plugin.Zfp(rate=10.0)) + +- **Fixed-precision** mode: + For details, see `zfp fixed-precision mode `_. + + .. code-block:: python + + f.create_dataset('zfp_fixed_precision', data=numpy.random.random(100), + **hdf5plugin.Zfp(precision=10)) + +- **Fixed-accuracy** mode: + For details, see `zfp fixed-accuracy mode `_. + + .. code-block:: python + + f.create_dataset('zfp_fixed_accuracy', data=numpy.random.random(100), + **hdf5plugin.Zfp(accuracy=0.001)) + +- **Reversible** (i.e., lossless) mode: + For details, see `zfp reversible mode `_. + + .. code-block:: python + + f.create_dataset('zfp_reversible', data=numpy.random.random(100), + **hdf5plugin.Zfp(reversible=True)) + +- **Expert** mode: + For details, see `zfp expert mode `_. + + .. code-block:: python + + f.create_dataset('zfp_expert', data=numpy.random.random(100), + **hdf5plugin.Zfp(minbits=1, maxbits=16657, maxprec=64, minexp=-1074)) + +Zstd() +====== + +This class returns the compression options to feed into ``h5py.Group.create_dataset`` for using the Zstd filter: + +It can be passed as keyword arguments. + +Sample code: + +.. code-block:: python + + f = h5py.File('test.h5', 'w') + f.create_dataset('zstd', data=numpy.arange(100), + **hdf5plugin.Zstd()) + f.close() From a5c6a5f35f97d90efe1d16d0aa62b061b03d486b Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Wed, 23 Jun 2021 16:04:55 +0200 Subject: [PATCH 08/24] Fix link --- README.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.rst b/README.rst index a95b9195..f732b93e 100644 --- a/README.rst +++ b/README.rst @@ -31,7 +31,7 @@ How-to use To use it, just use ``import hdf5plugin`` and supported compression filters are available from `h5py `_. -For details, see `Usage documentation `_ +For details, see `Usage documentation `_. License ------- @@ -39,4 +39,4 @@ License The source code of *hdf5plugin* itself is licensed under the `MIT license `_. Embedded HDF5 compression filters are licensed under different open-source licenses: -see the `license documentation `_. +see the `license documentation `_. From 4448bfb3052711e6c97acfefe2dfbd28b103245d Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Wed, 23 Jun 2021 16:16:01 +0200 Subject: [PATCH 09/24] rework abstract sentence --- README.rst | 4 +--- doc/index.rst | 2 +- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/README.rst b/README.rst index f732b93e..44fb1f87 100644 --- a/README.rst +++ b/README.rst @@ -1,9 +1,7 @@ hdf5plugin ========== -This module provides HDF5 compression filters (namely: blosc, bitshuffle, lz4, FCIDECOMP, ZFP, zstd) and registers them to the HDF5 library used by `h5py `_. - -`hdf5plugin` provides a generic way to enable the use of the provided HDF5 compression filters with `h5py` that can be installed via `pip` or `conda`. +*hdf5plugin* provides HDF5 compression filters (namely: blosc, bitshuffle, lz4, FCIDECOMP, ZFP, zstd) and makes them usable from `h5py `_. See `documentation `_. diff --git a/doc/index.rst b/doc/index.rst index aa578300..d600a042 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -1,7 +1,7 @@ hdf5plugin |version| ==================== -*hdf5plugin* provides HDF5 compression filters (namely: blosc, bitshuffle, lz4, FCIDECOMP, ZFP, zstd) and registers them to the HDF5 library used by `h5py `_. +*hdf5plugin* provides HDF5 compression filters (namely: blosc, bitshuffle, lz4, FCIDECOMP, ZFP, zstd) and makes them usable from `h5py `_. * Supported operating systems: Linux, Windows, macOS. * Supported versions of Python: >= 3.4 From eb42421aa8de5447647110a93c209fd46680f336 Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Wed, 23 Jun 2021 16:18:39 +0200 Subject: [PATCH 10/24] remove documentation dependency on hdf5plugin --- doc/conf.py | 5 +---- doc/index.rst | 4 ++-- 2 files changed, 3 insertions(+), 6 deletions(-) diff --git a/doc/conf.py b/doc/conf.py index 718b31a5..885b6e12 100644 --- a/doc/conf.py +++ b/doc/conf.py @@ -18,10 +18,7 @@ # -- Project information ----------------------------------------------------- project = 'hdf5plugin' -from hdf5plugin import strictversion, version, __date__ as _date -release = strictversion -year = _date.split("/")[-1] -copyright = u'2016-%s, Data analysis unit, European Synchrotron Radiation Facility, Grenoble' % year +copyright = u'2016-2021, Data analysis unit, European Synchrotron Radiation Facility, Grenoble' author = 'ESRF - Data Analysis Unit' # -- General configuration --------------------------------------------------- diff --git a/doc/index.rst b/doc/index.rst index d600a042..c72ce339 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -1,5 +1,5 @@ -hdf5plugin |version| -==================== +hdf5plugin +========== *hdf5plugin* provides HDF5 compression filters (namely: blosc, bitshuffle, lz4, FCIDECOMP, ZFP, zstd) and makes them usable from `h5py `_. From c5e91d1eb3ea21b9fb55a8f48ed51d8abad709cf Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Wed, 23 Jun 2021 16:23:22 +0200 Subject: [PATCH 11/24] add build doc to documentation --- doc/contribute.rst | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/doc/contribute.rst b/doc/contribute.rst index 0d9dec27..1656f04b 100644 --- a/doc/contribute.rst +++ b/doc/contribute.rst @@ -24,6 +24,15 @@ To also run tests relying on actual HDF5 files, run from the source directory:: This tests the installed version of `hdf5plugin`. +Building documentation +====================== + +Documentation relies on `Sphinx `_. + +To build documentation, run from the project root directory:: + + sphinx-build -b html doc/ build/html + Guidelines to add a compression filter ====================================== From 06c5848a912f8de97e2a1314c7b01667d4f8fcbe Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Wed, 23 Jun 2021 16:31:48 +0200 Subject: [PATCH 12/24] Add reference of used source code --- doc/information.rst | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/doc/information.rst b/doc/information.rst index f2168258..df006560 100644 --- a/doc/information.rst +++ b/doc/information.rst @@ -25,6 +25,18 @@ Project resources - Windows: `AppVeyor `_ - `Weekly builds `_ +HDF5 filters and compression libraries +-------------------------------------- + +HDF5 compression filters and compression libraries sources were obtained from: + +* LZ4 plugin (v0.1.0) and lz4 (v1.3.0, tag r122): https://github.com/nexusformat/HDF5-External-Filter-Plugins, https://github.com/lz4/lz4 +* bitshuffle plugin (0.3.5): https://github.com/kiyo-masui/bitshuffle +* hdf5-blosc plugin (v1.0.0), c-blosc (v1.20.1) and snappy (v1.1.1): https://github.com/Blosc/hdf5-blosc, https://github.com/Blosc/c-blosc and https://github.com/Blosc/c-blosc/tree/v1.17.0/internal-complibs/snappy-1.1.1 +* FCIDECOMP plugin (v1.0.2) and CharLS (branch 1.x-master SHA1 ID:25160a42fb62e71e4b0ce081f5cb3f8bb73938b5): ftp://ftp.eumetsat.int/pub/OPS/out/test-data/Test-data-for-External-Users/MTG_FCI_Test-Data/FCI_Decompression_Software_V1.0.2/ and https://github.com/team-charls/charls.git +* HDF5-ZFP plugin (v1.0.1) and zfp (v0.5.5): https://github.com/LLNL/H5Z-ZFP and https://github.com/LLNL/zfp +* HDF5Plugin-Zstandard (commit d5afdb5) and zstd (v1.4.5): https://github.com/aparamon/HDF5Plugin-Zstandard and https://github.com/Blosc/c-blosc/tree/v1.20.1/internal-complibs/zstd-1.4.5 + License ------- From 6af842ae0658137cb748dd39a9ae23954350643d Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Wed, 23 Jun 2021 16:32:13 +0200 Subject: [PATCH 13/24] update contribute doc --- doc/contribute.rst | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/doc/contribute.rst b/doc/contribute.rst index 1656f04b..9bdbdcb6 100644 --- a/doc/contribute.rst +++ b/doc/contribute.rst @@ -66,10 +66,13 @@ This briefly describes the steps to add a HDF5 compression filter to the zoo. - In ``test/test.py`` for testing reading a compressed file that was produced with another software. - In ``src/hdf5plugin/test.py`` for tests that writes data using the compression filter and the compression options helper function and reads back the data. -* Update the ``README.rst`` file to document: +* Update the ``information.rst`` file to document: - The version of the HDF5 filter that is embedded in ``hdf5plugin``. - The license of the filter (by adding a link to the license file). + +* Update the ``usage.rst`` file to document: + - The ``hdf5plugin.`` filter ID "CONSTANT". - The ``hdf5plugin._options`` compression helper function. From a636619f5b8c348ea70edc5a4a98eb4f3ef25376 Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Thu, 24 Jun 2021 11:26:23 +0200 Subject: [PATCH 14/24] Add install doc to index --- doc/index.rst | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/doc/index.rst b/doc/index.rst index c72ce339..44161ded 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -17,8 +17,12 @@ Alternatives to install HDF5 compression filters are: system-wide installation o compression_opts.rst contribute.rst information.rst + install.rst usage.rst +:doc:`install` + How-to install *hdf5plugin* + :doc:`usage` How-to use *hdf5plugin* From 3b9d0e656ebb58eb7fa7615bb65e7d5fdae8bee1 Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Fri, 25 Jun 2021 10:18:47 +0200 Subject: [PATCH 15/24] Move low level compression options doc to contribute --- doc/compression_opts.rst | 88 -------------------- doc/contribute.rst | 172 ++++++++++++++++++++++++++++++++++++++- doc/index.rst | 19 ++--- 3 files changed, 177 insertions(+), 102 deletions(-) delete mode 100644 doc/compression_opts.rst diff --git a/doc/compression_opts.rst b/doc/compression_opts.rst deleted file mode 100644 index 4c01aee9..00000000 --- a/doc/compression_opts.rst +++ /dev/null @@ -1,88 +0,0 @@ -===================== - Compression options -===================== - -Compression filters can be configured with the ``compression_opts`` argument of `h5py.Group.create_dataset `_ method by providing a tuple of integers. - -The meaning of those integers is filter dependent and is described below. - -bitshuffle -.......... - -compression_opts: (**block_size**, **lz4 compression**) - -- **block size**: Number of elements (not bytes) per block. - It MUST be a mulitple of 8. - Default: 0 for a block size of about 8 kB. -- **lz4 compression**: 0: disabled (default), 2: enabled. - -By default the filter uses bitshuffle, but does NOT compress with LZ4. - -blosc -..... - -compression_opts: (0, 0, 0, 0, **compression level**, **shuffle**, **compression**) - -- First 4 values are reserved. -- **compression level**: - From 0 (no compression) to 9 (maximum compression). - Default: 5. -- **shuffle**: Shuffle filter: - - * 0: no shuffle - * 1: byte shuffle - * 2: bit shuffle - -- **compression**: The compressor blosc ID: - - * 0: blosclz (default) - * 1: lz4 - * 2: lz4hc - * 3: snappy - * 4: zlib - * 5: zstd - -By default the filter uses byte shuffle and blosclz. - -lz4 -... - -compression_opts: (**block_size**,) - -- **block size**: Number of bytes per block. - Default 0 for a block size of 1GB. - It MUST be < 1.9 GB. - -zfp -... - -For more information, see `zfp modes `_ and `hdf5-zfp generic interface `_. - -The first value of *compression_opts* is **mode**. -The following values depends on the value of **mode**: - -- *Fixed-rate* mode: (1, 0, **rateHigh**, **rateLow**, 0, 0) - Rate, i.e., number of compressed bits per value, as a double stored as: - - - **rateHigh**: High 32-bit word of the rate double. - - **rateLow**: Low 32-bit word of the rate double. - -- *Fixed-precision* mode: (2, 0, **prec**, 0, 0, 0) - - - **prec**: Number of uncompressed bits per value. - -- *Fixed-accuracy* mode: (3, 0, **accHigh**, **accLow**, 0, 0) - Accuracy, i.e., absolute error tolerance, as a double stored as: - - - **accHigh**: High 32-bit word of the accuracy double. - - **accLow**: Low 32-bit word of the accuracy double. - -- *Expert* mode: (4, 0, **minbits**, **maxbits**, **maxprec**, **minexp**) - - - **minbits**: Minimum number of compressed bits used to represent a block. - - **maxbits**: Maximum number of bits used to represent a block. - - **maxprec**: Maximum number of bit planes encoded. - - **minexp**: Smallest absolute bit plane number encoded. - -- *Reversible* mode: (5, 0, 0, 0, 0, 0) - diff --git a/doc/contribute.rst b/doc/contribute.rst index 9bdbdcb6..7bb98220 100644 --- a/doc/contribute.rst +++ b/doc/contribute.rst @@ -66,19 +66,183 @@ This briefly describes the steps to add a HDF5 compression filter to the zoo. - In ``test/test.py`` for testing reading a compressed file that was produced with another software. - In ``src/hdf5plugin/test.py`` for tests that writes data using the compression filter and the compression options helper function and reads back the data. -* Update the ``information.rst`` file to document: +* Update the ``doc/information.rst`` file to document: - The version of the HDF5 filter that is embedded in ``hdf5plugin``. - The license of the filter (by adding a link to the license file). -* Update the ``usage.rst`` file to document: +* Update the ``doc/usage.rst`` file to document: - The ``hdf5plugin.`` filter ID "CONSTANT". - The ``hdf5plugin._options`` compression helper function. -* Update ``doc/compression_opts.rst`` to document the format of ``compression_opts`` expected by the filter. +* Update ``doc/contribute.rst`` to document the format of ``compression_opts`` expected by the filter (see `Compression filters can be configured with the ``compression_opts`` argument of `h5py.Group.create_dataset `_ method by providing a tuple of integers. + +The meaning of those integers is filter dependent and is described below. + +bitshuffle +.......... + +compression_opts: (**block_size**, **lz4 compression**) + +- **block size**: Number of elements (not bytes) per block. + It MUST be a mulitple of 8. + Default: 0 for a block size of about 8 kB. +- **lz4 compression**: 0: disabled (default), 2: enabled. + +By default the filter uses bitshuffle, but does NOT compress with LZ4. + +blosc +..... + +compression_opts: (0, 0, 0, 0, **compression level**, **shuffle**, **compression**) + +- First 4 values are reserved. +- **compression level**: + From 0 (no compression) to 9 (maximum compression). + Default: 5. +- **shuffle**: Shuffle filter: + + * 0: no shuffle + * 1: byte shuffle + * 2: bit shuffle + +- **compression**: The compressor blosc ID: + + * 0: blosclz (default) + * 1: lz4 + * 2: lz4hc + * 3: snappy + * 4: zlib + * 5: zstd + +By default the filter uses byte shuffle and blosclz. + +lz4 +... + +compression_opts: (**block_size**,) + +- **block size**: Number of bytes per block. + Default 0 for a block size of 1GB. + It MUST be < 1.9 GB. + +zfp +... + +For more information, see `zfp modes `_ and `hdf5-zfp generic interface `_. + +The first value of *compression_opts* is **mode**. +The following values depends on the value of **mode**: + +- *Fixed-rate* mode: (1, 0, **rateHigh**, **rateLow**, 0, 0) + Rate, i.e., number of compressed bits per value, as a double stored as: + + - **rateHigh**: High 32-bit word of the rate double. + - **rateLow**: Low 32-bit word of the rate double. + +- *Fixed-precision* mode: (2, 0, **prec**, 0, 0, 0) + + - **prec**: Number of uncompressed bits per value. + +- *Fixed-accuracy* mode: (3, 0, **accHigh**, **accLow**, 0, 0) + Accuracy, i.e., absolute error tolerance, as a double stored as: + + - **accHigh**: High 32-bit word of the accuracy double. + - **accLow**: Low 32-bit word of the accuracy double. + +- *Expert* mode: (4, 0, **minbits**, **maxbits**, **maxprec**, **minexp**) + + - **minbits**: Minimum number of compressed bits used to represent a block. + - **maxbits**: Maximum number of bits used to represent a block. + - **maxprec**: Maximum number of bit planes encoded. + - **minexp**: Smallest absolute bit plane number encoded. + +- *Reversible* mode: (5, 0, 0, 0, 0, 0)`_ below). Low-level compression filter arguments ====================================== -See :doc:`compression_opts`. +Compression filters can be configured with the ``compression_opts`` argument of `h5py.Group.create_dataset `_ method by providing a tuple of integers. + +The meaning of those integers is filter dependent and is described below. + +bitshuffle +.......... + +compression_opts: (**block_size**, **lz4 compression**) + +- **block size**: Number of elements (not bytes) per block. + It MUST be a mulitple of 8. + Default: 0 for a block size of about 8 kB. +- **lz4 compression**: 0: disabled (default), 2: enabled. + +By default the filter uses bitshuffle, but does NOT compress with LZ4. + +blosc +..... + +compression_opts: (0, 0, 0, 0, **compression level**, **shuffle**, **compression**) + +- First 4 values are reserved. +- **compression level**: + From 0 (no compression) to 9 (maximum compression). + Default: 5. +- **shuffle**: Shuffle filter: + + * 0: no shuffle + * 1: byte shuffle + * 2: bit shuffle + +- **compression**: The compressor blosc ID: + + * 0: blosclz (default) + * 1: lz4 + * 2: lz4hc + * 3: snappy + * 4: zlib + * 5: zstd + +By default the filter uses byte shuffle and blosclz. + +lz4 +... + +compression_opts: (**block_size**,) + +- **block size**: Number of bytes per block. + Default 0 for a block size of 1GB. + It MUST be < 1.9 GB. + +zfp +... + +For more information, see `zfp modes `_ and `hdf5-zfp generic interface `_. + +The first value of *compression_opts* is **mode**. +The following values depends on the value of **mode**: + +- *Fixed-rate* mode: (1, 0, **rateHigh**, **rateLow**, 0, 0) + Rate, i.e., number of compressed bits per value, as a double stored as: + + - **rateHigh**: High 32-bit word of the rate double. + - **rateLow**: Low 32-bit word of the rate double. + +- *Fixed-precision* mode: (2, 0, **prec**, 0, 0, 0) + + - **prec**: Number of uncompressed bits per value. + +- *Fixed-accuracy* mode: (3, 0, **accHigh**, **accLow**, 0, 0) + Accuracy, i.e., absolute error tolerance, as a double stored as: + + - **accHigh**: High 32-bit word of the accuracy double. + - **accLow**: Low 32-bit word of the accuracy double. + +- *Expert* mode: (4, 0, **minbits**, **maxbits**, **maxprec**, **minexp**) + + - **minbits**: Minimum number of compressed bits used to represent a block. + - **maxbits**: Maximum number of bits used to represent a block. + - **maxprec**: Maximum number of bit planes encoded. + - **minexp**: Smallest absolute bit plane number encoded. + +- *Reversible* mode: (5, 0, 0, 0, 0, 0) diff --git a/doc/index.rst b/doc/index.rst index 44161ded..5da672f3 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -10,16 +10,6 @@ hdf5plugin Alternatives to install HDF5 compression filters are: system-wide installation on Linux or other conda packages: `blosc-hdf5-plugin `_, `hdf5-lz4 `_. -.. toctree:: - :hidden: - - changelog.rst - compression_opts.rst - contribute.rst - information.rst - install.rst - usage.rst - :doc:`install` How-to install *hdf5plugin* @@ -32,6 +22,15 @@ Alternatives to install HDF5 compression filters are: system-wide installation o :doc:`contribute` How-to contribute to *hdf5plugin* +.. toctree:: + :hidden: + + install.rst + usage.rst + information.rst + contribute.rst + changelog.rst + Indices and tables ================== From 3f970e869609b413190f88900cacd9e099d1a841 Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Fri, 25 Jun 2021 10:21:23 +0200 Subject: [PATCH 16/24] use read_the_docs theme --- doc/conf.py | 10 +++++++++- setup.py | 1 + 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/doc/conf.py b/doc/conf.py index 885b6e12..4a3f01e2 100644 --- a/doc/conf.py +++ b/doc/conf.py @@ -14,6 +14,11 @@ # import sys # sys.path.insert(0, os.path.abspath('.')) +import os + +# See https://docs.readthedocs.io/en/stable/builds.html#build-environment +on_rtd = os.environ.get('READTHEDOCS') == 'True' + # -- Project information ----------------------------------------------------- @@ -29,6 +34,9 @@ extensions = [ ] +if not on_rtd: + extensions.append('sphinx_rtd_theme') + # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. # This pattern also affects html_static_path and html_extra_path. @@ -40,4 +48,4 @@ # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. # -html_theme = 'default' +html_theme = 'default' if on_rtd else 'sphinx_rtd_theme' diff --git a/setup.py b/setup.py index 193f01f7..7736ca3a 100644 --- a/setup.py +++ b/setup.py @@ -752,6 +752,7 @@ def make_distribution(self): ext_modules=extensions, install_requires=['h5py'], setup_requires=['setuptools'], + extras_require={'dev': ['sphinx', 'sphinx_rtd_theme']}, cmdclass=cmdclass, libraries=libraries, zip_safe=False, From 9e6091bd37be0dcfd57ce162defb5630cce2fc58 Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Fri, 25 Jun 2021 11:22:21 +0200 Subject: [PATCH 17/24] use sphinx autodoc to avoid duplicating docstring/doc --- doc/conf.py | 1 + doc/usage.rst | 183 ++++++----------------------------- src/hdf5plugin/__init__.py | 191 ++++++++++++++++++++++++++++--------- 3 files changed, 179 insertions(+), 196 deletions(-) diff --git a/doc/conf.py b/doc/conf.py index 4a3f01e2..9a925c72 100644 --- a/doc/conf.py +++ b/doc/conf.py @@ -32,6 +32,7 @@ # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. extensions = [ + 'sphinx.ext.autodoc', ] if not on_rtd: diff --git a/doc/usage.rst b/doc/usage.rst index a4212593..7146ae5f 100644 --- a/doc/usage.rst +++ b/doc/usage.rst @@ -2,6 +2,8 @@ Usage ======= +.. currentmodule:: hdf5plugin + To use it, just use ``import hdf5plugin`` and supported compression filters are available from `h5py `_. Sample code: @@ -26,11 +28,11 @@ Sample code: * Compression option helper classes to prepare arguments to provide to ``h5py.Group.create_dataset``: - - `Bitshuffle(nelems=0, lz4=True)`_ - - `Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE)`_ - - `FciDecomp()`_ - - `LZ4(nbytes=0)`_ - - `Zfp()`_ + - `Bitshuffle`_ + - `Blosc`_ + - `FciDecomp`_ + - `LZ4`_ + - `Zfp`_ * The HDF5 filter ID of embedded plugins: @@ -45,163 +47,38 @@ Sample code: * ``FILTERS``: A dictionary mapping provided filters to their ID * ``PLUGINS_PATH``: The directory where the provided filters library are stored. -Bitshuffle(nelems=0, lz4=True) -============================== - -This class takes the following arguments and returns the compression options to feed into ``h5py.Group.create_dataset`` for using the bitshuffle filter: - -* **nelems** the number of elements per block, needs to be divisible by eight (default is 0, about 8kB per block) -* **lz4** if True the elements get compressed using lz4 (default is True) - -It can be passed as keyword arguments. - -Sample code: - -.. code-block:: python - - f = h5py.File('test.h5', 'w') - f.create_dataset('bitshuffle_with_lz4', data=numpy.arange(100), - **hdf5plugin.Bitshuffle(nelems=0, lz4=True)) - f.close() - -Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE) -============================================= - -This class takes the following arguments and returns the compression options to feed into ``h5py.Group.create_dataset`` for using the blosc filter: - -* **cname** the compression algorithm, one of: - - * 'blosclz' - * 'lz4' (default) - * 'lz4hc' - * 'snappy' (optional, requires C++11) - * 'zlib' - * 'zstd' - -* **clevel** the compression level, from 0 to 9 (default is 5) -* **shuffle** the shuffling mode, in: - - * `Blosc.NOSHUFFLE` (0): No shuffle - * `Blosc.SHUFFLE` (1): byte-wise shuffle (default) - * `Blosc.BITSHUFFLE` (2): bit-wise shuffle - -It can be passed as keyword arguments. - -Sample code: - -.. code-block:: python - - f = h5py.File('test.h5', 'w') - f.create_dataset('blosc_byte_shuffle_blosclz', data=numpy.arange(100), - **hdf5plugin.Blosc(cname='blosclz', clevel=9, shuffle=hdf5plugin.Blosc.SHUFFLE)) - f.close() - -FciDecomp() -=========== - -This class returns the compression options to feed into ``h5py.Group.create_dataset`` for using the FciDecomp filter: - -It can be passed as keyword arguments. - -Sample code: - -.. code-block:: python - - f = h5py.File('test.h5', 'w') - f.create_dataset('fcidecomp', data=numpy.arange(100), - **hdf5plugin.FciDecomp()) - f.close() - -LZ4(nbytes=0) -============= +Bitshuffle +========== -This class takes the number of bytes per block as argument and returns the compression options to feed into ``h5py.Group.create_dataset`` for using the lz4 filter: +.. autoclass:: Bitshuffle + :undoc-members: filter_id -* **nbytes** number of bytes per block needs to be in the range of 0 < nbytes < 2113929216 (1,9GB). - The default value is 0 (for 1GB). - -It can be passed as keyword arguments. - -Sample code: - -.. code-block:: python - - f = h5py.File('test.h5', 'w') - f.create_dataset('lz4', data=numpy.arange(100), - **hdf5plugin.LZ4(nbytes=0)) - f.close() - -Zfp() +Blosc ===== -This class returns the compression options to feed into ``h5py.Group.create_dataset`` for using the zfp filter: - -It can be passed as keyword arguments. - -Sample code: - -.. code-block:: python - - f = h5py.File('test.h5', 'w') - f.create_dataset('zfp', data=numpy.random.random(100), - **hdf5plugin.Zfp()) - f.close() - -The zfp filter compression mode is defined by the provided arguments. -The following compression modes are supported: - -- **Fixed-rate** mode: - For details, see `zfp fixed-rate mode `_. +.. autoclass:: Blosc + :undoc-members: filter_id - .. code-block:: python +FciDecomp +========= - f.create_dataset('zfp_fixed_rate', data=numpy.random.random(100), - **hdf5plugin.Zfp(rate=10.0)) +.. autoclass:: FciDecomp + :undoc-members: filter_id -- **Fixed-precision** mode: - For details, see `zfp fixed-precision mode `_. +LZ4 +=== - .. code-block:: python +.. autoclass:: LZ4 + :undoc-members: filter_id - f.create_dataset('zfp_fixed_precision', data=numpy.random.random(100), - **hdf5plugin.Zfp(precision=10)) +Zfp +=== -- **Fixed-accuracy** mode: - For details, see `zfp fixed-accuracy mode `_. +.. autoclass:: Zfp + :undoc-members: filter_id - .. code-block:: python - - f.create_dataset('zfp_fixed_accuracy', data=numpy.random.random(100), - **hdf5plugin.Zfp(accuracy=0.001)) - -- **Reversible** (i.e., lossless) mode: - For details, see `zfp reversible mode `_. - - .. code-block:: python - - f.create_dataset('zfp_reversible', data=numpy.random.random(100), - **hdf5plugin.Zfp(reversible=True)) - -- **Expert** mode: - For details, see `zfp expert mode `_. - - .. code-block:: python - - f.create_dataset('zfp_expert', data=numpy.random.random(100), - **hdf5plugin.Zfp(minbits=1, maxbits=16657, maxprec=64, minexp=-1074)) - -Zstd() -====== - -This class returns the compression options to feed into ``h5py.Group.create_dataset`` for using the Zstd filter: - -It can be passed as keyword arguments. - -Sample code: - -.. code-block:: python +Zstd +==== - f = h5py.File('test.h5', 'w') - f.create_dataset('zstd', data=numpy.arange(100), - **hdf5plugin.Zstd()) - f.close() +.. autoclass:: Zstd + :undoc-members: filter_id diff --git a/src/hdf5plugin/__init__.py b/src/hdf5plugin/__init__.py index d3e42e50..fc18a859 100644 --- a/src/hdf5plugin/__init__.py +++ b/src/hdf5plugin/__init__.py @@ -117,14 +117,57 @@ def __getitem__(self, item): return self._kwargs[item] +class Bitshuffle(_FilterRefClass): + """``h5py.Group.create_dataset``'s compression arguments for using bitshuffle filter. + + It can be passed as keyword arguments: + + .. code-block:: python + + f = h5py.File('test.h5', 'w') + f.create_dataset( + 'bitshuffle_with_lz4', + data=numpy.arange(100), + **hdf5plugin.Bitshuffle(nelems=0, lz4=True)) + f.close() + + :param int nelems: + The number of elements per block. + It needs to be divisible by eight (default is 0, about 8kB per block) + Default: 0 (for about 8kB per block). + :param bool lz4: + Whether to use lz4 compression or not as part of the filter. + Default: True + """ + filter_id = BSHUF_ID + + def __init__(self, nelems=0, lz4=True): + nelems = int(nelems) + assert nelems % 8 == 0 + + lz4_enabled = 2 if lz4 else 0 + self.filter_options = (nelems, lz4_enabled) + + class Blosc(_FilterRefClass): - """h5py.Group.create_dataset's compression and compression_opts arguments for using blosc filter. + """``h5py.Group.create_dataset``'s compression arguments for using blosc filter. + + It can be passed as keyword arguments: + + .. code-block:: python + + f = h5py.File('test.h5', 'w') + f.create_dataset( + 'blosc_byte_shuffle_blosclz', + data=numpy.arange(100), + **hdf5plugin.Blosc(cname='blosclz', clevel=9, shuffle=hdf5plugin.Blosc.SHUFFLE)) + f.close() :param str cname: `blosclz`, `lz4` (default), `lz4hc`, `zlib`, `zstd` Optional: `snappy`, depending on compilation (requires C++11). :param int clevel: - Compression level from 0 no compression to 9 maximum compression. + Compression level from 0 (no compression) to 9 (maximum compression). Default: 5. :param int shuffle: One of: - Blosc.NOSHUFFLE (0): No shuffle @@ -160,31 +203,45 @@ def __init__(self, cname='lz4', clevel=5, shuffle=SHUFFLE): self.filter_options = (0, 0, 0, 0, clevel, shuffle, compression) -class Bitshuffle(_FilterRefClass): - """h5py.Group.create_dataset's compression and compression_opts arguments for using bitshuffle filter. +class FciDecomp(_FilterRefClass): + """``h5py.Group.create_dataset``'s compression arguments for using FciDecomp filter. - :param int nelems: - The number of elements per block. - Default: 0 (for about 8kB per block). - :param bool lz4: - Whether to use LZ4_ID compression or not as part of the filter. - Default: True - """ - filter_id = BSHUF_ID + It can be passed as keyword arguments: - def __init__(self, nelems=0, lz4=True): - nelems = int(nelems) - assert nelems % 8 == 0 + .. code-block:: python - lz4_enabled = 2 if lz4 else 0 - self.filter_options = (nelems, lz4_enabled) + f = h5py.File('test.h5', 'w') + f.create_dataset( + 'fcidecomp', + data=numpy.arange(100), + **hdf5plugin.FciDecomp()) + f.close() + """ + filter_id = FCIDECOMP_ID + + def __init__(self, *args, **kwargs): + super().__init__(*args, **kwargs) + if not config.cpp11: + _logger.error( + "The FciDecomp filter is not available as hdf5plugin was not built with C++11.\n" + "You may need to reinstall hdf5plugin with a recent version of pip, or rebuild it with a newer compiler.") class LZ4(_FilterRefClass): - """h5py.Group.create_dataset's compression and compression_opts arguments for using lz4 filter. + """``h5py.Group.create_dataset``'s compression arguments for using lz4 filter. - :param int nelems: + It can be passed as keyword arguments: + + .. code-block:: python + + f = h5py.File('test.h5', 'w') + f.create_dataset('lz4', data=numpy.arange(100), + **hdf5plugin.LZ4(nbytes=0)) + f.close() + + :param int nbytes: The number of bytes per block. + It needs to be in the range of 0 < nbytes < 2113929216 (1,9GB). Default: 0 (for 1GB per block). """ filter_id = LZ4_ID @@ -196,20 +253,70 @@ def __init__(self, nbytes=0): class Zfp(_FilterRefClass): - """h5py.Group.create_dataset's compression and compression_opts arguments for using ZFP filter. + """``h5py.Group.create_dataset``'s compression arguments for using ZFP filter. + + It can be passed as keyword arguments: + + .. code-block:: python + + f = h5py.File('test.h5', 'w') + f.create_dataset( + 'zfp', + data=numpy.random.random(100), + **hdf5plugin.Zfp()) + f.close() This filter provides different modes: - - **Fixed-rate** mode: To use, set the `rate` argument. - For details, see https://zfp.readthedocs.io/en/latest/modes.html#fixed-rate-mode. - - **Fixed-precision** mode: To use, set the `precision` argument. - For details, see https://zfp.readthedocs.io/en/latest/modes.html#fixed-precision-mode. - - **Fixed-accuracy** mode: To use, set the `accuracy` argument - For details, see https://zfp.readthedocs.io/en/latest/modes.html#fixed-accuracy-mode. - - **Reversible** (i.e., lossless) mode: To use, set the `reversible` argument to True - For details, see https://zfp.readthedocs.io/en/latest/modes.html#reversible-mode. - - **Expert** mode: To use, set the `minbits`, `maxbits`, `maxprec` and ``minexp` arguments. - For details, see https://zfp.readthedocs.io/en/latest/modes.html#expert-mode. + - **Fixed-rate** mode: To use, set the ``rate`` argument. + For details, see `zfp fixed-rate mode `_. + + .. code-block:: python + + f.create_dataset( + 'zfp_fixed_rate', + data=numpy.random.random(100), + **hdf5plugin.Zfp(rate=10.0)) + + - **Fixed-precision** mode: To use, set the ``precision`` argument. + For details, see `zfp fixed-precision mode `_. + + .. code-block:: python + + f.create_dataset( + 'zfp_fixed_precision', + data=numpy.random.random(100), + **hdf5plugin.Zfp(precision=10)) + + - **Fixed-accuracy** mode: To use, set the ``accuracy`` argument + For details, see `zfp fixed-accuracy mode `_. + + .. code-block:: python + + f.create_dataset( + 'zfp_fixed_accuracy', + data=numpy.random.random(100), + **hdf5plugin.Zfp(accuracy=0.001)) + + - **Reversible** (i.e., lossless) mode: To use, set the ``reversible`` argument to True + For details, see `zfp reversible mode `_. + + .. code-block:: python + + f.create_dataset( + 'zfp_reversible', + data=numpy.random.random(100), + **hdf5plugin.Zfp(reversible=True)) + + - **Expert** mode: To use, set the ``minbits``, ``maxbits``, ``maxprec`` and ``minexp`` arguments. + For details, see `zfp expert mode `_. + + .. code-block:: python + + f.create_dataset( + 'zfp_expert', + data=numpy.random.random(100), + **hdf5plugin.Zfp(minbits=1, maxbits=16657, maxprec=64, minexp=-1074)) :param float rate: Use fixed-rate mode and set the number of compressed bits per value. @@ -271,22 +378,20 @@ def __init__(self, class Zstd(_FilterRefClass): - """h5py.Group.create_dataset's compression and compression_opts arguments for using FciDecomp filter. - """ - filter_id = ZSTD_ID + """``h5py.Group.create_dataset``'s compression arguments for using FciDecomp filter. + It can be passed as keyword arguments: -class FciDecomp(_FilterRefClass): - """h5py.Group.create_dataset's compression and compression_opts arguments for using FciDecomp filter. - """ - filter_id = FCIDECOMP_ID + .. code-block:: python - def __init__(self, *args, **kwargs): - super().__init__(*args, **kwargs) - if not config.cpp11: - _logger.error( - "The FciDecomp filter is not available as hdf5plugin was not built with C++11.\n" - "You may need to reinstall hdf5plugin with a recent version of pip, or rebuild it with a newer compiler.") + f = h5py.File('test.h5', 'w') + f.create_dataset( + 'zstd', + data=numpy.arange(100), + **hdf5plugin.Zstd()) + f.close() + """ + filter_id = ZSTD_ID def _init_filters(): From 87f13caf549e0c4ec6bb2c6534c49be14d622ce1 Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Fri, 25 Jun 2021 11:50:00 +0200 Subject: [PATCH 18/24] rework usage documentation --- doc/usage.rst | 86 ++++++++++++++++++++++++++++++--------------------- 1 file changed, 51 insertions(+), 35 deletions(-) diff --git a/doc/usage.rst b/doc/usage.rst index 7146ae5f..a3bd3106 100644 --- a/doc/usage.rst +++ b/doc/usage.rst @@ -4,81 +4,97 @@ .. currentmodule:: hdf5plugin -To use it, just use ``import hdf5plugin`` and supported compression filters are available from `h5py `_. +``hdf5plugin`` allows using additional HDF5 compression filters with `h5py`_ for reading and writing compressed datasets. -Sample code: +Available constants: + +* ``hdf5plugin.FILTERS``: A dictionary mapping provided filters to their ID +* ``hdf5plugin.PLUGINS_PATH``: The directory where the provided filters library are stored. + +Read compressed datasets +++++++++++++++++++++++++ + +In order to read compressed dataset with `h5py`_, use: .. code-block:: python - import numpy - import h5py - import hdf5plugin + import hdf5plugin - # Compression - f = h5py.File('test.h5', 'w') - f.create_dataset('data', data=numpy.arange(100), **hdf5plugin.LZ4()) - f.close() +It registers ``hdf5plugin`` supported compression filters with the HDF5 library used by `h5py`_. +Hence, HDF5 compressed datasets can be read as any other dataset (see `h5py documentation `_). - # Decompression - f = h5py.File('test.h5', 'r') - data = f['data'][()] - f.close() +Write compressed datasets ++++++++++++++++++++++++++ -``hdf5plugin`` provides: +As for reading compressed datasets, ``import hdf5plugin`` is required to enable the supported compression filters. -* Compression option helper classes to prepare arguments to provide to ``h5py.Group.create_dataset``: +To create a compressed dataset use `h5py.Group.create_dataset`_ and set the ``compression`` and ``compression_opts`` arguments. - - `Bitshuffle`_ - - `Blosc`_ - - `FciDecomp`_ - - `LZ4`_ - - `Zfp`_ +``hdf5plugin`` provides helpers to prepare those compression options: `Bitshuffle`_, `Blosc`_, `FciDecomp`_, `LZ4`_, `Zfp`_, `Zstd`_. +Sample code: -* The HDF5 filter ID of embedded plugins: +.. code-block:: python - - ``BLOSC_ID`` - - ``BSHUF_ID`` - - ``FCIDECOMP_ID`` - - ``LZ4_ID`` - - ``ZFP_ID`` - - ``ZSTD_ID`` + import numpy + import h5py + import hdf5plugin + + # Compression + f = h5py.File('test.h5', 'w') + f.create_dataset('data', data=numpy.arange(100), **hdf5plugin.LZ4()) + f.close() + + # Decompression + f = h5py.File('test.h5', 'r') + data = f['data'][()] + f.close() + +Relevant `h5py`_ documentation: `Filter pipeline `_ and `Chunked Storage `_. -* ``FILTERS``: A dictionary mapping provided filters to their ID -* ``PLUGINS_PATH``: The directory where the provided filters library are stored. Bitshuffle ========== .. autoclass:: Bitshuffle - :undoc-members: filter_id + :members: + :undoc-members: Blosc ===== .. autoclass:: Blosc - :undoc-members: filter_id + :members: + :undoc-members: FciDecomp ========= .. autoclass:: FciDecomp - :undoc-members: filter_id + :members: + :undoc-members: LZ4 === .. autoclass:: LZ4 - :undoc-members: filter_id + :members: + :undoc-members: Zfp === .. autoclass:: Zfp - :undoc-members: filter_id + :members: + :undoc-members: Zstd ==== .. autoclass:: Zstd - :undoc-members: filter_id + :members: + :undoc-members: + + +.. _h5py: https://www.h5py.org +.. _h5py.Group.create_dataset: https://docs.h5py.org/en/stable/high/group.html#h5py.Group.create_dataset \ No newline at end of file From 3d1d22def1b1a4dc56198c83bd623ee058a4ac76 Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Fri, 25 Jun 2021 11:52:42 +0200 Subject: [PATCH 19/24] update and clean contribute doc --- doc/contribute.rst | 85 +--------------------------------------------- 1 file changed, 1 insertion(+), 84 deletions(-) diff --git a/doc/contribute.rst b/doc/contribute.rst index 7bb98220..0c4e2541 100644 --- a/doc/contribute.rst +++ b/doc/contribute.rst @@ -73,93 +73,10 @@ This briefly describes the steps to add a HDF5 compression filter to the zoo. * Update the ``doc/usage.rst`` file to document: - - The ``hdf5plugin.`` filter ID "CONSTANT". - - The ``hdf5plugin._options`` compression helper function. + - The ``hdf5plugin.`` compression argument helper class. * Update ``doc/contribute.rst`` to document the format of ``compression_opts`` expected by the filter (see `Compression filters can be configured with the ``compression_opts`` argument of `h5py.Group.create_dataset `_ method by providing a tuple of integers. -The meaning of those integers is filter dependent and is described below. - -bitshuffle -.......... - -compression_opts: (**block_size**, **lz4 compression**) - -- **block size**: Number of elements (not bytes) per block. - It MUST be a mulitple of 8. - Default: 0 for a block size of about 8 kB. -- **lz4 compression**: 0: disabled (default), 2: enabled. - -By default the filter uses bitshuffle, but does NOT compress with LZ4. - -blosc -..... - -compression_opts: (0, 0, 0, 0, **compression level**, **shuffle**, **compression**) - -- First 4 values are reserved. -- **compression level**: - From 0 (no compression) to 9 (maximum compression). - Default: 5. -- **shuffle**: Shuffle filter: - - * 0: no shuffle - * 1: byte shuffle - * 2: bit shuffle - -- **compression**: The compressor blosc ID: - - * 0: blosclz (default) - * 1: lz4 - * 2: lz4hc - * 3: snappy - * 4: zlib - * 5: zstd - -By default the filter uses byte shuffle and blosclz. - -lz4 -... - -compression_opts: (**block_size**,) - -- **block size**: Number of bytes per block. - Default 0 for a block size of 1GB. - It MUST be < 1.9 GB. - -zfp -... - -For more information, see `zfp modes `_ and `hdf5-zfp generic interface `_. - -The first value of *compression_opts* is **mode**. -The following values depends on the value of **mode**: - -- *Fixed-rate* mode: (1, 0, **rateHigh**, **rateLow**, 0, 0) - Rate, i.e., number of compressed bits per value, as a double stored as: - - - **rateHigh**: High 32-bit word of the rate double. - - **rateLow**: Low 32-bit word of the rate double. - -- *Fixed-precision* mode: (2, 0, **prec**, 0, 0, 0) - - - **prec**: Number of uncompressed bits per value. - -- *Fixed-accuracy* mode: (3, 0, **accHigh**, **accLow**, 0, 0) - Accuracy, i.e., absolute error tolerance, as a double stored as: - - - **accHigh**: High 32-bit word of the accuracy double. - - **accLow**: Low 32-bit word of the accuracy double. - -- *Expert* mode: (4, 0, **minbits**, **maxbits**, **maxprec**, **minexp**) - - - **minbits**: Minimum number of compressed bits used to represent a block. - - **maxbits**: Maximum number of bits used to represent a block. - - **maxprec**: Maximum number of bit planes encoded. - - **minexp**: Smallest absolute bit plane number encoded. - -- *Reversible* mode: (5, 0, 0, 0, 0, 0)`_ below). - Low-level compression filter arguments ====================================== From e8edf4e571380aeb911fccca25b2cae5ca35094e Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Fri, 25 Jun 2021 11:54:25 +0200 Subject: [PATCH 20/24] update build doc --- doc/contribute.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/doc/contribute.rst b/doc/contribute.rst index 0c4e2541..b06e704d 100644 --- a/doc/contribute.rst +++ b/doc/contribute.rst @@ -31,7 +31,8 @@ Documentation relies on `Sphinx `_. To build documentation, run from the project root directory:: - sphinx-build -b html doc/ build/html + python setup.py build + PYTHONPATH=build/lib.--/ sphinx-build -b html doc/ build/html Guidelines to add a compression filter ====================================== From 09a6855a2f3c560428c7d77bdcaa6313beb2fbf1 Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Fri, 25 Jun 2021 12:03:54 +0200 Subject: [PATCH 21/24] Make README point to online doc --- README.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/README.rst b/README.rst index 44fb1f87..67273dcc 100644 --- a/README.rst +++ b/README.rst @@ -3,7 +3,7 @@ hdf5plugin *hdf5plugin* provides HDF5 compression filters (namely: blosc, bitshuffle, lz4, FCIDECOMP, ZFP, zstd) and makes them usable from `h5py `_. -See `documentation `_. +See `documentation `_. Installation ------------ @@ -22,14 +22,14 @@ To install from source and recompile the HDF5 plugins, run:: Installing from source can achieve better performances by enabling AVX2 and OpenMP if available. -For more details, see the `installation documentation `_. +For more details, see the `installation documentation `_. How-to use ---------- To use it, just use ``import hdf5plugin`` and supported compression filters are available from `h5py `_. -For details, see `Usage documentation `_. +For details, see `Usage documentation `_. License ------- @@ -37,4 +37,4 @@ License The source code of *hdf5plugin* itself is licensed under the `MIT license `_. Embedded HDF5 compression filters are licensed under different open-source licenses: -see the `license documentation `_. +see the `license documentation `_. From dd37cd1b2355c92af6ea0012d6fc73a40b485b68 Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Fri, 25 Jun 2021 14:06:29 +0200 Subject: [PATCH 22/24] Add supported architectures --- doc/index.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/doc/index.rst b/doc/index.rst index 5da672f3..1783024a 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -5,6 +5,8 @@ hdf5plugin * Supported operating systems: Linux, Windows, macOS. * Supported versions of Python: >= 3.4 +* Supported architectures: All. + Specific optimizations are available for *x86* family and *ppc64le*. *hdf5plugin* provides a generic way to enable the use of the provided HDF5 compression filters with `h5py` that can be installed via `pip` or `conda`. From ac61ada0f85f46188f1d91a67c484a2131093e78 Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Fri, 25 Jun 2021 14:47:45 +0200 Subject: [PATCH 23/24] add extra usage of plugins --- doc/usage.rst | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/doc/usage.rst b/doc/usage.rst index a3bd3106..d32b28d5 100644 --- a/doc/usage.rst +++ b/doc/usage.rst @@ -95,6 +95,20 @@ Zstd :members: :undoc-members: +Use HDF5 filters in other applications +++++++++++++++++++++++++++++++++++++++ + +Non `h5py`_ or non-Python users can also benefit from the supplied HDF5 compression filters for reading compressed datasets by setting the ``HDF5_PLUGIN_PATH`` environment variable the value of ``hdf5plugin.PLUGINS_PATH``, which can be retrieved from the command line with:: + + python -c "import hdf5plugin; print(hdf5plugin.PLUGINS_PATH)" + +For instance:: + + export HDF5_PLUGIN_PATH=$(python -c "import hdf5plugin; print(hdf5plugin.PLUGINS_PATH)") + +should allow MatLab or IDL users to read data compressed using the supported plugins. + +Setting the ``HDF5_PLUGIN_PATH`` environment variable allows already existing programs or Python code to read compressed data without any modification. .. _h5py: https://www.h5py.org -.. _h5py.Group.create_dataset: https://docs.h5py.org/en/stable/high/group.html#h5py.Group.create_dataset \ No newline at end of file +.. _h5py.Group.create_dataset: https://docs.h5py.org/en/stable/high/group.html#h5py.Group.create_dataset From 3de2f3c19e2927b78f1f7db3f7b0fb44c011ab8d Mon Sep 17 00:00:00 2001 From: Thomas VINCENT Date: Fri, 25 Jun 2021 15:29:18 +0200 Subject: [PATCH 24/24] fixed fcidecomp links in doc --- doc/information.rst | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/doc/information.rst b/doc/information.rst index df006560..da728c15 100644 --- a/doc/information.rst +++ b/doc/information.rst @@ -33,7 +33,9 @@ HDF5 compression filters and compression libraries sources were obtained from: * LZ4 plugin (v0.1.0) and lz4 (v1.3.0, tag r122): https://github.com/nexusformat/HDF5-External-Filter-Plugins, https://github.com/lz4/lz4 * bitshuffle plugin (0.3.5): https://github.com/kiyo-masui/bitshuffle * hdf5-blosc plugin (v1.0.0), c-blosc (v1.20.1) and snappy (v1.1.1): https://github.com/Blosc/hdf5-blosc, https://github.com/Blosc/c-blosc and https://github.com/Blosc/c-blosc/tree/v1.17.0/internal-complibs/snappy-1.1.1 -* FCIDECOMP plugin (v1.0.2) and CharLS (branch 1.x-master SHA1 ID:25160a42fb62e71e4b0ce081f5cb3f8bb73938b5): ftp://ftp.eumetsat.int/pub/OPS/out/test-data/Test-data-for-External-Users/MTG_FCI_Test-Data/FCI_Decompression_Software_V1.0.2/ and https://github.com/team-charls/charls.git +* FCIDECOMP plugin (v1.0.2) and CharLS (branch 1.x-master SHA1 ID: 25160a42fb62e71e4b0ce081f5cb3f8bb73938b5): + ftp://ftp.eumetsat.int/pub/OPS/out/test-data/Test-data-for-External-Users/MTG_FCI_Test-Data/FCI_Decompression_Software_V1.0.2 and + https://github.com/team-charls/charls * HDF5-ZFP plugin (v1.0.1) and zfp (v0.5.5): https://github.com/LLNL/H5Z-ZFP and https://github.com/LLNL/zfp * HDF5Plugin-Zstandard (commit d5afdb5) and zstd (v1.4.5): https://github.com/aparamon/HDF5Plugin-Zstandard and https://github.com/Blosc/c-blosc/tree/v1.20.1/internal-complibs/zstd-1.4.5