Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding / importing JPEG filter #85

Open
petoor opened this issue Sep 11, 2020 · 19 comments
Open

Adding / importing JPEG filter #85

petoor opened this issue Sep 11, 2020 · 19 comments

Comments

@petoor
Copy link

petoor commented Sep 11, 2020

Hi.

I've been looking at this lossy jpeg filter : https://github.com/CARS-UChicago/jpegHDF5 which i'd very much like to try.
Is it possible to add it / import it with the hdf5plugin module?

Best regards
Peter

@t20100
Copy link
Member

t20100 commented Sep 11, 2020

Hi,

It sounds possible to add it to hdf5plugin.
The only issue I see is that jpegHDF5 seems to have no license, but that can hopefully be tackled by contacting the author.

If you feel like, Pull Request welcomed!

See https://github.com/silx-kit/hdf5plugin/blob/master/doc/contribute.rst and of course, we can provide support.

@vasole
Copy link
Member

vasole commented Sep 11, 2020

Besides the license, that should not be a big issue, the main issue I see is if we want to build ourselves the libjpeg library or if we want to rely on that of the system. System libjpeg libraries seem to use libjpeg-turbo that has plenty of compilation options for the different architectures and it would be a nightmare to try to solve at our side.

@t20100
Copy link
Member

t20100 commented Sep 11, 2020

Indeed, that can be a bit complicated.
So far we've been embedding the source of the codec libs in the repository to ease installation from source.

For generating the wheels, libjpeg-turbo is in the manylinux docker, so it should be possible to make wheels which will embed the libjpeg.
For building from source, either we need to embed the source of libjpeg or libjpeg-turbo and write the correct extension/lib in setup.py... or we leave this filter optional and requiring libjpeg already installed on the system.

@vasole
Copy link
Member

vasole commented Sep 11, 2020

I guess the simplest solution is to provide the reference old implementation with the possibility for the user to compile and link against the system libjpeg. That way the default implementation would be 2x or 6x slower but it would still work.

That filter seems to work only for 8-bit integers.

@petoor
Copy link
Author

petoor commented Sep 13, 2020

I reached out to the author and he now added a licence to the file ( Apache 2 )
Is there a way to easily to link libjpeg to hdf5plugin? I don't mind giving this a go myself, but C is not really my strong suit.

@vasole
Copy link
Member

vasole commented Sep 13, 2020

I have tried to incorporate the plugin into our hdf5plugin building chain.

It is straightforward if one can rely on an installed version of libjpeg-turbo.

We have to take a decision. After all, it is a filter just for uint8 data.

@petoor
Copy link
Author

petoor commented Sep 13, 2020

Alright, that sounds good.
I would love to check it out, how do i do that?
uint8 data is used a lot in machine learning, especially computer vision. I think it would be useful for a lot of people to have a uint8 filter.

@MarkRivers
Copy link

The only thing you should need to use this filter is a shareable library or DLL with that filter in the directory pointed to by the HDF5_PLUGIN_PATH environment variable.

Here is an example where I have an existing HDF5 file that I saved with the JPEG plugin. I am using a very old version of h5dump (1.8.12), so it cannot possibly know about the JPEG filter.

(base) corvette:~/scratch>/usr/bin/h5dump --version
h5dump: Version 1.8.12

This is the contents of the HDF5 file:

(base) corvette:~/scratch>/usr/bin/h5dump --contents test_hdf5_mono_jpeg_q90_326.h5
HDF5 "test_hdf5_mono_jpeg_q90_326.h5" {
FILE_CONTENTS {
 group      /
 group      /entry
 group      /entry/data
 dataset    /entry/data/data
 group      /entry/instrument
 group      /entry/instrument/NDAttributes
 dataset    /entry/instrument/NDAttributes/AcquireTime
 dataset    /entry/instrument/NDAttributes/AttributesFileNative
 dataset    /entry/instrument/NDAttributes/AttributesFileParam
 dataset    /entry/instrument/NDAttributes/AttributesFileString
 dataset    /entry/instrument/NDAttributes/CameraManufacturer
 dataset    /entry/instrument/NDAttributes/CameraModel
 dataset    /entry/instrument/NDAttributes/E
 dataset    /entry/instrument/NDAttributes/Gettysburg
 dataset    /entry/instrument/NDAttributes/ID_Energy
 dataset    /entry/instrument/NDAttributes/ID_Energy_EGU
 dataset    /entry/instrument/NDAttributes/ImageCounter
 dataset    /entry/instrument/NDAttributes/MaxSizeX
 dataset    /entry/instrument/NDAttributes/MaxSizeY
 dataset    /entry/instrument/NDAttributes/NDArrayEpicsTSSec
 dataset    /entry/instrument/NDAttributes/NDArrayEpicsTSnSec
 dataset    /entry/instrument/NDAttributes/NDArrayTimeStamp
 dataset    /entry/instrument/NDAttributes/NDArrayUniqueId
 dataset    /entry/instrument/NDAttributes/Pi
 dataset    /entry/instrument/NDAttributes/RingCurrent
 dataset    /entry/instrument/NDAttributes/RingCurrent_EGU
 dataset    /entry/instrument/NDAttributes/Ten
 group      /entry/instrument/detector
 group      /entry/instrument/detector/NDAttributes
 dataset    /entry/instrument/detector/NDAttributes/ColorMode
 dataset    /entry/instrument/detector/data -> /entry/data/data
 group      /entry/instrument/performance
 dataset    /entry/instrument/performance/timestamp
 }
}

This is h5dump -p which shows the filter information.

(base) corvette:~/scratch>/usr/bin/h5dump -p -d /entry/data/data test_hdf5_mono_jpeg_q90_326.h5 | more
HDF5 "test_hdf5_mono_jpeg_q90_326.h5" {
DATASET "/entry/data/data" {
   DATATYPE  H5T_STD_U8LE
   DATASPACE  SIMPLE { ( 1024, 1024 ) / ( 1024, 1024 ) }
   STORAGE_LAYOUT {
      CHUNKED ( 1024, 1024 )
      SIZE 107505 (9.754:1 COMPRESSION)
   }
   FILTERS {
      USER_DEFINED_FILTER {
         FILTER_ID 32019
         COMMENT jpeg; see https://github.com/CARS-UChicago/jpegHDF5
         PARAMS { 90 1024 1024 0 }
      }
   }
   FILLVALUE {
      FILL_TIME H5D_FILL_TIME_IFSET
      VALUE  0
   }
   ALLOCATION_TIME {
      H5D_ALLOC_TIME_INCR
   }
   DATA {
   (0,0): 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204,
   (0,13): 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217,
   (0,26): 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230,
   (0,39): 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243,
   (0,52): 244, 245, 246, 247, 248, 250, 252, 248, 252, 253, 255, 253, 0, 1,
   (0,66): 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
   (0,84): 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
   (0,100): 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51,
   (0,116): 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
   (0,132): 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83,
   (0,148): 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99,
   (0,164): 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112,
   (0,177): 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125,
   (0,190): 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138,
   (0,203): 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151,
   (0,216): 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164,
   (0,229): 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177,
   (0,242): 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190,
   (0,255): 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203,
   (0,268): 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216,
   (0,281): 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229,
   (0,294): 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242,
(base) corvette:~/scratch>

So it finds the JPEG plugin, prints the correct information, and decodes the data correctly.

This is 1024x1024 UInt8 image, so it would be about 1MB if it were not compressed. It was compressed with Quality=90, and this is the actual size:

(base) corvette:~/scratch>ls -lh test_hdf5_mono_jpeg_q90_326.h5
-rw-rw-r-- 1 epics domain users 182K May  8  2019 test_hdf5_mono_jpeg_q90_326.h5

This is my HDF5_PLUGIN_PATH

(base) corvette:~/scratch>echo $HDF5_PLUGIN_PATH
/home/epics/devel/areaDetector/ADSupport/lib/linux-x86_64

These are the jpeg files in that directory.

(base) corvette:~/scratch>ls -lh $HDF5_PLUGIN_PATH/*jpeg*
-r-xr-xr-x 1 epics domain users  14K Jul  7 12:11 /home/epics/devel/areaDetector/ADSupport/lib/linux-x86_64/libHDF5_jpeg_plugin.so
-r--r--r-- 1 epics domain users 386K Jul  7 12:10 /home/epics/devel/areaDetector/ADSupport/lib/linux-x86_64/libjpeg.a
-r-xr-xr-x 1 epics domain users 311K Jul  7 12:10 /home/epics/devel/areaDetector/ADSupport/lib/linux-x86_64/libjpeg.so

In my case I am building libjpeg from source and putting it that directory. I do that so I can ensure that libjpeg is available and a working version on all architectures that I build for. This includes 32 and 64-bit Windows, 32 and 64-bit Linux, MacOS, vxWorks and a number of others. It saves my users from needing to find and install the libjpeg library. But this is not necessary, it should also work fine with the system version of libjpeg.

@vasole
Copy link
Member

vasole commented Sep 13, 2020

@MarkRivers, what sources of libjpeg are you using?

@vasole
Copy link
Member

vasole commented Sep 13, 2020

@petoor

The branch jpeg https://github.com/silx-kit/hdf5plugin/tree/jpeg shows how things would look like to add the plugin using a static version of libjpeg-turbo as jpeg library. That code is just for illustration purposes and it is only for windows. I do not think we'll integrate the filter.

The main interest of hdf5plugin is that it allows to decouple the version of the HDF5 library used when building the plugin from the HDF5 version available when using the plugin. If the plugins supplied by Mark can be used on multiple versions of HDF5, then there is little interest on adding this filter to our list. You can just take the plugin from him.

@MarkRivers
Copy link

@MarkRivers, what sources of libjpeg are you using?

The repository where I build libjpeg is here:
https://github.com/areaDetector/ADsupport

It builds the following libraries:

  • Bitshuffle and lz4
  • Blosc
  • CBF
  • GraphicsMagick
  • HDF5
  • JPEG
  • netCDF
  • NeXus
  • SZIP
  • TIFF
  • XML2
  • ZLIB

Each directory has a README.epics that says what version of the source code is used and any modifications made. The Makefile is always new, because the builds are done using the EPICS build system for OS-independence. In many cases minor changes were made to the source code to allow it to be built on vxWorks, etc.

When building areaDetector one can select whether to use the system version of any library, or to use the version built in ADSupport. The ADSupport versions have the following advantages:

  • Version is known to work with areaDetector plugins.
  • Additional operating systems are supported compared to the original version.
  • No need to involve system administrators in installing packages to allow areaDetector to be built .

@vasole
Copy link
Member

vasole commented Sep 13, 2020

@MarkRivers Thank you.

I have seen you are using compatibility with JPEG version 9 when the default compatibility mode for libjpeg-turbo is 6.2

https://github.com/areaDetector/ADSupport/blob/4767b1afcaa676045d4bf9ee68a25448bb8a0b58/supportApp/jpegSrc/os/default/jpeglib.h#L40

Clearly if one wants to remain compatible with the source (you), only your sources have to be used.

@MarkRivers
Copy link

Clearly if one wants to remain compatible with the source (you), only your sources have to be used.

I am not sure that is true. I shared my JPEG plugin with the HDF Group, but not my version of libjpeg built from source. They added it to the HDF5 distribution, both the HDF5 source and plugin binaries, and they tested it. But they must have tested with some system version of libjpeg, because they did not use my libjpeg source. Maybe the API for the functions the plugin uses has not changed between 6.2 and 9?

@vasole
Copy link
Member

vasole commented Sep 13, 2020

It could well be. Not being an expert I do not know if the incompatibilities can affect the output.

I have been able to build your plugin against libjpeg-turbo built with 6.2 compatibility mode (its default). However, unless you perform a systematic check to verify it, I would not take the risk to use libraries built with different compatibility settings.

The modern jpeg libraries can be built with different compatibility settings, perhaps it is enough that you specify your targeted compatibility.

@MarkRivers
Copy link

Maybe the API for the functions the plugin uses has not changed between 6.2 and 9?

6.2 and 9 are maintained by different organizations. 9 is maintained by the Independent JPEG group: https://www.ijg.org/. 6.2 appears to be a dead-end with no further development.

In October 2016 I updated ADSupport from JPEG 6.2 to 9b. However, at that time I did not need to make any changes to the areaDetector JPEG file writing plugin: https://github.com/areaDetector/ADCore/blob/master/ADApp/pluginSrc/NDFileJPEG.cpp. This tells me that the API did not change between 6.2 and 9b.

@vasole
Copy link
Member

vasole commented Sep 13, 2020

If you do not use any extension mentioned in https://en.wikipedia.org/wiki/Libjpeg it should be fine.

@petoor
Copy link
Author

petoor commented Sep 14, 2020

I managed to use the jpeg filter following your guide @MarkRivers .
When calling it from the h5py wrapper it also compresses the file with the compression=32019 argument.
Does anyone know how to call the filter with cd_values in order to change the compression quality?

In **hdf5plugin.jpeg i guess it would have been arguments, but i cant really find that in the jpeg branch.

@t20100
Copy link
Member

t20100 commented Sep 14, 2020

I added a commit (777f14a) to the jpeg branch with the handling of arguments.

@petoor
Copy link
Author

petoor commented Sep 15, 2020

Thank you Thomas.

It seems to be working :-)
The images compressed with this filter are much smaller than being compressed with gzip (no wonder, it is a lossy compression). It makes the h5 format storage wise, competable with storing the raw jpeg files.

t20100 added a commit to t20100/hdf5plugin that referenced this issue Oct 21, 2022
59b7f38a updating for 1.0.0 (silx-kit#85)
3802c0f7 Tests: (fix) Makefile - CFLAGS and clean target (silx-kit#84)
111e5a19 Fix Silo doc section and warnings (silx-kit#83)
9dd45462 Use ZFP's version string (silx-kit#80)
557a4f61 Pin zfp version in windows ci (silx-kit#82)
f51ac59f added parameter to fortran const (silx-kit#76)
35a08e06 [github] add windows build config (silx-kit#72)
1844b5b5 Fixed detection of CFP availability (silx-kit#74)
c8544d3e [windows] enable compilation in windows with ClangCL (silx-kit#71)
7b34cacf Update installation.rst (silx-kit#66)
e153bf9d Feature direct write zfp array (silx-kit#43)
20b0f1f3 Added missing HDF5 include (CMake) (silx-kit#63)
20b893a1 Test and fix working with HDF5-1.12 (silx-kit#62)
497e9420 CMake: (fix) Autotools build HDF5 (silx-kit#59)
8f63f7d3 Add missing headers for string functions and fix printfs for cd_nelmts (silx-kit#60)
8de12f77 Added generic interface fortran wrappers with tests (silx-kit#58)
983a1870 find zfp and hdf5 in lib64 instead of lib (silx-kit#56)
e6c9c14c Fix typos
f7670c43 Fix second typo in h5repack docs (silx-kit#55)
57a849de handle optional/mandatory flag (silx-kit#54)
48126d3c CMake: (feature) Added CMake build configuration (silx-kit#52)
bee74347 fix link to travis badge (silx-kit#51)
00146c12 update to download from github (silx-kit#49)
259feb8f Remove include for HDF5 header file (silx-kit#48)
3d0b1768 adding missing call to h5z_zfp_finaliz() (silx-kit#45)
b642fe8a minor fixes to docs and h5repack utility (silx-kit#42)

git-subtree-dir: src/H5Z-ZFP
git-subtree-split: 59b7f38a063b6adce3db074a945ee47bc50856da
t20100 added a commit to t20100/hdf5plugin that referenced this issue Nov 8, 2022
cd5422c Compatibility with Visual Studio 2019 (silx-kit#93)
bcff4d2 CMake: (fix) Missing header file. (silx-kit#90)
59b7f38 updating for 1.0.0 (silx-kit#85)
3802c0f Tests: (fix) Makefile - CFLAGS and clean target (silx-kit#84)
111e5a1 Fix Silo doc section and warnings (silx-kit#83)
9dd4546 Use ZFP's version string (silx-kit#80)
557a4f6 Pin zfp version in windows ci (silx-kit#82)
f51ac59 added parameter to fortran const (silx-kit#76)
35a08e0 [github] add windows build config (silx-kit#72)
1844b5b Fixed detection of CFP availability (silx-kit#74)
c8544d3 [windows] enable compilation in windows with ClangCL (silx-kit#71)
7b34cac Update installation.rst (silx-kit#66)
e153bf9 Feature direct write zfp array (silx-kit#43)
20b0f1f Added missing HDF5 include (CMake) (silx-kit#63)
20b893a Test and fix working with HDF5-1.12 (silx-kit#62)
497e942 CMake: (fix) Autotools build HDF5 (silx-kit#59)
8f63f7d Add missing headers for string functions and fix printfs for cd_nelmts (silx-kit#60)
8de12f7 Added generic interface fortran wrappers with tests (silx-kit#58)
983a187 find zfp and hdf5 in lib64 instead of lib (silx-kit#56)
e6c9c14 Fix typos
f7670c4 Fix second typo in h5repack docs (silx-kit#55)
57a849d handle optional/mandatory flag (silx-kit#54)
48126d3 CMake: (feature) Added CMake build configuration (silx-kit#52)
bee7434 fix link to travis badge (silx-kit#51)
00146c1 update to download from github (silx-kit#49)
259feb8 Remove include for HDF5 header file (silx-kit#48)
3d0b176 adding missing call to h5z_zfp_finaliz() (silx-kit#45)
b642fe8 minor fixes to docs and h5repack utility (silx-kit#42)

git-subtree-dir: src/H5Z-ZFP
git-subtree-split: cd5422c146836e17c7a0380bfb05cf52d0c4467c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants