README

This project implements a complete(!) JPEG (Rec. ITU-T T.81 | ISO/IEC
10918-1) codec, plus a library that can be used to encode and decode
JPEG streams.  It also implements ISO/IEC 18477 aka JPEG XT which is
an extension towards intermediate, high-dynamic-range lossy and
lossless coding of JPEG. In specific, it supports ISO/IEC
18477-3/-6/-7/-8/-9 encoding.

--------------------------------------------------------------------------

Unlike many other implementations, libjpeg also implements:

- 12 bpp image coding for the lossy DCT process,
- the predictive lossless mode of Rec. ITU-T T.81 | ISO/IEC 10918-1,
- the hierarchical process of Rec. ITU-T T.81 | ISO/IEC 10918-1,
- the arithmetic coding option of Rec. ITU-T T.81 | ISO/IEC 10918-1,
- coding of up to 256 component images
- upsampling of images for all factors from 1x1 to 4x4

Standard features are of course also supported, such as
sequential and progressive mode in 8bpp.

--------------------------------------------------------------------------

In addition, this codec provides methods to encode images

- with a bit depth between 8 and 16 bits per sample, fully backwards
  compatible to Rec. ITU-T T.81 | ISO/IEC 10918 baseline coding.

- consisting of floating point samples, specifically images with 
  high dynamic range.

- to encode images without loss, regardless of their bit-depth and their
  sample data type.

--------------------------------------------------------------------------

Example usage:

Standard JPEG compression, with 444 (aka "no") subsampling:

$ jpeg -q <quality> infile.ppm outfile.jpg

Standard JPEG compression, with 420 subsampling:

$ jpeg -q <quality> -s 1x1,2x2,2x2 infile.ppm outfile.jpg

Intermediate dynamic range compression, i.e. compression of images
of a bit-depth between 8 and 16 bits:

$ jpeg -r -q <base-quality> -Q <extension-quality> -h -r12 infile.ppm outfile.jpg

This type of encoding uses a technology known as "residual scans" which 
increase the bit-depths in the spatial domain which is enabled by the -r
command line switch. The -Q parameter sets the quality of the residual image. 
To improve the precision in the frequency domain, "refinement scans" can be used. 
The following encodes a 12-bit image with  four additional refinement scans,
enabled by the "-R 4" parameter.

$ jpeg -q <quality> -R 4 -h infile.ppm outfile.jpg

Both technologies can be combined, and the precision of the residual scan
can also be enlarged by using residual refinement scans with the -rR option.
The following command line with use a 12-bit residual scan with four refinement
scans:

$ jpeg -r -q <base-quality> -Q <extension-quality> -h -rR 4 infile.ppm outfile.jpg

High-dynamic range compression allows three different profiles of varying
complexity and performance. The profiles are denoted by "-profile <X>" where
<X> is a,b or c. The following encodes an HDR image in profile C:

$ jpeg -r -q <base-quality> -Q <extension-quality> -h -profile c -rR 4 infile.pfm outfile.jpg

HDR images here have to be fed into the command line in "pfm" format. 
exr or hdr is not supported as input format and requires conversion to 
pfm first. pfm is the floating-point equivalent of ppm and encodes each
pixel by three 32-bit floating point numbers.

Encoding in profiles a and b works likewise, though it is generally advisable to
use "open loop" rather than "closed loop" coding for these two profiles by
additionally providing the "-ol" parameter. This also works for profile C:

$ jpeg -ol -r -profile a -q <base-quality> -Q <extension-quality> -h infile.pfm out.jpg

similar for profile B.

What is common to profiles A and C is that you may optionally also specify 
the LDR image, i.e. the image that a legacy JPEG decoder will show. By default, 
a simple tone mapping algorithm ("global Reinhard") will be used to derive a
suitable LDR image from the input image:

$ jpeg -ldr infile.ppm -q <base-quality> -Q <extension-quality> -h -rR 4 infile.pfm out.jpg

The profile is by default profile c, but it also works for profile a:

$ jpeg -ol profile a -ldr infile.ppm -q <base-quality> -Q <extension-quality> infile.pfm out.jpg

It is in general advisable for profile c encoding to enable residual refinement scans,
profiles a or b do not require them.


The following options exist for lossless coding integer:

predictive Rec. ITU-T T.81 | ISO/IEC 10918-1 coding. Note, however,
that not many implementations are capable of decoding such stream,
thus this is probably not a good option for all-day purposes.

$ jpeg -p -c infile.ppm out.jpg

While the result is a valid Rec. ITU-T T.81 | ISO/IEC 10918-1 stream,
most other implementations will hick up and break, thus it is not
advisable to use it.

A second option for lossless coding is residual coding within profile c:

$ jpeg -q <quality> -Q 100 -h -r infile.ppm out.jpg

This also works for floating point coding. Note that lossless coding is enabled
by setting the extension quality to 100.

$ jpeg -q <quality> -Q 100 -h -r infile.pfm out.jpg

However, this is only lossless for 16 bit input samples, i.e. there is a precision
loss due to down-converting the 32-bit input to 16 bit. If samples are out of the
601 gamut, the problem also exists that clamping will happen. To avoid that,
encode in the XYZ color space (profile C only, currently):

$ jpeg -xyz -q <quality> -Q 100 -h -r infile.pfm out.jpg

A second option for lossless integer coding is to use a lossless 1-1 DCT
process. This is enabled with the -l command line option:

$ jpeg -l -q 100 -c infile.ppm out.jpg

Refinement scans can be used to increase the sample precision to up to 12
bits. The "-c" command line option disables the lossy color transformation.

Additionally, this implementation also supports JPEG LS, which is
outside of Rec. ITU-T T.81 | ISO/IEC 10918-1 and ISO/IEC 18477. For
that, use the command line option -ls:

$ jpeg -ls -c infile.ppm out.jpg

The "-c" command line switch is necessary to disable the color transformation
as JPEG LS typically encodes in RGB and not YCbCr space.

Optionally, you may specify the JPEG LS "near" parameter (maximum error) with
the -m command line switch:

$ jpeg -ls -m 2 -c infile.ppm out.jpg

JPEG LS also specifies a lossless color transformation that is enabled with
-cls:

$ jpeg -ls -cls infile.ppm out.jpg


To encode images with an alpha channel, specify the source image that 
contains the alpha channel with -al. The alpha channel is a one-component
grey-scale image, either integer or floating point. The quality of the
alpha channel is specified with -aq, that of the regular image with -q:

$ jpeg -al alpha.pgm -aq 80 -q 85 input.ppm output.jpg

Alpha channels can be larger than 8bpp or can be floating point. In both
cases, residual coding is required. To enable residual coding in the alpha
channel, use the -ar command line option. Similar to the regular image,
where residual coding requires two parameters, -q for the base quality and
-Q for the extension quality, an alpha channel that uses residual coding
also requires a base and extension quality, the former is given by -aq,
the latter with -aQ:

$ jpeg -ar -al alphahigh.pgm -q 85 -Q 90 -aq 80 -aQ 90 input.ppm out.jpg

The alpha channel can be encoded without loss if desired. For that, enable
residual coding with -ar and specify an extension quality of 100:

$ jpeg -ar -al alphahigh.pgm -q 85 -Q 90 -aq 80 -aQ 100 input.ppm out.jpg

The alpha channel can use the same technology extensions as the image,
namely refinement scans in the base or extension image, or 12-bit residual
images. The number of refinement scans is selected with -aR and -arR for
the base and residual image, a 12-bit residual image is selected with -ar12.

--------------------------------------------------------------------------

Decoding is much simpler:

$ jpeg infile.jpg out.ppm

or, for floating point images:

$ jpeg infile.jpg out.pfm


If you want to decode a JPEG LS image, then you may want to tell the
decoder explicitly to disable the color transformation even though the
corresponding marker signalling coding in RGB space is typically missing
for JPEG LS:

$ jpeg -c infile.jpg out.ppm


If an alpha channel is included in the image, the decoder does not
reconstruct this automatically, nor does it attempt to merge the alpha
image into the file. Instead, it may optionally be instructed to write the
alpha channel into a separate 1-component (grey-scale) file:

$ jpeg -al alpha.pgm infile.jpg outfile.ppm

The -al option for the decoder provides the target file for the alpha
channel.

--------------------------------------------------------------------------

Starting with release 1.30, libjpeg will include a couple of optimization
parameters to improve the performance of JPEG and JPEG XT. In this
release, the following additional command line switches are available:

-qt <n> : Selects a different quantization table. The default table,
also enabled by -qt 0, is the one in the legacy JPEG standard
(Rec. ITU-T T.81 | ISO/IEC 10918-1). -qt 1 is the "flat" table for
PSNR-optimal performance. It is not recommended for real-life usage as
its visual performance is non-ideal, it just generates "nice
numbers". -qt 2 is MS-SSIM ideal, but similarly, not necessarily a
good recommendation for all-day use. -qt 3 is a good compromize and
usually works better than -qt 0.

-dz : This option enables a deadzone quantizer that shifts the buckets
by 1/8th of their size to the outside. This is (almost) the ideal choice
for Laplacian sources which would require a shift of 1/12th. Nevertheless,
this option improves the rate-distortion performance by about 0.3dB on
average and works pretty consistent over many images.

Additional options are planned for future releases.
-------------------------------------------------------------------------------------

Release 1.40:

In this release, we included additional support for "full profile" encoding, i.e.
encoding parameters that do not fit any of the four profiles specified in 18477-7.
Using such encoding parameters will generate a warning on the command line, but
encoding will proceed anyhow, generating a bitstream that conforms to 18477-7, but
not to any of the profiles in this standard.

With "-profile a -g 0" or "-profile b -g 0" the encoder will generate a file that
uses an inverse TMO lookup similar to profile C with other encoding parameters
identical to those defined by profiles A and B.

The command line option "-lr" will use a logarithmic encoding instead of the gamma
encoding for profile B. Again, this will leave the profile, but will be within the
bounds of 18477-7.

Other than that, a couple of bug fixes have been made. Profile A and B setup could
not reset the toe value for the inverse gamma map, due to a typo of one of the
parameters. Profile B accepted a different gamma value than the default, but never
communicated it to the core code, i.e. it was simply ignored. Profile B setup ignored
the epsilon values for numerator and denomiator, and they were communicated wrongly
into the core code. This was corrected, and epsilons can now be specified on the
command line.

--------------------------------------------------------------------------

Release 1.50:

This release fixes encoding of ISO/IEC 18477-8 if the IDCT was selected as
transformation in the extension layer and refinement scans were added, i.e.
the command line options -rl -rR 4 created invalid codestreams. Previous
releases used the wrong type of refinement scan (dct bypass refinement instead
of regular refinement) and hence broke reconstruction. Furthermore, previous
releases no longer allowed near lossless coding with DCT bypass. Instead, regular
DCT coding conforming to ISO/IEC 18477-7 was used. To enable the near-lossless
DCT bypass mode, use the new option "-ro" now.

Profile B encoding could potentially create codestreams that run into
clipping of the extension channel; this always happens if the denominator is
larger than 1, and has to happen according to Annex C of ISO/IEC 18477-3.
This release avoids this issue by adjusting the exposure value such that
the denominator always remains smaller than 1.

--------------------------------------------------------------------------

Release 1.51:

If the JPEG-XT markers were delayed to the frame-header intead the global
header, the previous code did not built up the necessary infrastructure
to compute the checksum and hence could not verify the checksum in such
a condition. The 1.51 release fixes this problem.

--------------------------------------------------------------------------

Release 1.52:

This file is an updated/enhanced version of the 1.51 release of
the JPEG XT demo software found on https://github.com/thorfdbg/. It
includes additional features presented in the paper
"JPEG on Steroids : Common Optimization Techniques for JPEG Image Compression"
by the same author.

In specific, the following command line flags are *NEW* to this version and
are available only as a contribution to ICIP 2016:

-oz:          This enables the dynamic programming algorithm to enhance
the rate-distortion performance by soft-threshold quantization. It has been
used for the tests in section 3.3 of the paper.

-dr:         This enables the smart de-ringing algorithm that has been used
in section 3.6.

Additionally, the following switches have been used for other subsections
of the paper; they are not new to this distribution but available as
part of the regular libjpeg distribution at github or www.jpeg.org:

-s 1x1,2x2,2x2:     Enable 420 subsampling (444 is default)
-s 1x1,2x1,2x1:     Enable 422 subsampling (444 is default)
-qt n (n=0..8)      Use quantization matrix n.
                    In the paper, n=1 (flat) was used for PSNR-optimized
                    coding, unless otherwise noted.
-dz                 The deadzone quantizer in section 3.3
                    (simpler than -oz)
-v                  Enable coding in processive mode (section 3.5)
-v -qv              Optimized progressive mode (section 3.5)
-h                  Optimized Huffman coding (always used, unless noted
                    otherwise, see section 3.4)
		    
--------------------------------------------------------------------------

Release 1.53:

This release includes additional functionality to inject markers, or
retrieve markers from a codestream while reading. For that, set
the JPGTAG_ENCODER_STOP tag of the JPEG::Write() call to a bitmask
where the encoder should interrupt writing data (this flag already
existed before) then write custom data with JPEG::WriteMarker(), then
continue with JPEG::Write(). On decoding, set JPGTAG_DECODER_STOP to
a bitmask where to stop for markers, then identify markers with
JPEG::PeekMarker(), and retrieve them with JPEG::ReadMarker(). Details
can be found in cmd/encodec.cpp for encoding, and cmd/reconstruct.cpp.

Otherwise, no functional changes.

--------------------------------------------------------------------------

Release 1.54:

In this release, upsampling has been made conforming to the latest
corrigendum of 18477-1 and 18477-8. In particular, upsampling is now
by design always centered and never co-sited. The co-sited upsampling
procedure is still included in the source code, but never executed.

--------------------------------------------------------------------------

Release 1.55:

This release only addresses some minor formulation issues of the
command line such that references are formatted properly to make this
software package acceptable as a JPEG reference software.
No functional changes.

--------------------------------------------------------------------------

Release 1.56:

Encoding and reconstruction of 2-component images was actually never
supported, as it was considered a rather exotic use-case. Now that a
request was made, support for 2-components was added and should
hopefully work ok.

--------------------------------------------------------------------------

Release 1.57:

Newer g++ compiler versions warned about implicit fall-throughs in switch/
case constructs that are actually harmless. This release adds an autoconf
detection of such compiler versions, adds consistent comments throughout
the code.

--------------------------------------------------------------------------

Release 1.58:

This release fixes multiple spelling errors in the file, thanks to
Mathieu Mmalaterre for finding and fixing them. The release also
addresses multiple race conditions and improves stability and robustness
on invalid streams. Thanks to seviezhou for providing codestreams that
triggered these defects. In particular, the following defects have
been found:

- when a codestream with unsupported upsampling specification (beyond
  18477-1) was found, the code crashed.
- JPEG LS single component scans did not check whether there is actually
  only a single component referenced in the scan.
- An invalid DC category in the sequential scan could have caused a
  crash in the follow-up decoding.
- AC-coded lossless JPEG scans with horizontal subsampling factors
  trashed memory.
- MCU sizes of 0 remained undetected and caused crashes due to a
  division-by-zero exception.
- The code did not check whether a scan references the same component
  more than once and could have failed with strange effects then.
- The code did not handle EOF conditions in the frame header
  gracefully.

--------------------------------------------------------------------------

Release 1.59:

This release addresses a defect in the MCU handling for JPEG LS scans.
The previous code forgot to reset the JPEG LS state variables on MCU
scan boundaries, thus defeating the independent decodability of MCUs
if restart markers are inserted into the stream. Thanks to Spyros for
detecting this defect.

--------------------------------------------------------------------------

Release 1.60:

A specially crafted bitstream depending on line-based JPEG processes
could trigger a segfault because source data the reconstruction
process depended upon were not available. This has been fixed.

--------------------------------------------------------------------------

Release 1.61:

The restart interval for JPEG LS streams, specifically, is allowed to
be larger than 2^16. Modified the DRI marker accordingly. Unfortunately,
as the initial tables section of a codestream of JPEG and JPEG LS is
identical, JPEG files with an invalid DRI marker size will also be
accepted as valid.

--------------------------------------------------------------------------

Release 1.62:

The quantization table could contain entries larger than 255 for the 8-bit
DCT process, even though the standard prohibits this. Now the quantization
table entries are clipped to the allowed range.
Added an option -bl to force encoding in the baseline sequential process.
Added options to read the quantization tables from files rather than using
the built-in defaults.

--------------------------------------------------------------------------

Release 1.63:

In case the decoder was started with an image containing an alpha channel,
i.e. a 18477-9 image, and no output file for the alpha channel was
provided, the decoder crashed. This issue was fixed, the alpha channel is
now in this case simply disregarded. Note that you can define the output
file for the alpha channel with the "-al" command line option.

--------------------------------------------------------------------------

Release 1.64:

The lossless scan, the arithmetically coded lossless scan and the
arithmetically coded sequential scan could run into cases where an
out-of-bounds symbol triggered and out-of-bounds array access and could
have crashed the decoder. The code is now more carefully changing the
validity of the symbols and aborts with an error if it finds illegal
codes.
The code now also checks the consistency of the MCU sizes in the
hierarchical process and fails if they differ across levels.

--------------------------------------------------------------------------

Release 1.65:

The components requested through DisplayRectangle() are always codestream
component (i.e. Y, U, V) and not RGB components in the target image.
This led to some confusion and lack of initialization of bitmap descriptors
if less components were requested than present in the image.
The large-range scan did not properly check whether the DCT precision exceeds
the range decodable by the Huffman encoder. It now aborts faithfully if such
coefficients are detected.
The code did not check if the number of components present in a JPEG LS-2
transformation are identical to the number of components indicated in the
frame header. The code now aborts with an error if such a condition is
detected.

--------------------------------------------------------------------------

Release 1.66:

Previous releases had no means to signal an error from within the BitMapHook.
Now the return code of the BitMapHook call back function, if non-zero, is
used as an error, and such an error is then reported upstream to the caller.

Note that the supplied bitmap hook in the "cmd" directory is only an example
and does not validate inputs. The encoder, in particular, operates on a
"garbadge-in garbadge-out" basis. If the samples in the bitmap hook exceed
the range indicated to the library, bad things will happen. That is, if the
encoder is supposed to operate within an unknown environment with unknown
input data, input validation is required within the bitmap hook.

THE SAMPLE CODE DOES NOT ATTEMPT TO VALIDATE INPUT

Similar restrictions applies to the "PNM" helper code (or any code) in the
cmd directory. It does not attempt to validate input. At least some
minimal attempts to ensure that input files are valid is now in place, but
real life deployment of the code should contain more checks.

--------------------------------------------------------------------------

Release 1.67:

- Fixed a potential memory leak where the code did not release residual
or alpha tables on some error conditions.
- Fixed missing valiation for hidden residual DCT scans where start and
stop bitplane must be one bitplane appart.
- Fixed the box parser where boxes were attempted to be parsed that were
not yet completely loaded.
- Invalid PPM files that indicate a bitdepth > 16 are now detected and
generate an early error condition.
- Encoding images with an alpha channel was not working due to a lack
of enabling the alpha channel with a suitable tag.
- The Adobe marker with version 101 is now also decoded, though despite
the color transformation, flags are disregarded. Smoothing is not
supported by this implementation.

--------------------------------------------------------------------------

Release 1.68:
- Fixed a missing initialization of the table owner holder introduced
  in 1.67.
  
--------------------------------------------------------------------------

Release 1.69:
- The DC Huffman table for the lossless predictive process also contained
  entries for symbols of 17 and above for unknown reasons that might have
  confused some other implementations. As these symbols are never needed,
  they have been removed.
  
--------------------------------------------------------------------------

Release 1.70:
- The decoding of predictive JPEGs aborted premature if the DNL marker was
  found, but the AC or Huffman decoder contained sufficient state information
  to cover at least one line of data. In such a case, the DNL marker was
  hit before the end of the next line was hit, even though parts of the
  current line and the next line were still covered by the data in the AC
  or Huffman decoder. I want to thank Robert Ancell for reporting this defect
  and providing a fix.

--------------------------------------------------------------------------

For license conditions, please check the file README.license in this
directory.

Finally, I want to thank Accusoft and the Computing Center of the University of
Stuttgart and Fraunhofer IIS for sponsoring this project.

Thomas Richter, November 2024

---------------------------------------------------------------------------