-
Notifications
You must be signed in to change notification settings - Fork 81
/
README
557 lines (398 loc) · 24.1 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
This project implements a complete(!) JPEG (Rec. ITU-T T.81 | ISO/IEC
10918-1) codec, plus a library that can be used to encode and decode
JPEG streams. It also implements ISO/IEC 18477 aka JPEG XT which is
an extension towards intermediate, high-dynamic-range lossy and
lossless coding of JPEG. In specific, it supports ISO/IEC
18477-3/-6/-7/-8/-9 encoding.
--------------------------------------------------------------------------
Unlike many other implementations, libjpeg also implements:
- 12 bpp image coding for the lossy DCT process,
- the predictive lossless mode of Rec. ITU-T T.81 | ISO/IEC 10918-1,
- the hierarchical process of Rec. ITU-T T.81 | ISO/IEC 10918-1,
- the arithmetic coding option of Rec. ITU-T T.81 | ISO/IEC 10918-1,
- coding of up to 256 component images
- upsampling of images for all factors from 1x1 to 4x4
Standard features are of course also supported, such as
sequential and progressive mode in 8bpp.
--------------------------------------------------------------------------
In addition, this codec provides methods to encode images
- with a bit depth between 8 and 16 bits per sample, fully backwards
compatible to Rec. ITU-T T.81 | ISO/IEC 10918 baseline coding.
- consisting of floating point samples, specifically images with
high dynamic range.
- to encode images without loss, regardless of their bit-depth and their
sample data type.
--------------------------------------------------------------------------
Example usage:
Standard JPEG compression, with 444 (aka "no") subsampling:
$ jpeg -q <quality> infile.ppm outfile.jpg
Standard JPEG compression, with 420 subsampling:
$ jpeg -q <quality> -s 1x1,2x2,2x2 infile.ppm outfile.jpg
Intermediate dynamic range compression, i.e. compression of images
of a bit-depth between 8 and 16 bits:
$ jpeg -r -q <base-quality> -Q <extension-quality> -h -r12 infile.ppm outfile.jpg
This type of encoding uses a technology known as "residual scans" which
increase the bit-depths in the spatial domain which is enabled by the -r
command line switch. The -Q parameter sets the quality of the residual image.
To improve the precision in the frequency domain, "refinement scans" can be used.
The following encodes a 12-bit image with four additional refinement scans,
enabled by the "-R 4" parameter.
$ jpeg -q <quality> -R 4 -h infile.ppm outfile.jpg
Both technologies can be combined, and the precision of the residual scan
can also be enlarged by using residual refinement scans with the -rR option.
The following command line with use a 12-bit residual scan with four refinement
scans:
$ jpeg -r -q <base-quality> -Q <extension-quality> -h -rR 4 infile.ppm outfile.jpg
High-dynamic range compression allows three different profiles of varying
complexity and performance. The profiles are denoted by "-profile <X>" where
<X> is a,b or c. The following encodes an HDR image in profile C:
$ jpeg -r -q <base-quality> -Q <extension-quality> -h -profile c -rR 4 infile.pfm outfile.jpg
HDR images here have to be fed into the command line in "pfm" format.
exr or hdr is not supported as input format and requires conversion to
pfm first. pfm is the floating-point equivalent of ppm and encodes each
pixel by three 32-bit floating point numbers.
Encoding in profiles a and b works likewise, though it is generally advisable to
use "open loop" rather than "closed loop" coding for these two profiles by
additionally providing the "-ol" parameter. This also works for profile C:
$ jpeg -ol -r -profile a -q <base-quality> -Q <extension-quality> -h infile.pfm out.jpg
similar for profile B.
What is common to profiles A and C is that you may optionally also specify
the LDR image, i.e. the image that a legacy JPEG decoder will show. By default,
a simple tone mapping algorithm ("global Reinhard") will be used to derive a
suitable LDR image from the input image:
$ jpeg -ldr infile.ppm -q <base-quality> -Q <extension-quality> -h -rR 4 infile.pfm out.jpg
The profile is by default profile c, but it also works for profile a:
$ jpeg -ol profile a -ldr infile.ppm -q <base-quality> -Q <extension-quality> infile.pfm out.jpg
It is in general advisable for profile c encoding to enable residual refinement scans,
profiles a or b do not require them.
The following options exist for lossless coding integer:
predictive Rec. ITU-T T.81 | ISO/IEC 10918-1 coding. Note, however,
that not many implementations are capable of decoding such stream,
thus this is probably not a good option for all-day purposes.
$ jpeg -p -c infile.ppm out.jpg
While the result is a valid Rec. ITU-T T.81 | ISO/IEC 10918-1 stream,
most other implementations will hick up and break, thus it is not
advisable to use it.
A second option for lossless coding is residual coding within profile c:
$ jpeg -q <quality> -Q 100 -h -r infile.ppm out.jpg
This also works for floating point coding. Note that lossless coding is enabled
by setting the extension quality to 100.
$ jpeg -q <quality> -Q 100 -h -r infile.pfm out.jpg
However, this is only lossless for 16 bit input samples, i.e. there is a precision
loss due to down-converting the 32-bit input to 16 bit. If samples are out of the
601 gamut, the problem also exists that clamping will happen. To avoid that,
encode in the XYZ color space (profile C only, currently):
$ jpeg -xyz -q <quality> -Q 100 -h -r infile.pfm out.jpg
A second option for lossless integer coding is to use a lossless 1-1 DCT
process. This is enabled with the -l command line option:
$ jpeg -l -q 100 -c infile.ppm out.jpg
Refinement scans can be used to increase the sample precision to up to 12
bits. The "-c" command line option disables the lossy color transformation.
Additionally, this implementation also supports JPEG LS, which is
outside of Rec. ITU-T T.81 | ISO/IEC 10918-1 and ISO/IEC 18477. For
that, use the command line option -ls:
$ jpeg -ls -c infile.ppm out.jpg
The "-c" command line switch is necessary to disable the color transformation
as JPEG LS typically encodes in RGB and not YCbCr space.
Optionally, you may specify the JPEG LS "near" parameter (maximum error) with
the -m command line switch:
$ jpeg -ls -m 2 -c infile.ppm out.jpg
JPEG LS also specifies a lossless color transformation that is enabled with
-cls:
$ jpeg -ls -cls infile.ppm out.jpg
To encode images with an alpha channel, specify the source image that
contains the alpha channel with -al. The alpha channel is a one-component
grey-scale image, either integer or floating point. The quality of the
alpha channel is specified with -aq, that of the regular image with -q:
$ jpeg -al alpha.pgm -aq 80 -q 85 input.ppm output.jpg
Alpha channels can be larger than 8bpp or can be floating point. In both
cases, residual coding is required. To enable residual coding in the alpha
channel, use the -ar command line option. Similar to the regular image,
where residual coding requires two parameters, -q for the base quality and
-Q for the extension quality, an alpha channel that uses residual coding
also requires a base and extension quality, the former is given by -aq,
the latter with -aQ:
$ jpeg -ar -al alphahigh.pgm -q 85 -Q 90 -aq 80 -aQ 90 input.ppm out.jpg
The alpha channel can be encoded without loss if desired. For that, enable
residual coding with -ar and specify an extension quality of 100:
$ jpeg -ar -al alphahigh.pgm -q 85 -Q 90 -aq 80 -aQ 100 input.ppm out.jpg
The alpha channel can use the same technology extensions as the image,
namely refinement scans in the base or extension image, or 12-bit residual
images. The number of refinement scans is selected with -aR and -arR for
the base and residual image, a 12-bit residual image is selected with -ar12.
--------------------------------------------------------------------------
Decoding is much simpler:
$ jpeg infile.jpg out.ppm
or, for floating point images:
$ jpeg infile.jpg out.pfm
If you want to decode a JPEG LS image, then you may want to tell the
decoder explicitly to disable the color transformation even though the
corresponding marker signalling coding in RGB space is typically missing
for JPEG LS:
$ jpeg -c infile.jpg out.ppm
If an alpha channel is included in the image, the decoder does not
reconstruct this automatically, nor does it attempt to merge the alpha
image into the file. Instead, it may optionally be instructed to write the
alpha channel into a separate 1-component (grey-scale) file:
$ jpeg -al alpha.pgm infile.jpg outfile.ppm
The -al option for the decoder provides the target file for the alpha
channel.
--------------------------------------------------------------------------
Starting with release 1.30, libjpeg will include a couple of optimization
parameters to improve the performance of JPEG and JPEG XT. In this
release, the following additional command line switches are available:
-qt <n> : Selects a different quantization table. The default table,
also enabled by -qt 0, is the one in the legacy JPEG standard
(Rec. ITU-T T.81 | ISO/IEC 10918-1). -qt 1 is the "flat" table for
PSNR-optimal performance. It is not recommended for real-life usage as
its visual performance is non-ideal, it just generates "nice
numbers". -qt 2 is MS-SSIM ideal, but similarly, not necessarily a
good recommendation for all-day use. -qt 3 is a good compromize and
usually works better than -qt 0.
-dz : This option enables a deadzone quantizer that shifts the buckets
by 1/8th of their size to the outside. This is (almost) the ideal choice
for Laplacian sources which would require a shift of 1/12th. Nevertheless,
this option improves the rate-distortion performance by about 0.3dB on
average and works pretty consistent over many images.
Additional options are planned for future releases.
-------------------------------------------------------------------------------------
Release 1.40:
In this release, we included additional support for "full profile" encoding, i.e.
encoding parameters that do not fit any of the four profiles specified in 18477-7.
Using such encoding parameters will generate a warning on the command line, but
encoding will proceed anyhow, generating a bitstream that conforms to 18477-7, but
not to any of the profiles in this standard.
With "-profile a -g 0" or "-profile b -g 0" the encoder will generate a file that
uses an inverse TMO lookup similar to profile C with other encoding parameters
identical to those defined by profiles A and B.
The command line option "-lr" will use a logarithmic encoding instead of the gamma
encoding for profile B. Again, this will leave the profile, but will be within the
bounds of 18477-7.
Other than that, a couple of bug fixes have been made. Profile A and B setup could
not reset the toe value for the inverse gamma map, due to a typo of one of the
parameters. Profile B accepted a different gamma value than the default, but never
communicated it to the core code, i.e. it was simply ignored. Profile B setup ignored
the epsilon values for numerator and denomiator, and they were communicated wrongly
into the core code. This was corrected, and epsilons can now be specified on the
command line.
--------------------------------------------------------------------------
Release 1.50:
This release fixes encoding of ISO/IEC 18477-8 if the IDCT was selected as
transformation in the extension layer and refinement scans were added, i.e.
the command line options -rl -rR 4 created invalid codestreams. Previous
releases used the wrong type of refinement scan (dct bypass refinement instead
of regular refinement) and hence broke reconstruction. Furthermore, previous
releases no longer allowed near lossless coding with DCT bypass. Instead, regular
DCT coding conforming to ISO/IEC 18477-7 was used. To enable the near-lossless
DCT bypass mode, use the new option "-ro" now.
Profile B encoding could potentially create codestreams that run into
clipping of the extension channel; this always happens if the denominator is
larger than 1, and has to happen according to Annex C of ISO/IEC 18477-3.
This release avoids this issue by adjusting the exposure value such that
the denominator always remains smaller than 1.
--------------------------------------------------------------------------
Release 1.51:
If the JPEG-XT markers were delayed to the frame-header intead the global
header, the previous code did not built up the necessary infrastructure
to compute the checksum and hence could not verify the checksum in such
a condition. The 1.51 release fixes this problem.
--------------------------------------------------------------------------
Release 1.52:
This file is an updated/enhanced version of the 1.51 release of
the JPEG XT demo software found on https://github.com/thorfdbg/. It
includes additional features presented in the paper
"JPEG on Steroids : Common Optimization Techniques for JPEG Image Compression"
by the same author.
In specific, the following command line flags are *NEW* to this version and
are available only as a contribution to ICIP 2016:
-oz: This enables the dynamic programming algorithm to enhance
the rate-distortion performance by soft-threshold quantization. It has been
used for the tests in section 3.3 of the paper.
-dr: This enables the smart de-ringing algorithm that has been used
in section 3.6.
Additionally, the following switches have been used for other subsections
of the paper; they are not new to this distribution but available as
part of the regular libjpeg distribution at github or www.jpeg.org:
-s 1x1,2x2,2x2: Enable 420 subsampling (444 is default)
-s 1x1,2x1,2x1: Enable 422 subsampling (444 is default)
-qt n (n=0..8) Use quantization matrix n.
In the paper, n=1 (flat) was used for PSNR-optimized
coding, unless otherwise noted.
-dz The deadzone quantizer in section 3.3
(simpler than -oz)
-v Enable coding in processive mode (section 3.5)
-v -qv Optimized progressive mode (section 3.5)
-h Optimized Huffman coding (always used, unless noted
otherwise, see section 3.4)
--------------------------------------------------------------------------
Release 1.53:
This release includes additional functionality to inject markers, or
retrieve markers from a codestream while reading. For that, set
the JPGTAG_ENCODER_STOP tag of the JPEG::Write() call to a bitmask
where the encoder should interrupt writing data (this flag already
existed before) then write custom data with JPEG::WriteMarker(), then
continue with JPEG::Write(). On decoding, set JPGTAG_DECODER_STOP to
a bitmask where to stop for markers, then identify markers with
JPEG::PeekMarker(), and retrieve them with JPEG::ReadMarker(). Details
can be found in cmd/encodec.cpp for encoding, and cmd/reconstruct.cpp.
Otherwise, no functional changes.
--------------------------------------------------------------------------
Release 1.54:
In this release, upsampling has been made conforming to the latest
corrigendum of 18477-1 and 18477-8. In particular, upsampling is now
by design always centered and never co-sited. The co-sited upsampling
procedure is still included in the source code, but never executed.
--------------------------------------------------------------------------
Release 1.55:
This release only addresses some minor formulation issues of the
command line such that references are formatted properly to make this
software package acceptable as a JPEG reference software.
No functional changes.
--------------------------------------------------------------------------
Release 1.56:
Encoding and reconstruction of 2-component images was actually never
supported, as it was considered a rather exotic use-case. Now that a
request was made, support for 2-components was added and should
hopefully work ok.
--------------------------------------------------------------------------
Release 1.57:
Newer g++ compiler versions warned about implicit fall-throughs in switch/
case constructs that are actually harmless. This release adds an autoconf
detection of such compiler versions, adds consistent comments throughout
the code.
--------------------------------------------------------------------------
Release 1.58:
This release fixes multiple spelling errors in the file, thanks to
Mathieu Mmalaterre for finding and fixing them. The release also
addresses multiple race conditions and improves stability and robustness
on invalid streams. Thanks to seviezhou for providing codestreams that
triggered these defects. In particular, the following defects have
been found:
- when a codestream with unsupported upsampling specification (beyond
18477-1) was found, the code crashed.
- JPEG LS single component scans did not check whether there is actually
only a single component referenced in the scan.
- An invalid DC category in the sequential scan could have caused a
crash in the follow-up decoding.
- AC-coded lossless JPEG scans with horizontal subsampling factors
trashed memory.
- MCU sizes of 0 remained undetected and caused crashes due to a
division-by-zero exception.
- The code did not check whether a scan references the same component
more than once and could have failed with strange effects then.
- The code did not handle EOF conditions in the frame header
gracefully.
--------------------------------------------------------------------------
Release 1.59:
This release addresses a defect in the MCU handling for JPEG LS scans.
The previous code forgot to reset the JPEG LS state variables on MCU
scan boundaries, thus defeating the independent decodability of MCUs
if restart markers are inserted into the stream. Thanks to Spyros for
detecting this defect.
--------------------------------------------------------------------------
Release 1.60:
A specially crafted bitstream depending on line-based JPEG processes
could trigger a segfault because source data the reconstruction
process depended upon were not available. This has been fixed.
--------------------------------------------------------------------------
Release 1.61:
The restart interval for JPEG LS streams, specifically, is allowed to
be larger than 2^16. Modified the DRI marker accordingly. Unfortunately,
as the initial tables section of a codestream of JPEG and JPEG LS is
identical, JPEG files with an invalid DRI marker size will also be
accepted as valid.
--------------------------------------------------------------------------
Release 1.62:
The quantization table could contain entries larger than 255 for the 8-bit
DCT process, even though the standard prohibits this. Now the quantization
table entries are clipped to the allowed range.
Added an option -bl to force encoding in the baseline sequential process.
Added options to read the quantization tables from files rather than using
the built-in defaults.
--------------------------------------------------------------------------
Release 1.63:
In case the decoder was started with an image containing an alpha channel,
i.e. a 18477-9 image, and no output file for the alpha channel was
provided, the decoder crashed. This issue was fixed, the alpha channel is
now in this case simply disregarded. Note that you can define the output
file for the alpha channel with the "-al" command line option.
--------------------------------------------------------------------------
Release 1.64:
The lossless scan, the arithmetically coded lossless scan and the
arithmetically coded sequential scan could run into cases where an
out-of-bounds symbol triggered and out-of-bounds array access and could
have crashed the decoder. The code is now more carefully changing the
validity of the symbols and aborts with an error if it finds illegal
codes.
The code now also checks the consistency of the MCU sizes in the
hierarchical process and fails if they differ across levels.
--------------------------------------------------------------------------
Release 1.65:
The components requested through DisplayRectangle() are always codestream
component (i.e. Y, U, V) and not RGB components in the target image.
This led to some confusion and lack of initialization of bitmap descriptors
if less components were requested than present in the image.
The large-range scan did not properly check whether the DCT precision exceeds
the range decodable by the Huffman encoder. It now aborts faithfully if such
coefficients are detected.
The code did not check if the number of components present in a JPEG LS-2
transformation are identical to the number of components indicated in the
frame header. The code now aborts with an error if such a condition is
detected.
--------------------------------------------------------------------------
Release 1.66:
Previous releases had no means to signal an error from within the BitMapHook.
Now the return code of the BitMapHook call back function, if non-zero, is
used as an error, and such an error is then reported upstream to the caller.
Note that the supplied bitmap hook in the "cmd" directory is only an example
and does not validate inputs. The encoder, in particular, operates on a
"garbadge-in garbadge-out" basis. If the samples in the bitmap hook exceed
the range indicated to the library, bad things will happen. That is, if the
encoder is supposed to operate within an unknown environment with unknown
input data, input validation is required within the bitmap hook.
THE SAMPLE CODE DOES NOT ATTEMPT TO VALIDATE INPUT
Similar restrictions applies to the "PNM" helper code (or any code) in the
cmd directory. It does not attempt to validate input. At least some
minimal attempts to ensure that input files are valid is now in place, but
real life deployment of the code should contain more checks.
--------------------------------------------------------------------------
Release 1.67:
- Fixed a potential memory leak where the code did not release residual
or alpha tables on some error conditions.
- Fixed missing valiation for hidden residual DCT scans where start and
stop bitplane must be one bitplane appart.
- Fixed the box parser where boxes were attempted to be parsed that were
not yet completely loaded.
- Invalid PPM files that indicate a bitdepth > 16 are now detected and
generate an early error condition.
- Encoding images with an alpha channel was not working due to a lack
of enabling the alpha channel with a suitable tag.
- The Adobe marker with version 101 is now also decoded, though despite
the color transformation, flags are disregarded. Smoothing is not
supported by this implementation.
--------------------------------------------------------------------------
Release 1.68:
- Fixed a missing initialization of the table owner holder introduced
in 1.67.
--------------------------------------------------------------------------
Release 1.69:
- The DC Huffman table for the lossless predictive process also contained
entries for symbols of 17 and above for unknown reasons that might have
confused some other implementations. As these symbols are never needed,
they have been removed.
--------------------------------------------------------------------------
Release 1.70:
- The decoding of predictive JPEGs aborted premature if the DNL marker was
found, but the AC or Huffman decoder contained sufficient state information
to cover at least one line of data. In such a case, the DNL marker was
hit before the end of the next line was hit, even though parts of the
current line and the next line were still covered by the data in the AC
or Huffman decoder. I want to thank Robert Ancell for reporting this defect
and providing a fix.
--------------------------------------------------------------------------
For license conditions, please check the file README.license in this
directory.
Finally, I want to thank Accusoft and the Computing Center of the University of
Stuttgart and Fraunhofer IIS for sponsoring this project.
Thomas Richter, November 2024
---------------------------------------------------------------------------