Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image prints with black background #1128

Closed
wesleyi23 opened this issue May 27, 2020 · 20 comments · Fixed by #2179
Closed

Image prints with black background #1128

wesleyi23 opened this issue May 27, 2020 · 20 comments · Fixed by #2179
Labels
bug Existing features not working as expected
Milestone

Comments

@wesleyi23
Copy link

I am trying to use Weasyprint to generate pdf reports for a Django web application. The HTML file the report is based on has images in base 64, some of the images generate with a black background instead of a white background. I am trying to understand where this issue might be or how to go about correcting it. Does anyone have any ideas where to start?

@liZe
Copy link
Member

liZe commented May 27, 2020

Hello!

Could you please provide an example?

@wesleyi23
Copy link
Author

Sure. Here is an example HTML file (saved as a txt file):
example.txt

@liZe
Copy link
Member

liZe commented May 27, 2020

Looks like a problem with CMYK, really like #315. It was supposed to be fixed in Cairo 1.15.4, I don’t know why it’s still there with 1.16.

@wesleyi23
Copy link
Author

Hey thanks! I will double check to make sure I have the latest version of Cairo and post back what I find tomorrow.

@liZe
Copy link
Member

liZe commented May 27, 2020

Looks like a problem with CMYK, really like #315. It was supposed to be fixed in Cairo 1.15.4, I don’t know why it’s still there with 1.16.

Here is why: the fix has been reverted.

Hopefully Cairo is soon gone forever.

A quick workaround is to save your JPG file as RGB.

@liZe liZe added the bug Existing features not working as expected label May 27, 2020
@liZe
Copy link
Member

liZe commented Sep 16, 2021

The bug remains, even without Cairo.

@germinator1512
Copy link

This still happens .. also converting client pictures, so a solution would be much appreciated

@liZe
Copy link
Member

liZe commented May 13, 2024

so a solution would be much appreciated

Some help would be really appreciated on this topic. If you could spend some time to understand what’s wrong in the generated PDF, that would be wonderful. Thank you!

@germinator1512
Copy link

germinator1512 commented May 13, 2024

Theres nothing wrong with the pdf itself, just the color. If you use this image as the background:
background
the result is the following pdf :
result.pdf

@liZe
Copy link
Member

liZe commented May 13, 2024

Theres nothing wrong with the pdf itself, just the color.

I mean, there’s something in the generated PDF that makes the colors look inverted. Maybe there’s a place where we can tell the PDF that this image is CMYK? Reading the PDF specification and playing with this code could help to find what’s wrong and how to fix this.

@endaxi
Copy link

endaxi commented May 30, 2024

Hi, I was having the same problem where CMYK JPEG images were getting inverted in the PDF output.

I was able to get it working by adding the following to images.py

            if self.mode == 'CMYK':
                extra['Decode'] = '[1 0 1 0 1 0 1 0]'

I'm not sure whether this should be done for all CMYK JPEGs (probably not), but I hope it helps.

@liZe
Copy link
Member

liZe commented Jun 1, 2024

I was able to get it working by adding the following to images.py

That’s really interesting, thanks!

I'm not sure whether this should be done for all CMYK JPEGs (probably not), but I hope it helps.

That’s a good question. If someone could find some useful technical information about CMYK JPEGs, that would greatly help understand what we can do to fix this bug.

@endaxi
Copy link

endaxi commented Jun 6, 2024

What led me to try the Decode trick was this response to an old post about CMYK JPEGs being inverted when extracted from PDF.

Section 8.9.5.2 of the pdf spec describes Decode Arrays and lists the default for DeviceCMYK as [0 1 0 1 0 1 0 1], followed by a note: "It is possible to specify a mapping that inverts sample colour intensities by specifying a Dmin value greater than Dmax". This kind of explains why the trick works, but not why it needs to be done in the first place.

One thing in common with the "dentex" image posted above and all the other CMYK JPEGs I have tried that exhibit the inversion problem is the following attributes in the EXIF data:

  • APP14 Flags 0 is present
  • APP14 Flags 1 is present
  • Color Transform = YCCK

This data can be viewed using exif.tools and the tags are described here. The APP14 tags seem to be specific to Adobe. Perhaps these attributes can be used to determine whether to apply the inversion decode, though I haven't seen a CMYK JPEG that doesn't have these attributes.

@liZe
Copy link
Member

liZe commented Jun 6, 2024

@endaxi Thanks a lot for this investigation.

This kind of explains why the trick works, but not why it needs to be done in the first place.

That’s a great first step!

I think that ImageMagick/ImageMagick#6094 is the same problem. The root of the problem seems to be a bug in Photoshop:

https://github.com/libjpeg-turbo/libjpeg-turbo/blob/3c17063ef1ab43f5877f19d670dc39497c5cd036/libjpeg.txt#L1569-L1582

I suppose that searching for APP14 flags could be a reliable way to know it’s generated by Adobe products, and then add the filter in this case. I think that EXIF tags can be read with Pillow too. Interested in opening a pull request?

@endaxi
Copy link

endaxi commented Jun 7, 2024

Interested in opening a pull request?

Sure, I can do that.

endaxi pushed a commit to endaxi/WeasyPrint that referenced this issue Jun 7, 2024
Addresses issue [Kozea#1128](Kozea#1128)

According to
[libjpeg](https://github.com/libjpeg-turbo/libjpeg-turbo/blob/3c17063ef1ab43f5877f19d670dc39497c5cd036/libjpeg.txt#L1569-L1582)
"it appears that Adobe Photoshop writes inverted data in CMYK JPEG
files"

An Adobe JPEG can be identified by the presence of the
[APP14](https://exiftool.org/TagNames/JPEG.html#Adobe)
segment.

The code now checks for the `APP14` segment in `RasterImage` and adds a
Decode Array to the XObject when rendering the CMYK JPEG. The value of
the Decode Array is the inverse of the default value for DeviceCMYK
according to the PDF spec. This has the effect of inverting the inverted
image back to normal.
@liZe liZe linked a pull request Jun 8, 2024 that will close this issue
@liZe liZe closed this as completed in #2179 Jun 8, 2024
@alessandroghigi
Copy link

Hi, I'm still having issues with this. A jpg image seems to be inverted (black background).
This is the html code:

<html>
   <head>
      <title>PDF</title>
      <style id="62c475c7-8ef3-4807-a043-292d42571b44_style" type="text/css">@page { size: 100.00mm 40.00mm; margin: 0mm; }
         img.company-logo-00d56a09-3652-44ea-af8c-308a175e8e84 {width:100%;height:100%;position:relative;}
      </style>
   </head>
   <body style="margin: 0px;">
      <div id="62c475c7-8ef3-4807-a043-292d42571b44_container000" style="transform: rotate(0deg) scale(1, 1); width: 100%; height: 100%; left: 0mm; top: 0mm; position: absolute; user-select: none; pointer-events: none;">
         <div style="display: block; position: absolute; left: 0px; top: 0px; width: 100%; height: 100%;"></div>
         <div style="position:absolute;width:141.74px;left:250px;top:10px;height:165.36px;"><img class="company-logo-00d56a09-3652-44ea-af8c-308a175e8e84" src="test_image.jpg"></div>
      </div>
   </body>
</html>

Please find attached image and generated pdf file.

Thanks!

test_image.pdf
test_image

@liZe liZe added this to the 63.0 milestone Jun 27, 2024
@liZe
Copy link
Member

liZe commented Jun 27, 2024

Hi @alessandroghigi,

The fix will be included in version 63.0, it’s not in 62.x. Your image works with the current main branch.

@alessandroghigi
Copy link

Ok, thanks.

@CosaroLisa
Copy link

Hi @alessandroghigi,

The fix will be included in version 63.0, it’s not in 62.x. Your image works with the current main branch.

Hello, when do you plan to release the 63 version any time soon ?

@liZe
Copy link
Member

liZe commented Jul 15, 2024

Hello, when do you plan to release the 63 version any time soon ?

We’d like to have submit buttons and Color level 4 (#1630) supported before the release. I think that it will take at least 1 month before 63 is released.

If you really want to use the main branch before it’s stable, you can install it with pip install --upgrade git+https://github.com/Kozea/WeasyPrint.

youen-dev added a commit to youen-dev/WeasyPrint that referenced this issue Sep 5, 2024
Addresses issue [Kozea#1128](Kozea#1128)

According to
[libjpeg](https://github.com/libjpeg-turbo/libjpeg-turbo/blob/3c17063ef1ab43f5877f19d670dc39497c5cd036/libjpeg.txt#L1569-L1582)
"it appears that Adobe Photoshop writes inverted data in CMYK JPEG
files"

An Adobe JPEG can be identified by the presence of the
[APP14](https://exiftool.org/TagNames/JPEG.html#Adobe)
segment.

The code now checks for the APP14 segment in RasterImage and adds a
Decode Array to the XObject when rendering the CMYK JPEG. The value of
the Decode Array is the inverse of the default value for DeviceCMYK
according to the PDF spec. This has the effect of inverting the inverted
image back to normal.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Existing features not working as expected
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants