Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF support #97

Merged
merged 38 commits into from
Sep 13, 2023
Merged

PDF support #97

merged 38 commits into from
Sep 13, 2023

Conversation

jonmmease
Copy link
Collaborator

@jonmmease jonmmease commented Sep 9, 2023

Closes #91

Overview

This PR adds dependency-free PDF export support to VlConvert. It's been a journey to get to this point, but I'm really happy with the end result.

How it work

This PR uses VlConvert's SVG export path and then converts the resulting SVG image to a PDF. The bulk of the work is done by the wonderful svg2pdf crate. svg2pdf relies on usvg to convert the original SVG image to a simplified collection of paths, and then converts these paths to PDF.

Text

It's possible to render text using svg2pdf by using usvg to convert text to paths before the SVG tree is passed to svg2pdf. But this approach is suboptimal as the resulting text cannot be selected or searched in a PDF viewer like Adobe Acrobat. I opened an svg2pdf issue in January to talk about embedding text. The typst team (who developed svg2pdf, and the pdf-writer crate it depends on) have been really helpful through this process.

It turned out to be possible to accomplish text embedding on top of svg2pdf without changes to the core library. This PR uses pdf-writer to construct a new PDF document and then uses svg2pdf to convert everything in an SVG file except text to a PDF XObject. Then it traverses the SVG tree again and overlays PDF text on top of the XObject.

The logic for using pdf-writer to embed fonts in the resulting PDF file was taken from the typst project repository. It would be nice to eventually find a way to avoid duplicating this logic, but the duplication is worth it for the time being.

Testing

This logic is tested from Python using pdfium2 to convert the PDF to a PNG image and comparing to our existing PNG baselines. The comparison tolerance needs to be a little larger due to the slight differences in text rendering between pdfium and resvg, but they still match really well!

TODO

  • Update to svg2pdf 0.7 once it is released

@@ -274,7 +274,7 @@ mod test_vl2png {
let output_png = dssim::load_image(&Dssim::new(), &output).unwrap();

let attr = Dssim::new();
let (diff, _) = attr.compare(&expected_png, &output_png);
let (diff, _) = attr.compare(&expected_png, output_png);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clippy --fix caught this

@jonmmease jonmmease marked this pull request as draft September 9, 2023 15:38
@jonmmease jonmmease mentioned this pull request Sep 9, 2023
@jonmmease jonmmease marked this pull request as ready for review September 13, 2023 12:33
@jonmmease jonmmease merged commit 416b592 into main Sep 13, 2023
@jonmmease jonmmease deleted the jonmmease/pdf-embed-2 branch September 13, 2023 15:04
@domoritz
Copy link
Member

Very cool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PDF support
2 participants