Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Render PDF from multiple HTML files #214

Open
sake92 opened this issue Apr 22, 2018 · 10 comments
Open

Render PDF from multiple HTML files #214

sake92 opened this issue Apr 22, 2018 · 10 comments

Comments

@sake92
Copy link

sake92 commented Apr 22, 2018

Hi all, is it possible to render a PDF from multiple input files/strings, like in this example from flyingsaucer?
I had a problem with flyingsaucer, it throws something like "Page 21 was requested but document has only 20 pages"... 😞
I could set a baseURL for every page, on the setDocumentFromString method.

With openhtmltopdf I should concatenate all the HTMLs into one, right? But the base folder is not the same for all of them... 😄 In some of them it is "../styles/main.css" but in some other "../../styles/main.css" (deeper folder)

@danfickle
Copy link
Owner

This will work, sort of. The only problem is the counter(page) and counter(pages) will be wrong on subsequent documents. I'll try to get these working again in the fast renderer that I'm working on as part of #180.

	//214
	public static void main(String...args) throws Exception {
		String[] uris = new String[] {
				"file:///Users/me/Documents/pdf-issues/issue-206.htm",
				"file:///Users/me/Documents/pdf-issues/issue-180-p.htm"
		};
		
		PDDocument doc = new PDDocument();
		
		for (String uri : uris) {
			PdfRendererBuilder builder = new PdfRendererBuilder();
			builder.withUri(uri);
			builder.usePDDocument(doc);
			PdfBoxRenderer renderer = builder.buildPdfRenderer();
			renderer.createPDFWithoutClosing();
			renderer.close();
		}

		OutputStream os = new FileOutputStream("/Users/me/Documents/pdf-issues/output/mytest-214.pdf");
		doc.save(os);
		os.close();
	}

@sake92
Copy link
Author

sake92 commented May 6, 2018

Thanks @danfickle, it works!!! 😄

Although, I have some issues with characters like č,ć etc. They get turned into # character.

Also, code highlighting with http://prismjs.com/ isn't working same like in the browser.
Seems like it "sees" just the <code> markup, not the prismjs goodies that get inserted after parsing the HTML.

@danfickle
Copy link
Owner

Hi @sake92

For the characters, you will still have to embed a valid font for most languages other than English. See the template author's guide on the readme for tips.

In regard, to the prismjs, this project doesn't run Javascript, so you would probably need to find a Java syntax highlighter (see link below) or somehow get prismjs running in a Javascript runner available from Java (Nashorn or Rhino).

https://stackoverflow.com/questions/1853419/syntax-highlighter-for-java

@sake92
Copy link
Author

sake92 commented May 23, 2018

@danfickle as far as I'm concerned, you can close this issue. Proposed solution works! 😌

If someone is interested, here is my implementation, from my static site generator: https://github.com/sake92/hepek/blob/master/src/main/scala/ba/sake/hepek/pdf/PdfGenerator.scala
I used Selenium ChromeDriver to wait for JS to load etc.

Example of PDF with some math: https://blog.sake.ba/pdfs/Matematika.pdf

@taras19921
Copy link

Dear @danfickle, how I can generate one PDF file with 2 pages from 2 html templates?

@danfickle
Copy link
Owner

I strongly recommend combining the templates if possible. Other than that, the sample above in this thread should work. What are you having trouble with?

@taras19921
Copy link

taras19921 commented Nov 14, 2018

Thank you for quick reply. Yes, I've used the example above but with some modifications:

                  try (OutputStream os = new FileOutputStream(filePath)) {
			PDDocument doc = new PDDocument();
			for (String html : htmlPagesWithValues) {
				PdfRendererBuilder builder = new PdfRendererBuilder();
				builder.defaultTextDirection(BaseRendererBuilder.TextDirection.LTR);
				builder.useDefaultPageSize(210, 297, BaseRendererBuilder.PageSizeUnits.MM);
				builder.useProtocolsStreamImplementation(new InternalFSStreamFactory(), "localProtocol");
				builder.withHtmlContent(html, "");
				builder.useSVGDrawer(new BatikSVGDrawer());
				builder.usePDDocument(doc);
				PdfBoxRenderer renderer = builder.buildPdfRenderer();
				renderer.createPDFWithoutClosing();
			}
			doc.save(os);
		  } catch (Exception ex) {
		  }

And it works but without renderer.close(); line. With this line I am getting the following error: "Format error: Not a PDF or corrupted" during opening the PDF file and file size = 0 KB as well.

"I strongly recommend combining the templates" - do you mean concatenate the two html templates?

@roldevg
Copy link

roldevg commented Sep 10, 2020

It works for me too. I implemented the integration of multiple processed templates (Thymeleaf) with once PDF file.

On the Kotlin:


fun convertHtmlToPdf(val processedTemplateFiles, outputStream: OutputStream) {
	val doc = PDDocument()
	val builder = PdfRendererBuilder()
	for (processedTemplateContent in processedTemplateFiles) {
		builder.useFastMode()
		builder.withHtmlContent(processedTemplateContent, resourcesBaseUri)
		builder.usePDDocument(doc)
		val buildPdfRenderer = builder.buildPdfRenderer()
		buildPdfRenderer.layout()
		buildPdfRenderer.createPDFWithoutClosing()
		buildPdfRenderer.close()
	}
	doc.save(outputStream)
}

But if you have some special logic for all of these files, it will be hard to implement, for example, if you need common order with page numbers.

@OJHOCY
Copy link

OJHOCY commented Jul 26, 2022

public static void main(String...args) throws Exception {
String[] uris = new String[] {
"file:///Users/me/Documents/pdf-issues/issue-206.htm",
"file:///Users/me/Documents/pdf-issues/issue-180-p.htm"
};

	PDDocument doc = new PDDocument();
	
	for (String uri : uris) {
		PdfRendererBuilder builder = new PdfRendererBuilder();
		builder.withUri(uri);
		builder.usePDDocument(doc);
		PdfBoxRenderer renderer = builder.buildPdfRenderer();
		renderer.createPDFWithoutClosing();
		renderer.close();
	}

	OutputStream os = new FileOutputStream("/Users/me/Documents/pdf-issues/output/mytest-214.pdf");
	doc.save(os);
	os.close();
}

Does doc(PDDocument instance) need to call close method?

@varshaldavda
Copy link

Hi @danfickle , the code you provided works for me to generate PDF from multiple HTML files, but it is adding each HTML as a new page and not starting from where previous page ended.
Can you please guide me?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants