HTML to PDF converter for Java and .NET

<< back

PD4ML: Multiple HTML source documents

If you need to produce a single PDF document from multiple HTMLs, a straightforward approach - to merge the HTML files into a single one - will not work as a closing </body> or </html> tag of the first document signals to the HTML parser that the rest of input data must be ignored.

A possible solution for the problem is to preprocess the merged documents and to remove all occurrences of  </body> or </html>, before it is passed to PD4ML. PD4ML's HTML normalizer module should correctly fix the inconsistency and auto-close the tags if they are removed in the last chained document.

As a recommended solution, PD4ML defines special versions of render() methods, which accept multiple HTML documents for a conversion into a single PDF: render(URL[],...) and render(StringReader[],...). The approach will give you more predictable result (as there is no CSS style inter-mixes, could happen by document merge), but has a limitation: each source HTML document starts a new PDF page; there is no way to continue half-blank page with a new doc. 

Copyright ©2004-23 zefer|org. All rights reserved. Bookmark and Share