HTML to PDF converter for Java and .NET

HOME   FEATURES   PRODUCTS   DOWNLOADS   BUY NOW!   SUPPORT
<< back

PD4ML: HTML to a raster image conversion

PD4ML (as HTML-to-PDF converter) consists of 2 relatively separate modules: HTML rendering engine and PDF output pseudo-device, derived from java.awt.Graphics. That makes an output of a rendered HTML document to an image (or to any other Graphics device) quite a trivial task.

In order to make the task even easier, we added a image output mode to PD4ML API. With a simple output format switch you may produce a PNG or a multipage TIFF

pd4ml.outputFormat(PD4Constants.PNG8);
// or 
pd4ml.outputFormat(PD4Constants.PNG24);
// or 
pd4ml.outputFormat(PD4Constants.TIFF);
The equivalents in JSP taglib:
<pd4ml:transform ... outputFormat="png8"> ... </pd4ml:transform>

<pd4ml:transform ... outputFormat="png24"> ... </pd4ml:transform>

<pd4ml:transform ... outputFormat="tiff"> ... </pd4ml:transform>
(in the case the transform tag automatically sets corresponding Content-type HTTP header "image/png" or "image/tiff")

In the command line tool:

java -Xmx512m -Djava.awt.headless=true -cp ./pd4ml.jar Pd4Cmd <URL> 1200 -out thumbnail.png -outformat png8

java -Xmx512m -Djava.awt.headless=true -cp ./pd4ml.jar Pd4Cmd <URL> 1200 -out thumbnail.png -outformat png24

java -Xmx512m -Djava.awt.headless=true -cp ./pd4ml.jar Pd4Cmd <URL> 1200 -out thumbnail.tiff -outformat tiff
By PNG image output PD4ML ignores page breaks, but it respects them by multipage TIFF generation.

There are some other limitations in the HTML-to-Image conversion mode.

  • No headers/footers supported
  • No footnotes
  • No hyperlinks (of course)
  • Generated TOC has no page numbering
  • No page insets applied (however document body margins are there)
  • etc

Also, for a case, you need further image data processing, PD4ML API introduces a couple of specialized renderAsImages() methods, which return an array of BufferedImage objects, represent document pages.

The biggest source of troubles by image output is memory allocation. Even a relatively small HTML layout 1000x5000px requires to allocate at least 20 MB for image bytes output (plus BufferedImage class infrastructure overhead).

Copyright ©2004-24 zefer|org. All rights reserved. Bookmark and Share