Pd4Cmd is a Java command line tool built on the top of PD4ML HTML-to-PDF
converter library. The tool offers an access to virtually all PD4ML API
functionality and makes possible to use PD4ML converter as a standalone
application or as a part of non-Java environments/applications.
- HTML-to-PDF conversion with the absolute minimum of parameters
Win32:
java
-Xmx512m -cp .\pd4ml.jar Pd4Cmd "http://old.pd4ml.com" 1200
UNIX-derived operating systems:
java -Xmx512m -Djava.awt.headless=true -cp
./pd4ml.jar Pd4Cmd 'http://old.pd4ml.com' 1200
The command line overrides the default Java memory heap size
limit with -Xmx512m. Here it is set to 512Mb.
On UNIX platform -Djava.awt.headless=true allows to run
the application on non-graphics-enabled servers or from remote ssh/telnet
sessions.
"http://old.pd4ml.com" 1200 are HTML source URL and
htmlWidth (virtual "browser" frame width) parameters.
Please note: on Win32 the URL is enclosed, if needed, to double quotes, on
UNIX - to single quotes.
The default PDF document format: A4 / PORTRAIT
In the example 1200px width of rendered document will be
mapped to 595pt widths of A4 page format.
As long as an output file path omitted, the output is sent to
STDOUT and can be piped to another application.
- Customized HTML-to-PDF conversion
Win32:
java -Xmx512m
-cp .\pd4ml.jar Pd4Cmd "http://old.pd4ml.com" 1200 LETTER
-bookmarks HEADINGS -pdfforms -debug -out c:\pd4ml.pdf
UNIX-derived operating systems:
java -Xmx512m
-Djava.awt.headless=true -cp ./pd4ml.jar Pd4Cmd 'http://old.pd4ml.com' 1200
LETTER -bookmarks HEADINGS -pdfforms -debug
-out /tmp/pd4ml.pdf
In the examples the generated PDF is written to a file, defined
with -out parameter. That makes possible to use STDOUT
for debug output (-debug parameter).
The examples also force PD4ML to produce PDF outlines (bookmarks)
from <h1>-<h6> structure of the document (-bookmarks
HEADINGS) and to convert HTML forms to interactive PDF
forms (-pdfforms).
Below is a list of all supported parameters with brief descriptions.
- PDF meta data reporting
java -Xmx512m
-cp .\pd4ml.jar Pd4Cmd -tools file:c:/docs/test.pdf -printpermissions -printauthor -printtitle -printpagenum
The call prints to STDOUT
basic PDF info: document permissions (as a hex number), document
author, document title, number of document pages (decimal number)
The info can be also requested also by HTML-to-PDF conversion, by
PDF document merge, by PDF page removal.
- PDF page removal (Tools mode)
java -Xmx512m
-cp .\pd4ml.jar Pd4Cmd -tools file:c:/docs/test.pdf -pagerange
2-3,5+ -out
c:/docs/newdoc.pdf
The call allows to reduce document pages to a given range.
- PDF documents merge (Tools mode)
java -Xmx512m
-cp .\pd4ml.jar Pd4Cmd -tools file:c:/docs/test.pdf -merge
file:c:/docs/tomerge.pdf after
-out
c:/docs/newdoc.pdf
Note: -pagerange option is not available by a PDF merge
- PDF permissions update (Tools mode)
java -Xmx512m
-cp .\pd4ml.jar Pd4Cmd -tools file:c:/docs/test.pdf -permissions
28 -out
c:/docs/newdoc.pdf
-permissions 28 is a sum of permissions: AllowDegradedPrint = 4, AllowModify = 8 and AllowCopy = 16. See
API reference for more details.
Pd4Cmd parameter | Description
|
'<url>' |
(mandatory) URL of HTML source.
- Supported protocols: file, http and https (https
may not work under some JDKs)
- If needed, enclose the URL into single quotes on UNIX-derived
platforms, into double quotes on Windows.
- Due specifics of Java, file protocol requires less (than
normally) slashes by addressing absolute paths on Windows:
"file:c:/path/file.html"
Examples:
'http://old.pd4ml.com'
'http://host/doc.htm;jsessionid=873465837'
'file:c:/path/file.htm'
'file:docs/doc1.htm' (relative to the current directory)
(on Windows platform use double quotes) |
<htmlWidth> |
(mandatory) Width of "virtual
browser" frame. Base for relative width calculations. |
pageFormatName|WxH | Target page
format. Either one of predefined names or WIDTHxHEIGHT dimensions, given in
typographical points. Default value: A4 Predefined page formats:
- A0 - 2384x3370 points
- A1 - 1684x2384 points
- A2 - 1190x1684 points
- A3 - 842x1190 points
- A4 - 595x842 points
- A5 - 421x595 points
- A6 - 297x421 points
- A7 - 210x297 points
- A8 - 148x210 points
- A9 - 105x148 points
- A10 - 74x105 points
- HALFLETTER - 396x612 points
- ISOB0 - 2836x4008 points
- ISOB1 - 2004x2836 points
- ISOB2 - 1418x2004 points
- ISOB3 - 1002x1418 points
- ISOB4 - 709x1002 points
- ISOB5 - 501x709 points
- LEDGER - 1224x792 points
- LEGAL - 612x1008 points
- LETTER - 612x792 points
- NOTE - 540x720 points
- TABLOID - 792x1224 points
Examples:
A3
400x400 |
-addstyle <CSS code> | The parameter
allows to apply additional styles to the source document. Multiple
occurrences of the parameter in Pd4Cmd command line are allowed. Example:
-addstyle 'TH {background-color: tomato}
TR {page-break-inside: avoid}'
(on Windows
platform use double quotes) |
-adjustwidth | Sets htmlWidth to the most right margin of the HTML block content.
Calling the method would force PD4ML to build HTML layout with htmlWidth to determine the most right edge of rendered content and to use the value for PDF mapping
(in other words, to virtually cut any blank area right-side).
Notes:
- In order to use the method efficiently, it is important to set HtmlWidth value greater than the expected
maximal right edge offset.
- If the source document has HTML objects, whose width is set to 100%, than the method call
is meaningless.
- As long as htmlWidth affects HTML-to-PDF scale factor, usage of the method causes inconstancy of
font/object sizes in the resulting PDF from document to document.
|
-author <author name> | Defines
document author in PDF properties Example:
-author 'Max Mustermann'
(on Windows platform use double quotes) |
-bgcolor '<#RGB>' | Defines
background color for PDF pages Examples:
-bgcolor '#FFFCFE'
-bgcolor 0xFFFCFE
(on Windows platform use
double quotes) |
-bgimage '<url>' | Defines
background image for PDF pages. The image will be stretched to cover the
entire page, so it makes sense to choose images with dimensions,
proportional to the target page format. Examples:
-bgimage 'http://old.pd4ml.com/i/blank.jpg'
-bgimage 'file:/resources/images/blank.jpg'
(on Windows platform use double quotes) |
-bookmarks <HEADINGS|ANCHORS> |
Forces to generate PDF bookmarks (aka outlines).
- If set to ANCHORS, PD4ML creates PDF bookmarks taken from
<a name="destination"> Label</a> tags. If such tag is
empty (Label is not defined), it uses
destination string as visible label.
- if set to HEADINGS, than PD4ML creates PDF bookmark tree structure derived from
<H1>-<H6> structure.
Examples:
-bookmarks HEADINGS
-bookmarks ANCHORS |
-cookie <name> <value> | Allows
to define a cookie to be sent with source HTML HTTP request (and all
subsequent resource requests). Multiple occurrences of the parameter in
Pd4Cmd command line are allowed. Example:
-cookie JSESSIONID '9034657927465;path=/'
(on Windows platform use double quotes) |
-debug | Enables PD4ML debug output to
STDOUT. The parameter takes no effect if -out parameter
is omitted. |
-encoding <HTML encoding> |
Document encoding override |
-fitapage | Forces
PD4ML to downscale entire HTML layout if needed to fit a single PDF page
vertically |
-footer '<footer HTML code>' |
(PD4ML Pro only) Defines PDF page footer
in HTML. $[page], $[total] and $[title] placeholders are supported.
Example:
-footer
'<div width=100% align=right>$[page] of $[total]</div>'
(on Windows platform use double quotes) |
-header '<header HTML code>' |
(PD4ML Pro only) Defines PDF page header
in HTML. $[page], $[total] and $[title] placeholders are supported.
Example:
-header
'<div width=100% align=right>$[page] of $[total]</div>'
(on Windows platform use double quotes) |
-insets <T,L,B,R,><mm|pt> |
Defines page margins (Top,Left,Bottom,Right). Defaults: 25,50,25,25,pt
Examples:
-insets 10,20,10,10,mm
-insets 20,40,20,20,pt |
-merge <path> <after|before> |
(PD4ML Pro only)Merges conversion result with an existent PDF document.
after - append the existing document to the conversion result, before - prepend the document |
-multicolumn <nr,gap> |
(PD4ML Pro only)Outputs multicolumn PDF document. nr - number of columns, gap - column padding |
-nohyperlinks | Disables to convert
external HTML hyperlinks into PDF hyperlinks |
-noimagesplit | Allows to disable image splitting by page breaks.
By default the splitting is enabled. If the parameter is set, than PD4ML tries to put page breaks protecting the images. If an image
height (in screen pixels) is bigger than computed page height (in screen pixels),
than it will be splitted regardless the option.
Similar behavior may be achieved with IMG{page-break-inside: avoid}
CSS style |
-orientation <PORTRAIT|LANDSCAPE> |
LANDSCAPE rotates 90° target page format (default is A4)
Examples:
-orientation PORTRAIT
-orientation LANDSCAPE |
-out <output_file_path> | Defines
target file path/name. Pd4Cmd must have permissions to write the file.
Examples:
-out c:\tmp\out.pdf
-out /tmp/out.pdf |
-outformat <pdf|pdfa|rtf|rtfwmf> |
(PD4ML Pro only) Specifies output file format. pdfa duplicates -pdfa parameter.
rtf forces PD4ML to output RTF instead of PDF. rtfwmf outputs RTF and converts images to WMF file format for
a better viewer compatibility. |
-pagerange <page> | Allows to limit a scope of generated pages. Examples:
"2+" - skip the first page,
"1-2" - output only the first and the second pages, "even" or
"odd" - it is obvious. The rules may
be combined: "3-7,odd"
Example:
-pagerange '2-3,7+'
(on Windows platform use double quotes) |
-param <name> <value> | Sets key/value pair to dynamically substitute placeholders in HTML template (like
$[key]).
Key names "page", "total" and "title" are reserved for PDF headers and footers.
Also allows to pass PD4ML tweaking parameters. Multiple occurrences of the
parameter in Pd4Cmd command line are allowed. Examples:
-param date 'Feb 18, 2010'
-param pd4ml.basic.authentication usr:pwd
(on Windows platform use double quotes) |
-password <password> | Protects
the resulting document with a password. Example:
-password geheim |
-pdfa | (PD4ML
Volume DMS edition only) Forces PD4ML to output PDF compliant with PDF/A specification.
PDF/A specification requires all used fonts to be embedded to the resulting document.
So the method call cannot guarantee the resulting doc is PDF/A, for example,
if TTF embedding (-ttf) is disabled or not configured. Place
pd4ml_rc.jar to the same directory where pd4ml.jar is - it will help to
avoid most of the font embedding problems. |
-pdfforms | Forces PD4ML to convert
HTML forms into interactive PDF forms |
-permissions <NUMBER> | Defines
document access permissions. NUMBER is a sum of permission values:
- AllowAnnotate - (bit 6, value = 32)
- AllowAssembly - (bit 11, value = 1024)
- AllowContentExtraction - (bit 10, value = 512)
- AllowCopy - (bit 5, value = 16)
- AllowDegradedPrint - (bit 3, value = 4)
- AllowFillingForms - (bit 9, value = 256)
- AllowModify - (bit 4, value = 8)
- AllowPrint - (bit 12 + bit 3, value = 2052)
Examples:
-permissions 2068 - allows to copy and to print
the resulting document |
-protectpud | Makes PD4ML to output PDF objects respecting dimensions/font sizes given in "in", "pt", "cm" etc.
By default the physical sizes are converted to pixel equivalents (using 72dpi)
and scaled up or down with entire document layout.
Use the feature carefully: as it switched on, there is no single HTML-to-PDF scale
factor for all HTML objects. The resulting PDF layout may appear visually corrupted. |
-smarttablesplit | Insert page breaks inbetween table rows
to make the table portions fit PDF page height.
If the table has a header (the first rows with <th> cells only) it replicates
the row to each table section.
Similar behavior (excluding the header replication) may be achieved with
TR, TABLE {page-break-inside: avoid} CSS style |
-title <title override> | Defines
(or overrides) the document title Example:
-title 'New title'
(on Windows platform use double quotes) |
-ttf <ttf_fonts_dir> |
(PD4ML Pro only) Specifies TTF fonts
directory. See reference
Examples:
-ttf c:\windows\fonts
-ttf fonts/ (relative to the current dir) |
-tools |
Switches Pd4Cmd to a tools mode. In the mode it expects not HTML,
but PDF as an input and some HTML conversion-specific features take
no effect. Examples:
-tools file:/docs/test.pdf
-tools file:c:\docs\test.pdf
-tools file:c:/docs/test.pdf
-tools c:\docs\test.pdf
-tools http://pdfcloud.com/test.pdf
|
-readpassword <password> |
Specifies an input PDF document password for a case the document is
password protected (Tools mode) Examples:
-readpassword segretto |
-mergepassword <password> |
Specifies a merged PDF document password for a case the
document is password protected (Tools mode) Examples:
-mergepassword segretto |
-printpermissions |
Reads and prints PDF document permissions numberic value in
hex form to STDOUT |
-printpagenum |
Reads and prints PDF page number to STDOUT |
-printauthor |
Reads and prints PDF document author |
-printtitle |
Reads and prints PDF document title |