<< back
PD4ML: HTML-to-RTF conversion
One of the popular PD4ML features is a generation of RTF documents from
HTML templates.
A switch to RTF generation can be done with one of the following API calls:
pd4ml.outputFormat(PD4Constants.RTF);
// or optionally...
pd4ml.outputFormat(PD4Constants.RTF_WMF);
The equivalents in JSP taglib:
<pd4ml:transform ... outputFormat="rtf"> ... </pd4ml:transform>
<pd4ml:transform ... outputFormat="rtfwmf"> ... </pd4ml:transform>
(in the case the transform tag automatically sets corresponding Content-type
HTTP header "application/rtf")In the command line tool:
java -Xmx512m -Djava.awt.headless=true -cp ./pd4ml.jar Pd4Cmd <URL> 1200 -out doc.rtf -outformat rtf
java -Xmx512m -Djava.awt.headless=true -cp ./pd4ml.jar Pd4Cmd <URL> 1200 -out doc.rtf -outformat rtfwmf
Java PD4ML API example:
package samples;
import java.awt.Insets;
import java.io.File;
import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URL;
import java.security.InvalidParameterException;
import org.zefer.pd4ml.PD4Constants;
import org.zefer.pd4ml.PD4ML;
public class GettingStarted2 {
protected int topValue = 10;
protected int leftValue = 20;
protected int rightValue = 10;
protected int bottomValue = 10;
protected int userSpaceWidth = 1300;
public static void main(String[] args) {
try {
GettingStarted2 jt = new GettingStarted2();
jt.doConversion("http://old.pd4ml.com/i/rtf/invoice.htm", "c:/invoice.rtf");
} catch (Exception e) {
e.printStackTrace();
}
}
public void doConversion( String url, String outputPath )
throws InvalidParameterException, MalformedURLException, IOException {
File output = new File(outputPath);
java.io.FileOutputStream fos = new java.io.FileOutputStream(output);
PD4ML pd4ml = new PD4ML();
pd4ml.setHtmlWidth(userSpaceWidth); // set frame width of "virtual web browser"
// choose target paper format and "rotate" it to landscape orientation
pd4ml.setPageSize(pd4ml.changePageOrientation(PD4Constants.A4));
// define PDF page margins
pd4ml.setPageInsetsMM(new Insets(topValue, leftValue, bottomValue, rightValue));
// Force generate RTF instead of PDF
pd4ml.outputFormat(PD4Constants.RTF_WMF);
pd4ml.render(new URL(url), fos); // actual document conversion from URL to RTF file
fos.close();
System.out.println( outputPath + "\ndone." );
}
}
The only difference between RTF and RTF_WMF is in embedded images: with RTF it
embeds to RTF images "as is": PNG, JPEG etc. In RTF_WMF mode it converts al
images to WMF format for compatibility with WordPad.exe. As a drawback of
the image compatibility is a significantly bigger output file size.PD4ML is
able to convert from rendered HTML layout to RTF the following elements:
- Page margins
- Text styles and fonts
- Text backgrounds
- Text indentation
- Tables (with correct table nesting). It supports col- and row-spans,
table and cell backgrounds, cell paddings. Border style (width) is
not supported for the time being.
- Images
- Hyperlinks (external and internal), image hyperlinks
- Headers / footers. There is a possibility to define individual header
and footer for title page.
Despite the fact RTF format is quite old and standardized, only few viewers
implement all its features. For example on MacOS platform tables appear
corrupted (as a set of text paragraphs) and images are not shown at all. MS
Word probably is the most features-rich RTF viewer/editor application.
RTF conversion samples:
|