HTML to PDF converter for Java and .NET

HOME   FEATURES   PRODUCTS   DOWNLOADS   BUY NOW!   SUPPORT
<< back

See also: PD4ML v4 - How to configure PDF fonts

PDF and True Type fonts

As a rule PDF viewers include a set of built-in Type1 fonts like Times, Helvetica, Courier. It is guaranteeed, that PDFs with the fonts are shown correctly on any platform. Despite the fact, the fonts include most of UNICODE glyphs, PD4ML only addresses Latin subset of the fonts (ISO-8859-1). There are some technical issues behind the restriction.

The built-in Type1 fonts are the only fonts, you can use with PD4ML Std.

PD4ML Pro and derived licenses allow you to use all UNICODE space of custom TTF/OTF fonts in PDFs (if the fonts are not buggy and are compatible with Java runtime environment).

In order to do that, PD4ML requires you to configure and to use TTF embedding feature (http://old.pd4ml.com/reference.htm#7.1)

The way TTF embedding is implemented by PD4ML may look overcomplicated at first glance. On practice it is not so; also there are reasons why TTF usage is not as transparent as in regular Java applications.

In Java you may easily instantiate Font object for any font face name and to use it for text output. But for PDF generation PD4ML needs an access not only to java.awt.Font objects, but to the corresponding physical .ttf files (to parse them and to extract a subset of used glyphs). Unfortunately Java does not offer a way to locate TTF file for a particular java.awt.Font object.

For that reason we introduced the font face -> font file mapping appraoch (with pd4fonts.properties).

The needed actions are straightforward:

  1. create fonts/ directory (i.e /path/to/my/fonts/) and copy needed TTFs into it.
  2. run pd4font.properties generation command
    java -jar pd4ml.jar -configure.fonts /path/to/my/fonts/
    (as a result it should produce /path/to/my/fonts/pd4font.properties)
  3. reference /path/to/my/fonts/ directory from your Java/JSP/... code.


If you want to avoid binding to a local directory, you may pack the fonts/ directory into a JAR, place it to classpath and access them via classloader.

http://old.pd4ml.com/examples.zip (~2MB) contains chinese_ttf sample, which illustrates how to do that.

 

pd4fonts.properties supports a special entry ttf.fonts.dir, which makes possible to store pd4fonts.properties separately from the font files. For example in a case you have no permission to write anything to the system font directory:

ttf.fonts.dir=c:/windows/fonts

You do not need to add the entry manually. You may add second -configure.fonts parameter with a desired pd4fonts.properties location.

java -jar pd4ml.jar -configure.fonts   c:/windows/fonts   c:/work/fontmappings/
or even more specific
java -jar pd4ml.jar -configure.fonts   c:/windows/fonts   c:/work/fontmappings/myfontmapping.properties

In the case PD4ML will insert automatically ttf.fonts.dir, points to the given font directory.

Note: in the second command-line example above, it produces .properties file with an arbitrary name: myfontmapping.properties. That would require explicit file name in useTTF() call (or in the call equivalents):

pd4ml.useTTF( "/work/fontmappings/myfontmapping.properties", true );

 

Very often there is no necessity to support multiple font faces, but missing of special characters (like &delta;, &laquo; ...) or charsets (like Cyrillic, Arabic) support is critical.

For the case we created a "quick hack" solution with easy-to-use TTF embedding.

There is a JAR with 3 fonts for serif, sansserif and monospaced types (the fonts do not contain CJK glyphs):

http://old.pd4ml.com/i/easyfonts/fonts.jar (~2MB)

Add the JAR to application's classpath (or put to WEB-INF/lib in webapp scenarios), address the fonts via Java classloader and specify, that the 3 fonts should be used as defaults:

pd4ml.useTTF( "java:fonts", true );
pd4ml.setDefaultTTFs("Times New Roman", "Arial", "Courier New");


(Full Java API example: http://old.pd4ml.com/i/easyfonts/EasyFonts.java)

JSP equivalent:

<pd4ml:usettf from="java:fonts" serif="Times New Roman" sansserif="Arial" monospace="Courier New">


The same for the PHP wrapper (assuming that fonts.jar is in the same dir where pd4ml(_demo).jar is):

passthru('java -Xmx512m -Djava.awt.headless=true -cp .:pd4ml_demo.jar:fonts.jar Pd4Php \'' . 
                   $_POST['url'] . '\' 800 A4 -ttf java:fonts 2>>stderr.txt');
 
// Win32 version
// passthru('java -Xmx512m -cp .;pd4ml_demo.jar;fonts.jar Pd4Cmd ' . 
//                 $_POST['url'] . ' 800 A4 -ttf fonts:jar');
 

1. What happens if HTML references a font like Comic Sans MS, but the font name-to-file mapping is missing in pd4fonts.properties?

In the case PD4ML tries to determine which group the font belongs to: Serif, SansSerif or Monospace. (For Comic Sans MS it is SansSerif).

After that it tries to load default group font: Times New Roman, Arial or Courier New (or corresponding fonts, overridden with pd4ml.setDefaultTTFs())

If the fonts are missing as well, PD4ML walks so-called fallback tables to find any present font of the group to be used as a substitution.

Here are the substitution tables:

private static String[] serifFallback = new String[] {
	"Times New Roman",
	"MS Mincho",
	"MingLiU",
	"SimSun",
	"Mangal",
	"David",
	"Batang",
	"Wingdings",
	"Symbol",
	"Lucida Sans Regular",
};
		
private static String[] sansFallback = new String[] {
	"Arial",
	"MS Gothic",
	"MingLiU",
	"SimSun",
	"Mangal",
	"David",
	"Gulim",
	"Wingdings",
	"Symbol",
	"Lucida Sans Regular",
};
		
private static String[] monoFallback = new String[] {
	"Courier New",
	"MS Gothic",
	"MingLiU",
	"SimSun",
	"Mangal",
	"David",
	"GulimChe",
	"Wingdings",
	"Symbol",
	"Lucida Sans Regular",
};

If nothing found, it uses corresponding built-in Type1 font.

2. What happens, when Chinese text is styled to be shown with Arial, however Arial font does not define CJK glyphs?

PD4ML performs a test if a particular content can be displayed with a chosen font. If not, it splits the text into smaller portions, differ by UNICODE group. It again checks each portion if it can be displayed with Arial; if not, it walks sansFallback table in order to find a font, that has needed glyphs and applies the font.

3. Can I override a fallback table with font faces of my choice?

No. But there is a workaround. The names in the fallback tables refer to font face names in pd4fonts.properties. So you may override the mapping in the .properties file.

For example, usually Arial is mapped to arial.ttf

Arial=arial.ttf

If you change the entry in the .properties file to

Arial=comic.ttf

It will always use another font (Comic Sans MS) instead of Arial.

4. How font styles (bold/italic) correlate with TTF embedding logic?

TTF font files as a rule implement only one font style. For example arial.ttf - regular Arial, arialbd.ttf - Arial Bold. That means there is a separate mapping in pd4fonts.properties for each existing font style.

Arial=arial.ttf
Arial\ Italic=ariali.ttf
Arial\ Bold=arialbd.ttf
Arial\ Bold\ Italic=arialbi.ttf
(spaces in font face names or style separating spaces are escaped with a backslash)

The above notation means, for <font face="Arial"><b>Text</b></font> PD4ML looks up "Arial Bold" entry in pd4fonts.properties,
for <font face="Arial"><i><b>Text</b></i></font> is is "Arial Bold Italic".

(Note: <font face="Arial Bold">Text</font> and <font face="Arial Bold Italic">Text</font> would take the same effect, but it is not a portable solution and it is not recommended in general).

There are some special cases.

For example Copperplate Gothic Bold has only one known style - bold. That means <font face="Copperplate Gothic"><b>Text</b></font> code correctly refers the font, but <font face="Copperplate Gothic">Text</font> will fail, and a substitution font will be used.

Some fonts have only regular (not Bold, not Italic) style - it is true for most of CJK fonts. If there is no matching substitution, PD4ML emulates missing styles. Italic is emulated by an area tilt, Bold is by a font glyph stroke out.

 

 

Copyright ©2004-24 zefer|org. All rights reserved. Bookmark and Share