[Pkg-fonts-devel] Fwd: Draining the font swamp

Michael Vogt michael.vogt at ubuntu.com
Mon Jun 4 16:07:48 UTC 2007


On Tue, May 29, 2007 at 10:30:10AM +0800, "Arne Götje (高盛華)" wrote:
> Matt Zimmerman wrote:
> >> 2. A list of default fonts should be made for certain languages (this is
> >> only interesting for screen display):
> >>  * For Latin script DejaVu Sans and the SIL fonts for sans and serif
> >> respectively should be on the top of the list. Both are smart fonts
> >> which can compose almost any diacritical combination, as for Vietnamese,
> >> European languages and African languages based on Latin script. Also
> >> both fonts include the full list of IPA characters...
> > 
> > Which one of the SIL fonts do you mean?  Is it packaged and available in
> > Debian and Ubuntu?
> ttf-sil-doulos
> ttf-sil-charis
> both are smart serif fonts. Doulos is the more popular one.

Both are available in ubuntu universe but we currently do nothing to
give them a high priority in fontconfig.
> >>   b) there is currently no font avaliable which covers all CJK glyphs in
> >> Unicode
> >>   c) we don't have any acceptable sans-serif font for CJK
> >>   d) currently we can only use the ming/mincho style for screen display,
> >> the Kochi Mincho and AR PL ShanHeiSun Uni fonts contain embedded bitmaps
> >> already.
> >>   e) CJK glyphs in China, Hong Kong, Taiwan, Japan and Korea have
> >> different shapes which share the same Unicode codepoints. The available
> >> fonts (ttf-arphic-uming, ttf-kochi-mincho and ttf-unfonts) overlap each
> >> other in the CJK range, which confuses fontconfig! Chinese users usually
> >> prefer the ttf-arphic-uming package, while Japanese users might prefer
> >> the ttf-kochi-mincho or ttf-sazanami-mincho fonts and Korean users stick
> >> with the ttf-unfonts package. What makes it worse is, that fontconfig
> >> comes with a predefined list of fonts which should be preferred. This
> >> does not suit all CJK users, as they have different preferences.
> > 
> > This seems like a real mess.  In Ubuntu, we try to work around this by
> > changing font preferences depending on which supported languages are
> > selected by the user, but it is not ideal.  What if the system needs to
> > support more than one of these?
> That's exactly the problem. It is a real mess.
> See below for the explanation of what we need to change in fontconfig
> (or better, upstream should change it!)

We added a script called fontconfig-voodoo as part of
language-selector that tries to work around problem (e). If you use
language-selector and change your default locale, a new configuration
is added to /etc/fonts/ called language-selector.conf. This will point
to a configuration in /usr/share/language-selector/fontconfig/ that
contains locale specific preferences. It is not a ideal solution but
with the current limitations looked like the best option. The same
logic is applied after installation, so a fresh e.g. zh_TW install
will get a configuration that points to
/usr/share/language-selector/fontconfig/zh_TW. We got the
configurations from the different communities. 

It seems to me like the only way to solve this problem properly is to
add support into fontconfig for this directly. Having a match rule
that checks for the global locale used currently or (better) a way so
that the application can ask for a font for a given locale to support
the case where you have mixed zh_CN and ja_JP in a single
document. AFAIK there is no such matcher rule for the configuration
file currently.

> >> For all these CJK issues I'm working on a solution. But it takes time
> >> until it's ready.
> > 
> > Where can we learn more about your work?
> I didn't put any web page up for that. But I can tell you now. :)
> I'm the font maintainer of the ttf-arphic-{ukai|uming} packages, my
> project is CJK-Unifonts and it aims to provide a free set of fonts
> covering all CJK glyphs currently in Unicode for all CJK regions (China,
> Hong Kong/Macao, Taiwan, Japan, Korea).
> Currently it's still a mess, but I'm getting somewhere... slowly...
> For now, the fonts support Big5, GB2312 and HKSCS (Hong Kong
> supplemental charset), but the glyph shapes are those which came with
> the font and do not follow any standard... this is a problem many users
> have with the fonts.
> Currently the fonts come in two styles, Unicode and MBE (MBE is only of
> interest for Taiwanese users and even then optional).
> For the next release I plan to distribute the fonts as ttc (truetype
> collection), which allows me to shrink the font size dramatically.
> For now, each font is about 20MB in size, but the difference between the
> Unicode and MBE styles are only 12 glyphs. So, it's actually  a waste of
> space to have two full size fonts around. A single TTC file would save
> about 50% of space in this case.
> For the future I plan to include glyphs for the different CJK regions
> (if they differ). Compared to providing seperate fonts for each region
> (at current size of 20MB, that would be at least 6 times (China, Hong
> Kong/Macao, Taiwan, Japan, Korea, Taiwan MBE) as much, while with a
> single TTC it would be only around 22 MB total or so...
> The user however still sees 6 different fonts in his system and would
> have to choose which style he wants to use as default for CJK glyphs.
> That one would have to be configured in fontconfig.

That sounds very good! We would still need something like
fontconfig-voodoo or better support inside fontconfig as I understand it?
> These defaults do more harm than good.
> For Latin script a good default would be:
>  * serif: Doulos SIL, Charis SIL, DejaVu Serif, Bitstream Vera Serif
>  * sans: DejaVu Sans, Bitstream Vera Sans
>  * monospace: DejaVu Sans Mono, Bitstram Vera Sans Mono
> For CJK (KR and ZH locales):
>  Latin plus the following:
>  * serif: AR PL ShanHeiSun Uni MBE, AR PL ShanHeiSun Uni, Kochi Mincho,
> Sazanami Mincho, UnBatang
>  * sans: AR PL ShanHeiSun Uni MBE, AR PL ShanHeiSun Uni, Kochi Mincho,
> Sazanami Mincho, UnDotum
>  * monospace: AR PL ShanHeiSun Uni MBE, AR PL ShanHeiSun Uni, Kochi
> Mincho, Sazanami Mincho, UnDotum
> For CJK (JP locale):
>  Latin plus the following:
>  * serif: Kochi Mincho, Sazanami Mincho, AR PL ShanHeiSun Uni MBE, AR PL
> ShanHeiSun Uni, UnBatang
>  * sans: Kochi Gothic, Sazanami Gothic, AR PL ShanHeiSun Uni MBE, AR PL
> ShanHeiSun Uni, UnDotum
>  * monospace: Kochi Mincho, Sazanami Mincho, AR PL ShanHeiSun Uni MBE,
> AR PL ShanHeiSun Uni, UnDotum
> The following entries should be removed altogether, as the fonts are
> either non-free (and not available in Debian), or outdated and not
> preferred:
>  MS 明朝, Baekmuk *, AR PL KaitiM *, MS ゴシック, SimSun, NSimSun,
> AR PL SungtiL GB, AR PL Mingti2L Big5

If there is general agreement about this, I will be happy to update
the language-selector fontconfig rules with the above suggestions.

