[debiandoc-sgml-pkgs] [Fwd: Re: UTF-8 transition (debiandoc-sgml) and request for help with LaTeX (CJK)]

Osamu Aoki osamu at debian.org
Sun Aug 26 00:31:27 UTC 2007


On Sun, Aug 26, 2007 at 01:11:28AM +0200, "Danai SAE-HAN (韓達耐)" wrote:
> [Resending]
> 
> -------- Original Message --------
> Subject: Re: UTF-8 transition (debiandoc-sgml) and request for help with	LaTeX
> (CJK)
> Date: Fri, 17 Aug 2007 02:39:29 +0200
> From: "Danai SAE-HAN (韓達耐)" <danai.sae-han at edpnet.be>
> To: Osamu Aoki <osamu at debian.org>
> References: <20070811070653.GA16465 at debian.org> <46BD8D7B.2030506 at edpnet.be>
> <20070811111450.GA24216 at debian.org>
> 
> [Sorry for this messy email; I'm a bit tired.]
> 
> I've checked DD-SGML vesion 1.2.4, and here's what I recommend to change in
> order to get ja_JP.UTF-8 working.
> 
> In /tools/lib/Locale/ja_JP.UTF-8/LaTeX, the following changes need to be made:
> 
> ## ----------------------------------------------------------------------
> %locale = (
>            'babel' => '',
>            'inputenc' => '',
>            'abstract' => '概要',
>            'copyright notice' => '著作権表示',
>            'before begin document' => '\\usepackage{CJKutf8}
> \\usepackage[CJK, overlap]{ruby}
> \\renewcommand{\rubysep}{-0.2ex}',

I assume \\rubysep since this is part of perl code.

I added these to local git.

Di I need \\usepackage{indentfirst} here too?

> You could add [T1] in \usepackage[T1]{CJKutf8}; it is the same as
> \usepacage[T1]{fontenc}; so this line is actually two \usepackage commands in
> one.  But since debiandoc-sgml already provides a fontenc line, the [T1] isn't
> necessary anymore.
> 
> The line after it allows ruby text, furigana.  Pretty cool, I'd say, but I'm
> not sure if you that's available in SGML.  I think that for XML you need to
> load an extra Ruby module, but I'm not sure.  You could leave the two lines
> out if you don't intend to support Ruby tags in DD-SGML.
> 
> \rubysep (re)defines the space between the kanji and the furigana above.

How about font to use?
'after begin document' => '\\begin{CJK}{UTF8}{song}
\\renewcommand{\\vpageref}[1]{on page \\pageref{#1}}
...

Also do I need all the \renewcommand needed for title etc in this
package.  Is not it taken care by CJK?


> 
> Then:
> 
>            'after begin document' => '\\begin{CJK*}{UTF8}{min}
> 
> {CJK*} instead of {CJK} will make sure that the Japanese text contains no
> spaces between the kanji.  Use * if the core of the text is a Chinese,
> Japanese or Korean text; it's much prettier this way.  To get spaces, e.g.
> when you have an English word in the middle of a sentence, you can use the
> tilde (~) to get a space.
> 
> Example: ...韓達耐~Han Danai~韓達耐...
> 
> To get a non-breakable space, the original use of ~ by TeX, use \nbs.
> 
> When you have a block of English text, you could switch it off with
> \CJKnospace.  When the next Japanese paragraph starts again, use \CJKspace to
> activate it again.
> 
> 
> When you use CJKutf8, you don't need [dnp] to get the Wadalab fonts.
> 
> 
> For PDF hyperreferences, use:
>   \usepackage[unicode]{hyperref}
> if you want to use "latex+dvipdfmx".
> 
> Or use this line if you want to use "pdflatex" instead:
>   \usepackage[pdftex,unicode]{hyperref}
> 
> I'm not sure if the [pdftex] option is necessary.  But I don't see why you
> have this \ifpdf clause, where you only use [unicode] for PDF output.
> 
> And here are a few interesting options you can set with hyperref:
> 
> % Just a few test strings I found on the net.
> \hypersetup{pdfauthor={李果正 Edward G.J. Lee},
>             pdftitle={中文 PDF outline 測試},
>             pdfsubject={Title},
>             a4paper=true,
>             colorlinks=true}
> 
> To get translations for things like Part, Chapter, Section and the TOC, add:
>            'after begin document' => '\\begin{CJK*}{UTF8}{min}
> \\CJKcaption{ja}
> 
> There's one catch though: they only work with the KOMA scripts.  It's a script
> that I would really recommend, because it makes life so much easer to create
> LaTeX documents.
> 
> The CJKcaptions that exist (leave out the suffix .cpx):
> Bg5.cpx (zh_TW.Big5)
> GB.cpx (zh_CN.GB2312)
> JIS.cpx (ja_JP.EUCJP)
> ja.cpx (ja_JP.UTF-8)
> zh-Hans.cpx (zh_TW.UTF-8)
> zh-Hant.cpx (zh_CN.UTF-8)
> 
> Other files (not of any use in the current DD-SGML build):
> hangul2.cpx
> hangul.cpx
> hanja.cpx
> SJIS.cpx
> ko-Hang2.cpx (UTF-8)
> ko-Hang.cpx (UTF-8)
> ko-Hani.cpx (UTF-8)
> 
> Of course, the implementation of KOMA and the CJKcaptions is totally up to the
> DD-SGML developers.
> 
> 
> I'm afraid that ko_KR.UTF-8 will have to wait, because I haven't yet figured
> out how to get all the Korean fonts in Unicode using CJK.  Most build alright,
> but a few show some problems.
> 
> WRT zh_*.UTF-8, I think I can manage to get it working by this weekend.
> 
> 
> Cheerio
> 
> 
> 
> -- 
> Danai SAE-HAN (韓達耐)
> --
> 題目:《春居雜興》二首
> 作者:王禹稱偁(954-1001)
> 
>             其一
> 
> 兩株桃杏映籬斜,妝點商山副使家,
> 何事春風容不得,和鶯吹折數枝花。
> 
>             其二
> 
> 春云如獸復如禽,日照風吹淺又深。
> 誰道無心便容與,亦同翻覆小人心。
> 





More information about the Debiandoc-sgml-pkgs mailing list