[Debian-handbook-translators] XML markup problems
hertzog at debian.org
Mon Nov 12 21:54:59 UTC 2012
On Fri, 09 Nov 2012, Raphael Hertzog wrote:
> Unfortunately several translations are not buildable due to markup errors
> (invalid XML syntax). The translations which have errors are the
> * de-DE
> * it-IT
> * pt-BR
I fixed the markup errors so that the document gets built. They are
Unfortunately the amount of mistakes in it-IT and pt-BR was very high. I
believe that those translations teams should make sure to explain
to contributors how XML is supposed to be translated.
Here are some important points where I saw lots of errors:
* you should not add spaces where there's no space in the original string
in particular within the XML markup.
If original is “<emphasis>foo</emphasis>” the translations
“<emphasis>bar</emphasis>” is OK but
“<emphasis> bar </emphasis>” is NOT OK and
“<emphasis>bar</ emphasis>” is NOT OK
If original is “<xref linkend="foo" />”, you should not change
anything. In particular “<xref linkend="foo" / >” is NOT OK.
If original is “<filename>/etc/foo/config</filename>”, you should not
change anything. “<filename> / Etc / foo / config </filename>” is NOT
* you should not translate the XML tags (neither the opening tag, nor the
closing tag): it's "<command>" not "<comando>", it's "<primary>" not
"<principal>", it's "<filename>" not "<nomefile>".
* you should not translate attributes names and values which are not displayed to
the user (it's “<ulink type="block"…” not “<ulink tipo = "block"…”).
* you should not translate filenames, URLs, etc.
* you should not invert the order of tags in
“<primary>foo<primary><secondary>bar</secondary>”. Those are index
entries, where you lookup the first word first and the second word
as a sub-entry of the former word. You should instead reword to
accomodate for the inverted order in languages where the order of worlds
would be the opposite (usually there are two entries with the two
orders when it's relevant).
* you should not insert all those "zero width space" characters (Unicode
0x200B). I don't know why translators included those but I don't believe
that there are legitimate reasons for those characters here. Furthermore
they are often doubled.
Note that I fixed only the errors that generated build errors but many of
the mistakes (in particular extraneous spaces) do not generate build
errors but instead will render incorrectly in the book.
It would be nice if all the translators could be more careful from now on
so that all the translations keep being buildable and can be regularly
updated on the website (I'll do it once a month). I know it's a bit more
difficult with weblate since you can't easily verify if you have
introduced errors. That's why I requested a new feature to ensure
that translations are valid XML:
Raphaël Hertzog ◈ Debian Developer
Get the Debian Administrator's Handbook:
More information about the Debian-handbook-translators