Bug#978315: xgettext complains about UTF conformance of strings not marked for translation
Bruno Haible
bruno at clisp.org
Sun Dec 27 18:40:41 GMT 2020
Hi Santiago, Samuel,
> The upload of gettext 0.21 for Debian unstable has made package "dasher",
> maintained by Samuel Thibault (in Cc), not to build anymore, as reported here
> by Lucas Nussbaum:
>
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=978315
>
> We are not sure where is exactly the problem (either "dasher" or "gettext").
>
> In short: xgettext seems to parse and complain about UTF conformance
> of strings even if they are not marked for translation.
>
> Here is a minimal test case provided by Samuel:
>
> ----- Begin forwarded message -----
>
> € cat test.c
>
> #include <wchar.h>
>
> void f(const wchar_t *str) { }
>
> void g(void) {
> f(L"\xABCDFF");
> }
>
>
> € xgettext test.c
> xgettext: x-c.c:1666: phase5_get: Assertion `UNICODE_VALUE (c) >= 0 && UNICODE_VALUE (c) < 0x110000' failed.
>
> Samuel
>
> ----- End forwarded message -----
This behaviour was introduced in gettext 0.20, with the ability to grok
C11 and C++11 string literals.
In the next gettext release, functions like 'f' (which take a 'const wchar_t *'
argument) can be designated as gettext-like functions, for which the argument
needs to be extracted and put into the POT file. For this, it must be possible
to convert it to UTF-8.
The assertion could be converted to a reasonable error message, sure.
Having a reasonable error message (with line number) *and* emitting this error
message only when the string actually gets extracted would make xgettext more
complex.
Since Samuel says:
... the file that poses problem is Testing/gtest/test/gtest_unittest.cc
This is not something that contains anything to be translated, we'd need
some option to just ignore Testing/ entirely.
this looks like the better option.
Bruno
More information about the Pkg-a11y-devel
mailing list