Bug#496266: UTF-8 string characters not properly recognized
Adam Majer
adamm at zombino.com
Tue Sep 2 18:19:11 UTC 2008
Christian Perrier wrote:
>> Le samedi 23 août 2008 à 19:59 -0500, Adam Majer a écrit :
>>> Package: gedit
>>> Version: 2.22.3-1
>>> Severity: normal
>>>
>>> The following UTF-8 string is not correctly handled in gedit,
>>>
>>> const char *unicode_insert = "?Э";
>>>
>>> The " and the ? characters are viewed as one character, making the
>>> entire thing next to impossible to copy/paste/edit.
>> Looks like an issue in pango, since it is not specific to gedit.
>>
>> Such things seem to happen a lot when using Tibetan characters, so this
>> may or may not be intentional. I’d prefer to have the input of someone
>> who uses them. Is there anyone on debian-i18n who’s more knowledgeable
>> about Tibetan glyphs?
>
>
> Adding Pema Geyleg and Tenzin Dendup, our fellow Dzongkha translation
> coordinators, who certainly have skills about Tibetan-family scripts
> (Dzongkha is one of these) and could maybe point you to people with
> needed knowledge.
I'm sorry, but aren't we missing the entire point here? This is not
about bad handling of some Tibetan characters. It is about bad handling
of 3-byte UTF-8 characters.
http://en.wikipedia.org/wiki/UTF-8
So, the following characters should have the same problems,
"ऄक
"ঈউঊ
"ਜਗਏ
"ଜଁଂ
"ஔ
"ంఁః
"ಂಖ
"ഈഃ
etc..
I've put a Ascii " in front of all the different characters. In emacs,
I'm able to select the " in front of these characters and copy it. vim
under a UTF-8 gnome terminal also allows the " to be selected. The 2nd
last line above (using icedove), I can't independently select the " but
I can select the " and ಂ together and then remove the 2nd character.
Maybe it is just my misunderstanding of UTF-8, I'm not sure. But at
least my expected behaviour was being able to select 1 UTF-8 character
at a time, even if linguistically it does not make any sense.
- Adam
More information about the pkg-gnome-maintainers
mailing list