[Pkg-zsh-devel] Bug#654225: zsh: Multibyte fails when $LANG.utf variable is not set

Frank Terbeck ft at bewatermyfriend.org
Mon Jan 2 17:52:09 UTC 2012


Morten Bo Johansen wrote:
[...]
> I attach my rather small zshrc.

Nothing in that file is relevant to this issue.

> ~/.zshenv is a symlink to ~/.environment
> which just set a lot of environment variables. It is there that I now
> specifiy the LANG variable which makes zsh behave correctly.

That would be relevant then. What are you doing in there? Are you
exporting the parameters you're setting or not? Which parameters are you
setting (only locale-related parameters matter)?

> zprofile and
> zlogin are empty. On a sidenote, I did specifiy LANG=da_DK.utf8 in
> /etc/default/locale along with LC_ALL=da_DK.utf8. This is supposedly the
> right way to set your locale in Debian, but only the first line with
> LC_ALL is read.

Doesn't the default setup only set `LANGUAGE' and `LANG'?

That would make sense. Because `LC_ALL' is a blunt instrument. If it's
set, you can't deviate from that setting. If you only set `LANG', then
that has the same effect as setting `LC_ALL' until you choose to modify
some aspect of your locale settings. You change `LC_COLLATE' or
`LC_MESSAGES' or whatever and only that setting is changed. All other
unset values default to the value of `LANG'.

> Maybe I should file a bug report for that against the
> locales package?

Don't think it's a bug in their package. (Well, yet, but I think it's
unlikely so far.)


Let's take a look first. I'm running the following in a "zsh -f" shell,
which means that there's so no setup except for the global zshenv file.
That file is empty in my case, so that is a shell with just defaults.

I am only defining the following function:

l() {
    for i in ${${(f)"$(locale)"}%%\=*}; do
        printf '%s (type: "%s"): "%s"\n' $i ${(Pt)i} ${(P)i}
    done
}

That function prints *all* locale-variables with their respective type
(as far as zsh in concerned) and value, to be able to compare those
settings with the values from the `locale' command.

Zsh handles a few variables itself even if they are not exported into
the environment. As documented in zshparam(1).

So, what's going on when zsh is freshly started?

zsh% l     
LANG (type: "scalar-export-special"): "en_GB.UTF-8"
LANGUAGE (type: "scalar-export"): "en_GB:en"
LC_CTYPE (type: ""): ""
LC_NUMERIC (type: ""): ""
LC_TIME (type: ""): ""
LC_COLLATE (type: ""): ""
LC_MONETARY (type: ""): ""
LC_MESSAGES (type: ""): ""
LC_PAPER (type: ""): ""
LC_NAME (type: ""): ""
LC_ADDRESS (type: ""): ""
LC_TELEPHONE (type: ""): ""
LC_MEASUREMENT (type: ""): ""
LC_IDENTIFICATION (type: ""): ""
LC_ALL (type: ""): ""

zsh% locale
LANG=en_GB.UTF-8
LANGUAGE=en_GB:en
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_PAPER="en_GB.UTF-8"
LC_NAME="en_GB.UTF-8"
LC_ADDRESS="en_GB.UTF-8"
LC_TELEPHONE="en_GB.UTF-8"
LC_MEASUREMENT="en_GB.UTF-8"
LC_IDENTIFICATION="en_GB.UTF-8"
LC_ALL=

So there. `LANG' and `LANGUAGE' are defined and exported (as configured
in /etc/default/locale). `LANG' is "special" to the shell, as documented
in zshparam(1).

UTF-8 works like this. Now I do this:

zsh% LANG=POSIX

...which leads to:

zsh% l             
LANG (type: "scalar-export-special"): "POSIX"
LANGUAGE (type: "scalar-export"): "en_GB:en"
LC_CTYPE (type: ""): ""
LC_NUMERIC (type: ""): ""
LC_TIME (type: ""): ""
LC_COLLATE (type: ""): ""
LC_MONETARY (type: ""): ""
LC_MESSAGES (type: ""): ""
LC_PAPER (type: ""): ""
LC_NAME (type: ""): ""
LC_ADDRESS (type: ""): ""
LC_TELEPHONE (type: ""): ""
LC_MEASUREMENT (type: ""): ""
LC_IDENTIFICATION (type: ""): ""
LC_ALL (type: ""): ""

zsh% locale    
LANG=POSIX
LANGUAGE=en_GB:en
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=

See how all the values that defaulted to "en_GB.UTF-8" (the value of
`LANG') before now default to "POSIX" (the new value of `LANG').

UTF-8 stopps working now, because zsh rightfully thinks that it is not
in a unicode environment.

Let's experiment a bit:

Let's unset `LANG' to get closer to your setup:

zsh% unset LANG

Now we're getting this:

zsh% l
LANG (type: ""): ""
LANGUAGE (type: "scalar-export"): "en_GB:en"
LC_CTYPE (type: ""): ""
LC_NUMERIC (type: ""): ""
LC_TIME (type: ""): ""
LC_COLLATE (type: ""): ""
LC_MONETARY (type: ""): ""
LC_MESSAGES (type: ""): ""
LC_PAPER (type: ""): ""
LC_NAME (type: ""): ""
LC_ADDRESS (type: ""): ""
LC_TELEPHONE (type: ""): ""
LC_MEASUREMENT (type: ""): ""
LC_IDENTIFICATION (type: ""): ""
LC_ALL (type: ""): ""

zsh% locale
LANG=
LANGUAGE=en_GB:en
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=

Everything as expected. And UTF-8 does not work. Which is correct,
because - again - nothing tells zsh it's in a unicode environment.

Now let's set `LC_ALL':

zsh% LC_ALL=en_GB.UTF-8

And the result:

zsh% l   
LANG (type: ""): ""
LANGUAGE (type: "scalar-export"): "en_GB:en"
LC_CTYPE (type: ""): ""
LC_NUMERIC (type: ""): ""
LC_TIME (type: ""): ""
LC_COLLATE (type: ""): ""
LC_MONETARY (type: ""): ""
LC_MESSAGES (type: ""): ""
LC_PAPER (type: ""): ""
LC_NAME (type: ""): ""
LC_ADDRESS (type: ""): ""
LC_TELEPHONE (type: ""): ""
LC_MEASUREMENT (type: ""): ""
LC_IDENTIFICATION (type: ""): ""
LC_ALL (type: "scalar-special"): "en_GB.UTF-8"

zsh% locale
LANG=
LANGUAGE=en_GB:en
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=

Everything as expected. And UTF-8 works. As expected.

"What?" I hear you say. The `LC_ALL' setting in the `locale' output
appears empty. And yes, it is. I didn't export the parameter to the
environment, so the external application can't know about it. (The
parameters type is "scalar-special" no export in there. The reason it
still works is that it's a parameter that zsh handles specially. it does
not have to be exported - although applications launched from that shell
will probably not like this.)

If I export `LC_ALL', I am getting the following:

zsh% export LC_ALL

zsh% l
LANG (type: ""): ""
LANGUAGE (type: "scalar-export"): "en_GB:en"
LC_CTYPE (type: ""): ""
LC_NUMERIC (type: ""): ""
LC_TIME (type: ""): ""
LC_COLLATE (type: ""): ""
LC_MONETARY (type: ""): ""
LC_MESSAGES (type: ""): ""
LC_PAPER (type: ""): ""
LC_NAME (type: ""): ""
LC_ADDRESS (type: ""): ""
LC_TELEPHONE (type: ""): ""
LC_MEASUREMENT (type: ""): ""
LC_IDENTIFICATION (type: ""): ""
LC_ALL (type: "scalar-export-special"): "en_GB.UTF-8"

zsh% locale
LANG=
LANGUAGE=en_GB:en
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_PAPER="en_GB.UTF-8"
LC_NAME="en_GB.UTF-8"
LC_ADDRESS="en_GB.UTF-8"
LC_TELEPHONE="en_GB.UTF-8"
LC_MEASUREMENT="en_GB.UTF-8"
LC_IDENTIFICATION="en_GB.UTF-8"
LC_ALL=en_GB.UTF-8

And UTF-8 still works - again, as I would expect.

So, nope, sorry. I can't reproduce what you're seeing. Granted, I tested
this on my laptop where I'm running an up-to-date development snapshot
of zsh.

But the code handling LANG etc. was last changed in 2007 which was
in between the 4.3.4 and 4.3.5 releases of zsh.


If you still think this is a bug and not an error in your setup, then
please provide a concise way to reproduce the issue, because so far I've
invested about an hour into seeing that everything works as expected. ;)

If I am missing something, tell me what it is exactly.

Regards, Frank





More information about the Pkg-zsh-devel mailing list