[Nut-upsdev] Character-encoding in code and docs

Fri Feb 10 13:40:45 UTC 2006

On 2/10/06, Arjen de Korte <nut+devel at de-korte.org> wrote:
>
> > On Fri, Feb 10, 2006 at 07:37:35AM -0500, Charles Lepple wrote:
> >> I think I have seen a few files with latin-1 characters in the
> >> comments (changelog as well), and that's probably what Niels is
> >> referring to.
> >
> > Yes, this was personal names in comments and copyrights.
> > Also, the CREDITS file could easily end up containing names with
> > characters outside the ASCII repertoire.
>
> I understand you wish to have your name written as it is supposed to be
> written, but unfortunately there is no portable way to do so, unless we
> start using generated files (based on the localization) for all of these.
> In that case we need to agree on an encoding standard and do some magic in
> the makefiles to convert these according to the localization on the system
> we're building on (if possible at all).

That's a little extreme... UTF-8 has been used to represent all kinds
of names in Debian changelogs for a while, and I think it could work
here as well. AFAIK, the only place where UTF-8 loses is for character
sets where the majority of the characters have codes above 255, and
therefore the escaping makes the file huge (which is when you would
consider UTF-16).

A compromise might be to keep the credits file in UTF-8, but only
allow ASCII transliterations in the individual source files.

> That doesn't need to stop you from using whatever kind of encoding in your
> sourcefiles, as long as you make sure that they are between /* and */, you
> should be fine. Of course there is no guarantee that the person viewing
> these files is seeing what you intended to be seen.

...or that someone's editor will not try to perform automatic encoding
conversion.

--
- Charles Lepple