[Pkg-postgresql-public] Locale sanitizing [was: Re: Changing default encoding to unicode?]

Martin Pitt mpitt@debian.org
Mon, 8 Nov 2004 12:38:24 +0100


--Q0rSlbzrZN6k9QnT
Content-Type: multipart/mixed; boundary="wxDdMuZNg1r63Hyj"
Content-Disposition: inline


--wxDdMuZNg1r63Hyj
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi!

Peter Eisentraut [2004-11-07 23:37 +0100]:
> The man page I am reading says that in 7.4 the encoding defaults to=20
> SQL_ASCII, which is what it actually does.

Right, sorry for mixing that up.

> Right.  You would only need to devise a way to match the encoding names=
=20
> provided by the locale to the encoding names used by PostgreSQL.  You=20
> can steal that mapping table from PG 8.0.

I took this table as a basis to implement a similar autodetection in
the postinst. Thanks for that hint!

My local postgresql tree now dropped the encoding question, determines
it from the chosen locale (the locale question explains this) and lets
the postmaster run under the chosen locale. (Bugs #254058, #257117).

I also fixed #263503 (postgresql uses aptitude's locale instead of the
debconf selected one). I attach the current changeset here. If nobody
objects, I will commit it tomorrow.

Thanks in advance for proofreading,

Martin

--=20
Martin Pitt                       http://www.piware.de
Ubuntu Developer            http://www.ubuntulinux.org
Debian GNU/Linux Developer       http://www.debian.org

--wxDdMuZNg1r63Hyj
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="psql-locale.diff"
Content-Transfer-Encoding: quoted-printable

* looking for pkg-postgresql-private@lists.alioth.debian.org--postgresql/po=
stgresql--unstable--1--patch-70 to compare with
* comparing to pkg-postgresql-private@lists.alioth.debian.org--postgresql/p=
ostgresql--unstable--1--patch-70
M  postgresql-7.4.6/debian/changelog
M  postgresql-7.4.6/debian/postgresql-startup.in
M  postgresql-7.4.6/debian/postgresql.config.in
M  postgresql-7.4.6/debian/postgresql.templates
M  postgresql-7.4.6/debian/postinst.in
M  postgresql-7.4.6/debian/po/templates.pot

* modified files

--- orig/postgresql-7.4.6/debian/changelog
+++ mod/postgresql-7.4.6/debian/changelog
@@ -28,8 +28,14 @@
     kernel variables, this should be done by the admin in /etc/sysctl.conf.
   * postinst.in: do chmod'ing of conffiles as root (before calling initdb)=
 to
     avoid permission errors
+  * Dropped debconf question for default database encoding; this is now
+    automatically determined from the locale. Locale debconf question now =
also
+    explains the impact on encoding. Closes: #254058, #257117
+  * postgresql.config.in: Fixed a debconf logic flaw: previously, initdb u=
sed
+    the locale of the package installation process, not the one chosen in
+    debconf. Closes: #263503
=20
- -- Martin Pitt <mpitt@debian.org>  Mon,  8 Nov 2004 00:09:32 +0100
+ -- Martin Pitt <mpitt@debian.org>  Mon,  8 Nov 2004 12:05:40 +0100
=20
 postgresql (7.4.6-2) unstable; urgency=3Dmedium
=20


--- orig/postgresql-7.4.6/debian/po/templates.pot
+++ mod/postgresql-7.4.6/debian/po/templates.pot
@@ -16,7 +16,7 @@
 msgstr ""
 "Project-Id-Version: PACKAGE VERSION\n"
 "Report-Msgid-Bugs-To: \n"
-"POT-Creation-Date: 2004-06-08 17:30+0200\n"
+"POT-Creation-Date: 2004-11-08 01:15+0100\n"
 "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
 "Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
 "Language-Team: LANGUAGE <LL@li.org>\n"
@@ -166,10 +166,20 @@
 #. Description
 #: ../postgresql.templates:65
 msgid ""
-"The locales available to you can be configured with locale-gen.  The defa=
ult "
-"locale, C, is always available; any others must be specifically configure=
d "
-"for your system.  Only locales that are currently configured appear in th=
e "
-"list of choices."
+"This setting also determines the default encoding of newly created "
+"databases. The default can be overridden when creating a particular "
+"database, but if the database encoding does not match the encoding of the=
 "
+"backend's locale you might encounter (nonfatal) errors."
+msgstr ""
+
+#. Type: select
+#. Description
+#: ../postgresql.templates:65
+msgid ""
+"The locales available to you can be configured with 'dpkg-reconfigure "
+"locales'.  The default locale, C, is always available; any others must be=
 "
+"specifically configured for your system.  Only locales that are currently=
 "
+"configured appear in the list of choices."
 msgstr ""
=20
 #. Type: select
@@ -178,24 +188,25 @@
 msgid ""
 "Use of any locale but C will somewhat reduce the efficiency of index acce=
ss, "
 "because  sorting by national collating order is rather less efficient tha=
n "
-"sorting by ASCII sequence."
+"sorting by ASCII sequence. But 'C' is not capable of representing any "
+"characters outside the 7-bit ASCII range."
 msgstr ""
=20
 #. Type: select
 #. Choices
-#: ../postgresql.templates:83
+#: ../postgresql.templates:89
 msgid "European, US"
 msgstr ""
=20
 #. Type: select
 #. Description
-#: ../postgresql.templates:85
+#: ../postgresql.templates:91
 msgid "Choose European or US day/month order in dates."
 msgstr ""
=20
 #. Type: select
 #. Description
-#: ../postgresql.templates:85
+#: ../postgresql.templates:91
 msgid ""
 "Do you expect dates to be in European format (day before month) or in US "
 "format (month before day)?"
@@ -203,126 +214,28 @@
=20
 #. Type: select
 #. Description
-#: ../postgresql.templates:85
+#: ../postgresql.templates:91
 msgid ""
 "This setting affects the interpretation of all dates on input, and the "
 "presentation of dates in all styles except ISO.  A user can override this=
 "
 "setting for a particular session."
 msgstr ""
=20
-#. Type: select
-#. Choices
-#: ../postgresql.templates:95
-msgid ""
-"per_locale, SQL_ASCII, UNICODE, EUC_JP, EUC_CN, EUC_KR, JOHAB, EUC_TW, "
-"MULE_INTERNAL, LATIN1, LATIN2, LATIN3, LATIN4, LATIN5, LATIN6, LATIN7, "
-"LATIN8, LATIN9, LATIN10, ISO-8859-5, ISO-8859-6, ISO-8859-7, ISO-8859-8, "
-"KOI8, WIN, ALT, WIN1256, TCVN, WIN874"
-msgstr ""
-
-#. Type: select
-#. Description
-#: ../postgresql.templates:97
-msgid "What character encoding should the database use by default?"
-msgstr ""
-
-#. Type: select
-#. Description
-#: ../postgresql.templates:97
-msgid ""
-"The database character encoding determines which internal codes are used =
to "
-"represent written characters.  The traditional form is SQL_ASCII, which u=
ses "
-"the ASCII character set.  This is not capable of representing non-English=
 "
-"alphabets, so you will need to use a different character set if your "
-"databases are to hold non-English characters."
-msgstr ""
-
-#. Type: select
-#. Description
-#: ../postgresql.templates:97
-msgid ""
-"These are the available encodings:\n"
-" per_locale         SQL_ASCII if chosen locale is C, otherwise UNICODE\n"
-" SQL_ASCII          ASCII - standard English characters only\n"
-" UNICODE            Unicode (UTF-8)\n"
-" EUC_JP             Japanese EUC\n"
-" EUC_CN             Chinese EUC\n"
-" EUC_KR             Korean EUC\n"
-" JOHAB              Korean EUC (Hangle base)\n"
-" EUC_TW             Taiwan EUC\n"
-" MULE_INTERNAL      Mule internal code (Emacs)\n"
-" LATIN1             ISO 8859-1 ECMA-94 Latin Alphabet No.1\n"
-" LATIN2             ISO 8859-2 ECMA-94 Latin Alphabet No.2\n"
-" LATIN3             ISO 8859-3 ECMA-94 Latin Alphabet No.3\n"
-" LATIN4             ISO 8859-4 ECMA-94 Latin Alphabet No.4\n"
-" LATIN5             ISO 8859-9 ECMA-128 Latin Alphabet No.5\n"
-" LATIN6             ISO 8859-10 ECMA-144 Latin Alphabet No.6\n"
-" LATIN7             ISO 8859-13 Latin Alphabet No.7\n"
-" LATIN8             ISO 8859-14 Latin Alphabet No.8\n"
-" LATIN9             ISO 8859-15 Latin Alphabet No.9\n"
-" LATIN10            ISO 8859-16 ASRO SR 14111 Latin Alphabet No.10\n"
-" ISO-8859-5         ECMA-113 Latin/Cyrillic\n"
-" ISO-8859-6         ECMA-114 Latin/Arabic\n"
-" ISO-8859-7         ECMA-118 Latin/Greek\n"
-" ISO-8859-8         ECMA-121 Latin/Hebrew\n"
-" KOI8               KOI8-R(U)\n"
-" WIN                Windows CP1251\n"
-" ALT                Windows CP866\n"
-" WIN1256            Arabic Windows CP1256\n"
-" TCVN               Vietnamese TCVN-5712 (Windows CP1258)\n"
-" WIN874             Thai Windows CP874"
-msgstr ""
-
-#. Type: select
-#. Description
-#: ../postgresql.templates:97
-msgid ""
-"Important: Before PostgreSQL 7.2, LATIN5 mistakenly meant ISO 8859-5. Fro=
m "
-"7.2 on, LATIN5 means ISO 8859-9. If you have a LATIN5 database created on=
 "
-"7.1 or earlier and want to migrate to 7.2 (or later), you should be very "
-"careful about this change."
-msgstr ""
-
-#. Type: select
-#. Description
-#: ../postgresql.templates:97
-msgid ""
-"Important: Not all APIs supports all the encodings listed above. For "
-"example, the PostgreSQL JDBC driver does not support MULE_INTERNAL, LATIN=
6, "
-"LATIN8, and LATIN10."
-msgstr ""
-
-#. Type: select
-#. Description
-#: ../postgresql.templates:97
-msgid ""
-"We suggest that UNICODE is the best encoding to use if you cannot use "
-"SQL_ASCII, unless you have a particular requirement for some other encodi=
ng."
-msgstr ""
-
-#. Type: select
-#. Description
-#: ../postgresql.templates:97
-msgid ""
-"This encoding is the default to use when no other is specified. You can "
-"create a particular database with any encoding you wish."
-msgstr ""
-
 #. Type: string
 #. Default
-#: ../postgresql.templates:153
+#: ../postgresql.templates:101
 msgid "/var/lib/postgres/data"
 msgstr ""
=20
 #. Type: string
 #. Description
-#: ../postgresql.templates:154
+#: ../postgresql.templates:102
 msgid "Where should the PostgreSQL database be created?"
 msgstr ""
=20
 #. Type: string
 #. Description
-#: ../postgresql.templates:154
+#: ../postgresql.templates:102
 msgid ""
 "The database is built as a directory tree under the directory which you "
 "specify here, which must be empty or non-existent.  It will contain all t=
he "
@@ -333,7 +246,7 @@
=20
 #. Type: string
 #. Description
-#: ../postgresql.templates:154
+#: ../postgresql.templates:102
 msgid ""
 "Tables or databases can be located elsewhere and referenced by symbolic "
 "links. However, this structure is not preserved during an upgrade; theref=
ore "
@@ -343,19 +256,19 @@
=20
 #. Type: string
 #. Description
-#: ../postgresql.templates:154
+#: ../postgresql.templates:102
 msgid "The default location is /var/lib/postgres/data."
 msgstr ""
=20
 #. Type: boolean
 #. Description
-#: ../postgresql.templates:171
+#: ../postgresql.templates:119
 msgid "Cancel upgrade from undumpable version?"
 msgstr ""
=20
 #. Type: boolean
 #. Description
-#: ../postgresql.templates:171
+#: ../postgresql.templates:119
 msgid ""
 "You are upgrading from a PostgreSQL release older than 7.2. These release=
s "
 "are not supported any more and upgrading from them will most likely fail "
@@ -364,13 +277,13 @@
=20
 #. Type: boolean
 #. Description
-#: ../postgresql.templates:171
+#: ../postgresql.templates:119
 msgid "Please upgrade your system to Debian 3.0 (\"Woody\") first."
 msgstr ""
=20
 #. Type: boolean
 #. Description
-#: ../postgresql.templates:171
+#: ../postgresql.templates:119
 msgid ""
 "IF YOU CHOOSE TO IGNORE THIS WARNING AND PROCEED WITH THE UPGRADE, YOU MA=
Y "
 "FIND THAT YOUR DATA IS IRRECOVERABLE.  At the least, you may need to edit=
 "
@@ -379,19 +292,19 @@
=20
 #. Type: boolean
 #. Description
-#: ../postgresql.templates:171
+#: ../postgresql.templates:119
 msgid "Do you want to cancel the upgrade? (Highly recommended)"
 msgstr ""
=20
 #. Type: boolean
 #. Description
-#: ../postgresql.templates:187
+#: ../postgresql.templates:135
 msgid "Should the data be purged as well as the package files?"
 msgstr ""
=20
 #. Type: boolean
 #. Description
-#: ../postgresql.templates:187
+#: ../postgresql.templates:135
 msgid ""
 "A request to purge PostgreSQL might imply removal of the database files "
 "under /var/lib/postgres/data, which contain the actual database data (unl=
ess "
@@ -400,7 +313,7 @@
=20
 #. Type: boolean
 #. Description
-#: ../postgresql.templates:187
+#: ../postgresql.templates:135
 msgid ""
 "When a purge is requested, these files can be removed and any data that m=
ay "
 "be there can be destroyed."
@@ -408,13 +321,13 @@
=20
 #. Type: boolean
 #. Description
-#: ../postgresql.templates:198
+#: ../postgresql.templates:146
 msgid "Should pg_hba.conf be converted to the new format?"
 msgstr ""
=20
 #. Type: boolean
 #. Description
-#: ../postgresql.templates:198
+#: ../postgresql.templates:146
 msgid ""
 "The format of /etc/postgresql/pg_hba.conf has been changed by the additio=
n "
 "of a username field.  It needs to be changed in order to allow access to "
@@ -423,7 +336,7 @@
=20
 #. Type: boolean
 #. Description
-#: ../postgresql.templates:198
+#: ../postgresql.templates:146
 msgid ""
 "If you allow, it will be automatically updated to allow access to all use=
rs "
 "on each type of connection.  This was always the behaviour in release 7.2=
 "
@@ -432,13 +345,13 @@
=20
 #. Type: boolean
 #. Description
-#: ../postgresql.templates:210
+#: ../postgresql.templates:158
 msgid "Should PL/PGSQL procedural language be enabled in all databases?"
 msgstr ""
=20
 #. Type: boolean
 #. Description
-#: ../postgresql.templates:210
+#: ../postgresql.templates:158
 msgid ""
 "PL/PGSQL is a procedural language that can be used to define functions fo=
r "
 "use in SQL queries."
@@ -446,7 +359,7 @@
=20
 #. Type: boolean
 #. Description
-#: ../postgresql.templates:210
+#: ../postgresql.templates:158
 msgid ""
 "The installation script can make sure that PL/PGSQL is available in all y=
our "
 "databases."
@@ -454,7 +367,7 @@
=20
 #. Type: boolean
 #. Description
-#: ../postgresql.templates:210
+#: ../postgresql.templates:158
 msgid ""
 "If you don't want that, you can add the language to individual databases "
 "later with the enable_lang or create_lang scripts."


--- orig/postgresql-7.4.6/debian/postgresql-startup.in
+++ mod/postgresql-7.4.6/debian/postgresql-startup.in
@@ -218,6 +218,10 @@
 check_shm
 check_filemax
=20
+# set locale that is used by the backend
+export LANG=3D`pg_controldata "$PGDATA" | grep LC_CTYPE | awk '{print $2}'`
+export LC_ALL=3D$LANG
+
 # Ready to go: stand clear...
 cd ${POSTGRES_HOME}
 eval /usr/lib/postgresql/bin/pg_ctl start -s -D ${PGDATA} ${LOG_OPT} ${OPT=
IONS}


--- orig/postgresql-7.4.6/debian/postgresql.config.in
+++ mod/postgresql-7.4.6/debian/postgresql.config.in
@@ -48,20 +48,6 @@
 	db_go
     fi
=20
-    # Available locales:
-    langs=3D`locale -a`
-    if [ -n "$langs" ]
-    then
-	CHOICES=3DC
-	for lng in $langs
-	do
-	    if [ "$lng" !=3D C -a "$lng" !=3D POSIX ]; then
-		CHOICES=3D"$CHOICES, $lng"
-	    fi
-	done
-	db_subst postgresql/settings/locale CURRENT_LOCALE_LIST "$CHOICES"
-    fi
-
     # on new installs, ask user for database directory $PGDATA
     if [ ! -f /etc/postgresql/postmaster.conf ]
     then
@@ -81,26 +67,58 @@
     db_input medium postgresql/purge_data_too || true
     db_go
=20
-    # LANG has to be set before we run initdb
-    db_get postgresql/settings/locale
-    PGLANG=3D"$RET"
-    if [ -z "$RET" -a -r "$PGDATA/global/pg_control" -a -x /usr/lib/postgr=
esql/bin/pg_controldata ]
-    then
-	PGLANG=3D`/usr/lib/postgresql/bin/pg_controldata | grep LC_COLLATE | awk =
'{print $2}'`
+    # determine default value of PGLANG
+    db_fget postgresql/settings/day_month_order seen || true
+    if [ "$RET" =3D "true" ]; then
+	db_get postgresql/settings/locale
+	PGLANG=3D"$RET"
+    else
+	if [ -r "$PGDATA/global/pg_control" -a -x /usr/lib/postgresql/bin/pg_cont=
roldata ]
+	then
+	    PGLANG=3D`/usr/lib/postgresql/bin/pg_controldata "$PGDATA" | grep LC_=
COLLATE | awk '{print $2}'`
+	else
+	    if [ -r /etc/environment ]
+	    then
+		. /etc/environment
+		PGLANG=3D${LC_COLLATE:-${LC_TYPE:-${LC_ALL:-${LANG}}}}
+	    else
+		PGLANG=3DC
+	    fi
+	fi
+
+	# set default value
+	db_set postgresql/settings/locale "$PGLANG"
     fi
=20
-    # database settings
+    # Available locales
+    CHOICES=3DC
+    langs=3D`locale -a` || true
+    if [ -n "$langs" ]
+    then
+	for lng in $langs
+	do
+	    if [ "$lng" !=3D C -a "$lng" !=3D POSIX ]; then
+		CHOICES=3D"$CHOICES, $lng"
+	    fi
+	done
=20
-    db_set postgresql/settings/locale ${PGLANG:=3DC}
+	# Ensure that PGLANG is in choices; it might be written differently
+	if ! echo "$CHOICES" | grep -Fq "$PGLANG"; then
+	    CHOICES=3D"$CHOICES, $PGLANG"
+	fi
+    fi
+
+    db_subst postgresql/settings/locale CURRENT_LOCALE_LIST "$CHOICES"
     db_input medium postgresql/settings/locale || true
-    db_input medium postgresql/settings/encoding || true
+    db_go
=20
-    # Guess the postgresql date order from current locale setting if quest=
ion
+    # Guess the postgresql date order from the PGLANG setting if question
     # is new
     db_fget postgresql/settings/day_month_order seen || true
-    if [ ! "$RET" =3D "true" ]
+    if [ "$RET" !=3D "true" ]
     then
-	D=3D`su -c 'date -d "January 30 2000" +%x' postgres | cut -c 1-2` || true
+	db_get postgresql/settings/locale || true
+	D=3D`env LC_TIME=3D$RET date -d "January 30 2000" +%x | cut -c 1-2` || tr=
ue
 	if [ "$D" =3D "01" ]
 	then
 	    db_set postgresql/settings/day_month_order US


--- orig/postgresql-7.4.6/debian/postgresql.templates
+++ mod/postgresql-7.4.6/debian/postgresql.templates
@@ -69,14 +69,20 @@
  particular locale, that cannot be changed without destroying and
  recreating the database.
  .
- The locales available to you can be configured with locale-gen.  The
- default locale, C, is always available; any others must be specifically
- configured for your system.  Only locales that are currently configured
- appear in the list of choices.
+ This setting also determines the default encoding of newly created
+ databases. The default can be overridden when creating a particular
+ database, but if the database encoding does not match the encoding of
+ the backend's locale you might encounter (nonfatal) errors.
+ .
+ The locales available to you can be configured with 'dpkg-reconfigure
+ locales'.  The default locale, C, is always available; any others
+ must be specifically configured for your system.  Only locales that
+ are currently configured appear in the list of choices.
  .
  Use of any locale but C will somewhat reduce the efficiency of index
  access, because  sorting by national collating order is rather less
- efficient than sorting by ASCII sequence.
+ efficient than sorting by ASCII sequence. But 'C' is not capable of
+ representing any characters outside the 7-bit ASCII range.
=20
 Template: postgresql/settings/day_month_order
 Type: select
@@ -90,64 +96,6 @@
  presentation of dates in all styles except ISO.  A user can override this
  setting for a particular session.
=20
-Template: postgresql/settings/encoding
-Type: select
-_Choices: per_locale, SQL_ASCII, UNICODE, EUC_JP, EUC_CN, EUC_KR, JOHAB, E=
UC_TW, MULE_INTERNAL, LATIN1, LATIN2, LATIN3, LATIN4, LATIN5, LATIN6, LATIN=
7, LATIN8, LATIN9, LATIN10, ISO-8859-5, ISO-8859-6, ISO-8859-7, ISO-8859-8,=
 KOI8, WIN, ALT, WIN1256, TCVN, WIN874
-Default: per_locale
-_Description: What character encoding should the database use by default?
- The database character encoding determines which internal codes are used
- to represent written characters.  The traditional form is SQL_ASCII, which
- uses the ASCII character set.  This is not capable of representing
- non-English alphabets, so you will need to use a different character set
- if your databases are to hold non-English characters.
- .
- These are the available encodings:
-  per_locale         SQL_ASCII if chosen locale is C, otherwise UNICODE
-  SQL_ASCII          ASCII - standard English characters only
-  UNICODE            Unicode (UTF-8)
-  EUC_JP             Japanese EUC
-  EUC_CN             Chinese EUC
-  EUC_KR             Korean EUC
-  JOHAB              Korean EUC (Hangle base)
-  EUC_TW             Taiwan EUC
-  MULE_INTERNAL      Mule internal code (Emacs)
-  LATIN1             ISO 8859-1 ECMA-94 Latin Alphabet No.1
-  LATIN2             ISO 8859-2 ECMA-94 Latin Alphabet No.2
-  LATIN3             ISO 8859-3 ECMA-94 Latin Alphabet No.3
-  LATIN4             ISO 8859-4 ECMA-94 Latin Alphabet No.4
-  LATIN5             ISO 8859-9 ECMA-128 Latin Alphabet No.5
-  LATIN6             ISO 8859-10 ECMA-144 Latin Alphabet No.6
-  LATIN7             ISO 8859-13 Latin Alphabet No.7
-  LATIN8             ISO 8859-14 Latin Alphabet No.8
-  LATIN9             ISO 8859-15 Latin Alphabet No.9
-  LATIN10            ISO 8859-16 ASRO SR 14111 Latin Alphabet No.10
-  ISO-8859-5         ECMA-113 Latin/Cyrillic
-  ISO-8859-6         ECMA-114 Latin/Arabic
-  ISO-8859-7         ECMA-118 Latin/Greek
-  ISO-8859-8         ECMA-121 Latin/Hebrew
-  KOI8               KOI8-R(U)
-  WIN                Windows CP1251
-  ALT                Windows CP866
-  WIN1256            Arabic Windows CP1256
-  TCVN               Vietnamese TCVN-5712 (Windows CP1258)
-  WIN874             Thai Windows CP874
- .
- Important: Before PostgreSQL 7.2, LATIN5 mistakenly meant ISO 8859-5. From
- 7.2 on, LATIN5 means ISO 8859-9. If you have a LATIN5 database created on
- 7.1 or earlier and want to migrate to 7.2 (or later), you should be very
- careful about this change.
- .
- Important: Not all APIs supports all the encodings listed above. For
- example, the PostgreSQL JDBC driver does not support MULE_INTERNAL,
- LATIN6, LATIN8, and LATIN10.
- .
- We suggest that UNICODE is the best encoding to use if you cannot use
- SQL_ASCII, unless you have a particular requirement for some other
- encoding.
- .
- This encoding is the default to use when no other is specified. You can
- create a particular database with any encoding you wish.
-
 Template: postgresql/initdb/location
 Type: string
 _Default: /var/lib/postgres/data


--- orig/postgresql-7.4.6/debian/postinst.in
+++ mod/postgresql-7.4.6/debian/postinst.in
@@ -140,6 +140,88 @@
     fi
 }
=20
+
+# return matching database encoding for PGLANG
+get_encoding () {
+    CHARSET=3D`env LANG=3D$PGLANG locale charmap`
+    MAP=3D$(grep -i $CHARSET <<EOF | cut -d' '  -f 1
+EUC_JP EUC-JP
+EUC_JP eucJP
+EUC_JP IBM-eucJP
+EUC_JP sdeckanji
+EUC_CN EUC-CN
+EUC_CN eucCN
+EUC_CN IBM-eucCN
+EUC_CN GB2312
+EUC_CN dechanzi
+EUC_KR EUC-KR
+EUC_KR eucKR
+EUC_KR IBM-eucKR
+EUC_KR deckorean
+EUC_KR 5601
+EUC_TW EUC-TW
+EUC_TW eucTW
+EUC_TW IBM-eucTW
+EUC_TW cns11643
+UTF8 UTF-8
+UTF8 utf8
+LATIN1 ISO-8859-1
+LATIN1 ISO8859-1
+LATIN1 iso88591
+LATIN2 ISO-8859-2
+LATIN2 ISO8859-2
+LATIN2 iso88592
+LATIN3 ISO-8859-3
+LATIN3 ISO8859-3
+LATIN3 iso88593
+LATIN4 ISO-8859-4
+LATIN4 ISO8859-4
+LATIN4 iso88594
+LATIN5 ISO-8859-9
+LATIN5 ISO8859-9
+LATIN5 iso88599
+LATIN6 ISO-8859-10
+LATIN6 ISO8859-10
+LATIN6 iso885910
+LATIN7 ISO-8859-13
+LATIN7 ISO8859-13
+LATIN7 iso885913
+LATIN8 ISO-8859-14
+LATIN8 ISO8859-14
+LATIN8 iso885914
+LATIN9 ISO-8859-15
+LATIN9 ISO8859-15
+LATIN9 iso885915
+LATIN10 ISO-8859-16
+LATIN10 ISO8859-16
+LATIN10 iso885916
+ISO_8859_5 ISO-8859-5
+ISO_8859_5 ISO8859-5
+ISO_8859_5 iso88595
+ISO_8859_6 ISO-8859-6
+ISO_8859_6 ISO8859-6
+ISO_8859_6 iso88596
+ISO_8859_7 ISO-8859-7
+ISO_8859_7 ISO8859-7
+ISO_8859_7 iso88597
+ISO_8859_8 ISO-8859-8
+ISO_8859_8 ISO8859-8
+ISO_8859_8 iso88598
+WIN CP1251
+WIN1256 CP1256
+TCVN CP1258
+KOI8 KOI8-R
+ALT CP866
+EOF)
+    if [ -n "$MAP" ]; then
+	export ENCODING=3D$MAP
+    else
+	# fallback if no mapping is found
+	export ENCODING=3DUTF8
+    fi
+}
+
+
 # *** EXECUTION STARTS HERE ***
 SHELL=3D/bin/sh
 case "$1" in
@@ -310,20 +392,6 @@
     # There is no existing database structure
     db_get postgresql/settings/locale
     PGLANG=3D${RET:-${LANG:-C}}
-    db_get postgresql/settings/encoding
-    case ${RET:-per_locale} in
-    	per_locale)
-	    if [ $PGLANG =3D C ]
-	    then
-	    	ENCODING=3DSQL_ASCII
-	    else
-	        ENCODING=3DUNICODE
-	    fi
-	    ;;
-	*)
-	    ENCODING=3D$RET
-	    ;;
-    esac
=20
     # initdb needs write permissions of the configuration files
     for f in /etc/postgresql/pg_hba.conf /etc/postgresql/pg_ident.conf /et=
c/postgresql/postgresql.conf /etc/postgresql/postmaster.conf
@@ -333,11 +401,15 @@
 	fi
     done
=20
+    # Determine matching encoding
+    get_encoding
+
     # Install the PostgreSQL database files in ${PGDATA}
     cat <<EOI > $SCRIPTFILE
 #!/bin/sh
 cd ${PGHOME}
 . ./${PROFILE}
+unset LC_ALL LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGE=
S LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT LC_IDENTIFICATION
 export LANG=3D$PGLANG
=20
 initdb --encoding ${ENCODING} --pgdata ${PGDATA}




--wxDdMuZNg1r63Hyj--

--Q0rSlbzrZN6k9QnT
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFBj1qwDecnbV4Fd/IRAsjoAJ42g1w2wCTxpkCV7ZCkuZ049f166wCfWbWH
Qm7PsFq0ALAUOwQ2J4glcu4=
=AATp
-----END PGP SIGNATURE-----

--Q0rSlbzrZN6k9QnT--