Lists: | pgsql-bugs |
---|
From: | "Scott V" <datagenic(at)gmail(dot)com> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | BUG #4451: initcap() function capitalizes incorrectly |
Date: | 2008-10-06 04:01:09 |
Message-ID: | 200810060401.m96419cn021991@wwwmaster.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
The following bug has been logged online:
Bug reference: 4451
Logged by: Scott V
Email address: datagenic(at)gmail(dot)com
PostgreSQL version: 8.3.1
Operating system: Mac OS X 10.5.4
Description: initcap() function capitalizes incorrectly
Details:
initcap() capitalizes incorrectly when passing strings containing certain
two-byte utf-8 characters. E.g., when argument = 'mātūrāte', initcap
returns 'MāTūRāTe'. Correct result should be 'Mātūrāte'.
The function appears to be incorrectly interpreting the two-byte chars as
non-alphamueric characters. They are in fact alphanumerics, they just have
diacritical markings.
From: | Magnus Hagander <magnus(at)hagander(dot)net> |
---|---|
To: | Scott V <datagenic(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #4451: initcap() function capitalizes incorrectly |
Date: | 2008-10-06 08:03:23 |
Message-ID: | 48E9C64B.7080306@hagander.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Scott V wrote:
> The following bug has been logged online:
>
> Bug reference: 4451
> Logged by: Scott V
> Email address: datagenic(at)gmail(dot)com
> PostgreSQL version: 8.3.1
> Operating system: Mac OS X 10.5.4
> Description: initcap() function capitalizes incorrectly
> Details:
>
> initcap() capitalizes incorrectly when passing strings containing certain
> two-byte utf-8 characters. E.g., when argument = 'mātūrāte', initcap
> returns 'MāTūRāTe'. Correct result should be 'Mātūrāte'.
>
> The function appears to be incorrectly interpreting the two-byte chars as
> non-alphamueric characters. They are in fact alphanumerics, they just have
> diacritical markings.
What's your setting for lc_collate?
//Magnus
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Magnus Hagander <magnus(at)hagander(dot)net> |
Cc: | Scott V <datagenic(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #4451: initcap() function capitalizes incorrectly |
Date: | 2008-10-06 12:37:08 |
Message-ID: | 18902.1223296628@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Magnus Hagander <magnus(at)hagander(dot)net> writes:
> Scott V wrote:
>> PostgreSQL version: 8.3.1
>> Operating system: Mac OS X 10.5.4
>> initcap() capitalizes incorrectly when passing strings containing certain
>> two-byte utf-8 characters. E.g., when argument = 'mtrte', initcap
>> returns 'MTRTe'. Correct result should be 'Mtrte'.
> What's your setting for lc_collate?
I think actually it's lc_ctype that determines case-folding. But the
current theory is that Apple's locale support is simply broken for
utf-8:
http://archives.postgresql.org/pgsql-general/2008-02/msg01072.php
which means that even if Scott had all his settings right, it wouldn't
work :-( A quick test on OS X here seems to confirm this.
regards, tom lane
From: | "Scott Vanderbilt" <datagenic(at)gmail(dot)com> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Magnus Hagander" <magnus(at)hagander(dot)net>, pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #4451: initcap() function capitalizes incorrectly |
Date: | 2008-10-06 15:50:11 |
Message-ID: | cac40f10810060850i66b74557ob07aad3315f98857@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Note sure what the correct settings should be, but output from SHOW
ALL in psql says:
lc_collate C
lc_ctype C
On Mon, Oct 6, 2008 at 5:37 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Magnus Hagander <magnus(at)hagander(dot)net> writes:
>> Scott V wrote:
>>> PostgreSQL version: 8.3.1
>>> Operating system: Mac OS X 10.5.4
>
>>> initcap() capitalizes incorrectly when passing strings containing certain
>>> two-byte utf-8 characters. E.g., when argument = 'mātūrāte', initcap
>>> returns 'MāTūRāTe'. Correct result should be 'Mātūrāte'.
>
>> What's your setting for lc_collate?
>
> I think actually it's lc_ctype that determines case-folding. But the
> current theory is that Apple's locale support is simply broken for
> utf-8:
> http://archives.postgresql.org/pgsql-general/2008-02/msg01072.php
> which means that even if Scott had all his settings right, it wouldn't
> work :-( A quick test on OS X here seems to confirm this.
>
> regards, tom lane
>
From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
---|---|
To: | Scott Vanderbilt <datagenic(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #4451: initcap() function capitalizes incorrectly |
Date: | 2008-10-06 16:01:46 |
Message-ID: | 48EA366A.8050601@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Scott Vanderbilt wrote:
> Note sure what the correct settings should be, but output from SHOW
> ALL in psql says:
>
> lc_collate C
> lc_ctype C
There's a chapter on locale support in the user manual:
http://www.postgresql.org/docs/8.3/interactive/locale.html
The right setting depends on what language's collation rules you want to
follow. "locale -a" in a shell should list the available options.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com