Lists: | pgsql-bugs |
---|
From: | pgsql-bugs(at)postgresql(dot)org |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Bug #676: lower(), upper(), & initcap() do not work on utf-8 chars |
Date: | 2002-05-25 04:23:51 |
Message-ID: | 20020525042351.927C7475A77@postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Henry House (hajhouse(at)houseag(dot)com) reports a bug with a severity of 3
The lower the number the more severe it is.
Short Description
lower(), upper(), & initcap() do not work on utf-8 chars
Long Description
The string case manipulation functions lower(), upper(), & initcap()
have no effect on non-ASCII characters in the argument, such as , ,
, , etc. ASCII chars in the argument are properly up- or down-cased.
The database encoding is utf-8.
Sample Code
SELECT upper('');
No file was uploaded with this report
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | hajhouse(at)houseag(dot)com, pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: Bug #676: lower(), upper(), & initcap() do not work on utf-8 chars |
Date: | 2002-05-25 04:56:06 |
Message-ID: | 23560.1022302566@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
pgsql-bugs(at)postgresql(dot)org writes:
> The string case manipulation functions lower(), upper(), & initcap()
> have no effect on non-ASCII characters in the argument, such as , ,
> , , etc. ASCII chars in the argument are properly up- or down-cased.
> The database encoding is utf-8.
lower/upper-casing is driven by locale, not encoding.
Unfortunately you didn't mention anything about your locale setup...
regards, tom lane
From: | Henry House <hajhouse(at)houseag(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: Bug #676: lower(), upper(), & initcap() do not work on utf-8 chars |
Date: | 2002-05-25 22:06:46 |
Message-ID: | 20020525220646.GA29258@wotan.hajhouse.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
On Sat, May 25, 2002 at 12:56:06AM -0400, Tom Lane wrote:
> pgsql-bugs(at)postgresql(dot)org writes:
> > The string case manipulation functions lower(), upper(), & initcap()
> > have no effect on non-ASCII characters in the argument, such as �, �,
> > �, �, etc. ASCII chars in the argument are properly up- or down-cased.
> > The database encoding is utf-8.
>
> lower/upper-casing is driven by locale, not encoding.
>
> Unfortunately you didn't mention anything about your locale setup...
The server locale is en_US.utf-8. (At least I set it up as such when
installing PostgreSQL; I know no way to verify.) The server version is 7.2.1,
running on a IA32 and a DEC Alpha; both machines show the same behavior. Both
are Debian Linux. Perhaps the bug lies in the locale definition supplied by
Debian?
--
Henry House
The attached file is a digital signature. See <http://romana.hajhouse.org/pgp>
for information. My OpenPGP key: <http://romana.hajhouse.org/hajhouse.asc>.
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Henry House <hajhouse(at)houseag(dot)com> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: Bug #676: lower(), upper(), & initcap() do not work on utf-8 chars |
Date: | 2002-05-26 03:25:04 |
Message-ID: | 2999.1022383504@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Henry House <hajhouse(at)houseag(dot)com> writes:
>> Unfortunately you didn't mention anything about your locale setup...
> The server locale is en_US.utf-8. (At least I set it up as such when
> installing PostgreSQL; I know no way to verify.) The server version is 7.2.=
> 1,
> running on a IA32 and a DEC Alpha; both machines show the same behavior. Bo=
> th
> are Debian Linux. Perhaps the bug lies in the locale definition supplied by
> Debian?
Offhand I'd not necessarily expect an en_US locale to upcase/downcase
anything except a-z/A-Z. Perhaps you need to use a different locale.
I'd suggest taking this up with a locale expert, which I surely am
not.
regards, tom lane
From: | Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> |
---|---|
To: | hajhouse(at)houseag(dot)com |
Cc: | tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: Bug #676: lower(), upper(), & initcap() do not work on |
Date: | 2002-05-28 01:07:59 |
Message-ID: | 20020528.100759.52176594.t-ishii@sra.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
> > lower/upper-casing is driven by locale, not encoding.
> >
> > Unfortunately you didn't mention anything about your locale setup...
>
> The server locale is en_US.utf-8. (At least I set it up as such when
> installing PostgreSQL; I know no way to verify.) The server version is 7.2.1,
> running on a IA32 and a DEC Alpha; both machines show the same behavior. Both
> are Debian Linux. Perhaps the bug lies in the locale definition supplied by
> Debian?
I don't think current locale support code works with mutibyte
encodings such as utf-8. See the thread tiled "Bug #659:
lower()/upper() bug on" on pgsql-bugs and pgsql-hackers.
In the mean time, a work around would be something like:
select convert(lower(convert('X', 'LATIN1')),'LATIN1','UNICODE');
That will convert utf-8 'X' to its lower case if you are sure that 'X'
could be converted to ISO-8859-1.
Of course the problem with this method is:
Someone has suggested me a fix using utf-8 locales, but I'm worried
about usage of utf-8 and am waiting for the test result with my
Japanese data.
--
Tatsuo Ishii