From: | Marko Kreen <markokr(at)gmail(dot)com> |
---|---|
To: | Scott Ribe <scott_ribe(at)elevated-dev(dot)com> |
Cc: | PostgreSQL general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: finding bogus utf-8 |
Date: | 2011-02-15 21:21:16 |
Message-ID: | AANLkTi==bd28k_J2=Dg0kcLD_mMTrLByCGrS+PHk1U-s@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Thu, Feb 10, 2011 at 9:02 PM, Scott Ribe <scott_ribe(at)elevated-dev(dot)com> wrote:
> I know that I have at least one instance of a varchar that is not valid utf-8, imported from a source with errors (AMA CPT files, actually) before PG's checking was as stringent as it is today. Can anybody suggest a query to find such values?
CREATE OR REPLACE FUNCTION is_utf8(text)
RETURNS bool AS $$
try:
args[0].decode('utf8')
return True
except UnicodeDecodeError:
return False
$$ LANGUAGE plpythonu STRICT;
--
marko
From | Date | Subject | |
---|---|---|---|
Next Message | David Kerr | 2011-02-15 21:33:37 | Re: pg_dump: schema with OID 58698 does not exist |
Previous Message | Vick Khera | 2011-02-15 21:20:40 | Re: finding bogus utf-8 |