Quick Links

AW: Yet another LIKE-indexing scheme

Lists:	pgsql-hackers

From:	Zeugswetter Andreas SB <ZeugswetterA(at)wien(dot)spardat(dot)at>
To:	"'Jules Bean'" <jules(at)jellybean(dot)co(dot)uk>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	AW: Yet another LIKE-indexing scheme
Date:	2000-09-06 15:19:46
Message-ID:	11C1E6749A55D411A9670001FA687963368069@sdexcsrv1.f000.d0188.sd.spardat.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

> On Sat, Sep 02, 2000 at 01:39:47PM -0400, Tom Lane wrote:
> > > So what happens with "WHERE name like 'Czec%`" ?
> >
> > Our existing code fails because it generates WHERE name >=
> 'Czec' AND
> > name < 'Czed'; it will therefore not find names beginning 'Czech'
> > because those are in another part of the index, between 'Czeh' and
> > 'Czei'. But WHERE name >= 'Cze' AND name < 'Czf' would work.
>
> (OK, I haven't read the previous discussion. Guilty, m'lud)
>
> Why should it? If 'ch' is one letter, then surely 'czech' isn't LIKE
> 'czec%'. Because 'czec%' has a second c, wheres, 'czech' only has one
> 'c' and one 'ch'?

Indeed an interesting interpretation, but what I guess makes it bogus is
that
words can exist that have a h after the c that do not represent the ch
character.

Andreas

From:	Jules Bean <jules(at)jellybean(dot)co(dot)uk>
To:	Zeugswetter Andreas SB <ZeugswetterA(at)wien(dot)spardat(dot)at>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Yet another LIKE-indexing scheme
Date:	2000-09-06 15:22:18
Message-ID:	20000906162217.H31824@grommit.office.vi.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Sep 06, 2000 at 05:19:46PM +0200, Zeugswetter Andreas SB wrote:
>
> > On Sat, Sep 02, 2000 at 01:39:47PM -0400, Tom Lane wrote:
> > > > So what happens with "WHERE name like 'Czec%`" ?
> > >
> > > Our existing code fails because it generates WHERE name >=
> > 'Czec' AND
> > > name < 'Czed'; it will therefore not find names beginning 'Czech'
> > > because those are in another part of the index, between 'Czeh' and
> > > 'Czei'. But WHERE name >= 'Cze' AND name < 'Czf' would work.
> >
> > (OK, I haven't read the previous discussion. Guilty, m'lud)
> >
> > Why should it? If 'ch' is one letter, then surely 'czech' isn't LIKE
> > 'czec%'. Because 'czec%' has a second c, wheres, 'czech' only has one
> > 'c' and one 'ch'?
>
> Indeed an interesting interpretation, but what I guess makes it bogus is
> that
> words can exist that have a h after the c that do not represent the ch
> character.

This is an excellent point.

But in that case, how is the collating system to cope? How can the
computer know which 'ch's are 'ch's and which are 'c''h's (IYSWIM)?

Jules