Quick Links

Re: Proposed Query Planner TODO items

Lists:	pgsql-hackers

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Proposed Query Planner TODO items
Date:	2003-12-05 19:08:05
Message-ID:	200312051108.05868.josh@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

PG Folks,

What follows are a couple of proposed TODO items to make up for some of the
places our planner is weak compared to other leading databases.
Particularly, I'm personally concerned that as of 7.4.0 we would "fail" the
TPC benchmark even if someone sponsored us for it (see Issue #2 below).

I freely admit that I don't have the skill to implement either of these;
instead, I want them on the TODO list just so we don't lose track of them,
and just in case some new brilliant coder jumps into our community looking
for something to do.

1) MAINTAIN CORROLARY STATS ON FORIEGN KEYS

Summary: Keep correspondance statistics between FK columns.

Description: One of the areas of ongoing planner estimation problems
estimation of cross-table correspondence of column values. Indeed, as late
a 7.2.4 the WHERE EXISTS code just estimated a flat 50%.
While it would be practically impossible to maintain statistics between all
columns in a database that might possibly be compared, there is one class of
cross-table column comparisons which is both used heavily and is readily
identifiable: foriegn keys.
My proposal is to keep statistics on the correlation of values between the
key and foriegn key values in order to arrive at better estimates. Adapting
the newly committed pg_indexstats to track this as well seems to me to be the
easiest method, but I'll admit to not really understanding Manfried's code.

NOTE: This suggestion was dicussed on Hackers early last summer and received
general approval but somehow never ended up on the TODO list.

2) DEVELOP BETTER PLANS FOR "OR GROUP" QUERIES

Summary: Currently, queries with complex "or group" criteria get devolved by
the planner into canonical and-or filters resulting in very poor execution on
large data sets. We should find better ways of dealing with these queries,
for example UNIONing.

Description: While helping OSDL with their derivative TPC-R benchmark, we ran
into a query (#19) which took several hours to complete on PostgreSQL. It
was in the general form:

SELECT t1.a, t2.b
FROM t1, t2
WHERE t1.a = t2.a
AND (
( t1.c = x
AND t1.f IN (m, n, o)
AND t2.d = v
AND t2.e BETWEEN j AND k
)
OR
( t1.c = y
AND t1.f IN (n, o, p)
AND t2.d = v
AND t2.e BETWEEN k AND h
)
OR
( t1.c = z
AND t1.f IN (p, q)
AND t2.d = w
AND t2.e BETWEEN k AND h
)
)

The reason why this query is included in the TPC benchmarks is the reason I've
run into problems with similar querys before; it is the kind of query
produced by many 3rd-party decision-support and reporting applications. Its
distinguishing feature is the same thing which gives PG indigestion; the
distinct OR groups with a complex set of criteria for each.

Or planner's approach to this sort of query is to devolve the criteria into a
3-page long set of canonical and-or filters, and seq scan the entire
underlying data set. This is fine if the data set is small, but if it's
several times the size of RAM, a full-table seq scan is fatal, as it is for
TPC-R which seems specifically designed to test for this kind of failure.

One solution which suggests itself is that the following query form runs in a
couple of seconds:

SELECT t1.a, t2.b
FROM t1, t2
WHERE t1.a = t2.a
AND t1.c = x
AND t1.f IN (m, n, o)
AND t2.d = v
AND t2.e BETWEEN j AND k
UNION ALL
SELECT t1.a, t2.b
FROM t1, t2
WHERE t1.a = t2.a
AND t1.c = y
AND t1.f IN (n, o, p)
AND t2.d = v
AND t2.e BETWEEN k AND h
UNION ALL
SELECT t1.a, t2.b
FROM t1, t2
AND t1.c = z
AND t1.f IN (p, q)
AND t2.d = w
AND t2.e BETWEEN k AND h

So the trick would be teaching the planner to:
a) recognize an "or group query" when it sees one;
b) break down that query into a multi-part union and estimate the cost

However, I'm sure there are other possible solutions. Oracle and MSSQL have
solved this particular query problem; anybody know how they do it?

--
Josh Berkus
Aglio Database Solutions
San Francisco

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	johnnnnnn <john(at)phaedrusdeinus(dot)org>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Proposed Query Planner TODO items
Date:	2003-12-05 20:14:37
Message-ID:	200312051214.37204.josh@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

John,

> > SELECT t1.a, t2.b
> > FROM t1, t2
> > WHERE t1.a = t2.a
> > AND t1.c = x
> > AND t1.f IN (m, n, o)
> > AND t2.d = v
> > AND t2.e BETWEEN j AND k
> > UNION ALL

> Shouldn't that be "UNION" instead of "UNION ALL"? You don't want
> duplicate rows, if i'm not mistaken.

Yes, you're correct; I copied UNION ALL from a test case which was not
generic. In general, one would want UNION.

--
-Josh Berkus
Aglio Database Solutions
San Francisco

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Proposed Query Planner TODO items
Date:	2003-12-05 20:41:26
Message-ID:	87y8tr0zgp.fsf@stark.dyndns.tv
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I know Oracle is capable of producing the UNION plan. but I don't know if
that's the only option. I'm curious what indexes the rewritten union-based
query used.

Josh Berkus <josh(at)agliodbs(dot)com> writes:

> SELECT t1.a, t2.b
> FROM t1, t2
> WHERE t1.a = t2.a
> AND (
> ( t1.c = x
> AND t1.f IN (m, n, o)
> AND t2.d = v
> AND t2.e BETWEEN j AND k
> )
> OR
> ( t1.c = y
> AND t1.f IN (n, o, p)
> AND t2.d = v
> AND t2.e BETWEEN k AND h
> )
> OR
> ( t1.c = z
> AND t1.f IN (p, q)
> AND t2.d = w
> AND t2.e BETWEEN k AND h
> )
> )

In this case it seems like it might be possible to look for a covering set
that is guaranteed to include all the records and doesn't include any ORs. If
that covering set can be scanned quickly then the complex conditions could be
tested on the resulting records individually.

In this case it would be something like

select t1.a,t2.b from t1,t2 where t1.a = t2.a
and ( t1.c in (x,y,z)
and t1.f in (m,n,o,p,q)
and t2.d in (v,w)
and t2.e between min(j,k) and max(k,h)
)
and (.... the above constraints...)

It seems like it would be a lot of work and only help in narrow cases though.

--
greg

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Proposed Query Planner TODO items
Date:	2003-12-09 15:50:59
Message-ID:	29947.1070985059@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> Summary: Currently, queries with complex "or group" criteria get devolved by
> the planner into canonical and-or filters resulting in very poor execution on
> large data sets. We should find better ways of dealing with these queries,
> for example UNIONing.

Could we see the actual present query plans for both the TPC-R query
and the UNION version? (I'll settle for "explain" on the slow
version, but "explain analyze" on the other, please.)

In general I am suspicious of proposals to rewrite queries into UNION
"equivalents", because the "equivalent" usually isn't exactly
equivalent, at least not without conditions that the planner can't
easily prove. This proposal looks a lot like the KSQO optimization that
we put in and then took out again several years ago --- it also rewrote
queries into a UNION form, only the UNION didn't necessarily produce
identical results.

I am thinking that the guys who do this query fast are probably
extracting single-relation subsets of the big OR/AND clause, so that
they can do some filtering of the input tables before the join. Our
existing planner would think that the OR/AND clause is only usable at
the join step, which is why it's seqscanning. But if we pulled out
subsets, we could have for instance

WHERE t1.a = t2.a
AND (
( t1.c = x
AND t1.f IN (m, n, o)
AND t2.d = v
AND t2.e BETWEEN j AND k
)
OR
( t1.c = y
AND t1.f IN (n, o, p)
AND t2.d = v
AND t2.e BETWEEN k AND h
)
OR
( t1.c = z
AND t1.f IN (p, q)
AND t2.d = w
AND t2.e BETWEEN k AND h
)
)
AND ( t1.c = x OR t1.c = y OR t1.c = z )

which is redundant, but that last clause could enable an indexscan on t1.c.

However ... the planner has code in it already that should do something
close to that, so there may be something I am missing. Again, could we
see EXPLAIN results?

regards, tom lane

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Proposed Query Planner TODO items
Date:	2003-12-09 17:25:54
Message-ID:	200312090925.54413.josh@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom,

> In general I am suspicious of proposals to rewrite queries into UNION
> "equivalents", because the "equivalent" usually isn't exactly
> equivalent, at least not without conditions that the planner can't
> easily prove.

As I said, I'm not sure that UNIONing the query is the solution, we just need
something other than what the planner currently does, which does not
complete.

Explains later today.

--
Josh Berkus
Aglio Database Solutions
San Francisco

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Proposed Query Planner TODO items
Date:	2003-12-19 18:14:29
Message-ID:	200312191014.29558.josh@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom,

> Could we see the actual present query plans for both the TPC-R query
> and the UNION version? (I'll settle for "explain" on the slow
> version, but "explain analyze" on the other, please.)

I'm not going to be able to set this up. I just had to put my server into
cold storage due to dismantling my office, and running the TPC stuff on my
laptop is a joke.

I'll contact the OSDL folks to see if they can run it.

--
Josh Berkus
Aglio Database Solutions
San Francisco

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Proposed Query Planner TODO items
Date:	2003-12-19 21:06:37
Message-ID:	3FE3685D.3060304@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

>I'm not going to be able to set this up. I just had to put my server into
>cold storage due to dismantling my office, and running the TPC stuff on my
>laptop is a joke.
>
>I'll contact the OSDL folks to see if they can run it.
>
>
>
We can... depending on what you need for a server.

--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd(at)commandprompt(dot)com - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org, Mark Wong <markw(at)osdl(dot)org>
Subject:	Re: Proposed Query Planner TODO items
Date:	2004-01-05 16:00:30
Message-ID:	10185.1073318430@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> 2) DEVELOP BETTER PLANS FOR "OR GROUP" QUERIES

> Summary: Currently, queries with complex "or group" criteria get devolved by
> the planner into canonical and-or filters resulting in very poor execution on
> large data sets. We should find better ways of dealing with these queries,
> for example UNIONing.

> Description: While helping OSDL with their derivative TPC-R benchmark, we ran
> into a query (#19) which took several hours to complete on PostgreSQL.

I've made some progress on this over the last week or two. Would it be
possible to retry that benchmark with CVS tip?

regards, tom lane

From:	markw(at)osdl(dot)org
To:	tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc:	josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org, jenny zhang <jenny(at)osdl(dot)org>, Mary Edie Meredith <maryedie(at)osdl(dot)org>
Subject:	Re: Proposed Query Planner TODO items
Date:	2004-01-05 16:17:02
Message-ID:	200401051617.i05GH5M14999@mail.osdl.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 5 Jan, Tom Lane wrote:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> 2) DEVELOP BETTER PLANS FOR "OR GROUP" QUERIES
>
>> Summary: Currently, queries with complex "or group" criteria get devolved by
>> the planner into canonical and-or filters resulting in very poor execution on
>> large data sets. We should find better ways of dealing with these queries,
>> for example UNIONing.
>
>> Description: While helping OSDL with their derivative TPC-R benchmark, we ran
>> into a query (#19) which took several hours to complete on PostgreSQL.
>
> I've made some progress on this over the last week or two. Would it be
> possible to retry that benchmark with CVS tip?

Yeah, no problem. We'll pull the code from CVS and give it a try.

Mark

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Proposed Query Planner TODO items
Date:	2004-01-06 19:21:01
Message-ID:	200401061121.01937.josh@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom,

> I've made some progress on this over the last week or two. Would it be
> possible to retry that benchmark with CVS tip?

Yes! I'll just need some time to get my laptop set up for running it. My
server is, alas, in storage due to me being "between offices".

--
-Josh Berkus
Aglio Database Solutions
San Francisco

From:	markw(at)osdl(dot)org
To:	tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc:	josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org, jenny zhang <jenny(at)osdl(dot)org>
Subject:	Re: Proposed Query Planner TODO items
Date:	2004-02-06 19:35:08
Message-ID:	200402061935.i16JZMN16731@mail.osdl.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Sorry it's taking so long. I tried to take a export from CVS today and
the database appears not to be able to connect to the postmaster when I
attempt to create the database. Let me know if getting a trace of
anything will help, if you guys already aren't already aware of the
problem.

Mark

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	markw(at)osdl(dot)org
Cc:	josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org, jenny zhang <jenny(at)osdl(dot)org>
Subject:	Re: Proposed Query Planner TODO items
Date:	2004-02-06 19:58:52
Message-ID:	9174.1076097532@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

markw(at)osdl(dot)org writes:
> Sorry it's taking so long. I tried to take a export from CVS today and
> the database appears not to be able to connect to the postmaster when I
> attempt to create the database. Let me know if getting a trace of
> anything will help, if you guys already aren't already aware of the
> problem.

CVS tip is not broken to my knowledge. Details please?

regards, tom lane

From:	markw(at)osdl(dot)org
To:	tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc:	josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org, jenny(at)osdl(dot)org
Subject:	Re: Proposed Query Planner TODO items
Date:	2004-02-06 20:56:20
Message-ID:	200402062056.i16KuNN31407@mail.osdl.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 6 Feb, Tom Lane wrote:
> markw(at)osdl(dot)org writes:
>> Sorry it's taking so long. I tried to take a export from CVS today and
>> the database appears not to be able to connect to the postmaster when I
>> attempt to create the database. Let me know if getting a trace of
>> anything will help, if you guys already aren't already aware of the
>> problem.
>
> CVS tip is not broken to my knowledge. Details please?

I ran this:

$ strace -o /tmp/initdb-7.5.out initdb -D /opt/pgdb/dbt2
The files belonging to this database system will be owned by user "markw".
This user must also own the server process.

The database cluster will be initialized with locale C.

creating directory /opt/pgdb/dbt2 ... ok
creating directory /opt/pgdb/dbt2/global ... ok
creating directory /opt/pgdb/dbt2/pg_xlog ... ok
creating directory /opt/pgdb/dbt2/pg_clog ... ok
creating directory /opt/pgdb/dbt2/base ... ok
creating directory /opt/pgdb/dbt2/base/1 ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 1000
creating configuration files ... ok
creating template1 database in /opt/pgdb/dbt2/base/1 ... ERROR: relnatts disagrees with indnatts for index 16601
initdb: child process exited with exit code 1
initdb: failed
initdb: removing data directory "/opt/pgdb/dbt2"

I've never seen this relnatts and indnatts disagreements message before.
I'll attach a compressed strace.

Thanks,
Mark

Attachment	Content-Type	Size
initdb-7.5.out.gz	application/octet-stream	32.0 KB

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	markw(at)osdl(dot)org
Cc:	josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org, jenny(at)osdl(dot)org
Subject:	Re: Proposed Query Planner TODO items
Date:	2004-02-06 21:08:15
Message-ID:	12485.1076101695@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

markw(at)osdl(dot)org writes:
> creating template1 database in /opt/pgdb/dbt2/base/1 ... ERROR: relnatts disagrees with indnatts for index 16601

Wow, that's a bizarre one. Are you sure you did a clean rebuild?
I usually like to do "make distclean" before or after "cvs update";
it tends to save me a lot of wasted time chasing build inconsistencies.
Which is what I suspect this is.

FWIW, my last CVS pull was yesterday about 15:00 EST, and it works fine.

regards, tom lane

From:	markw(at)osdl(dot)org
To:	tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc:	josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org, jenny(at)osdl(dot)org
Subject:	Re: Proposed Query Planner TODO items
Date:	2004-02-06 21:26:39
Message-ID:	200402062126.i16LQgN05986@mail.osdl.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 6 Feb, Tom Lane wrote:
> markw(at)osdl(dot)org writes:
>> creating template1 database in /opt/pgdb/dbt2/base/1 ... ERROR: relnatts disagrees with indnatts for index 16601
>
> Wow, that's a bizarre one. Are you sure you did a clean rebuild?
> I usually like to do "make distclean" before or after "cvs update";
> it tends to save me a lot of wasted time chasing build inconsistencies.
> Which is what I suspect this is.
>
> FWIW, my last CVS pull was yesterday about 15:00 EST, and it works fine.

Well, that make distclean did the trick. I actually did an export this
morning, not a checkout, but not like that should matter. Ok, will
hopefully get back with results soon.

Thanks,
Mark