Quick Links

Re: BUG #15160: planner overestimates number of rows in join when there are more than 200 rows coming from CTE

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>
Cc:	Alexey Ermakov <alexey(dot)ermakov(at)dataegret(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #15160: planner overestimates number of rows in join when there are more than 200 rows coming from CTE
Date:	2018-09-25 20:11:15
Message-ID:	18632.1537906275@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs pgsql-hackers

Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com> writes:
> On Tue, Apr 17, 2018 at 5:15 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> =?utf-8?q?PG_Bug_reporting_form?= <noreply(at)postgresql(dot)org> writes:
>>> I'm wondering how planner estimates number of rows in that case:

>> See eqjoinsel_semi, particularly the change in behavior when it thinks
>> nd2 is or is not a default estimate.

> There are similar issue without CTE which look pretty weird:

Yeah, this is exactly the same case as Alexey's: as soon as eqjoinsel_semi
decides it's dealing with a default ndistinct estimate, it chickens out
and delivers a very middle-of-the-road selectivity (0.5, it looks like).
It's somewhat luck that the non-default path is giving you an accurate
estimate, but certainly there's no surprise in the default case being
way off.

I don't particularly want to make that logic more aggressive about
assuming it's calculating something real. The existing behavior was
put in to fix a clear bug in the other direction, see
/message-id/flat/201104112029.14738.uwe%40oss4u.com

However, while looking at this I had a bit of an epiphany. The inner-join
selectivity in the same cases is pretty much on-target, so is there any
way we could factor that in? Yes, there is: the size of the semijoin
output could not be more than the output of a plain inner join of the
same two relations. So it'd be legitimate to clamp our selectivity
estimate for the semijoin case to make it not more than the inner-join
estimate.

A little bit of hacking later, I have the attached patch. The bulk of
the patch is just refactoring to avoid repetitive information lookup
when we call both eqjoinsel_semi and eqjoinsel_inner. The actual
change is just to clamp eqjoinsel_semi's result, like this:

/*
* We should never estimate the output of a semijoin to be more
* rows than the equivalent inner join; it's obviously impossible
* for that to happen. The former is N1 * Psemi while the latter
* is N1 * N2 * Pinner, so we may clamp Psemi <= N2 * Pinner.
* Doing this is worthwhile because of the shakier estimation
* rules we use in eqjoinsel_semi, particularly in cases where it
* has to punt entirely.
*/
selec = Min(selec, inner_rel->rows * selec_inner);

That makes the funny behavior go away in both test cases shown in this
thread. I find one plan change in the regression tests, but it looks
reasonable enough (and checking the actual row counts shows that the
estimate moved closer to reality, not further away).

Now, there's a certain amount of garbage-in-garbage-out to this: if for
some reason the innerjoin selectivity is way off, this could do more to
hurt the semijoin estimate than to help it. But I think that generally
the semijoin numbers are much less reliable than the innerjoin numbers,
so mostly it ought to be no-change or a win.

I'll queue this up for review in the next CF.

regards, tom lane

Attachment	Content-Type	Size
clamp-semijoin-selectivity-1.patch	text/x-diff	21.8 KB

In response to

Re: BUG #15160: planner overestimates number of rows in join when there are more than 200 rows coming from CTE at 2018-09-25 11:24:39 from Maxim Boguk

Responses

Re: BUG #15160: planner overestimates number of rows in join when there are more than 200 rows coming from CTE at 2018-11-16 00:51:45 from Melanie Plageman

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Michael Paquier	2018-09-26 00:09:29	Re: BUG #15238: Sequence owner not updated when owning table is foreign
Previous Message	Dilip Kumar	2018-09-25 16:45:23	Re: BUG #15401: pg_dump error

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2018-09-25 20:17:23	Re: Query is over 2x slower with jit=on
Previous Message	Andres Freund	2018-09-25 19:50:34	Re: Query is over 2x slower with jit=on