Lists: | Postg토토 사이트 순위SQL : Postg토토 사이트 순위SQL 메일 링리스트 : 2022-09-28 이후 PGSQL-BUGS 12:22 |
---|
From: | PG Bug reporting form <noreply(at)postgresql(dot)org> |
---|---|
To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Cc: | zabetak(at)gmail(dot)com |
Subject: | BUG #17623: WHERE should be evaluated after FROM clause when operators may throw |
Date: | 2022-09-28 12:22:50 |
Message-ID: | 17623-2bef2f4ff7fe8051@postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | Postg토토 사이트 순위SQL : Postg토토 사이트 순위SQL 메일 링리스트 : 2022-09-28 이후 PGSQL-BUGS 12:22 |
The following bug has been logged on the website:
Bug reference: 17623
Logged by: Stamatis Zampetakis
Email address: zabetak(at)gmail(dot)com
PostgreSQL version: 14.5
Operating system: Debian 10.2.1-6
Description:
Steps to reproduce:
CREATE TABLE emp (empno INT, name VARCHAR, deptno INT);
INSERT INTO emp VALUES (0, 'Alex', 0);
INSERT INTO emp VALUES (10, 'Bob', 1);
CREATE TABLE dept (deptno INT);
INSERT INTO dept VALUES (1);
SELECT e.name
FROM emp e
INNER JOIN dept d ON e.deptno = d.deptno
WHERE (10 / e.empno) = 1
Actual output:
ERROR: division by zero
Expected output:
Bob
The error is caused since the filter condition in the WHERE clause is
evaluated before the join. Filter push-down is a very common and powerful
optimization but when there are operators in the WHERE clause that may throw
(such as division, cast, etc) this optimization is unsafe.
The SQL standard (Section 7.4 general rule 1) mandates that WHERE should be
applied to the result of FROM so in the case above pushing filters below the
join seems to violate the standard.
Citing the standard:
"If all optional clauses are omitted, then the result of the <table
expression> is the same as the result of the
<from clause>. Otherwise, each specified clause is applied to the result of
the previously specified clause
and the result of the <table expression> is the result of the application of
the last specified clause."
One of the optional clauses mentioned in the previous paragraph is the
<where clause>. There seems to be a clearly defined order between the <from
clause>, which includes inner joins, and the <where clause>.
From: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
---|---|
To: | zabetak(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17623: WHERE should be evaluated after FROM clause when operators may throw |
Date: | 2022-09-28 13:56:26 |
Message-ID: | CAKFQuwbRN3ji2jQzPqGeSpLyYCGg3WZnheQUjvXSN52VW-YKGg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
On Wed, Sep 28, 2022 at 5:29 AM PG Bug reporting form <
noreply(at)postgresql(dot)org> wrote:
> The following bug has been logged on the website:
>
> Bug reference: 17623
> Logged by: Stamatis Zampetakis
> Email address: zabetak(at)gmail(dot)com
> PostgreSQL version: 14.5
> Operating system: Debian 10.2.1-6
> Description:
>
> Steps to reproduce:
> CREATE TABLE emp (empno INT, name VARCHAR, deptno INT);
> INSERT INTO emp VALUES (0, 'Alex', 0);
> INSERT INTO emp VALUES (10, 'Bob', 1);
>
> CREATE TABLE dept (deptno INT);
> INSERT INTO dept VALUES (1);
>
> SELECT e.name
> FROM emp e
> INNER JOIN dept d ON e.deptno = d.deptno
> WHERE (10 / e.empno) = 1
>
> Actual output:
> ERROR: division by zero
>
> Expected output:
> Bob
>
> The error is caused since the filter condition in the WHERE clause is
> evaluated before the join. Filter push-down is a very common and powerful
> optimization but when there are operators in the WHERE clause that may
> throw
> (such as division, cast, etc) this optimization is unsafe.
>
>
> The SQL standard (Section 7.4 general rule 1) mandates that WHERE should be
> applied to the result of FROM so in the case above pushing filters below
> the
> join seems to violate the standard.
The failure to document such a deviation from the standard can be
considered a bug but not the deviation itself. That is intentional. In
terms of trade-offs the current behavior seems reasonable. You'd need a
real example motivating a desire to make a change that will likely add
complexity and cost to every query most of which work just fine with
relevant clauses pushed down to restrict the volume of data that needs to
be joined.
David J.
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | zabetak(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17623: WHERE should be evaluated after FROM clause when operators may throw |
Date: | 2022-09-28 14:37:38 |
Message-ID: | 3945739.1664375858@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
"David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
>> The SQL standard (Section 7.4 general rule 1) mandates that WHERE should be
>> applied to the result of FROM so in the case above pushing filters below
>> the join seems to violate the standard.
> The failure to document such a deviation from the standard can be
> considered a bug but not the deviation itself. That is intentional. In
> terms of trade-offs the current behavior seems reasonable.
Enforcing such a restriction would be absolutely disastrous for the
performance of a large number of real-world queries. I doubt that
any other RDBMS takes that spec wording literally either.
I'd suggest something like
CASE WHEN e.empno = 0 THEN NULL ELSE 10 / e.empno END
if you need to program around the lack of guarantees about the
order of evaluation. A WITH clause can also be used as an
optimization fence when you need one.
BTW, this *is* documented, see for example
/docs/current/sql-expressions.html#SYNTAX-EXPRESS-EVAL
regards, tom lane
From: | Stamatis Zampetakis <zabetak(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17623: WHERE should be evaluated after FROM clause when operators may throw |
Date: | 2022-09-28 15:44:25 |
Message-ID: | CAFQnWdYji9ShU-uaEip7_qQ4d1fvW-Q1Dpfon9HgY=xJDLL4eQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
I agree that brutally disabling filter pushdown is out of question.
However, it may not be so bad in terms of performance if done selectively
only for operators that may raise an error.
If for instance you have a big conjunction and only one conjunct contains a
cast you can push everything below the join and keep only the cast on the
top.
There may be other ways to retain performance while being inline with
standard SQL semantics; I haven't given too much though about it.
I didn't find a relevant discussion in the archives so I thought it was
worth putting this down for people who may bump into this in the future.
I don't have a strong opinion if it is worth trying to fix this or not
cause I don't really know much about Postgres internals.
In any case, it may be good to document the current behavior somewhere.
Note that the workaround with CASE WHEN can address this simplistic example
but it can easily get very complicated if you want to perform a cast based
on the result of a complex join; every join condition has to become part of
the WHEN clause.
There are probably other workarounds such as wrapping the join in a sub
query, putting the risky computation in project, and the filtering in a
outer query but these syntactic rewritings may become meaningless in the
eyes of the query optimizer. I don't know what Postgres will do in this
case but I know other optimizers that will happily pushdown the filter all
the way down and on top of the scan.
All that to say that the SQL semantics that say that WHERE is applied on
the result of FROM are quite well-known so when exceptions come into play
things get tricky.
Apologies for the long message and many thanks for taking the time to look
into this.
On Wed, Sep 28, 2022, 4:37 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
> >> The SQL standard (Section 7.4 general rule 1) mandates that WHERE
> should be
> >> applied to the result of FROM so in the case above pushing filters below
> >> the join seems to violate the standard.
>
> > The failure to document such a deviation from the standard can be
> > considered a bug but not the deviation itself. That is intentional. In
> > terms of trade-offs the current behavior seems reasonable.
>
> Enforcing such a restriction would be absolutely disastrous for the
> performance of a large number of real-world queries. I doubt that
> any other RDBMS takes that spec wording literally either.
>
> I'd suggest something like
>
> CASE WHEN e.empno = 0 THEN NULL ELSE 10 / e.empno END
>
> if you need to program around the lack of guarantees about the
> order of evaluation. A WITH clause can also be used as an
> optimization fence when you need one.
>
> BTW, this *is* documented, see for example
>
>
> /docs/current/sql-expressions.html#SYNTAX-EXPRESS-EVAL
>
> regards, tom lane
>
From: | Stamatis Zampetakis <zabetak(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17623: WHERE should be evaluated after FROM clause when operators may throw |
Date: | 2022-10-07 16:06:03 |
Message-ID: | CAFQnWdZdbQK8H1Rw=CHPFioSQne4v5vpj8Yc7yjRE6Vu1ETkLQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Just for the sake of completeness, (in case it was not clear in my previous
message), I would like to highlight that this problem is different from the
evaluation order of predicates in the WHERE clause, which is implementation
specific. Due to that, the documentation below (shared by Tom) does not
apply here.
/docs/current/sql-expressions.html#SYNTAX-EXPRESS-EVAL
If we want to document this deviation from the standard it should be done
in the page below:
/docs/current/queries-table-expressions.html#QUERIES-WHERE
Best,
Stamatis
On Wed, Sep 28, 2022 at 5:44 PM Stamatis Zampetakis <zabetak(at)gmail(dot)com>
wrote:
> I agree that brutally disabling filter pushdown is out of question.
> However, it may not be so bad in terms of performance if done selectively
> only for operators that may raise an error.
>
> If for instance you have a big conjunction and only one conjunct contains
> a cast you can push everything below the join and keep only the cast on the
> top.
>
> There may be other ways to retain performance while being inline with
> standard SQL semantics; I haven't given too much though about it.
>
> I didn't find a relevant discussion in the archives so I thought it was
> worth putting this down for people who may bump into this in the future.
>
> I don't have a strong opinion if it is worth trying to fix this or not
> cause I don't really know much about Postgres internals.
>
> In any case, it may be good to document the current behavior somewhere.
>
> Note that the workaround with CASE WHEN can address this simplistic
> example but it can easily get very complicated if you want to perform a
> cast based on the result of a complex join; every join condition has to
> become part of the WHEN clause.
>
> There are probably other workarounds such as wrapping the join in a sub
> query, putting the risky computation in project, and the filtering in a
> outer query but these syntactic rewritings may become meaningless in the
> eyes of the query optimizer. I don't know what Postgres will do in this
> case but I know other optimizers that will happily pushdown the filter all
> the way down and on top of the scan.
>
> All that to say that the SQL semantics that say that WHERE is applied on
> the result of FROM are quite well-known so when exceptions come into play
> things get tricky.
>
> Apologies for the long message and many thanks for taking the time to look
> into this.
>
>
>
> On Wed, Sep 28, 2022, 4:37 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
>> >> The SQL standard (Section 7.4 general rule 1) mandates that WHERE
>> should be
>> >> applied to the result of FROM so in the case above pushing filters
>> below
>> >> the join seems to violate the standard.
>>
>> > The failure to document such a deviation from the standard can be
>> > considered a bug but not the deviation itself. That is intentional. In
>> > terms of trade-offs the current behavior seems reasonable.
>>
>> Enforcing such a restriction would be absolutely disastrous for the
>> performance of a large number of real-world queries. I doubt that
>> any other RDBMS takes that spec wording literally either.
>>
>> I'd suggest something like
>>
>> CASE WHEN e.empno = 0 THEN NULL ELSE 10 / e.empno END
>>
>> if you need to program around the lack of guarantees about the
>> order of evaluation. A WITH clause can also be used as an
>> optimization fence when you need one.
>>
>> BTW, this *is* documented, see for example
>>
>>
>> /docs/current/sql-expressions.html#SYNTAX-EXPRESS-EVAL
>>
>> regards, tom lane
>>
>