Lists: | pgsql-hackers |
---|
From: | Zeugswetter Andreas SB <ZeugswetterA(at)wien(dot)spardat(dot)at> |
---|---|
To: | "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org |
Subject: | AW: A fine point about OUTER JOIN semantics |
Date: | 2000-09-05 14:56:59 |
Message-ID: | 11C1E6749A55D411A9670001FA687963368065@sdexcsrv1.f000.d0188.sd.spardat.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
I can tell you the results Informix produces:
> Am I right in thinking that the WHERE clause of a query must logically
> be applied *after* any joins specified in the FROM clause?
>
> For example, suppose that we have table t1 (x int) containing the
> values 1, 2, 3, 4, and table t2 (y int) containing the values 1, 2, 4.
> It's clear that the result of
> SELECT * FROM t1 LEFT JOIN t2 ON (x = y);
> should be
> x y
>
> 1 1
> 2 2
> 3 NULL
> 4 4
same
>
> But suppose we make the query
> SELECT * FROM t1 LEFT JOIN t2 ON (x = y) WHERE y <> 2;
> It seems to me this should yield
> x y
>
> 1 1
> 3 NULL
> 4 4
>
> and not
> x y
>
> 1 1
> 2 NULL
> 3 NULL
> 4 4
x y
1 1
4 4
>
> which is what you'd get if the y=2 tuple were filtered out before
> reaching the left-join stage. Does anyone read the spec differently,
> or get the latter result from another implementation?
>
> The reason this is interesting is that this example breaks a rather
> fundamental assumption in our planner/optimizer, namely that WHERE
> conditions can be pushed down to the lowest level at which all the
> variables they mention are available. Thus the planner would normally
> apply "y <> 2" during its bottom-level scan of t2, which
> would cause the
> LEFT JOIN to decide that x = 2 is an unmatched value, and thus produce
> a "2 NULL" output row.
>
> An even more interesting example is
> SELECT * FROM t1 FULL JOIN t2 ON (x = y AND y <> 2);
> My interpretation is that this should produce
> x y
>
> 1 1
> 2 NULL
> NULL 2
> 3 NULL
> 4 4
x y
1 1
4 4
> since both t1's x=2 and t2's y=2 tuple will appear "unmatched".
> This is *not* the same output you'd get from
> SELECT * FROM t1 FULL JOIN t2 ON (x = y) WHERE y <> 2;
> which I think should yield
> x y
>
> 1 1
> 3 NULL
> 4 4
> This shows that JOIN/ON conditions for outer joins are not
> semantically
> interchangeable with WHERE conditions.
x y
1 1
4 4
>
> This is going to be a bit of work to fix, so I thought I'd better
> confirm that I'm reading the spec correctly before I dive into it.
No idea if they interpret correctly, but seems they hand it interchangeably.
Someone want to check Oracle and MS Sql ?
Andreas
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Zeugswetter Andreas SB <ZeugswetterA(at)wien(dot)spardat(dot)at> |
Cc: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: AW: A fine point about OUTER JOIN semantics |
Date: | 2000-09-05 15:30:15 |
Message-ID: | 1251.968167815@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Zeugswetter Andreas SB <ZeugswetterA(at)wien(dot)spardat(dot)at> writes:
>> But suppose we make the query
>> SELECT * FROM t1 LEFT JOIN t2 ON (x = y) WHERE y <> 2;
>> It seems to me this should yield
>> x y
>>
>> 1 1
>> 3 NULL
>> 4 4
> x y
> 1 1
> 4 4
Oh, my mistake, I forgot that the WHERE clause would filter out NULLs.
Try
SELECT * FROM t1 LEFT JOIN t2 ON (x = y) WHERE y <> 2 OR y IS NULL;
>> An even more interesting example is
>> SELECT * FROM t1 FULL JOIN t2 ON (x = y AND y <> 2);
>> My interpretation is that this should produce
>> x y
>>
>> 1 1
>> 2 NULL
>> NULL 2
>> 3 NULL
>> 4 4
> x y
> 1 1
> 4 4
Here I believe Informix is broken. Their result clearly does not
agree with the spec's definition of a FULL JOIN ... indeed it looks
exactly like an inner join.
regards, tom lane