Re: UNIQUE null treatment option

Lists: Postg토토 사이트SQL
From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: UNIQUE null treatment option
Date: 2021-08-27 12:38:34
Message-ID: 84e5ee1b-387e-9a54-c326-9082674bde78@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: 503 토토 베이 페치 실패


The SQL standard has been ambiguous about whether null values in
unique constraints should be considered equal or not. Different
implementations have different behaviors. In the SQL:202x draft, this
has been formalized by making this implementation-defined and adding
an option on unique constraint definitions UNIQUE [ NULLS [NOT]
DISTINCT ] to choose a behavior explicitly.

This patch adds this option to PostgreSQL. The default behavior
remains UNIQUE NULLS DISTINCT. Making this happen in the btree code
is apparently pretty easy; most of the patch is just to carry the flag
around to all the places that need it.

The CREATE UNIQUE INDEX syntax extension is not from the standard,
it's my own invention.

(I named all the internal flags, catalog columns, etc. in the
negative ("nulls not distinct") so that the default PostgreSQL
behavior is the default if the flag is false. But perhaps the double
negatives make some code harder to read.)

Attachment Content-Type Size
0001-Add-UNIQUE-null-treatment-option.patch text/plain 52.8 KB

From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UNIQUE null treatment option
Date: 2021-08-27 12:44:37
Message-ID: CAL9smLAP5D3mJ6ijSO-M=459_pSS2d2Eo1vLxAMQi0rD1U1jUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Aug 27, 2021 at 3:38 PM Peter Eisentraut <
peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:

> In the SQL:202x draft, this
> has been formalized by making this implementation-defined and adding
> an option on unique constraint definitions UNIQUE [ NULLS [NOT]
> DISTINCT ] to choose a behavior explicitly.
>
> The CREATE UNIQUE INDEX syntax extension is not from the standard,
> it's my own invention.
>

For the unique index syntax, should this be selectable per
column/expression, rather than for the entire index as a whole?

.m


From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UNIQUE null treatment option
Date: 2021-09-07 11:17:40
Message-ID: 29632a11-25da-c073-4e89-3eec6315b102@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 27.08.21 14:44, Marko Tiikkaja wrote:
> On Fri, Aug 27, 2021 at 3:38 PM Peter Eisentraut
> <peter(dot)eisentraut(at)enterprisedb(dot)com
> <mailto:peter(dot)eisentraut(at)enterprisedb(dot)com>> wrote:
>
> In the SQL:202x draft, this
> has been formalized by making this implementation-defined and adding
> an option on unique constraint definitions UNIQUE [ NULLS [NOT]
> DISTINCT ] to choose a behavior explicitly.
>
> The CREATE UNIQUE INDEX syntax extension is not from the standard,
> it's my own invention.
>
>
> For the unique index syntax, should this be selectable per
> column/expression, rather than for the entire index as a whole?

Semantically, this would be possible, but the bookkeeping to make it
work seems out of proportion with the utility. And you'd have the
unique index syntax out of sync with the unique constraint syntax, which
would be pretty confusing.


From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UNIQUE null treatment option
Date: 2021-12-29 10:06:41
Message-ID: ac7f9f56-acf6-dfd2-2295-56eff1bbc935@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Here is a rebased version of this patch.

On 27.08.21 14:38, Peter Eisentraut wrote:
> The SQL standard has been ambiguous about whether null values in
> unique constraints should be considered equal or not.  Different
> implementations have different behaviors.  In the SQL:202x draft, this
> has been formalized by making this implementation-defined and adding
> an option on unique constraint definitions UNIQUE [ NULLS [NOT]
> DISTINCT ] to choose a behavior explicitly.
>
> This patch adds this option to PostgreSQL.  The default behavior
> remains UNIQUE NULLS DISTINCT.  Making this happen in the btree code
> is apparently pretty easy; most of the patch is just to carry the flag
> around to all the places that need it.
>
> The CREATE UNIQUE INDEX syntax extension is not from the standard,
> it's my own invention.
>
> (I named all the internal flags, catalog columns, etc. in the
> negative ("nulls not distinct") so that the default PostgreSQL
> behavior is the default if the flag is false.  But perhaps the double
> negatives make some code harder to read.)

Attachment Content-Type Size
v2-0001-Add-UNIQUE-null-treatment-option.patch text/plain 52.6 KB

From: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UNIQUE null treatment option
Date: 2022-01-13 13:21:06
Message-ID: CALT9ZEF_X_OkKnN0x4dbTbJtvkhM29h3LCStSYGUaBcTTfmMDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

+1 for commiting this feature. Consider this useful.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>


From: Maxim Orlov <orlovmg(at)gmail(dot)com>
To: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UNIQUE null treatment option
Date: 2022-01-13 15:51:18
Message-ID: CACG=ezYbFD0mODDkJZNpMPKKBjf9y_KuA8E142imSAnRwvRjnw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I find this patch useful. It includes changes in documentation and tests.
Code itself looks reasonable to me. Since, unique constraint check is done
by corresponding btree index, it makes this feature implementation
elegant and lightweight.

In my view, it is sufficient that heap relation can have different nulls
treatment in unique constraints for different unique columns. For example:
CREATE TABLE t (i INT UNIQUE NULLS DISTINCT, a INT UNIQUE NULLS NOT
DISTINCT);

All the tests are running ok on Linux and MacOS X.

Although, patch doesn't apply with default git apply options. Only with the
"three way merge" option (-3). Consider rebasing it, please. Then, in my
view, it can be "Ready for committer".
--
Best regards,
Maxim Orlov.


From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UNIQUE null treatment option
Date: 2022-01-13 18:36:45
Message-ID: CAH2-Wz=K67yJSuF-VF+FHA7MhGdht2WohFyTmSJeFD0C9bOtSQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Dec 29, 2021 at 2:06 AM Peter Eisentraut
<peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:
> Here is a rebased version of this patch.

BTScanInsertData.anynullkeys already effectively means "if the index
is a unique index, then we don't actually need to go through
_bt_check_unique(), or perform any other checkingunique steps". This
is really an instruction about what to do (or not do), based on the
specifics of the values for the insertion scan key plus the index
definition. In other words, the code in _bt_mkscankey() that sets up
BTScanInsertData (an insertion scankey) was written with the exact
requirements of btinsert() in mind -- nothing more.

I wonder if the logic for setting BTScanInsertData.anynullkeys inside
_bt_mkscankey() is the place to put your test for
rel->rd_index->indnullsnotdistinct -- not inside _bt_doinsert(). That
would probably necessitate renaming anynullkeys, but that's okay. This
feels more natural to me because a NULL key column in a NULLS NOT
DISTINCT unique constraint is very similar to a NULL non-key column in
an INCLUDE index, as far as our requirements go -- and so both cases
should probably be dealt with at the same point.

--
Peter Geoghegan


From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UNIQUE null treatment option
Date: 2022-01-13 18:47:09
Message-ID: CAH2-Wzm+LwPHX=pB7i6PoTrQi=C_WbF-6yNYqrotX5mSc7s4nQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg메이저 토토 사이트SQL

On Thu, Jan 13, 2022 at 10:36 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> I wonder if the logic for setting BTScanInsertData.anynullkeys inside
> _bt_mkscankey() is the place to put your test for
> rel->rd_index->indnullsnotdistinct -- not inside _bt_doinsert(). That
> would probably necessitate renaming anynullkeys, but that's okay. This
> feels more natural to me because a NULL key column in a NULLS NOT
> DISTINCT unique constraint is very similar to a NULL non-key column in
> an INCLUDE index, as far as our requirements go -- and so both cases
> should probably be dealt with at the same point.

Correction: I meant to write "...a NULL key column in a NULLS DISTINCT
unique constraint is very similar...".

--
Peter Geoghegan


From: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UNIQUE null treatment option
Date: 2022-01-13 19:01:49
Message-ID: CALT9ZEGAi4Sn_qvCVrdYAtZA=9uwAOXv2ft9R-iBJ2abRS7uew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg토토 결과SQL

>
> I wonder if the logic for setting BTScanInsertData.anynullkeys inside
> _bt_mkscankey() is the place to put your test for
> rel->rd_index->indnullsnotdistinct -- not inside _bt_doinsert(). That
> would probably necessitate renaming anynullkeys, but that's okay. This
> feels more natural to me because a NULL key column in a NULLS NOT
> DISTINCT unique constraint is very similar to a NULL non-key column in
> an INCLUDE index, as far as our requirements go -- and so both cases
> should probably be dealt with at the same point.
>

A good point, indeed!

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>


From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UNIQUE null treatment option
Date: 2022-01-24 15:50:17
Message-ID: 87d6b906-d1cb-75c1-7f90-24b17c724369@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 13.01.22 19:36, Peter Geoghegan wrote:
> I wonder if the logic for setting BTScanInsertData.anynullkeys inside
> _bt_mkscankey() is the place to put your test for
> rel->rd_index->indnullsnotdistinct -- not inside _bt_doinsert(). That
> would probably necessitate renaming anynullkeys, but that's okay. This
> feels more natural to me because a NULL key column in a NULLS NOT
> DISTINCT unique constraint is very similar to a NULL non-key column in
> an INCLUDE index, as far as our requirements go -- and so both cases
> should probably be dealt with at the same point.

Makes sense. Here is an updated patch with this change.

I didn't end up renaming anynullkeys. I came up with names like
"anyalwaysdistinctkeys", but in the end that felt too abstract, and
moreover, it would require rewriting a bunch of code comments that refer
to null values in this context. Since as you wrote, anynullkeys is just
a local concern between two functions, this slight inaccuracy is perhaps
better than some highly general but unclear terminology.

Attachment Content-Type Size
v3-0001-Add-UNIQUE-null-treatment-option.patch text/plain 52.6 KB

From: Maxim Orlov <orlovmg(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UNIQUE null treatment option
Date: 2022-01-25 10:05:17
Message-ID: CACG=ezaJj58_QGgrgg2MdTHRON6sXi7RVaW0KVhFxh6q7haPeg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg윈 토토SQL :

Since cfbot did failed with error, probably, unrelated to the patch itself
(see https://cirrus-ci.com/task/5330150500859904)
and repeated check did not start automatically, I reattach patch v3 to
restart cfbot on this patch.

--
Best regards,
Maxim Orlov.

Attachment Content-Type Size
v3-0001-Add-UNIQUE-null-treatment-option.patch application/octet-stream 52.6 KB

From: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UNIQUE null treatment option
Date: 2022-01-28 12:56:11
Message-ID: CALT9ZEHV1uQZzLY=adSF5Bhfj3DaSLCZRmUJCSF=5F0EjV5cUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>
> Makes sense. Here is an updated patch with this change.
>
> I didn't end up renaming anynullkeys. I came up with names like
> "anyalwaysdistinctkeys", but in the end that felt too abstract, and
> moreover, it would require rewriting a bunch of code comments that refer
> to null values in this context. Since as you wrote, anynullkeys is just
> a local concern between two functions, this slight inaccuracy is perhaps
> better than some highly general but unclear terminology.

Agree with that. With the comment it is clear how it works.

I've looked at the patch v3. It seems good enough for me. CFbot tests have
also come green.
Suggest it is RFC now.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>


From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UNIQUE null treatment option
Date: 2022-02-03 10:54:10
Message-ID: 12facb9f-541b-5a5c-040e-a5ca3c1dd4a3@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg토토 사이트SQL

On 28.01.22 13:56, Pavel Borisov wrote:
> Makes sense.  Here is an updated patch with this change.
>
> I didn't end up renaming anynullkeys.  I came up with names like
> "anyalwaysdistinctkeys", but in the end that felt too abstract, and
> moreover, it would require rewriting a bunch of code comments that
> refer
> to null values in this context.  Since as you wrote, anynullkeys is
> just
> a local concern between two functions, this slight inaccuracy is
> perhaps
> better than some highly general but unclear terminology.
>
> Agree with that. With the comment it is clear how it works.
>
> I've looked at the patch v3. It seems good enough for me. CFbot tests
> have also come green.
> Suggest it is RFC now.

Committed. Thanks.