Re: Insert works but fails for merge

Lists: pgsql-general
From: yudhi s <learnerdatabase99(at)gmail(dot)com>
To: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Insert works but fails for merge
Date: 2024-08-09 21:13:28
Message-ID: CAEzWdqc=QKpfFU3jNSoBrZ37y727GK2w2rSQ2qJCEGJDT5Pm6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hello,
It's version 15.4 postgres. Where we have an insert working fine, but then
a similar insert with the same 'timestamp' value, when trying to be
executed through merge , it fails stating "You will need to rewrite or cast
the expression.". Why so?

*Example:-*
https://dbfiddle.uk/j5S7br-q

CREATE TABLE tab1 (
id varchar(100) ,
mid INT,
txn_timestamp TIMESTAMPTZ NOT NULL,
cre_ts TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (txn_timestamp);

CREATE TABLE tab1_2024_08_09 PARTITION OF tab1
FOR VALUES FROM ('2024-08-09 00:00:00') TO ('2024-08-10 00:00:00');

-- Below insert works fine
INSERT INTO tab1
(id, mid, txn_timestamp, cre_ts)
VALUES
('5efd4c91-ef93-4477-840c-a723ae212d84', 123,
'2024-08-09T11:33:49.402585600Z', '2024-08-09T11:33:49.402585600Z');

-- Below merge , which trying to insert similar row but failing

WITH source_data (id, mid, txn_timestamp, cre_ts) AS (
VALUES ('5efd4c91-ef93-4477-840c-a723ae212d66', 123,
'2024-08-09T11:33:49.402585600Z', '2024-08-09T11:33:49.402585600Z')
)
INSERT INTO tab1 (id, mid, txn_timestamp, cre_ts)
SELECT id, mid, txn_timestamp, cre_ts
FROM source_data
ON CONFLICT (id) DO UPDATE
SET mid = EXCLUDED.mid,
txn_timestamp = EXCLUDED.txn_timestamp,
cre_ts = EXCLUDED.cre_ts;

ERROR: column "txn_timestamp" is of type timestamp with time zone but
expression is of type text LINE 24: SELECT id, mid, txn_timestamp, cre_ts ^
HINT: You will need to rewrite or cast the expression.


From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: yudhi s <learnerdatabase99(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Insert works but fails for merge
Date: 2024-08-09 21:24:46
Message-ID: CAKFQuwY3itH6rZ8xxjyZOgm6wbNCB2Rt+z24vpb4dZYD_9YypQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Fri, Aug 9, 2024 at 2:14 PM yudhi s <learnerdatabase99(at)gmail(dot)com> wrote:

>
> Why so?
>

Because you stuck a CTE in between the column list of the insert - where
types are known - and the values command - where types are unknown since
you didn't specify them. As the row comes out of the CTE every column must
have a known type, and so in the absence of context they get typed as text.

David J.


From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: yudhi s <learnerdatabase99(at)gmail(dot)com>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Insert works but fails for merge
Date: 2024-08-09 21:26:50
Message-ID: 37e09717-f121-4192-b152-18df17713414@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 8/9/24 14:13, yudhi s wrote:
> Hello,
> It's version 15.4 postgres. Where we have an insert working fine, but
> then a similar insert with the same 'timestamp' value, when trying to be
> executed through merge , it fails stating "You will need to rewrite or
> cast the expression.". Why so?
>
> *Example:-*
> https://dbfiddle.uk/j5S7br-q <https://dbfiddle.uk/j5S7br-q>*
> *
>
> CREATE TABLE tab1 (
>     id varchar(100) ,
>     mid INT,
>     txn_timestamp TIMESTAMPTZ NOT NULL,
>     cre_ts TIMESTAMPTZ NOT NULL
> ) PARTITION BY RANGE (txn_timestamp);
>
> CREATE TABLE tab1_2024_08_09 PARTITION OF tab1
>     FOR VALUES FROM ('2024-08-09 00:00:00') TO ('2024-08-10 00:00:00');
>
> -- Below insert works fine
> INSERT INTO tab1
>     (id, mid, txn_timestamp, cre_ts)
> VALUES
>     ('5efd4c91-ef93-4477-840c-a723ae212d84', 123,
> '2024-08-09T11:33:49.402585600Z', '2024-08-09T11:33:49.402585600Z');
>
> -- Below merge , which trying to insert similar row but failing
>
> WITH source_data (id, mid, txn_timestamp, cre_ts) AS (
>     VALUES ('5efd4c91-ef93-4477-840c-a723ae212d66', 123,
> '2024-08-09T11:33:49.402585600Z', '2024-08-09T11:33:49.402585600Z')
> )
> INSERT INTO tab1 (id, mid, txn_timestamp, cre_ts)
> SELECT id, mid, txn_timestamp, cre_ts
> FROM source_data
> ON CONFLICT (id) DO UPDATE
> SET    mid = EXCLUDED.mid,
>     txn_timestamp = EXCLUDED.txn_timestamp,
>     cre_ts = EXCLUDED.cre_ts;
>
> ERROR: column "txn_timestamp" is of type timestamp with time zone but
> expression is of type text LINE 24: SELECT id, mid, txn_timestamp,
> cre_ts ^ HINT: You will need to rewrite or cast the expression.

VALUES:

/docs/current/sql-values.html

"When VALUES is used in INSERT, the values are all automatically coerced
to the data type of the corresponding destination column. When it's used
in other contexts, it might be necessary to specify the correct data
type. If the entries are all quoted literal constants, coercing the
first is sufficient to determine the assumed type for all:

SELECT * FROM machines
WHERE ip_address IN (VALUES('192.168.0.1'::inet), ('192.168.0.10'),
('192.168.1.43'));
"

The VALUES is not directly attached to the INSERT, you will need to do
explicit casts:

VALUES ('5efd4c91-ef93-4477-840c-a723ae212d66', 123,
'2024-08-09T11:33:49.402585600Z'::timestamptz,
'2024-08-09T11:33:49.402585600Z'::timestamptz)

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com


From: yudhi s <learnerdatabase99(at)gmail(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Insert works but fails for merge
Date: 2024-08-10 12:07:31
Message-ID: CAEzWdqeW+yVwb2v8Vy-t9DC69_cMhUWTQqVBoL7_9qDzhBe8=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Sat, Aug 10, 2024 at 2:56 AM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
wrote:

> On 8/9/24 14:13, yudhi s wrote:
> > Hello,
> > It's version 15.4 postgres. Where we have an insert working fine, but
> > then a similar insert with the same 'timestamp' value, when trying to be
> > executed through merge , it fails stating "You will need to rewrite or
> > cast the expression.". Why so?
> >
> > *Example:-*
> > https://dbfiddle.uk/j5S7br-q <https://dbfiddle.uk/j5S7br-q>*
> > *
> >
> > CREATE TABLE tab1 (
> > id varchar(100) ,
> > mid INT,
> > txn_timestamp TIMESTAMPTZ NOT NULL,
> > cre_ts TIMESTAMPTZ NOT NULL
> > ) PARTITION BY RANGE (txn_timestamp);
> >
> > CREATE TABLE tab1_2024_08_09 PARTITION OF tab1
> > FOR VALUES FROM ('2024-08-09 00:00:00') TO ('2024-08-10 00:00:00');
> >
> > -- Below insert works fine
> > INSERT INTO tab1
> > (id, mid, txn_timestamp, cre_ts)
> > VALUES
> > ('5efd4c91-ef93-4477-840c-a723ae212d84', 123,
> > '2024-08-09T11:33:49.402585600Z', '2024-08-09T11:33:49.402585600Z');
> >
> > -- Below merge , which trying to insert similar row but failing
> >
> > WITH source_data (id, mid, txn_timestamp, cre_ts) AS (
> > VALUES ('5efd4c91-ef93-4477-840c-a723ae212d66', 123,
> > '2024-08-09T11:33:49.402585600Z', '2024-08-09T11:33:49.402585600Z')
> > )
> > INSERT INTO tab1 (id, mid, txn_timestamp, cre_ts)
> > SELECT id, mid, txn_timestamp, cre_ts
> > FROM source_data
> > ON CONFLICT (id) DO UPDATE
> > SET mid = EXCLUDED.mid,
> > txn_timestamp = EXCLUDED.txn_timestamp,
> > cre_ts = EXCLUDED.cre_ts;
> >
> > ERROR: column "txn_timestamp" is of type timestamp with time zone but
> > expression is of type text LINE 24: SELECT id, mid, txn_timestamp,
> > cre_ts ^ HINT: You will need to rewrite or cast the expression.
>
>
> VALUES:
>
> /docs/current/sql-values.html
>
> "When VALUES is used in INSERT, the values are all automatically coerced
> to the data type of the corresponding destination column. When it's used
> in other contexts, it might be necessary to specify the correct data
> type. If the entries are all quoted literal constants, coercing the
> first is sufficient to determine the assumed type for all:
>
> SELECT * FROM machines
> WHERE ip_address IN (VALUES('192.168.0.1'::inet), ('192.168.0.10'),
> ('192.168.1.43'));
> "
>
> The VALUES is not directly attached to the INSERT, you will need to do
> explicit casts:
>
> VALUES ('5efd4c91-ef93-4477-840c-a723ae212d66', 123,
> '2024-08-09T11:33:49.402585600Z'::timestamptz,
> '2024-08-09T11:33:49.402585600Z'::timestamptz)
>
>
>
Thank You Adrian and David.

Even converting the merge avoiding the WITH clause/CTE as below , is still
making it fail with the same error. So it seems , only direct "insert into
values" query can be auto converted/casted but not the other queries.

In our case , we were using this merge query in application code(in Java)
as a framework to dynamically take these values as bind values and do the
merge of input data/message. But it seems we have to now cast each and
every field which we get from the incoming message to make this merge work
in a correct way. I am wondering if the only way now is to get the data
types from information_schema.columns and then use the cast function to
write the values of the merge query dynamically casted/converted for each
of the fields in the application code. Please correct me if my
understanding is wrong.

MERGE INTO tab1 AS target
USING (VALUES ('5efd4c91-ef93-4477-840c-a723ae212d99', 123,
'2024-08-09T11:33:49.402585600Z','2024-08-09T11:33:49.402585600Z')) AS
source(id, mid,txn_timestamp, cre_ts)
ON target.id = source.id
WHEN MATCHED THEN
UPDATE SET mid = source.mid
WHEN NOT MATCHED THEN
INSERT (id, mid, txn_timestamp, cre_ts)
VALUES (source.id,source.mid, source.txn_timestamp, source.cre_ts);


From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: yudhi s <learnerdatabase99(at)gmail(dot)com>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Insert works but fails for merge
Date: 2024-08-10 13:49:30
Message-ID: CAKFQuwZBRWuVE5mz4ZTZ3Y2t=RMkSdxG7-UXV7aHLwFuM2-KOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Saturday, August 10, 2024, yudhi s <learnerdatabase99(at)gmail(dot)com> wrote:

>
> In our case , we were using this merge query in application code(in Java)
> as a framework to dynamically take these values as bind values and do the
> merge of input data/message.
>

I’d do most anything before resorting to dynamic SQL. Usually one can
simply write: values ($1::timestamptz) without resorting to a catalog
lookup.

Or maybe write a function to do the merge. Or maybe insert to a temporary
table then merge that.

David J.


From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: yudhi s <learnerdatabase99(at)gmail(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Insert works but fails for merge
Date: 2024-08-10 14:52:31
Message-ID: 9dc702ca-8bb3-442e-bd2b-12abe81a84d0@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 8/10/24 05:07, yudhi s wrote:
>
>

> Thank You Adrian and David.
>
> Even converting the merge avoiding the WITH clause/CTE as below , is
> still making it fail with the same error. So it seems , only
> direct "insert into values" query can be auto converted/casted but not
> the other queries.
>
> In our case , we were using this merge query in application code(in
> Java) as a framework to dynamically take these values as bind values and
> do the merge of input data/message. But it seems we have to now cast
> each and every field which we get from the incoming message to make
> this merge work in a correct way. I am wondering if the only way now is
> to get the data types from information_schema.columns and then use the
> cast function to write the values of the merge query dynamically
> casted/converted for each of the fields in the application code. Please
> correct me if my understanding is wrong.

Why not use INSERT ... ON CONFLICT instead of MERGE?

>
> MERGE INTO tab1 AS target
> USING (VALUES ('5efd4c91-ef93-4477-840c-a723ae212d99', 123,
> '2024-08-09T11:33:49.402585600Z','2024-08-09T11:33:49.402585600Z')) AS
> source(id, mid,txn_timestamp, cre_ts)
> ON target.id <http://target.id> = source.id <http://source.id>
> WHEN MATCHED THEN
> UPDATE SET mid  = source.mid
> WHEN NOT MATCHED THEN
> INSERT (id, mid, txn_timestamp, cre_ts)
>     VALUES (source.id <http://source.id>,source.mid,
>  source.txn_timestamp, source.cre_ts);

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com


From: yudhi s <learnerdatabase99(at)gmail(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Insert works but fails for merge
Date: 2024-08-10 20:23:11
Message-ID: CAEzWdqfLi-JtgZPhArq-6hAcSe9RnjN62LbeS8rVVSAiQOiOtg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Sat, Aug 10, 2024 at 8:22 PM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
wrote:

>
>
> Why not use INSERT ... ON CONFLICT instead of MERGE?
>
> >
> > MERGE INTO tab1 AS target
> > USING (VALUES ('5efd4c91-ef93-4477-840c-a723ae212d99', 123,
> > '2024-08-09T11:33:49.402585600Z','2024-08-09T11:33:49.402585600Z')) AS
> > source(id, mid,txn_timestamp, cre_ts)
> > ON target.id <http://target.id> = source.id <http://source.id>
> > WHEN MATCHED THEN
> > UPDATE SET mid = source.mid
> > WHEN NOT MATCHED THEN
> > INSERT (id, mid, txn_timestamp, cre_ts)
> > VALUES (source.id <http://source.id>,source.mid,
> > source.txn_timestamp, source.cre_ts);
>
>
>
Actually , as per the business logic , we need to merge on a column which
is not unique or having any unique index on it. It's the leading column of
a composite unique key though. And in such scenarios the "INSERT.... ON
CONFLICT" will give an error. So we are opting for a merge statement here,
which will work fine with the column being having duplicate values in it.


From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: yudhi s <learnerdatabase99(at)gmail(dot)com>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Insert works but fails for merge
Date: 2024-08-10 21:27:37
Message-ID: a301dd68-fd01-43b5-a22e-ac8f38855488@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 8/10/24 13:23, yudhi s wrote:
>
>
> On Sat, Aug 10, 2024 at 8:22 PM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com
> <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
>
>
>
> Why not use INSERT ... ON CONFLICT instead of MERGE?
>
> >
> > MERGE INTO tab1 AS target
> > USING (VALUES ('5efd4c91-ef93-4477-840c-a723ae212d99', 123,
> >
> '2024-08-09T11:33:49.402585600Z','2024-08-09T11:33:49.402585600Z')) AS
> > source(id, mid,txn_timestamp, cre_ts)
> > ON target.id <http://target.id> <http://target.id
> <http://target.id>> = source.id <http://source.id> <http://source.id
> <http://source.id>>
> > WHEN MATCHED THEN
> > UPDATE SET mid  = source.mid
> > WHEN NOT MATCHED THEN
> > INSERT (id, mid, txn_timestamp, cre_ts)
> >      VALUES (source.id <http://source.id> <http://source.id
> <http://source.id>>,source.mid,
> >   source.txn_timestamp, source.cre_ts);
>
>
>
> Actually , as per the business logic , we need to merge on a column
> which is not unique or having any unique index on it. It's the leading
> column of a composite unique key though. And in such scenarios the
> "INSERT.... ON CONFLICT" will give an error. So we are opting for a
> merge statement here, which will work fine with the column being
> having duplicate values in it.

Alright it's official I am confused.

You started with:

WITH source_data (id, mid, txn_timestamp, cre_ts) AS (
VALUES ('5efd4c91-ef93-4477-840c-a723ae212d66', 123,
'2024-08-09T11:33:49.402585600Z', '2024-08-09T11:33:49.402585600Z')
)
INSERT INTO tab1 (id, mid, txn_timestamp, cre_ts)
SELECT id, mid, txn_timestamp, cre_ts
FROM source_data
ON CONFLICT (id) DO UPDATE
SET mid = EXCLUDED.mid,
txn_timestamp = EXCLUDED.txn_timestamp,
cre_ts = EXCLUDED.cre_ts;

That implied that id was unique in of itself. As side note you called it
a merge, which it is not as in MERGE. At this point I got off track
thinking of MERGE.

Then you went to the below which is a merge:

MERGE INTO tab1 AS target
USING (VALUES ('5efd4c91-ef93-4477-840c-a723ae212d99', 123,
'2024-08-09T11:33:49.402585600Z','2024-08-09T11:33:49.402585600Z')) AS
source(id, mid,txn_timestamp, cre_ts)
ON target.id = source.id
WHEN MATCHED THEN
UPDATE SET mid = source.mid
WHEN NOT MATCHED THEN
INSERT (id, mid, txn_timestamp, cre_ts)
VALUES (source.id,source.mid, source.txn_timestamp, source.cre_ts);

The question I have now is if id is part of a composite UNIQUE index on
this:

CREATE TABLE tab1 (
id varchar(100) ,
mid INT,
txn_timestamp TIMESTAMPTZ NOT NULL,
cre_ts TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (txn_timestamp);

Then what is the other column in the UNIQUE index?

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com


From: yudhi s <learnerdatabase99(at)gmail(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Insert works but fails for merge
Date: 2024-08-10 21:40:52
Message-ID: CAEzWdqcHthQef8aAine9dUCrH-A+__xunt273KrwA7hv1LGQTA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Apology for the confusion. The other column is the txn_timestamp in the
composite unique key, which is also the partition key.

But yes we cant use both in the ON clause because of certain business
requirements. We realized it late. And that's why "on conflict " We are
unable to use.

On Sun, 11 Aug, 2024, 2:57 am Adrian Klaver, <adrian(dot)klaver(at)aklaver(dot)com>
wrote:

> On 8/10/24 13:23, yudhi s wrote:
> >
> >
> > On Sat, Aug 10, 2024 at 8:22 PM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com
> > <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
> >
> >
> >
> > Why not use INSERT ... ON CONFLICT instead of MERGE?
> >
> > >
> > > MERGE INTO tab1 AS target
> > > USING (VALUES ('5efd4c91-ef93-4477-840c-a723ae212d99', 123,
> > >
> > '2024-08-09T11:33:49.402585600Z','2024-08-09T11:33:49.402585600Z'))
> AS
> > > source(id, mid,txn_timestamp, cre_ts)
> > > ON target.id <http://target.id> <http://target.id
> > <http://target.id>> = source.id <http://source.id> <http://source.id
> > <http://source.id>>
> > > WHEN MATCHED THEN
> > > UPDATE SET mid = source.mid
> > > WHEN NOT MATCHED THEN
> > > INSERT (id, mid, txn_timestamp, cre_ts)
> > > VALUES (source.id <http://source.id> <http://source.id
> > <http://source.id>>,source.mid,
> > > source.txn_timestamp, source.cre_ts);
> >
> >
> >
> > Actually , as per the business logic , we need to merge on a column
> > which is not unique or having any unique index on it. It's the leading
> > column of a composite unique key though. And in such scenarios the
> > "INSERT.... ON CONFLICT" will give an error. So we are opting for a
> > merge statement here, which will work fine with the column being
> > having duplicate values in it.
>
>
> Alright it's official I am confused.
>
> You started with:
>
> WITH source_data (id, mid, txn_timestamp, cre_ts) AS (
> VALUES ('5efd4c91-ef93-4477-840c-a723ae212d66', 123,
> '2024-08-09T11:33:49.402585600Z', '2024-08-09T11:33:49.402585600Z')
> )
> INSERT INTO tab1 (id, mid, txn_timestamp, cre_ts)
> SELECT id, mid, txn_timestamp, cre_ts
> FROM source_data
> ON CONFLICT (id) DO UPDATE
> SET mid = EXCLUDED.mid,
> txn_timestamp = EXCLUDED.txn_timestamp,
> cre_ts = EXCLUDED.cre_ts;
>
> That implied that id was unique in of itself. As side note you called it
> a merge, which it is not as in MERGE. At this point I got off track
> thinking of MERGE.
>
> Then you went to the below which is a merge:
>
> MERGE INTO tab1 AS target
> USING (VALUES ('5efd4c91-ef93-4477-840c-a723ae212d99', 123,
> '2024-08-09T11:33:49.402585600Z','2024-08-09T11:33:49.402585600Z')) AS
> source(id, mid,txn_timestamp, cre_ts)
> ON target.id = source.id
> WHEN MATCHED THEN
> UPDATE SET mid = source.mid
> WHEN NOT MATCHED THEN
> INSERT (id, mid, txn_timestamp, cre_ts)
> VALUES (source.id,source.mid, source.txn_timestamp, source.cre_ts);
>
> The question I have now is if id is part of a composite UNIQUE index on
> this:
>
> CREATE TABLE tab1 (
> id varchar(100) ,
> mid INT,
> txn_timestamp TIMESTAMPTZ NOT NULL,
> cre_ts TIMESTAMPTZ NOT NULL
> ) PARTITION BY RANGE (txn_timestamp);
>
> Then what is the other column in the UNIQUE index?
>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com
>
>


From: Alban Hertroys <haramrae(at)gmail(dot)com>
To: yudhi s <learnerdatabase99(at)gmail(dot)com>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, David G Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Insert works but fails for merge
Date: 2024-08-11 10:09:44
Message-ID: A0C7725A-C82D-4B62-A485-D1C4CE7C5A8E@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


> On 10 Aug 2024, at 22:23, yudhi s <learnerdatabase99(at)gmail(dot)com> wrote:
> On Sat, Aug 10, 2024 at 8:22 PM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> wrote:
> > MERGE INTO tab1 AS target
> > USING (VALUES ('5efd4c91-ef93-4477-840c-a723ae212d99', 123,
> > '2024-08-09T11:33:49.402585600Z','2024-08-09T11:33:49.402585600Z')) AS
> > source(id, mid,txn_timestamp, cre_ts)
> > ON target.id <http://target.id> = source.id <http://source.id>
> > WHEN MATCHED THEN
> > UPDATE SET mid = source.mid
> > WHEN NOT MATCHED THEN
> > INSERT (id, mid, txn_timestamp, cre_ts)
> > VALUES (source.id <http://source.id>,source.mid,
> > source.txn_timestamp, source.cre_ts);
>
> Actually , as per the business logic , we need to merge on a column which is not unique or having any unique index on it.

Then how is the database supposed to determine which of those duplicate rows it should update? In the best case, it would update all of the duplicates with the same values, which usually is not what you want.

> It's the leading column of a composite unique key though.

Which could be unique of itself, I suppose that isn’t the case here?

In that case, IMHO your best course of action is to do something about those duplicates first.

> And in such scenarios the "INSERT.... ON CONFLICT" will give an error. So we are opting for a merge statement here, which will work fine with the column being having duplicate values in it.

I’m not so sure about that claim…

At least on MSSQL, MERGE has this requirement: "A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times.”. I’ve seen that as an error message on occasion.

The MERGE documentation for PostgreSQL says this: "You should ensure that the join produces at most one candidate change row for each target row.”, which also seems to imply that you shouldn’t have duplicates.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.


From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Alban Hertroys <haramrae(at)gmail(dot)com>, yudhi s <learnerdatabase99(at)gmail(dot)com>
Cc: David G Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Insert works but fails for merge
Date: 2024-08-11 15:11:49
Message-ID: accc0da1-6be2-4a02-a2c3-fbe45f8da8d8@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 8/11/24 03:09, Alban Hertroys wrote:
>
>> On 10 Aug 2024, at 22:23, yudhi s <learnerdatabase99(at)gmail(dot)com> wrote:
>> On Sat, Aug 10, 2024 at 8:22 PM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> wrote:
>>> MERGE INTO tab1 AS target
>>> USING (VALUES ('5efd4c91-ef93-4477-840c-a723ae212d99', 123,
>>> '2024-08-09T11:33:49.402585600Z','2024-08-09T11:33:49.402585600Z')) AS
>>> source(id, mid,txn_timestamp, cre_ts)
>>> ON target.id <http://target.id> = source.id <http://source.id>
>>> WHEN MATCHED THEN
>>> UPDATE SET mid = source.mid
>>> WHEN NOT MATCHED THEN
>>> INSERT (id, mid, txn_timestamp, cre_ts)
>>> VALUES (source.id <http://source.id>,source.mid,
>>> source.txn_timestamp, source.cre_ts);
>>
>> Actually , as per the business logic , we need to merge on a column which is not unique or having any unique index on it.
>
> Then how is the database supposed to determine which of those duplicate rows it should update? In the best case, it would update all of the duplicates with the same values, which usually is not what you want.
>
>> It's the leading column of a composite unique key though.
>
> Which could be unique of itself, I suppose that isn’t the case here?
>
> In that case, IMHO your best course of action is to do something about those duplicates first.
>
>> And in such scenarios the "INSERT.... ON CONFLICT" will give an error. So we are opting for a merge statement here, which will work fine with the column being having duplicate values in it.
>
> I’m not so sure about that claim…
>
> At least on MSSQL, MERGE has this requirement: "A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times.”. I’ve seen that as an error message on occasion.
>
> The MERGE documentation for PostgreSQL says this: "You should ensure that the join produces at most one candidate change row for each target row.”, which also seems to imply that you shouldn’t have duplicates.

The next sentence says:

"In other words, a target row shouldn't join to more than one data
source row."

In this case the OP's data source is a single VALUES(). As it is written
I don't it tripping that rule, though it would not take much to change that.

>
> Alban Hertroys
> --
> If you can't see the forest for the trees,
> cut the trees and you'll find there is no forest.
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com


From: Greg Sabino Mullane <htamfids(at)gmail(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: Alban Hertroys <haramrae(at)gmail(dot)com>, yudhi s <learnerdatabase99(at)gmail(dot)com>, David G Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Insert works but fails for merge
Date: 2024-08-12 01:18:00
Message-ID: CAKAnmm+UgOUpGeurzeFg6r-q7RwhwuY-rQLd2nhSMk2sFfTJQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

So it looks like the OP does not mind updating more than one row. If you
want to keep it simple and not do a lot of casting, consider using a CTE to
do a reverse-upsert and use a prepared statement. Prepare and cast once,
and have your app send the raw uncasted strings many, many times:

prepare foo(text,int,timestamptz) as with x as (update tab1 set mid=$2
where id=$1 returning 1)
insert into tab1 select $1,$2,$3 where not exists (select 1 from x);

execute foo('5efd4c91-ef93-4477-840c-a723ae212d99', 123,
'2024-08-09T11:33:49.402585600Z');
execute foo('some_other_id', 456, '2024-08-11T21:44:55.8675309Z');
etc.

Your app/driver may or may not already do protocol-level statement
prepare/execute automagically, so test that way first.

It's version 15.4 postgres.

Keep on the latest revision. Right now, that's 15.8. Upgrading revisions is
quick and painless.

Cheers,
Greg


From: Greg Sabino Mullane <htamfids(at)gmail(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: Alban Hertroys <haramrae(at)gmail(dot)com>, yudhi s <learnerdatabase99(at)gmail(dot)com>, David G Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Insert works but fails for merge
Date: 2024-08-13 13:55:18
Message-ID: CAKAnmmK2iECtmatmiOq_JE03K5dFPmugy_CqbGAs-SR-3U1bbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

I just remembered that one of the complaints was not wanting to worry about
looking up the data types. In my previous example, you can also leave out
the types and Postgres will do the right thing. I prefer the explicit data
type version for clarity, but though I would provide this one for
completeness:

prepare foo as with x as (update tab1 set mid=$2 where id=$1 returning 1)
insert into tab1 select $1,$2,$3 where not exists (select 1 from x);

Cheers,
Greg