Re: Long text values destroys logical replication slots

Lists: pgsql-bugs
From: Adam Dratwiński <adam(dot)dratwinski(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Cc: Krzysiek Kotlarski <kotlarski(dot)krzysztof(at)gmail(dot)com>
Subject: Long text values destroys logical replication slots
Date: 2015-10-28 18:51:18
Message-ID: CAKg6ypLd7773AOX4DiOGRwQk1TVOQKhNwjYiVjJnpq8Wo+i62Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

I recently encountered a bug on my production postgres which results in
killing replication slot. It happens when there is an UPDATE statement
called on a record having text column with a very long value (about 1mb).
The problem can be easily reproduced by this query:

https://gist.github.com/adwinsky/f7a6dc4382776fb9ea60

Here I am using decoderbufs decoder https://github.com/krzychk/decoderbufs
but it doesn't seems to be related with this plugin as it also happen on
the postgres test decoder.


From: Andres Freund <andres(at)anarazel(dot)de>
To: Adam Dratwiński <adam(dot)dratwinski(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Cc: Krzysiek Kotlarski <kotlarski(dot)krzysztof(at)gmail(dot)com>
Subject: Re: Long text values destroys logical replication slots
Date: 2015-10-28 20:54:34
Message-ID: 853C54D9-7168-4852-99CF-49DCA1A7B48C@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On October 28, 2015 7:51:18 PM GMT+01:00, "Adam Dratwiński" <adam(dot)dratwinski(at)gmail(dot)com> wrote:
>I recently encountered a bug on my production postgres which results in
>killing replication slot. It happens when there is an UPDATE statement
>called on a record having text column with a very long value (about
>1mb).
>The problem can be easily reproduced by this query:
>
>https://gist.github.com/adwinsky/f7a6dc4382776fb9ea60
>
>Here I am using decoderbufs decoder
>https://github.com/krzychk/decoderbufs
>but it doesn't seems to be related with this plugin as it also happen
>on
>the postgres test decoder.

Did you try the last minor release?

Andres
Hi
---
Please excuse brevity and formatting - I am writing this on my mobile phone.


From: Adam Dratwiński <adam(dot)dratwinski(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-bugs(at)postgresql(dot)org, Krzysiek Kotlarski <kotlarski(dot)krzysztof(at)gmail(dot)com>
Subject: Re: Long text values destroys logical replication slots
Date: 2015-10-28 21:14:28
Message-ID: CAKg6ypJcXr9x4p10gWp217SWG9q=MvB1d702BhAKu9keHrf2iw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

I managed to reproduce this problem on postgres 9.4.1 and 9.4.5 on osx and
gentoo.

On Wed, Oct 28, 2015 at 9:54 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:

> On October 28, 2015 7:51:18 PM GMT+01:00, "Adam Dratwiński" <
> adam(dot)dratwinski(at)gmail(dot)com> wrote:
> >I recently encountered a bug on my production postgres which results in
> >killing replication slot. It happens when there is an UPDATE statement
> >called on a record having text column with a very long value (about
> >1mb).
> >The problem can be easily reproduced by this query:
> >
> >https://gist.github.com/adwinsky/f7a6dc4382776fb9ea60
> >
> >Here I am using decoderbufs decoder
> >https://github.com/krzychk/decoderbufs
> >but it doesn't seems to be related with this plugin as it also happen
> >on
> >the postgres test decoder.
>
> Did you try the last minor release?
>
> Andres
> Hi
> ---
> Please excuse brevity and formatting - I am writing this on my mobile
> phone.
>


From: Andres Freund <andres(at)anarazel(dot)de>
To: Adam Dratwiński <adam(dot)dratwinski(at)gmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org, Krzysiek Kotlarski <kotlarski(dot)krzysztof(at)gmail(dot)com>
Subject: Re: Long text values destroys logical replication slots
Date: 2015-10-28 21:17:11
Message-ID: 4B354135-2D94-48B2-8346-C1DF0CDD5DA7@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Can you reproduce it with test_decoding as the output plugin?
---
Please excuse brevity and formatting - I am writing this on my mobile phone.


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Adam Dratwiński <adam(dot)dratwinski(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>, Krzysiek Kotlarski <kotlarski(dot)krzysztof(at)gmail(dot)com>
Subject: Re: Long text values destroys logical replication slots
Date: 2015-10-28 22:00:53
Message-ID: CAB7nPqRk4-rdKnXb=qTynyw1JgM9txDdg__XciZ6nOyQxjY+ng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Wed, Oct 28, 2015 at 10:17 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> Can you reproduce it with test_decoding as the output plugin?

You can just use that for example to get an assertion failure:
CREATE TABLE a (b text);
ALTER TABLE ONLY a REPLICA IDENTITY FULL;
SELECT * FROM pg_create_logical_replication_slot('new', 'test_decoding');
INSERT INTO a (b) VALUES (repeat('k', 2000000));
UPDATE a SET b = 'c';
select * from pg_logical_slot_peek_changes('new', NULL, NULL); -- boom

frame #3: 0x0000000100458ca9
postgres`DecodeXLogTuple(data=0x00007fb7e2126046, len=22910,
tuple=0x000000010a32e038) + 137 at decode.c:856
853 int datalen = len - SizeOfHeapHeader;
854
855 Assert(datalen >= 0);
-> 856 Assert(datalen <= MaxHeapTupleSize);
(lldb) p datalen
(int) $0 = 22905
--
Michael


From: Andres Freund <andres(at)anarazel(dot)de>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Adam Dratwiński <adam(dot)dratwinski(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>, Krzysiek Kotlarski <kotlarski(dot)krzysztof(at)gmail(dot)com>
Subject: Re: Long text values destroys logical replication slots
Date: 2015-10-28 22:23:56
Message-ID: 20151028222302.GK29811@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On 2015-10-28 23:00:53 +0100, Michael Paquier wrote:
> On Wed, Oct 28, 2015 at 10:17 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Can you reproduce it with test_decoding as the output plugin?
>
> You can just use that for example to get an assertion failure:
> CREATE TABLE a (b text);
> ALTER TABLE ONLY a REPLICA IDENTITY FULL;
> SELECT * FROM pg_create_logical_replication_slot('new', 'test_decoding');
> INSERT INTO a (b) VALUES (repeat('k', 2000000));
> UPDATE a SET b = 'c';
> select * from pg_logical_slot_peek_changes('new', NULL, NULL); -- boom
>
> frame #3: 0x0000000100458ca9
> postgres`DecodeXLogTuple(data=0x00007fb7e2126046, len=22910,
> tuple=0x000000010a32e038) + 137 at decode.c:856
> 853 int datalen = len - SizeOfHeapHeader;
> 854
> 855 Assert(datalen >= 0);
> -> 856 Assert(datalen <= MaxHeapTupleSize);
> (lldb) p datalen
> (int) $0 = 22905

Ugh, that's some monumental stupidity on my side in the handling of
oldtuple values. Newtuple datums never can be bigger than 8k (all
relevant columns point into toasted datums)- but that's absolutely not
the case for the "old" values. That can't happen for primary keys (due
to the btree limitations), which is why this wasn't noticed so far.
Gotta think about this one.

Greetings,

Andres Freund


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Adam Dratwiński <adam(dot)dratwinski(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>, Krzysiek Kotlarski <kotlarski(dot)krzysztof(at)gmail(dot)com>
Subject: Re: Long text values destroys logical replication slots
Date: 2016-01-29 07:07:29
Message-ID: CAB7nPqQZ8eBX_wMkXqCHty+4s=HsN-na+xVtPCOf0RikcAKVDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Thu, Oct 29, 2015 at 7:23 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2015-10-28 23:00:53 +0100, Michael Paquier wrote:
>> On Wed, Oct 28, 2015 at 10:17 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> > Can you reproduce it with test_decoding as the output plugin?
>>
>> You can just use that for example to get an assertion failure:
>> CREATE TABLE a (b text);
>> ALTER TABLE ONLY a REPLICA IDENTITY FULL;
>> SELECT * FROM pg_create_logical_replication_slot('new', 'test_decoding');
>> INSERT INTO a (b) VALUES (repeat('k', 2000000));
>> UPDATE a SET b = 'c';
>> select * from pg_logical_slot_peek_changes('new', NULL, NULL); -- boom
>>
>> frame #3: 0x0000000100458ca9
>> postgres`DecodeXLogTuple(data=0x00007fb7e2126046, len=22910,
>> tuple=0x000000010a32e038) + 137 at decode.c:856
>> 853 int datalen = len - SizeOfHeapHeader;
>> 854
>> 855 Assert(datalen >= 0);
>> -> 856 Assert(datalen <= MaxHeapTupleSize);
>> (lldb) p datalen
>> (int) $0 = 22905
>
> Ugh, that's some monumental stupidity on my side in the handling of
> oldtuple values. Newtuple datums never can be bigger than 8k (all
> relevant columns point into toasted datums)- but that's absolutely not
> the case for the "old" values. That can't happen for primary keys (due
> to the btree limitations), which is why this wasn't noticed so far.
> Gotta think about this one.

FWIW, I am going to play with that..
--
Michael


From: Andres Freund <andres(at)anarazel(dot)de>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Adam Dratwiński <adam(dot)dratwinski(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>, Krzysiek Kotlarski <kotlarski(dot)krzysztof(at)gmail(dot)com>
Subject: Re: Long text values destroys logical replication slots
Date: 2016-01-29 07:37:09
Message-ID: CDBBC7C8-5152-4EA0-A202-51B3BBACE12D@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On January 29, 2016 8:07:29 AM GMT+01:00, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:
>On Thu, Oct 29, 2015 at 7:23 AM, Andres Freund <andres(at)anarazel(dot)de>
>wrote:
>> On 2015-10-28 23:00:53 +0100, Michael Paquier wrote:
>>> On Wed, Oct 28, 2015 at 10:17 PM, Andres Freund <andres(at)anarazel(dot)de>
>wrote:
>>> > Can you reproduce it with test_decoding as the output plugin?
>>>
>>> You can just use that for example to get an assertion failure:
>>> CREATE TABLE a (b text);
>>> ALTER TABLE ONLY a REPLICA IDENTITY FULL;
>>> SELECT * FROM pg_create_logical_replication_slot('new',
>'test_decoding');
>>> INSERT INTO a (b) VALUES (repeat('k', 2000000));
>>> UPDATE a SET b = 'c';
>>> select * from pg_logical_slot_peek_changes('new', NULL, NULL); --
>boom
>>>
>>> frame #3: 0x0000000100458ca9
>>> postgres`DecodeXLogTuple(data=0x00007fb7e2126046, len=22910,
>>> tuple=0x000000010a32e038) + 137 at decode.c:856
>>> 853 int datalen = len - SizeOfHeapHeader;
>>> 854
>>> 855 Assert(datalen >= 0);
>>> -> 856 Assert(datalen <= MaxHeapTupleSize);
>>> (lldb) p datalen
>>> (int) $0 = 22905
>>
>> Ugh, that's some monumental stupidity on my side in the handling of
>> oldtuple values. Newtuple datums never can be bigger than 8k (all
>> relevant columns point into toasted datums)- but that's absolutely
>not
>> the case for the "old" values. That can't happen for primary keys
>(due
>> to the btree limitations), which is why this wasn't noticed so far.
>> Gotta think about this one.
>
>FWIW, I am going to play with that..

I've a patch for this, in just not yet happy with the API changes...

---
Please excuse brevity and formatting - I am writing this on my mobile phone.


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Adam Dratwiński <adam(dot)dratwinski(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>, Krzysiek Kotlarski <kotlarski(dot)krzysztof(at)gmail(dot)com>
Subject: Re: Long text values destroys logical replication slots
Date: 2016-01-29 08:06:19
Message-ID: CAB7nPqRps+nn25QRAaKqn8Y7yZxANbKY_zwGZdiANuo8AZVVkA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, Jan 29, 2016 at 4:37 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On January 29, 2016 8:07:29 AM GMT+01:00, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:
>>On Thu, Oct 29, 2015 at 7:23 AM, Andres Freund <andres(at)anarazel(dot)de>
>>wrote:
>>> On 2015-10-28 23:00:53 +0100, Michael Paquier wrote:
>>>> On Wed, Oct 28, 2015 at 10:17 PM, Andres Freund <andres(at)anarazel(dot)de>
>>wrote:
>>>> > Can you reproduce it with test_decoding as the output plugin?
>>>>
>>>> You can just use that for example to get an assertion failure:
>>>> CREATE TABLE a (b text);
>>>> ALTER TABLE ONLY a REPLICA IDENTITY FULL;
>>>> SELECT * FROM pg_create_logical_replication_slot('new',
>>'test_decoding');
>>>> INSERT INTO a (b) VALUES (repeat('k', 2000000));
>>>> UPDATE a SET b = 'c';
>>>> select * from pg_logical_slot_peek_changes('new', NULL, NULL); --
>>boom
>>>>
>>>> frame #3: 0x0000000100458ca9
>>>> postgres`DecodeXLogTuple(data=0x00007fb7e2126046, len=22910,
>>>> tuple=0x000000010a32e038) + 137 at decode.c:856
>>>> 853 int datalen = len - SizeOfHeapHeader;
>>>> 854
>>>> 855 Assert(datalen >= 0);
>>>> -> 856 Assert(datalen <= MaxHeapTupleSize);
>>>> (lldb) p datalen
>>>> (int) $0 = 22905
>>>
>>> Ugh, that's some monumental stupidity on my side in the handling of
>>> oldtuple values. Newtuple datums never can be bigger than 8k (all
>>> relevant columns point into toasted datums)- but that's absolutely
>>not
>>> the case for the "old" values. That can't happen for primary keys
>>(due
>>> to the btree limitations), which is why this wasn't noticed so far.
>>> Gotta think about this one.
>>
>>FWIW, I am going to play with that..
>
> I've a patch for this, in just not yet happy with the API changes...

Oh, OK, good to know. If you want to have extra eyes on it, just ping
me here then.
--
Michael


From: Andres Freund <andres(at)anarazel(dot)de>
To: Adam Dratwiński <adam(dot)dratwinski(at)gmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org, Krzysiek Kotlarski <kotlarski(dot)krzysztof(at)gmail(dot)com>
Subject: Re: Long text values destroys logical replication slots
Date: 2016-03-06 02:18:11
Message-ID: 20160306021811.n73abbpuu4rqeviu@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi Adam,

On 2015-10-28 19:51:18 +0100, Adam Dratwiński wrote:
> I recently encountered a bug on my production postgres which results in
> killing replication slot. It happens when there is an UPDATE statement
> called on a record having text column with a very long value (about 1mb).
> The problem can be easily reproduced by this query:
>
> https://gist.github.com/adwinsky/f7a6dc4382776fb9ea60
>
> Here I am using decoderbufs decoder https://github.com/krzychk/decoderbufs
> but it doesn't seems to be related with this plugin as it also happen on
> the postgres test decoder.

I pushed a fix for this; it'll be included in the next maintenance
releases which are coming up soon-ish. Sorry that it took so long.

Thanks for the report!

Regards,

Andres