Re: BUG #16369: Segmentation Faults and Data Corruption with Generated Columns

Lists: PostgreSQL : PostgreSQL 메일 링리스트 : 2020-04-16 00:40 이후 PGSQL 토토 커뮤니티
From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: cameron(dot)ezell(at)clearcapital(dot)com
Subject: BUG #16369: Segmentation Faults and Data Corruption with Generated Columns
Date: 2020-04-15 20:20:35
Message-ID: 16369-5845a6f1bef59884@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 16369
Logged by: Cameron Ezell
Email address: cameron(dot)ezell(at)clearcapital(dot)com
PostgreSQL version: 12.2
Operating system: CentOS 8, Red Hat 8, Mac OS X 10.14.6
Description:

It seems that there are a few bugs that are throwing segmentation faults and
causing data corruption. These issues keep appearing for our team when using
tables that contain generated columns introduced in PostgreSQL 12. This has
been tested on 12.2 & 12.1 on CentOS 8 as well as 12.1 on MacOS 10.14.6:

select version();
-- PostgreSQL 12.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.3
20140911 (Red Hat 4.8.3-9), 64-bit

CREATE SCHEMA if not exists test;commit;

CREATE TABLE if not exists test.bug_report
(
id bigint generated by default as identity,
hostname varchar,
hostname_short varchar GENERATED ALWAYS AS (split_part(hostname, '.',
1)) STORED,
device text,
mount text,
used_space_bytes bigint,
used_space_gb numeric GENERATED ALWAYS AS (ROUND(used_space_bytes /
1073741824.0,2)) STORED,
avail_space_bytes bigint,
avail_space_gb numeric GENERATED ALWAYS AS (ROUND(avail_space_bytes /
1073741824.0,2)) STORED,
inserted_dts timestamp with time zone NOT NULL DEFAULT
clock_timestamp(),
inserted_by text NOT NULL DEFAULT session_user
);commit;

-- No problems on the following insert
INSERT INTO test.bug_report(hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('123456789', 'devtmpfs', '/dev', 0,
6047076131313);
select * from test.bug_report;commit;

-- On CentOS 8, this bug is triggered with a hostname with 10+ characters.
On MacOS 10.14.6, 19+ characters.
INSERT INTO test.bug_report(hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('12345678901234567890', 'devtmpfs', '/dev', 0,
6047076131313);commit;
-- This should immediately crash the postgres service

-- Inserting some strings below that character threshold will insert just
fine, but a select statement on the table will now throw an error
INSERT INTO test.bug_report(hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('abc', 'devtmpfs', '/dev', 0,
6047076131313);commit;
select * from test.bug_report;commit;
--ERROR: XX000: invalid memory alloc request size 18446744073709551613
--LOCATION: palloc, mcxt.c:934

-- Once the "hostname_short" column no longer references any other column, I
am unable to reproduce this error
CREATE TABLE if not exists test.bug_report2
(
id bigint generated by default as identity,
hostname varchar,
hostname_short varchar GENERATED ALWAYS AS ('static_string') STORED,
device text,
mount text,
used_space_bytes bigint,
used_space_gb numeric GENERATED ALWAYS AS (ROUND(used_space_bytes /
1073741824.0,2)) STORED,
avail_space_bytes bigint,
avail_space_gb numeric GENERATED ALWAYS AS (ROUND(avail_space_bytes /
1073741824.0,2)) STORED,
inserted_dts timestamp with time zone NOT NULL DEFAULT
clock_timestamp(),
inserted_by text NOT NULL DEFAULT session_user
);commit;

INSERT INTO test.bug_report2(hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('12345678901234567890', 'devtmpfs', '/dev', 0,
6047076131313);commit;
select * from test.bug_report2;commit;

INSERT INTO test.bug_report2(hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('abc', 'devtmpfs', '/dev', 0,
6047076131313);commit;
select * from test.bug_report2;commit;

-- simply referencing another column in the generated column will cause a
crash
CREATE TABLE if not exists test.bug_report3
(
id bigint generated by default as identity,
hostname varchar,
hostname_short varchar GENERATED ALWAYS AS (hostname) STORED,
device text,
mount text,
used_space_bytes bigint,
used_space_gb numeric GENERATED ALWAYS AS (ROUND(used_space_bytes /
1073741824.0,2)) STORED,
avail_space_bytes bigint,
avail_space_gb numeric GENERATED ALWAYS AS (ROUND(avail_space_bytes /
1073741824.0,2)) STORED,
inserted_dts timestamp with time zone NOT NULL DEFAULT
clock_timestamp(),
inserted_by text NOT NULL DEFAULT session_user
);commit;

-- immediate crash
INSERT INTO test.bug_report3(hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('12345678901234567890', 'devtmpfs', '/dev', 0,
6047076131313);commit;

-- no crash on insert
INSERT INTO test.bug_report3(hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('abc', 'devtmpfs', '/dev', 0,
6047076131313);commit;
-- error thrown on select
select * from test.bug_report3;commit;
--ERROR: XX000: invalid memory alloc request size 18446744073709551613
--LOCATION: palloc, mcxt.c:934


From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: cameron(dot)ezell(at)clearcapital(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, peter(dot)eisentraut(at)2ndquadrant(dot)com
Subject: Re: BUG #16369: Segmentation Faults and Data Corruption with Generated Columns
Date: 2020-04-15 23:35:57
Message-ID: CAApHDvr7xeMcfGRWQ0rxtagw7CKbjEcjPKfKQAU4KdEcZm6XkQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Thu, 16 Apr 2020 at 08:53, PG Bug reporting form
<noreply(at)postgresql(dot)org> wrote:
> -- On CentOS 8, this bug is triggered with a hostname with 10+ characters.
> On MacOS 10.14.6, 19+ characters.
> INSERT INTO test.bug_report(hostname, device, mount, used_space_bytes,
> avail_space_bytes) VALUES ('12345678901234567890', 'devtmpfs', '/dev', 0,
> 6047076131313);commit;
> -- This should immediately crash the postgres service
--LOCATION: palloc, mcxt.c:934

Thanks for the report. This is certainly a bug.

I slightly simplified the test case to become:

drop table if exists crash;

CREATE TABLE crash (
id bigint generated by default as identity,
hostname varchar,
hostname_short varchar GENERATED ALWAYS AS (hostname) STORED,
device text,
mount text,
used_space_bytes bigint,
used_space_gb bigint GENERATED ALWAYS AS (used_space_bytes) STORED,
avail_space_bytes bigint,
avail_space_gb bigint GENERATED ALWAYS AS (avail_space_bytes) STORED,
inserted_dts timestamp with time zone NOT NULL DEFAULT clock_timestamp(),
inserted_by text NOT NULL DEFAULT session_user
);

INSERT INTO crash (hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('12345678901234567', 'devtmpfs', '/dev', 0,
6047076131313); -- no crash
INSERT INTO crash (hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES ('123456789012345678', 'devtmpfs', '/dev',
0, 6047076131313); -- crash

The crash occurs during ExecComputeStoredGenerated() when calculating
the hostname_short column. When we call ExecEvalExpr() for that
column, the eval function just returns the value of the Datum in the
"hostname" column, which is stored in "slot". This happens to be a
pointer into the tuple. This results in a crash because, later in that
function, we do ExecClearTuple(slot), which frees the memory for that
tuple.

I suppose it normally works because generally the calculation will be
using some SQL function which will return some calculated value which
is allocated, but in this case, we just return the pointer to the
memory in the tuple slot.

I'd say the fix should be to just datumCopy() the result of the
calculation. i.e the attached.

David

Attachment Content-Type Size
fix_generated_column_crash.patch application/octet-stream 661 bytes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: cameron(dot)ezell(at)clearcapital(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, peter(dot)eisentraut(at)2ndquadrant(dot)com
Subject: Re: BUG #16369: Segmentation Faults and Data Corruption with Generated Columns
Date: 2020-04-15 23:52:21
Message-ID: 1047.1586994741@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg토토 사이트 추천SQL : Postg토토 사이트 추천SQL 메일 링리스트 : 2020-04-15 이후 PGSQL-BUGS 23:52

David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> I'd say the fix should be to just datumCopy() the result of the
> calculation. i.e the attached.

That looks like it'll fail on a null result.

regards, tom lane


From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: cameron(dot)ezell(at)clearcapital(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, peter(dot)eisentraut(at)2ndquadrant(dot)com
Subject: Re: BUG #16369: Segmentation Faults and Data Corruption with Generated Columns
Date: 2020-04-16 00:40:24
Message-ID: CAApHDvqFZZOwCi35xi02G=19df7idzminAhibrHEAz4McEnX=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: PostgreSQL : PostgreSQL 메일 링리스트 : 2020-04-16 00:40 이후 PGSQL 토토 커뮤니티

On Thu, 16 Apr 2020 at 11:52, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> > I'd say the fix should be to just datumCopy() the result of the
> > calculation. i.e the attached.
>
> That looks like it'll fail on a null result.

Thanks. I'll have another go then.

Attachment Content-Type Size
fix_generated_column_crash2.patch application/octet-stream 666 bytes

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, cameron(dot)ezell(at)clearcapital(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, peter(dot)eisentraut(at)2ndquadrant(dot)com
Subject: Re: BUG #16369: Segmentation Faults and Data Corruption with Generated Columns
Date: 2020-04-16 05:30:36
Message-ID: 20200416053036.GA81957@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg토토 결과SQL : Postg토토 결과SQL 메일 링리스트 : 2020-04-16 이후 PGSQL-BUGS

On Thu, Apr 16, 2020 at 12:40:24PM +1200, David Rowley wrote:
> On Thu, 16 Apr 2020 at 11:52, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> That looks like it'll fail on a null result.
>
> Thanks. I'll have another go then.

Shouldn't you have more regression test cases here? I guess one for
the non-NULL case, and one for the NULL case?
--
Michael


From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>, David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, cameron(dot)ezell(at)clearcapital(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16369: Segmentation Faults and Data Corruption with Generated Columns
Date: 2020-04-17 09:04:46
Message-ID: 8e6df41a-d0a3-681f-1ac3-b0e2227103cc@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On 2020-04-16 07:30, Michael Paquier wrote:
> On Thu, Apr 16, 2020 at 12:40:24PM +1200, David Rowley wrote:
>> On Thu, 16 Apr 2020 at 11:52, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> That looks like it'll fail on a null result.
>>
>> Thanks. I'll have another go then.
>
> Shouldn't you have more regression test cases here? I guess one for
> the non-NULL case, and one for the NULL case?

I've tried creating a smaller test case, but it's difficult. I haven't
fully understood the reason why it sometimes crashes and sometimes not.
The smallest I could come up with so far is something like this:

CREATE TABLE crash (
hostname varchar,
hostname_short varchar GENERATED ALWAYS AS (hostname) STORED,
device text,
mount text,
used_space_bytes bigint,
avail_space_bytes bigint
);
INSERT INTO crash (hostname, device, mount, used_space_bytes,
avail_space_bytes) VALUES (repeat('1234567890', 100), 'devtmpfs',
'/dev', 0, 6047076131313);

Would this be easier to reproduce with one of the memory-cleaning
#defines enabled?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, cameron(dot)ezell(at)clearcapital(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16369: Segmentation Faults and Data Corruption with Generated Columns
Date: 2020-04-17 10:01:26
Message-ID: CAApHDvrDuAngJCAjhBEsPtWAgm4RzQ67T5shvATKgyaysvX0wA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, 17 Apr 2020 at 21:04, Peter Eisentraut
<peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
>
> I've tried creating a smaller test case, but it's difficult. I haven't
> fully understood the reason why it sometimes crashes and sometimes not.
>
> The smallest I could come up with so far is something like this:
>
> CREATE TABLE crash (
> hostname varchar,
> hostname_short varchar GENERATED ALWAYS AS (hostname) STORED,
> device text,
> mount text,
> used_space_bytes bigint,
> avail_space_bytes bigint
> );
> INSERT INTO crash (hostname, device, mount, used_space_bytes,
> avail_space_bytes) VALUES (repeat('1234567890', 100), 'devtmpfs',
> '/dev', 0, 6047076131313);
>
> Would this be easier to reproduce with one of the memory-cleaning
> #defines enabled?

I've done a bit more work on this and have something close. The case I
came up with for it was:

+CREATE TABLE gtest_varlena (a varchar, b varchar GENERATED ALWAYS AS
(a) STORED);
+INSERT INTO gtest_varlena (a) VALUES('01234567890123456789');
+INSERT INTO gtest_varlena (a) VALUES(NULL);
+SELECT * FROM gtest_varlena ORDER BY a;
+ a | b
+----------------------+----------------------
+ 01234567890123456789 | 01234567890123456789
+ |
+(2 rows)
+
+DROP TABLE gtest_varlena;

With that, on an unpatched server, there's no crash, but there's
clearly junk in the b column.


From: Terry Schmitt <tschmitt(at)schmittworks(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, cameron(dot)ezell(at)clearcapital(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16369: Segmentation Faults and Data Corruption with Generated Columns
Date: 2020-04-17 17:06:23
Message-ID: CAOOcyszqNBRSR66koLAaM+snYc=N-RgRRnsPXqnB7vw0ZKk3PQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg토토 커뮤니티SQL : Postg토토 커뮤니티SQL 메일 링리스트 : 2020-04-17 이후 PGSQL-BUGS 17:06

I've tried creating a smaller test case, but it's difficult. I haven't
> fully understood the reason why it sometimes crashes and sometimes not.
> The smallest I could come up with so far is something like this:
>
>
For context, I work with Cameron, who submitted this bug.
I totally agree with Peter on this. We tried to simplify the test case on
our end, but the symptoms would change in odd ways. Changing the column
order would change the symptoms slightly and at times the columns with
DEFAULT values would be corrupt. It also seemed like string text based
columns would behave differently than integers.


From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, cameron(dot)ezell(at)clearcapital(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: BUG #16369: Segmentation Faults and Data Corruption with Generated Columns
Date: 2020-04-18 02:13:17
Message-ID: CAApHDvpuQBiMCJsyfuqWWLCa2++QKkqttxJuh8F+98xVzNv14Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg토토 꽁 머니SQL : Postg토토 꽁 머니SQL 메일 링리스트 : 2020-04-18 이후 PGSQL-BUGS.

On Thu, 16 Apr 2020 at 17:30, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> Shouldn't you have more regression test cases here? I guess one for
> the non-NULL case, and one for the NULL case?

Yeah. I've just pushed a fix that included a fairly simple regression test.

David


From: Michael Paquier <michael(at)paquier(dot)xyz>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, cameron(dot)ezell(at)clearcapital(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: BUG #16369: Segmentation Faults and Data Corruption with Generated Columns
Date: 2020-04-18 09:28:56
Message-ID: 20200418092856.GE350229@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Sat, Apr 18, 2020 at 02:13:17PM +1200, David Rowley wrote:
> On Thu, 16 Apr 2020 at 17:30, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> > Shouldn't you have more regression test cases here? I guess one for
> > the non-NULL case, and one for the NULL case?
>
> Yeah. I've just pushed a fix that included a fairly simple regression test.

Thanks for the commit, David! I was wondering if it would have been
better to add a "\pset null", but that's a nit and the result looks
fine anyway.
--
Michael