Quick Links

Re: dsa_allocate() faliure

Lists:	pgsql-hackerspgsql-performance

From:	Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: dsa_allocate() faliure
Date:	2018-11-13 13:08:24
Message-ID:	CAJk1zg0x881aTwbEM8d-5EWVPfVWxN+Z56+parZ1WFc5DVcgmA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-performance

Hi, I'm also experiencing the problem: dsa_allocate could not find 7 free
pages CONTEXT: parallel worker

I'm running: PostgreSQL 10.5 (Ubuntu 10.5-1.pgdg16.04+1) on
x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0
20160609, 64-bit

query plan: (select statement over parent table to many partitions):
select ...
from fa
where c_id in (<ID_LIST>) and
datetime >= '2018/01/01'
and ((dims ? 'p' and dims ? 'mcp')
or (datasource in (FA', 'GA')))
and not datasource = 'm'
GROUP BY datasource, dims ->'ct', dims ->'mcp', dims -> 'p', dims -> 'sp':

Finalize GroupAggregate (cost=31514757.77..31519357.77 rows=40000 width=223)
Group Key: fa.datasource, ((fa.dims -> 'ct'::text)), ((fa.dims ->
'mcp'::text)), ((fa.dims -> 'p'::text)), ((fa.dims -> 'sp'::text))
-> Sort (cost=31514757.77..31515057.77 rows=120000 width=223)
Sort Key: fa.datasource, ((fa.dims -> 'ct'::text)), ((fa.dims
-> 'mcp'::text)), ((fa.dims -> 'p'::text)), ((fa.dims -> 'sp'::text))
-> Gather (cost=31491634.17..31504634.17 rows=120000 width=223)
Workers Planned: 3
-> Partial HashAggregate
(cost=31490634.17..31491634.17 rows=40000 width=223)
Group Key: fa.datasource, (fa.dims ->
'ct'::text), (fa.dims -> 'mcp'::text), (fa.dims -> 'p'::text),
(fa.dims -> 'sp'::text)
-> Result (cost=0.00..31364713.39 rows=5596479 width=175)
-> Append (cost=0.00..31252783.81
rows=5596479 width=659)
-> Parallel Seq Scan on fa
(cost=0.00..0.00 rows=1 width=580)
Filter: ((datetime >=
'2018-01-01 00:00:00+01'::timestamp with time zone) AND
((datasource)::text <> 'M'::text) AND (((dims ? 'p'::text) AND (dims ?
'mcp'::text)) OR ((datasource)::text =
ANY ('{"FA","GA"}'::text[]))) AND (c_id = ANY ('{<ID_LIST>}'::bigint[])))
-> Parallel Bitmap Heap Scan on
fa_10 (cost=1226.36..53641.49 rows=1 width=1290)
Recheck Cond: (datetime >=
'2018-01-01 00:00:00+01'::timestamp with time zone)
Filter: (((datasource)::text <>
'M'::text) AND (((dims ? 'p'::text) AND (dims ? 'mcp'::text)) OR
((datasource)::text = ANY ('<ID_LIST>'::bigint[])))
-> Bitmap Index Scan on
fa_10_rangestart (cost=0.00..1226.36 rows=32259 width=0)
Index Cond: (datetime >=
'2018-01-01 00:00:00+01'::timestamp with time zone)
-> Parallel Seq Scan on fa_105
(cost=0.00..11.99 rows=1 width=580)
Filter: ((datetime >=
'2018-01-01 00:00:00+01'::timestamp with time zone) AND
((datasource)::text <> 'M'::text) AND (((dims ? 'p'::text) AND (dims ?
'mcp'::text)) OR ((datasource)::text =
ANY ('{"FA","GA"}'::text[]))) AND (c_id = ANY ('<ID_LIST>'::bigint[])))
-> Parallel Seq Scan on fa_106
(cost=0.00..11.99 rows=1 width=580)
Filter: ((datetime >=
'2018-01-01 00:00:00+01'::timestamp with time zone) AND
((datasource)::text <> 'M'::text) AND (((dims ? 'p'::text) AND (dims ?
'mcp'::text)) OR ((datasource)::text =
ANY ('{"FA","GA"}'::text[]))) AND (c_id = ANY ('<ID_LIST>..........

--
regards,
Jakub Glapa

From:	Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: dsa_allocate() faliure
Date:	2018-11-21 14:26:42
Message-ID:	CAJk1zg10iCNsxFvQ4pgKe1B0rdjNG9iELA7AzLXjXnQm5T=KzQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-performance

Looks like my email didn't match the right thread:
/message-id/flat/CAMAYy4%2Bw3NTBM5JLWFi8twhWK4%3Dk_5L4nV5%2BbYDSPu8r4b97Zg%40mail.gmail.com
Any chance to get some feedback on this?
--
regards,
Jakub Glapa

On Tue, Nov 13, 2018 at 2:08 PM Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com> wrote:

> Hi, I'm also experiencing the problem: dsa_allocate could not find 7 free
> pages CONTEXT: parallel worker
>
> I'm running: PostgreSQL 10.5 (Ubuntu 10.5-1.pgdg16.04+1) on
> x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0
> 20160609, 64-bit
>
> query plan: (select statement over parent table to many partitions):
> select ...
> from fa
> where c_id in (<ID_LIST>) and
> datetime >= '2018/01/01'
> and ((dims ? 'p' and dims ? 'mcp')
> or (datasource in (FA', 'GA')))
> and not datasource = 'm'
> GROUP BY datasource, dims ->'ct', dims ->'mcp', dims -> 'p', dims -> 'sp':
>
> Finalize GroupAggregate (cost=31514757.77..31519357.77 rows=40000 width=223)
> Group Key: fa.datasource, ((fa.dims -> 'ct'::text)), ((fa.dims -> 'mcp'::text)), ((fa.dims -> 'p'::text)), ((fa.dims -> 'sp'::text))
> -> Sort (cost=31514757.77..31515057.77 rows=120000 width=223)
> Sort Key: fa.datasource, ((fa.dims -> 'ct'::text)), ((fa.dims -> 'mcp'::text)), ((fa.dims -> 'p'::text)), ((fa.dims -> 'sp'::text))
> -> Gather (cost=31491634.17..31504634.17 rows=120000 width=223)
> Workers Planned: 3
> -> Partial HashAggregate (cost=31490634.17..31491634.17 rows=40000 width=223)
> Group Key: fa.datasource, (fa.dims -> 'ct'::text), (fa.dims -> 'mcp'::text), (fa.dims -> 'p'::text), (fa.dims -> 'sp'::text)
> -> Result (cost=0.00..31364713.39 rows=5596479 width=175)
> -> Append (cost=0.00..31252783.81 rows=5596479 width=659)
> -> Parallel Seq Scan on fa (cost=0.00..0.00 rows=1 width=580)
> Filter: ((datetime >= '2018-01-01 00:00:00+01'::timestamp with time zone) AND ((datasource)::text <> 'M'::text) AND (((dims ? 'p'::text) AND (dims ? 'mcp'::text)) OR ((datasource)::text =
> ANY ('{"FA","GA"}'::text[]))) AND (c_id = ANY ('{<ID_LIST>}'::bigint[])))
> -> Parallel Bitmap Heap Scan on fa_10 (cost=1226.36..53641.49 rows=1 width=1290)
> Recheck Cond: (datetime >= '2018-01-01 00:00:00+01'::timestamp with time zone)
> Filter: (((datasource)::text <> 'M'::text) AND (((dims ? 'p'::text) AND (dims ? 'mcp'::text)) OR ((datasource)::text = ANY ('<ID_LIST>'::bigint[])))
> -> Bitmap Index Scan on fa_10_rangestart (cost=0.00..1226.36 rows=32259 width=0)
> Index Cond: (datetime >= '2018-01-01 00:00:00+01'::timestamp with time zone)
> -> Parallel Seq Scan on fa_105 (cost=0.00..11.99 rows=1 width=580)
> Filter: ((datetime >= '2018-01-01 00:00:00+01'::timestamp with time zone) AND ((datasource)::text <> 'M'::text) AND (((dims ? 'p'::text) AND (dims ? 'mcp'::text)) OR ((datasource)::text =
> ANY ('{"FA","GA"}'::text[]))) AND (c_id = ANY ('<ID_LIST>'::bigint[])))
> -> Parallel Seq Scan on fa_106 (cost=0.00..11.99 rows=1 width=580)
> Filter: ((datetime >= '2018-01-01 00:00:00+01'::timestamp with time zone) AND ((datasource)::text <> 'M'::text) AND (((dims ? 'p'::text) AND (dims ? 'mcp'::text)) OR ((datasource)::text =
> ANY ('{"FA","GA"}'::text[]))) AND (c_id = ANY ('<ID_LIST>..........
>
>
>
> --
> regards,
> Jakub Glapa
>

From:	Justin Pryzby <pryzby(at)telsasoft(dot)com>
To:	Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: dsa_allocate() faliure
Date:	2018-11-22 16:09:58
Message-ID:	20181122160958.GJ10913@telsasoft.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers Postg무지개 토토SQL : Postg무지개

On Wed, Nov 21, 2018 at 03:26:42PM +0100, Jakub Glapa wrote:
> Looks like my email didn't match the right thread:
> /message-id/flat/CAMAYy4%2Bw3NTBM5JLWFi8twhWK4%3Dk_5L4nV5%2BbYDSPu8r4b97Zg%40mail.gmail.com
> Any chance to get some feedback on this?

In the related thread, it looks like Thomas backpatched a fix to v10, and so I
guess this should be resolved in 10.6, which was released couple weeks ago.
/message-id/CAEepm%3D0QxoUSkFqYbvmxi2eNvvU6BkqH6fTOu4oOzc1MRAT4Dw%40mail.gmail.com

Could you upgrade and check ?

38763d67784c6563d08dbea5c9f913fa174779b8 in master

|commit ba20d392584cdecc2808fe936448d127f43f2c07
|Author: Thomas Munro <tmunro(at)postgresql(dot)org>
|Date: Thu Sep 20 15:52:39 2018 +1200
|
| Fix segment_bins corruption in dsa.c.

Justin

From:	Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>
To:	pryzby(at)telsasoft(dot)com
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: dsa_allocate() faliure
Date:	2018-11-23 14:31:41
Message-ID:	CAJk1zg3nCqG2fo_vg2RwOW6s4TjKnnnqDTrFN4FQMVAT5XEbMQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers Postg범퍼카 토토SQL : Postg범퍼카

Hi Justin, I've upgrade to 10.6 but the error still shows up:

psql db(at)host as user => select version();
version

─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
PostgreSQL 10.6 (Ubuntu 10.6-1.pgdg16.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609, 64-bit
(1 row)

Time: 110.512 ms

psql db(at)host as user => select <COLUMNS> from fa where client_id in
(<IDS>) and datetime >= '2018/01/01' and ((dims ? 'p' and dimensions ?
'mcp') or (datasource in ('FA', 'GA'))) and not datasource = 'M' GROUP BY
datasource, dims ->'ct', dimensions ->'mct', dims -> 'p', dims -> 'sp';
ERROR: XX000: dsa_allocate could not find 7 free pages
CONTEXT: parallel worker
LOCATION: dsa_allocate_extended, dsa.c:729
Time: 131400.831 ms (02:11.401)

the above is execute with max_parallel_workers=8
If I set it to max_parallel_workers=0 I also get and my connection is being
closed (but the server is alive):

psql db(at)host as user => set max_parallel_workers=0;
SET
Time: 89.542 ms
psql db(at)host as user => SELECT <QUERY>;
FATAL: XX000: dsa_allocate could not find 7 free pages
LOCATION: dsa_allocate_extended, dsa.c:729
SSL connection has been closed unexpectedly
The connection to the server was lost. Attempting reset: Succeeded.
Time: 200390.466 ms (03:20.390)

--
regards,
Jakub Glapa

On Thu, Nov 22, 2018 at 5:10 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:

> On Wed, Nov 21, 2018 at 03:26:42PM +0100, Jakub Glapa wrote:
> > Looks like my email didn't match the right thread:
> >
> /message-id/flat/CAMAYy4%2Bw3NTBM5JLWFi8twhWK4%3Dk_5L4nV5%2BbYDSPu8r4b97Zg%40mail.gmail.com
> > Any chance to get some feedback on this?
>
> In the related thread, it looks like Thomas backpatched a fix to v10, and
> so I
> guess this should be resolved in 10.6, which was released couple weeks ago.
>
> /message-id/CAEepm%3D0QxoUSkFqYbvmxi2eNvvU6BkqH6fTOu4oOzc1MRAT4Dw%40mail.gmail.com
>
> Could you upgrade and check ?
>
> 38763d67784c6563d08dbea5c9f913fa174779b8 in master
>
> |commit ba20d392584cdecc2808fe936448d127f43f2c07
> |Author: Thomas Munro <tmunro(at)postgresql(dot)org>
> |Date: Thu Sep 20 15:52:39 2018 +1200
> |
> | Fix segment_bins corruption in dsa.c.
>
> Justin
>

From:	Justin Pryzby <pryzby(at)telsasoft(dot)com>
To:	Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: dsa_allocate() faliure
Date:	2018-11-23 16:10:28
Message-ID:	20181123161028.GL10913@telsasoft.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	Postg스포츠 토토 사이트SQL Postg토토 핫SQL : Postg토토

On Fri, Nov 23, 2018 at 03:31:41PM +0100, Jakub Glapa wrote:
> Hi Justin, I've upgrade to 10.6 but the error still shows up:
>
> If I set it to max_parallel_workers=0 I also get and my connection is being
> closed (but the server is alive):
>
> psql db(at)host as user => set max_parallel_workers=0;

Can you show the plan (explain without analyze) for the nonparallel case?

Also, it looks like the server crashed in that case (even if it restarted
itself quickly). Can you confirm ?

For example: dmesg |tail might show "postmaster[8582]: segfault [...]" or
similar. And other clients would've been disconnected. (For example, you'd
get an error in another, previously-connected session the next time you run:
SELECT 1).

In any case, could you try to find a minimal way to reproduce the problem ? I
mean, is the dataset and query small and something you can publish, or can you
reproduce with data generated from (for example) generate_series() ?

Thanks,
Justin

From:	Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>
To:	pryzby(at)telsasoft(dot)com
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: dsa_allocate() faliure
Date:	2018-11-26 15:26:45
Message-ID:	CAJk1zg0pdKJzH6=gDOK52SfLkbNTT=jBkRneCEzDNCEPPBVo6g@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-performance

So, the issue occurs only on production db an right now I cannot reproduce
it.
I had a look at dmesg and indeed I see something like:

--
regards,
pozdrawiam,
Jakub Glapa

On Fri, Nov 23, 2018 at 5:10 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:

> On Fri, Nov 23, 2018 at 03:31:41PM +0100, Jakub Glapa wrote:
> > Hi Justin, I've upgrade to 10.6 but the error still shows up:
> >
> > If I set it to max_parallel_workers=0 I also get and my connection is
> being
> > closed (but the server is alive):
> >
> > psql db(at)host as user => set max_parallel_workers=0;
>
> Can you show the plan (explain without analyze) for the nonparallel case?
>
> Also, it looks like the server crashed in that case (even if it restarted
> itself quickly). Can you confirm ?
>
> For example: dmesg |tail might show "postmaster[8582]: segfault [...]" or
> similar. And other clients would've been disconnected. (For example,
> you'd
> get an error in another, previously-connected session the next time you
> run:
> SELECT 1).
>
> In any case, could you try to find a minimal way to reproduce the problem
> ? I
> mean, is the dataset and query small and something you can publish, or can
> you
> reproduce with data generated from (for example) generate_series() ?
>
> Thanks,
> Justin
>

From:	Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>
To:	pryzby(at)telsasoft(dot)com
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: dsa_allocate() faliure
Date:	2018-11-26 15:38:35
Message-ID:	CAJk1zg3ZXhDsFg7tQGJ3ZD6N9dp+Q1_DU2N3=s3Ywb-u6Lhc5A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-performance

sorry, the message was sent out to early.

So, the issue occurs only on production db an right now I cannot reproduce
it.
I had a look at dmesg and indeed I see something like:

postgres[30667]: segfault at 0 ip 0000557834264b16 sp 00007ffc2ce1e030
error 4 in postgres[557833db7000+6d5000]

and AFAIR other sessions I had opened at that time were indeed disconnected.

When it comes to the execution plan for max_parallel_workers=0.
There is no real difference.
I guess *max_parallel_workers *has no effect and
*max_parallel_workers_per_gather
*should have been used.
Why it caused a server crash is unknown right now.

I cannot really give a reproducible recipe.
My case is that I have a parent table with ~300 partitions.
And I initiate a select on ~100 of them with select [...] from fa where
client_id(<IDS>) and [filters].
I know this is not effective. Every partition has several indexes and this
query acquires a lot of locks... even for relations not used in the query.
PG11 should have better partition pruning mechanism but I'm not there yet
to upgrade.
Some of the partitions have millions of rows.

I'll keep observing maybe I'l find a pattern when this occurs.

--
regards,
pozdrawiam,
Jakub Glapa

On Mon, Nov 26, 2018 at 4:26 PM Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com> wrote:

> So, the issue occurs only on production db an right now I cannot reproduce
> it.
> I had a look at dmesg and indeed I see something like:
>
>
> --
> regards,
> Jakub Glapa
>
>
> On Fri, Nov 23, 2018 at 5:10 PM Justin Pryzby <pryzby(at)telsasoft(dot)com>
> wrote:
>
>> On Fri, Nov 23, 2018 at 03:31:41PM +0100, Jakub Glapa wrote:
>> > Hi Justin, I've upgrade to 10.6 but the error still shows up:
>> >
>> > If I set it to max_parallel_workers=0 I also get and my connection is
>> being
>> > closed (but the server is alive):
>> >
>> > psql db(at)host as user => set max_parallel_workers=0;
>>
>> Can you show the plan (explain without analyze) for the nonparallel case?
>>
>> Also, it looks like the server crashed in that case (even if it restarted
>> itself quickly). Can you confirm ?
>>
>> For example: dmesg |tail might show "postmaster[8582]: segfault [...]" or
>> similar. And other clients would've been disconnected. (For example,
>> you'd
>> get an error in another, previously-connected session the next time you
>> run:
>> SELECT 1).
>>
>> In any case, could you try to find a minimal way to reproduce the problem
>> ? I
>> mean, is the dataset and query small and something you can publish, or
>> can you
>> reproduce with data generated from (for example) generate_series() ?
>>
>> Thanks,
>> Justin
>>
>

From:	Justin Pryzby <pryzby(at)telsasoft(dot)com>
To:	Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Subject:	Re: dsa_allocate() faliure
Date:	2018-11-26 15:52:08
Message-ID:	20181126155207.GZ10913@telsasoft.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-performance

Hi, thanks for following through.

On Mon, Nov 26, 2018 at 04:38:35PM +0100, Jakub Glapa wrote:
> I had a look at dmesg and indeed I see something like:
>
> postgres[30667]: segfault at 0 ip 0000557834264b16 sp 00007ffc2ce1e030
> error 4 in postgres[557833db7000+6d5000]

That's useful, I think "at 0" means a null pointer dereferenced.

Can you check /var/log/messages (or ./syslog or similar) and verify the
timestamp matches the time of the last crash (and not an unrelated crash) ?

The logs might also indicate if the process dumped a core file anywhere.

I don't know what distribution/OS you're using, but it might be good to install
abrt (RHEL) or apport (ubuntu) or other mechanism to save coredumps, or to
manually configure /proc/sys/kernel/core_pattern.

On centos, I usually set:
/etc/abrt/abrt-action-save-package-data.conf
OpenGPGCheck = no

Also, it might be good to install debug symbols, in case you do find a core
dump now or get one later.

On centos: yum install postgresql10-debuginfo or debuginfo-install postgresql10-server
Make sure this exactly matches the debug symbols exactly match the server version.

Justin

From:	Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>
To:	pryzby(at)telsasoft(dot)com
Cc:	pgsql-performance(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Subject:	Re: dsa_allocate() faliure
Date:	2018-11-26 16:00:30
Message-ID:	CAJk1zg0GxWzkh1RESY0=RfoZ_20WaFyt2g7Me=cVr3VYngPthA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-performance

Justin thanks for the information!
I'm running Ubuntu 16.04.
I'll try to prepare for the next crash.
Couldn't find anything this time.

--
regards,
Jakub Glapa

On Mon, Nov 26, 2018 at 4:52 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:

> Hi, thanks for following through.
>
> On Mon, Nov 26, 2018 at 04:38:35PM +0100, Jakub Glapa wrote:
> > I had a look at dmesg and indeed I see something like:
> >
> > postgres[30667]: segfault at 0 ip 0000557834264b16 sp 00007ffc2ce1e030
> > error 4 in postgres[557833db7000+6d5000]
>
> That's useful, I think "at 0" means a null pointer dereferenced.
>
> Can you check /var/log/messages (or ./syslog or similar) and verify the
> timestamp matches the time of the last crash (and not an unrelated crash) ?
>
> The logs might also indicate if the process dumped a core file anywhere.
>
> I don't know what distribution/OS you're using, but it might be good to
> install
> abrt (RHEL) or apport (ubuntu) or other mechanism to save coredumps, or to
> manually configure /proc/sys/kernel/core_pattern.
>
> On centos, I usually set:
> /etc/abrt/abrt-action-save-package-data.conf
> OpenGPGCheck = no
>
> Also, it might be good to install debug symbols, in case you do find a core
> dump now or get one later.
>
> On centos: yum install postgresql10-debuginfo or debuginfo-install
> postgresql10-server
> Make sure this exactly matches the debug symbols exactly match the server
> version.
>
> Justin
>

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>
Cc:	pryzby(at)telsasoft(dot)com, pgsql-performance(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Subject:	Re: dsa_allocate() faliure
Date:	2018-11-26 18:45:09
Message-ID:	20181126184509.gge2vqmlu3twf72h@alvherre.pgsql
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-performance

On 2018-Nov-26, Jakub Glapa wrote:

> Justin thanks for the information!
> I'm running Ubuntu 16.04.
> I'll try to prepare for the next crash.
> Couldn't find anything this time.

As I recall, the appport stuff in Ubuntu is terrible ... I've seen it
take 40 minutes to write the crash dump to disk, during which the
database was "down". I don't know why it is so slow (it's a rather
silly python script that apparently processes the core dump one byte at
a time, and you can imagine that with a few gigabytes of shared memory
that takes a while). Anyway my recommendation was to *remove* that
stuff from the server and make sure the core file is saved by normal
means.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: dsa_allocate() faliure
Date:	2018-11-27 03:00:34
Message-ID:	CAEepm=2McVb9t3cS0yfoKoxBXFxhbJrn5rApq6CSMDgQ0OUGww@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-performance

On Tue, Nov 27, 2018 at 7:45 AM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> On 2018-Nov-26, Jakub Glapa wrote:
> > Justin thanks for the information!
> > I'm running Ubuntu 16.04.
> > I'll try to prepare for the next crash.
> > Couldn't find anything this time.
>
> As I recall, the appport stuff in Ubuntu is terrible ... I've seen it
> take 40 minutes to write the crash dump to disk, during which the
> database was "down". I don't know why it is so slow (it's a rather
> silly python script that apparently processes the core dump one byte at
> a time, and you can imagine that with a few gigabytes of shared memory
> that takes a while). Anyway my recommendation was to *remove* that
> stuff from the server and make sure the core file is saved by normal
> means.

Thanks for CC-ing me. I didn't see this thread earlier because I'm
not subscribed to -performance. Let's move it over to -hackers since
it looks like it's going to be a debugging exercise. So, reading
through the thread[1], I think there might be two independent problems
here:

1. Jakub has a many-partition Parallel Bitmap Heap Scan query that
segfaults when run with max_parallel_workers = 0. That sounds
suspiciously like an instance of a class of bug we've run into before.
We planned a parallel query, but were unable to launch one due to lack
of DSM slots or process slots, so we run the parallel plan in a kind
of degraded non-parallel mode that needs to cope with various pointers
into shared memory being NULL. A back trace from a core file should
hopefully make it very obvious what's going on.

2. The same query when run in real parallel query mode occasionally
reaches an error "dsa_allocate could not find 7 free pages", which
should not happen. This is on 10.6, so it has the commit "Fix
segment_bins corruption in dsa.c.".

Hmm. I will see if I can come up with a many-partition torture test
reproducer for this.

[1] /message-id/flat/CAJk1zg10iCNsxFvQ4pgKe1B0rdjNG9iELA7AzLXjXnQm5T%3DKzQ%40mail.gmail.com

--
Thomas Munro
http://www.enterprisedb.com

From:	Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: dsa_allocate() faliure
Date:	2018-11-27 08:02:29
Message-ID:	CAEepm=0_7TkVJq6z=-gw2236_mYgY4rRczrtf3ozFKcHW4FHEQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-performance

On Tue, Nov 27, 2018 at 4:00 PM Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> Hmm. I will see if I can come up with a many-partition torture test
> reproducer for this.

No luck. I suppose one theory that could link both failure modes
would a buffer overrun, where in the non-shared case it trashes a
pointer that is later dereferenced, and in the shared case it writes
past the end of allocated 4KB pages and corrupts the intrusive btree
that lives in spare pages to track available space.

--
Thomas Munro
http://www.enterprisedb.com

From:	Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>
To:	Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc:	alvherre(at)2ndquadrant(dot)com, pryzby(at)telsasoft(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: dsa_allocate() faliure
Date:	2018-11-30 19:20:49
Message-ID:	CAJk1zg2kgnAbWAuM3oG20EN_Fvin2Z5OtWJHTR711S2jbNwQQA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-performance

Hi, just a small update.
I've configured the OS for taking crash dumps on Ubuntu 16.04 with the
following (maybe somebody will find it helpful):
I've added LimitCORE=infinity to /lib/systemd/system/postgresql(at)(dot)service
under [Service] section
I've reloaded the service config with sudo systemctl daemon-reload
Changed the core pattern to: sudo echo
/var/lib/postgresql/core.%p.sig%s.%ts | tee -a /proc/sys/kernel/core_pattern
I had tested it with kill -ABRT pidofbackend and it behaved correctly. A
crash dump was written.

In the last days I've been monitoring no segfault occurred but the
das_allocation did.
I'm starting to doubt if the segfault I've found in dmesg was actually
related.

I've grepped the postgres log for dsa_allocated:
Why do the messages occur sometimes as FATAL and sometimes as ERROR?

2018-11-29 07:59:06 CET::@:[20584]: FATAL: dsa_allocate could not find 7
free pages
2018-11-29 07:59:06 CET:127.0.0.1(40846):user(at)db:[19507]: ERROR:
dsa_allocate could not find 7 free pages
2018-11-30 09:04:13 CET::@:[27341]: FATAL: dsa_allocate could not find 13
free pages
2018-11-30 09:04:13 CET:127.0.0.1(41782):user(at)db:[25417]: ERROR:
dsa_allocate could not find 13 free pages
2018-11-30 09:28:38 CET::@:[30215]: FATAL: dsa_allocate could not find 4
free pages
2018-11-30 09:28:38 CET:127.0.0.1(45980):user(at)db:[29924]: ERROR:
dsa_allocate could not find 4 free pages
2018-11-30 16:37:16 CET::@:[14385]: FATAL: dsa_allocate could not find 7
free pages
2018-11-30 16:37:16 CET::@:[14375]: FATAL: dsa_allocate could not find 7
free pages
2018-11-30 16:37:16 CET:212.186.105.45(55004):user(at)db:[14386]: FATAL:
dsa_allocate could not find 7 free pages
2018-11-30 16:37:16 CET:212.186.105.45(54964):user(at)db:[14379]: ERROR:
dsa_allocate could not find 7 free pages
2018-11-30 16:37:16 CET:212.186.105.45(54916):user(at)db:[14370]: ERROR:
dsa_allocate could not find 7 free pages
2018-11-30 16:45:11 CET:212.186.105.45(55356):user(at)db:[14555]: FATAL:
dsa_allocate could not find 7 free pages
2018-11-30 16:49:13 CET::@:[15359]: FATAL: dsa_allocate could not find 7
free pages
2018-11-30 16:49:13 CET::@:[15363]: FATAL: dsa_allocate could not find 7
free pages
2018-11-30 16:49:13 CET:212.186.105.45(54964):user(at)db:[14379]: FATAL:
dsa_allocate could not find 7 free pages
2018-11-30 16:49:13 CET:212.186.105.45(54916):user(at)db:[14370]: ERROR:
dsa_allocate could not find 7 free pages
2018-11-30 16:49:13 CET:212.186.105.45(55842):user(at)db:[14815]: ERROR:
dsa_allocate could not find 7 free pages
2018-11-30 16:56:11 CET:212.186.105.45(57076):user(at)db:[15638]: FATAL:
dsa_allocate could not find 7 free pages

There's quite a bit errors from today but I was launching the problematic
query in parallel from 2-3 sessions.
Sometimes it was breaking sometimes not.
Couldn't find any pattern.
The workload on this db is not really constant, rather bursting.

--
regards,
Jakub Glapa

On Tue, Nov 27, 2018 at 9:03 AM Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
wrote:

> On Tue, Nov 27, 2018 at 4:00 PM Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> > Hmm. I will see if I can come up with a many-partition torture test
> > reproducer for this.
>
> No luck. I suppose one theory that could link both failure modes
> would a buffer overrun, where in the non-shared case it trashes a
> pointer that is later dereferenced, and in the shared case it writes
> past the end of allocated 4KB pages and corrupts the intrusive btree
> that lives in spare pages to track available space.
>
> --
> Thomas Munro
> http://www.enterprisedb.com
>

From:	Justin Pryzby <pryzby(at)telsasoft(dot)com>
To:	Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>
Cc:	Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, alvherre(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: dsa_allocate() faliure
Date:	2018-11-30 20:46:47
Message-ID:	20181130204647.GK24746@telsasoft.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-performance

On Fri, Nov 30, 2018 at 08:20:49PM +0100, Jakub Glapa wrote:
> In the last days I've been monitoring no segfault occurred but the
> das_allocation did.
> I'm starting to doubt if the segfault I've found in dmesg was actually
> related.

The dmesg looks like a real crash, not just OOM. You can hopefully find the
timestamp of the segfaults in /var/log/syslog, and compare with postgres logs
if they go back far enough. All the postgres processes except the parent
would've been restarted at that time.

> I've grepped the postgres log for dsa_allocated:
> Why do the messages occur sometimes as FATAL and sometimes as ERROR?

I believe it may depend if it happens in a parallel worker or the leader.

You may get more log detail if you enable CSV logging (although unfortunately
as I recall it doesn't indicate it's a parallel worker).

You could force it to dump core if you recompile postgres with an assert() (see
patch below).

You could build an .deb by running dpkg-buildpackage -rfakeroot or similar (i
haven't done this in awhile), or you could compile, install, and launch
debugging binaries from your homedir (or similar)

You'd want to compile the same version (git checkout REL_10_6) and with the
proper configure flags..perhaps starting with:
./configure --with-libxml --with-libxslt --enable-debug --prefix=$HOME/src/postgresql.bin --enable-cassert && time make && make install

Be careful if you have extensions installed that they still work.

Justin

--- a/src/backend/utils/mmgr/dsa.c
+++ b/src/backend/utils/mmgr/dsa.c
@@ -727,4 +727,7 @@ dsa_allocate_extended(dsa_area *area, size_t size, int flags)
if (!FreePageManagerGet(segment_map->fpm, npages, &first_page))
+ {
elog(FATAL,
"dsa_allocate could not find %zu free pages", npages);
+ abort()
+ }

From:	Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To:	Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc:	Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: dsa_allocate() faliure
Date:	2018-12-02 22:45:00
Message-ID:	CAEepm=1-Lo+98n7s1jXftEO2BhxFbpKSbPEhNiFkOooxe+ZBWg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-performance

On Sat, Dec 1, 2018 at 9:46 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> elog(FATAL,
> "dsa_allocate could not find %zu free pages", npages);
> + abort()

If anyone can reproduce this problem with a debugger, it'd be
interesting to see the output of dsa_dump(area), and
FreePageManagerDump(segment_map->fpm). This error condition means
that get_best_segment() selected a segment from a segment bin that
holds segments with a certain minimum number of contiguous free pages
>= the requested number npages, but then FreePageManagerGet() found
that it didn't have npages of contiguous free memory after all when it
consulted the segment's btree of free space. Possible explanations
include: the segment bin lists are somehow messed up, the FPM in the
segment was corrupted by someone scribbling on free pages (which hold
the btree), the btree was corrupted by an incorrect sequence of
allocate/free calls (for example double frees, allocating from one
area and freeing to another etc), freepage.c fails to track its
largest size correctly.

There is a macro FPM_EXTRA_ASSERTS that can be defined to double-check
the largest contiguous page tracking. I have also been wondering
about a debug mode that would mprotect(PROT_READ) free pages when they
aren't being modified to detect unexpected writes, which should work
on systems that have 4k pages.

One thing I noticed is that it is failing on a "large" allocation,
where we go straight to the btree of 4k pages, but the equivalent code
where we allocate a superblock for "small" allocations doesn't report
the same kind of FATAL this-can't-happen error, it just fails the
allocation via the regular error path without explanation. I also
spotted a path that doesn't respect the DSA_ALLOC_NO_OOM flag (you get
a null pointer instead of an error). I should fix those
inconsistencies (draft patch attached), but those are incidental
problems AFAIK.

--
Thomas Munro
http://www.enterprisedb.com

Attachment	Content-Type	Size
fix-dsa-area-handling.patch	application/octet-stream	1.4 KB

From:	Justin Pryzby <pryzby(at)telsasoft(dot)com>
To:	Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Subject:	Re: dsa_allocate() faliure
Date:	2019-02-17 22:21:29
Message-ID:	20190217222129.GE28750@telsasoft.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-performance

Hi,

On Mon, Nov 26, 2018 at 09:52:07AM -0600, Justin Pryzby wrote:
> Hi, thanks for following through.
>
> On Mon, Nov 26, 2018 at 04:38:35PM +0100, Jakub Glapa wrote:
> > I had a look at dmesg and indeed I see something like:
> >
> > postgres[30667]: segfault at 0 ip 0000557834264b16 sp 00007ffc2ce1e030
> > error 4 in postgres[557833db7000+6d5000]
>
> That's useful, I think "at 0" means a null pointer dereferenced.

Thomas fixed several bugs in DSA, which will be in next release, postgres 10.8
and 11.3.

However that doesn't explain the segfault you saw, and I don't see anything
which looks relevant changed since in 10.5.

If you still see that using the latest minor release (10.7), please try to
capture a core file and send a backtrace with a new thread on pgsql-hackers.

Thanks,
Justin

From:	Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com>
To:	Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Subject:	Re: dsa_allocate() faliure
Date:	2019-02-18 09:11:10
Message-ID:	CAJk1zg2uLmyq7FfD2fDWsHV=E8UBA=aaAMvjTLuoFHCB92X6CA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-performance

Hi I just checked the dmesg.
The segfault I wrote about is the only one I see, dated Nov 24 last year.
Since then no other segfaults happened although dsa_allocated failures
happen daily.
I'll report if anything occurs.
I have the core dumping setup in place.

--
regards,
pozdrawiam,
Jakub Glapa

On Sun, Feb 17, 2019 at 11:21 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:

> Hi,
>
> On Mon, Nov 26, 2018 at 09:52:07AM -0600, Justin Pryzby wrote:
> > Hi, thanks for following through.
> >
> > On Mon, Nov 26, 2018 at 04:38:35PM +0100, Jakub Glapa wrote:
> > > I had a look at dmesg and indeed I see something like:
> > >
> > > postgres[30667]: segfault at 0 ip 0000557834264b16 sp 00007ffc2ce1e030
> > > error 4 in postgres[557833db7000+6d5000]
> >
> > That's useful, I think "at 0" means a null pointer dereferenced.
>
> Thomas fixed several bugs in DSA, which will be in next release, postgres
> 10.8
> and 11.3.
>
> However that doesn't explain the segfault you saw, and I don't see anything
> which looks relevant changed since in 10.5.
>
> If you still see that using the latest minor release (10.7), please try to
> capture a core file and send a backtrace with a new thread on
> pgsql-hackers.
>
> Thanks,
> Justin
>