Re: RFC: Logging plan of the running query

Lists: Postg토토 꽁 머니SQL
From: Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2021-11-02 11:32:59
Message-ID: d68c3ae31672664876b22d2dcbb526d2@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

I'm here to answer your questions about contrib/pg_query_state.
>> I only took a quick look at pg_query_state, I have some questions.

>> pg_query_state seems using shm_mq to expose the plan information, but
>> there was a discussion that this kind of architecture would be tricky
>> to do properly [1].
>> Does pg_query_state handle difficulties listed on the discussion?
> [1]
> /message-id/9a50371e15e741e295accabc72a41df1%40oss.nttdata.com

I doubt that it was the right link.
But on the topic I will say that extension really use shared memory,
interaction is implemented by sending / receiving messages. This
architecture provides the required reliability and convenience.

>> It seems the caller of the pg_query_state() has to wait until the
>> target process pushes the plan information into shared memory, can it
>> lead to deadlock situations?
>> I came up with this question because when trying to make a view for
>> memory contexts of other backends, we encountered deadlock situations.
>> After all, we gave up view design and adopted sending signal and
>> logging.
>
> Discussion at the following URL.
> /message-id/9a50371e15e741e295accabc72a41df1%40oss.nttdata.com

Before extracting information about side process we check its state.
Information will only be retrieved for a process willing to provide it.
Otherwise, we will receive an error message about impossibility of
getting query execution statistics + process status. Also checking fact
of extracting your own status exists. This is even verified in tests.

Thanks for your attention.
Just in case, I am ready to discuss this topic in more detail.

About overhead:
> I haven't measured it yet, but I believe that the overhead for backends
> which are not called pg_log_current_plan() would be slight since the
> patch just adds the logic for saving QueryDesc on ExecutorRun().
> The overhead for backends which is called pg_log_current_plan() might
> not slight, but since the target process are assumed dealing with
> long-running query and the user want to know its plan, its overhead
> would be worth the cost.
I think it would be useful for us to have couple of examples with a
different number of rows compared to using without this functionality.

Hope this helps.

--
Ekaterina Sokolova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2021-11-03 12:48:51
Message-ID: A5B5A615-A055-42E6-A575-B83938A8EA9E@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

This patch no longer applies on top of HEAD, please submit a rebased version.

--
Daniel Gustafsson https://vmware.com/


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2021-11-04 12:49:21
Message-ID: 577370a1bd9f622e41f6cda8ed895534@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2021-11-02 20:32, Ekaterina Sokolova wrote:
Thanks for your response!

> Hi!
>
> I'm here to answer your questions about contrib/pg_query_state.
>>> I only took a quick look at pg_query_state, I have some questions.
>
>>> pg_query_state seems using shm_mq to expose the plan information, but
>>> there was a discussion that this kind of architecture would be tricky
>>> to do properly [1].
>>> Does pg_query_state handle difficulties listed on the discussion?
>> [1]
>> /message-id/9a50371e15e741e295accabc72a41df1%40oss.nttdata.com
>
> I doubt that it was the right link.

Sorry for make you confused, here is the link.


/message-id/CA%2BTgmobkpFV0UB67kzXuD36--OFHwz1bs%3DL_6PZbD4nxKqUQMw%40mail.gmail.com

> But on the topic I will say that extension really use shared memory,
> interaction is implemented by sending / receiving messages. This
> architecture provides the required reliability and convenience.

As described in the link, using shared memory for this kind of work
would need DSM and It'd be also necessary to exchange information
between requestor and responder.

For example, when I looked at a little bit of pg_query_state code, it
looks like the size of the queue is fixed at QUEUE_SIZE, and I wonder
how plans that exceed QUEUE_SIZE are handled.

>>> It seems the caller of the pg_query_state() has to wait until the
>>> target process pushes the plan information into shared memory, can it
>>> lead to deadlock situations?
>>> I came up with this question because when trying to make a view for
>>> memory contexts of other backends, we encountered deadlock
>>> situations.
>>> After all, we gave up view design and adopted sending signal and
>>> logging.
>>
>> Discussion at the following URL.
>> /message-id/9a50371e15e741e295accabc72a41df1%40oss.nttdata.com
>
> Before extracting information about side process we check its state.
> Information will only be retrieved for a process willing to provide
> it. Otherwise, we will receive an error message about impossibility of
> getting query execution statistics + process status. Also checking
> fact of extracting your own status exists. This is even verified in
> tests.
>
> Thanks for your attention.
> Just in case, I am ready to discuss this topic in more detail.

I imagined the following procedure.
Does it cause dead lock in pg_query_state?

- session1
BEGIN; TRUNCATE t;

- session2
BEGIN; TRUNCATE t; -- wait

- session1
SELECT * FROM pg_query_state(<pid of session>); -- wait and dead locked?

> About overhead:
>> I haven't measured it yet, but I believe that the overhead for
>> backends
>> which are not called pg_log_current_plan() would be slight since the
>> patch just adds the logic for saving QueryDesc on ExecutorRun().
>> The overhead for backends which is called pg_log_current_plan() might
>> not slight, but since the target process are assumed dealing with
>> long-running query and the user want to know its plan, its overhead
>> would be worth the cost.
> I think it would be useful for us to have couple of examples with a
> different number of rows compared to using without this functionality.

Do you have any expectaion that the number of rows would affect the
performance of this functionality?
This patch adds some codes to ExecutorRun(), but I thought the number of
rows would not give impact on the performance.

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION


From: Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2021-11-17 13:44:51
Message-ID: d5b390ba34e9914a3f73a8be936000a8@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

You forgot my last fix to build correctly on Mac. I have added it.

About our discussion of pg_query_state:

torikoshia писал 2021-11-04 15:49:
>> I doubt that it was the right link.
> Sorry for make you confused, here is the link.
> /message-id/CA%2BTgmobkpFV0UB67kzXuD36--OFHwz1bs%3DL_6PZbD4nxKqUQMw%40mail.gmail.com

Thank you. I'll see it soon.

> I imagined the following procedure.
> Does it cause dead lock in pg_query_state?
>
> - session1
> BEGIN; TRUNCATE t;
>
> - session2
> BEGIN; TRUNCATE t; -- wait
>
> - session1
> SELECT * FROM pg_query_state(<pid of session>); -- wait and dead
> locked?

As I know, pg_query_state use non-blocking read and write. I have wrote
few tests trying to deadlock it (on 14 version), but all finished
correctly.

Have a nice day. Please feel free to contact me if you need any further
information.

--
Ekaterina Sokolova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
v13-0002-log-running-query-plan.patch text/x-diff 28.4 KB

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2021-11-26 03:39:48
Message-ID: 2995aecc0414faf3df3e6fe437d5fe55@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2021-11-17 22:44, Ekaterina Sokolova wrote:
> Hi!
>
> You forgot my last fix to build correctly on Mac. I have added it.

Thanks for the notification!
Since the patch could not be applied to the HEAD anymore, I also updated
it.

>
> About our discussion of pg_query_state:
>
> torikoshia писал 2021-11-04 15:49:
>>> I doubt that it was the right link.
>> Sorry for make you confused, here is the link.
>> /message-id/CA%2BTgmobkpFV0UB67kzXuD36--OFHwz1bs%3DL_6PZbD4nxKqUQMw%40mail.gmail.com
>
> Thank you. I'll see it soon.
>
>> I imagined the following procedure.
>> Does it cause dead lock in pg_query_state?
>>
>> - session1
>> BEGIN; TRUNCATE t;
>>
>> - session2
>> BEGIN; TRUNCATE t; -- wait
>>
>> - session1
>> SELECT * FROM pg_query_state(<pid of session>); -- wait and dead
>> locked?
>
> As I know, pg_query_state use non-blocking read and write. I have
> wrote few tests trying to deadlock it (on 14 version), but all
> finished correctly.
>
> Have a nice day. Please feel free to contact me if you need any
> further information.

Thanks for your information and help!

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
v14-0001-log-running-query-plan.patch text/x-diff 28.4 KB

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-01-07 05:30:37
Message-ID: 1e6252dc1b1fd734b3e79505341bf039@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2021-11-26 12:39, torikoshia wrote:
> Since the patch could not be applied to the HEAD anymore, I also
> updated it.

Updated the patch for fixing compiler warning about the format on
windows.

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
v15-0001-log-running-query-plan.patch text/x-diff 28.4 KB

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-01-07 11:58:19
Message-ID: 41578d623a4bb0454fe66fc4c638d32b@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2022-01-07 14:30, torikoshia wrote:

> Updated the patch for fixing compiler warning about the format on
> windows.

I got another compiler warning, updated the patch again.

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
v16-0001-log-running-query-plan.patch text/x-diff 28.4 KB

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-01-07 12:54:31
Message-ID: 57408d96-ff7e-340b-f0d6-fba06ec95500@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2022/01/07 20:58, torikoshia wrote:
> On 2022-01-07 14:30, torikoshia wrote:
>
>> Updated the patch for fixing compiler warning about the format on windows.
>
> I got another compiler warning, updated the patch again.

Thanks for updating the patch!

I ran the following query every 0.1s by using \watch psql command from three different sessions while make installcheck test was running.

SELECT pg_log_query_plan(pid) FROM pg_stat_activity;

And then I got the segmentation fault as follows.

2022-01-07 21:40:32 JST [postmaster] LOG: server process (PID 51017) was terminated by signal 11: Segmentation fault: 11
2022-01-07 21:40:32 JST [postmaster] DETAIL: Failed process was running: select description, (test_conv(inbytes, 'utf8', 'utf8')).* from utf8_verification_inputs;
2022-01-07 21:40:32 JST [postmaster] LOG: terminating any other active server processes
2022-01-07 21:40:32 JST [postmaster] LOG: all server processes terminated; reinitializing

The backtrace I got from the core file was:

(lldb) target create --core "/cores/core.51017"
Core file '/cores/core.51017' (x86_64) was loaded.
(lldb) bt
* thread #1, stop reason = signal SIGSTOP
* frame #0: 0x00007fff20484552 libsystem_platform.dylib`_platform_strlen + 18
frame #1: 0x000000011076e36c postgres`dopr(target=0x00007ffedfd1b1d8, format="\n", args=0x00007ffedfd1b450) at snprintf.c:444:20
frame #2: 0x000000011076e133 postgres`pg_vsnprintf(str="Query Text: \x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f"..., count=1024, fmt="%s: %s\n", args=0x00007ffedfd1b450) at snprintf.c:195:2
frame #3: 0x000000011075e51d postgres`pvsnprintf(buf="Query Text: \x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f"..., len=1024, fmt="%s: %s\n", args=0x00007ffedfd1b450) at psprintf.c:110:13
frame #4: 0x000000011076052d postgres`appendStringInfoVA(str=0x00007fe56804bec0, fmt="%s: %s\n", args=0x00007ffedfd1b450) at stringinfo.c:149:13
frame #5: 0x0000000110760449 postgres`appendStringInfo(str=0x00007fe56804bec0, fmt="%s: %s\n") at stringinfo.c:103:12
frame #6: 0x000000011016f6cf postgres`ExplainProperty(qlabel="Query Text", unit=0x0000000000000000, value="", numeric=false, es=0x00007fe56804be28) at explain.c:4390:5
frame #7: 0x000000011016eace postgres`ExplainPropertyText(qlabel="Query Text", value="", es=0x00007fe56804be28) at explain.c:4435:2
frame #8: 0x000000011016ea97 postgres`ExplainQueryText(es=0x00007fe56804be28, queryDesc=0x00007fe568022320) at explain.c:969:3
frame #9: 0x000000011016faeb postgres`ProcessLogCurrentPlanInterrupt at explain.c:5009:2
frame #10: 0x00000001104dda84 postgres`ProcessInterrupts at postgres.c:3373:3
frame #11: 0x0000000110c81c57 plpgsql.so`exec_stmts(estate=0x00007ffedfd1bab8, stmts=0x00007fe568050e00) at pl_exec.c:1994:3
frame #12: 0x0000000110c80a78 plpgsql.so`exec_stmt_block(estate=0x00007ffedfd1bab8, block=0x00007fe56801c570) at pl_exec.c:1777:9
frame #13: 0x0000000110c81c91 plpgsql.so`exec_stmts(estate=0x00007ffedfd1bab8, stmts=0x00007fe56801c5c8) at pl_exec.c:1999:10
frame #14: 0x0000000110c80f21 plpgsql.so`exec_stmt_block(estate=0x00007ffedfd1bab8, block=0x00007fe56801c678) at pl_exec.c:1926:8
frame #15: 0x0000000110c7ed92 plpgsql.so`exec_toplevel_block(estate=0x00007ffedfd1bab8, block=0x00007fe56801c678) at pl_exec.c:1617:7
frame #16: 0x0000000110c7d3e4 plpgsql.so`plpgsql_exec_function(func=0x00007fe56a00aa20, fcinfo=0x00007fe56803c888, simple_eval_estate=0x0000000000000000, simple_eval_resowner=0x0000000000000000, procedure_resowner=0x0000000000000000, atomic=true) at pl_exec.c:611:7
frame #17: 0x0000000110c9bf12 plpgsql.so`plpgsql_call_handler(fcinfo=0x00007fe56803c888) at pl_handler.c:277:13
frame #18: 0x0000000110228b06 postgres`ExecInterpExpr(state=0x00007fe56803bf88, econtext=0x00007fe56803ba70, isnull=0x00007ffedfd1bf7f) at execExprInterp.c:725:8
frame #19: 0x000000011024858b postgres`ExecEvalExprSwitchContext(state=0x00007fe56803bf88, econtext=0x00007fe56803ba70, isNull=0x00007ffedfd1bf7f) at executor.h:339:13
frame #20: 0x00000001102482aa postgres`ExecProject(projInfo=0x00007fe56803bf80) at executor.h:373:9
frame #21: 0x0000000110247e2f postgres`ExecScan(node=0x00007fe56803b958, accessMtd=(postgres`SeqNext at nodeSeqscan.c:51), recheckMtd=(postgres`SeqRecheck at nodeSeqscan.c:90)) at execScan.c:238:12
frame #22: 0x000000011028c575 postgres`ExecSeqScan(pstate=0x00007fe56803b958) at nodeSeqscan.c:112:9
frame #23: 0x000000011023bbe2 postgres`ExecProcNode(node=0x00007fe56803b958) at executor.h:257:9
frame #24: 0x0000000110237471 postgres`ExecutePlan(estate=0x00007fe56803b720, planstate=0x00007fe56803b958, use_parallel_mode=false, operation=CMD_SELECT, sendTuples=true, numberTuples=0, direction=ForwardScanDirection, dest=0x00007fe568046ac8, execute_once=true) at execMain.c:1561:10
frame #25: 0x0000000110237320 postgres`standard_ExecutorRun(queryDesc=0x00007fe568037320, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:371:3
frame #26: 0x00000001102370b8 postgres`ExecutorRun(queryDesc=0x00007fe568037320, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:313:3
frame #27: 0x00000001104e6bdb postgres`PortalRunSelect(portal=0x00007fe567825920, forward=true, count=0, dest=0x00007fe568046ac8) at pquery.c:921:4
frame #28: 0x00000001104e65ba postgres`PortalRun(portal=0x00007fe567825920, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x00007fe568046ac8, altdest=0x00007fe568046ac8, qc=0x00007ffedfd1c350) at pquery.c:765:18
frame #29: 0x00000001104e1861 postgres`exec_simple_query(query_string="select description, (test_conv(inbytes, 'utf8', 'utf8')).* from utf8_verification_inputs;") at postgres.c:1216:10
frame #30: 0x00000001104e099c postgres`PostgresMain(dbname="regression", username="postgres") at postgres.c:4505:7
frame #31: 0x00000001103ee2a2 postgres`BackendRun(port=0x00007fe567704080) at postmaster.c:4594:2
frame #32: 0x00000001103ed868 postgres`BackendStartup(port=0x00007fe567704080) at postmaster.c:4322:3
frame #33: 0x00000001103ec64c postgres`ServerLoop at postmaster.c:1802:7
frame #34: 0x00000001103e9e29 postgres`PostmasterMain(argc=3, argv=0x00007fe567405e70) at postmaster.c:1474:11
frame #35: 0x00000001102c8b69 postgres`main(argc=3, argv=0x00007fe567405e70) at main.c:198:3
frame #36: 0x00007fff2045cf3d libdyld.dylib`start + 1

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-01-14 06:38:46
Message-ID: 20220114063846.7jygvulyu6g23kdv@jrouhaud
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Fri, Jan 07, 2022 at 09:54:31PM +0900, Fujii Masao wrote:
>
> On 2022/01/07 20:58, torikoshia wrote:
> > On 2022-01-07 14:30, torikoshia wrote:
> >
> > > Updated the patch for fixing compiler warning about the format on windows.
> >
> > I got another compiler warning, updated the patch again.
>
> Thanks for updating the patch!
>
> I ran the following query every 0.1s by using \watch psql command from three different sessions while make installcheck test was running.
>
> SELECT pg_log_query_plan(pid) FROM pg_stat_activity;
>
>
> And then I got the segmentation fault as follows.
>
> 2022-01-07 21:40:32 JST [postmaster] LOG: server process (PID 51017) was terminated by signal 11: Segmentation fault: 11
> 2022-01-07 21:40:32 JST [postmaster] DETAIL: Failed process was running: select description, (test_conv(inbytes, 'utf8', 'utf8')).* from utf8_verification_inputs;
> 2022-01-07 21:40:32 JST [postmaster] LOG: terminating any other active server processes
> 2022-01-07 21:40:32 JST [postmaster] LOG: all server processes terminated; reinitializing
>
>
> The backtrace I got from the core file was:
>
> (lldb) target create --core "/cores/core.51017"
> Core file '/cores/core.51017' (x86_64) was loaded.
> (lldb) bt
> * thread #1, stop reason = signal SIGSTOP
> * frame #0: 0x00007fff20484552 libsystem_platform.dylib`_platform_strlen + 18
> frame #1: 0x000000011076e36c postgres`dopr(target=0x00007ffedfd1b1d8, format="\n", args=0x00007ffedfd1b450) at snprintf.c:444:20
> frame #2: 0x000000011076e133 postgres`pg_vsnprintf(str="Query Text: \x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f"..., count=1024, fmt="%s: %s\n", args=0x00007ffedfd1b450) at snprintf.c:195:2

On top of the visibly use-after-pfree memory issue, the patch doesn't apply
anymore:

http://cfbot.cputube.org/patch_36_3142.log
=== Applying patches on top of PostgreSQL commit ID a18b6d2dc288dfa6e7905ede1d4462edd6a8af47 ===
=== applying patch ./v16-0001-log-running-query-plan.patch
[...]
patching file src/test/regress/expected/misc_functions.out
Hunk #1 FAILED at 134.
Hunk #2 succeeded at 163 (offset 7 lines).
Hunk #3 succeeded at 180 (offset 7 lines).
1 out of 3 hunks FAILED -- saving rejects to file src/test/regress/expected/misc_functions.out.rej
patching file src/test/regress/sql/misc_functions.sql
Hunk #1 FAILED at 31.
1 out of 1 hunk FAILED -- saving rejects to file src/test/regress/sql/misc_functions.sql.rej

I'm switching the patch to Waiting on Author.


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-01-24 05:33:05
Message-ID: fb6360e4c0ffbdd0a6698b92bb2617e2@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2022-01-14 15:38, Julien Rouhaud wrote:
> Hi,
>
> On Fri, Jan 07, 2022 at 09:54:31PM +0900, Fujii Masao wrote:

>> I ran the following query every 0.1s by using \watch psql command from
>> three different sessions while make installcheck test was running.
>>
>> SELECT pg_log_query_plan(pid) FROM pg_stat_activity;
>>
>>
>> And then I got the segmentation fault as follows.
>>
>> 2022-01-07 21:40:32 JST [postmaster] LOG: server process (PID 51017)
>> was terminated by signal 11: Segmentation fault: 11
>> 2022-01-07 21:40:32 JST [postmaster] DETAIL: Failed process was
>> running: select description, (test_conv(inbytes, 'utf8', 'utf8')).*
>> from utf8_verification_inputs;
>> 2022-01-07 21:40:32 JST [postmaster] LOG: terminating any other
>> active server processes
>> 2022-01-07 21:40:32 JST [postmaster] LOG: all server processes
>> terminated; reinitializing
>>
>>
>> The backtrace I got from the core file was:
>>
>> (lldb) target create --core "/cores/core.51017"
>> Core file '/cores/core.51017' (x86_64) was loaded.
>> (lldb) bt
>> * thread #1, stop reason = signal SIGSTOP
>> * frame #0: 0x00007fff20484552
>> libsystem_platform.dylib`_platform_strlen + 18
>> frame #1: 0x000000011076e36c
>> postgres`dopr(target=0x00007ffedfd1b1d8, format="\n",
>> args=0x00007ffedfd1b450) at snprintf.c:444:20
>> frame #2: 0x000000011076e133 postgres`pg_vsnprintf(str="Query
>> Text:
>>
\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x
7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f
\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x
7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f
\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f\x7f"...,
>> count=1024, fmt="%s: %s\n", args=0x00007ffedfd1b450) at
>> snprintf.c:195:2

Thanks for testing!
I could reproduce the situation and confirmed almost the same backtrace.

The cause seems that the previous patch didn't do cleanup when the
transactions were aborted. It leaded segmentation faults when
referencing ActiveQueryDesc-> after the transaction were aborted.

The attached patch added cleanups when transactions and subtransactions
are aborted.

BTW, the latest patch for pg_log_backtrace() added a new function for
deduplicating codes for pg_log_*()[1].
It seems better to use it for pg_log_query_plan() after the patch is
merged.

[1]
/message-id/flat/20211115194251.GP17618%40telsasoft.com#4e681d04c78867b2cce44b3cb1791bf5
duplicate code to a new function
CheckPostgresProcessId which can be used by pg_log_backtrace,

> On top of the visibly use-after-pfree memory issue, the patch doesn't
> apply

Thanks for letting me know!
As mentioned above, I updated the patch.

Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
v17-0001-log-running-query-plan.patch text/x-diff 25.1 KB

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-01-27 11:18:37
Message-ID: d3313477-2d51-ea2d-b08c-7d62ee3e12b9@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2022/01/24 14:33, torikoshia wrote:
> As mentioned above, I updated the patch.

Thanks for updating the patch!

Here are another review comments:

+LOG: plan of the query running on backend with PID 17793 is:

This seems not the same as what actually logged.

+ ereport(WARNING,
+ (errmsg("PID %d is not a PostgreSQL server process", pid)));

Like commit 7fa945b857 changed, this warning message should be "PID %d is not a PostgreSQL backend process"?

+ if (SendProcSignal(pid, PROCSIG_LOG_CURRENT_PLAN, InvalidBackendId) < 0)

proc->backendId should be specified instead of InvalidBackendId, to speed up the processing in SendProcSignal()?

+ PROCSIG_LOG_CURRENT_PLAN, /* ask backend to log plan of the current query */
+volatile sig_atomic_t LogCurrentPlanPending = false;
+extern void HandleLogCurrentPlanInterrupt(void);
+extern void ProcessLogCurrentPlanInterrupt(void);

Isn't it better to use the names that are more consistent with the function name, i.e., pg_log_query_plan? For example, PROCSIG_LOG_QUERY_PLAN instead of PROCSIG_LOG_CURRENT_PLAN?

+ ereport(LOG_SERVER_ONLY,
+ errmsg("backend with PID %d is not running a query",
+ MyProcPid));

errhidestmt(true) and errhidecontext(true) need to be added, don't they? Otherwise, for example, if pg_log_query_plan() is executed after debug_query_string is set but before ActiveQueryDesc is set, STATEMENT message would be output even though the message saying "not running a query" is output. Which seems confusing.

+ hash_seq_init(&status, GetLockMethodLocalHash());
+ while ((locallock = (LOCALLOCK *) hash_seq_search(&status)) != NULL)
+ {
+ if (LOCALLOCK_LOCKTAG(*locallock) == LOCKTAG_PAGE)

Why did you use the search for local lock hash instead of IsPageLockHeld flag variable, to check whether a page lock is held or not? Because there is the corner case where the interrupt is processed after the local lock is registered into the hash but before IsPageLockHeld is enabled?

There is the case where the request to log a query plan is skipped even while the target backend is running a query. If this happens, users can just retry pg_log_query_plan(). These things should be documented?

+ ereport(LOG_SERVER_ONLY,
+ errmsg("backend with PID %d is holding a page lock. Try again",
+ MyProcPid));

It seems more proper to output this message in DETAIL or HINT, instead. So how about something like the following messages?

LOG: could not log the query plan
DETAIL: query plan cannot be logged while page level lock is being held
HINT: Try pg_log_query_plan() after a few ....

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-01-28 08:45:49
Message-ID: e954285e4f6907fd479ebddf682acf52@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2022-01-27 20:18, Fujii Masao wrote:
> Here are another review comments:

Thanks for reviewing!

> +LOG: plan of the query running on backend with PID 17793 is:
>
> This seems not the same as what actually logged.

Modified.

> + ereport(WARNING,
> + (errmsg("PID %d is not a PostgreSQL server process", pid)));
>
> Like commit 7fa945b857 changed, this warning message should be "PID %d
> is not a PostgreSQL backend process"?

Modified.

> + if (SendProcSignal(pid, PROCSIG_LOG_CURRENT_PLAN, InvalidBackendId) <
> 0)
>
> proc->backendId should be specified instead of InvalidBackendId, to
> speed up the processing in SendProcSignal()?

Agreed. Modified.

> + PROCSIG_LOG_CURRENT_PLAN, /* ask backend to log plan of the current
> query */
> +volatile sig_atomic_t LogCurrentPlanPending = false;
> +extern void HandleLogCurrentPlanInterrupt(void);
> +extern void ProcessLogCurrentPlanInterrupt(void);
>
> Isn't it better to use the names that are more consistent with the
> function name, i.e., pg_log_query_plan? For example,
> PROCSIG_LOG_QUERY_PLAN instead of PROCSIG_LOG_CURRENT_PLAN?

Agreed.
I removed 'current' from the variable and function names and used
'query' instead.

> + ereport(LOG_SERVER_ONLY,
> + errmsg("backend with PID %d is not running a query",
> + MyProcPid));
>
> errhidestmt(true) and errhidecontext(true) need to be added, don't
> they? Otherwise, for example, if pg_log_query_plan() is executed after
> debug_query_string is set but before ActiveQueryDesc is set, STATEMENT
> message would be output even though the message saying "not running a
> query" is output. Which seems confusing.

Agreed. Added errhidestmt(true) and errhidecontext(true).

> + hash_seq_init(&status, GetLockMethodLocalHash());
> + while ((locallock = (LOCALLOCK *) hash_seq_search(&status)) != NULL)
> + {
> + if (LOCALLOCK_LOCKTAG(*locallock) == LOCKTAG_PAGE)
>
> Why did you use the search for local lock hash instead of
> IsPageLockHeld flag variable, to check whether a page lock is held or
> not? Because there is the corner case where the interrupt is processed
> after the local lock is registered into the hash but before
> IsPageLockHeld is enabled?

As far as I read CheckAndSetLockHeld(), IsPageLockHeld can be used only
when USE_ASSERT_CHECKING is enabled.
Since removing USE_ASSERT_CHECKING from CheckAndSetLockHeld() would give
performance impact on every granting/removing local lock, I used the
search for local local hash.

> There is the case where the request to log a query plan is skipped
> even while the target backend is running a query. If this happens,
> users can just retry pg_log_query_plan(). These things should be
> documented?

Agreed.
Added following:

+ Note that there is the case where the request to log a query
+ plan is skipped even while the target backend is running a
+ query due to lock conflict avoidance.
+ If this happens, users can just retry pg_log_query_plan().

|

> + ereport(LOG_SERVER_ONLY,
> + errmsg("backend with PID %d is holding a page lock. Try again",
> + MyProcPid));
>
> It seems more proper to output this message in DETAIL or HINT,
> instead. So how about something like the following messages?
>
> LOG: could not log the query plan
> DETAIL: query plan cannot be logged while page level lock is being held
> HINT: Try pg_log_query_plan() after a few ....

Agreed.
I felt the HINT message 'after a few ...' is difficult to describe, and
wrote as below.

| HINT: Retrying pg_log_query_plan() might succeed since the lock
duration of page level locks are usually short

How do you think?

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
v18-0001-log-running-query-plan.patch text/x-diff 25.2 KB

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-01-31 16:51:11
Message-ID: 69a08ae9-2077-9ef4-2adf-055d1aac2bde@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2022/01/28 17:45, torikoshia wrote:
>> There is the case where the request to log a query plan is skipped
>> even while the target backend is running a query. If this happens,
>> users can just retry pg_log_query_plan(). These things should be
>> documented?
>
> Agreed.
> Added following:
>
>   +        Note that there is the case where the request to log a query
>   +        plan is skipped even while the target backend is running a
>   +        query due to lock conflict avoidance.
>   +        If this happens, users can just retry pg_log_query_plan().

This may cause users to misunderstand that pg_log_query_plan() fails while the target backend is holding *any* locks? Isn't it better to mention "page-level locks", instead? So how about the following?

--------------------------
Note that the request to log the query plan will be ignored if it's received during a short period while the target backend is holding a page-level lock.
--------------------------

>> +            ereport(LOG_SERVER_ONLY,
>> +                errmsg("backend with PID %d is holding a page lock. Try again",
>> +                    MyProcPid));
>>
>> It seems more proper to output this message in DETAIL or HINT,
>> instead. So how about something like the following messages?
>>
>> LOG: could not log the query plan
>> DETAIL: query plan cannot be logged while page level lock is being held
>> HINT: Try pg_log_query_plan() after a few ....
>
> Agreed.
> I felt the HINT message 'after a few ...' is difficult to describe, and wrote as below.
>
> | HINT: Retrying pg_log_query_plan() might succeed since the lock duration of page level locks are usually short
>
> How do you think?

Or we don't need HINT message?

+ errmsg("could not log the query plan"),
+ errdetail("query plan cannot be logged while page level lock is being held"),

In detail message, the first word of sentences should be capitalized. How about "Cannot log the query plan while holding page-level lock.", instead?

Thanks for updating the patch! Here are some review comments.

+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_log_query_plan</primary>
+ </indexterm>

This entry is placed before one for pg_log_backend_memory_contexts(). But it should be *after* that since those entries seem to be placed in alphabetical order in the table?

+ Requests to log the plan of the query currently running on the
+ backend with specified process ID along with the untruncated
+ query string.

Other descriptions about logging of query string seem not to mention something like "untruncated query string". For example, auto_explain, log_statement, etc. Why do we need to mention "along with the untruncated query string" specially for pg_log_query_plan()?

+ Note that nested statements (statements executed inside a function) are not
+ considered for logging. Only the plan of the most deeply nested query is logged.

Now the plan of even nested statement can be logged. So this description needs to be updated?

@@ -440,6 +450,7 @@ standard_ExecutorFinish(QueryDesc *queryDesc)

MemoryContextSwitchTo(oldcontext);

+ ActiveQueryDesc = NULL;

ActiveQueryDesc seems unnecessary. Why does ActiveQueryDesc need to be reset to NULL in standard_ExecutorFinish()?

Currently even during ProcessLogQueryPlanInterrupt(), pg_log_query_plan() can be call and another ProcessLogQueryPlanInterrupt() can be executed. So repeatable re-entrances to ProcessLogQueryPlanInterrupt() could cause "stack depth limit exceeded" error. To avoid this, shouldn't we make ProcessLogQueryPlanInterrupt() do nothing and return immediately, if it's called during another ProcessLogQueryPlanInterrupt()?

pg_log_backend_memory_contexts() also might have the same issue.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: masao(dot)fujii(at)oss(dot)nttdata(dot)com
Cc: torikoshia(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-02-01 08:27:57
Message-ID: 20220201.172757.1480996662235658750.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

At Tue, 1 Feb 2022 01:51:11 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
>
>
> On 2022/01/28 17:45, torikoshia wrote:
> >> There is the case where the request to log a query plan is skipped
> >> even while the target backend is running a query. If this happens,
> >> users can just retry pg_log_query_plan(). These things should be
> >> documented?
> > Agreed.
> > Added following:
> >   +        Note that there is the case where the request to log a
> > query
> >   +        plan is skipped even while the target backend is running a
> >   +        query due to lock conflict avoidance.
> >   +        If this happens, users can just retry pg_log_query_plan().
>
> This may cause users to misunderstand that pg_log_query_plan() fails
> while the target backend is holding *any* locks? Isn't it better to
> mention "page-level locks", instead? So how about the following?
>
> --------------------------
> Note that the request to log the query plan will be ignored if it's
> received during a short period while the target backend is holding a
> page-level lock.
> --------------------------
>
>
> >> +            ereport(LOG_SERVER_ONLY,
> >> +                errmsg("backend with PID %d is holding a page
> >> lock. Try again",
> >> +                    MyProcPid));
> >>
> >> It seems more proper to output this message in DETAIL or HINT,
> >> instead. So how about something like the following messages?
> >>
> >> LOG: could not log the query plan
> >> DETAIL: query plan cannot be logged while page level lock is being
> >> held
> >> HINT: Try pg_log_query_plan() after a few ....
> > Agreed.
> > I felt the HINT message 'after a few ...' is difficult to describe,
> > and wrote as below.
> > | HINT: Retrying pg_log_query_plan() might succeed since the lock
> > | duration of page level locks are usually short
> > How do you think?
>
> Or we don't need HINT message?
>
>
> + errmsg("could not log the query plan"),
> + errdetail("query plan cannot be logged while page level lock is
> being held"),
>
> In detail message, the first word of sentences should be
> capitalized. How about "Cannot log the query plan while holding
> page-level lock.", instead?
>
>
> Thanks for updating the patch! Here are some review comments.
>
> + <row>
> + <entry role="func_table_entry"><para role="func_signature">
> + <indexterm>
> + <primary>pg_log_query_plan</primary>
> + </indexterm>
>
> This entry is placed before one for
> pg_log_backend_memory_contexts(). But it should be *after* that since
> those entries seem to be placed in alphabetical order in the table?
>
>
> + Requests to log the plan of the query currently running on the
> + backend with specified process ID along with the untruncated
> + query string.
>
> Other descriptions about logging of query string seem not to mention
> something like "untruncated query string". For example, auto_explain,
> log_statement, etc. Why do we need to mention "along with the
> untruncated query string" specially for pg_log_query_plan()?
>
>
> + Note that nested statements (statements executed inside a function)
> are not
> + considered for logging. Only the plan of the most deeply nested
> query is logged.
>
> Now the plan of even nested statement can be logged. So this
> description needs to be updated?
>
>
> @@ -440,6 +450,7 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
> MemoryContextSwitchTo(oldcontext);
> + ActiveQueryDesc = NULL;
>
> ActiveQueryDesc seems unnecessary. Why does ActiveQueryDesc need to be
> reset to NULL in standard_ExecutorFinish()?
>
>
> Currently even during ProcessLogQueryPlanInterrupt(),
> pg_log_query_plan() can be call and another
> ProcessLogQueryPlanInterrupt() can be executed. So repeatable
> re-entrances to ProcessLogQueryPlanInterrupt() could cause "stack
> depth limit exceeded" error. To avoid this, shouldn't we make
> ProcessLogQueryPlanInterrupt() do nothing and return immediately, if
> it's called during another ProcessLogQueryPlanInterrupt()?
>
> pg_log_backend_memory_contexts() also might have the same issue.

Good catch.

By the way, I'm anxious about the following part and I'd like to
remove it.

+ * Ensure no page lock is held on this process.

It seems to me what is wrong is ginInsertCleanup(), not this feature.
As I read the comment for the assertion, I don't believe we want to
allow CFI while holding a page lock. And AFAICS the function is the
only point where doing that. (It is the alone user of LockPage()...)

This is the assertion.

lock.c: 902
> /*
> * We don't acquire any other heavyweight lock while holding the page lock
> * except for relation extension.
> */
> Assert(!IsPageLockHeld ||
> (locktag->locktag_type == LOCKTAG_RELATION_EXTEND));

It is added by a recent commit 72e78d831ab5550c39f2dcc7cc5d44c406ec3dc2.

> * Similar to relation extension, page locks are also held for a short
> * duration, so imposing such a restriction won't hurt.

I don't believe a path involving vacuum_delay_point() calls is
short-duration'ed.

/message-id/CAH2-WznzCPUKnOV%2Bre30_rHLCkqQWm2JTSVjTCAei9LySTc2pw%40mail.gmail.com

> One thing that really bothers me about commit e2c79e14 is that
> LockPage() is called, not LockBuffer(). GIN had no LockPage() calls
> before that commit, and is now the only code in the entire system that
> calls LockPage()/ConditionalLockPage() (the hash am no longer uses
> page heavyweight locks following recent work there).

I agree to the discussion. Can't we use other mechanism here to get
rid of the Lockpage()?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: torikoshia(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-02-01 14:11:03
Message-ID: a398dab8-486c-4a54-5aa6-f9faab2ea1fa@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg윈 토토SQL :

On 2022/02/01 17:27, Kyotaro Horiguchi wrote:
>> * Similar to relation extension, page locks are also held for a short
>> * duration, so imposing such a restriction won't hurt.
>
> I don't believe a path involving vacuum_delay_point() calls is
> short-duration'ed.

Yes.

>> One thing that really bothers me about commit e2c79e14 is that
>> LockPage() is called, not LockBuffer(). GIN had no LockPage() calls
>> before that commit, and is now the only code in the entire system that
>> calls LockPage()/ConditionalLockPage() (the hash am no longer uses
>> page heavyweight locks following recent work there).
>
> I agree to the discussion. Can't we use other mechanism here to get
> rid of the Lockpage()?

I have no good idea for that yet, but I agree it's better to get rid of page level lock.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: masao(dot)fujii(at)oss(dot)nttdata(dot)com
Cc: torikoshia(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-02-02 07:49:57
Message-ID: 20220202.164957.824320913353996109.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

At Tue, 1 Feb 2022 23:11:03 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
> >> One thing that really bothers me about commit e2c79e14 is that
> >> LockPage() is called, not LockBuffer(). GIN had no LockPage() calls
> >> before that commit, and is now the only code in the entire system that
> >> calls LockPage()/ConditionalLockPage() (the hash am no longer uses
> >> page heavyweight locks following recent work there).
> > I agree to the discussion. Can't we use other mechanism here to get
> > rid of the Lockpage()?
>
> I have no good idea for that yet, but I agree it's better to get rid
> of page level lock.

It's my turn?

The page lock is used to hold-off simultaneous cleanups on the same
index. ShareUpdateExclusive lock on the index relation works that
way. In that path it seems like we are always holding a RowExclusive
lock, so it seems to me we can use ShareUpdateExclusive for our
purpose.

There might be a false blocking case when another backend is holding a
conflicting lock on the index. They are, Share, ShareRowExclusive,
Exclusive and AccessExclusive. The last three cases don't seem worth
discussion. I'm not sure about Share and Share Row cases. AFAICS
Share lock is taken on an index in ATExecReplicaIdentity, and there no
use of ShareRowExclusive lock on an index. It's no use discussing about
explicit locking.

So aren't we able to use ShareUpdateExclusive lock for that?

In the attached patch, ginInsertCleanup has an extra check for such
stronger locks not being held. At least "make check" doesn't cause
the extra assertion to fire.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: masao(dot)fujii(at)oss(dot)nttdata(dot)com
Cc: torikoshia(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-02-02 08:04:21
Message-ID: 20220202.170421.204740592565249296.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

At Wed, 02 Feb 2022 16:49:57 +0900 (JST), Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> So aren't we able to use ShareUpdateExclusive lock for that?
>
> In the attached patch, ginInsertCleanup has an extra check for such
> stronger locks not being held. At least "make check" doesn't cause
> the extra assertion to fire.

Actually, the discussion is a bit dubious. What we need really to
check is wheter such locks are not held on an index *elsewhere*.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, masao(dot)fujii(at)oss(dot)nttdata(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-02-02 12:59:12
Message-ID: 3bb7bacbc5ef9e58e2ece0557adac98b@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg토토 꽁 머니SQL

2022-02-01 01:51, Fujii Masao wrote:
Thanks for reviewing and suggestions!

>>   +        Note that there is the case where the request to log a
>> query
>>   +        plan is skipped even while the target backend is running a
>>   +        query due to lock conflict avoidance.
>>   +        If this happens, users can just retry pg_log_query_plan().
>
> This may cause users to misunderstand that pg_log_query_plan() fails
> while the target backend is holding *any* locks? Isn't it better to
> mention "page-level locks", instead? So how about the following?
>
> --------------------------
> Note that the request to log the query plan will be ignored if it's
> received during a short period while the target backend is holding a
> page-level lock.
> --------------------------

Agreed.

>>> +            ereport(LOG_SERVER_ONLY,
>>> +                errmsg("backend with PID %d is holding a page lock.
>>> Try again",
>>> +                    MyProcPid));
>>>
>>> It seems more proper to output this message in DETAIL or HINT,
>>> instead. So how about something like the following messages?
>>>
>>> LOG: could not log the query plan
>>> DETAIL: query plan cannot be logged while page level lock is being
>>> held
>>> HINT: Try pg_log_query_plan() after a few ....
>>
>> Agreed.
>> I felt the HINT message 'after a few ...' is difficult to describe,
>> and wrote as below.
>>
>> | HINT: Retrying pg_log_query_plan() might succeed since the lock
>> duration of page level locks are usually short
>>
>> How do you think?
>
> Or we don't need HINT message?

Removed the HINT message.

> + errmsg("could not log the query plan"),
> + errdetail("query plan cannot be logged
> while page level lock is
> being held"),
>
> In detail message, the first word of sentences should be capitalized.
> How about "Cannot log the query plan while holding page-level lock.",
> instead?

Agreed.

> Thanks for updating the patch! Here are some review comments.
>
> + <row>
> + <entry role="func_table_entry"><para role="func_signature">
> + <indexterm>
> + <primary>pg_log_query_plan</primary>
> + </indexterm>
>
> This entry is placed before one for pg_log_backend_memory_contexts().
> But it should be *after* that since those entries seem to be placed in
> alphabetical order in the table?

Modified it.

> + Requests to log the plan of the query currently running on the
> + backend with specified process ID along with the untruncated
> + query string.
>
> Other descriptions about logging of query string seem not to mention
> something like "untruncated query string". For example, auto_explain,
> log_statement, etc. Why do we need to mention "along with the
> untruncated query string" specially for pg_log_query_plan()?

Modified it as below:

Requests to log the plan of the query currently running on
the
- backend with specified process ID along with the
untruncated
- query string.
- They will be logged at <literal>LOG</literal> message level
and
+ backend with specified process ID.
+ It will be logged at <literal>LOG</literal> message level
and

> + Note that nested statements (statements executed inside a
> function) are not
> + considered for logging. Only the plan of the most deeply nested
> query is logged.
>
> Now the plan of even nested statement can be logged. So this
> description needs to be updated?

Modified it as below:

- Note that nested statements (statements executed inside a
function) are not
- considered for logging. Only the plan of the most deeply nested
query is logged.
+ Note that when the statements are executed inside a function,
only the
+ plan of the most deeply nested query is logged.

> @@ -440,6 +450,7 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
> MemoryContextSwitchTo(oldcontext);
> + ActiveQueryDesc = NULL;
>
> ActiveQueryDesc seems unnecessary. Why does ActiveQueryDesc need to be
> reset to NULL in standard_ExecutorFinish()?

ActiveQueryDesc should not be reset in standard_ExecutorFinish().
Removed it.

> Currently even during ProcessLogQueryPlanInterrupt(),
> pg_log_query_plan() can be call and another
> ProcessLogQueryPlanInterrupt() can be executed. So repeatable
> re-entrances to ProcessLogQueryPlanInterrupt() could cause "stack
> depth limit exceeded" error. To avoid this, shouldn't we make
> ProcessLogQueryPlanInterrupt() do nothing and return immediately, if
> it's called during another ProcessLogQueryPlanInterrupt()?
>
> pg_log_backend_memory_contexts() also might have the same issue.

As you pointed out offlist, the issue could occur even when both
pg_log_backend_memory_contexts and pg_log_query_plan are called and it
may be better to make another patch.

You also pointed out offlist that it would be necessary to call
LockErrorCleanup() if ProcessLogQueryPlanInterrupt() can incur ERROR.
AFAICS it can call ereport(ERROR), i.e., palloc0() in NewExplainState(),
so I added PG_TRY/CATCH and make it call LockErrorCleanup() when ERROR
occurs.

On 2022-02-01 17:27, Kyotaro Horiguchi wrote:

Thanks for reviewing Horiguchi-san!

> By the way, I'm anxious about the following part and I'd like to
> remove it.

I also think it would be nice if it's possible.
>
> + * Ensure no page lock is held on this process.
>
> It seems to me what is wrong is ginInsertCleanup(), not this feature.

> Actually, the discussion is a bit dubious. What we need really to
check is wheter such locks are not held on an index *elsewhere*.

Since I'm not sure how long it will take to discuss this point, the
attached patch is based on the current HEAD at this time.
I also think it may be better to discuss it on another thread.

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
v19-0001-log-running-query-plan.patch text/x-diff 24.9 KB

From: Robert Treat <rob(at)xzilla(dot)net>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, masao(dot)fujii(at)oss(dot)nttdata(dot)com, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: RFC: Logging plan of the running query
Date: 2022-02-03 08:09:07
Message-ID: CABV9wwPHtN6NSo6nrx3REhTpYewU=j9__uroS-B=wTW-1K4yXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 2, 2022 at 7:59 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> 2022-02-01 01:51, Fujii Masao wrote:
<snip>
> > + Note that nested statements (statements executed inside a
> > function) are not
> > + considered for logging. Only the plan of the most deeply nested
> > query is logged.
> >
> > Now the plan of even nested statement can be logged. So this
> > description needs to be updated?
>
> Modified it as below:
>
> - Note that nested statements (statements executed inside a
> function) are not
> - considered for logging. Only the plan of the most deeply nested
> query is logged.
> + Note that when the statements are executed inside a function,
> only the
> + plan of the most deeply nested query is logged.
>

Minor nit, but I think the "the" is superfluous.. ie.

"Note that when statements are executed inside a function,
only the plan of the most deeply nested query is logged"

<snip>
> On 2022-02-01 17:27, Kyotaro Horiguchi wrote:
>
> Thanks for reviewing Horiguchi-san!
>
> > By the way, I'm anxious about the following part and I'd like to
> > remove it.
>
> I also think it would be nice if it's possible.
> >
> > + * Ensure no page lock is held on this process.
> >
> > It seems to me what is wrong is ginInsertCleanup(), not this feature.
>
> > Actually, the discussion is a bit dubious. What we need really to
> check is wheter such locks are not held on an index *elsewhere*.
>
> Since I'm not sure how long it will take to discuss this point, the
> attached patch is based on the current HEAD at this time.
> I also think it may be better to discuss it on another thread.
>

While I agree on the above points, IMHO I don't believe it should be a
show-stopper for adding this functionality to v15, but we have a few
more commitments before we get to that point.

Robert Treat
https://xzilla.net


From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-02-07 16:13:44
Message-ID: 03c8dcbb-72d2-d3ad-3ad1-9efe69676529@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2022/02/02 21:59, torikoshia wrote:
>> This may cause users to misunderstand that pg_log_query_plan() fails
>> while the target backend is holding *any* locks? Isn't it better to
>> mention "page-level locks", instead? So how about the following?
>>
>> --------------------------
>> Note that the request to log the query plan will be ignored if it's
>> received during a short period while the target backend is holding a
>> page-level lock.
>> --------------------------
>
> Agreed.

On second thought, this note is confusing rather than helpful? Because the users don't know when and what operation needs page-level lock. So now I'm thinking it's better to remove this note.

> As you pointed out offlist, the issue could occur even when both pg_log_backend_memory_contexts and pg_log_query_plan are called and it may be better to make another patch.

OK.

> You also pointed out offlist that it would be necessary to call LockErrorCleanup() if ProcessLogQueryPlanInterrupt() can incur ERROR.
> AFAICS it can call ereport(ERROR), i.e., palloc0() in NewExplainState(), so I added PG_TRY/CATCH and make it call LockErrorCleanup() when ERROR occurs.

As we discussed off-list, this treatment might not be necessary. Because when ERROR or FATAL error happens, AbortTransaction() is called and LockErrorCleanup() is also called inside there.

> Since I'm not sure how long it will take to discuss this point, the attached patch is based on the current HEAD at this time.

Thanks for updating the patch!

@@ -5048,6 +5055,12 @@ AbortSubTransaction(void)
*/
PG_SETMASK(&UnBlockSig);

+ /*
+ * When ActiveQueryDesc is referenced after abort, some of its elements
+ * are freed. To avoid accessing them, reset ActiveQueryDesc here.
+ */
+ ActiveQueryDesc = NULL;

AbortSubTransaction() should reset ActiveQueryDesc to save_ActiveQueryDesc that ExecutorRun() set, instead of NULL? Otherwise ActiveQueryDesc of top-level statement will be unavailable after subtransaction is aborted in the nested statements.

For example, no plan is logged while the following "select pg_sleep(test())" is running, because the exception inside test() function aborts the subtransaction and resets ActiveQueryDesc to NULL.

create or replace function test () returns int as $$
begin
perform 1/0;
exception when others then
return 30;
end;
$$ language plpgsql;

select pg_sleep(test());

-CREATE ROLE regress_log_memory;
+CREATE ROLE regress_log;

Isn't this name too generic? How about regress_log_function, for example?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: masao(dot)fujii(at)oss(dot)nttdata(dot)com
Cc: torikoshia(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-02-08 08:18:52
Message-ID: 20220208.171852.1699766364147189900.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

At Tue, 8 Feb 2022 01:13:44 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
>
>
> On 2022/02/02 21:59, torikoshia wrote:
> >> This may cause users to misunderstand that pg_log_query_plan() fails
> >> while the target backend is holding *any* locks? Isn't it better to
> >> mention "page-level locks", instead? So how about the following?
> >>
> >> --------------------------
> >> Note that the request to log the query plan will be ignored if it's
> >> received during a short period while the target backend is holding a
> >> page-level lock.
> >> --------------------------
> > Agreed.
>
> On second thought, this note is confusing rather than helpful? Because
> the users don't know when and what operation needs page-level lock. So
> now I'm thinking it's better to remove this note.

*I* agree to removing the note. And the following error message looks
as mysterious as the note is, and the DETAIL doesn't help..

ereport(LOG_SERVER_ONLY,
+ errmsg("could not log the query plan"),
+ errdetail("Cannot log the query plan while holding page-level lock."));
+ hash_seq_term(&status);

We should tell the command can be retried soon, like this?

"LOG: ignored request for logging query plan due to lock confilcts"
"HINT: You can try again in a moment."

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, masao(dot)fujii(at)oss(dot)nttdata(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-02-08 15:12:50
Message-ID: ab761e68f3fbcee80cf1ca19b1db85f1@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2022-02-03 17:09, Robert Treat wrote:
> On Wed, Feb 2, 2022 at 7:59 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
>>
>> 2022-02-01 01:51, Fujii Masao wrote:
> <snip>
>> > + Note that nested statements (statements executed inside a
>> > function) are not
>> > + considered for logging. Only the plan of the most deeply nested
>> > query is logged.
>> >
>> > Now the plan of even nested statement can be logged. So this
>> > description needs to be updated?
>>
>> Modified it as below:
>>
>> - Note that nested statements (statements executed inside a
>> function) are not
>> - considered for logging. Only the plan of the most deeply
>> nested
>> query is logged.
>> + Note that when the statements are executed inside a
>> function,
>> only the
>> + plan of the most deeply nested query is logged.
>>
>
> Minor nit, but I think the "the" is superfluous.. ie.
>
> "Note that when statements are executed inside a function,
> only the plan of the most deeply nested query is logged"

Thanks! Modified it.

On 2022-02-08 01:13, Fujii Masao wrote:
Thanks for the comments!

> On 2022/02/02 21:59, torikoshia wrote:
>>> This may cause users to misunderstand that pg_log_query_plan() fails
>>> while the target backend is holding *any* locks? Isn't it better to
>>> mention "page-level locks", instead? So how about the following?
>>>
>>> --------------------------
>>> Note that the request to log the query plan will be ignored if it's
>>> received during a short period while the target backend is holding a
>>> page-level lock.
>>> --------------------------
>>
>> Agreed.
>
> On second thought, this note is confusing rather than helpful? Because
> the users don't know when and what operation needs page-level lock. So
> now I'm thinking it's better to remove this note.

Removed it.
>
>
>> As you pointed out offlist, the issue could occur even when both
>> pg_log_backend_memory_contexts and pg_log_query_plan are called and it
>> may be better to make another patch.
>
> OK.
>
>
>> You also pointed out offlist that it would be necessary to call
>> LockErrorCleanup() if ProcessLogQueryPlanInterrupt() can incur ERROR.
>> AFAICS it can call ereport(ERROR), i.e., palloc0() in
>> NewExplainState(), so I added PG_TRY/CATCH and make it call
>> LockErrorCleanup() when ERROR occurs.
>
> As we discussed off-list, this treatment might not be necessary.
> Because when ERROR or FATAL error happens, AbortTransaction() is
> called and LockErrorCleanup() is also called inside there.

Agreed.

>> Since I'm not sure how long it will take to discuss this point, the
>> attached patch is based on the current HEAD at this time.
>
> Thanks for updating the patch!
>
> @@ -5048,6 +5055,12 @@ AbortSubTransaction(void)
> */
> PG_SETMASK(&UnBlockSig);
> + /*
> + * When ActiveQueryDesc is referenced after abort, some of its
> elements
> + * are freed. To avoid accessing them, reset ActiveQueryDesc here.
> + */
> + ActiveQueryDesc = NULL;
>
> AbortSubTransaction() should reset ActiveQueryDesc to
> save_ActiveQueryDesc that ExecutorRun() set, instead of NULL?
> Otherwise ActiveQueryDesc of top-level statement will be unavailable
> after subtransaction is aborted in the nested statements.
>
> For example, no plan is logged while the following "select
> pg_sleep(test())" is running, because the exception inside test()
> function aborts the subtransaction and resets ActiveQueryDesc to NULL.
>
> create or replace function test () returns int as $$
> begin
> perform 1/0;
> exception when others then
> return 30;
> end;
> $$ language plpgsql;
>
> select pg_sleep(test());

Agreed.

BTW, since the above example results in calling ExecutorRun() only once,
the output didn't differ even after ActiveQueryDesc is reset to
save_ActiveQueryDesc.

The below definition of test() worked as expected.

create or replace function test () returns int as $$
begin
perform 1;
perform 1/0;
exception when others then
return 30;
end;
$$ language plpgsql;

>
>
> -CREATE ROLE regress_log_memory;
> +CREATE ROLE regress_log;
>
> Isn't this name too generic? How about regress_log_function, for
> example?

Agreed.

On 2022-02-08 17:18, Kyotaro Horiguchi wrote:
> At Tue, 8 Feb 2022 01:13:44 +0900, Fujii Masao
> <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
>>
>>
>> On 2022/02/02 21:59, torikoshia wrote:
>> >> This may cause users to misunderstand that pg_log_query_plan() fails
>> >> while the target backend is holding *any* locks? Isn't it better to
>> >> mention "page-level locks", instead? So how about the following?
>> >>
>> >> --------------------------
>> >> Note that the request to log the query plan will be ignored if it's
>> >> received during a short period while the target backend is holding a
>> >> page-level lock.
>> >> --------------------------
>> > Agreed.
>>
>> On second thought, this note is confusing rather than helpful? Because
>> the users don't know when and what operation needs page-level lock. So
>> now I'm thinking it's better to remove this note.
>
> *I* agree to removing the note. And the following error message looks
> as mysterious as the note is, and the DETAIL doesn't help..
>
> ereport(LOG_SERVER_ONLY,
> + errmsg("could not log the query plan"),
> + errdetail("Cannot log the query plan while holding page-level
> lock."));
> + hash_seq_term(&status);
>
> We should tell the command can be retried soon, like this?
>
> "LOG: ignored request for logging query plan due to lock confilcts"
> "HINT: You can try again in a moment."

Thanks, it seems better.

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
v20-0001-log-running-query-plan.patch text/x-diff 24.9 KB

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-02-21 17:39:16
Message-ID: 3e94e9b3-83ef-c328-0c5c-f93aa3224790@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2022/02/09 0:12, torikoshia wrote:
> BTW, since the above example results in calling ExecutorRun() only once, the output didn't differ even after ActiveQueryDesc is reset to save_ActiveQueryDesc.
>
> The below definition of test() worked as expected.
>
>  create or replace function test () returns int as $$
>  begin
>      perform 1;
>      perform 1/0;
>  exception when others then
>      return 30;
>  end;
>  $$ language plpgsql;
So in this case ActiveQueryDesc set by ExecutorRun() called only once is still valid even when AbortSubTransaction() is called. That is, that ActiveQueryDesc should NOT be reset to save_ActiveQueryDesc in this case, should it?

OTOH, in your example, ActiveQueryDesc set by the second call to ExecutorRun() should be reset in AbortSubTransaction(). Then ActiveQueryDesc set by the first call should be valid.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-03-09 10:04:49
Message-ID: 3d121ed5f81cef588bac836b43f5d1f9@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2022-02-08 01:13, Fujii Masao wrote:
> AbortSubTransaction() should reset ActiveQueryDesc to
> save_ActiveQueryDesc that ExecutorRun() set, instead of NULL?
> Otherwise ActiveQueryDesc of top-level statement will be unavailable
> after subtransaction is aborted in the nested statements.

I once agreed above suggestion and made v20 patch making
save_ActiveQueryDesc a global variable, but it caused segfault when
calling pg_log_query_plan() after FreeQueryDesc().

OTOH, doing some kind of reset of ActiveQueryDesc seems necessary since
it also caused segfault when running pg_log_query_plan() during
installcheck.

There may be a better way, but resetting ActiveQueryDesc to NULL seems
safe and simple.
Of course it makes pg_log_query_plan() useless after a subtransaction is
aborted.
However, if it does not often happen that people want to know the
running query's plan whose subtransaction is aborted, resetting
ActiveQueryDesc to NULL would be acceptable.

Attached is a patch that sets ActiveQueryDesc to NULL when a
subtransaction is aborted.

How do you think?

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
v21-0001-log-running-query-plan.patch text/x-diff 25.1 KB

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-05-16 08:02:40
Message-ID: 379d69c0b3d058997550f88e73ffddd6@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2022-03-09 19:04, torikoshia wrote:
> On 2022-02-08 01:13, Fujii Masao wrote:
>> AbortSubTransaction() should reset ActiveQueryDesc to
>> save_ActiveQueryDesc that ExecutorRun() set, instead of NULL?
>> Otherwise ActiveQueryDesc of top-level statement will be unavailable
>> after subtransaction is aborted in the nested statements.
>
> I once agreed above suggestion and made v20 patch making
> save_ActiveQueryDesc a global variable, but it caused segfault when
> calling pg_log_query_plan() after FreeQueryDesc().
>
> OTOH, doing some kind of reset of ActiveQueryDesc seems necessary
> since it also caused segfault when running pg_log_query_plan() during
> installcheck.
>
> There may be a better way, but resetting ActiveQueryDesc to NULL seems
> safe and simple.
> Of course it makes pg_log_query_plan() useless after a subtransaction
> is aborted.
> However, if it does not often happen that people want to know the
> running query's plan whose subtransaction is aborted, resetting
> ActiveQueryDesc to NULL would be acceptable.
>
> Attached is a patch that sets ActiveQueryDesc to NULL when a
> subtransaction is aborted.
>
> How do you think?

Attached new patch to fix patch apply failures.

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
v22-0001-log-running-query-plan.patch text/x-diff 25.0 KB

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-09-08 10:19:09
Message-ID: 727712bcaa8d449615820c16d15e452e@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2022-05-16 17:02, torikoshia wrote:
> On 2022-03-09 19:04, torikoshia wrote:
>> On 2022-02-08 01:13, Fujii Masao wrote:
>>> AbortSubTransaction() should reset ActiveQueryDesc to
>>> save_ActiveQueryDesc that ExecutorRun() set, instead of NULL?
>>> Otherwise ActiveQueryDesc of top-level statement will be unavailable
>>> after subtransaction is aborted in the nested statements.
>>
>> I once agreed above suggestion and made v20 patch making
>> save_ActiveQueryDesc a global variable, but it caused segfault when
>> calling pg_log_query_plan() after FreeQueryDesc().
>>
>> OTOH, doing some kind of reset of ActiveQueryDesc seems necessary
>> since it also caused segfault when running pg_log_query_plan() during
>> installcheck.
>>
>> There may be a better way, but resetting ActiveQueryDesc to NULL seems
>> safe and simple.
>> Of course it makes pg_log_query_plan() useless after a subtransaction
>> is aborted.
>> However, if it does not often happen that people want to know the
>> running query's plan whose subtransaction is aborted, resetting
>> ActiveQueryDesc to NULL would be acceptable.
>>
>> Attached is a patch that sets ActiveQueryDesc to NULL when a
>> subtransaction is aborted.
>>
>> How do you think?
Attached new patch to fix patch apply failures again.

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
v23-0001-log-running-query-plan.patch text/x-diff 25.0 KB

From: Алена Рыбакина <lena(dot)ribackina(at)yandex(dot)ru>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: RFC: Logging plan of the running query
Date: 2022-09-19 08:47:21
Message-ID: 598171663576373@mail.yandex.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

<div><div>Hi,</div><div><div>I'm sorry,if this message is duplicated previous this one, but I'm not sure that the previous message is sent correctly. I sent it from email address <a href="mailto:a(dot)rybakina(at)postgrespro(dot)ru" rel="noopener noreferrer">a(dot)rybakina(at)postgrespro(dot)ru</a> and I couldn't send this one email from those address.</div><div>I like idea to create patch for logging query plan. After reviewing this code and notice some moments and I'd rather ask you some questions.</div></div><div><br />Firstly, I suggest some editing in the comment of commit. I think, it is turned out the more laconic and the same clear. I wrote it below since I can't think of any other way to add it.<br /><br />```<br />Currently, we have to wait for finishing of the query execution to check its plan.<br />This is not so convenient in investigation long-running queries on production<br />environments where we cannot use debuggers.<br /><br />To improve this situation there is proposed the patch containing the pg_log_query_plan()<br />function which request to log plan of the specified backend process.<br /><br />By default, only superusers are allowed to request log of the plan otherwise<br />allowing any users to issue this request could create cause lots of log messages<br />and it can lead to denial of service.<br /><br />At the next requesting CHECK_FOR_INTERRUPTS(), the target backend logs its plan at<br />LOG_SERVER_ONLY level and therefore this plan will appear in the server log only,<br />not to be sent to the client.<br />```<br /><br />Secondly, I have question about deleting USE_ASSERT_CHECKING in lock.h?<br />It supposed to have been checked in another placed of the code by matching values. I am worry about skipping errors due to untesting with assert option in the places where it (GetLockMethodLocalHash) participates and we won't able to get core file in segfault cases. I might not understand something, then can you please explain to me?<br /><br />Thirdly, I have incomprehension of the point why save_ActiveQueryDesc is declared in the pquery.h? I am seemed to save_ActiveQueryDesc to be used in an once time in the ExecutorRun function and  its declaration superfluous. I added it in the attached patch.<br /><br />Fourthly, it seems to me there are not enough explanatory comments in the code. I also added them in the attached patch.<br /><br />Lastly, I have incomprehension about handling signals since have been unused it before. Could another signal disabled calling this signal to log query plan? I noticed this signal to be checked the latest in procsignal_sigusr1_handler function.</div></div><div> </div><div> </div><div>-- </div><div>Regards,</div><div>Alena Rybakina<br />Postgres Professional</div><div> </div><div> </div><div> </div><div>19.09.2022, 11:01, "torikoshia" &lt;torikoshia(at)oss(dot)nttdata(dot)com&gt;:</div><blockquote><p>On 2022-05-16 17:02, torikoshia wrote:</p><blockquote> On 2022-03-09 19:04, torikoshia wrote:<blockquote> On 2022-02-08 01:13, Fujii Masao wrote:<blockquote> AbortSubTransaction() should reset ActiveQueryDesc to<br /> save_ActiveQueryDesc that ExecutorRun() set, instead of NULL?<br /> Otherwise ActiveQueryDesc of top-level statement will be unavailable<br /> after subtransaction is aborted in the nested statements.</blockquote> <br /> I once agreed above suggestion and made v20 patch making<br /> save_ActiveQueryDesc a global variable, but it caused segfault when<br /> calling pg_log_query_plan() after FreeQueryDesc().<br /> <br /> OTOH, doing some kind of reset of ActiveQueryDesc seems necessary<br /> since it also caused segfault when running pg_log_query_plan() during<br /> installcheck.<br /> <br /> There may be a better way, but resetting ActiveQueryDesc to NULL seems<br /> safe and simple.<br /> Of course it makes pg_log_query_plan() useless after a subtransaction<br /> is aborted.<br /> However, if it does not often happen that people want to know the<br /> running query's plan whose subtransaction is aborted, resetting<br /> ActiveQueryDesc to NULL would be acceptable.<br /> <br /> Attached is a patch that sets ActiveQueryDesc to NULL when a<br /> subtransaction is aborted.<br /> <br /> How do you think?</blockquote></blockquote><p>Attached new patch to fix patch apply failures again.<br /> </p>--<br />Regards,<br /><br />--<br />Atsushi Torikoshi<br />NTT DATA CORPORATION</blockquote>

Attachment Content-Type Size
unknown_filename text/html 4.3 KB

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Алена Рыбакина <lena(dot)ribackina(at)yandex(dot)ru>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-09-20 14:41:57
Message-ID: 38f4e19323b264e5975c60010b512b15@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2022-09-19 17:47, Алена Рыбакина wrote:
Thanks for your review and comments!

> Hi,
>
> I'm sorry,if this message is duplicated previous this one, but I'm not
> sure that the previous message is sent correctly. I sent it from email
> address a(dot)rybakina(at)postgrespro(dot)ru and I couldn't send this one email
> from those address.

I've successfully received your mail from both a(dot)rybakina(at)postgrespro(dot)ru
and lena(dot)ribackina(at)yandex(dot)ru(dot)

> I like idea to create patch for logging query plan. After reviewing
> this code and notice some moments and I'd rather ask you some
> questions.
>
> Firstly, I suggest some editing in the comment of commit. I think, it
> is
> turned out the more laconic and the same clear. I wrote it below since
> I
> can't think of any other way to add it.

> ```
> Currently, we have to wait for finishing of the query execution to
> check
> its plan.
> This is not so convenient in investigation long-running queries on
> production
> environments where we cannot use debuggers.

> To improve this situation there is proposed the patch containing the
> pg_log_query_plan()
> function which request to log plan of the specified backend process.

> By default, only superusers are allowed to request log of the plan
> otherwise
> allowing any users to issue this request could create cause lots of log
> messages
> and it can lead to denial of service.
>
> At the next requesting CHECK_FOR_INTERRUPTS(), the target backend logs
> its plan at
> LOG_SERVER_ONLY level and therefore this plan will appear in the server
> log only,
> not to be sent to the client.
Thanks, I have incorporated your comments.
Since the latter part of the original message comes from the commit
message of pg_log_backend_memory_contexts(43620e328617c), so I left it
as it was for consistency.

> Secondly, I have question about deleting USE_ASSERT_CHECKING in lock.h?
> It supposed to have been checked in another placed of the code by
> matching values. I am worry about skipping errors due to untesting with
> assert option in the places where it (GetLockMethodLocalHash)
> participates and we won't able to get core file in segfault cases. I
> might not understand something, then can you please explain to me?
Since GetLockMethodLocalHash() is only used for assertions, this is only
defined when USE_ASSERT_CHECKING is enabled.
This patch uses GetLockMethodLocalHash() not only for the assertion
purpose, so I removed "ifdef USE_ASSERT_CHECKING" for this function.
I belive it does not lead to skip errors.

> Thirdly, I have incomprehension of the point why save_ActiveQueryDesc
> is
> declared in the pquery.h? I am seemed to save_ActiveQueryDesc to be
> used
> in an once time in the ExecutorRun function and its declaration
> superfluous. I added it in the attached patch.
Exactly.

> Fourthly, it seems to me there are not enough explanatory comments in
> the code. I also added them in the attached patch.
Thanks!

| + /*
| + * Save value of ActiveQueryDesc before having called
| + * ExecutorRun_hook function due to having reset by
| + * AbortSubTransaction.
| + */
| +
| save_ActiveQueryDesc = ActiveQueryDesc;
| ActiveQueryDesc = queryDesc;
|
| @@ -314,6 +320,7 @@ ExecutorRun(QueryDesc *queryDesc,
| else
| standard_ExecutorRun(queryDesc, direction, count,
execute_once);
|
| + /* We set the actual value of ActiveQueryDesc */
| ActiveQueryDesc = save_ActiveQueryDesc;

Since these processes are needed for nested queries, not only for
AbortSubTransaction[1], added comments on it.

| +/* Function to handle the signal to output the query plan. */
| extern void HandleLogQueryPlanInterrupt(void);

I feel this comment is unnecessary since the explanation of
HandleLogQueryPlanInterrupt() is written in explain.c and no functions
in explain.h have comments in it.

> Lastly, I have incomprehension about handling signals since have been
> unused it before. Could another signal disabled calling this signal to
> log query plan? I noticed this signal to be checked the latest in
> procsignal_sigusr1_handler function.
Are you concerned that one signal will not be processed when multiple
signals are sent in succession?
AFAIU both of them are processed since SendProcSignal flags
ps_signalFlags for each signal.

```
SendProcSignal(pid_t pid, ProcSignalReason reason, BackendId backendId)
{
volatile ProcSignalSlot *slot;
...(snip)...
278 if (slot->pss_pid == pid)
279 {
280 /* Atomically set the proper flag */
281 slot->pss_signalFlags[reason] = true;
282 /* Send signal */
283 return kill(pid, SIGUSR1);
```

Comments of ProcSignalReason also say 'We can cope with concurrent
signals for different reasons'.

```C
/*
* Reasons for signaling a Postgres child process (a backend or an
auxiliary
* process, like checkpointer). We can cope with concurrent signals
for different
* reasons. However, if the same reason is signaled multiple times in
quick
* succession, the process is likely to observe only one notification
of it.
* This is okay for the present uses.
...
typedef enum
{
PROCSIG_CATCHUP_INTERRUPT, /* sinval catchup interrupt */
PROCSIG_NOTIFY_INTERRUPT, /* listen/notify interrupt */
PROCSIG_PARALLEL_MESSAGE, /* message from cooperating parallel
backend */
PROCSIG_WALSND_INIT_STOPPING, /* ask walsenders to prepare for
shutdown */
PROCSIG_BARRIER, /* global barrier interrupt */
PROCSIG_LOG_MEMORY_CONTEXT, /* ask backend to log the memory
contexts */
PROCSIG_LOG_QUERY_PLAN, /* ask backend to log plan of the current
query */
...
} ProcSignalReason;
```

[1]
/message-id/8b53b32f-26cc-0531-4ac0-27310e0bef4b%40oss.nttdata.com

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
v24-0001-log-running-query-plan.patch text/x-diff 25.1 KB

From: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-09-21 08:30:25
Message-ID: d4a5ea75-d6a3-7f02-890e-8e3d1a43aeab@yandex.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

 Ok, I get it.

> Since GetLockMethodLocalHash() is only used for assertions, this is
> only defined when USE_ASSERT_CHECKING is enabled. This patch uses
> GetLockMethodLocalHash() not only for the assertion purpose, so I
> removed "ifdef USE_ASSERT_CHECKING" for this function. I belive it
> does not lead to skip errors.
Agree.
> Since these processes are needed for nested queries, not only for
> AbortSubTransaction[1], added comments on it.

I also noticed it. However I also discovered that above function
declarations to be aplied for explain command and used to be printed
details of the executed query.

We have a similar task to print the plan of an interrupted process
making a request for a specific pid.

In short, I think, this task is different and for separating these parts
I added this comment.

> I feel this comment is unnecessary since the explanation of
> HandleLogQueryPlanInterrupt() is written in explain.c and no functions
> in explain.h have comments in it.

Yes, I was worried about it. I understood it, thank for explaining.

>
> AFAIU both of them are processed since SendProcSignal flags
> ps_signalFlags for each signal.
>
> ```
> SendProcSignal(pid_t pid, ProcSignalReason reason, BackendId backendId)
> {
> volatile ProcSignalSlot *slot;
> ...(snip)...
> 278 if (slot->pss_pid == pid)
> 279 {
> 280 /* Atomically set the proper flag */
> 281 slot->pss_signalFlags[reason] = true;
> 282 /* Send signal */
> 283 return kill(pid, SIGUSR1);
> ```
>
> Comments of ProcSignalReason also say 'We can cope with concurrent
> signals for different reasons'.
>
> ```C
> /*
> * Reasons for signaling a Postgres child process (a backend or an
> auxiliary
> * process, like checkpointer). We can cope with concurrent signals for
> different
> * reasons. However, if the same reason is signaled multiple times in
> quick
> * succession, the process is likely to observe only one notification
> of it.
> * This is okay for the present uses.
> ...
> typedef enum
> {
> PROCSIG_CATCHUP_INTERRUPT, /* sinval catchup interrupt */
> PROCSIG_NOTIFY_INTERRUPT, /* listen/notify interrupt */
> PROCSIG_PARALLEL_MESSAGE, /* message from cooperating parallel backend */
> PROCSIG_WALSND_INIT_STOPPING, /* ask walsenders to prepare for
> shutdown */
> PROCSIG_BARRIER, /* global barrier interrupt */
> PROCSIG_LOG_MEMORY_CONTEXT, /* ask backend to log the memory contexts */
> PROCSIG_LOG_QUERY_PLAN, /* ask backend to log plan of the current
> query */
> ...
> } ProcSignalReason;
> ```
>
>
> [1]
> /message-id/8b53b32f-26cc-0531-4ac0-27310e0bef4b%40oss.nttdata.com


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-09-22 14:12:16
Message-ID: 0cc0608d89cc61a29e18c4ba8f8a03a6@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2022-09-21 17:30, Alena Rybakina wrote:

Thanks for your reply!

> I also noticed it. However I also discovered that above function
> declarations to be aplied for explain command and used to be printed
> details of the executed query.
>
> We have a similar task to print the plan of an interrupted process
> making a request for a specific pid.
>
> In short, I think, this task is different and for separating these
> parts I added this comment.

I'm not sure I understand your comment correctly, do you mean
HandleLogQueryPlanInterrupt() should not be placed in explain.c?

It may be so.
However, given that ProcesLogMemoryContextInterrupt(), which similarly
handles interrupts for pg_log_backend_memory_contexts(), is located in
mcxt.c, I also think current location might be acceptable.

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION


From: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-09-23 09:43:19
Message-ID: f2b2481d-f017-b005-d2c6-e1ff0f611353@yandex.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Sorry, I wrote confusingly at that time.

No, I suggested adding comment about the explanation of
HandleLogQueryPlanInterrupt() only in the explain.h and not removing
from the explain.c.

I seemed to be necessary separating declaration function for 'explaining
feature' of executed query from our logging plan of the running query
interrupts function. But now, I doubt it.

> I'm not sure I understand your comment correctly, do you mean
> HandleLogQueryPlanInterrupt() should not be placed in explain.c?

Thank you for having reminded about this function and I looked at
ProcessLogMemoryContextInterrupt() declaration. I'm noticed comments in
the memutils.h are missed tooю

Description of this function is written only in mcxt.c.
> However, given that ProcesLogMemoryContextInterrupt(), which similarly
> handles interrupts for pg_log_backend_memory_contexts(), is located in
> mcxt.c, I also think current location might be acceptable.

So I think you are right and the comment about the explanation of
HandleLogQueryPlanInterrupt() written in explain.h is redundant.

> I feel this comment is unnecessary since the explanation of
> HandleLogQueryPlanInterrupt() is written in explain.c and no functions
> in explain.h have comments in it.

Regards,

--
Alena Rybakina
Postgres Professional


From: Andres Freund <andres(at)anarazel(dot)de>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Алена Рыбакина <lena(dot)ribackina(at)yandex(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-12-06 18:41:01
Message-ID: 20221206184101.exj2zyhmt7ta5fco@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

This patch does not currently build, due to a conflicting oid:

https://cirrus-ci.com/task/4638460594618368?logs=build#L108
[17:26:44.602] /usr/bin/perl ../src/include/catalog/../../backend/catalog/genbki.pl --include-path=../src/include --set-version=16 --output=src/include/catalog ../src/include/catalog/pg_proc.h ../src/include/catalog/pg_type.h ../src/include/catalog/pg_attribute.h ../src/include/catalog/pg_class.h ../src/include/catalog/pg_attrdef.h ../src/include/catalog/pg_constraint.h ../src/include/catalog/pg_inherits.h ../src/include/catalog/pg_index.h ../src/include/catalog/pg_operator.h ../src/include/catalog/pg_opfamily.h ../src/include/catalog/pg_opclass.h ../src/include/catalog/pg_am.h ../src/include/catalog/pg_amop.h ../src/include/catalog/pg_amproc.h ../src/include/catalog/pg_language.h ../src/include/catalog/pg_largeobject_metadata.h ../src/include/catalog/pg_largeobject.h ../src/include/catalog/pg_aggregate.h ../src/include/catalog/pg_statistic.h ../src/include/catalog/pg_statistic_ext.h ../src/include/catalog/pg_statistic_ext_data.h ../src/include/catalog/pg_rewrite.h ../src/include/catalog/pg_trigger.h ../src/include/catalog/pg_event_trigger.h ../src/include/catalog/pg_description.h ../src/include/catalog/pg_cast.h ../src/include/catalog/pg_enum.h ../src/include/catalog/pg_namespace.h ../src/include/catalog/pg_conversion.h ../src/include/catalog/pg_depend.h ../src/include/catalog/pg_database.h ../src/include/catalog/pg_db_role_setting.h ../src/include/catalog/pg_tablespace.h ../src/include/catalog/pg_authid.h ../src/include/catalog/pg_auth_members.h ../src/include/catalog/pg_shdepend.h ../src/include/catalog/pg_shdescription.h ../src/include/catalog/pg_ts_config.h ../src/include/catalog/pg_ts_config_map.h ../src/include/catalog/pg_ts_dict.h ../src/include/catalog/pg_ts_parser.h ../src/include/catalog/pg_ts_template.h ../src/include/catalog/pg_extension.h ../src/include/catalog/pg_foreign_data_wrapper.h ../src/include/catalog/pg_foreign_server.h ../src/include/catalog/pg_user_mapping.h ../src/include/catalog/pg_foreign_table.h ../src/include/catalog/pg_policy.h ../src/include/catalog/pg_replication_origin.h ../src/include/catalog/pg_default_acl.h ../src/include/catalog/pg_init_privs.h ../src/include/catalog/pg_seclabel.h ../src/include/catalog/pg_shseclabel.h ../src/include/catalog/pg_collation.h ../src/include/catalog/pg_parameter_acl.h ../src/include/catalog/pg_partitioned_table.h ../src/include/catalog/pg_range.h ../src/include/catalog/pg_transform.h ../src/include/catalog/pg_sequence.h ../src/include/catalog/pg_publication.h ../src/include/catalog/pg_publication_namespace.h ../src/include/catalog/pg_publication_rel.h ../src/include/catalog/pg_subscription.h ../src/include/catalog/pg_subscription_rel.h
[17:26:44.602] Duplicate OIDs detected:
[17:26:44.602] 4550
[17:26:44.602] found 1 duplicate OID(s) in catalog data

I suggest you choose a random oid out of the "development purposes" range:

* OIDs 1-9999 are reserved for manual assignment (see .dat files in
* src/include/catalog/). Of these, 8000-9999 are reserved for
* development purposes (such as in-progress patches and forks);
* they should not appear in released versions.

Greetings,

Andres Freund


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2022-12-08 05:10:32
Message-ID: c51098ac613f908776e5564a7c7a0e16@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2022-12-07 03:41, Andres Freund wrote:
> Hi,
>
> This patch does not currently build, due to a conflicting oid:
>
> I suggest you choose a random oid out of the "development purposes"
> range:

Thanks for your advice!
Attached updated patch.

BTW, since this patch depends on ProcessInterrupts() and EXPLAIN codes
which is used in auto_explain, I'm feeling that the following discussion
also applies to this patch.

> --
> /message-id/CA%2BTgmoYW_rSOW4JMQ9_0Df9PKQ%3DsQDOKUGA4Gc9D8w4wui8fSA%40mail.gmail.com
>
> explaining a query is a pretty
> complicated operation that involves catalog lookups and lots of
> complicated stuff, and I don't think that it would be safe to do all
> of that at any arbitrary point in the code where ProcessInterrupts()
> happened to fire.

If I can't come up with some workaround during the next Commitfest, I'm
going to withdraw this proposal.

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
v25-0001-log-running-query-plan.patch text/x-diff 24.9 KB

From: James Coleman <jtc331(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-06-02 17:51:22
Message-ID: CAAaqYe9euUZD8bkjXTVcD9e4n5c7kzHzcvuCJXt-xds9X4c7Fw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

Thanks for working on this patch!

On Thu, Dec 8, 2022 at 12:10 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> BTW, since this patch depends on ProcessInterrupts() and EXPLAIN codes
> which is used in auto_explain, I'm feeling that the following discussion
> also applies to this patch.
>
> > --
> > /message-id/CA%2BTgmoYW_rSOW4JMQ9_0Df9PKQ%3DsQDOKUGA4Gc9D8w4wui8fSA%40mail.gmail.com
> >
> > explaining a query is a pretty
> > complicated operation that involves catalog lookups and lots of
> > complicated stuff, and I don't think that it would be safe to do all
> > of that at any arbitrary point in the code where ProcessInterrupts()
> > happened to fire.
>
> If I can't come up with some workaround during the next Commitfest, I'm
> going to withdraw this proposal.

While at PGCon this week I'd brought up this idea with a few people
without realizing you'd already worked on it previously, and then
after I discovered this thread several of us (Greg, Ronan, David,
Heikki, and myself -- all cc'd) discussed the safety concerns over
dinner last night.

Our conclusion was that all of the concerns we could come up with (for
example, walking though the code for ExplainTargetRel() and discussing
the relevant catalog and syscache accesses) all applied specifically
to Robert's concerns about running explain inside an aborted
transaction. After all, I'd originally started that thread to ask
about running auto-explain after a query timeout.

To put it positively: we believe that, for example, catalog accesses
inside CHECK_FOR_INTERRUPTS() -- assuming that the CFI call is inside
an existing valid transaction/query state, as it would be for this
patch -- are safe. If there were problems, then those problems are
likely bugs we already have in other CFI cases.

Another concern Robert raised may apply here: what if a person tries
to cancel a query when we're explaining? I believe that's a reasonable
question to ask, but I believe it's a trade-off that's worth making
for the (significant) introspection benefits this patch would provide.

On to the patch itself: overall I think it looks like it's in pretty
good shape. I also noticed we don't have any tests (I assume it'd have
to be TAP tests) of the actual output happening, and I think it would
be worth adding that.

Are you interested in re-opening this patch? I'd be happy to provide
further review and help to try to push this along.

I've rebased the patch and attached as v26.

Thanks,
James Coleman

Attachment Content-Type Size
v26-0001-Add-function-to-log-the-plan-of-the-query.patch application/octet-stream 25.0 KB

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-06-05 08:30:18
Message-ID: 2f466f82de025f745dff9377a368cee4@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-06-03 02:51, James Coleman wrote:
> Hello,
>
> Thanks for working on this patch!
>
> On Thu, Dec 8, 2022 at 12:10 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
>>
>> BTW, since this patch depends on ProcessInterrupts() and EXPLAIN codes
>> which is used in auto_explain, I'm feeling that the following
>> discussion
>> also applies to this patch.
>>
>> > --
>> > /message-id/CA%2BTgmoYW_rSOW4JMQ9_0Df9PKQ%3DsQDOKUGA4Gc9D8w4wui8fSA%40mail.gmail.com
>> >
>> > explaining a query is a pretty
>> > complicated operation that involves catalog lookups and lots of
>> > complicated stuff, and I don't think that it would be safe to do all
>> > of that at any arbitrary point in the code where ProcessInterrupts()
>> > happened to fire.
>>
>> If I can't come up with some workaround during the next Commitfest,
>> I'm
>> going to withdraw this proposal.
>
> While at PGCon this week I'd brought up this idea with a few people
> without realizing you'd already worked on it previously, and then
> after I discovered this thread several of us (Greg, Ronan, David,
> Heikki, and myself -- all cc'd) discussed the safety concerns over
> dinner last night.
>
> Our conclusion was that all of the concerns we could come up with (for
> example, walking though the code for ExplainTargetRel() and discussing
> the relevant catalog and syscache accesses) all applied specifically
> to Robert's concerns about running explain inside an aborted
> transaction. After all, I'd originally started that thread to ask
> about running auto-explain after a query timeout.
>
> To put it positively: we believe that, for example, catalog accesses
> inside CHECK_FOR_INTERRUPTS() -- assuming that the CFI call is inside
> an existing valid transaction/query state, as it would be for this
> patch -- are safe. If there were problems, then those problems are
> likely bugs we already have in other CFI cases.

Thanks a lot for the discussion and sharing it!
I really appreciate it.

BTW I'm not sure whether all the CFI are called in valid transaction,
do you think we should check each of them?

> Another concern Robert raised may apply here: what if a person tries
> to cancel a query when we're explaining? I believe that's a reasonable
> question to ask, but I believe it's a trade-off that's worth making
> for the (significant) introspection benefits this patch would provide.

Is the concern here limited to the case where explain code goes crazy
as Robert pointed out?
If so, this may be a trade-off worth doing.
I am a little concerned about whether there will be cases where the
explain code is not problematic but just takes long time.

> On to the patch itself: overall I think it looks like it's in pretty
> good shape. I also noticed we don't have any tests (I assume it'd have
> to be TAP tests) of the actual output happening, and I think it would
> be worth adding that.

Checking the log output may be better, but I didn't add it since there
is only a little content that can be checked, and similar function
pg_log_backend_memory_contexts() does not do such type of tests.

> Are you interested in re-opening this patch? I'd be happy to provide
> further review and help to try to push this along.
Sure, I'm going to re-open this.

> I've rebased the patch and attached as v26.
Thanks again for your work!

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION


From: James Coleman <jtc331(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-06-05 18:26:25
Message-ID: CAAaqYe8LXVXQhYy3yT0QOHUymdM=uha0dJ0=BEPzVAx2nG1gsw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 5, 2023 at 4:30 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> On 2023-06-03 02:51, James Coleman wrote:
> > Hello,
> >
> > Thanks for working on this patch!

Sure thing! I'm *very interested* in seeing this available, and I
think it paves the way for some additional features later on...

> > On Thu, Dec 8, 2022 at 12:10 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> ...
> > To put it positively: we believe that, for example, catalog accesses
> > inside CHECK_FOR_INTERRUPTS() -- assuming that the CFI call is inside
> > an existing valid transaction/query state, as it would be for this
> > patch -- are safe. If there were problems, then those problems are
> > likely bugs we already have in other CFI cases.
>
> Thanks a lot for the discussion and sharing it!
> I really appreciate it.
>
> BTW I'm not sure whether all the CFI are called in valid transaction,
> do you think we should check each of them?

I kicked off the regressions tests with a call to
ProcessLogQueryPlanInterrupt() in every single CHECK_FOR_INTERRUPTS()
call. Several hours and 52 GB of logs later I have confirmed that
(with the attached revision) at the very least the regression test
suite can't trigger any kind of failures regardless of when we trigger
this. The existing code in the patch for only running the explain when
there's an active query handling that.

> > Another concern Robert raised may apply here: what if a person tries
> > to cancel a query when we're explaining? I believe that's a reasonable
> > question to ask, but I believe it's a trade-off that's worth making
> > for the (significant) introspection benefits this patch would provide.
>
> Is the concern here limited to the case where explain code goes crazy
> as Robert pointed out?
> If so, this may be a trade-off worth doing.
> I am a little concerned about whether there will be cases where the
> explain code is not problematic but just takes long time.

The explain code could take a long time in some rare cases (e.g., we
discovered a bug a few years back with the planning code that actually
descends an index to find the max value), but I think the trade-off is
worth it.

> > On to the patch itself: overall I think it looks like it's in pretty
> > good shape. I also noticed we don't have any tests (I assume it'd have
> > to be TAP tests) of the actual output happening, and I think it would
> > be worth adding that.
>
> Checking the log output may be better, but I didn't add it since there
> is only a little content that can be checked, and similar function
> pg_log_backend_memory_contexts() does not do such type of tests.

Fair enough. I still think that would be an improvement here, but
others could also weigh in.

> > Are you interested in re-opening this patch? I'd be happy to provide
> > further review and help to try to push this along.
> Sure, I'm going to re-open this.

I've attached v27. The important change here in 0001 is that it
guarantees the interrupt handler is re-entrant, since that was a bug
exposed by my testing. I've also included 0002 which is only meant for
testing (it attempts to log in the plan in every
CHECK_FOR_INTERRUPTS() call).

Regards,
James

Attachment Content-Type Size
v27-0001-Add-function-to-log-the-plan-of-the-query.patch application/octet-stream 25.4 KB
v27-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch application/octet-stream 891 bytes

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-06-12 04:07:34
Message-ID: 788e5db60b6c5de4b73fc9810b635d44@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-06-06 03:26, James Coleman wrote:
> On Mon, Jun 5, 2023 at 4:30 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
>>
>> On 2023-06-03 02:51, James Coleman wrote:
>> > Hello,
>> >
>> > Thanks for working on this patch!
>
> Sure thing! I'm *very interested* in seeing this available, and I
> think it paves the way for some additional features later on...
>
>> > On Thu, Dec 8, 2022 at 12:10 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
>> ...
>> > To put it positively: we believe that, for example, catalog accesses
>> > inside CHECK_FOR_INTERRUPTS() -- assuming that the CFI call is inside
>> > an existing valid transaction/query state, as it would be for this
>> > patch -- are safe. If there were problems, then those problems are
>> > likely bugs we already have in other CFI cases.
>>
>> Thanks a lot for the discussion and sharing it!
>> I really appreciate it.
>>
>> BTW I'm not sure whether all the CFI are called in valid transaction,
>> do you think we should check each of them?
>
> I kicked off the regressions tests with a call to
> ProcessLogQueryPlanInterrupt() in every single CHECK_FOR_INTERRUPTS()
> call. Several hours and 52 GB of logs later I have confirmed that
> (with the attached revision) at the very least the regression test
> suite can't trigger any kind of failures regardless of when we trigger
> this. The existing code in the patch for only running the explain when
> there's an active query handling that.

Thanks for the testing!

> I've attached v27. The important change here in 0001 is that it
> guarantees the interrupt handler is re-entrant, since that was a bug
> exposed by my testing. I've also included 0002 which is only meant for
> testing (it attempts to log in the plan in every
> CHECK_FOR_INTERRUPTS() call).

When SIGINT is sent during ProcessLogQueryPlanInterrupt(),
ProcessLogQueryPlanInterruptActive can remain true.
After that, pg_log_query_plan() does nothing but just returns.

AFAIU, v26 logs plan for each pg_log_query_plan() even when another
pg_log_query_plan() is running, but it doesn't cause errors or critical
problem.

Considering the problem solved and introduced by v27, I think v26 might
be fine.
How do you think?

>
> Regards,
> James

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION


From: James Coleman <jtc331(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-06-12 15:52:01
Message-ID: CAAaqYe9u-ywvRX32iBj1SkQrLf1hqMhR9HFCc0dAk4rGDJpyFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jun 11, 2023 at 11:07 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> On 2023-06-06 03:26, James Coleman wrote:
> > On Mon, Jun 5, 2023 at 4:30 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> > wrote:
> >>
> >> On 2023-06-03 02:51, James Coleman wrote:
> >> > Hello,
> >> >
> >> > Thanks for working on this patch!
> >
> > Sure thing! I'm *very interested* in seeing this available, and I
> > think it paves the way for some additional features later on...
> >
> >> > On Thu, Dec 8, 2022 at 12:10 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> >> ...
> >> > To put it positively: we believe that, for example, catalog accesses
> >> > inside CHECK_FOR_INTERRUPTS() -- assuming that the CFI call is inside
> >> > an existing valid transaction/query state, as it would be for this
> >> > patch -- are safe. If there were problems, then those problems are
> >> > likely bugs we already have in other CFI cases.
> >>
> >> Thanks a lot for the discussion and sharing it!
> >> I really appreciate it.
> >>
> >> BTW I'm not sure whether all the CFI are called in valid transaction,
> >> do you think we should check each of them?
> >
> > I kicked off the regressions tests with a call to
> > ProcessLogQueryPlanInterrupt() in every single CHECK_FOR_INTERRUPTS()
> > call. Several hours and 52 GB of logs later I have confirmed that
> > (with the attached revision) at the very least the regression test
> > suite can't trigger any kind of failures regardless of when we trigger
> > this. The existing code in the patch for only running the explain when
> > there's an active query handling that.
>
> Thanks for the testing!
>
> > I've attached v27. The important change here in 0001 is that it
> > guarantees the interrupt handler is re-entrant, since that was a bug
> > exposed by my testing. I've also included 0002 which is only meant for
> > testing (it attempts to log in the plan in every
> > CHECK_FOR_INTERRUPTS() call).
>
> When SIGINT is sent during ProcessLogQueryPlanInterrupt(),
> ProcessLogQueryPlanInterruptActive can remain true.
> After that, pg_log_query_plan() does nothing but just returns.
>
> AFAIU, v26 logs plan for each pg_log_query_plan() even when another
> pg_log_query_plan() is running, but it doesn't cause errors or critical
> problem.
>
> Considering the problem solved and introduced by v27, I think v26 might
> be fine.
> How do you think?

The testing I did with calling this during every CFI is what uncovered
the re-entrancy problem. IIRC (without running that test again) the
problem was a stack overflow. Now, to be sure this is a particularly
degenerate case because in real-world usage it'd be impossible in
practice, I think, to trigger that many calls to this function (and by
extension the interrupt handler).

If SIGINT is the only concern we could reset
ProcessLogQueryPlanInterruptActive in error handling code. I admit
that part of my thought process here is thinking ahead to an
additional patch I'd like to see on top of this, which is logging a
query plan before cleaning up when statement timeout occurs. The
re-entrancy issue becomes more interesting then, I think, since we
would then have automated calling of the logging code. BTW: I'd
thought that would make a nice follow-up patch for this, but if you'd
prefer I could add it as another patch in the series here.

What do you think about resetting the flag versus just not having it?

Regards,
James Coleman


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-06-13 15:22:14
Message-ID: 9c3445ab0f695dbd0a879cf380f313c5@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-06-13 00:52, James Coleman wrote:
>>
>> > I've attached v27. The important change here in 0001 is that it
>> > guarantees the interrupt handler is re-entrant, since that was a bug
>> > exposed by my testing. I've also included 0002 which is only meant for
>> > testing (it attempts to log in the plan in every
>> > CHECK_FOR_INTERRUPTS() call).
>>
>> When SIGINT is sent during ProcessLogQueryPlanInterrupt(),
>> ProcessLogQueryPlanInterruptActive can remain true.
>> After that, pg_log_query_plan() does nothing but just returns.
>>
>> AFAIU, v26 logs plan for each pg_log_query_plan() even when another
>> pg_log_query_plan() is running, but it doesn't cause errors or
>> critical
>> problem.
>>
>> Considering the problem solved and introduced by v27, I think v26
>> might
>> be fine.
>> How do you think?
>
> The testing I did with calling this during every CFI is what uncovered
> the re-entrancy problem. IIRC (without running that test again) the
> problem was a stack overflow. Now, to be sure this is a particularly
> degenerate case because in real-world usage it'd be impossible in
> practice, I think, to trigger that many calls to this function (and by
> extension the interrupt handler).

Yeah.In addition, currently only superusers are allowed to execute
pg_log_query_plan(), I think we don't need to think about cases
where users are malicious.

> If SIGINT is the only concern we could reset
> ProcessLogQueryPlanInterruptActive in error handling code. I admit
> that part of my thought process here is thinking ahead to an
> additional patch I'd like to see on top of this, which is logging a
> query plan before cleaning up when statement timeout occurs.

I remember this is what you wanted do.[1]

> The
> re-entrancy issue becomes more interesting then, I think, since we
> would then have automated calling of the logging code. BTW: I'd
> thought that would make a nice follow-up patch for this, but if you'd
> prefer I could add it as another patch in the series here.
>
> What do you think about resetting the flag versus just not having it?

If I understand you correctly, adding the flag is not necessary for this
proposal.
To keep the patch simple, I prefer not having it.

[1]
/message-id/flat/CA%2BTgmoYW_rSOW4JMQ9_0Df9PKQ%3DsQDOKUGA4Gc9D8w4wui8fSA%40mail.gmail.com#b57432077f8045be8588049269f7a8dd

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION


From: James Coleman <jtc331(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-06-13 15:53:39
Message-ID: CAAaqYe-5jKTuuZB5StBwa2YRiqpnN=q9buKADP8A_RPd09f_Ug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 13, 2023 at 11:22 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> On 2023-06-13 00:52, James Coleman wrote:
> >>
> >> > I've attached v27. The important change here in 0001 is that it
> >> > guarantees the interrupt handler is re-entrant, since that was a bug
> >> > exposed by my testing. I've also included 0002 which is only meant for
> >> > testing (it attempts to log in the plan in every
> >> > CHECK_FOR_INTERRUPTS() call).
> >>
> >> When SIGINT is sent during ProcessLogQueryPlanInterrupt(),
> >> ProcessLogQueryPlanInterruptActive can remain true.
> >> After that, pg_log_query_plan() does nothing but just returns.
> >>
> >> AFAIU, v26 logs plan for each pg_log_query_plan() even when another
> >> pg_log_query_plan() is running, but it doesn't cause errors or
> >> critical
> >> problem.
> >>
> >> Considering the problem solved and introduced by v27, I think v26
> >> might
> >> be fine.
> >> How do you think?
> >
> > The testing I did with calling this during every CFI is what uncovered
> > the re-entrancy problem. IIRC (without running that test again) the
> > problem was a stack overflow. Now, to be sure this is a particularly
> > degenerate case because in real-world usage it'd be impossible in
> > practice, I think, to trigger that many calls to this function (and by
> > extension the interrupt handler).
>
> Yeah.In addition, currently only superusers are allowed to execute
> pg_log_query_plan(), I think we don't need to think about cases
> where users are malicious.
>
> > If SIGINT is the only concern we could reset
> > ProcessLogQueryPlanInterruptActive in error handling code. I admit
> > that part of my thought process here is thinking ahead to an
> > additional patch I'd like to see on top of this, which is logging a
> > query plan before cleaning up when statement timeout occurs.
>
> I remember this is what you wanted do.[1]
>
> > The
> > re-entrancy issue becomes more interesting then, I think, since we
> > would then have automated calling of the logging code. BTW: I'd
> > thought that would make a nice follow-up patch for this, but if you'd
> > prefer I could add it as another patch in the series here.
> >
> > What do you think about resetting the flag versus just not having it?
>
> If I understand you correctly, adding the flag is not necessary for this
> proposal.
> To keep the patch simple, I prefer not having it.
>

I'm going to re-run tests with my patch version + resetting the flag
on SIGINT (and any other error condition) to be certain that the issue
you uncovered (where backends get stuck after a SIGINT not responding
to the requested plan logging) wasn't masking any other issues.

As long as that run is clean also then I believe the patch is safe
as-is even without the re-entrancy guard.

I'll report back with the results of that testing.

Regards,
James Coleman


From: James Coleman <jtc331(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-06-14 16:48:52
Message-ID: CAAaqYe-M63VNMpgq0sEw3+t2KaJyhXfu3uBU7w_jtfbGAm4LoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 13, 2023 at 11:53 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
>
> ...
> I'm going to re-run tests with my patch version + resetting the flag
> on SIGINT (and any other error condition) to be certain that the issue
> you uncovered (where backends get stuck after a SIGINT not responding
> to the requested plan logging) wasn't masking any other issues.
>
> As long as that run is clean also then I believe the patch is safe
> as-is even without the re-entrancy guard.
>
> I'll report back with the results of that testing.

The tests have been running since last night, but have been apparently
hung now for many hours. I haven't been able to fully look into it,
but so far I know the hung (100% CPU) backend last logged this:

2023-06-14 02:00:30.045 UTC client backend[84461]
pg_regress/updatable_views LOG: query plan running on backend with
PID 84461 is:
Query Text: SELECT table_name, column_name, is_updatable
FROM information_schema.columns
WHERE table_name LIKE E'r_\\_view%'
ORDER BY table_name, ordinal_position;

The last output from the regression test harness was:

# parallel group (5 tests): index_including create_view
index_including_gist create_index create_index_spgist
ok 66 + create_index 36508 ms
ok 67 + create_index_spgist 38588 ms
ok 68 + create_view 1394 ms
ok 69 + index_including 654 ms
ok 70 + index_including_gist 1701 ms
# parallel group (16 tests): errors create_cast drop_if_exists
create_aggregate roleattributes constraints hash_func typed_table
infinite_recurse

Attaching gdb to the hung backend shows this:

#0 0x00005601ab1f9529 in ProcLockWakeup
(lockMethodTable=lockMethodTable(at)entry=0x5601ab6484e0
<default_lockmethod>, lock=lock(at)entry=0x7f5325c913f0) at proc.c:1655
#1 0x00005601ab1e99dc in CleanUpLock (lock=lock(at)entry=0x7f5325c913f0,
proclock=proclock(at)entry=0x7f5325d40d60,
lockMethodTable=lockMethodTable(at)entry=0x5601ab6484e0
<default_lockmethod>,
hashcode=hashcode(at)entry=573498161, wakeupNeeded=<optimized out>)
at lock.c:1673
#2 0x00005601ab1e9e21 in LockRefindAndRelease
(lockMethodTable=lockMethodTable(at)entry=0x5601ab6484e0
<default_lockmethod>, proc=<optimized out>,
locktag=locktag(at)entry=0x5601ac3d7998, lockmode=lockmode(at)entry=1,
decrement_strong_lock_count=decrement_strong_lock_count(at)entry=false)
at lock.c:3150
#3 0x00005601ab1edb27 in LockReleaseAll
(lockmethodid=lockmethodid(at)entry=1, allLocks=false) at lock.c:2295
#4 0x00005601ab1f8599 in ProcReleaseLocks
(isCommit=isCommit(at)entry=true) at proc.c:781
#5 0x00005601ab37f1f4 in ResourceOwnerReleaseInternal
(owner=<optimized out>, phase=phase(at)entry=RESOURCE_RELEASE_LOCKS,
isCommit=isCommit(at)entry=true, isTopLevel=isTopLevel(at)entry=true) at
resowner.c:618
#6 0x00005601ab37f7b7 in ResourceOwnerRelease (owner=<optimized out>,
phase=phase(at)entry=RESOURCE_RELEASE_LOCKS,
isCommit=isCommit(at)entry=true, isTopLevel=isTopLevel(at)entry=true) at
resowner.c:494
#7 0x00005601aaec1d84 in CommitTransaction () at xact.c:2334
#8 0x00005601aaec2b22 in CommitTransactionCommand () at xact.c:3067
#9 0x00005601ab200a66 in finish_xact_command () at postgres.c:2783
#10 0x00005601ab20338f in exec_simple_query (
query_string=query_string(at)entry=0x5601ac3b0858 "SELECT table_name,
column_name, is_updatable\n FROM information_schema.columns\n WHERE
table_name LIKE E'r_\\\\_view%'\n ORDER BY table_name,
ordinal_position;") at postgres.c:1300

I am unable to connect to the regression test Postgres instance --
psql just hangs, so the lock seems to have affected the postmaster
also.

I'm wondering if this might represent a bug in the current patch.

Regards,
James Coleman


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-06-15 13:00:46
Message-ID: d81f73616aa5164a68033bc96662dfbc@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-06-15 01:48, James Coleman wrote:
> On Tue, Jun 13, 2023 at 11:53 AM James Coleman <jtc331(at)gmail(dot)com>
> wrote:
>>
>> ...
>> I'm going to re-run tests with my patch version + resetting the flag
>> on SIGINT (and any other error condition) to be certain that the issue
>> you uncovered (where backends get stuck after a SIGINT not responding
>> to the requested plan logging) wasn't masking any other issues.
>>
>> As long as that run is clean also then I believe the patch is safe
>> as-is even without the re-entrancy guard.
>>
>> I'll report back with the results of that testing.
>
> The tests have been running since last night, but have been apparently
> hung now for many hours. I haven't been able to fully look into it,
> but so far I know the hung (100% CPU) backend last logged this:
>
> 2023-06-14 02:00:30.045 UTC client backend[84461]
> pg_regress/updatable_views LOG: query plan running on backend with
> PID 84461 is:
> Query Text: SELECT table_name, column_name, is_updatable
> FROM information_schema.columns
> WHERE table_name LIKE E'r_\\_view%'
> ORDER BY table_name, ordinal_position;
>
> The last output from the regression test harness was:
>
> # parallel group (5 tests): index_including create_view
> index_including_gist create_index create_index_spgist
> ok 66 + create_index 36508 ms
> ok 67 + create_index_spgist 38588 ms
> ok 68 + create_view 1394 ms
> ok 69 + index_including 654 ms
> ok 70 + index_including_gist 1701 ms
> # parallel group (16 tests): errors create_cast drop_if_exists
> create_aggregate roleattributes constraints hash_func typed_table
> infinite_recurse
>
> Attaching gdb to the hung backend shows this:
>
> #0 0x00005601ab1f9529 in ProcLockWakeup
> (lockMethodTable=lockMethodTable(at)entry=0x5601ab6484e0
> <default_lockmethod>, lock=lock(at)entry=0x7f5325c913f0) at proc.c:1655
> #1 0x00005601ab1e99dc in CleanUpLock (lock=lock(at)entry=0x7f5325c913f0,
> proclock=proclock(at)entry=0x7f5325d40d60,
> lockMethodTable=lockMethodTable(at)entry=0x5601ab6484e0
> <default_lockmethod>,
> hashcode=hashcode(at)entry=573498161, wakeupNeeded=<optimized out>)
> at lock.c:1673
> #2 0x00005601ab1e9e21 in LockRefindAndRelease
> (lockMethodTable=lockMethodTable(at)entry=0x5601ab6484e0
> <default_lockmethod>, proc=<optimized out>,
> locktag=locktag(at)entry=0x5601ac3d7998, lockmode=lockmode(at)entry=1,
>
> decrement_strong_lock_count=decrement_strong_lock_count(at)entry=false)
> at lock.c:3150
> #3 0x00005601ab1edb27 in LockReleaseAll
> (lockmethodid=lockmethodid(at)entry=1, allLocks=false) at lock.c:2295
> #4 0x00005601ab1f8599 in ProcReleaseLocks
> (isCommit=isCommit(at)entry=true) at proc.c:781
> #5 0x00005601ab37f1f4 in ResourceOwnerReleaseInternal
> (owner=<optimized out>, phase=phase(at)entry=RESOURCE_RELEASE_LOCKS,
> isCommit=isCommit(at)entry=true, isTopLevel=isTopLevel(at)entry=true) at
> resowner.c:618
> #6 0x00005601ab37f7b7 in ResourceOwnerRelease (owner=<optimized out>,
> phase=phase(at)entry=RESOURCE_RELEASE_LOCKS,
> isCommit=isCommit(at)entry=true, isTopLevel=isTopLevel(at)entry=true) at
> resowner.c:494
> #7 0x00005601aaec1d84 in CommitTransaction () at xact.c:2334
> #8 0x00005601aaec2b22 in CommitTransactionCommand () at xact.c:3067
> #9 0x00005601ab200a66 in finish_xact_command () at postgres.c:2783
> #10 0x00005601ab20338f in exec_simple_query (
> query_string=query_string(at)entry=0x5601ac3b0858 "SELECT table_name,
> column_name, is_updatable\n FROM information_schema.columns\n WHERE
> table_name LIKE E'r_\\\\_view%'\n ORDER BY table_name,
> ordinal_position;") at postgres.c:1300
>
> I am unable to connect to the regression test Postgres instance --
> psql just hangs, so the lock seems to have affected the postmaster
> also.
>
> I'm wondering if this might represent a bug in the current patch.

Thanks for running and analyzing the test!

Could you share me how you are running the test?

I imagined something like below, but currently couldn't reproduce it.
- apply both v26-0001 and v27-0002 and build
- run PostgreSQL with default GUCssaaa
- make installcheck-world
- run 'SELECT pg_log_query_plan(pid) FROM pg_stat_activity \watch 0.1'
during make installcheck-world

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION


From: James Coleman <jtc331(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-06-15 16:34:23
Message-ID: CAAaqYe9QB_CUjXrvNh-cpzW6Hz7S4o5pZnR1U1TujLGptUrS1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jun 15, 2023 at 9:00 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> On 2023-06-15 01:48, James Coleman wrote:
> > On Tue, Jun 13, 2023 at 11:53 AM James Coleman <jtc331(at)gmail(dot)com>
> > wrote:
> >>
> >> ...
> >> I'm going to re-run tests with my patch version + resetting the flag
> >> on SIGINT (and any other error condition) to be certain that the issue
> >> you uncovered (where backends get stuck after a SIGINT not responding
> >> to the requested plan logging) wasn't masking any other issues.
> >>
> >> As long as that run is clean also then I believe the patch is safe
> >> as-is even without the re-entrancy guard.
> >>
> >> I'll report back with the results of that testing.
> >
> > The tests have been running since last night, but have been apparently
> > hung now for many hours. I haven't been able to fully look into it,
> > but so far I know the hung (100% CPU) backend last logged this:
> >
> > 2023-06-14 02:00:30.045 UTC client backend[84461]
> > pg_regress/updatable_views LOG: query plan running on backend with
> > PID 84461 is:
> > Query Text: SELECT table_name, column_name, is_updatable
> > FROM information_schema.columns
> > WHERE table_name LIKE E'r_\\_view%'
> > ORDER BY table_name, ordinal_position;
> >
> > The last output from the regression test harness was:
> >
> > # parallel group (5 tests): index_including create_view
> > index_including_gist create_index create_index_spgist
> > ok 66 + create_index 36508 ms
> > ok 67 + create_index_spgist 38588 ms
> > ok 68 + create_view 1394 ms
> > ok 69 + index_including 654 ms
> > ok 70 + index_including_gist 1701 ms
> > # parallel group (16 tests): errors create_cast drop_if_exists
> > create_aggregate roleattributes constraints hash_func typed_table
> > infinite_recurse
> >
> > Attaching gdb to the hung backend shows this:
> >
> > #0 0x00005601ab1f9529 in ProcLockWakeup
> > (lockMethodTable=lockMethodTable(at)entry=0x5601ab6484e0
> > <default_lockmethod>, lock=lock(at)entry=0x7f5325c913f0) at proc.c:1655
> > #1 0x00005601ab1e99dc in CleanUpLock (lock=lock(at)entry=0x7f5325c913f0,
> > proclock=proclock(at)entry=0x7f5325d40d60,
> > lockMethodTable=lockMethodTable(at)entry=0x5601ab6484e0
> > <default_lockmethod>,
> > hashcode=hashcode(at)entry=573498161, wakeupNeeded=<optimized out>)
> > at lock.c:1673
> > #2 0x00005601ab1e9e21 in LockRefindAndRelease
> > (lockMethodTable=lockMethodTable(at)entry=0x5601ab6484e0
> > <default_lockmethod>, proc=<optimized out>,
> > locktag=locktag(at)entry=0x5601ac3d7998, lockmode=lockmode(at)entry=1,
> >
> > decrement_strong_lock_count=decrement_strong_lock_count(at)entry=false)
> > at lock.c:3150
> > #3 0x00005601ab1edb27 in LockReleaseAll
> > (lockmethodid=lockmethodid(at)entry=1, allLocks=false) at lock.c:2295
> > #4 0x00005601ab1f8599 in ProcReleaseLocks
> > (isCommit=isCommit(at)entry=true) at proc.c:781
> > #5 0x00005601ab37f1f4 in ResourceOwnerReleaseInternal
> > (owner=<optimized out>, phase=phase(at)entry=RESOURCE_RELEASE_LOCKS,
> > isCommit=isCommit(at)entry=true, isTopLevel=isTopLevel(at)entry=true) at
> > resowner.c:618
> > #6 0x00005601ab37f7b7 in ResourceOwnerRelease (owner=<optimized out>,
> > phase=phase(at)entry=RESOURCE_RELEASE_LOCKS,
> > isCommit=isCommit(at)entry=true, isTopLevel=isTopLevel(at)entry=true) at
> > resowner.c:494
> > #7 0x00005601aaec1d84 in CommitTransaction () at xact.c:2334
> > #8 0x00005601aaec2b22 in CommitTransactionCommand () at xact.c:3067
> > #9 0x00005601ab200a66 in finish_xact_command () at postgres.c:2783
> > #10 0x00005601ab20338f in exec_simple_query (
> > query_string=query_string(at)entry=0x5601ac3b0858 "SELECT table_name,
> > column_name, is_updatable\n FROM information_schema.columns\n WHERE
> > table_name LIKE E'r_\\\\_view%'\n ORDER BY table_name,
> > ordinal_position;") at postgres.c:1300
> >
> > I am unable to connect to the regression test Postgres instance --
> > psql just hangs, so the lock seems to have affected the postmaster
> > also.
> >
> > I'm wondering if this might represent a bug in the current patch.
>
> Thanks for running and analyzing the test!

Sure thing!

> Could you share me how you are running the test?
>
> I imagined something like below, but currently couldn't reproduce it.
> - apply both v26-0001 and v27-0002 and build
> - run PostgreSQL with default GUCssaaa
> - make installcheck-world
> - run 'SELECT pg_log_query_plan(pid) FROM pg_stat_activity \watch 0.1'
> during make installcheck-world

Apologies, I should have attached my updated patch (with the fix for
the bug you'd reporting with the re-entrancy guard). Attached is v28
which sets ProcessLogQueryPlanInterruptActive to false in errfinish
when necessary. Once built with those two patches I'm simply running
`make check`.

Regards,
James Coleman

Attachment Content-Type Size
v28-0001-Add-function-to-log-the-plan-of-the-query.patch application/octet-stream 26.8 KB
v28-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch application/octet-stream 891 bytes

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-06-27 10:03:21
Message-ID: 00dac323d6732d59ba2c03be729966dc@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-06-16 01:34, James Coleman wrote:
> On Thu, Jun 15, 2023 at 9:00 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
>>
>> On 2023-06-15 01:48, James Coleman wrote:
>> > On Tue, Jun 13, 2023 at 11:53 AM James Coleman <jtc331(at)gmail(dot)com>
>> > wrote:
>> >>
>> >> ...
>> >> I'm going to re-run tests with my patch version + resetting the flag
>> >> on SIGINT (and any other error condition) to be certain that the issue
>> >> you uncovered (where backends get stuck after a SIGINT not responding
>> >> to the requested plan logging) wasn't masking any other issues.
>> >>
>> >> As long as that run is clean also then I believe the patch is safe
>> >> as-is even without the re-entrancy guard.
>> >>
>> >> I'll report back with the results of that testing.
>> >
>> > The tests have been running since last night, but have been apparently
>> > hung now for many hours. I haven't been able to fully look into it,
>> > but so far I know the hung (100% CPU) backend last logged this:
>> >
>> > 2023-06-14 02:00:30.045 UTC client backend[84461]
>> > pg_regress/updatable_views LOG: query plan running on backend with
>> > PID 84461 is:
>> > Query Text: SELECT table_name, column_name, is_updatable
>> > FROM information_schema.columns
>> > WHERE table_name LIKE E'r_\\_view%'
>> > ORDER BY table_name, ordinal_position;
>> >
>> > The last output from the regression test harness was:
>> >
>> > # parallel group (5 tests): index_including create_view
>> > index_including_gist create_index create_index_spgist
>> > ok 66 + create_index 36508 ms
>> > ok 67 + create_index_spgist 38588 ms
>> > ok 68 + create_view 1394 ms
>> > ok 69 + index_including 654 ms
>> > ok 70 + index_including_gist 1701 ms
>> > # parallel group (16 tests): errors create_cast drop_if_exists
>> > create_aggregate roleattributes constraints hash_func typed_table
>> > infinite_recurse
>> >
>> > Attaching gdb to the hung backend shows this:
>> >
>> > #0 0x00005601ab1f9529 in ProcLockWakeup
>> > (lockMethodTable=lockMethodTable(at)entry=0x5601ab6484e0
>> > <default_lockmethod>, lock=lock(at)entry=0x7f5325c913f0) at proc.c:1655
>> > #1 0x00005601ab1e99dc in CleanUpLock (lock=lock(at)entry=0x7f5325c913f0,
>> > proclock=proclock(at)entry=0x7f5325d40d60,
>> > lockMethodTable=lockMethodTable(at)entry=0x5601ab6484e0
>> > <default_lockmethod>,
>> > hashcode=hashcode(at)entry=573498161, wakeupNeeded=<optimized out>)
>> > at lock.c:1673
>> > #2 0x00005601ab1e9e21 in LockRefindAndRelease
>> > (lockMethodTable=lockMethodTable(at)entry=0x5601ab6484e0
>> > <default_lockmethod>, proc=<optimized out>,
>> > locktag=locktag(at)entry=0x5601ac3d7998, lockmode=lockmode(at)entry=1,
>> >
>> > decrement_strong_lock_count=decrement_strong_lock_count(at)entry=false)
>> > at lock.c:3150
>> > #3 0x00005601ab1edb27 in LockReleaseAll
>> > (lockmethodid=lockmethodid(at)entry=1, allLocks=false) at lock.c:2295
>> > #4 0x00005601ab1f8599 in ProcReleaseLocks
>> > (isCommit=isCommit(at)entry=true) at proc.c:781
>> > #5 0x00005601ab37f1f4 in ResourceOwnerReleaseInternal
>> > (owner=<optimized out>, phase=phase(at)entry=RESOURCE_RELEASE_LOCKS,
>> > isCommit=isCommit(at)entry=true, isTopLevel=isTopLevel(at)entry=true) at
>> > resowner.c:618
>> > #6 0x00005601ab37f7b7 in ResourceOwnerRelease (owner=<optimized out>,
>> > phase=phase(at)entry=RESOURCE_RELEASE_LOCKS,
>> > isCommit=isCommit(at)entry=true, isTopLevel=isTopLevel(at)entry=true) at
>> > resowner.c:494
>> > #7 0x00005601aaec1d84 in CommitTransaction () at xact.c:2334
>> > #8 0x00005601aaec2b22 in CommitTransactionCommand () at xact.c:3067
>> > #9 0x00005601ab200a66 in finish_xact_command () at postgres.c:2783
>> > #10 0x00005601ab20338f in exec_simple_query (
>> > query_string=query_string(at)entry=0x5601ac3b0858 "SELECT table_name,
>> > column_name, is_updatable\n FROM information_schema.columns\n WHERE
>> > table_name LIKE E'r_\\\\_view%'\n ORDER BY table_name,
>> > ordinal_position;") at postgres.c:1300
>> >
>> > I am unable to connect to the regression test Postgres instance --
>> > psql just hangs, so the lock seems to have affected the postmaster
>> > also.
>> >
>> > I'm wondering if this might represent a bug in the current patch.
>>
>> Thanks for running and analyzing the test!
>
> Sure thing!
>
>> Could you share me how you are running the test?
>>
>> I imagined something like below, but currently couldn't reproduce it.
>> - apply both v26-0001 and v27-0002 and build
>> - run PostgreSQL with default GUCssaaa
>> - make installcheck-world
>> - run 'SELECT pg_log_query_plan(pid) FROM pg_stat_activity \watch 0.1'
>> during make installcheck-world
>
> Apologies, I should have attached my updated patch (with the fix for
> the bug you'd reporting with the re-entrancy guard). Attached is v28
> which sets ProcessLogQueryPlanInterruptActive to false in errfinish
> when necessary. Once built with those two patches I'm simply running
> `make check`.

Thanks!

However, I haven't succeeded in reproducing the problem as below:

> The tests have been running since last night, but have been apparently
> hung now for many hours. I haven't been able to fully look into it,
> but so far I know the hung (100% CPU) backend last logged this:

Did you do something like this?

$ ./configure --prefix=/home/ubuntu/pgsql/master --enable-cassert
$ git apply
../patch/v28-0001-Add-function-to-log-the-plan-of-the-query.patch
../patch/v28-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch
$ make
$ make check

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-08-17 14:02:22
Message-ID: af8b4c4752dcee3ca7284e7c15922d5d@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-06-16 01:34, James Coleman wrote:
> Attached is v28
> which sets ProcessLogQueryPlanInterruptActive to false in errfinish
> when necessary. Once built with those two patches I'm simply running
> `make check`.

With v28-0001 and v28-0002 patch, I confirmed backend processes consume
huge
amount of memory and under some environments they were terminated by OOM
killer.

This was because memory was allocated from existing memory contexts and
they
were not freed after ProcessLogQueryPlanInterrupt().
Updated the patch to use dedicated memory context for
ProcessLogQueryPlanInterrupt().

Applying attached patch and v28-0002 patch, `make check` successfully
completed after 20min and 50GB of logs on my environment.

>>> On 2023-06-15 01:48, James Coleman wrote:
>>> > The tests have been running since last night, but have been apparently
>>> > hung now for many hours.

I don't know if this has anything to do with the hung you faced, but I
thought
it might be possible that the large amount of memory usage resulted in
swapping, which caused a significant delay in processing.

If possible, I would be very grateful if you could try to reproduce this
with
the v29 patch.

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Attachment Content-Type Size
v29-0001-Add-function-to-log-the-plan-of-the-query.patch text/x-diff 27.0 KB

From: James Coleman <jtc331(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-08-25 11:43:26
Message-ID: CAAaqYe9TPyWe9yoOm5Jz_39gSnhCAYXozc698jkyegM1x74Qjg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Aug 17, 2023 at 10:02 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> On 2023-06-16 01:34, James Coleman wrote:
> > Attached is v28
> > which sets ProcessLogQueryPlanInterruptActive to false in errfinish
> > when necessary. Once built with those two patches I'm simply running
> > `make check`.
>
> With v28-0001 and v28-0002 patch, I confirmed backend processes consume
> huge
> amount of memory and under some environments they were terminated by OOM
> killer.
>
> This was because memory was allocated from existing memory contexts and
> they
> were not freed after ProcessLogQueryPlanInterrupt().
> Updated the patch to use dedicated memory context for
> ProcessLogQueryPlanInterrupt().
>
> Applying attached patch and v28-0002 patch, `make check` successfully
> completed after 20min and 50GB of logs on my environment.
>
> >>> On 2023-06-15 01:48, James Coleman wrote:
> >>> > The tests have been running since last night, but have been apparently
> >>> > hung now for many hours.
>
> I don't know if this has anything to do with the hung you faced, but I
> thought
> it might be possible that the large amount of memory usage resulted in
> swapping, which caused a significant delay in processing.

Ah, yes, I think that could be a possible explanation. I was delaying
on this thread because I wasn't comfortable with having caused an
issue once (even if I couldn't easily reproduce) without at least some
theory as to the cause (and a fix).

> If possible, I would be very grateful if you could try to reproduce this
> with
> the v29 patch.

I'll kick off some testing.

Thanks,
James Coleman


From: James Coleman <jtc331(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-08-26 12:03:26
Message-ID: CAAaqYe9Mse7JC78DgT97sXoB6U+E5OL3zJAYRSsgy-TPP9PXmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Aug 25, 2023 at 7:43 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
>
> On Thu, Aug 17, 2023 at 10:02 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> >
> > On 2023-06-16 01:34, James Coleman wrote:
> > > Attached is v28
> > > which sets ProcessLogQueryPlanInterruptActive to false in errfinish
> > > when necessary. Once built with those two patches I'm simply running
> > > `make check`.
> >
> > With v28-0001 and v28-0002 patch, I confirmed backend processes consume
> > huge
> > amount of memory and under some environments they were terminated by OOM
> > killer.
> >
> > This was because memory was allocated from existing memory contexts and
> > they
> > were not freed after ProcessLogQueryPlanInterrupt().
> > Updated the patch to use dedicated memory context for
> > ProcessLogQueryPlanInterrupt().
> >
> > Applying attached patch and v28-0002 patch, `make check` successfully
> > completed after 20min and 50GB of logs on my environment.
> >
> > >>> On 2023-06-15 01:48, James Coleman wrote:
> > >>> > The tests have been running since last night, but have been apparently
> > >>> > hung now for many hours.
> >
> > I don't know if this has anything to do with the hung you faced, but I
> > thought
> > it might be possible that the large amount of memory usage resulted in
> > swapping, which caused a significant delay in processing.
>
> Ah, yes, I think that could be a possible explanation. I was delaying
> on this thread because I wasn't comfortable with having caused an
> issue once (even if I couldn't easily reproduce) without at least some
> theory as to the cause (and a fix).
>
> > If possible, I would be very grateful if you could try to reproduce this
> > with
> > the v29 patch.
>
> I'll kick off some testing.
>

I don't have time to investigate what's happening here, but 24 hours
later the first "make check" is still running, and at first glance it
seems to have the same behavior I'd seen that first time. The test
output is to this point:

# parallel group (5 tests): index_including create_view
index_including_gist create_index create_index_spgist
ok 66 + create_index 26365 ms
ok 67 + create_index_spgist 27675 ms
ok 68 + create_view 1235 ms
ok 69 + index_including 1102 ms
ok 70 + index_including_gist 1633 ms
# parallel group (16 tests): create_aggregate create_cast errors
roleattributes drop_if_exists hash_func typed_table create_am
infinite_recurse

and it hasn't progressed past that point since at least ~16 hours ago
(the first several hours of the run I wasn't monitoring it).

I haven't connected up gdb yet, and won't be able to until maybe
tomorrow, but here's the ps output for postgres processes that are
running:

admin 3213249 0.0 0.0 196824 20552 ? Ss Aug25 0:00
/home/admin/postgresql-test/bin/postgres -D
/home/admin/postgresql-test-data
admin 3213250 0.0 0.0 196964 7284 ? Ss Aug25 0:00
postgres: checkpointer
admin 3213251 0.0 0.0 196956 4276 ? Ss Aug25 0:00
postgres: background writer
admin 3213253 0.0 0.0 196956 8600 ? Ss Aug25 0:00
postgres: walwriter
admin 3213254 0.0 0.0 198424 7316 ? Ss Aug25 0:00
postgres: autovacuum launcher
admin 3213255 0.0 0.0 198412 5488 ? Ss Aug25 0:00
postgres: logical replication launcher
admin 3237967 0.0 0.0 2484 516 pts/4 S+ Aug25 0:00
/bin/sh -c echo "# +++ regress check in src/test/regress +++" &&
PATH="/home/admin/postgres/tmp_install/home/admin/postgresql-test/bin:/home/admin/postgres/src/test/regress:$PATH"
LD_LIBRARY_PATH="/home/admin/postgres/tmp_install/home/admin/postgresql-test/lib"
INITDB_TEMPLATE='/home/admin/postgres'/tmp_install/initdb-template
../../../src/test/regress/pg_regress --temp-instance=./tmp_check
--inputdir=. --bindir= --dlpath=. --max-concurrent-tests=20
--schedule=./parallel_schedule
admin 3237973 0.0 0.0 197880 20688 pts/4 S+ Aug25 0:00
postgres -D /home/admin/postgres/src/test/regress/tmp_check/data -F -c
listen_addresses= -k /tmp/pg_regress-7mmGUa
admin 3237976 0.0 0.1 198332 44608 ? Ss Aug25 0:00
postgres: checkpointer
admin 3237977 0.0 0.0 198032 4640 ? Ss Aug25 0:00
postgres: background writer
admin 3237979 0.0 0.0 197880 8580 ? Ss Aug25 0:00
postgres: walwriter
admin 3237980 0.0 0.0 199484 7608 ? Ss Aug25 0:00
postgres: autovacuum launcher
admin 3237981 0.0 0.0 199460 5488 ? Ss Aug25 0:00
postgres: logical replication launcher
admin 3243644 0.0 0.2 252400 74656 ? Ss Aug25 0:01
postgres: admin regression [local] SELECT waiting
admin 3243645 0.0 0.1 205480 33992 ? Ss Aug25 0:00
postgres: admin regression [local] SELECT waiting
admin 3243654 99.9 0.0 203156 31504 ? Rs Aug25 1437:49
postgres: admin regression [local] VACUUM
admin 3243655 0.0 0.1 212036 38504 ? Ss Aug25 0:00
postgres: admin regression [local] SELECT waiting
admin 3243656 0.0 0.0 206024 30892 ? Ss Aug25 0:00
postgres: admin regression [local] DELETE waiting
admin 3243657 0.0 0.1 205568 32232 ? Ss Aug25 0:00
postgres: admin regression [local] ALTER TABLE waiting
admin 3243658 0.0 0.0 203740 21532 ? Ss Aug25 0:00
postgres: admin regression [local] ANALYZE waiting
admin 3243798 0.0 0.0 199884 8464 ? Ss Aug25 0:00
postgres: autovacuum worker
admin 3244733 0.0 0.0 199492 5956 ? Ss Aug25 0:00
postgres: autovacuum worker
admin 3245652 0.0 0.0 199884 8468 ? Ss Aug25 0:00
postgres: autovacuum worker

As you can see there are a bunch of backends presumably waiting, and
also the VACUUM process has been pegging a single CPU core for at
least since that ~16 hour ago mark.

I hope to be able to do more investigation later, but I wanted to at
least give you this information now.

James Coleman


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-08-28 07:01:53
Message-ID: 4e1e4e5d022964953e84468fe4e511b0@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-08-26 21:03, James Coleman wrote:
> On Fri, Aug 25, 2023 at 7:43 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
>>
>> On Thu, Aug 17, 2023 at 10:02 AM torikoshia
>> <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>> >
>> > On 2023-06-16 01:34, James Coleman wrote:
>> > > Attached is v28
>> > > which sets ProcessLogQueryPlanInterruptActive to false in errfinish
>> > > when necessary. Once built with those two patches I'm simply running
>> > > `make check`.
>> >
>> > With v28-0001 and v28-0002 patch, I confirmed backend processes consume
>> > huge
>> > amount of memory and under some environments they were terminated by OOM
>> > killer.
>> >
>> > This was because memory was allocated from existing memory contexts and
>> > they
>> > were not freed after ProcessLogQueryPlanInterrupt().
>> > Updated the patch to use dedicated memory context for
>> > ProcessLogQueryPlanInterrupt().
>> >
>> > Applying attached patch and v28-0002 patch, `make check` successfully
>> > completed after 20min and 50GB of logs on my environment.
>> >
>> > >>> On 2023-06-15 01:48, James Coleman wrote:
>> > >>> > The tests have been running since last night, but have been apparently
>> > >>> > hung now for many hours.
>> >
>> > I don't know if this has anything to do with the hung you faced, but I
>> > thought
>> > it might be possible that the large amount of memory usage resulted in
>> > swapping, which caused a significant delay in processing.
>>
>> Ah, yes, I think that could be a possible explanation. I was delaying
>> on this thread because I wasn't comfortable with having caused an
>> issue once (even if I couldn't easily reproduce) without at least some
>> theory as to the cause (and a fix).
>>
>> > If possible, I would be very grateful if you could try to reproduce this
>> > with
>> > the v29 patch.
>>
>> I'll kick off some testing.
>>
>
> I don't have time to investigate what's happening here, but 24 hours
> later the first "make check" is still running, and at first glance it
> seems to have the same behavior I'd seen that first time. The test
> output is to this point:
>
> # parallel group (5 tests): index_including create_view
> index_including_gist create_index create_index_spgist
> ok 66 + create_index 26365 ms
> ok 67 + create_index_spgist 27675 ms
> ok 68 + create_view 1235 ms
> ok 69 + index_including 1102 ms
> ok 70 + index_including_gist 1633 ms
> # parallel group (16 tests): create_aggregate create_cast errors
> roleattributes drop_if_exists hash_func typed_table create_am
> infinite_recurse
>
> and it hasn't progressed past that point since at least ~16 hours ago
> (the first several hours of the run I wasn't monitoring it).
>
> I haven't connected up gdb yet, and won't be able to until maybe
> tomorrow, but here's the ps output for postgres processes that are
> running:
>
> admin 3213249 0.0 0.0 196824 20552 ? Ss Aug25 0:00
> /home/admin/postgresql-test/bin/postgres -D
> /home/admin/postgresql-test-data
> admin 3213250 0.0 0.0 196964 7284 ? Ss Aug25 0:00
> postgres: checkpointer
> admin 3213251 0.0 0.0 196956 4276 ? Ss Aug25 0:00
> postgres: background writer
> admin 3213253 0.0 0.0 196956 8600 ? Ss Aug25 0:00
> postgres: walwriter
> admin 3213254 0.0 0.0 198424 7316 ? Ss Aug25 0:00
> postgres: autovacuum launcher
> admin 3213255 0.0 0.0 198412 5488 ? Ss Aug25 0:00
> postgres: logical replication launcher
> admin 3237967 0.0 0.0 2484 516 pts/4 S+ Aug25 0:00
> /bin/sh -c echo "# +++ regress check in src/test/regress +++" &&
> PATH="/home/admin/postgres/tmp_install/home/admin/postgresql-test/bin:/home/admin/postgres/src/test/regress:$PATH"
> LD_LIBRARY_PATH="/home/admin/postgres/tmp_install/home/admin/postgresql-test/lib"
> INITDB_TEMPLATE='/home/admin/postgres'/tmp_install/initdb-template
> ../../../src/test/regress/pg_regress --temp-instance=./tmp_check
> --inputdir=. --bindir= --dlpath=. --max-concurrent-tests=20
> --schedule=./parallel_schedule
> admin 3237973 0.0 0.0 197880 20688 pts/4 S+ Aug25 0:00
> postgres -D /home/admin/postgres/src/test/regress/tmp_check/data -F -c
> listen_addresses= -k /tmp/pg_regress-7mmGUa
> admin 3237976 0.0 0.1 198332 44608 ? Ss Aug25 0:00
> postgres: checkpointer
> admin 3237977 0.0 0.0 198032 4640 ? Ss Aug25 0:00
> postgres: background writer
> admin 3237979 0.0 0.0 197880 8580 ? Ss Aug25 0:00
> postgres: walwriter
> admin 3237980 0.0 0.0 199484 7608 ? Ss Aug25 0:00
> postgres: autovacuum launcher
> admin 3237981 0.0 0.0 199460 5488 ? Ss Aug25 0:00
> postgres: logical replication launcher
> admin 3243644 0.0 0.2 252400 74656 ? Ss Aug25 0:01
> postgres: admin regression [local] SELECT waiting
> admin 3243645 0.0 0.1 205480 33992 ? Ss Aug25 0:00
> postgres: admin regression [local] SELECT waiting
> admin 3243654 99.9 0.0 203156 31504 ? Rs Aug25 1437:49
> postgres: admin regression [local] VACUUM
> admin 3243655 0.0 0.1 212036 38504 ? Ss Aug25 0:00
> postgres: admin regression [local] SELECT waiting
> admin 3243656 0.0 0.0 206024 30892 ? Ss Aug25 0:00
> postgres: admin regression [local] DELETE waiting
> admin 3243657 0.0 0.1 205568 32232 ? Ss Aug25 0:00
> postgres: admin regression [local] ALTER TABLE waiting
> admin 3243658 0.0 0.0 203740 21532 ? Ss Aug25 0:00
> postgres: admin regression [local] ANALYZE waiting
> admin 3243798 0.0 0.0 199884 8464 ? Ss Aug25 0:00
> postgres: autovacuum worker
> admin 3244733 0.0 0.0 199492 5956 ? Ss Aug25 0:00
> postgres: autovacuum worker
> admin 3245652 0.0 0.0 199884 8468 ? Ss Aug25 0:00
> postgres: autovacuum worker
>
> As you can see there are a bunch of backends presumably waiting, and
> also the VACUUM process has been pegging a single CPU core for at
> least since that ~16 hour ago mark.
>
> I hope to be able to do more investigation later, but I wanted to at
> least give you this information now.

Thanks a lot for testing the patch!
I really appreciate your cooperation.

Hmm, I also tested on the current HEAD(165d581f146b09) again on Ubuntu
22.04 and macOS, but unfortunately(fortunately?) they succeeded as
below:

```
$ git apply v29-0001-Add-function-to-log-the-plan-of-the-query.patch
$ git apply v28-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch
$ ./configure --enable-debug --enable-cassert
$ make
$ make check

...(snip)...

# parallel group (5 tests): index_including index_including_gist
create_view create_index create_index_spgist
ok 66 + create_index 25033 ms
ok 67 + create_index_spgist 26144 ms
ok 68 + create_view 3061 ms
ok 69 + index_including 976 ms
ok 70 + index_including_gist 2998 ms
# parallel group (16 tests): create_cast errors create_aggregate
roleattributes drop_if_exists hash_func typed_table
create_am select constraints updatable_views inherit triggers vacuum
create_function_sql infinite_recurse
ok 71 + create_aggregate 225 ms
ok 72 + create_function_sql 18874 ms
ok 73 + create_cast 168 ms

...(snip)...

# All 215 tests passed.
```

If you notice any difference, I would be grateful if you could let me
know.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: James Coleman <jtc331(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-08-28 13:47:19
Message-ID: CAAaqYe_5p_MkBCgSYX=S7G3ZKi=2=_6bZQFzTgVYyv=1NRikpw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Aug 28, 2023 at 3:01 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> On 2023-08-26 21:03, James Coleman wrote:
> > On Fri, Aug 25, 2023 at 7:43 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
> >>
> >> On Thu, Aug 17, 2023 at 10:02 AM torikoshia
> >> <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> >> >
> >> > On 2023-06-16 01:34, James Coleman wrote:
> >> > > Attached is v28
> >> > > which sets ProcessLogQueryPlanInterruptActive to false in errfinish
> >> > > when necessary. Once built with those two patches I'm simply running
> >> > > `make check`.
> >> >
> >> > With v28-0001 and v28-0002 patch, I confirmed backend processes consume
> >> > huge
> >> > amount of memory and under some environments they were terminated by OOM
> >> > killer.
> >> >
> >> > This was because memory was allocated from existing memory contexts and
> >> > they
> >> > were not freed after ProcessLogQueryPlanInterrupt().
> >> > Updated the patch to use dedicated memory context for
> >> > ProcessLogQueryPlanInterrupt().
> >> >
> >> > Applying attached patch and v28-0002 patch, `make check` successfully
> >> > completed after 20min and 50GB of logs on my environment.
> >> >
> >> > >>> On 2023-06-15 01:48, James Coleman wrote:
> >> > >>> > The tests have been running since last night, but have been apparently
> >> > >>> > hung now for many hours.
> >> >
> >> > I don't know if this has anything to do with the hung you faced, but I
> >> > thought
> >> > it might be possible that the large amount of memory usage resulted in
> >> > swapping, which caused a significant delay in processing.
> >>
> >> Ah, yes, I think that could be a possible explanation. I was delaying
> >> on this thread because I wasn't comfortable with having caused an
> >> issue once (even if I couldn't easily reproduce) without at least some
> >> theory as to the cause (and a fix).
> >>
> >> > If possible, I would be very grateful if you could try to reproduce this
> >> > with
> >> > the v29 patch.
> >>
> >> I'll kick off some testing.
> >>
> >
> > I don't have time to investigate what's happening here, but 24 hours
> > later the first "make check" is still running, and at first glance it
> > seems to have the same behavior I'd seen that first time. The test
> > output is to this point:
> >
> > # parallel group (5 tests): index_including create_view
> > index_including_gist create_index create_index_spgist
> > ok 66 + create_index 26365 ms
> > ok 67 + create_index_spgist 27675 ms
> > ok 68 + create_view 1235 ms
> > ok 69 + index_including 1102 ms
> > ok 70 + index_including_gist 1633 ms
> > # parallel group (16 tests): create_aggregate create_cast errors
> > roleattributes drop_if_exists hash_func typed_table create_am
> > infinite_recurse
> >
> > and it hasn't progressed past that point since at least ~16 hours ago
> > (the first several hours of the run I wasn't monitoring it).
> >
> > I haven't connected up gdb yet, and won't be able to until maybe
> > tomorrow, but here's the ps output for postgres processes that are
> > running:
> >
> > admin 3213249 0.0 0.0 196824 20552 ? Ss Aug25 0:00
> > /home/admin/postgresql-test/bin/postgres -D
> > /home/admin/postgresql-test-data
> > admin 3213250 0.0 0.0 196964 7284 ? Ss Aug25 0:00
> > postgres: checkpointer
> > admin 3213251 0.0 0.0 196956 4276 ? Ss Aug25 0:00
> > postgres: background writer
> > admin 3213253 0.0 0.0 196956 8600 ? Ss Aug25 0:00
> > postgres: walwriter
> > admin 3213254 0.0 0.0 198424 7316 ? Ss Aug25 0:00
> > postgres: autovacuum launcher
> > admin 3213255 0.0 0.0 198412 5488 ? Ss Aug25 0:00
> > postgres: logical replication launcher
> > admin 3237967 0.0 0.0 2484 516 pts/4 S+ Aug25 0:00
> > /bin/sh -c echo "# +++ regress check in src/test/regress +++" &&
> > PATH="/home/admin/postgres/tmp_install/home/admin/postgresql-test/bin:/home/admin/postgres/src/test/regress:$PATH"
> > LD_LIBRARY_PATH="/home/admin/postgres/tmp_install/home/admin/postgresql-test/lib"
> > INITDB_TEMPLATE='/home/admin/postgres'/tmp_install/initdb-template
> > ../../../src/test/regress/pg_regress --temp-instance=./tmp_check
> > --inputdir=. --bindir= --dlpath=. --max-concurrent-tests=20
> > --schedule=./parallel_schedule
> > admin 3237973 0.0 0.0 197880 20688 pts/4 S+ Aug25 0:00
> > postgres -D /home/admin/postgres/src/test/regress/tmp_check/data -F -c
> > listen_addresses= -k /tmp/pg_regress-7mmGUa
> > admin 3237976 0.0 0.1 198332 44608 ? Ss Aug25 0:00
> > postgres: checkpointer
> > admin 3237977 0.0 0.0 198032 4640 ? Ss Aug25 0:00
> > postgres: background writer
> > admin 3237979 0.0 0.0 197880 8580 ? Ss Aug25 0:00
> > postgres: walwriter
> > admin 3237980 0.0 0.0 199484 7608 ? Ss Aug25 0:00
> > postgres: autovacuum launcher
> > admin 3237981 0.0 0.0 199460 5488 ? Ss Aug25 0:00
> > postgres: logical replication launcher
> > admin 3243644 0.0 0.2 252400 74656 ? Ss Aug25 0:01
> > postgres: admin regression [local] SELECT waiting
> > admin 3243645 0.0 0.1 205480 33992 ? Ss Aug25 0:00
> > postgres: admin regression [local] SELECT waiting
> > admin 3243654 99.9 0.0 203156 31504 ? Rs Aug25 1437:49
> > postgres: admin regression [local] VACUUM
> > admin 3243655 0.0 0.1 212036 38504 ? Ss Aug25 0:00
> > postgres: admin regression [local] SELECT waiting
> > admin 3243656 0.0 0.0 206024 30892 ? Ss Aug25 0:00
> > postgres: admin regression [local] DELETE waiting
> > admin 3243657 0.0 0.1 205568 32232 ? Ss Aug25 0:00
> > postgres: admin regression [local] ALTER TABLE waiting
> > admin 3243658 0.0 0.0 203740 21532 ? Ss Aug25 0:00
> > postgres: admin regression [local] ANALYZE waiting
> > admin 3243798 0.0 0.0 199884 8464 ? Ss Aug25 0:00
> > postgres: autovacuum worker
> > admin 3244733 0.0 0.0 199492 5956 ? Ss Aug25 0:00
> > postgres: autovacuum worker
> > admin 3245652 0.0 0.0 199884 8468 ? Ss Aug25 0:00
> > postgres: autovacuum worker
> >
> > As you can see there are a bunch of backends presumably waiting, and
> > also the VACUUM process has been pegging a single CPU core for at
> > least since that ~16 hour ago mark.
> >
> > I hope to be able to do more investigation later, but I wanted to at
> > least give you this information now.
>
> Thanks a lot for testing the patch!
> I really appreciate your cooperation.
>
> Hmm, I also tested on the current HEAD(165d581f146b09) again on Ubuntu
> 22.04 and macOS, but unfortunately(fortunately?) they succeeded as
> below:
>
> ```
> $ git apply v29-0001-Add-function-to-log-the-plan-of-the-query.patch
> $ git apply v28-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch
> $ ./configure --enable-debug --enable-cassert
> $ make
> $ make check
>
> ...(snip)...
>
> # parallel group (5 tests): index_including index_including_gist
> create_view create_index create_index_spgist
> ok 66 + create_index 25033 ms
> ok 67 + create_index_spgist 26144 ms
> ok 68 + create_view 3061 ms
> ok 69 + index_including 976 ms
> ok 70 + index_including_gist 2998 ms
> # parallel group (16 tests): create_cast errors create_aggregate
> roleattributes drop_if_exists hash_func typed_table
> create_am select constraints updatable_views inherit triggers vacuum
> create_function_sql infinite_recurse
> ok 71 + create_aggregate 225 ms
> ok 72 + create_function_sql 18874 ms
> ok 73 + create_cast 168 ms
>
> ...(snip)...
>
> # All 215 tests passed.
> ```
>
> If you notice any difference, I would be grateful if you could let me
> know.

I've never been able to reproduce it (haven't tested the new version,
but v28 at least) on my M1 Mac; where I've reproduced it is on Debian
(first buster and now bullseye).

I'm attaching several stacktraces in the hope that they provide some
clues. These all match the ps output I sent earlier, though note in
that output there is both the regress instance and my test instance
(pid 3213249) running (different ports, of course, and they are from
the exact same compilation run). I've attached ps output for the
postgres processes under the make check process to simplify cross
referencing.

A few interesting things:
- There's definitely a lock on a relation that seems to be what's
blocking the processes.
- When I try to connect with psql the process forks but then hangs
(see the ps output with task names stuck in "authentication"). I've
also included a trace from one of these.

If you think a core file for any of these processes would be helpful
for debugging I'd be happy to try to figure out a way to get that to
you.

Regards,
James Coleman

Attachment Content-Type Size
ps.txt text/plain 2.0 KB
delete.trace application/octet-stream 5.7 KB
altertable.trace application/octet-stream 6.3 KB
analyze.trace application/octet-stream 5.0 KB
postmaster.trace application/octet-stream 1.1 KB
new_psql.trace application/octet-stream 3.6 KB
vacuum.trace application/octet-stream 4.3 KB
select3.trace application/octet-stream 6.4 KB
select1.trace application/octet-stream 19.2 KB
select2.trace application/octet-stream 5.3 KB

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-09-05 13:58:59
Message-ID: 4af7712833d4ce989cf5e658106e42da@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-08-28 22:47, James Coleman wrote:
> On Mon, Aug 28, 2023 at 3:01 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
>>
>> On 2023-08-26 21:03, James Coleman wrote:
>> > On Fri, Aug 25, 2023 at 7:43 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
>> >>
>> >> On Thu, Aug 17, 2023 at 10:02 AM torikoshia
>> >> <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>> >> >
>> >> > On 2023-06-16 01:34, James Coleman wrote:
>> >> > > Attached is v28
>> >> > > which sets ProcessLogQueryPlanInterruptActive to false in errfinish
>> >> > > when necessary. Once built with those two patches I'm simply running
>> >> > > `make check`.
>> >> >
>> >> > With v28-0001 and v28-0002 patch, I confirmed backend processes consume
>> >> > huge
>> >> > amount of memory and under some environments they were terminated by OOM
>> >> > killer.
>> >> >
>> >> > This was because memory was allocated from existing memory contexts and
>> >> > they
>> >> > were not freed after ProcessLogQueryPlanInterrupt().
>> >> > Updated the patch to use dedicated memory context for
>> >> > ProcessLogQueryPlanInterrupt().
>> >> >
>> >> > Applying attached patch and v28-0002 patch, `make check` successfully
>> >> > completed after 20min and 50GB of logs on my environment.
>> >> >
>> >> > >>> On 2023-06-15 01:48, James Coleman wrote:
>> >> > >>> > The tests have been running since last night, but have been apparently
>> >> > >>> > hung now for many hours.
>> >> >
>> >> > I don't know if this has anything to do with the hung you faced, but I
>> >> > thought
>> >> > it might be possible that the large amount of memory usage resulted in
>> >> > swapping, which caused a significant delay in processing.
>> >>
>> >> Ah, yes, I think that could be a possible explanation. I was delaying
>> >> on this thread because I wasn't comfortable with having caused an
>> >> issue once (even if I couldn't easily reproduce) without at least some
>> >> theory as to the cause (and a fix).
>> >>
>> >> > If possible, I would be very grateful if you could try to reproduce this
>> >> > with
>> >> > the v29 patch.
>> >>
>> >> I'll kick off some testing.
>> >>
>> >
>> > I don't have time to investigate what's happening here, but 24 hours
>> > later the first "make check" is still running, and at first glance it
>> > seems to have the same behavior I'd seen that first time. The test
>> > output is to this point:
>> >
>> > # parallel group (5 tests): index_including create_view
>> > index_including_gist create_index create_index_spgist
>> > ok 66 + create_index 26365 ms
>> > ok 67 + create_index_spgist 27675 ms
>> > ok 68 + create_view 1235 ms
>> > ok 69 + index_including 1102 ms
>> > ok 70 + index_including_gist 1633 ms
>> > # parallel group (16 tests): create_aggregate create_cast errors
>> > roleattributes drop_if_exists hash_func typed_table create_am
>> > infinite_recurse
>> >
>> > and it hasn't progressed past that point since at least ~16 hours ago
>> > (the first several hours of the run I wasn't monitoring it).
>> >
>> > I haven't connected up gdb yet, and won't be able to until maybe
>> > tomorrow, but here's the ps output for postgres processes that are
>> > running:
>> >
>> > admin 3213249 0.0 0.0 196824 20552 ? Ss Aug25 0:00
>> > /home/admin/postgresql-test/bin/postgres -D
>> > /home/admin/postgresql-test-data
>> > admin 3213250 0.0 0.0 196964 7284 ? Ss Aug25 0:00
>> > postgres: checkpointer
>> > admin 3213251 0.0 0.0 196956 4276 ? Ss Aug25 0:00
>> > postgres: background writer
>> > admin 3213253 0.0 0.0 196956 8600 ? Ss Aug25 0:00
>> > postgres: walwriter
>> > admin 3213254 0.0 0.0 198424 7316 ? Ss Aug25 0:00
>> > postgres: autovacuum launcher
>> > admin 3213255 0.0 0.0 198412 5488 ? Ss Aug25 0:00
>> > postgres: logical replication launcher
>> > admin 3237967 0.0 0.0 2484 516 pts/4 S+ Aug25 0:00
>> > /bin/sh -c echo "# +++ regress check in src/test/regress +++" &&
>> > PATH="/home/admin/postgres/tmp_install/home/admin/postgresql-test/bin:/home/admin/postgres/src/test/regress:$PATH"
>> > LD_LIBRARY_PATH="/home/admin/postgres/tmp_install/home/admin/postgresql-test/lib"
>> > INITDB_TEMPLATE='/home/admin/postgres'/tmp_install/initdb-template
>> > ../../../src/test/regress/pg_regress --temp-instance=./tmp_check
>> > --inputdir=. --bindir= --dlpath=. --max-concurrent-tests=20
>> > --schedule=./parallel_schedule
>> > admin 3237973 0.0 0.0 197880 20688 pts/4 S+ Aug25 0:00
>> > postgres -D /home/admin/postgres/src/test/regress/tmp_check/data -F -c
>> > listen_addresses= -k /tmp/pg_regress-7mmGUa
>> > admin 3237976 0.0 0.1 198332 44608 ? Ss Aug25 0:00
>> > postgres: checkpointer
>> > admin 3237977 0.0 0.0 198032 4640 ? Ss Aug25 0:00
>> > postgres: background writer
>> > admin 3237979 0.0 0.0 197880 8580 ? Ss Aug25 0:00
>> > postgres: walwriter
>> > admin 3237980 0.0 0.0 199484 7608 ? Ss Aug25 0:00
>> > postgres: autovacuum launcher
>> > admin 3237981 0.0 0.0 199460 5488 ? Ss Aug25 0:00
>> > postgres: logical replication launcher
>> > admin 3243644 0.0 0.2 252400 74656 ? Ss Aug25 0:01
>> > postgres: admin regression [local] SELECT waiting
>> > admin 3243645 0.0 0.1 205480 33992 ? Ss Aug25 0:00
>> > postgres: admin regression [local] SELECT waiting
>> > admin 3243654 99.9 0.0 203156 31504 ? Rs Aug25 1437:49
>> > postgres: admin regression [local] VACUUM
>> > admin 3243655 0.0 0.1 212036 38504 ? Ss Aug25 0:00
>> > postgres: admin regression [local] SELECT waiting
>> > admin 3243656 0.0 0.0 206024 30892 ? Ss Aug25 0:00
>> > postgres: admin regression [local] DELETE waiting
>> > admin 3243657 0.0 0.1 205568 32232 ? Ss Aug25 0:00
>> > postgres: admin regression [local] ALTER TABLE waiting
>> > admin 3243658 0.0 0.0 203740 21532 ? Ss Aug25 0:00
>> > postgres: admin regression [local] ANALYZE waiting
>> > admin 3243798 0.0 0.0 199884 8464 ? Ss Aug25 0:00
>> > postgres: autovacuum worker
>> > admin 3244733 0.0 0.0 199492 5956 ? Ss Aug25 0:00
>> > postgres: autovacuum worker
>> > admin 3245652 0.0 0.0 199884 8468 ? Ss Aug25 0:00
>> > postgres: autovacuum worker
>> >
>> > As you can see there are a bunch of backends presumably waiting, and
>> > also the VACUUM process has been pegging a single CPU core for at
>> > least since that ~16 hour ago mark.
>> >
>> > I hope to be able to do more investigation later, but I wanted to at
>> > least give you this information now.
>>
>> Thanks a lot for testing the patch!
>> I really appreciate your cooperation.
>>
>> Hmm, I also tested on the current HEAD(165d581f146b09) again on Ubuntu
>> 22.04 and macOS, but unfortunately(fortunately?) they succeeded as
>> below:
>>
>> ```
>> $ git apply v29-0001-Add-function-to-log-the-plan-of-the-query.patch
>> $ git apply
>> v28-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch
>> $ ./configure --enable-debug --enable-cassert
>> $ make
>> $ make check
>>
>> ...(snip)...
>>
>> # parallel group (5 tests): index_including index_including_gist
>> create_view create_index create_index_spgist
>> ok 66 + create_index 25033 ms
>> ok 67 + create_index_spgist 26144 ms
>> ok 68 + create_view 3061 ms
>> ok 69 + index_including 976 ms
>> ok 70 + index_including_gist 2998 ms
>> # parallel group (16 tests): create_cast errors create_aggregate
>> roleattributes drop_if_exists hash_func typed_table
>> create_am select constraints updatable_views inherit triggers vacuum
>> create_function_sql infinite_recurse
>> ok 71 + create_aggregate 225 ms
>> ok 72 + create_function_sql 18874 ms
>> ok 73 + create_cast 168 ms
>>
>> ...(snip)...
>>
>> # All 215 tests passed.
>> ```
>>
>> If you notice any difference, I would be grateful if you could let me
>> know.
>
> I've never been able to reproduce it (haven't tested the new version,
> but v28 at least) on my M1 Mac; where I've reproduced it is on Debian
> (first buster and now bullseye).
>
> I'm attaching several stacktraces in the hope that they provide some
> clues. These all match the ps output I sent earlier, though note in
> that output there is both the regress instance and my test instance
> (pid 3213249) running (different ports, of course, and they are from
> the exact same compilation run). I've attached ps output for the
> postgres processes under the make check process to simplify cross
> referencing.
>
> A few interesting things:
> - There's definitely a lock on a relation that seems to be what's
> blocking the processes.
> - When I try to connect with psql the process forks but then hangs
> (see the ps output with task names stuck in "authentication"). I've
> also included a trace from one of these.

Thanks for sharing them!

Many processes are waiting to acquire the LW lock, including the process
trying to output the plan(select1.trace).

I suspect that this is due to a lock that was acquired prior to being
interrupted by ProcessLogQueryPlanInterrupt(), but have not been able to
reproduce the same situation..

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: James Coleman <jtc331(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-09-06 02:17:08
Message-ID: CAAaqYe_hyWLs-ObFQfu54yA-ZOjNCd-L_Z=1TnYjYNqrLuGiXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Sep 5, 2023 at 9:59 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> On 2023-08-28 22:47, James Coleman wrote:
> > On Mon, Aug 28, 2023 at 3:01 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> > wrote:
> >>
> >> On 2023-08-26 21:03, James Coleman wrote:
> >> > On Fri, Aug 25, 2023 at 7:43 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
> >> >>
> >> >> On Thu, Aug 17, 2023 at 10:02 AM torikoshia
> >> >> <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> >> >> >
> >> >> > On 2023-06-16 01:34, James Coleman wrote:
> >> >> > > Attached is v28
> >> >> > > which sets ProcessLogQueryPlanInterruptActive to false in errfinish
> >> >> > > when necessary. Once built with those two patches I'm simply running
> >> >> > > `make check`.
> >> >> >
> >> >> > With v28-0001 and v28-0002 patch, I confirmed backend processes consume
> >> >> > huge
> >> >> > amount of memory and under some environments they were terminated by OOM
> >> >> > killer.
> >> >> >
> >> >> > This was because memory was allocated from existing memory contexts and
> >> >> > they
> >> >> > were not freed after ProcessLogQueryPlanInterrupt().
> >> >> > Updated the patch to use dedicated memory context for
> >> >> > ProcessLogQueryPlanInterrupt().
> >> >> >
> >> >> > Applying attached patch and v28-0002 patch, `make check` successfully
> >> >> > completed after 20min and 50GB of logs on my environment.
> >> >> >
> >> >> > >>> On 2023-06-15 01:48, James Coleman wrote:
> >> >> > >>> > The tests have been running since last night, but have been apparently
> >> >> > >>> > hung now for many hours.
> >> >> >
> >> >> > I don't know if this has anything to do with the hung you faced, but I
> >> >> > thought
> >> >> > it might be possible that the large amount of memory usage resulted in
> >> >> > swapping, which caused a significant delay in processing.
> >> >>
> >> >> Ah, yes, I think that could be a possible explanation. I was delaying
> >> >> on this thread because I wasn't comfortable with having caused an
> >> >> issue once (even if I couldn't easily reproduce) without at least some
> >> >> theory as to the cause (and a fix).
> >> >>
> >> >> > If possible, I would be very grateful if you could try to reproduce this
> >> >> > with
> >> >> > the v29 patch.
> >> >>
> >> >> I'll kick off some testing.
> >> >>
> >> >
> >> > I don't have time to investigate what's happening here, but 24 hours
> >> > later the first "make check" is still running, and at first glance it
> >> > seems to have the same behavior I'd seen that first time. The test
> >> > output is to this point:
> >> >
> >> > # parallel group (5 tests): index_including create_view
> >> > index_including_gist create_index create_index_spgist
> >> > ok 66 + create_index 26365 ms
> >> > ok 67 + create_index_spgist 27675 ms
> >> > ok 68 + create_view 1235 ms
> >> > ok 69 + index_including 1102 ms
> >> > ok 70 + index_including_gist 1633 ms
> >> > # parallel group (16 tests): create_aggregate create_cast errors
> >> > roleattributes drop_if_exists hash_func typed_table create_am
> >> > infinite_recurse
> >> >
> >> > and it hasn't progressed past that point since at least ~16 hours ago
> >> > (the first several hours of the run I wasn't monitoring it).
> >> >
> >> > I haven't connected up gdb yet, and won't be able to until maybe
> >> > tomorrow, but here's the ps output for postgres processes that are
> >> > running:
> >> >
> >> > admin 3213249 0.0 0.0 196824 20552 ? Ss Aug25 0:00
> >> > /home/admin/postgresql-test/bin/postgres -D
> >> > /home/admin/postgresql-test-data
> >> > admin 3213250 0.0 0.0 196964 7284 ? Ss Aug25 0:00
> >> > postgres: checkpointer
> >> > admin 3213251 0.0 0.0 196956 4276 ? Ss Aug25 0:00
> >> > postgres: background writer
> >> > admin 3213253 0.0 0.0 196956 8600 ? Ss Aug25 0:00
> >> > postgres: walwriter
> >> > admin 3213254 0.0 0.0 198424 7316 ? Ss Aug25 0:00
> >> > postgres: autovacuum launcher
> >> > admin 3213255 0.0 0.0 198412 5488 ? Ss Aug25 0:00
> >> > postgres: logical replication launcher
> >> > admin 3237967 0.0 0.0 2484 516 pts/4 S+ Aug25 0:00
> >> > /bin/sh -c echo "# +++ regress check in src/test/regress +++" &&
> >> > PATH="/home/admin/postgres/tmp_install/home/admin/postgresql-test/bin:/home/admin/postgres/src/test/regress:$PATH"
> >> > LD_LIBRARY_PATH="/home/admin/postgres/tmp_install/home/admin/postgresql-test/lib"
> >> > INITDB_TEMPLATE='/home/admin/postgres'/tmp_install/initdb-template
> >> > ../../../src/test/regress/pg_regress --temp-instance=./tmp_check
> >> > --inputdir=. --bindir= --dlpath=. --max-concurrent-tests=20
> >> > --schedule=./parallel_schedule
> >> > admin 3237973 0.0 0.0 197880 20688 pts/4 S+ Aug25 0:00
> >> > postgres -D /home/admin/postgres/src/test/regress/tmp_check/data -F -c
> >> > listen_addresses= -k /tmp/pg_regress-7mmGUa
> >> > admin 3237976 0.0 0.1 198332 44608 ? Ss Aug25 0:00
> >> > postgres: checkpointer
> >> > admin 3237977 0.0 0.0 198032 4640 ? Ss Aug25 0:00
> >> > postgres: background writer
> >> > admin 3237979 0.0 0.0 197880 8580 ? Ss Aug25 0:00
> >> > postgres: walwriter
> >> > admin 3237980 0.0 0.0 199484 7608 ? Ss Aug25 0:00
> >> > postgres: autovacuum launcher
> >> > admin 3237981 0.0 0.0 199460 5488 ? Ss Aug25 0:00
> >> > postgres: logical replication launcher
> >> > admin 3243644 0.0 0.2 252400 74656 ? Ss Aug25 0:01
> >> > postgres: admin regression [local] SELECT waiting
> >> > admin 3243645 0.0 0.1 205480 33992 ? Ss Aug25 0:00
> >> > postgres: admin regression [local] SELECT waiting
> >> > admin 3243654 99.9 0.0 203156 31504 ? Rs Aug25 1437:49
> >> > postgres: admin regression [local] VACUUM
> >> > admin 3243655 0.0 0.1 212036 38504 ? Ss Aug25 0:00
> >> > postgres: admin regression [local] SELECT waiting
> >> > admin 3243656 0.0 0.0 206024 30892 ? Ss Aug25 0:00
> >> > postgres: admin regression [local] DELETE waiting
> >> > admin 3243657 0.0 0.1 205568 32232 ? Ss Aug25 0:00
> >> > postgres: admin regression [local] ALTER TABLE waiting
> >> > admin 3243658 0.0 0.0 203740 21532 ? Ss Aug25 0:00
> >> > postgres: admin regression [local] ANALYZE waiting
> >> > admin 3243798 0.0 0.0 199884 8464 ? Ss Aug25 0:00
> >> > postgres: autovacuum worker
> >> > admin 3244733 0.0 0.0 199492 5956 ? Ss Aug25 0:00
> >> > postgres: autovacuum worker
> >> > admin 3245652 0.0 0.0 199884 8468 ? Ss Aug25 0:00
> >> > postgres: autovacuum worker
> >> >
> >> > As you can see there are a bunch of backends presumably waiting, and
> >> > also the VACUUM process has been pegging a single CPU core for at
> >> > least since that ~16 hour ago mark.
> >> >
> >> > I hope to be able to do more investigation later, but I wanted to at
> >> > least give you this information now.
> >>
> >> Thanks a lot for testing the patch!
> >> I really appreciate your cooperation.
> >>
> >> Hmm, I also tested on the current HEAD(165d581f146b09) again on Ubuntu
> >> 22.04 and macOS, but unfortunately(fortunately?) they succeeded as
> >> below:
> >>
> >> ```
> >> $ git apply v29-0001-Add-function-to-log-the-plan-of-the-query.patch
> >> $ git apply
> >> v28-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch
> >> $ ./configure --enable-debug --enable-cassert
> >> $ make
> >> $ make check
> >>
> >> ...(snip)...
> >>
> >> # parallel group (5 tests): index_including index_including_gist
> >> create_view create_index create_index_spgist
> >> ok 66 + create_index 25033 ms
> >> ok 67 + create_index_spgist 26144 ms
> >> ok 68 + create_view 3061 ms
> >> ok 69 + index_including 976 ms
> >> ok 70 + index_including_gist 2998 ms
> >> # parallel group (16 tests): create_cast errors create_aggregate
> >> roleattributes drop_if_exists hash_func typed_table
> >> create_am select constraints updatable_views inherit triggers vacuum
> >> create_function_sql infinite_recurse
> >> ok 71 + create_aggregate 225 ms
> >> ok 72 + create_function_sql 18874 ms
> >> ok 73 + create_cast 168 ms
> >>
> >> ...(snip)...
> >>
> >> # All 215 tests passed.
> >> ```
> >>
> >> If you notice any difference, I would be grateful if you could let me
> >> know.
> >
> > I've never been able to reproduce it (haven't tested the new version,
> > but v28 at least) on my M1 Mac; where I've reproduced it is on Debian
> > (first buster and now bullseye).
> >
> > I'm attaching several stacktraces in the hope that they provide some
> > clues. These all match the ps output I sent earlier, though note in
> > that output there is both the regress instance and my test instance
> > (pid 3213249) running (different ports, of course, and they are from
> > the exact same compilation run). I've attached ps output for the
> > postgres processes under the make check process to simplify cross
> > referencing.
> >
> > A few interesting things:
> > - There's definitely a lock on a relation that seems to be what's
> > blocking the processes.
> > - When I try to connect with psql the process forks but then hangs
> > (see the ps output with task names stuck in "authentication"). I've
> > also included a trace from one of these.
>
> Thanks for sharing them!
>
> Many processes are waiting to acquire the LW lock, including the process
> trying to output the plan(select1.trace).
>
> I suspect that this is due to a lock that was acquired prior to being
> interrupted by ProcessLogQueryPlanInterrupt(), but have not been able to
> reproduce the same situation..
>

I don't have time immediately to dig into this, but perhaps loading up
the core dumps would allow us to see what query is running in each
backend process (if it hasn't already been discarded by that point)
and thereby determine what point in each test process led to the error
condition?

Alternatively we might be able to apply the same trick to the test
client instead...

BTW, for my own easy reference in this thread: relid 1259 is pg_class
if I'm not mistaken.

Thoughts?
James Coleman


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-09-07 06:09:12
Message-ID: cbe6eb783affb3e9b5068f5cf019ee62@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-09-06 11:17, James Coleman wrote:

>> > I've never been able to reproduce it (haven't tested the new version,
>> > but v28 at least) on my M1 Mac; where I've reproduced it is on Debian
>> > (first buster and now bullseye).
>> >
>> > I'm attaching several stacktraces in the hope that they provide some
>> > clues. These all match the ps output I sent earlier, though note in
>> > that output there is both the regress instance and my test instance
>> > (pid 3213249) running (different ports, of course, and they are from
>> > the exact same compilation run). I've attached ps output for the
>> > postgres processes under the make check process to simplify cross
>> > referencing.
>> >
>> > A few interesting things:
>> > - There's definitely a lock on a relation that seems to be what's
>> > blocking the processes.
>> > - When I try to connect with psql the process forks but then hangs
>> > (see the ps output with task names stuck in "authentication"). I've
>> > also included a trace from one of these.
>>
>> Thanks for sharing them!
>>
>> Many processes are waiting to acquire the LW lock, including the
>> process
>> trying to output the plan(select1.trace).
>>
>> I suspect that this is due to a lock that was acquired prior to being
>> interrupted by ProcessLogQueryPlanInterrupt(), but have not been able
>> to
>> reproduce the same situation..
>>
>
> I don't have time immediately to dig into this, but perhaps loading up
> the core dumps would allow us to see what query is running in each
> backend process (if it hasn't already been discarded by that point)
> and thereby determine what point in each test process led to the error
> condition?

Thanks for the suggestion.
I am concerned that core dumps may not be readable on different
operating systems or other environments. (Unfortunately, I do not have
Debian on hand)

It seems that we can know what queries were running from the stack
traces you shared.
As described above, I suspect a lock which was acquired prior to
ProcessLogQueryPlanInterrupt() caused the issue.
I will try a little more to see if I can devise a way to create the same
situation.

> Alternatively we might be able to apply the same trick to the test
> client instead...
>
> BTW, for my own easy reference in this thread: relid 1259 is pg_class
> if I'm not mistaken.

Yeah, and I think it's strange that the lock to 1259 appears twice and
must be avoided.

#10 0x0000559d61d8ee6e in LockRelationOid (relid=1259, lockmode=1) at
lmgr.c:117
..
#49 0x0000559d61b4989d in ProcessLogQueryPlanInterrupt () at
explain.c:5158
..
#53 0x0000559d61d8ee6e in LockRelationOid (relid=1259, lockmode=1) at
lmgr.c:117

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: "Lepikhov Andrei" <a(dot)lepikhov(at)postgrespro(dot)ru>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, "James Coleman" <jtc331(at)gmail(dot)com>
Cc: "'Andres Freund'" <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, "Greg Stark" <stark(at)mit(dot)edu>, "Ronan Dunklau" <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, "Heikki Linnakangas" <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-09-15 06:21:27
Message-ID: b1b110ae-61f6-4fd9-9b94-f967db9b53d4@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 7, 2023, at 1:09 PM, torikoshia wrote:
> On 2023-09-06 11:17, James Coleman wrote:
> It seems that we can know what queries were running from the stack
> traces you shared.
> As described above, I suspect a lock which was acquired prior to
> ProcessLogQueryPlanInterrupt() caused the issue.
> I will try a little more to see if I can devise a way to create the same
> situation.
Hi,
I have explored this patch and, by and large, agree with the code. But some questions I want to discuss:
1. Maybe add a hook to give a chance for extensions, like pg_query_state, to do their job?
2. In this implementation, ProcessInterrupts does a lot of work during the explain creation: memory allocations, pass through the plan, systables locks, syscache access, etc. I guess it can change the semantic meaning of 'safe place' where CHECK_FOR_INTERRUPTS can be called - I can imagine a SELECT query, which would call a stored procedure, which executes some DDL and acquires row exclusive lock at the time of interruption. So, in my mind, it is too unpredictable to make the explain in the place of interruption processing. Maybe it is better to think about some hook at the end of ExecProcNode, where a pending explain could be created?

Also, I suggest minor code change with the diff in attachment.

--
Regards,
Andrei Lepikhov

Attachment Content-Type Size
improve.diff application/octet-stream 772 bytes

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Lepikhov Andrei <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: James Coleman <jtc331(at)gmail(dot)com>, 'Andres Freund' <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-09-19 13:39:59
Message-ID: 8068eb19f732169a1a79fdeadf104adf@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-09-15 15:21, Lepikhov Andrei wrote:
> On Thu, Sep 7, 2023, at 1:09 PM, torikoshia wrote:
>> On 2023-09-06 11:17, James Coleman wrote:
>> It seems that we can know what queries were running from the stack
>> traces you shared.
>> As described above, I suspect a lock which was acquired prior to
>> ProcessLogQueryPlanInterrupt() caused the issue.
>> I will try a little more to see if I can devise a way to create the
>> same
>> situation.
> Hi,
> I have explored this patch and, by and large, agree with the code. But
> some questions I want to discuss:
> 1. Maybe add a hook to give a chance for extensions, like
> pg_query_state, to do their job?

Do you imagine adding a hook which enables adding custom interrupt codes
like below?

https://github.com/postgrespro/pg_query_state/blob/master/patches/custom_signals_15.0.patch

If so, that would be possible, but this patch doesn't require the
functionality and I feel it'd be better doing in independent patch.

> 2. In this implementation, ProcessInterrupts does a lot of work during
> the explain creation: memory allocations, pass through the plan,
> systables locks, syscache access, etc. I guess it can change the
> semantic meaning of 'safe place' where CHECK_FOR_INTERRUPTS can be
> called - I can imagine a SELECT query, which would call a stored
> procedure, which executes some DDL and acquires row exclusive lock at
> the time of interruption. So, in my mind, it is too unpredictable to
> make the explain in the place of interruption processing. Maybe it is
> better to think about some hook at the end of ExecProcNode, where a
> pending explain could be created?

Yeah, I withdrew this patch once for that reason, but we are resuming
development in response to the results of a discussion by James and
others at this year's pgcon about the safety of this process in CFI:

/message-id/CAAaqYe9euUZD8bkjXTVcD9e4n5c7kzHzcvuCJXt-xds9X4c7Fw%40mail.gmail.com

BTW it seems pg_query_state also enables users to get running query
plans using CFI.
Does pg_query_state do something for this safety concern?

> Also, I suggest minor code change with the diff in attachment.

Thanks!

This might be biased opinion and objections are welcomed, but I feel the
original patch is easier to read since PG_RETURN_BOOL(true/false) is
located in near place to each cases.
Also the existing function pg_log_backend_memory_contexts(), which does
the same thing, has the same form as the original patch.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: "Lepikhov Andrei" <a(dot)lepikhov(at)postgrespro(dot)ru>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: "James Coleman" <jtc331(at)gmail(dot)com>, "'Andres Freund'" <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, "Greg Stark" <stark(at)mit(dot)edu>, "Ronan Dunklau" <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, "Heikki Linnakangas" <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-09-20 05:39:13
Message-ID: 43bc4f5a-2fdc-46c8-ba4f-77ec5d7485e5@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Sep 19, 2023, at 8:39 PM, torikoshia wrote:
> On 2023-09-15 15:21, Lepikhov Andrei wrote:
>> On Thu, Sep 7, 2023, at 1:09 PM, torikoshia wrote:
>> I have explored this patch and, by and large, agree with the code. But
>> some questions I want to discuss:
>> 1. Maybe add a hook to give a chance for extensions, like
>> pg_query_state, to do their job?
>
> Do you imagine adding a hook which enables adding custom interrupt codes
> like below?
>
> https://github.com/postgrespro/pg_query_state/blob/master/patches/custom_signals_15.0.patch

No, I think around the hook, which would allow us to rewrite the pg_query_state extension without additional patches by using the functionality provided by your patch. I mean, an extension could provide console UI, not only server logging. And obtain from target backend so much information in the explain as the instrumentation level of the current query can give.
It may make pg_query_state shorter and more stable.

>> 2. In this implementation, ProcessInterrupts does a lot of work during
>> the explain creation: memory allocations, pass through the plan,
>> systables locks, syscache access, etc. I guess it can change the
>> semantic meaning of 'safe place' where CHECK_FOR_INTERRUPTS can be
>> called - I can imagine a SELECT query, which would call a stored
>> procedure, which executes some DDL and acquires row exclusive lock at
>> the time of interruption. So, in my mind, it is too unpredictable to
>> make the explain in the place of interruption processing. Maybe it is
>> better to think about some hook at the end of ExecProcNode, where a
>> pending explain could be created?
>
> Yeah, I withdrew this patch once for that reason, but we are resuming
> development in response to the results of a discussion by James and
> others at this year's pgcon about the safety of this process in CFI:
>
> /message-id/CAAaqYe9euUZD8bkjXTVcD9e4n5c7kzHzcvuCJXt-xds9X4c7Fw%40mail.gmail.com

I can't track the logic path of the decision provided at this conference. But my anxiety related to specific place, where ActiveQueryDesc points top-level query and interruption comes during DDL, wrapped up in stored procedure. For example:
CREATE TABLE test();
CREATE OR REPLACE FUNCTION ddl() RETURNS void AS $$
BEGIN
EXECUTE format('ALTER TABLE test ADD COLUMN x integer;');
...
END; $$ LANGUAGE plpgsql VOLATILE;
SELECT ddl(), ... FROM ...;

> BTW it seems pg_query_state also enables users to get running query
> plans using CFI.
> Does pg_query_state do something for this safety concern?

No, and I'm looking for the solution, which could help to rewrite pg_query_state as a clean extension, without patches.

>> Also, I suggest minor code change with the diff in attachment.
>
> Thanks!
>
> This might be biased opinion and objections are welcomed, but I feel the
> original patch is easier to read since PG_RETURN_BOOL(true/false) is
> located in near place to each cases.
> Also the existing function pg_log_backend_memory_contexts(), which does
> the same thing, has the same form as the original patch.
I got it, thank you.

--
Regards,
Andrei Lepikhov


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Lepikhov Andrei <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: James Coleman <jtc331(at)gmail(dot)com>, 'Andres Freund' <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-09-25 07:21:31
Message-ID: ff2380c93362868c0894e3c56ed96184@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-09-20 14:39, Lepikhov Andrei wrote:
Thanks for your reply.

> On Tue, Sep 19, 2023, at 8:39 PM, torikoshia wrote:
>> On 2023-09-15 15:21, Lepikhov Andrei wrote:
>>> On Thu, Sep 7, 2023, at 1:09 PM, torikoshia wrote:
>>> I have explored this patch and, by and large, agree with the code.
>>> But
>>> some questions I want to discuss:
>>> 1. Maybe add a hook to give a chance for extensions, like
>>> pg_query_state, to do their job?
>>
>> Do you imagine adding a hook which enables adding custom interrupt
>> codes
>> like below?
>>
>> https://github.com/postgrespro/pg_query_state/blob/master/patches/custom_signals_15.0.patch
>
> No, I think around the hook, which would allow us to rewrite the
> pg_query_state extension without additional patches by using the
> functionality provided by your patch. I mean, an extension could
> provide console UI, not only server logging. And obtain from target
> backend so much information in the explain as the instrumentation
> level of the current query can give.
> It may make pg_query_state shorter and more stable.
>
>>> 2. In this implementation, ProcessInterrupts does a lot of work
>>> during
>>> the explain creation: memory allocations, pass through the plan,
>>> systables locks, syscache access, etc. I guess it can change the
>>> semantic meaning of 'safe place' where CHECK_FOR_INTERRUPTS can be
>>> called - I can imagine a SELECT query, which would call a stored
>>> procedure, which executes some DDL and acquires row exclusive lock at
>>> the time of interruption. So, in my mind, it is too unpredictable to
>>> make the explain in the place of interruption processing. Maybe it is
>>> better to think about some hook at the end of ExecProcNode, where a
>>> pending explain could be created?
>>
>> Yeah, I withdrew this patch once for that reason, but we are resuming
>> development in response to the results of a discussion by James and
>> others at this year's pgcon about the safety of this process in CFI:
>>
>> /message-id/CAAaqYe9euUZD8bkjXTVcD9e4n5c7kzHzcvuCJXt-xds9X4c7Fw%40mail.gmail.com
>
> I can't track the logic path of the decision provided at this
> conference. But my anxiety related to specific place, where
> ActiveQueryDesc points top-level query and interruption comes during
> DDL, wrapped up in stored procedure. For example:
> CREATE TABLE test();
> CREATE OR REPLACE FUNCTION ddl() RETURNS void AS $$
> BEGIN
> EXECUTE format('ALTER TABLE test ADD COLUMN x integer;');
> ...
> END; $$ LANGUAGE plpgsql VOLATILE;
> SELECT ddl(), ... FROM ...;

Hmm, as a test, I made sure to call ProcessLogQueryPlanInterrupt() on
all CFI using
v28-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch, and then
ran the following query but did not cause any problems.

```
=# CREATE TABLE test();
=# CREATE OR REPLACE FUNCTION ddl() RETURNS void AS $$
BEGIN
EXECUTE format('ALTER TABLE test ADD COLUMN x integer;');
PERFORM pg_sleep(5);
END; $$ LANGUAGE plpgsql VOLATILE;
=# SELECT ddl();
```

Is this the case you're worrying about?

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, 'Andres Freund' <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-09-25 09:49:18
Message-ID: c8322c0c-1f75-4e6b-ab91-b1469d1c778e@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 25/9/2023 14:21, torikoshia wrote:
> On 2023-09-20 14:39, Lepikhov Andrei wrote:
> Hmm, as a test, I made sure to call ProcessLogQueryPlanInterrupt() on
> all CFI using
> v28-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch, and then
> ran the following query but did not cause any problems.
>
> ```
> =# CREATE TABLE test();
> =# CREATE OR REPLACE FUNCTION ddl() RETURNS void AS $$
> BEGIN
>   EXECUTE format('ALTER TABLE test ADD COLUMN x integer;');
>   PERFORM pg_sleep(5);
> END; $$ LANGUAGE plpgsql VOLATILE;
> =# SELECT ddl();
> ```
>
> Is this the case you're worrying about?

I didn't find a problem either. I just feel uncomfortable if, at the
moment of interruption, we have a descriptor of another query than the
query have been executing and holding resources.

--
regards,
Andrey Lepikhov
Postgres Professional


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: James Coleman <jtc331(at)gmail(dot)com>, 'Andres Freund' <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-09-28 02:04:09
Message-ID: da0d42ad12efd9cf13ddbe4c04dbb978@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-09-25 18:49, Andrey Lepikhov wrote:
> On 25/9/2023 14:21, torikoshia wrote:
>> On 2023-09-20 14:39, Lepikhov Andrei wrote:
>> Hmm, as a test, I made sure to call ProcessLogQueryPlanInterrupt() on
>> all CFI using
>> v28-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch, and then
>> ran the following query but did not cause any problems.
>>
>> ```
>> =# CREATE TABLE test();
>> =# CREATE OR REPLACE FUNCTION ddl() RETURNS void AS $$
>> BEGIN
>>   EXECUTE format('ALTER TABLE test ADD COLUMN x integer;');
>>   PERFORM pg_sleep(5);
>> END; $$ LANGUAGE plpgsql VOLATILE;
>> =# SELECT ddl();
>> ```
>>
>> Is this the case you're worrying about?
>
> I didn't find a problem either. I just feel uncomfortable if, at the
> moment of interruption, we have a descriptor of another query than the
> query have been executing and holding resources.

I think that "descriptor" here refers to ActiveQueryDesc, in which case
it is updated at the beginning of ExecutorRun(), so I am wondering if
the situation you're worried about would not occur.

---------
@@ -303,10 +306,21 @@ ExecutorRun(QueryDesc *queryDesc,
ScanDirection direction, uint64 count,
bool execute_once)
{
+ /*
+ * Update ActiveQueryDesc here to enable retrieval of the currently
+ * running queryDesc for nested queries.
+ */
+ QueryDesc *save_ActiveQueryDesc;
+
+ save_ActiveQueryDesc = ActiveQueryDesc;
+ ActiveQueryDesc = queryDesc;
---------
--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-09-28 02:32:46
Message-ID: d4d59177-2eba-4d0f-ab99-e9977b19f195@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 28/9/2023 09:04, torikoshia wrote:
> On 2023-09-25 18:49, Andrey Lepikhov wrote:
>> On 25/9/2023 14:21, torikoshia wrote:
>>> On 2023-09-20 14:39, Lepikhov Andrei wrote:
>>> Hmm, as a test, I made sure to call ProcessLogQueryPlanInterrupt() on
>>> all CFI using
>>> v28-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch, and
>>> then ran the following query but did not cause any problems.
>>>
>>> ```
>>> =# CREATE TABLE test();
>>> =# CREATE OR REPLACE FUNCTION ddl() RETURNS void AS $$
>>> BEGIN
>>>    EXECUTE format('ALTER TABLE test ADD COLUMN x integer;');
>>>    PERFORM pg_sleep(5);
>>> END; $$ LANGUAGE plpgsql VOLATILE;
>>> =# SELECT ddl();
>>> ```
>>>
>>> Is this the case you're worrying about?
>>
>> I didn't find a problem either. I just feel uncomfortable if, at the
>> moment of interruption, we have a descriptor of another query than the
>> query have been executing and holding resources.
>
> I think that "descriptor" here refers to ActiveQueryDesc, in which case
> it is updated at the beginning of ExecutorRun(), so I am wondering if
> the situation you're worried about would not occur.
As you can see, in my example we have the only DDL and no queries with
plans. In this case postgres doesn't call ExecutorRun() just because it
doesn't have a plan. But locks will be obtained.

--
regards,
Andrey Lepikhov
Postgres Professional


From: James Coleman <jtc331(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-03 18:00:11
Message-ID: CAAaqYe_p-849MtsVgN9_y-YL-uFzLFMpGw38T3Omb11gw-=K1w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 7, 2023 at 2:09 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> On 2023-09-06 11:17, James Coleman wrote:
>
> >> > I've never been able to reproduce it (haven't tested the new version,
> >> > but v28 at least) on my M1 Mac; where I've reproduced it is on Debian
> >> > (first buster and now bullseye).
> >> >
> >> > I'm attaching several stacktraces in the hope that they provide some
> >> > clues. These all match the ps output I sent earlier, though note in
> >> > that output there is both the regress instance and my test instance
> >> > (pid 3213249) running (different ports, of course, and they are from
> >> > the exact same compilation run). I've attached ps output for the
> >> > postgres processes under the make check process to simplify cross
> >> > referencing.
> >> >
> >> > A few interesting things:
> >> > - There's definitely a lock on a relation that seems to be what's
> >> > blocking the processes.
> >> > - When I try to connect with psql the process forks but then hangs
> >> > (see the ps output with task names stuck in "authentication"). I've
> >> > also included a trace from one of these.
> >>
> >> Thanks for sharing them!
> >>
> >> Many processes are waiting to acquire the LW lock, including the
> >> process
> >> trying to output the plan(select1.trace).
> >>
> >> I suspect that this is due to a lock that was acquired prior to being
> >> interrupted by ProcessLogQueryPlanInterrupt(), but have not been able
> >> to
> >> reproduce the same situation..
> >>
> >
> > I don't have time immediately to dig into this, but perhaps loading up
> > the core dumps would allow us to see what query is running in each
> > backend process (if it hasn't already been discarded by that point)
> > and thereby determine what point in each test process led to the error
> > condition?
>
> Thanks for the suggestion.
> I am concerned that core dumps may not be readable on different
> operating systems or other environments. (Unfortunately, I do not have
> Debian on hand)
>
> It seems that we can know what queries were running from the stack
> traces you shared.
> As described above, I suspect a lock which was acquired prior to
> ProcessLogQueryPlanInterrupt() caused the issue.
> I will try a little more to see if I can devise a way to create the same
> situation.
>
> > Alternatively we might be able to apply the same trick to the test
> > client instead...
> >
> > BTW, for my own easy reference in this thread: relid 1259 is pg_class
> > if I'm not mistaken.
>
> Yeah, and I think it's strange that the lock to 1259 appears twice and
> must be avoided.
>
> #10 0x0000559d61d8ee6e in LockRelationOid (relid=1259, lockmode=1) at
> lmgr.c:117
> ..
> #49 0x0000559d61b4989d in ProcessLogQueryPlanInterrupt () at
> explain.c:5158
> ..
> #53 0x0000559d61d8ee6e in LockRelationOid (relid=1259, lockmode=1) at
> lmgr.c:117

I chatted with Andres and David about this at PGConf.NYC, and I think
what we need to do is explicitly disallow running this code any time
we are inside of lock acquisition code.

Regards,
James Coleman


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-06 12:58:46
Message-ID: 7ab57ab0ba3416e2dbf54927752bc310@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-10-04 03:00, James Coleman wrote:
> On Thu, Sep 7, 2023 at 2:09 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
>>
>> On 2023-09-06 11:17, James Coleman wrote:
>>
>> >> > I've never been able to reproduce it (haven't tested the new version,
>> >> > but v28 at least) on my M1 Mac; where I've reproduced it is on Debian
>> >> > (first buster and now bullseye).
>> >> >
>> >> > I'm attaching several stacktraces in the hope that they provide some
>> >> > clues. These all match the ps output I sent earlier, though note in
>> >> > that output there is both the regress instance and my test instance
>> >> > (pid 3213249) running (different ports, of course, and they are from
>> >> > the exact same compilation run). I've attached ps output for the
>> >> > postgres processes under the make check process to simplify cross
>> >> > referencing.
>> >> >
>> >> > A few interesting things:
>> >> > - There's definitely a lock on a relation that seems to be what's
>> >> > blocking the processes.
>> >> > - When I try to connect with psql the process forks but then hangs
>> >> > (see the ps output with task names stuck in "authentication"). I've
>> >> > also included a trace from one of these.
>> >>
>> >> Thanks for sharing them!
>> >>
>> >> Many processes are waiting to acquire the LW lock, including the
>> >> process
>> >> trying to output the plan(select1.trace).
>> >>
>> >> I suspect that this is due to a lock that was acquired prior to being
>> >> interrupted by ProcessLogQueryPlanInterrupt(), but have not been able
>> >> to
>> >> reproduce the same situation..
>> >>
>> >
>> > I don't have time immediately to dig into this, but perhaps loading up
>> > the core dumps would allow us to see what query is running in each
>> > backend process (if it hasn't already been discarded by that point)
>> > and thereby determine what point in each test process led to the error
>> > condition?
>>
>> Thanks for the suggestion.
>> I am concerned that core dumps may not be readable on different
>> operating systems or other environments. (Unfortunately, I do not have
>> Debian on hand)
>>
>> It seems that we can know what queries were running from the stack
>> traces you shared.
>> As described above, I suspect a lock which was acquired prior to
>> ProcessLogQueryPlanInterrupt() caused the issue.
>> I will try a little more to see if I can devise a way to create the
>> same
>> situation.
>>
>> > Alternatively we might be able to apply the same trick to the test
>> > client instead...
>> >
>> > BTW, for my own easy reference in this thread: relid 1259 is pg_class
>> > if I'm not mistaken.
>>
>> Yeah, and I think it's strange that the lock to 1259 appears twice and
>> must be avoided.
>>
>> #10 0x0000559d61d8ee6e in LockRelationOid (relid=1259, lockmode=1)
>> at
>> lmgr.c:117
>> ..
>> #49 0x0000559d61b4989d in ProcessLogQueryPlanInterrupt () at
>> explain.c:5158
>> ..
>> #53 0x0000559d61d8ee6e in LockRelationOid (relid=1259, lockmode=1)
>> at
>> lmgr.c:117
>
> I chatted with Andres and David about this at PGConf.NYC,
Thanks again for the discussion with hackers!

> and I think
> what we need to do is explicitly disallow running this code any time
> we are inside of lock acquisition code.

Yeah, I think it's a straightforward workaround.
And I'm also concerned that the condition of being in the process
of acquiring some kind of lock is too strict and will make it almost
impossible to output a running plan.

Anyway I'm going to implement it and run pg_log_query_plan()
while the regression test is running to see how successful the plan
output is.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: James Coleman <jtc331(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-06 13:10:15
Message-ID: CAAaqYe_Xz_bU09dZxPEP5C63N+V_j2qy7Xj2kW22BX2hcwGfGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Oct 6, 2023 at 8:58 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> On 2023-10-04 03:00, James Coleman wrote:
> > On Thu, Sep 7, 2023 at 2:09 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> > wrote:
> >>
> >> On 2023-09-06 11:17, James Coleman wrote:
> >>
> >> >> > I've never been able to reproduce it (haven't tested the new version,
> >> >> > but v28 at least) on my M1 Mac; where I've reproduced it is on Debian
> >> >> > (first buster and now bullseye).
> >> >> >
> >> >> > I'm attaching several stacktraces in the hope that they provide some
> >> >> > clues. These all match the ps output I sent earlier, though note in
> >> >> > that output there is both the regress instance and my test instance
> >> >> > (pid 3213249) running (different ports, of course, and they are from
> >> >> > the exact same compilation run). I've attached ps output for the
> >> >> > postgres processes under the make check process to simplify cross
> >> >> > referencing.
> >> >> >
> >> >> > A few interesting things:
> >> >> > - There's definitely a lock on a relation that seems to be what's
> >> >> > blocking the processes.
> >> >> > - When I try to connect with psql the process forks but then hangs
> >> >> > (see the ps output with task names stuck in "authentication"). I've
> >> >> > also included a trace from one of these.
> >> >>
> >> >> Thanks for sharing them!
> >> >>
> >> >> Many processes are waiting to acquire the LW lock, including the
> >> >> process
> >> >> trying to output the plan(select1.trace).
> >> >>
> >> >> I suspect that this is due to a lock that was acquired prior to being
> >> >> interrupted by ProcessLogQueryPlanInterrupt(), but have not been able
> >> >> to
> >> >> reproduce the same situation..
> >> >>
> >> >
> >> > I don't have time immediately to dig into this, but perhaps loading up
> >> > the core dumps would allow us to see what query is running in each
> >> > backend process (if it hasn't already been discarded by that point)
> >> > and thereby determine what point in each test process led to the error
> >> > condition?
> >>
> >> Thanks for the suggestion.
> >> I am concerned that core dumps may not be readable on different
> >> operating systems or other environments. (Unfortunately, I do not have
> >> Debian on hand)
> >>
> >> It seems that we can know what queries were running from the stack
> >> traces you shared.
> >> As described above, I suspect a lock which was acquired prior to
> >> ProcessLogQueryPlanInterrupt() caused the issue.
> >> I will try a little more to see if I can devise a way to create the
> >> same
> >> situation.
> >>
> >> > Alternatively we might be able to apply the same trick to the test
> >> > client instead...
> >> >
> >> > BTW, for my own easy reference in this thread: relid 1259 is pg_class
> >> > if I'm not mistaken.
> >>
> >> Yeah, and I think it's strange that the lock to 1259 appears twice and
> >> must be avoided.
> >>
> >> #10 0x0000559d61d8ee6e in LockRelationOid (relid=1259, lockmode=1)
> >> at
> >> lmgr.c:117
> >> ..
> >> #49 0x0000559d61b4989d in ProcessLogQueryPlanInterrupt () at
> >> explain.c:5158
> >> ..
> >> #53 0x0000559d61d8ee6e in LockRelationOid (relid=1259, lockmode=1)
> >> at
> >> lmgr.c:117
> >
> > I chatted with Andres and David about this at PGConf.NYC,
> Thanks again for the discussion with hackers!
>
> > and I think
> > what we need to do is explicitly disallow running this code any time
> > we are inside of lock acquisition code.
>
> Yeah, I think it's a straightforward workaround.
> And I'm also concerned that the condition of being in the process
> of acquiring some kind of lock is too strict and will make it almost
> impossible to output a running plan.

I was concerned about this too, but I was wondering if we might be
able to cheat a bit by making such a case not clear the flag but
instead just skip it for now.

Regards,
James


From: Andres Freund <andres(at)anarazel(dot)de>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-06 15:58:50
Message-ID: 20231006155850.kjjb2vwbkawj7moq@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2023-10-06 21:58:46 +0900, torikoshia wrote:
> Yeah, I think it's a straightforward workaround.
> And I'm also concerned that the condition of being in the process
> of acquiring some kind of lock is too strict and will make it almost
> impossible to output a running plan.

How so? We shouldn't commonly acquire relevant locks while executing a query?
With a few exceptions, they should instead be acquired t the start of query
processing. We do acquire a lot of lwlocks, obviously, but we don't process
interrupts during the acquisition / holding of lwlocks.

And presumably the interrupt would just be processed the next time interrupt
processing is happening?

Greetings,

Andres Freund


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-10 13:27:08
Message-ID: 0e0e7ca08dff077a625c27a5e0c2ef0a@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-10-04 03:00, James Coleman wrote:
> and I think
> what we need to do is explicitly disallow running this code any time
> we are inside of lock acquisition code.

Updated patch to check if any locks have already been acquired by
examining MyProc->heldLocks.

I'm not sure this change can "disallow running this code `any time` we
are inside of lock acquisition code", but as far as select1.trace, which
you shared, I believe it can prevent running explain codes since it must
have set MyProc->heldLocks in LockAcquireExtended() before WaitOnLock():

```
/*
* Set bitmask of locks this process already holds on this
object.
*/
MyProc->heldLocks = proclock->holdMask;

..(snip)..

WaitOnLock(locallock, owner);
```

On 2023-10-07 00:58, Andres Freund wrote:

> How so? We shouldn't commonly acquire relevant locks while executing a
> query?
> With a few exceptions, they should instead be acquired t the start of
> query
> processing. We do acquire a lot of lwlocks, obviously, but we don't
> process
> interrupts during the acquisition / holding of lwlocks.
>
> And presumably the interrupt would just be processed the next time
> interrupt
> processing is happening?

Thanks for your comments!

I tested v30 patch with
v28-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch which makes
CFI() call ProcessLogQueryPlanInterrupt() internally, and confirmed that
very few logging queries failed with v30 patch.

This is a result in line with your prediction.

```
$ rg -c'ignored request for logging query plan due to lock confilcts'
postmaster.log
8
```

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

Attachment Content-Type Size
v30-0001-Add-function-to-log-the-plan-of-the-query.patch text/x-diff 27.4 KB

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-11 07:22:00
Message-ID: CAExHW5ujL=mzbjxf6xtyU-Ft6LMBMEmt8H4ZSDiiZre0Fef0jg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Tue, Oct 10, 2023 at 7:00 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:

> Thanks for your comments!
>
> I tested v30 patch with
> v28-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch which makes
> CFI() call ProcessLogQueryPlanInterrupt() internally, and confirmed that
> very few logging queries failed with v30 patch.
>
> This is a result in line with your prediction.
>
>
> ```
> $ rg -c'ignored request for logging query plan due to lock confilcts'
> postmaster.log
> 8

Like many others I think this feature is useful to debug a long running query.

Sorry for jumping late into this.

I have a few of high level comments

There is a lot of similarity between what this feature does and what
auto explain does. I see the code is also duplicated. There is some
merit in avoiding this duplication
1. we will get all the features of auto_explain automatically like
choosing a format (this was expressed somebody earlier in this
thread), setings etc.
2. avoid bugs. E.g your code switches context after ExplainState has
been allocated. These states may leak depending upon when this
function gets called.
3. Building features on top as James envisions will be easier.

Considering the similarity with auto_explain I wondered whether this
function should be part of auto_explain contrib module itself? If we
do that users will need to load auto_explain extension and thus
install executor hooks when this function doesn't need those. So may
not be such a good idea. I didn't see any discussion on this.

I tried following query to pass PID of a non-client backend to this function.
#select pg_log_query_plan(pid), application_name, backend_type from
pg_stat_activity where backend_type = 'autovacuum launcher';
pg_log_query_plan | application_name | backend_type
-------------------+------------------+---------------------
t | | autovacuum launcher
(1 row)
I see "LOG: backend with PID 2733631 is not running a query or a
subtransaction is aborted" in server logs. That's ok. But may be we
should not send signal to these kinds of backends at all, thus
avoiding some system calls.

I am also wondering whether it's better to report the WARNING as
status column in the output. E.g. instead of
#select pg_log_query_plan(100);
WARNING: PID 100 is not a PostgreSQL backend process
pg_log_query_plan
-------------------
f
(1 row)
we output
#select pg_log_query_plan(100);
pg_log_query_plan | status
-------------------+---------------------------------------------
f | PID 100 is not a PostgreSQL backend process
(1 row)

That looks neater and can easily be handled by scripts, applications
and such. But it will be inconsistent with other functions like
pg_terminate_backend() and pg_log_backend_memory_contexts().

I do share a concern that was discussed earlier. If a query is running
longer, there's something problematic with it. A diagnostic
intervention breaking it further would be unwelcome. James has run
experiments to shake this code for any loose breakages. He has not
found any. So may be we are good. And we wouldn't know about very rare
corner cases so easily without using it in the field. So fine with it.
If we could add some safety net that will be great but may not be
necessary for the first cut.

--
Best Wishes,
Ashutosh Bapat


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-12 13:21:26
Message-ID: 2d6d7846b6b794c0e727c431852a113c@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-10-11 16:22, Ashutosh Bapat wrote:

> Like many others I think this feature is useful to debug a long running
> query.
>
> Sorry for jumping late into this.
>
> I have a few of high level comments

Thanks for your comments!

> There is a lot of similarity between what this feature does and what
> auto explain does. I see the code is also duplicated. There is some
> merit in avoiding this duplication
> 1. we will get all the features of auto_explain automatically like
> choosing a format (this was expressed somebody earlier in this
> thread), setings etc.
> 2. avoid bugs. E.g your code switches context after ExplainState has
> been allocated. These states may leak depending upon when this
> function gets called.
> 3. Building features on top as James envisions will be easier.
>
> Considering the similarity with auto_explain I wondered whether this
> function should be part of auto_explain contrib module itself? If we
> do that users will need to load auto_explain extension and thus
> install executor hooks when this function doesn't need those. So may
> not be such a good idea. I didn't see any discussion on this.

I once thought about adding this to auto_explain, but I left it asis for
below reasons:

- One of the typical use case of pg_log_query_plan() would be analyzing
slow query on customer environments. On such environments, We cannot
always control what extensions to install.
Of course auto_explain is a major extension and it is quite possible
that they installed auto_explain, but but it is also possible they do
not.
- It seems a bit counter-intuitive that pg_log_query_plan() is in an
extension called auto_explain, since it `manually`` logs plans

> I tried following query to pass PID of a non-client backend to this
> function.
> #select pg_log_query_plan(pid), application_name, backend_type from
> pg_stat_activity where backend_type = 'autovacuum launcher';
> pg_log_query_plan | application_name | backend_type
> -------------------+------------------+---------------------
> t | | autovacuum launcher
> (1 row)
> I see "LOG: backend with PID 2733631 is not running a query or a
> subtransaction is aborted" in server logs. That's ok. But may be we
> should not send signal to these kinds of backends at all, thus
> avoiding some system calls.

Agreed, it seems better.
Attached patch checks if the backendType of target process is 'client
backend'.

=# select pg_log_query_plan(pid), application_name, backend_type from
pg_stat_activity where backend_type = 'autovacuum launcher';
WARNING: PID 63323 is not a PostgreSQL client backend process
pg_log_query_plan | application_name | backend_type
-------------------+------------------+---------------------
f | | autovacuum launcher

> I am also wondering whether it's better to report the WARNING as
> status column in the output. E.g. instead of
> #select pg_log_query_plan(100);
> WARNING: PID 100 is not a PostgreSQL backend process
> pg_log_query_plan
> -------------------
> f
> (1 row)
> we output
> #select pg_log_query_plan(100);
> pg_log_query_plan | status
> -------------------+---------------------------------------------
> f | PID 100 is not a PostgreSQL backend process
> (1 row)
>
> That looks neater and can easily be handled by scripts, applications
> and such. But it will be inconsistent with other functions like
> pg_terminate_backend() and pg_log_backend_memory_contexts().

It seems neater, but it might be inconvenient because we can no longer
use it in select list like the following query as you wrote:

#select pg_log_query_plan(pid), application_name, backend_type from
pg_stat_activity where backend_type = 'autovacuum launcher';

> I do share a concern that was discussed earlier. If a query is running
> longer, there's something problematic with it. A diagnostic
> intervention breaking it further would be unwelcome. James has run
> experiments to shake this code for any loose breakages. He has not
> found any. So may be we are good. And we wouldn't know about very rare
> corner cases so easily without using it in the field. So fine with it.
> If we could add some safety net that will be great but may not be
> necessary for the first cut.

If there are candidates for the safety net, I'm willing to add them.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

Attachment Content-Type Size
v31-0001-Add-function-to-log-the-plan-of-the-query.patch text/x-diff 27.7 KB

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-16 09:46:28
Message-ID: CAExHW5s+T0Ds-h85y92MsryLqup8fuAT8bovgGGaNnLt=OrQOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 12, 2023 at 6:51 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> On 2023-10-11 16:22, Ashutosh Bapat wrote:
> >
> > Considering the similarity with auto_explain I wondered whether this
> > function should be part of auto_explain contrib module itself? If we
> > do that users will need to load auto_explain extension and thus
> > install executor hooks when this function doesn't need those. So may
> > not be such a good idea. I didn't see any discussion on this.
>
> I once thought about adding this to auto_explain, but I left it asis for
> below reasons:
>
> - One of the typical use case of pg_log_query_plan() would be analyzing
> slow query on customer environments. On such environments, We cannot
> always control what extensions to install.

The same argument applies to auto_explain functionality as well. But
it's not part of the core.

> Of course auto_explain is a major extension and it is quite possible
> that they installed auto_explain, but but it is also possible they do
> not.
> - It seems a bit counter-intuitive that pg_log_query_plan() is in an
> extension called auto_explain, since it `manually`` logs plans
>

pg_log_query_plan() may not fit auto_explain but
pg_explain_backend_query() does. What we are logging is more than just
plan of the query, it might expand to be closer to explain output.
While auto in auto_explain would refer to its automatically logging
explain outputs, it can provide an additional function which provides
similar functionality by manually triggering it.

But we can defer this to a committer, if you want.

I am more interested in avoiding the duplication of code, esp. the
first comment in my reply
>> There is a lot of similarity between what this feature does and what
>> auto explain does. I see the code is also duplicated. There is some
>> merit in avoiding this duplication
>> 1. we will get all the features of auto_explain automatically like
>> choosing a format (this was expressed somebody earlier in this
>> thread), setings etc.
>> 2. avoid bugs. E.g your code switches context after ExplainState has
>> been allocated. These states may leak depending upon when this
>> function gets called.
>> 3. Building features on top as James envisions will be easier.

>
> =# select pg_log_query_plan(pid), application_name, backend_type from
> pg_stat_activity where backend_type = 'autovacuum launcher';
> WARNING: PID 63323 is not a PostgreSQL client backend process
> pg_log_query_plan | application_name | backend_type
> -------------------+------------------+---------------------
> f | | autovacuum launcher
>
>
> > I am also wondering whether it's better to report the WARNING as
> > status column in the output. E.g. instead of
> > #select pg_log_query_plan(100);
> > WARNING: PID 100 is not a PostgreSQL backend process
> > pg_log_query_plan
> > -------------------
> > f
> > (1 row)
> > we output
> > #select pg_log_query_plan(100);
> > pg_log_query_plan | status
> > -------------------+---------------------------------------------
> > f | PID 100 is not a PostgreSQL backend process
> > (1 row)
> >
> > That looks neater and can easily be handled by scripts, applications
> > and such. But it will be inconsistent with other functions like
> > pg_terminate_backend() and pg_log_backend_memory_contexts().
>
> It seems neater, but it might be inconvenient because we can no longer
> use it in select list like the following query as you wrote:
>
> #select pg_log_query_plan(pid), application_name, backend_type from
> pg_stat_activity where backend_type = 'autovacuum launcher';

Why is that?

--
Best Wishes,
Ashutosh Bapat


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-18 06:09:38
Message-ID: 0eb66e8c72f23695957466db5c76c69c@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-10-16 18:46, Ashutosh Bapat wrote:
> On Thu, Oct 12, 2023 at 6:51 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
>>
>> On 2023-10-11 16:22, Ashutosh Bapat wrote:
>> >
>> > Considering the similarity with auto_explain I wondered whether this
>> > function should be part of auto_explain contrib module itself? If we
>> > do that users will need to load auto_explain extension and thus
>> > install executor hooks when this function doesn't need those. So may
>> > not be such a good idea. I didn't see any discussion on this.
>>
>> I once thought about adding this to auto_explain, but I left it asis
>> for
>> below reasons:
>>
>> - One of the typical use case of pg_log_query_plan() would be
>> analyzing
>> slow query on customer environments. On such environments, We cannot
>> always control what extensions to install.
>
> The same argument applies to auto_explain functionality as well. But
> it's not part of the core.

Yeah, and when we have a situation where we want to run
pg_log_query_plan(), we can run it in any environment as long as it is
bundled with the core.
On the other hand, if it is built into auto_explain, we need to start by
installing auto_explain if we do not have auto_explain, which is often
difficult to do in production environments.

>> Of course auto_explain is a major extension and it is quite
>> possible
>> that they installed auto_explain, but but it is also possible they do
>> not.
>> - It seems a bit counter-intuitive that pg_log_query_plan() is in an
>> extension called auto_explain, since it `manually`` logs plans
>>
>
> pg_log_query_plan() may not fit auto_explain but
> pg_explain_backend_query() does. What we are logging is more than just
> plan of the query, it might expand to be closer to explain output.
> While auto in auto_explain would refer to its automatically logging
> explain outputs, it can provide an additional function which provides
> similar functionality by manually triggering it.
>
> But we can defer this to a committer, if you want.
>
> I am more interested in avoiding the duplication of code, esp. the
> first comment in my reply

If there are no objections, I will try porting it to auto_explain and
see its feasibility.

>>> There is a lot of similarity between what this feature does and what
>>> auto explain does. I see the code is also duplicated. There is some
>>> merit in avoiding this duplication
>>> 1. we will get all the features of auto_explain automatically like
>>> choosing a format (this was expressed somebody earlier in this
>>> thread), setings etc.
>>> 2. avoid bugs. E.g your code switches context after ExplainState has
>>> been allocated. These states may leak depending upon when this
>>> function gets called.
>>> 3. Building features on top as James envisions will be easier.
>
>>
>> =# select pg_log_query_plan(pid), application_name, backend_type
>> from
>> pg_stat_activity where backend_type = 'autovacuum launcher';
>> WARNING: PID 63323 is not a PostgreSQL client backend process
>> pg_log_query_plan | application_name | backend_type
>> -------------------+------------------+---------------------
>> f | | autovacuum launcher
>>
>>
>> > I am also wondering whether it's better to report the WARNING as
>> > status column in the output. E.g. instead of
>> > #select pg_log_query_plan(100);
>> > WARNING: PID 100 is not a PostgreSQL backend process
>> > pg_log_query_plan
>> > -------------------
>> > f
>> > (1 row)
>> > we output
>> > #select pg_log_query_plan(100);
>> > pg_log_query_plan | status
>> > -------------------+---------------------------------------------
>> > f | PID 100 is not a PostgreSQL backend process
>> > (1 row)
>> >
>> > That looks neater and can easily be handled by scripts, applications
>> > and such. But it will be inconsistent with other functions like
>> > pg_terminate_backend() and pg_log_backend_memory_contexts().
>>
>> It seems neater, but it might be inconvenient because we can no longer
>> use it in select list like the following query as you wrote:
>>
>> #select pg_log_query_plan(pid), application_name, backend_type from
>> pg_stat_activity where backend_type = 'autovacuum launcher';
>
> Why is that?

Sorry, I misunderstood and confirmed we can run queries like below:

```
=# CREATE OR REPLACE FUNCTION pg_log_query_plan_stab(i int)
RETURNS TABLE(

pg_log_query_plan bool,
status text

) AS $$
DECLARE

BEGIN

RETURN QUERY SELECT false::bool, 'PID 100 is not a PostgreSQL
backend process'::text; END;

$$ LANGUAGE plpgsql;

=# select pg_log_query_plan_stab(pid), application_name, backend_type
from pg_stat_activity where backend_type = 'autovacuum launcher';
pg_log_query_plan_stab | application_name |
backend_type
---------------------------------------------------+------------------+---------------------
(f,"PID 100 is not a PostgreSQL backend process") | |
autovacuum launcher
```
--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-18 10:13:01
Message-ID: CAExHW5tqFqPT-GSvxBskc9Odrm_VgWru2fWvPX1BfjymcrGbjA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 18, 2023 at 11:39 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> >
> > I am more interested in avoiding the duplication of code, esp. the
> > first comment in my reply
>
> If there are no objections, I will try porting it to auto_explain and
> see its feasibility.

If you want it in core, you will need to port relevant parts of
auto_explain code to core.

--
Best Wishes,
Ashutosh Bapat


From: James Coleman <jtc331(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-18 16:34:36
Message-ID: CAAaqYe_1EuoTudAz8mr8-qtN5SoNtYRm4JM2J9CqeverpE3B2A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 18, 2023 at 2:09 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> On 2023-10-16 18:46, Ashutosh Bapat wrote:
> > On Thu, Oct 12, 2023 at 6:51 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> > wrote:
> >>
> >> On 2023-10-11 16:22, Ashutosh Bapat wrote:
> >> >
> >> > Considering the similarity with auto_explain I wondered whether this
> >> > function should be part of auto_explain contrib module itself? If we
> >> > do that users will need to load auto_explain extension and thus
> >> > install executor hooks when this function doesn't need those. So may
> >> > not be such a good idea. I didn't see any discussion on this.
> >>
> >> I once thought about adding this to auto_explain, but I left it asis
> >> for
> >> below reasons:
> >>
> >> - One of the typical use case of pg_log_query_plan() would be
> >> analyzing
> >> slow query on customer environments. On such environments, We cannot
> >> always control what extensions to install.
> >
> > The same argument applies to auto_explain functionality as well. But
> > it's not part of the core.
>
> Yeah, and when we have a situation where we want to run
> pg_log_query_plan(), we can run it in any environment as long as it is
> bundled with the core.
> On the other hand, if it is built into auto_explain, we need to start by
> installing auto_explain if we do not have auto_explain, which is often
> difficult to do in production environments.
>
> >> Of course auto_explain is a major extension and it is quite
> >> possible
> >> that they installed auto_explain, but but it is also possible they do
> >> not.
> >> - It seems a bit counter-intuitive that pg_log_query_plan() is in an
> >> extension called auto_explain, since it `manually`` logs plans
> >>
> >
> > pg_log_query_plan() may not fit auto_explain but
> > pg_explain_backend_query() does. What we are logging is more than just
> > plan of the query, it might expand to be closer to explain output.
> > While auto in auto_explain would refer to its automatically logging
> > explain outputs, it can provide an additional function which provides
> > similar functionality by manually triggering it.
> >
> > But we can defer this to a committer, if you want.
> >
> > I am more interested in avoiding the duplication of code, esp. the
> > first comment in my reply
>
> If there are no objections, I will try porting it to auto_explain and
> see its feasibility.
>
> >>> There is a lot of similarity between what this feature does and what
> >>> auto explain does. I see the code is also duplicated. There is some
> >>> merit in avoiding this duplication
> >>> 1. we will get all the features of auto_explain automatically like
> >>> choosing a format (this was expressed somebody earlier in this
> >>> thread), setings etc.
> >>> 2. avoid bugs. E.g your code switches context after ExplainState has
> >>> been allocated. These states may leak depending upon when this
> >>> function gets called.
> >>> 3. Building features on top as James envisions will be easier.

In my view the fact that auto_explain is itself not part of core is a
mistake, and I know there are more prominent hackers than myself who
hold that view.

While that decision as regards auto_explain has long since been made
(and there would be work to undo it), I don't believe that we should
repeat that choice here. I'm -10 on moving this into auto_explain.

However perhaps there is still an opportunity for moving some of the
auto_explain code into core so as to enable deduplicating the code.

Regards,
James Coleman


From: Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-24 07:12:46
Message-ID: 9c36599db62ef7466a297843fe812fc60e827cd2.camel@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Hi,
>
> Yeah, and when we have a situation where we want to run
> pg_log_query_plan(), we can run it in any environment as long as it
> is bundled with the core.

Is it possible to get the plan of a backend programmatically without
access to the logs?

Something like pg_get_query_plan(pid, format) which output would be the
same as EXPLAIN.

Regards,
Étienne BERSAC
Dalibo


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-24 12:30:49
Message-ID: 66d29d2c10ca9181c7e183eeeb4525f8@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-10-24 16:12, Étienne BERSAC wrote:
>> Hi,
>>
>> Yeah, and when we have a situation where we want to run
>> pg_log_query_plan(), we can run it in any environment as long as it
>> is bundled with the core.
>
> Is it possible to get the plan of a backend programmatically without
> access to the logs?
>
> Something like pg_get_query_plan(pid, format) which output would be the
> same as EXPLAIN.

Do you imagine a function like below?

=# select pg_get_query_plan(100, 'plain');
QUERY PLAN
-------------------------------------------------------------------
Limit (cost=0.00..0.04 rows=1 width=273)
-> Seq Scan on pg_class (cost=0.00..17.14 rows=414 width=273)

If so, we once tried to implement such function for getting memory
contexts.
However, this attempt didn't succeed because it could lead dead lock
situation[1].

I think the same problem can occur when implementing
pg_get_query_plan().

[1]
/message-id/9a50371e15e741e295accabc72a41df1%40oss.nttdata.com
>
> Regards,
> Étienne BERSAC
> Dalibo

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-25 03:40:06
Message-ID: CAExHW5siNFMWqDxUCtfbwT10bvcpKZ0GshNE5z-NjgPTQsHV_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 18, 2023 at 10:04 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
>
> While that decision as regards auto_explain has long since been made
> (and there would be work to undo it), I don't believe that we should
> repeat that choice here. I'm -10 on moving this into auto_explain.
>

Right.

> However perhaps there is still an opportunity for moving some of the
> auto_explain code into core so as to enable deduplicating the code.
>

Right. That's what I also think.

--
Best Wishes,
Ashutosh Bapat


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-26 06:53:22
Message-ID: e452507ea282ef996c2207bff1bff582@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-10-25 12:40, Ashutosh Bapat wrote:
> On Wed, Oct 18, 2023 at 10:04 PM James Coleman <jtc331(at)gmail(dot)com>
> wrote:
>>
>> While that decision as regards auto_explain has long since been made
>> (and there would be work to undo it), I don't believe that we should
>> repeat that choice here. I'm -10 on moving this into auto_explain.
>>
>
> Right.
>
>> However perhaps there is still an opportunity for moving some of the
>> auto_explain code into core so as to enable deduplicating the code.
>>
>
> Right. That's what I also think.

Thanks for your comments.

Attached patch adds a new function which assembles es->str for logging
according to specified contents and format.
This is called from both auto_explain and pg_log_query_plan().

On 2023-10-11 16:22, Ashutosh Bapat wrote:
> I am also wondering whether it's better to report the WARNING as
> status column in the output.

Attached patch left as it was since the inconsistency with
pg_terminate_backend() and pg_log_backend_memory_contexts() as you
pointed out.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

Attachment Content-Type Size
v32-0001-Add-function-to-log-the-plan-of-the-query.patch text/x-diff 30.3 KB

From: Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-27 07:06:31
Message-ID: 7bc1cc1bcc3289e9354ebb00ac1c8440c01b736c.camel@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Hi Torikoshia,

> If so, we once tried to implement such function for getting memory
> contexts.
> However, this attempt didn't succeed because it could lead dead lock
> situation[1].

Thanks for the pointer. Why not use client log message to allow client
to get plan without locking backend memory context and without access
to server log ? I missed the rationnal for not sending the plan to
client.

Regards,
Étienne


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-27 12:30:02
Message-ID: 0eafbfbe8dc39e80bce358b48a258510@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-10-27 16:06, Étienne BERSAC wrote:
> Hi Torikoshia,
>
>> If so, we once tried to implement such function for getting memory
>> contexts.
>> However, this attempt didn't succeed because it could lead dead lock
>> situation[1].
>
> Thanks for the pointer. Why not use client log message to allow client
> to get plan without locking backend memory context and without access
> to server log ? I missed the rationnal for not sending the plan to
> client.

If we use client log message, that message is shown on the target
process whose pid is specified by the parameter of pg_log_query_plan():

(pid:1000)=# select pg_sleep(60);
(pid:1001)=# select pg_log_query_plan(1000);
(pid:1000)=# LOG: query plan running on backend with PID 1000 is:
Query Text: select pg_sleep(1000);
Result (cost=0.00..0.01 rows=1 width=4)
Output: pg_sleep('1000'::double precision)

I think this is not an expected behavior and we set elevel to
LOG_SERVER_ONLY.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-10-27 12:54:48
Message-ID: 6d13f7485091f51ad7d11cb7c65f75038c10d747.camel@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Hi,
>

> If we use client log message, that message is shown on the target
> process whose pid is specified by the parameter of
> pg_log_query_plan():
>
>    (pid:1000)=# select pg_sleep(60);
>    (pid:1001)=# select pg_log_query_plan(1000);
>    (pid:1000)=# LOG:  query plan running on backend with PID 1000 is:
>                 Query Text: select pg_sleep(1000);
>                 Result  (cost=0.00..0.01 rows=1 width=4)
>                   Output: pg_sleep('1000'::double precision)
>
> I think this is not an expected behavior and we set elevel to
> LOG_SERVER_ONLY.

Makes sens. Thanks.


From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-11-03 14:01:14
Message-ID: CAExHW5vnoyd750bpjOJr58F33tN+hROAqVqv1nYRVPwd9XtgSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I have following questions related to the functionality. (Please point
me to the relevant discussion if this has been already discussed.)

1. When a backend is running nested queries, we will see the plan of
the innermost query. That query may not be the actual culprit if the
user query is running slowly. E.g a query being run as part of inner
side nested loop when the nested loop itself is the bottleneck. I
think it will be useful to print plans of all the whole query stack.

2. When a query is running in parallel worker do we want to print that
query? It may or may not be interesting given the situation. If the
overall plan itself is faulty, output of the parallel worker query is
not useful. If the plan is fine but a given worker's leg is running
slowly it may be interesting.

As a side note, you may want to fix the indentation in
ExplainAssembleLogOutput().

On Fri, Oct 27, 2023 at 6:24 PM Étienne BERSAC
<etienne(dot)bersac(at)dalibo(dot)com> wrote:
>
> > Hi,
> >
>
> > If we use client log message, that message is shown on the target
> > process whose pid is specified by the parameter of
> > pg_log_query_plan():
> >
> > (pid:1000)=# select pg_sleep(60);
> > (pid:1001)=# select pg_log_query_plan(1000);
> > (pid:1000)=# LOG: query plan running on backend with PID 1000 is:
> > Query Text: select pg_sleep(1000);
> > Result (cost=0.00..0.01 rows=1 width=4)
> > Output: pg_sleep('1000'::double precision)
> >
> > I think this is not an expected behavior and we set elevel to
> > LOG_SERVER_ONLY.
>
>
> Makes sens. Thanks.

--
Best Wishes,
Ashutosh Bapat


From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-11-06 06:32:12
Message-ID: CAExHW5t+xh95_LZjkUZARDbV=gS5pBJghG8x=Xnjb2psWr0R6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Nov 3, 2023 at 7:31 PM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
> I have following questions related to the functionality. (Please point
> me to the relevant discussion if this has been already discussed.)
>
> 1. When a backend is running nested queries, we will see the plan of
> the innermost query. That query may not be the actual culprit if the
> user query is running slowly. E.g a query being run as part of inner
> side nested loop when the nested loop itself is the bottleneck. I
> think it will be useful to print plans of all the whole query stack.

To further explain this point, consider following scenario

-- A completely useless function which executes two SQL statements
which take small amount individually
#create function multstmt() returns int
language sql as $$
select count(*) from pg_class where reltype = 12345;
select count(*) from pg_type limit 10;
$$;

-- force a suboptimal plan
#set enable_hashjoin to false;
#set enable_mergejoin to false;

-- A completely useless but long running query
#select c.oid, t.oid from pg_class c, pg_type t where multstmt(c.oid)
= multstmt(t.oid);

This take a few minutes on my laptop.

In a separate session query pg_stat_activity. We will see the original query
#select pid, query, backend_type from pg_stat_activity where pid = 349330;
pid | query
| backend_type
--------+-----------------------------------------------------------------------------------------+----------------
349330 | select c.oid, t.oid from pg_class c, pg_type t where
multstmt(c.oid) = multstmt(t.oid); | client backend
(1 row)

Run the plan output function a few times
#select pg_log_query_plan(349330);

You will observe different plans based on which of the innermost query
is runnning
LOG: query plan running on backend with PID 349330 is:
Query Text:
select count(*) from pg_class where reltype = typeid;
select count(*) from pg_type where oid = typeid;

Query Parameters: $1 = '600'
Aggregate (cost=18.18..18.19 rows=1 width=8)
Output: count(*)
-> Seq Scan on pg_catalog.pg_class (cost=0.00..18.18 rows=2 width=0)
Output: oid, relname, relnamespace, reltype,
reloftype, relowner, relam, relfilenode, reltablespace, relpages,
reltuples, relallvisible, reltoastrelid, relhasindex, relisshared,
relpersistence, relkind, relnatts, relchecks, relhasrules,
relhastriggers, relhassubclass, relrowsecurity, relforcerowsecurity,
relispopulated, relreplident, relispartition, relrewrite,
relfrozenxid, relminmxid, relacl, reloptions, relpartbound
Filter: (pg_class.reltype = $1)
Settings: enable_hashjoin = 'off', enable_mergejoin = 'off'
2023-11-06 11:52:25.610 IST [349330] LOG: query plan running on
backend with PID 349330 is:
Query Text:
select count(*) from pg_class where reltype = typeid;
select count(*) from pg_type where oid = typeid;

Query Parameters: $1 = '2203'
Aggregate (cost=4.30..4.31 rows=1 width=4)
Output: count(*)
-> Index Only Scan using pg_type_oid_index on
pg_catalog.pg_type (cost=0.28..4.29 rows=1 width=0)
Output: oid
Index Cond: (pg_type.oid = $1)
Settings: enable_hashjoin = 'off', enable_mergejoin = 'off'

Individual pieces are confusing here since the query run by the
backend is not the one being EXPLAINed. A user may not know that the
queries being EXPLAINed arise from the function call multstmt(). So
they won't be able to stitch the pieces together unless they see plan
of the outermost query with loops and costs. What might help if we
explain each query in the hierarchy.

I think we can start with what auto_explain is doing. Always print the
plan of the outermost query; the query found in pg_stat_activity. In a
later version we might find a way to print plans of all the queries in
the stack and do so in a readable manner.

This makes tracking activeQueryDesc a bit tricky. My guess is that the
outermost query's descriptor will be available through ActivePortal
most of the time. But there are cases when ExecutorRun is called by
passing a queryDesc directly. So may be there are some cases where
that's not true.

--
Best Wishes,
Ashutosh Bapat


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-11-09 06:33:36
Message-ID: c25ae6015be96a1964eddd964657660b@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-11-06 15:32, Ashutosh Bapat wrote:

Thanks for the suggestion and example.

> On Fri, Nov 3, 2023 at 7:31 PM Ashutosh Bapat
> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>>
>> I have following questions related to the functionality. (Please point
>> me to the relevant discussion if this has been already discussed.)
>>
>> 1. When a backend is running nested queries, we will see the plan of
>> the innermost query. That query may not be the actual culprit if the
>> user query is running slowly. E.g a query being run as part of inner
>> side nested loop when the nested loop itself is the bottleneck. I
>> think it will be useful to print plans of all the whole query stack.

This was discussed in previous threads[1] and we thought it'd be useful
but since it needed some extra works, we stopped widening the scope.

> To further explain this point, consider following scenario
>
> -- A completely useless function which executes two SQL statements
> which take small amount individually
> #create function multstmt() returns int
> language sql as $$
> select count(*) from pg_class where reltype = 12345;
> select count(*) from pg_type limit 10;
> $$;
>
> -- force a suboptimal plan
> #set enable_hashjoin to false;
> #set enable_mergejoin to false;
>
> -- A completely useless but long running query
> #select c.oid, t.oid from pg_class c, pg_type t where multstmt(c.oid)
> = multstmt(t.oid);
>
> This take a few minutes on my laptop.
>
> In a separate session query pg_stat_activity. We will see the original
> query
> #select pid, query, backend_type from pg_stat_activity where pid =
> 349330;
> pid | query
> | backend_type
> --------+-----------------------------------------------------------------------------------------+----------------
> 349330 | select c.oid, t.oid from pg_class c, pg_type t where
> multstmt(c.oid) = multstmt(t.oid); | client backend
> (1 row)
>
> Run the plan output function a few times
> #select pg_log_query_plan(349330);
>
> You will observe different plans based on which of the innermost query
> is runnning
> LOG: query plan running on backend with PID 349330 is:
> Query Text:
> select count(*) from pg_class where reltype = typeid;
> select count(*) from pg_type where oid = typeid;
>
> Query Parameters: $1 = '600'
> Aggregate (cost=18.18..18.19 rows=1 width=8)
> Output: count(*)
> -> Seq Scan on pg_catalog.pg_class (cost=0.00..18.18 rows=2
> width=0)
> Output: oid, relname, relnamespace, reltype,
> reloftype, relowner, relam, relfilenode, reltablespace, relpages,
> reltuples, relallvisible, reltoastrelid, relhasindex, relisshared,
> relpersistence, relkind, relnatts, relchecks, relhasrules,
> relhastriggers, relhassubclass, relrowsecurity, relforcerowsecurity,
> relispopulated, relreplident, relispartition, relrewrite,
> relfrozenxid, relminmxid, relacl, reloptions, relpartbound
> Filter: (pg_class.reltype = $1)
> Settings: enable_hashjoin = 'off', enable_mergejoin = 'off'
> 2023-11-06 11:52:25.610 IST [349330] LOG: query plan running on
> backend with PID 349330 is:
> Query Text:
> select count(*) from pg_class where reltype = typeid;
> select count(*) from pg_type where oid = typeid;
>
> Query Parameters: $1 = '2203'
> Aggregate (cost=4.30..4.31 rows=1 width=4)
> Output: count(*)
> -> Index Only Scan using pg_type_oid_index on
> pg_catalog.pg_type (cost=0.28..4.29 rows=1 width=0)
> Output: oid
> Index Cond: (pg_type.oid = $1)
> Settings: enable_hashjoin = 'off', enable_mergejoin = 'off'
>
> Individual pieces are confusing here since the query run by the
> backend is not the one being EXPLAINed. A user may not know that the
> queries being EXPLAINed arise from the function call multstmt(). So
> they won't be able to stitch the pieces together unless they see plan
> of the outermost query with loops and costs. What might help if we
> explain each query in the hierarchy.
>
> I think we can start with what auto_explain is doing. Always print the
> plan of the outermost query; the query found in pg_stat_activity. In a
> later version we might find a way to print plans of all the queries in
> the stack and do so in a readable manner.

Agreed there are cases printing plan of the outermost query is more
useful.

>
> This makes tracking activeQueryDesc a bit tricky. My guess is that the
> outermost query's descriptor will be available through ActivePortal
> most of the time. But there are cases when ExecutorRun is called by
> passing a queryDesc directly. So may be there are some cases where
> that's not true.

Yeah, actually the original version of the patch got the plan from
ActivePortal, but it failed logging plan when the query was something
like this[2]:

DO $$
BEGIN
PERFORM pg_sleep(100);
END$$;

> 2. When a query is running in parallel worker do we want to print that
> query? It may or may not be interesting given the situation. If the
> overall plan itself is faulty, output of the parallel worker query is
> not useful. If the plan is fine but a given worker's leg is running
> slowly it may be interesting.

I think it can be useful.
I'm wondering if we can add this after the first patch for this feature
is committed.

> As a side note, you may want to fix the indentation in
> ExplainAssembleLogOutput().

Thanks, modified it.
Since the documentation was obsoleted, attached patch also updated it.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

Attachment Content-Type Size
v33-0001-Add-function-to-log-the-plan-of-the-query.patch text/x-diff 30.3 KB

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-11-09 07:11:49
Message-ID: CAExHW5utSXjEMh6BWBPEe3Ks70T12VcGZqR8YH1owhZoROv4fw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 9, 2023 at 12:03 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> >>
> >> 1. When a backend is running nested queries, we will see the plan of
> >> the innermost query. That query may not be the actual culprit if the
> >> user query is running slowly. E.g a query being run as part of inner
> >> side nested loop when the nested loop itself is the bottleneck. I
> >> think it will be useful to print plans of all the whole query stack.
>
> This was discussed in previous threads[1] and we thought it'd be useful
> but since it needed some extra works, we stopped widening the scope.
>
> >
> > I think we can start with what auto_explain is doing. Always print the
> > plan of the outermost query; the query found in pg_stat_activity. In a
> > later version we might find a way to print plans of all the queries in
> > the stack and do so in a readable manner.
>
> Agreed there are cases printing plan of the outermost query is more
> useful.
>

I am fine printing the plan of the outermost query. This will help
many cases. Printing plans of the whole query stack can be added as an
add on later.

> >
> > This makes tracking activeQueryDesc a bit tricky. My guess is that the
> > outermost query's descriptor will be available through ActivePortal
> > most of the time. But there are cases when ExecutorRun is called by
> > passing a queryDesc directly. So may be there are some cases where
> > that's not true.
>
> Yeah, actually the original version of the patch got the plan from
> ActivePortal, but it failed logging plan when the query was something
> like this[2]:
>
> DO $$
> BEGIN
> PERFORM pg_sleep(100);
> END$$;

References [1] and [2] are not listed in your email.

Is that because there was no ActivePortal created or the ActivePortal
pointed to DO block instead of PERFORM pg_sleep?

>
> > 2. When a query is running in parallel worker do we want to print that
> > query? It may or may not be interesting given the situation. If the
> > overall plan itself is faulty, output of the parallel worker query is
> > not useful. If the plan is fine but a given worker's leg is running
> > slowly it may be interesting.
>
> I think it can be useful.
> I'm wondering if we can add this after the first patch for this feature
> is committed.

With the current patches, it will print the query from a parallel
backend. If that's not desirable we should prohibit that case at
least.

--
Best Wishes,
Ashutosh Bapat


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-11-09 11:26:27
Message-ID: df4c8c0b088c773edcbff8dbf790ae9c@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-11-09 16:11, Ashutosh Bapat wrote:
> On Thu, Nov 9, 2023 at 12:03 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
>> >>
>> >> 1. When a backend is running nested queries, we will see the plan of
>> >> the innermost query. That query may not be the actual culprit if the
>> >> user query is running slowly. E.g a query being run as part of inner
>> >> side nested loop when the nested loop itself is the bottleneck. I
>> >> think it will be useful to print plans of all the whole query stack.
>>
>> This was discussed in previous threads[1] and we thought it'd be
>> useful
>> but since it needed some extra works, we stopped widening the scope.
>>
>> >
>> > I think we can start with what auto_explain is doing. Always print the
>> > plan of the outermost query; the query found in pg_stat_activity. In a
>> > later version we might find a way to print plans of all the queries in
>> > the stack and do so in a readable manner.
>>
>> Agreed there are cases printing plan of the outermost query is more
>> useful.
>>
>
> I am fine printing the plan of the outermost query. This will help
> many cases. Printing plans of the whole query stack can be added as an
> add on later.
>
>> >
>> > This makes tracking activeQueryDesc a bit tricky. My guess is that the
>> > outermost query's descriptor will be available through ActivePortal
>> > most of the time. But there are cases when ExecutorRun is called by
>> > passing a queryDesc directly. So may be there are some cases where
>> > that's not true.
>>
>> Yeah, actually the original version of the patch got the plan from
>> ActivePortal, but it failed logging plan when the query was something
>> like this[2]:
>>
>> DO $$
>> BEGIN
>> PERFORM pg_sleep(100);
>> END$$;
>
> References [1] and [2] are not listed in your email.

Oops, sorry. Here are links:

[1]
/message-id/64f716c44629e303b66e6c24502147cc%40oss.nttdata.com
[2]
/message-id/632e99eb-8090-53e6-1b1a-101601904cbd%40oss.nttdata.com

> Is that because there was no ActivePortal created or the ActivePortal
> pointed to DO block instead of PERFORM pg_sleep?

ActivePortal is created but ActivePortal->queryDesc is null.

>> > 2. When a query is running in parallel worker do we want to print that
>> > query? It may or may not be interesting given the situation. If the
>> > overall plan itself is faulty, output of the parallel worker query is
>> > not useful. If the plan is fine but a given worker's leg is running
>> > slowly it may be interesting.
>>
>> I think it can be useful.
>> I'm wondering if we can add this after the first patch for this
>> feature
>> is committed.
>
> With the current patches, it will print the query from a parallel
> backend. If that's not desirable we should prohibit that case at
> least.

Current patch prohibits printing plan if backend type is parallel worker
as below:

=# select pg_log_query_plan(pid), backend_type from pg_stat_activity
where backend_type = 'parallel worker';

pg_log_query_plan | backend_type
-------------------+-----------------
f | parallel worker
f | parallel worker
(2 rows)

WARNING: PID 4618 is not a PostgreSQL client backend process
WARNING: PID 4617 is not a PostgreSQL client backend process

Is this the behavior you expect?

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2023-11-09 14:58:48
Message-ID: CAExHW5sh4ahrJgmMAGfptWVmESt1JLKCNm148XVxTunRr+-6gA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 9, 2023 at 4:56 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> On 2023-11-09 16:11, Ashutosh Bapat wrote:
> > On Thu, Nov 9, 2023 at 12:03 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> > wrote:
> >> >>
> >> >> 1. When a backend is running nested queries, we will see the plan of
> >> >> the innermost query. That query may not be the actual culprit if the
> >> >> user query is running slowly. E.g a query being run as part of inner
> >> >> side nested loop when the nested loop itself is the bottleneck. I
> >> >> think it will be useful to print plans of all the whole query stack.
> >>
> >> This was discussed in previous threads[1] and we thought it'd be
> >> useful
> >> but since it needed some extra works, we stopped widening the scope.
> >>
> >> >
> >> > I think we can start with what auto_explain is doing. Always print the
> >> > plan of the outermost query; the query found in pg_stat_activity. In a
> >> > later version we might find a way to print plans of all the queries in
> >> > the stack and do so in a readable manner.
> >>
> >> Agreed there are cases printing plan of the outermost query is more
> >> useful.
> >>
> >
> > I am fine printing the plan of the outermost query. This will help
> > many cases. Printing plans of the whole query stack can be added as an
> > add on later.
> >
> >> >
> >> > This makes tracking activeQueryDesc a bit tricky. My guess is that the
> >> > outermost query's descriptor will be available through ActivePortal
> >> > most of the time. But there are cases when ExecutorRun is called by
> >> > passing a queryDesc directly. So may be there are some cases where
> >> > that's not true.
> >>
> >> Yeah, actually the original version of the patch got the plan from
> >> ActivePortal, but it failed logging plan when the query was something
> >> like this[2]:
> >>
> >> DO $$
> >> BEGIN
> >> PERFORM pg_sleep(100);
> >> END$$;
> >
> > References [1] and [2] are not listed in your email.
>
> Oops, sorry. Here are links:
>
> [1]
> /message-id/64f716c44629e303b66e6c24502147cc%40oss.nttdata.com
> [2]
> /message-id/632e99eb-8090-53e6-1b1a-101601904cbd%40oss.nttdata.com
>
> > Is that because there was no ActivePortal created or the ActivePortal
> > pointed to DO block instead of PERFORM pg_sleep?
>
> ActivePortal is created but ActivePortal->queryDesc is null.

Thanks.

>
> >> > 2. When a query is running in parallel worker do we want to print that
> >> > query? It may or may not be interesting given the situation. If the
> >> > overall plan itself is faulty, output of the parallel worker query is
> >> > not useful. If the plan is fine but a given worker's leg is running
> >> > slowly it may be interesting.
> >>
> >> I think it can be useful.
> >> I'm wondering if we can add this after the first patch for this
> >> feature
> >> is committed.
> >
> > With the current patches, it will print the query from a parallel
> > backend. If that's not desirable we should prohibit that case at
> > least.
>
> Current patch prohibits printing plan if backend type is parallel worker
> as below:
>
> =# select pg_log_query_plan(pid), backend_type from pg_stat_activity
> where backend_type = 'parallel worker';
>
> pg_log_query_plan | backend_type
> -------------------+-----------------
> f | parallel worker
> f | parallel worker
> (2 rows)
>
> WARNING: PID 4618 is not a PostgreSQL client backend process
> WARNING: PID 4617 is not a PostgreSQL client backend process
>
> Is this the behavior you expect?
>

I misread then. Thanks for correcting me. We could consider plans from
parallel workers in v2 of this feature.

--
Best Wishes,
Ashutosh Bapat


From: Rafael Thofehrn Castro <rafaelthca(at)gmail(dot)com>
To: torikoshia(at)oss(dot)nttdata(dot)com
Cc: Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2023-12-06 23:33:56
Message-ID: CAG0ozMqmjpUG65sDQ+RA41mhr_1+Wf2Kk88V4_3MDCy9Xk4-+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello hackers,

Last Saturday I submitted a patch to the pgsql-hackers list with the title
"Proposal: In-flight explain logging" with a patch proposing a feature very
similar to the one being worked on in this thread. I should have done a
better
search in the commitfest before implementing something from scratch.

So, as recommended by Ashutosh, I am sending an incremental patch containing
an additional feature I personally think we should include: logging the plan
with instrumentation details if enabled.

When targeting a query with instrumentation PG should log the complete
EXPLAIN ANALYZE plan with current row count and, if enabled, timing for each
node. This gives the user not only the ability to see what the plan is
but also what was executed so far, which is super useful when
troubleshooting queries that never finish.

Considering that the query is in progress the output will include the
statement (never executed) for nodes that weren't touched yet (or may
never be). This feature is already present in the current ExplainNode
implementation.

I added a new statement (in progress) for nodes currently being executed,
ie,
InstrStartNode was called and clock is ticking there.

Back-reading this thread I saw the discussion about extending
pg_log_query_plan
to parallel workers or not. I added it in my incremental patch as the new
capability of logging instrumentation makes it useful to be able to see
what parallel workers are currently doing.

# DEMONSTRATION:

postgres=# select pid, backend_type,pg_log_query_plan(pid)
postgres=# from pg_stat_activity
postgres=# where (backend_type = 'client backend' and pid !=
pg_backend_pid())
postgres=# or backend_type = 'parallel worker';
pid | backend_type | pg_log_query_plan
-------+-----------------+-------------------
33833 | client backend | t
47202 | parallel worker | t
47203 | parallel worker | t
(3 rows)

2023-12-06 23:14:41.756 UTC [33833] LOG: query plan running on backend
with PID 33833 is:
Query Text: explain (analyze, buffers) select *
from t2 a
inner join t1 b on a.c1=b.c1
inner join t1 c on a.c1=c.c1
inner join t1 d on a.c1=d.c1
inner join t1 e on a.c1=e.c1;
Gather (cost=70894.63..202643.27 rows=1000000 width=20) (never executed)
Output: a.c1, b.c1, c.c1, d.c1, e.c1
Workers Planned: 2
Workers Launched: 2
-> Parallel Hash Join (cost=69894.63..101643.27 rows=416667 width=20)
(never executed)
Output: a.c1, b.c1, c.c1, d.c1, e.c1
Hash Cond: (a.c1 = e.c1)
-> Parallel Hash Join (cost=54466.62..77218.65 rows=416667
width=16) (never executed)
Output: a.c1, b.c1, c.c1, d.c1
Hash Cond: (a.c1 = c.c1)
-> Parallel Hash Join (cost=15428.00..29997.42 rows=416667
width=8) (actual time=2500.914..2625.922 rows=250412 loops=1) (in progress)
Output: a.c1, b.c1
Inner Unique: true
Hash Cond: (b.c1 = a.c1)
Buffers: shared hit=755 read=2175, temp read=1473
written=1860
-> Parallel Seq Scan on public.t1 b
(cost=0.00..8591.67 rows=416667 width=4) (actual time=0.022..20.904
rows=331492 loops=1)
Output: b.c1
Buffers: shared hit=602 read=865
-> Parallel Hash (cost=8591.67..8591.67 rows=416667
width=4) (actual time=1745.107..1745.107 rows=330638 loops=1)
Output: a.c1
Buffers: shared hit=153 read=1310, temp written=868
-> Parallel Seq Scan on public.t2 a
(cost=0.00..8591.67 rows=416667 width=4) (actual time=0.042..27.695
rows=330638 loops=1)
Output: a.c1
Buffers: shared hit=153 read=1310
-> Parallel Hash (cost=32202.28..32202.28 rows=416667
width=8) (actual time=2450.489..2450.489 rows=407941 loops=1)
Output: c.c1, d.c1
Buffers: shared hit=1141 read=1833, temp read=1938
written=2836
-> Parallel Hash Join (cost=15428.00..32202.28
rows=416667 width=8) (actual time=1323.422..1575.245 rows=407941 loops=1)
Output: c.c1, d.c1
Hash Cond: (c.c1 = d.c1)
Buffers: shared hit=1141 read=1833, temp read=1938
written=1864
-> Parallel Seq Scan on public.t1 c
(cost=0.00..8591.67 rows=416667 width=4) (actual time=0.026..22.223
rows=336238 loops=1)
Output: c.c1
Buffers: shared hit=590 read=898
-> Parallel Hash (cost=8591.67..8591.67
rows=416667 width=4) (actual time=653.306..653.306 rows=335836 loops=1)
Output: d.c1
Buffers: shared hit=551 read=935, temp
written=872
-> Parallel Seq Scan on public.t1 d
(cost=0.00..8591.67 rows=416667 width=4) (actual time=0.022..23.127
rows=335836 loops=1)
Output: d.c1
Buffers: shared hit=551 read=935
-> Parallel Hash (cost=8591.67..8591.67 rows=416667 width=4)
(actual time=590.086..590.086 rows=343696 loops=1)
Output: e.c1
Buffers: shared hit=519 read=1002, temp written=896
-> Parallel Seq Scan on public.t1 e (cost=0.00..8591.67
rows=416667 width=4) (actual time=0.066..21.797 rows=343696 loops=1)
Output: e.c1
Buffers: shared hit=519 read=1002
2023-12-06 23:14:41.757 UTC [47203] LOG: query plan running on backend
with PID 47203 is:
Query Text: explain (analyze, buffers) select *
from t2 a
inner join t1 b on a.c1=b.c1
inner join t1 c on a.c1=c.c1
inner join t1 d on a.c1=d.c1
inner join t1 e on a.c1=e.c1;
Parallel Hash Join (cost=69894.63..101643.27 rows=416667 width=20) (never
executed)
Output: a.c1, b.c1, c.c1, d.c1, e.c1
Hash Cond: (a.c1 = e.c1)
-> Parallel Hash Join (cost=54466.62..77218.65 rows=416667 width=16)
(never executed)
Output: a.c1, b.c1, c.c1, d.c1
Hash Cond: (a.c1 = c.c1)
-> Parallel Hash Join (cost=15428.00..29997.42 rows=416667
width=8) (actual time=2464.367..2628.476 rows=258293 loops=1)
Output: a.c1, b.c1
Inner Unique: true
Hash Cond: (b.c1 = a.c1)
Buffers: shared hit=704 read=2213, temp read=1497 written=1860
-> Parallel Seq Scan on public.t1 b (cost=0.00..8591.67
rows=416667 width=4) (actual time=0.032..22.247 rows=330412 loops=1)
Output: b.c1
Buffers: shared hit=594 read=868
-> Parallel Hash (cost=8591.67..8591.67 rows=416667 width=4)
(actual time=1745.093..1745.093 rows=328830 loops=1)
Output: a.c1
Buffers: shared hit=110 read=1345, temp written=868
-> Parallel Seq Scan on public.t2 a
(cost=0.00..8591.67 rows=416667 width=4) (actual time=1.141..30.128
rows=328830 loops=1)
Output: a.c1
Buffers: shared hit=110 read=1345
-> Parallel Hash (cost=32202.28..32202.28 rows=416667 width=8)
(actual time=2449.694..2449.694 rows=295462 loops=1)
Output: c.c1, d.c1
Buffers: shared hit=1161 read=1844, temp read=1971 written=2872
-> Parallel Hash Join (cost=15428.00..32202.28 rows=416667
width=8) (actual time=1378.678..1577.182 rows=295462 loops=1)
Output: c.c1, d.c1
Hash Cond: (c.c1 = d.c1)
Buffers: shared hit=1161 read=1844, temp read=1971
written=1856
-> Parallel Seq Scan on public.t1 c
(cost=0.00..8591.67 rows=416667 width=4) (actual time=0.031..22.459
rows=336288 loops=1)
Output: c.c1
Buffers: shared hit=590 read=898
-> Parallel Hash (cost=8591.67..8591.67 rows=416667
width=4) (actual time=651.048..651.048 rows=328378 loops=1)
Output: d.c1
Buffers: shared hit=507 read=946, temp written=860
-> Parallel Seq Scan on public.t1 d
(cost=0.00..8591.67 rows=416667 width=4) (actual time=0.034..24.462
rows=328378 loops=1)
Output: d.c1
Buffers: shared hit=507 read=946
-> Parallel Hash (cost=8591.67..8591.67 rows=416667 width=4) (actual
time=575.417..575.417 rows=327926 loops=1)
Output: e.c1
Buffers: shared hit=522 read=929, temp written=860
-> Parallel Seq Scan on public.t1 e (cost=0.00..8591.67
rows=416667 width=4) (actual time=0.786..23.114 rows=327926 loops=1)
Output: e.c1
Buffers: shared hit=522 read=929
2023-12-06 23:14:41.758 UTC [47202] LOG: query plan running on backend
with PID 47202 is:
Query Text: explain (analyze, buffers) select *
from t2 a
inner join t1 b on a.c1=b.c1
inner join t1 c on a.c1=c.c1
inner join t1 d on a.c1=d.c1
inner join t1 e on a.c1=e.c1;
Parallel Hash Join (cost=69894.63..101643.27 rows=416667 width=20) (never
executed)
Output: a.c1, b.c1, c.c1, d.c1, e.c1
Hash Cond: (a.c1 = e.c1)
-> Parallel Hash Join (cost=54466.62..77218.65 rows=416667 width=16)
(never executed)
Output: a.c1, b.c1, c.c1, d.c1
Hash Cond: (a.c1 = c.c1)
-> Parallel Hash Join (cost=15428.00..29997.42 rows=416667
width=8) (actual time=2500.138..2631.855 rows=254125 loops=1)
Output: a.c1, b.c1
Inner Unique: true
Hash Cond: (b.c1 = a.c1)
Buffers: shared hit=749 read=2254, temp read=1483 written=1896
-> Parallel Seq Scan on public.t1 b (cost=0.00..8591.67
rows=416667 width=4) (actual time=0.027..22.224 rows=338096 loops=1)
Output: b.c1
Buffers: shared hit=628 read=868
-> Parallel Hash (cost=8591.67..8591.67 rows=416667 width=4)
(actual time=1744.616..1744.616 rows=340532 loops=1)
Output: a.c1
Buffers: shared hit=121 read=1386, temp written=892
-> Parallel Seq Scan on public.t2 a
(cost=0.00..8591.67 rows=416667 width=4) (actual time=1.138..30.003
rows=340532 loops=1)
Output: a.c1
Buffers: shared hit=121 read=1386
-> Parallel Hash (cost=32202.28..32202.28 rows=416667 width=8)
(actual time=2450.470..2450.470 rows=296597 loops=1)
Output: c.c1, d.c1
Buffers: shared hit=1186 read=1813, temp read=1610 written=2872
-> Parallel Hash Join (cost=15428.00..32202.28 rows=416667
width=8) (actual time=1380.286..1561.412 rows=296597 loops=1)
Output: c.c1, d.c1
Hash Cond: (c.c1 = d.c1)
Buffers: shared hit=1186 read=1813, temp read=1610
written=1848
-> Parallel Seq Scan on public.t1 c
(cost=0.00..8591.67 rows=416667 width=4) (actual time=0.032..22.170
rows=327474 loops=1)
Output: c.c1
Buffers: shared hit=548 read=901
-> Parallel Hash (cost=8591.67..8591.67 rows=416667
width=4) (actual time=653.297..653.297 rows=335786 loops=1)
Output: d.c1
Buffers: shared hit=574 read=912, temp written=868
-> Parallel Seq Scan on public.t1 d
(cost=0.00..8591.67 rows=416667 width=4) (actual time=0.023..24.210
rows=335786 loops=1)
Output: d.c1
Buffers: shared hit=574 read=912
-> Parallel Hash (cost=8591.67..8591.67 rows=416667 width=4) (actual
time=574.553..574.553 rows=328378 loops=1)
Output: e.c1
Buffers: shared hit=495 read=958, temp written=864
-> Parallel Seq Scan on public.t1 e (cost=0.00..8591.67
rows=416667 width=4) (actual time=0.717..23.270 rows=328378 loops=1)
Output: e.c1
Buffers: shared hit=495 read=958

The thread of my initial patch contains all important implementation
details. Sharing the ones relevant to my incremental patch here:

- Safely printing signaled plans with instrumentation

A plan string is built in function ExplainNode here (
https://github.com/postgres/postgres/blob/REL_16_STABLE/src/backend/commands/explain.c#L1178
)
which is called at the end of a query execution when EXPLAIN is used.
That function performs logic using a PlanState (part of QueryDesc) of
the running query and a ExplainState.

The main challenge there is that ExplainNode calls InstrEndLoop which
changes values in Instrumentation. This is ok for a regular EXPLAIN
where the query is already complete but not ok for the new feature with
signaled plan logging.

So the new code has custom logic to clone Instrumentation instance of
the current node. The cloned object can be safely written.

Function InstrEndLoop has a safety rule here (
https://github.com/postgres/postgres/blob/REL_16_STABLE/src/backend/executor/instrument.c#L148
)
that prevents adjusting instrumentation details in a running node. This
never happens in the current code logic but with the new feature
it will happen very often.

I didn't want to remove this safety rule as InstrEndLoop gets called in
other places too (even in auto_explain) so the solution was to keep
InstrEndLoop and have a new InstrEndLoopForce for the signaled
plan logging with instrumentation. Both InstrEndLoop and InstrEndLoopForce
call a new internal InstrEndLoopInternal to avoid duplicating the code.

- Memory management

Considering that pg_log_query_plan() creates its own memory context before
calling ExplainAssembleLogOutput, the new logic that allocates memory
to clone instrumentation doesn't need to free anything.

# FINAL CONSIDERATIONS

Let me know what you think about this incremental patch. If you think it
is useful but needs adjustments I will be happy to change as needed.

Kind Regards,

Rafael Castro.

Attachment Content-Type Size
v34-0001-Add-function-to-log-the-plan-of-the-query.patch application/octet-stream 30.3 KB
v34-0002-Log-plan-along-with-instrumentation-details.patch application/octet-stream 11.2 KB

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Rafael Thofehrn Castro <rafaelthca(at)gmail(dot)com>
Cc: Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2023-12-11 05:46:23
Message-ID: c161b5e7e1888eb9c9eb182a7d9dcf89@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2023-12-07 08:33, Rafael Thofehrn Castro wrote:
> Hello hackers,
>
> Last Saturday I submitted a patch to the pgsql-hackers list with the
> title
> "Proposal: In-flight explain logging" with a patch proposing a feature
> very
> similar to the one being worked on in this thread. I should have done
> a better
> search in the commitfest before implementing something from scratch.
>
> So, as recommended by Ashutosh, I am sending an incremental patch
> containing
> an additional feature I personally think we should include: logging
> the plan
> with instrumentation details if enabled.

Thanks for the proposal and making the patch!

> When targeting a query with instrumentation PG should log the complete
> EXPLAIN ANALYZE plan with current row count and, if enabled, timing
> for each
> node. This gives the user not only the ability to see what the plan is
> but also what was executed so far, which is super useful when
> troubleshooting queries that never finish.

I think showing the progress of the query execution would be useful.

OTOH it seems to at least need some modifications around Instrumentation
as your patch.
As a first step, I think it would better to minimize the scope and focus
on the fundamental function.
For the same reason, getting queries for parallel workers is also
prohibited in the current patch as discussed here[1].

[1]
/message-id/c25ae6015be96a1964eddd964657660b%40oss.nttdata.com

So I think below steps would be better than pushing all the
functionalities to the 1st commit.

- First, develop function to enable output of query
progress(v34-0001-Add-function-to-log-the-plan-of-the-query.patch).
- Then enhance the function
- showing the progress of the query
execution(v34-0002-Log-plan-along-with-instrumentation-details.patch),
etc.

> --/message-id/CAG0ozMp3g3drnkDa6RZxXO_OmnisL2sD9vBrmpu5fOBoYpC-3w%40mail.gmail.com
> - ExplainState customization
>
> A ExplainState is allocated and customized for the in-flight logging.
> Instrumentation related settings are enabled based on how the target
> query started, which is usually via EXPLAIN ANALYZE or with
> auto_explain.

Does this mean the progress can be got only when the target query was
run with EXPLAIN ANALYZE or auto_explain.log_analyze?

If so, there might be limited situations we can get the progress since I
imagine EXPLAIN ANALYZE is used when user want to get the plan from the
beginning and auto_explain.log_analyze can give negative impact on
performance as described in the manual and there may not be many
environments which enable it.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-01-29 13:02:29
Message-ID: 61b4add9a01f21011789b6ee04085751@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Updated the patch to fix typos and move
ProcessLogQueryPlanInterruptActive from errfinish() to AbortTransaction.

BTW since the thread is getting long, I list the some points of the
discussion so far:

# Safety concern
## Catalog access inside CFI
- it seems safe if the CFI call is inside an existing valid
transaction/query state[1]

- We did some tests, for example calling ProcessLogQueryPlanInterrupt()
in every single CHECK_FOR_INTERRUPTS()[2]. This test passed on my env
but was stucked on James's env, so I modified to exit
ProcessLogQueryPlanInterrupt() when target process is inside of lock
acquisition code[3]

## Risk of calling EXPLAIN code in CFI
- EXPLAIN is not a simple logic code, and there exists risk calling it
from CFI. For example, if there is a bug, we may find ourselves in a
situation where we can't cancel the query

- it's a trade-off that's worth making for the introspection benefits
this patch would provide?[4]

# Design
- Although some suggested it should be in auto_explain, current patch
introduces this feature to the core[5]

- When the target query is nested, only the most inner query's plan is
explained. In future, all the nested queries' plans are expected to
explained optionally like auto_explain.log_nested_statements[6]

- When the target process is a parallel worker, the plan is not shown[6]

- When the target query is nested and its subtransaction is aborted,
pg_log_query_plan cannot log the parental query plan after the abort
even parental query is running[7]

- The output corresponds to EXPLAIN with VERBOSE, COST, SETTINGS and
FORMAT text. It doesn't do ANALYZE or show the progress of the query
execution. Future work proposed by Rafael Thofehrn Castro may realize
this[8]

- To prevent assertion error, this patch ensures no page lock is held by
checking all the LocalLock entries before running explain code, but
there is a discussion that ginInsertCleanup() should be modified[9]

It may be not so difficult to improve some of restrictions in "Design",
but I'd like to limit the scope of the 1st patch to make it simpler.

[1]
/message-id/CAAaqYe9euUZD8bkjXTVcD9e4n5c7kzHzcvuCJXt-xds9X4c7Fw%40mail.gmail.com
[2]
/message-id/CAAaqYe8LXVXQhYy3yT0QOHUymdM%3Duha0dJ0%3DBEPzVAx2nG1gsw%40mail.gmail.com
[3]
/message-id/0e0e7ca08dff077a625c27a5e0c2ef0a%40oss.nttdata.com
[4]
/message-id/CAAaqYe8LXVXQhYy3yT0QOHUymdM%3Duha0dJ0%3DBEPzVAx2nG1gsw%40mail.gmail.com
[5]
/message-id/CAAaqYe_1EuoTudAz8mr8-qtN5SoNtYRm4JM2J9CqeverpE3B2A%40mail.gmail.com
[6]
/message-id/CAExHW5sh4ahrJgmMAGfptWVmESt1JLKCNm148XVxTunRr%2B-6gA%40mail.gmail.com
[7]
/message-id/3d121ed5f81cef588bac836b43f5d1f9%40oss.nttdata.com
[8]
/message-id/c161b5e7e1888eb9c9eb182a7d9dcf89%40oss.nttdata.com
[9]
/message-id/20220201.172757.1480996662235658750.horikyota.ntt%40gmail.com

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

Attachment Content-Type Size
v35-0001-Add-function-to-log-the-plan-of-the-query.patch text/x-diff 30.1 KB

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, rafaelthca(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-06 10:51:58
Message-ID: CAExHW5vEi_8RJC2X2mp+RSAvTXhh4NLuNd+7tQGb7P7jxgdKSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Atsushi,

On Mon, Jan 29, 2024 at 6:32 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> Hi,
>
> Updated the patch to fix typos and move
> ProcessLogQueryPlanInterruptActive from errfinish() to AbortTransaction.
>
>
> BTW since the thread is getting long, I list the some points of the
> discussion so far:
>
> # Safety concern
> ## Catalog access inside CFI
> - it seems safe if the CFI call is inside an existing valid
> transaction/query state[1]
>
> - We did some tests, for example calling ProcessLogQueryPlanInterrupt()
> in every single CHECK_FOR_INTERRUPTS()[2]. This test passed on my env
> but was stucked on James's env, so I modified to exit
> ProcessLogQueryPlanInterrupt() when target process is inside of lock
> acquisition code[3]
>
> ## Risk of calling EXPLAIN code in CFI
> - EXPLAIN is not a simple logic code, and there exists risk calling it
> from CFI. For example, if there is a bug, we may find ourselves in a
> situation where we can't cancel the query
>
> - it's a trade-off that's worth making for the introspection benefits
> this patch would provide?[4]
>
> # Design
> - Although some suggested it should be in auto_explain, current patch
> introduces this feature to the core[5]
>
> - When the target query is nested, only the most inner query's plan is
> explained. In future, all the nested queries' plans are expected to
> explained optionally like auto_explain.log_nested_statements[6]
>
> - When the target process is a parallel worker, the plan is not shown[6]
>
> - When the target query is nested and its subtransaction is aborted,
> pg_log_query_plan cannot log the parental query plan after the abort
> even parental query is running[7]
>
> - The output corresponds to EXPLAIN with VERBOSE, COST, SETTINGS and
> FORMAT text. It doesn't do ANALYZE or show the progress of the query
> execution. Future work proposed by Rafael Thofehrn Castro may realize
> this[8]
>
> - To prevent assertion error, this patch ensures no page lock is held by
> checking all the LocalLock entries before running explain code, but
> there is a discussion that ginInsertCleanup() should be modified[9]
>
>
> It may be not so difficult to improve some of restrictions in "Design",
> but I'd like to limit the scope of the 1st patch to make it simpler.

Thanks for the summary. It is helpful. I think patch is also getting better.

I have a few questions and suggestions
1. Prologue of GetLockMethodLocalHash() mentions
* NOTE: When there are many entries in LockMethodLocalHash, calling this
* function and looking into all of them can lead to performance problems.
*/
How bad this performance could be. Let's assume that a query is taking
time and pg_log_query_plan() is invoked to examine the plan of this
query. Is it possible that the looping over all the locks itself takes
a lot of time delaying the query execution further?

2. What happens if auto_explain is enabled in the backend and
pg_log_query_plan() is called on the same backend? Will they conflict?
I think we should add a test for the same.

Using injection point support we should be able to add tests for
testing pg_log_query_plan behaviour when there are page locks held or
when auto_explain (with instrumentation) and pg_log_query_plan() work
on the same query plan. Use injection point to make the backend
running query wait at a suitable point to delay its execution and fire
pg_log_query_plan() from other backend. May be the same test could
examine the server log file to see if the plan is indeed output to the
server log file.

Given that the feature will be used when the things have already gone
wrong, it should not make things more serious. So more testing and
especially automated would help.

--
Best Wishes,
Ashutosh Bapat


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, rafaelthca(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-07 04:08:24
Message-ID: ab2099c9aa64690851b444b6ea562988@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Ashutosh,

On 2024-02-06 19:51, Ashutosh Bapat wrote:

> Thanks for the summary. It is helpful. I think patch is also getting
> better.
>
> I have a few questions and suggestions

Thanks for your comments.

> 1. Prologue of GetLockMethodLocalHash() mentions
> * NOTE: When there are many entries in LockMethodLocalHash, calling
> this
> * function and looking into all of them can lead to performance
> problems.
> */
> How bad this performance could be. Let's assume that a query is taking
> time and pg_log_query_plan() is invoked to examine the plan of this
> query. Is it possible that the looping over all the locks itself takes
> a lot of time delaying the query execution further?

I think it depends on the number of local locks, but I've measured cpu
time for this page lock check by adding below codes and
v27-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch[1], which
calls ProcessLogQueryPlanInterrupt() in every CFI on my laptop just for
your information:

diff --git a/src/backend/commands/explain.c
b/src/backend/commands/explain.c
index 5f7d77d567..65b7cb4925 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -44,6 +44,8 @@

+#include "time.h"
...
@@ -5287,6 +5292,7 @@ ProcessLogQueryPlanInterrupt(void)
* we check all the LocalLock entries and when finding even
one, give up
* logging the plan.
*/
+ start = clock();
hash_seq_init(&status, GetLockMethodLocalHash());
while ((locallock = (LOCALLOCK *) hash_seq_search(&status)) !=
NULL)
{
if (LOCALLOCK_LOCKTAG(*locallock) == LOCKTAG_PAGE)
{
ereport(LOG_SERVER_ONLY,
errmsg("ignored request for logging query plan due
to page lock conflicts"),
errdetail("You can try again in a moment."));
hash_seq_term(&status);

ProcessLogQueryPlanInterruptActive = false;
return;
}
}
+ end = clock();
+ cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
+
+ ereport(LOG,
+ errmsg("all locallock entry search took: %f",
cpu_time_used));
+

There were about 3 million log lines which recorded the cpu time, and
the duration was quite short:

=# -- Extracted cpu_time_used from log and loaded it to cpu_time.d.
=# select max(d), min(d), avg(d) from cpu_time ;
max | min | avg
----------+-----+-----------------------
0.000116 | 0 | 4.706274625332238e-07

I'm not certain that this is valid for actual use cases, but these
results seem to suggest that it will not take that long.

> 2. What happens if auto_explain is enabled in the backend and
> pg_log_query_plan() is called on the same backend? Will they conflict?
> I think we should add a test for the same.

Hmm, I think they don't conflict since they just refer QueryDesc and
don't modify it and don't use same objects for locking.
(I imagine 'conflict' here is something like 'hard conflict' in
replication[2].)

Actually using both auto_explain and pg_log_query_plan() output each
logs separately:

(pid:62835)=# select pg_sleep(10);
(pid:70000)=# select pg_log_query_plan(62835);

(pid:70000)=# \! cat data/log/postgres.log
...
2024-02-06 21:44:17.837 JST [62835:4:0] LOG: 00000: query plan
running on backend with PID 62835 is:
Query Text: select pg_sleep(10);
Result (cost=0.00..0.01 rows=1 width=4)
Output: pg_sleep('10'::double precision)
Query Identifier: 3506829283127886044
2024-02-06 21:44:17.837 JST [62835:5:0] LOCATION:
ProcessLogQueryPlanInterrupt, explain.c:5336
2024-02-06 21:44:26.974 JST [62835:6:0] LOG: 00000: duration:
10000.868 ms plan:
Query Text: select pg_sleep(10);
Result (cost=0.00..0.01 rows=1 width=4) (actual
time=10000.802..10000.804 rows=1 loops=1)

> Using injection point support we should be able to add tests for
> testing pg_log_query_plan behaviour when there are page locks held or
> when auto_explain (with instrumentation) and pg_log_query_plan() work
> on the same query plan. Use injection point to make the backend
> running query wait at a suitable point to delay its execution and fire
> pg_log_query_plan() from other backend. May be the same test could
> examine the server log file to see if the plan is indeed output to the
> server log file.
>
> Given that the feature will be used when the things have already gone
> wrong, it should not make things more serious. So more testing and
> especially automated would help.

Thanks for the advice, it seems a good idea.
I'm going to try to add tests using injection point.

[1]
/message-id/CAAaqYe8LXVXQhYy3yT0QOHUymdM%3Duha0dJ0%3DBEPzVAx2nG1gsw%40mail.gmail.com
[2]
/docs/devel/hot-standby.html#HOT-STANDBY-CONFLICT

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, rafaelthca(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-07 04:58:18
Message-ID: CAExHW5ukAdmnq2B8gVJ4ufWFUrQvFYtoMf_fGgpsDkTtEF4oZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 7, 2024 at 9:38 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> Hi Ashutosh,
>
> On 2024-02-06 19:51, Ashutosh Bapat wrote:
>
> > Thanks for the summary. It is helpful. I think patch is also getting
> > better.
> >
> > I have a few questions and suggestions
>
> Thanks for your comments.
>
> > 1. Prologue of GetLockMethodLocalHash() mentions
> > * NOTE: When there are many entries in LockMethodLocalHash, calling
> > this
> > * function and looking into all of them can lead to performance
> > problems.
> > */
> > How bad this performance could be. Let's assume that a query is taking
> > time and pg_log_query_plan() is invoked to examine the plan of this
> > query. Is it possible that the looping over all the locks itself takes
> > a lot of time delaying the query execution further?
>
> I think it depends on the number of local locks, but I've measured cpu
> time for this page lock check by adding below codes and
> v27-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch[1], which
> calls ProcessLogQueryPlanInterrupt() in every CFI on my laptop just for
> your information:
>
> diff --git a/src/backend/commands/explain.c
> b/src/backend/commands/explain.c
> index 5f7d77d567..65b7cb4925 100644
> --- a/src/backend/commands/explain.c
> +++ b/src/backend/commands/explain.c
> @@ -44,6 +44,8 @@
>
> +#include "time.h"
> ...
> @@ -5287,6 +5292,7 @@ ProcessLogQueryPlanInterrupt(void)
> * we check all the LocalLock entries and when finding even
> one, give up
> * logging the plan.
> */
> + start = clock();
> hash_seq_init(&status, GetLockMethodLocalHash());
> while ((locallock = (LOCALLOCK *) hash_seq_search(&status)) !=
> NULL)
> {
> if (LOCALLOCK_LOCKTAG(*locallock) == LOCKTAG_PAGE)
> {
> ereport(LOG_SERVER_ONLY,
> errmsg("ignored request for logging query plan due
> to page lock conflicts"),
> errdetail("You can try again in a moment."));
> hash_seq_term(&status);
>
> ProcessLogQueryPlanInterruptActive = false;
> return;
> }
> }
> + end = clock();
> + cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
> +
> + ereport(LOG,
> + errmsg("all locallock entry search took: %f",
> cpu_time_used));
> +
>
> There were about 3 million log lines which recorded the cpu time, and
> the duration was quite short:
>
> =# -- Extracted cpu_time_used from log and loaded it to cpu_time.d.
> =# select max(d), min(d), avg(d) from cpu_time ;
> max | min | avg
> ----------+-----+-----------------------
> 0.000116 | 0 | 4.706274625332238e-07
>
> I'm not certain that this is valid for actual use cases, but these
> results seem to suggest that it will not take that long.

What load did you run? I don't think any query in make check would
take say thousands of locks. The prologue refers to a very populated
lock hash table. I think that will happen if thousands of tables are
queried in a single query OR a query runs on a partitioned table with
thousands of partitions. May be we want to try that scenario.

>
>
> > 2. What happens if auto_explain is enabled in the backend and
> > pg_log_query_plan() is called on the same backend? Will they conflict?
> > I think we should add a test for the same.
>
> Hmm, I think they don't conflict since they just refer QueryDesc and
> don't modify it and don't use same objects for locking.
> (I imagine 'conflict' here is something like 'hard conflict' in
> replication[2].)

By conflict, I mean the two features behave weird when used together
e.g give wrong results or crash etc.

>
> Actually using both auto_explain and pg_log_query_plan() output each
> logs separately:
>
> (pid:62835)=# select pg_sleep(10);
> (pid:70000)=# select pg_log_query_plan(62835);
>
> (pid:70000)=# \! cat data/log/postgres.log
> ...
> 2024-02-06 21:44:17.837 JST [62835:4:0] LOG: 00000: query plan
> running on backend with PID 62835 is:
> Query Text: select pg_sleep(10);
> Result (cost=0.00..0.01 rows=1 width=4)
> Output: pg_sleep('10'::double precision)
> Query Identifier: 3506829283127886044
> 2024-02-06 21:44:17.837 JST [62835:5:0] LOCATION:
> ProcessLogQueryPlanInterrupt, explain.c:5336
> 2024-02-06 21:44:26.974 JST [62835:6:0] LOG: 00000: duration:
> 10000.868 ms plan:
> Query Text: select pg_sleep(10);
> Result (cost=0.00..0.01 rows=1 width=4) (actual
> time=10000.802..10000.804 rows=1 loops=1)
>
> > Using injection point support we should be able to add tests for
> > testing pg_log_query_plan behaviour when there are page locks held or
> > when auto_explain (with instrumentation) and pg_log_query_plan() work
> > on the same query plan. Use injection point to make the backend
> > running query wait at a suitable point to delay its execution and fire
> > pg_log_query_plan() from other backend. May be the same test could
> > examine the server log file to see if the plan is indeed output to the
> > server log file.
> >
> > Given that the feature will be used when the things have already gone
> > wrong, it should not make things more serious. So more testing and
> > especially automated would help.
>
> Thanks for the advice, it seems a good idea.
> I'm going to try to add tests using injection point.

Your test with pg_sleep() is a good basic test. But more involved
testing might need something like injection points.

--
Best Wishes,
Ashutosh Bapat


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, rafaelthca(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-07 10:14:40
Message-ID: a82b1a3f677d2a3aa95083c394e12843@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2024-02-07 13:58, Ashutosh Bapat wrote:
> On Wed, Feb 7, 2024 at 9:38 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
>>
>> Hi Ashutosh,
>>
>> On 2024-02-06 19:51, Ashutosh Bapat wrote:
>>
>> > Thanks for the summary. It is helpful. I think patch is also getting
>> > better.
>> >
>> > I have a few questions and suggestions
>>
>> Thanks for your comments.
>>
>> > 1. Prologue of GetLockMethodLocalHash() mentions
>> > * NOTE: When there are many entries in LockMethodLocalHash, calling
>> > this
>> > * function and looking into all of them can lead to performance
>> > problems.
>> > */
>> > How bad this performance could be. Let's assume that a query is taking
>> > time and pg_log_query_plan() is invoked to examine the plan of this
>> > query. Is it possible that the looping over all the locks itself takes
>> > a lot of time delaying the query execution further?
>>
>> I think it depends on the number of local locks, but I've measured cpu
>> time for this page lock check by adding below codes and
>> v27-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch[1], which
>> calls ProcessLogQueryPlanInterrupt() in every CFI on my laptop just
>> for
>> your information:
>>
>> diff --git a/src/backend/commands/explain.c
>> b/src/backend/commands/explain.c
>> index 5f7d77d567..65b7cb4925 100644
>> --- a/src/backend/commands/explain.c
>> +++ b/src/backend/commands/explain.c
>> @@ -44,6 +44,8 @@
>>
>> +#include "time.h"
>> ...
>> @@ -5287,6 +5292,7 @@ ProcessLogQueryPlanInterrupt(void)
>> * we check all the LocalLock entries and when finding even
>> one, give up
>> * logging the plan.
>> */
>> + start = clock();
>> hash_seq_init(&status, GetLockMethodLocalHash());
>> while ((locallock = (LOCALLOCK *) hash_seq_search(&status))
>> !=
>> NULL)
>> {
>> if (LOCALLOCK_LOCKTAG(*locallock) == LOCKTAG_PAGE)
>> {
>> ereport(LOG_SERVER_ONLY,
>> errmsg("ignored request for logging query plan
>> due
>> to page lock conflicts"),
>> errdetail("You can try again in a moment."));
>> hash_seq_term(&status);
>>
>> ProcessLogQueryPlanInterruptActive = false;
>> return;
>> }
>> }
>> + end = clock();
>> + cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
>> +
>> + ereport(LOG,
>> + errmsg("all locallock entry search took: %f",
>> cpu_time_used));
>> +
>>
>> There were about 3 million log lines which recorded the cpu time, and
>> the duration was quite short:
>>
>> =# -- Extracted cpu_time_used from log and loaded it to cpu_time.d.
>> =# select max(d), min(d), avg(d) from cpu_time ;
>> max | min | avg
>> ----------+-----+-----------------------
>> 0.000116 | 0 | 4.706274625332238e-07
>>
>> I'm not certain that this is valid for actual use cases, but these
>> results seem to suggest that it will not take that long.
>
> What load did you run? I don't think any query in make check would
> take say thousands of locks.

Sorry, I forgot to write it but ran make check as you imagined.

> The prologue refers to a very populated
> lock hash table. I think that will happen if thousands of tables are
> queried in a single query OR a query runs on a partitioned table with
> thousands of partitions. May be we want to try that scenario.

OK, I'll try such cases.

>> > 2. What happens if auto_explain is enabled in the backend and
>> > pg_log_query_plan() is called on the same backend? Will they conflict?
>> > I think we should add a test for the same.
>>
>> Hmm, I think they don't conflict since they just refer QueryDesc and
>> don't modify it and don't use same objects for locking.
>> (I imagine 'conflict' here is something like 'hard conflict' in
>> replication[2].)
>
> By conflict, I mean the two features behave weird when used together
> e.g give wrong results or crash etc.
>
>>
>> Actually using both auto_explain and pg_log_query_plan() output each
>> logs separately:
>>
>> (pid:62835)=# select pg_sleep(10);
>> (pid:70000)=# select pg_log_query_plan(62835);
>>
>> (pid:70000)=# \! cat data/log/postgres.log
>> ...
>> 2024-02-06 21:44:17.837 JST [62835:4:0] LOG: 00000: query plan
>> running on backend with PID 62835 is:
>> Query Text: select pg_sleep(10);
>> Result (cost=0.00..0.01 rows=1 width=4)
>> Output: pg_sleep('10'::double precision)
>> Query Identifier: 3506829283127886044
>> 2024-02-06 21:44:17.837 JST [62835:5:0] LOCATION:
>> ProcessLogQueryPlanInterrupt, explain.c:5336
>> 2024-02-06 21:44:26.974 JST [62835:6:0] LOG: 00000: duration:
>> 10000.868 ms plan:
>> Query Text: select pg_sleep(10);
>> Result (cost=0.00..0.01 rows=1 width=4) (actual
>> time=10000.802..10000.804 rows=1 loops=1)
>>
>> > Using injection point support we should be able to add tests for
>> > testing pg_log_query_plan behaviour when there are page locks held or
>> > when auto_explain (with instrumentation) and pg_log_query_plan() work
>> > on the same query plan. Use injection point to make the backend
>> > running query wait at a suitable point to delay its execution and fire
>> > pg_log_query_plan() from other backend. May be the same test could
>> > examine the server log file to see if the plan is indeed output to the
>> > server log file.
>> >
>> > Given that the feature will be used when the things have already gone
>> > wrong, it should not make things more serious. So more testing and
>> > especially automated would help.
>>
>> Thanks for the advice, it seems a good idea.
>> I'm going to try to add tests using injection point.
>
> Your test with pg_sleep() is a good basic test. But more involved
> testing might need something like injection points.

It might be so, I will consider whether there are any subtle timing
issues, etc.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: jian he <jian(dot)universality(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-12 00:00:00
Message-ID: CACJufxGjtoFxLhXy9JOrfYsXnFyCAMrdu2DTyLPx5Js3qY3J-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 29, 2024 at 9:02 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> Hi,
>
> Updated the patch to fix typos and move
> ProcessLogQueryPlanInterruptActive from errfinish() to AbortTransaction.
>

+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_log_query_plan</primary>
+ </indexterm>
+ <function>pg_log_query_plan</function> (
<parameter>pid</parameter> <type>integer</type> )
+ <returnvalue>boolean</returnvalue>
+ </para>
+ <para>
+ Requests to log the plan of the query currently running on the
+ backend with specified process ID.
+ It will be logged at <literal>LOG</literal> message level and
+ will appear in the server log based on the log
+ configuration set (See <xref linkend="runtime-config-logging"/>
+ for more information), but will not be sent to the client
+ regardless of <xref linkend="guc-client-min-messages"/>.
+ </para></entry>
+ </row>
it would be better to explain the meaning of return value TRUE/FALSE?

+# logging plan of the running query on the specified backend
+{ oid => '8000', descr => 'log plan of the running query on the
specified backend',
+ proname => 'pg_log_query_plan',
+ provolatile => 'v', prorettype => 'bool',
you can add
`proargnames => '{pid}'`

+ if (proc == NULL)
+ {
+ /*
+ * This is just a warning so a loop-through-resultset will not abort
+ * if one backend terminated on its own during the run.
+ */
+ ereport(WARNING,
+ (errmsg("PID %d is not a PostgreSQL backend process", pid)));
+ PG_RETURN_BOOL(false);
+ }
+
+ be_status = pgstat_get_beentry_by_backend_id(proc->backendId);
+ if (be_status->st_backendType != B_BACKEND)
+ {
+ ereport(WARNING,
+ (errmsg("PID %d is not a PostgreSQL client backend process", pid)));
+ PG_RETURN_BOOL(false);
+ }
+
+ if (SendProcSignal(pid, PROCSIG_LOG_QUERY_PLAN, proc->backendId) < 0)
+ {
+ /* Again, just a warning to allow loops */
+ ereport(WARNING,
+ (errmsg("could not send signal to process %d: %m", pid)));
+ PG_RETURN_BOOL(false);
+ }

I found out that pg_log_query_plan's comments look like
pg_log_backend_memory_contexts.
pg_log_backend_memory_contexts will iterate through many memory contexts.
but pg_log_query_plan for one specific pid will only output one plan?
so I am a little bit confused by the comments.

+ /*
+ * Ensure no page lock is held on this process.
+ *
+ * If page lock is held at the time of the interrupt, we can't acquire any
+ * other heavyweight lock, which might be necessary for explaining the plan
+ * when retrieving column names.
+ *
+ * This may be overkill, but since page locks are held for a short duration
+ * we check all the LocalLock entries and when finding even one, give up
+ * logging the plan.
+ */
+ hash_seq_init(&status, GetLockMethodLocalHash());
+ while ((locallock = (LOCALLOCK *) hash_seq_search(&status)) != NULL)
+ {
+ if (LOCALLOCK_LOCKTAG(*locallock) == LOCKTAG_PAGE)
maybe not that self evident, the above comments still not explained
why we need to ensure only
PAGE lock was held on this process?

In the commit message, can you add all the discussion links?
My gmail account doesn't have a previous discussion history.
I am not sure this
(/message-id/flat/d68c3ae31672664876b22d2dcbb526d2%40postgrespro.ru)
is the only discussion link?

I found a bug:
src8=# select *, pg_sleep(10) from tenk1 for update;
2024-02-11 15:54:17.944 CST [48602] LOG: query plan running on
backend with PID 48602 is:
Query Text: select *, pg_sleep(10) from tenk1 for update;
LockRows (cost=0.00..570.00 rows=10000 width=254)
Output: unique1, unique2, two, four, ten, twenty, hundred,
thousand, twothousand, fivethous, tenthous, odd, even, stringu1,
stringu2, string4, (pg_sleep('10'::double precision)), ctid
-> Seq Scan on public.tenk1 (cost=0.00..470.00 rows=10000 width=254)
Output: unique1, unique2, two, four, ten, twenty,
hundred, thousand, twothousand, fivethous, tenthous, odd, even,
stringu1, stringu2, string4, pg_sleep('10'::double precision), ctid

another session (PID) executes `SELECT pg_log_query_plan(48602);` in
the meantime.
pg_log_query_plan returns true successfully, but PID 48602 was stuck.

I have problem using git apply:
error: patch failed: src/include/commands/explain.h:94
error: src/include/commands/explain.h: patch does not apply

`patch -p1 < /v35-0001-Add-function-to-log-the-plan-of-the-query.patch` works


From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, rafaelthca(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-12 00:00:00
Message-ID: CACJufxHZ5sf1a6zuCmpH3NP6EnVE34py_Ou1jPF6pUhTOkEQOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 7, 2024 at 12:58 PM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
> >
> > > */
> > > How bad this performance could be. Let's assume that a query is taking
> > > time and pg_log_query_plan() is invoked to examine the plan of this
> > > query. Is it possible that the looping over all the locks itself takes
> > > a lot of time delaying the query execution further?
> >
corner case test:
pgbench --initialize --partition-method=range --partitions=20000
Somehow my setup, the pg_bench didn't populate the data but there are
20000 partitions there.
(all my other settings are default)

some interesting things happened when a query touch so many partitions like:
select abalance, aid from public.pgbench_accounts,pg_sleep(4) where aid > 1;

in another session, if you immediate call SELECT pg_log_query_plan(9482);
then output be
`
LOG: backend with PID 9482 is not running a query or a subtransaction
is aborted
`
however if you delay a little bit of time (like 1 second), then
LOG will emit the plan with lots of text (not sure the plan is right).

I think the reason is that the `InitPlan` within
standard_ExecutorStart takes more time to finish
when your query touches a lot of partitions.


From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, rafaelthca(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-12 04:39:18
Message-ID: CAExHW5uT471q6D=MHvxHZ3zLhgZXDJiVG3Jhm-ngPkv0pk4uBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 12, 2024 at 5:31 AM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
>
> I found a bug:
> src8=# select *, pg_sleep(10) from tenk1 for update;
> 2024-02-11 15:54:17.944 CST [48602] LOG: query plan running on
> backend with PID 48602 is:
> Query Text: select *, pg_sleep(10) from tenk1 for update;
> LockRows (cost=0.00..570.00 rows=10000 width=254)
> Output: unique1, unique2, two, four, ten, twenty, hundred,
> thousand, twothousand, fivethous, tenthous, odd, even, stringu1,
> stringu2, string4, (pg_sleep('10'::double precision)), ctid
> -> Seq Scan on public.tenk1 (cost=0.00..470.00 rows=10000 width=254)
> Output: unique1, unique2, two, four, ten, twenty,
> hundred, thousand, twothousand, fivethous, tenthous, odd, even,
> stringu1, stringu2, string4, pg_sleep('10'::double precision), ctid
>
> another session (PID) executes `SELECT pg_log_query_plan(48602);` in
> the meantime.
> pg_log_query_plan returns true successfully, but PID 48602 was stuck.

What do you mean by PID 48602 was stuck?

--
Best Wishes,
Ashutosh Bapat


From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, rafaelthca(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-12 04:42:21
Message-ID: CAExHW5sWiMZjDpbhSLpgo6prV2J_SdMx6_mK_vxDrwPwNust7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 12, 2024 at 5:31 AM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
>
> On Wed, Feb 7, 2024 at 12:58 PM Ashutosh Bapat
> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >
> > >
> > > > */
> > > > How bad this performance could be. Let's assume that a query is taking
> > > > time and pg_log_query_plan() is invoked to examine the plan of this
> > > > query. Is it possible that the looping over all the locks itself takes
> > > > a lot of time delaying the query execution further?
> > >
> corner case test:
> pgbench --initialize --partition-method=range --partitions=20000
> Somehow my setup, the pg_bench didn't populate the data but there are
> 20000 partitions there.
> (all my other settings are default)
>
> some interesting things happened when a query touch so many partitions like:
> select abalance, aid from public.pgbench_accounts,pg_sleep(4) where aid > 1;
>
> in another session, if you immediate call SELECT pg_log_query_plan(9482);
> then output be
> `
> LOG: backend with PID 9482 is not running a query or a subtransaction
> is aborted
> `
> however if you delay a little bit of time (like 1 second), then
> LOG will emit the plan with lots of text (not sure the plan is right).
>
> I think the reason is that the `InitPlan` within
> standard_ExecutorStart takes more time to finish
> when your query touches a lot of partitions.

That's probably expected unless we make the ActiveQueryDesc available
before ExecutorRun.

How much time did it took between issuing SELECT
pg_log_query_plan(9482); and plan getting output to the server error
logs? How does this time compare with say the same time difference for
a simple query and how much of that time can be attributed to Lock
table hash scan, if the difference between time difference is huge.

--
Best Wishes,
Ashutosh Bapat


From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, rafaelthca(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-12 07:08:00
Message-ID: CACJufxHa6VFi1KPTKaOeCs6Yg=chyRGAwFwPiftiGkFKShRfCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 12, 2024 at 12:42 PM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
> On Mon, Feb 12, 2024 at 5:31 AM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
> >
> > On Wed, Feb 7, 2024 at 12:58 PM Ashutosh Bapat
> > <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> > >
> > > >
> > > > > */
> > > > > How bad this performance could be. Let's assume that a query is taking
> > > > > time and pg_log_query_plan() is invoked to examine the plan of this
> > > > > query. Is it possible that the looping over all the locks itself takes
> > > > > a lot of time delaying the query execution further?
> > > >
> > corner case test:
> > pgbench --initialize --partition-method=range --partitions=20000
> > Somehow my setup, the pg_bench didn't populate the data but there are
> > 20000 partitions there.
> > (all my other settings are default)
> >
> > some interesting things happened when a query touch so many partitions like:
> > select abalance, aid from public.pgbench_accounts,pg_sleep(4) where aid > 1;
> >
> > in another session, if you immediate call SELECT pg_log_query_plan(9482);
> > then output be
> > `
> > LOG: backend with PID 9482 is not running a query or a subtransaction
> > is aborted
> > `
> > however if you delay a little bit of time (like 1 second), then
> > LOG will emit the plan with lots of text (not sure the plan is right).
> >
> > I think the reason is that the `InitPlan` within
> > standard_ExecutorStart takes more time to finish
> > when your query touches a lot of partitions.
>
> That's probably expected unless we make the ActiveQueryDesc available
> before ExecutorRun.

so the error message should be something like:
errmsg("backend with PID %d is not running a query or a subtransaction
is aborted or the plan is not generated",

>
> How much time did it took between issuing SELECT
> pg_log_query_plan(9482); and plan getting output to the server error
> logs?

it either says errmsg("backend with PID %d is not running a query,.....)
or outputs the plan immediately, if i wait one or two seconds to call
pg_log_query_plan.

because of previously mentioned: with lots of partitions, initplan
took longer to finish.
setup: 2 sessions, one runs the query (select abalance, aid from
public.pgbench_accounts,pg_sleep(4) where aid > 1;),
another one calls pg_log_query_plan.


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, rafaelthca(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-13 00:19:03
Message-ID: 1b2b247530f3ff3afab4ddc2df222e8b@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2024-02-07 19:14, torikoshia wrote:
> On 2024-02-07 13:58, Ashutosh Bapat wrote:

>> The prologue refers to a very populated
>> lock hash table. I think that will happen if thousands of tables are
>> queried in a single query OR a query runs on a partitioned table with
>> thousands of partitions. May be we want to try that scenario.
>
> OK, I'll try such cases.

I measured this using partitioned pgbench_accounts with some
modification to v36[1].
The results[2] show that CPU time increases in proportion to the number
of partitions, and the increase is not that large.

However I've noticed that these ensuring no page lock logic would not be
necessary anymore since cc32ec24fdf3b98 removed the assertion which
caused an error[1].

$ git show cc32ec24fdf3b98
..
diff --git a/src/backend/storage/lmgr/lock.c
b/src/backend/storage/lmgr/lock.c
index 0a692ee0a6..f595bce31b 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -186,18 +186,6 @@ static int FastPathLocalUseCount = 0;
*/
static bool IsRelationExtensionLockHeld PG_USED_FOR_ASSERTS_ONLY =
false;

- /*
- * We don't acquire any other heavyweight lock while holding
the page lock
- * except for relation extension.
- */
- Assert(!IsPageLockHeld ||
- (locktag->locktag_type ==
LOCKTAG_RELATION_EXTEND));

I'm going to remove ensuring no page lock logic after some testings.

[1]
$ git diff _submission/log_running_query-v36
+#include "time.h"
+
bool ProcessLogQueryPlanInterruptActive = false;

/* Hook for plugins to get control in ExplainOneQuery() */
@@ -5258,6 +5260,10 @@ ProcessLogQueryPlanInterrupt(void)
MemoryContext old_cxt;
LogQueryPlanPending = false;

+ clock_t start, end;
+ double cpu_time_used;
+ int num_hash_entry = 0;
+
/* Cannot re-enter. */
if (ProcessLogQueryPlanInterruptActive)
return;
@@ -5287,9 +5293,11 @@ ProcessLogQueryPlanInterrupt(void)
* we check all the LocalLock entries and when finding even one,
give up
* logging the plan.
*/
+ start = clock();
hash_seq_init(&status, GetLockMethodLocalHash());
while ((locallock = (LOCALLOCK *) hash_seq_search(&status)) != NULL)
{
+ num_hash_entry++;
if (LOCALLOCK_LOCKTAG(*locallock) == LOCKTAG_PAGE)
{
ereport(LOG_SERVER_ONLY,
@@ -5301,6 +5309,12 @@ ProcessLogQueryPlanInterrupt(void)
return;
}
}
+ end = clock();
+ cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
+
+ ereport(LOG,
+ errmsg("locallock entry search took: %f for %d entries",
cpu_time_used, num_hash_entry));

[2]
# partition number: 512
locallock entry search took: 0.000029 for 1026 entries
locallock entry search took: 0.000030 for 1026 entries
locallock entry search took: 0.000036 for 1026 entries

# partition number: 1024
locallock entry search took: 0.000070 for 2050 entries
locallock entry search took: 0.000059 for 2050 entries
locallock entry search took: 0.000049 for 2050 entries

# partition number: 2048
locallock entry search took: 0.000100 for 4098 entries
locallock entry search took: 0.000103 for 4098 entries
locallock entry search took: 0.000101 for 4098 entries

# partition number: 4096
locallock entry search took: 0.000197 for 8194 entries
locallock entry search took: 0.000193 for 8194 entries
locallock entry search took: 0.000192 for 8194 entries

[3]
/message-id/0642712f-1298-960a-a3ba-e256d56040ac%40oss.nttdata.com

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-13 02:30:53
Message-ID: 3d5fd6d955944b5e8e4b8d1917f14885@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2024-02-12 09:00, jian he wrote:

Thanks for you comments.

> On Mon, Jan 29, 2024 at 9:02 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
>>
>> Hi,
>>
>> Updated the patch to fix typos and move
>> ProcessLogQueryPlanInterruptActive from errfinish() to
>> AbortTransaction.
>>
>
> + <row>
> + <entry role="func_table_entry"><para role="func_signature">
> + <indexterm>
> + <primary>pg_log_query_plan</primary>
> + </indexterm>
> + <function>pg_log_query_plan</function> (
> <parameter>pid</parameter> <type>integer</type> )
> + <returnvalue>boolean</returnvalue>
> + </para>
> + <para>
> + Requests to log the plan of the query currently running on the
> + backend with specified process ID.
> + It will be logged at <literal>LOG</literal> message level and
> + will appear in the server log based on the log
> + configuration set (See <xref
> linkend="runtime-config-logging"/>
> + for more information), but will not be sent to the client
> + regardless of <xref linkend="guc-client-min-messages"/>.
> + </para></entry>
> + </row>
> it would be better to explain the meaning of return value TRUE/FALSE?

Yeah, but I've noticed that this should be located in 'Table Server
Signaling Functions' not 'Table Control Data Functions'.
Since 'Table Server Signaling Functions' describes the return code as
below, just relocation seems fine.

Each of these functions returns true if the signal was successfully
sent and false if sending the signal failed.

> +# logging plan of the running query on the specified backend
> +{ oid => '8000', descr => 'log plan of the running query on the
> specified backend',
> + proname => 'pg_log_query_plan',
> + provolatile => 'v', prorettype => 'bool',
> you can add
> `proargnames => '{pid}'`

Hmm, pg_log_query_plan() can take one argument, I'm not sure how much
sense it makes.
Other functions which take one argument such as pg_cancel_backend() does
not have proargnames.

> + if (proc == NULL)
> + {
> + /*
> + * This is just a warning so a loop-through-resultset will not abort
> + * if one backend terminated on its own during the run.
> + */
> + ereport(WARNING,
> + (errmsg("PID %d is not a PostgreSQL backend process", pid)));
> + PG_RETURN_BOOL(false);
> + }
> +
> + be_status = pgstat_get_beentry_by_backend_id(proc->backendId);
> + if (be_status->st_backendType != B_BACKEND)
> + {
> + ereport(WARNING,
> + (errmsg("PID %d is not a PostgreSQL client backend process", pid)));
> + PG_RETURN_BOOL(false);
> + }
> +
> + if (SendProcSignal(pid, PROCSIG_LOG_QUERY_PLAN, proc->backendId) < 0)
> + {
> + /* Again, just a warning to allow loops */
> + ereport(WARNING,
> + (errmsg("could not send signal to process %d: %m", pid)));
> + PG_RETURN_BOOL(false);
> + }
>
> I found out that pg_log_query_plan's comments look like
> pg_log_backend_memory_contexts.
> pg_log_backend_memory_contexts will iterate through many memory
> contexts.
> but pg_log_query_plan for one specific pid will only output one plan?
> so I am a little bit confused by the comments.

These "loop" mean backend can run pg_log_query_plan() repeatedly even
when failing sending signals.
pg_signal_backend() also have such comments.

> + /*
> + * Ensure no page lock is held on this process.
> + *
> + * If page lock is held at the time of the interrupt, we can't acquire
> any
> + * other heavyweight lock, which might be necessary for explaining the
> plan
> + * when retrieving column names.
> + *
> + * This may be overkill, but since page locks are held for a short
> duration
> + * we check all the LocalLock entries and when finding even one, give
> up
> + * logging the plan.
> + */
> + hash_seq_init(&status, GetLockMethodLocalHash());
> + while ((locallock = (LOCALLOCK *) hash_seq_search(&status)) != NULL)
> + {
> + if (LOCALLOCK_LOCKTAG(*locallock) == LOCKTAG_PAGE)
> maybe not that self evident, the above comments still not explained
> why we need to ensure only
> PAGE lock was held on this process?

This is for preventing assertion error and it seems not necessary
anymore as described in [1].
I'm going remove them.

>
> In the commit message, can you add all the discussion links?
> My gmail account doesn't have a previous discussion history.

Sure.

> I am not sure this
> (/message-id/flat/d68c3ae31672664876b22d2dcbb526d2%40postgrespro.ru)
> is the only discussion link?

This is the original one:
/message-id/cf8501bcd95ba4d727cbba886ba9eea8%40oss.nttdata.com

> I found a bug:
> src8=# select *, pg_sleep(10) from tenk1 for update;
> 2024-02-11 15:54:17.944 CST [48602] LOG: query plan running on
> backend with PID 48602 is:
> Query Text: select *, pg_sleep(10) from tenk1 for update;
> LockRows (cost=0.00..570.00 rows=10000 width=254)
> Output: unique1, unique2, two, four, ten, twenty, hundred,
> thousand, twothousand, fivethous, tenthous, odd, even, stringu1,
> stringu2, string4, (pg_sleep('10'::double precision)), ctid
> -> Seq Scan on public.tenk1 (cost=0.00..470.00 rows=10000
> width=254)
> Output: unique1, unique2, two, four, ten, twenty,
> hundred, thousand, twothousand, fivethous, tenthous, odd, even,
> stringu1, stringu2, string4, pg_sleep('10'::double precision), ctid
>
> another session (PID) executes `SELECT pg_log_query_plan(48602);` in
> the meantime.
> pg_log_query_plan returns true successfully, but PID 48602 was stuck.

Hmm, it's not simply sleeping, is it?
I'm concerned a bit this because estimated rows of tenk1 is 10000.
If so, the query will take 10000 * 10 seconds.

> I have problem using git apply:
> error: patch failed: src/include/commands/explain.h:94
> error: src/include/commands/explain.h: patch does not apply
>
> `patch -p1 < /v35-0001-Add-function-to-log-the-plan-of-the-query.patch`
> works

I'll update the patch including other points such as removing ensuring
no page lock code.

[1]
/message-id/1b2b247530f3ff3afab4ddc2df222e8b%40oss.nttdata.com

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-14 06:29:35
Message-ID: 67fac921c935d7b9db51609a7599520f@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2024-02-13 11:30, torikoshia wrote:

> I'll update the patch including other points such as removing ensuring
> no page lock code.

Updated the patch.

> Using injection point support we should be able to add tests for
> testing pg_log_query_plan behaviour when there are page locks held or
> when auto_explain (with instrumentation) and pg_log_query_plan() work
> on the same query plan. Use injection point to make the backend
> running query wait at a suitable point to delay its execution and fire
> pg_log_query_plan() from other backend. May be the same test could
> examine the server log file to see if the plan is indeed output to the
> server log file.

Attached patch uses injection point as below:

- There may be more points to inject, but added an injection point at
ExecutorRun(), which seems to be the first interruption point where
plans can be reliably displayed.
- At injection point, it'd be possible to wait for some duration and
fire pg_log_plan_query() as you suggested. However, I'm not sure how
long duration is appropriate considering the variety of testing
environments. Instead, attached patch calls
HandleLogQueryPlanInterrupt() directly and set InterruptPending.
- Tests both pg_log_plan_query() and auto_explain logs for their output,
and the logged plans are the same.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

Attachment Content-Type Size
v36-0001-Add-function-to-log-the-plan-of-the-query.patch text/x-diff 33.4 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, jtc331(at)gmail(dot)com, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-15 09:12:11
Message-ID: CA+TgmobH+Uto9MCD+vWc71bVbOnd7d8zeYjRT8nXaeLe5hsNJQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I've just been catching up on this thread.

+ if (MyProc->heldLocks)
+ {
+ ereport(LOG_SERVER_ONLY,
+ errmsg("ignored request for logging query plan due to lock conflicts"),
+ errdetail("You can try again in a moment."));
+ return;
+ }

I don't like this for several reasons.

First, I think it's not nice to have a request just get ignored. A
user will expect that if we cannot immediately respond to some
request, we'll respond later at the first time that it's safe to do
so, rather than just ignoring it and telling them to retry.

Second, I don't think that the error message is very good. It talks
about lock conflicts, but we don't know that there is any real
problem. We know that, if we enter this block, the server is in the
middle of trying to acquire some lock, and we also know that we could
attempt to acquire a lock as part of generating the EXPLAIN output,
and maybe that's an issue. But that's not a lock conflict. That's a
re-entrancy problem. I don't know that we want to talk about
re-entrancy problems in an error message, and I don't really think
this error message should exist in the first place, but if we're going
to error out in this case surely we shouldn't do so with an error
message that describes a problem we don't have.

Third, I think that the re-entrancy problems with this patch may
extend well beyond lock acquisition. This is one example of how it can
be unsafe to do something as complicated as EXPLAIN at any arbitrary
CHECK_FOR_INTERRUPTS(). It is not correct to say, as
http://postgr.es/m/CAAaqYe9euUZD8bkjXTVcD9e4n5c7kzHzcvuCJXt-xds9X4c7Fw@mail.gmail.com
does, that the problems with running EXPLAIN at an arbitrary point are
specific to running this code in an aborted transaction. The existence
of this code shows that there is at least one hazard even if the
current transaction is not aborted, and I see no analysis on this
thread indicating whether there are any more such hazards, or how we
could go about finding them all.

I think the issue is very general. We have lots of subsystems that
both (a) use global variables and (b) contain CHECK_FOR_INTERRUPTS().
If we process an interrupt while that code is in the middle of
manipulating its global variables and then again re-enter that code,
bad things might happen. We could try to rule that out by analyzing
all such subsystems and all places where CHECK_FOR_INTERRUPTS() may
appear, but that's very difficult. Suppose we took the alternative
approach recommended by Andrey Lepikhov in
http://postgr.es/m/b1b110ae-61f6-4fd9-9b94-f967db9b53d4@app.fastmail.com
and instead set a flag that gets handled in some suitable place in the
executor code, like ExecProcNode(). If we did that, then we would know
that we're not in the middle of a syscache lookup, a catcache lookup,
a heavyweight lock acquisition, an ereport, or any other low-level
subsystem call that might create problems for the EXPLAIN code.

In that design, the hack shown above would go away, and we could be
much more certain that we don't need any other similar hacks for other
subsystems. The only downside is that we might experience a slightly
longer delay before a requested EXPLAIN plan actually shows up, but
that seems like a pretty small price to pay for being able to reason
about the behavior of the system. I don't *think* there are any cases
where we run in the executor for a particularly long time without a
new call to ExecProcNode().

I think this case is a bit like vacuum_delay_point(). You might think
that vacuum_delay_point() could be moved inside of
CHECK_FOR_INTERRUPTS(), but we've made the opposite decision: every
vacuum_delay_point() calls CHECK_FOR_INTERRUPTS() but not every
CHECK_FOR_INTERRUPTS() calls vacuum_delay_point(). That means that we
can allow vacuum_delay_point() only at cases where we know it's safe,
rather than at every CHECK_FOR_INTERRUPTS(). I think that's a pretty
smart decision, even for vacuum_delay_point(), and AFAICS the code
we're proposing to run here does things that are substantially more
complicated than what vacuum_delay_point() does. That code just does a
bit of reading of shared memory, reports a wait event, and sleeps.
That's really simple compared to code that could try to do relcache
builds, which can trigger syscache lookups, which can both trigger
heavyweight lock acquisition, lightweight lock acquisition, bringing
pages into shared_buffers possibly through physical I/O, processing of
invalidation messages, and a bunch of other stuff.

It's really hard for me to accept that the heavyweight lock problem
for which the patch contains a workaround is the only one that exists.
I can't see any reason why that should be true.

...Robert


From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, jtc331(at)gmail(dot)com, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-15 18:59:11
Message-ID: 20240215185911.v4o6fo444md6a3w7@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2024-02-15 14:42:11 +0530, Robert Haas wrote:
> I think the issue is very general. We have lots of subsystems that
> both (a) use global variables and (b) contain CHECK_FOR_INTERRUPTS().
> If we process an interrupt while that code is in the middle of
> manipulating its global variables and then again re-enter that code,
> bad things might happen. We could try to rule that out by analyzing
> all such subsystems and all places where CHECK_FOR_INTERRUPTS() may
> appear, but that's very difficult. Suppose we took the alternative
> approach recommended by Andrey Lepikhov in
> http://postgr.es/m/b1b110ae-61f6-4fd9-9b94-f967db9b53d4@app.fastmail.com
> and instead set a flag that gets handled in some suitable place in the
> executor code, like ExecProcNode(). If we did that, then we would know
> that we're not in the middle of a syscache lookup, a catcache lookup,
> a heavyweight lock acquisition, an ereport, or any other low-level
> subsystem call that might create problems for the EXPLAIN code.
>
> In that design, the hack shown above would go away, and we could be
> much more certain that we don't need any other similar hacks for other
> subsystems. The only downside is that we might experience a slightly
> longer delay before a requested EXPLAIN plan actually shows up, but
> that seems like a pretty small price to pay for being able to reason
> about the behavior of the system.

I am very wary of adding overhead to ExecProcNode() - I'm quite sure that
adding code there would trigger visible overhead for query times.

If we went with something like tht approach, I think we'd have to do something
like redirecting node->ExecProcNode to a wrapper, presumably from within a
CFI. That wrapper could then implement the explain support, without slowing
down the normal execution path.

> I don't *think* there are any cases where we run in the executor for a
> particularly long time without a new call to ExecProcNode().

I guess it depends on what you call a long time. A large sort, for example,
could spend a fair amount of time inside tuplesort, similarly, a gather node
might need to wait for a worker for a while etc.

> It's really hard for me to accept that the heavyweight lock problem
> for which the patch contains a workaround is the only one that exists.
> I can't see any reason why that should be true.

I suspect you're right.

Greetings,

Andres Freund


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, robertmhaas(at)gmail(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, jtc331(at)gmail(dot)com, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-16 14:42:36
Message-ID: 056f59efaf78a5469ee639a8a6d341db@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 15, 2024 at 6:12 PM Robert Haas <robertmhaas(at)gmail(dot)com>
wrote:
> Hi,
>
> I've just been catching up on this thread.
>
> + if (MyProc->heldLocks)
> + {
> + ereport(LOG_SERVER_ONLY,
> + errmsg("ignored request for logging query plan due to lock
> conflicts"),
> + errdetail("You can try again in a moment."));
> + return;
> + }
>
> I don't like this for several reasons.
>
> First, I think it's not nice to have a request just get ignored. A
> user will expect that if we cannot immediately respond to some
> request, we'll respond later at the first time that it's safe to do
> so, rather than just ignoring it and telling them to retry.
>
> Second, I don't think that the error message is very good. It talks
> about lock conflicts, but we don't know that there is any real
> problem. We know that, if we enter this block, the server is in the
> middle of trying to acquire some lock, and we also know that we could
> attempt to acquire a lock as part of generating the EXPLAIN output,
> and maybe that's an issue. But that's not a lock conflict. That's a
> re-entrancy problem. I don't know that we want to talk about
> re-entrancy problems in an error message, and I don't really think
> this error message should exist in the first place, but if we're going
> to error out in this case surely we shouldn't do so with an error
> message that describes a problem we don't have.
>
> Third, I think that the re-entrancy problems with this patch may
> extend well beyond lock acquisition. This is one example of how it can
> be unsafe to do something as complicated as EXPLAIN at any arbitrary
> CHECK_FOR_INTERRUPTS(). It is not correct to say, as
> http://postgr.es/m/CAAaqYe9euUZD8bkjXTVcD9e4n5c7kzHzcvuCJXt-xds9X4c7Fw@mail.gmail.com
> does, that the problems with running EXPLAIN at an arbitrary point are
> specific to running this code in an aborted transaction. The existence
> of this code shows that there is at least one hazard even if the
> current transaction is not aborted, and I see no analysis on this
> thread indicating whether there are any more such hazards, or how we
> could go about finding them all.
>
> I think the issue is very general. We have lots of subsystems that
> both (a) use global variables and (b) contain CHECK_FOR_INTERRUPTS().
> If we process an interrupt while that code is in the middle of
> manipulating its global variables and then again re-enter that code,
> bad things might happen. We could try to rule that out by analyzing
> all such subsystems and all places where CHECK_FOR_INTERRUPTS() may
> appear, but that's very difficult. Suppose we took the alternative
> approach recommended by Andrey Lepikhov in
> http://postgr.es/m/b1b110ae-61f6-4fd9-9b94-f967db9b53d4@app.fastmail.com
> and instead set a flag that gets handled in some suitable place in the
> executor code, like ExecProcNode(). If we did that, then we would know
> that we're not in the middle of a syscache lookup, a catcache lookup,
> a heavyweight lock acquisition, an ereport, or any other low-level
> subsystem call that might create problems for the EXPLAIN code.
>
> In that design, the hack shown above would go away, and we could be
> much more certain that we don't need any other similar hacks for other
> subsystems. The only downside is that we might experience a slightly
> longer delay before a requested EXPLAIN plan actually shows up, but
> that seems like a pretty small price to pay for being able to reason
> about the behavior of the system. I don't *think* there are any cases
> where we run in the executor for a particularly long time without a
> new call to ExecProcNode().
>
> I think this case is a bit like vacuum_delay_point(). You might think
> that vacuum_delay_point() could be moved inside of
> CHECK_FOR_INTERRUPTS(), but we've made the opposite decision: every
> vacuum_delay_point() calls CHECK_FOR_INTERRUPTS() but not every
> CHECK_FOR_INTERRUPTS() calls vacuum_delay_point(). That means that we
> can allow vacuum_delay_point() only at cases where we know it's safe,
> rather than at every CHECK_FOR_INTERRUPTS(). I think that's a pretty
> smart decision, even for vacuum_delay_point(), and AFAICS the code
> we're proposing to run here does things that are substantially more
> complicated than what vacuum_delay_point() does. That code just does a
> bit of reading of shared memory, reports a wait event, and sleeps.
> That's really simple compared to code that could try to do relcache
> builds, which can trigger syscache lookups, which can both trigger
> heavyweight lock acquisition, lightweight lock acquisition, bringing
> pages into shared_buffers possibly through physical I/O, processing of
> invalidation messages, and a bunch of other stuff.
>
> It's really hard for me to accept that the heavyweight lock problem
> for which the patch contains a workaround is the only one that exists.
> I can't see any reason why that should be true.

Thanks for the review and the very detailed explanation!

I'm convinced that it's unsafe to execute EXPLAIN codes during
CHECK_FOR_INTERRUPTS() and we need to execute it in other safe place, as
well as the first and second point.

On 2024-02-16 03:59, Andres Freund wrote:
> Hi,
>
> On 2024-02-15 14:42:11 +0530, Robert Haas wrote:
>> I think the issue is very general. We have lots of subsystems that
>> both (a) use global variables and (b) contain CHECK_FOR_INTERRUPTS().
>> If we process an interrupt while that code is in the middle of
>> manipulating its global variables and then again re-enter that code,
>> bad things might happen. We could try to rule that out by analyzing
>> all such subsystems and all places where CHECK_FOR_INTERRUPTS() may
>> appear, but that's very difficult. Suppose we took the alternative
>> approach recommended by Andrey Lepikhov in
>> http://postgr.es/m/b1b110ae-61f6-4fd9-9b94-f967db9b53d4@app.fastmail.com
>> and instead set a flag that gets handled in some suitable place in the
>> executor code, like ExecProcNode(). If we did that, then we would know
>> that we're not in the middle of a syscache lookup, a catcache lookup,
>> a heavyweight lock acquisition, an ereport, or any other low-level
>> subsystem call that might create problems for the EXPLAIN code.
>>
>> In that design, the hack shown above would go away, and we could be
>> much more certain that we don't need any other similar hacks for other
>> subsystems. The only downside is that we might experience a slightly
>> longer delay before a requested EXPLAIN plan actually shows up, but
>> that seems like a pretty small price to pay for being able to reason
>> about the behavior of the system.
>
> I am very wary of adding overhead to ExecProcNode() - I'm quite sure
> that
> adding code there would trigger visible overhead for query times.
>
> If we went with something like tht approach, I think we'd have to do
> something
> like redirecting node->ExecProcNode to a wrapper, presumably from
> within a
> CFI. That wrapper could then implement the explain support, without
> slowing
> down the normal execution path.

Thanks for the idea!
I'm not so sure about the implementation now, i.e. finding the next node
to be executed from the planstate tree, but I'm going to try this
approach.

>> I don't *think* there are any cases where we run in the executor for a
>> particularly long time without a new call to ExecProcNode().
>
> I guess it depends on what you call a long time. A large sort, for
> example,
> could spend a fair amount of time inside tuplesort, similarly, a gather
> node
> might need to wait for a worker for a while etc.
>
>
>> It's really hard for me to accept that the heavyweight lock problem
>> for which the patch contains a workaround is the only one that exists.
>> I can't see any reason why that should be true.
>
> I suspect you're right.
>
> Greetings,
>
> Andres Freund

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, jtc331(at)gmail(dot)com, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-20 04:53:04
Message-ID: CA+TgmoZ2Hsb39zzofasjOHXnjsNfJy84YjFF=-KNOO7FsvAi9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 16, 2024 at 12:29 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> If we went with something like tht approach, I think we'd have to do something
> like redirecting node->ExecProcNode to a wrapper, presumably from within a
> CFI. That wrapper could then implement the explain support, without slowing
> down the normal execution path.

That's an annoying complication; maybe there's some better way to
handle this. But I think we need to do something different than what
the patch does currently because...

> > It's really hard for me to accept that the heavyweight lock problem
> > for which the patch contains a workaround is the only one that exists.
> > I can't see any reason why that should be true.
>
> I suspect you're right.

...I think the current approach is just plain dead, because of this
issue. We can't take an approach that creates an unbounded number of
unclear reentrancy issues and then insert hacks one by one to cure
them (or hack around them, more to the point) as they're discovered.

The premise has to be that we only allow logging the query plan at
points where we know it's safe, rather than, as at present, allowing
it in places that are unsafe and then trying to compensate with code
elsewhere. That's not likely to ever be as stable as we want
PostgreSQL to be.

--
Robert Haas
EDB: http://www.enterprisedb.com


From: James Coleman <jtc331(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-22 00:54:57
Message-ID: CAAaqYe-gMkdL=M4v47=H0F3+-zi2qL9zFqAv3QsizkRjFiQR0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 19, 2024 at 11:53 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Fri, Feb 16, 2024 at 12:29 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > If we went with something like tht approach, I think we'd have to do something
> > like redirecting node->ExecProcNode to a wrapper, presumably from within a
> > CFI. That wrapper could then implement the explain support, without slowing
> > down the normal execution path.
>
> That's an annoying complication; maybe there's some better way to
> handle this. But I think we need to do something different than what
> the patch does currently because...
>
> > > It's really hard for me to accept that the heavyweight lock problem
> > > for which the patch contains a workaround is the only one that exists.
> > > I can't see any reason why that should be true.
> >
> > I suspect you're right.
>
> ...I think the current approach is just plain dead, because of this
> issue. We can't take an approach that creates an unbounded number of
> unclear reentrancy issues and then insert hacks one by one to cure
> them (or hack around them, more to the point) as they're discovered.
>
> The premise has to be that we only allow logging the query plan at
> points where we know it's safe, rather than, as at present, allowing
> it in places that are unsafe and then trying to compensate with code
> elsewhere. That's not likely to ever be as stable as we want
> PostgreSQL to be.

This is potentially a bit of a wild idea, but I wonder if having some
kind of argument to CHECK_FOR_INTERRUPTS() signifying we're in
"normal" as opposed to "critical" (using that word differently than
the existing critical sections) would be worth it.

Regards,
James Coleman


From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-22 04:06:06
Message-ID: CAExHW5tq=tMqNsse5QjXxcZshopra7xD46_1ctcREDqAfPXP6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 22, 2024 at 6:25 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
> >
> > ...I think the current approach is just plain dead, because of this
> > issue. We can't take an approach that creates an unbounded number of
> > unclear reentrancy issues and then insert hacks one by one to cure
> > them (or hack around them, more to the point) as they're discovered.
> >
> > The premise has to be that we only allow logging the query plan at
> > points where we know it's safe, rather than, as at present, allowing
> > it in places that are unsafe and then trying to compensate with code
> > elsewhere. That's not likely to ever be as stable as we want
> > PostgreSQL to be.
>
> This is potentially a bit of a wild idea, but I wonder if having some
> kind of argument to CHECK_FOR_INTERRUPTS() signifying we're in
> "normal" as opposed to "critical" (using that word differently than
> the existing critical sections) would be worth it.

My hunch is this will end up being a maintenance burden since every
caller has to decide (carefully) whether the call is under normal
condition or not. Developers will tend to take a safe approach and
flag calls as critical. But importantly, what's normal for one
interrupt action may be critical for another and vice versa. Approach
would be useful depending upon how easy it is to comprehend the
definition of "normal".

If a query executes for longer than a user defined threashold (session
level GUC? or same value as auto_explain parameter), the executor
proactively prepares an EXPLAIN output and keeps it handy in case
asked for. It can do so at a "known" safe place and time rather than
at any random time and location. Extra time spent in creating EXPLAIN
output may not be noticeable in a long running query. The EXPLAIN
output could be saved in pg_stat_activity or similar place. This will
avoid signaling the backend.

--
Best Wishes,
Ashutosh Bapat


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-23 04:52:32
Message-ID: CA+TgmoYkztFSryutQMNv9DnQ__NDwGE8XhhgnbdCG8e_o4T7yQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 22, 2024 at 6:25 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
> This is potentially a bit of a wild idea, but I wonder if having some
> kind of argument to CHECK_FOR_INTERRUPTS() signifying we're in
> "normal" as opposed to "critical" (using that word differently than
> the existing critical sections) would be worth it.

It's worth considering, but the definition of "normal" vs. "critical"
might be hard to pin down. Or, we might end up with a definition that
is specific to this particular case and not generalizable to others.

--
Robert Haas
EDB: http://www.enterprisedb.com


From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-23 14:20:51
Message-ID: hzicr3kidnmwbcqwmwortldzq4fijrfca7c3prncoepgt6rpwf@uydrxvi7qhpk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Fri, Feb 23, 2024 at 10:22:32AM +0530, Robert Haas wrote:
> On Thu, Feb 22, 2024 at 6:25 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
> > This is potentially a bit of a wild idea, but I wonder if having some
> > kind of argument to CHECK_FOR_INTERRUPTS() signifying we're in
> > "normal" as opposed to "critical" (using that word differently than
> > the existing critical sections) would be worth it.
>
> It's worth considering, but the definition of "normal" vs. "critical"
> might be hard to pin down. Or, we might end up with a definition that
> is specific to this particular case and not generalizable to others.

But it doesn't have to be all or nothing right? I mean each call could say
what the situation is like in their context, like
CHECK_FOR_INTERRUPTS(GUARANTEE_NO_HEAVYWEIGHT_LOCK | GUARANTEE_WHATEVER), and
slowly tag calls as needed, similarly to how we add already CFI based on users
report.


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-23 15:23:22
Message-ID: CA+Tgmoa6QniTueRJ1rKcp_-TpVc0c8FZxOtu4ehDop+R=pEcpA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 23, 2024 at 7:50 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> On Fri, Feb 23, 2024 at 10:22:32AM +0530, Robert Haas wrote:
> > On Thu, Feb 22, 2024 at 6:25 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
> > > This is potentially a bit of a wild idea, but I wonder if having some
> > > kind of argument to CHECK_FOR_INTERRUPTS() signifying we're in
> > > "normal" as opposed to "critical" (using that word differently than
> > > the existing critical sections) would be worth it.
> >
> > It's worth considering, but the definition of "normal" vs. "critical"
> > might be hard to pin down. Or, we might end up with a definition that
> > is specific to this particular case and not generalizable to others.
>
> But it doesn't have to be all or nothing right? I mean each call could say
> what the situation is like in their context, like
> CHECK_FOR_INTERRUPTS(GUARANTEE_NO_HEAVYWEIGHT_LOCK | GUARANTEE_WHATEVER), and
> slowly tag calls as needed, similarly to how we add already CFI based on users
> report.

Absolutely. My gut feeling is that it's going to be simpler to pick a
small number of places that are safe and sufficient for this
particular feature and add an extra call there, similar to how we do
vacuum_delay_point(). The reason I think that's likely to be better is
that it will likely require changing only a relatively small number of
places. If we instead start annotating CFIs, well, we've got hundreds
of those. That's a lot more to change, and it also inconveniences
third-party extension authors and people doing back-patching. I'm not
here to say it can't work; I just think it's likely not the easiest
path.

--
Robert Haas
EDB: http://www.enterprisedb.com


From: James Coleman <jtc331(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-24 13:56:41
Message-ID: CAAaqYe9sS+FXZBugNEEjb89XLny8KGpfhrVHdsOug2yOeC3now@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 23, 2024 at 10:23 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Fri, Feb 23, 2024 at 7:50 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> > On Fri, Feb 23, 2024 at 10:22:32AM +0530, Robert Haas wrote:
> > > On Thu, Feb 22, 2024 at 6:25 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
> > > > This is potentially a bit of a wild idea, but I wonder if having some
> > > > kind of argument to CHECK_FOR_INTERRUPTS() signifying we're in
> > > > "normal" as opposed to "critical" (using that word differently than
> > > > the existing critical sections) would be worth it.
> > >
> > > It's worth considering, but the definition of "normal" vs. "critical"
> > > might be hard to pin down. Or, we might end up with a definition that
> > > is specific to this particular case and not generalizable to others.
> >
> > But it doesn't have to be all or nothing right? I mean each call could say
> > what the situation is like in their context, like
> > CHECK_FOR_INTERRUPTS(GUARANTEE_NO_HEAVYWEIGHT_LOCK | GUARANTEE_WHATEVER), and
> > slowly tag calls as needed, similarly to how we add already CFI based on users
> > report.
>
> Absolutely. My gut feeling is that it's going to be simpler to pick a
> small number of places that are safe and sufficient for this
> particular feature and add an extra call there, similar to how we do
> vacuum_delay_point(). The reason I think that's likely to be better is
> that it will likely require changing only a relatively small number of
> places. If we instead start annotating CFIs, well, we've got hundreds
> of those. That's a lot more to change, and it also inconveniences
> third-party extension authors and people doing back-patching. I'm not
> here to say it can't work; I just think it's likely not the easiest
> path.

Yes, I suspect it's not the easiest path. I have a small hunch it
might end up paying more dividends in the future: there isn't just one
of these things that is regularly a thorny discussion for the same
reasons each time (basically "no way to trigger this safely from
another backend interrupting another one at an arbitrary point"), and
if we were able to generalize a solution we may have multiple wins (a
very obvious one in my mind is the inability of auto explain to run an
explain at the precise time it's most useful: when statement timeout
fires).

But it's also possible there are simply ways that get us more than
this scenario also, so I might be wrong; it's merely a hunch.

Regards,
James Coleman


From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-25 11:30:37
Message-ID: ryv7bxrkuycet2rm6yh2igcfdm6cpza3bmu6zlpdx4x5mvgd2r@yjys5ktjz7hv
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Feb 24, 2024 at 08:56:41AM -0500, James Coleman wrote:
> On Fri, Feb 23, 2024 at 10:23 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> > On Fri, Feb 23, 2024 at 7:50 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> > >
> > > But it doesn't have to be all or nothing right? I mean each call could say
> > > what the situation is like in their context, like
> > > CHECK_FOR_INTERRUPTS(GUARANTEE_NO_HEAVYWEIGHT_LOCK | GUARANTEE_WHATEVER), and
> > > slowly tag calls as needed, similarly to how we add already CFI based on users
> > > report.
> >
> > Absolutely. My gut feeling is that it's going to be simpler to pick a
> > small number of places that are safe and sufficient for this
> > particular feature and add an extra call there, similar to how we do
> > vacuum_delay_point(). The reason I think that's likely to be better is
> > that it will likely require changing only a relatively small number of
> > places. If we instead start annotating CFIs, well, we've got hundreds
> > of those. That's a lot more to change, and it also inconveniences
> > third-party extension authors and people doing back-patching. I'm not
> > here to say it can't work; I just think it's likely not the easiest
> > path.
>
> Yes, I suspect it's not the easiest path. I have a small hunch it
> might end up paying more dividends in the future: there isn't just one
> of these things that is regularly a thorny discussion for the same
> reasons each time (basically "no way to trigger this safely from
> another backend interrupting another one at an arbitrary point"), and
> if we were able to generalize a solution we may have multiple wins (a
> very obvious one in my mind is the inability of auto explain to run an
> explain at the precise time it's most useful: when statement timeout
> fires).

Yeah, trying to find a generalized solution seems like worth investing some
time to avoid having a bunch of CHECK_FOR_XXX() calls scattered in the code a
few years down the road.

I might be missing something, but since we already have a ton of macro hacks,
why not get another to allow CFI() overloading rather than modifying every
single call? Something like that should do the trick (and CFIFlagHandler() is
just a naive example with a function call to avoid multiple evaluation, should
be replaced with anything that required more than 10s thinking):

#define CHECK_FOR_INTERRUPTS_0() \
do { \
if (INTERRUPTS_PENDING_CONDITION()) \
ProcessInterrupts(); \
} while(0)

#define CHECK_FOR_INTERRUPTS_1(f) \
do { \
if (INTERRUPTS_PENDING_CONDITION()) \
ProcessInterrupts(); \
\
CFIFlagHandler(f); \
} while(0)

#define CHECK_FOR_INTERRUPTS_X(x, f, CFI_IMPL, ...) CFI_IMPL

#define CHECK_FOR_INTERRUPTS(...) \
CHECK_FOR_INTERRUPTS_X(, ##__VA_ARGS__, \
CHECK_FOR_INTERRUPTS_1(__VA_ARGS__), \
CHECK_FOR_INTERRUPTS_0(__VA_ARGS__) \
)

We would have to duplicate the current CFI body, but it should never really
change, and we shouldn't end up with more than 2 overloads anyway so I don't
see it being much of a problem.


From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-26 06:49:42
Message-ID: CAExHW5u6UFMHbZd2KW=Ri=6FedUPD75+rKfzmDr804F6ho9Yig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Feb 25, 2024 at 5:00 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> On Sat, Feb 24, 2024 at 08:56:41AM -0500, James Coleman wrote:
> > On Fri, Feb 23, 2024 at 10:23 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > >
> > > On Fri, Feb 23, 2024 at 7:50 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> > > >
> > > > But it doesn't have to be all or nothing right? I mean each call could say
> > > > what the situation is like in their context, like
> > > > CHECK_FOR_INTERRUPTS(GUARANTEE_NO_HEAVYWEIGHT_LOCK | GUARANTEE_WHATEVER), and
> > > > slowly tag calls as needed, similarly to how we add already CFI based on users
> > > > report.

That has some potential ...

>
> Yeah, trying to find a generalized solution seems like worth investing some
> time to avoid having a bunch of CHECK_FOR_XXX() calls scattered in the code a
> few years down the road.
>
> I might be missing something, but since we already have a ton of macro hacks,
> why not get another to allow CFI() overloading rather than modifying every
> single call? Something like that should do the trick (and CFIFlagHandler() is
> just a naive example with a function call to avoid multiple evaluation, should
> be replaced with anything that required more than 10s thinking):
>
> #define CHECK_FOR_INTERRUPTS_0() \
> do { \
> if (INTERRUPTS_PENDING_CONDITION()) \
> ProcessInterrupts(); \
> } while(0)
>
> #define CHECK_FOR_INTERRUPTS_1(f) \
> do { \
> if (INTERRUPTS_PENDING_CONDITION()) \
> ProcessInterrupts(); \
> \
> CFIFlagHandler(f); \
> } while(0)

From your earlier description I thought you are talking about flags
that can be ORed. We need only two macros above. Why are we calling
CFIFLagHandler() after ProcessInterrupts()? Shouldn't we pass flags to
ProcessInterrupts() itself.

>
> #define CHECK_FOR_INTERRUPTS_X(x, f, CFI_IMPL, ...) CFI_IMPL
>
> #define CHECK_FOR_INTERRUPTS(...) \
> CHECK_FOR_INTERRUPTS_X(, ##__VA_ARGS__, \
> CHECK_FOR_INTERRUPTS_1(__VA_ARGS__), \
> CHECK_FOR_INTERRUPTS_0(__VA_ARGS__) \
> )
>
> We would have to duplicate the current CFI body, but it should never really
> change, and we shouldn't end up with more than 2 overloads anyway so I don't
> see it being much of a problem.

Why do we need this complex macro?

--
Best Wishes,
Ashutosh Bapat


From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-26 07:55:04
Message-ID: owy4bhxatr6gucxb4fhitkikvfcfubrw2rjhuo3q24yfge2m6w@nmofutbx5ltw
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 26, 2024 at 12:19:42PM +0530, Ashutosh Bapat wrote:
> On Sun, Feb 25, 2024 at 5:00 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> >
> > > > On Fri, Feb 23, 2024 at 7:50 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> > > > >
> > > > > But it doesn't have to be all or nothing right? I mean each call could say
> > > > > what the situation is like in their context, like
> > > > > CHECK_FOR_INTERRUPTS(GUARANTEE_NO_HEAVYWEIGHT_LOCK | GUARANTEE_WHATEVER), and
> > > > > slowly tag calls as needed, similarly to how we add already CFI based on users
> > > > > report.
>
> That has some potential ...
>
> >
> > I might be missing something, but since we already have a ton of macro hacks,
> > why not get another to allow CFI() overloading rather than modifying every
> > single call? Something like that should do the trick (and CFIFlagHandler() is
> > just a naive example with a function call to avoid multiple evaluation, should
> > be replaced with anything that required more than 10s thinking):
> >
> > #define CHECK_FOR_INTERRUPTS_0() \
> > do { \
> > if (INTERRUPTS_PENDING_CONDITION()) \
> > ProcessInterrupts(); \
> > } while(0)
> >
> > #define CHECK_FOR_INTERRUPTS_1(f) \
> > do { \
> > if (INTERRUPTS_PENDING_CONDITION()) \
> > ProcessInterrupts(); \
> > \
> > CFIFlagHandler(f); \
> > } while(0)
>
> From your earlier description I thought you are talking about flags
> that can be ORed. We need only two macros above. Why are we calling
> CFIFLagHandler() after ProcessInterrupts()? Shouldn't we pass flags to
> ProcessInterrupts() itself.

Yes, I'm still talking about ORed flags passed to CFI(). That CFIFlagHandler
call is just an example for a generalized function that would act those flags,
rather than having it coded inside the macro.
>
> >
> > #define CHECK_FOR_INTERRUPTS_X(x, f, CFI_IMPL, ...) CFI_IMPL
> >
> > #define CHECK_FOR_INTERRUPTS(...) \
> > CHECK_FOR_INTERRUPTS_X(, ##__VA_ARGS__, \
> > CHECK_FOR_INTERRUPTS_1(__VA_ARGS__), \
> > CHECK_FOR_INTERRUPTS_0(__VA_ARGS__) \
> > )
> >
> > We would have to duplicate the current CFI body, but it should never really
> > change, and we shouldn't end up with more than 2 overloads anyway so I don't
> > see it being much of a problem.
>
> Why do we need this complex macro?

So that client code can use either CHECK_FOR_INTERRUPTS() or
CHECK_FOR_INTERRUPTS(flag) rather that transforming every single
CHECK_FOR_INTERRUPTS() to CHECK_FOR_INTERRUPTS(0), which was Robert's
complaint about this approach.


From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-26 08:15:08
Message-ID: CAExHW5t49R=J6Tf=yctvi7M07-Z-o1vY68hy_Ub2aF=kbTXeeQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 26, 2024 at 1:25 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> >
> > >
> > > #define CHECK_FOR_INTERRUPTS_X(x, f, CFI_IMPL, ...) CFI_IMPL
> > >
> > > #define CHECK_FOR_INTERRUPTS(...) \
> > > CHECK_FOR_INTERRUPTS_X(, ##__VA_ARGS__, \
> > > CHECK_FOR_INTERRUPTS_1(__VA_ARGS__), \
> > > CHECK_FOR_INTERRUPTS_0(__VA_ARGS__) \
> > > )
> > >
> > > We would have to duplicate the current CFI body, but it should never really
> > > change, and we shouldn't end up with more than 2 overloads anyway so I don't
> > > see it being much of a problem.
> >
> > Why do we need this complex macro?
>
> So that client code can use either CHECK_FOR_INTERRUPTS() or
> CHECK_FOR_INTERRUPTS(flag) rather that transforming every single
> CHECK_FOR_INTERRUPTS() to CHECK_FOR_INTERRUPTS(0), which was Robert's
> complaint about this approach.

It might be better to just create two marcos (with names like
CHECK_FOR_INTERRUPTS() and CHECK_FOR_INTERRUPTS_SAFE()) to call
ProcessInterrupt() directly and modify ProcessInterrupts() to accept
the flag (and if required call CFIFlagHandler() additionally). Last
macro is hard to understand.

--
Best Wishes,
Ashutosh Bapat


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-26 08:26:44
Message-ID: CA+TgmoZ-YjXougke=jej09RW2ZWJ8+hfVuBwzQvmjcD9OnObLg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg토토SQL : Postg토토SQL

On Sun, Feb 25, 2024 at 5:00 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> Yeah, trying to find a generalized solution seems like worth investing some
> time to avoid having a bunch of CHECK_FOR_XXX() calls scattered in the code a
> few years down the road.

I just don't really see how to do it. I suspect that every task that
wants to run at some CFIs but not others is going to have slightly
different requirements, and we probably can't foresee what all of
those requirements are.

Said another way, if in the future we want to call
DoSomethingOrOther() from the CFI handler, well then we need to know
that we're not already in the middle of using any subsystem that
DoSomethingOrOther() might also try to use ... and we also need to
know that we're not in the middle of doing anything that's more
critical than DoSomethingOrOther(). But both of these are likely to
vary in each case.

EXPLAIN might be one member of a general class of things that require
catalog access (and thus might take locks or lwlocks, access the
catalogs, trigger invalidations, etc.) but it's not clear how general
that class really is. Also, I think if we try to do too many different
kinds of things at CFIs, the whole thing is going to fall apart.
You'll end up failing to foresee some interactions, or the stack will
get too deep, or ... something.

--
Robert Haas
EDB: http://www.enterprisedb.com


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-26 12:01:45
Message-ID: 2c9a3fd390d653f1829b795d0bad4653@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2024-02-24 00:23, Robert Haas wrote:
> On Fri, Feb 23, 2024 at 7:50 PM Julien Rouhaud <rjuju123(at)gmail(dot)com>
> wrote:
>> On Fri, Feb 23, 2024 at 10:22:32AM +0530, Robert Haas wrote:
>> > On Thu, Feb 22, 2024 at 6:25 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
>> > > This is potentially a bit of a wild idea, but I wonder if having some
>> > > kind of argument to CHECK_FOR_INTERRUPTS() signifying we're in
>> > > "normal" as opposed to "critical" (using that word differently than
>> > > the existing critical sections) would be worth it.
>> >
>> > It's worth considering, but the definition of "normal" vs. "critical"
>> > might be hard to pin down. Or, we might end up with a definition that
>> > is specific to this particular case and not generalizable to others.
>>
>> But it doesn't have to be all or nothing right? I mean each call
>> could say
>> what the situation is like in their context, like
>> CHECK_FOR_INTERRUPTS(GUARANTEE_NO_HEAVYWEIGHT_LOCK |
>> GUARANTEE_WHATEVER), and
>> slowly tag calls as needed, similarly to how we add already CFI based
>> on users
>> report.
>
> Absolutely. My gut feeling is that it's going to be simpler to pick a
> small number of places that are safe and sufficient for this
> particular feature and add an extra call there

Hmm, whether extending CHECK_FOR_INTERRUPTS() or adding extras call
directly, currently I'm not sure where are the good 'places', which
don't give performance impact.

As attached PoC patch, I experimentally added extra calls on
ExecScanFetch() which would be less called than ExecProcNode()[1].
When running sequential scan on pgbench_accounts which is on the memory,
there seems a performance degradation.

- Executed "select * from pgbench_accounts" for 20 times
- Compared the elapsed time between the patch applied and not applied
on 874d817baa160ca7e68
- there were no heap_blks_read during the query
- pgbench_accounts has 3000000 rows

patch NOT applied:
- average: 335.88 ms
- max: 367.313 ms
- min: 309.609 ms

patch applied:
- average: 342.57 ms
- max: 380.099 ms
- min: 324.270 ms

It would be nice if there was a place accessed once every few seconds or
so..

[1]
/message-id/20240215185911.v4o6fo444md6a3w7%40awork3.anarazel.de

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

Attachment Content-Type Size
v37-0001-Add-function-to-log-the-plan-of-the-query.patch text/x-diff 32.4 KB

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-26 13:54:41
Message-ID: ZdyYIXZDEsAuKiw9@jrouhaud
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 26, 2024 at 01:56:44PM +0530, Robert Haas wrote:
> On Sun, Feb 25, 2024 at 5:00 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> > Yeah, trying to find a generalized solution seems like worth investing some
> > time to avoid having a bunch of CHECK_FOR_XXX() calls scattered in the code a
> > few years down the road.
>
> I just don't really see how to do it. I suspect that every task that
> wants to run at some CFIs but not others is going to have slightly
> different requirements, and we probably can't foresee what all of
> those requirements are.
>
> Said another way, if in the future we want to call
> DoSomethingOrOther() from the CFI handler, well then we need to know
> that we're not already in the middle of using any subsystem that
> DoSomethingOrOther() might also try to use ... and we also need to
> know that we're not in the middle of doing anything that's more
> critical than DoSomethingOrOther(). But both of these are likely to
> vary in each case.
>
> EXPLAIN might be one member of a general class of things that require
> catalog access (and thus might take locks or lwlocks, access the
> catalogs, trigger invalidations, etc.) but it's not clear how general
> that class really is. Also, I think if we try to do too many different
> kinds of things at CFIs, the whole thing is going to fall apart.
> You'll end up failing to foresee some interactions, or the stack will
> get too deep, or ... something.

I still fail to understand your point. So you say we might want to check for
safe condition to run explain or DoSomethingOrOther or even DoMeh, with all
different requirements. IIUC what you're saying is that we should have
CHECK_FOR_EXPLAIN(), CHECK_FOR_DOSOMETHINGOROTHER() and CHECK_FOR_MEH()?

And so in some code path A we could have

CHECK_FOR_INTERRUPTS();
CHECK_FOR_EXPLAIN();

In another

CHECK_FOR_INTERRUPTS();
CHECK_FOR_DOSOMETHINGOROTHER();

and in one happy path

CHECK_FOR_INTERRUPTS();
CHECK_FOR_EXPLAIN();
CHECK_FOR_DOSOMETHINGOROTHER();
CHECK_FOR_MEH();

Or that we should still have all of those but they shouldn't be anywhere close
to a CHECK_FOR_INTERRUPTS(), or something totally different?

In the first case, I'm not sure why having CHECK_FOR_INTERRUPTS(EXPLAIN|...),
with a combination of flags and with the flag handling being done after
ProcessIterrupts(), would be any different apart from having less boilerplate
lines. Similarly for the 2nd case, but relying on a single more general
CHECK_FOR_CONDITION(EXPLAIN | ...) rather than N CHECK_FOR_XXX?

If you just want something totally different then sure.


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-02-28 06:18:32
Message-ID: CA+TgmoYC3_KG8i1WfwGKK6+3YAS7Z0wP0gaTUC4ggEQ=1DrEEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 26, 2024 at 5:31 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> It would be nice if there was a place accessed once every few seconds or
> so..

I think this comment earlier from Andres deserves close attention:

# If we went with something like tht approach, I think we'd have to do something
# like redirecting node->ExecProcNode to a wrapper, presumably from within a
# CFI. That wrapper could then implement the explain support, without slowing
# down the normal execution path.

If this is correctly implemented, the overhead in the case where the
feature isn't used should be essentially zero, I believe.

--
Robert Haas
EDB: http://www.enterprisedb.com


From: James Coleman <jtc331(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-03-02 15:46:36
Message-ID: CAAaqYe-7QiAufDmfExAOZKcZTOR5dkuVSMxXRPEooaCaPfFqtA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 28, 2024 at 1:18 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Mon, Feb 26, 2024 at 5:31 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> > It would be nice if there was a place accessed once every few seconds or
> > so..
>
> I think this comment earlier from Andres deserves close attention:
>
> # If we went with something like tht approach, I think we'd have to do something
> # like redirecting node->ExecProcNode to a wrapper, presumably from within a
> # CFI. That wrapper could then implement the explain support, without slowing
> # down the normal execution path.
>
> If this is correctly implemented, the overhead in the case where the
> feature isn't used should be essentially zero, I believe.

If I can rephrase this idea: it's basically "delay this interrupt
until inline to the next ExecProcNode execution".

That seems pretty promising to me as well.

Regards,
James Coleman


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-03-04 18:19:45
Message-ID: CA+TgmoZVYG+Ot=yPg3HAr2H=2fPvQOyahrx5qwaRf5cqkM7PAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Mar 2, 2024 at 10:46 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
> If I can rephrase this idea: it's basically "delay this interrupt
> until inline to the next ExecProcNode execution".

Yes, but it's not just that. It also means that the code which would
handle the interrupt doesn't need to be called at every ExecProcNode.
Only when the interrupt actually arrives do we enable the code that
handles it.

--
Robert Haas
EDB: http://www.enterprisedb.com


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: James Coleman <jtc331(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com, robertmhaas(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-03-13 05:28:42
Message-ID: ac6c51071316279bf903078cf264c37a@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 16, 2024 at 11:42 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
wrote:
> I'm not so sure about the implementation now, i.e. finding the next
> node
> to be executed from the planstate tree, but I'm going to try this
> approach.

Attached a patch which takes this approach.

- I saw no way to find the next node to be executed from the planstate
tree, so the patch wraps all the ExecProcNode of the planstate tree at
CHECK_FOR_INTERRUPTS().
- To prevent overhead of this wrapped function call, unwrap it at the
end of EXPLAIN code execution.
- I first tried to use ExecSetExecProcNode() for wrapping, but it
'changes' ExecProcNodeMtd of nodes, not 'adds' some process to
ExecProcNodeMtd. I'm not sure this is the right approach, but attached
patch adds new member ExecProcNodeOriginal to PlanState to preserve
original ExecProcNodeMtd.

Any comments are welcomed.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

Attachment Content-Type Size
v38-0001-Add-function-to-log-the-plan-of-the-query.patch text/x-diff 33.5 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, James Coleman <jtc331(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-03-13 19:33:02
Message-ID: CA+TgmoaaEQtuope6za=3GSCZ+WJFT4DbF5Cnv0QXFmDnZ_PqFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 13, 2024 at 1:28 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> - I saw no way to find the next node to be executed from the planstate
> tree, so the patch wraps all the ExecProcNode of the planstate tree at
> CHECK_FOR_INTERRUPTS().

I don't think it does this correctly, because some node types have
children other than the left and right node. See /* special child
plans */ in ExplainNode().

But also ... having to wrap the entire plan tree like this seems
pretty awful. I don't really like the idea of a large-scan plan
modification like this in the middle of the query. I also wonder
whether it interacts properly with JIT. But at the same time, I wonder
how you're supposed to avoid it.

Andres, did you have some clever idea for this feature that would
avoid the need to do this?

--
Robert Haas
EDB: http://www.enterprisedb.com


From: jian he <jian(dot)universality(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, James Coleman <jtc331(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, robertmhaas(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-03-14 01:02:47
Message-ID: CACJufxFr_7c1nTyBKNK1A_Ks0a7N6zT7VEd=QQ8-W_0jMLp1=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 13, 2024 at 1:28 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> On Fri, Feb 16, 2024 at 11:42 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
> > I'm not so sure about the implementation now, i.e. finding the next
> > node
> > to be executed from the planstate tree, but I'm going to try this
> > approach.
>
> Attached a patch which takes this approach.
>

one minor issue.
I understand the regress test, compare the expected outcome with
testrun outcome,
but can you enlighten me about how you check if the change you made in
contrib/auto_explain/t/001_auto_explain.pl is correct.
(i am not familiar with perl).

diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index cf195f1359..2d06bf297e 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -17,6 +17,8 @@
#include "lib/stringinfo.h"
#include "parser/parse_node.h"
+extern PGDLLIMPORT bool ProcessLogQueryPlanInterruptActive;

diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 054dd2bf62..3c6bd1ea7c 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -167,6 +167,7 @@ struct Node;
extern bool message_level_is_interesting(int elevel);
+extern PGDLLIMPORT bool ProcessLogQueryPlanInterruptActive;

utils/elog.h is already included in src/include/postgres.h.
you don't need to declare ProcessLogQueryPlanInterruptActive at
include/commands/explain.h?
generally a variable should only declare once?


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>, robertmhaas(at)gmail(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, James Coleman <jtc331(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-03-18 05:39:48
Message-ID: b1a41d0d32d70fac75a3ffb7cae8cef7@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2024-03-14 04:33, Robert Haas wrote:
Thanks for your review!

> On Wed, Mar 13, 2024 at 1:28 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
>> - I saw no way to find the next node to be executed from the planstate
>> tree, so the patch wraps all the ExecProcNode of the planstate tree at
>> CHECK_FOR_INTERRUPTS().
>
> I don't think it does this correctly, because some node types have
> children other than the left and right node. See /* special child
> plans */ in ExplainNode().

Agreed.

> But also ... having to wrap the entire plan tree like this seems
> pretty awful. I don't really like the idea of a large-scan plan
> modification like this in the middle of the query.

Yeah, but I haven't come up with other ideas to select the appropriate
node to wrap.

> Andres, did you have some clever idea for this feature that would
> avoid the need to do this?

On 2024-03-14 10:02, jian he wrote:
> On Wed, Mar 13, 2024 at 1:28 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
>>
>> On Fri, Feb 16, 2024 at 11:42 PM torikoshia
>> <torikoshia(at)oss(dot)nttdata(dot)com>
>> wrote:
>> > I'm not so sure about the implementation now, i.e. finding the next
>> > node
>> > to be executed from the planstate tree, but I'm going to try this
>> > approach.
>>
>> Attached a patch which takes this approach.
>>
>
> one minor issue.
> I understand the regress test, compare the expected outcome with
> testrun outcome,
> but can you enlighten me about how you check if the change you made in
> contrib/auto_explain/t/001_auto_explain.pl is correct.
> (i am not familiar with perl).

$pg_log_plan_query_output saves the output plan of pg_log_query_plan()
and $auto_explain_output saves the output plan of auto_explain.
The test checks the both are the same using cmp_ok().

Did I answer your question?

> diff --git a/src/include/commands/explain.h
> b/src/include/commands/explain.h
> index cf195f1359..2d06bf297e 100644
> --- a/src/include/commands/explain.h
> +++ b/src/include/commands/explain.h
> @@ -17,6 +17,8 @@
> #include "lib/stringinfo.h"
> #include "parser/parse_node.h"
> +extern PGDLLIMPORT bool ProcessLogQueryPlanInterruptActive;
>
> diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
> index 054dd2bf62..3c6bd1ea7c 100644
> --- a/src/include/utils/elog.h
> +++ b/src/include/utils/elog.h
> @@ -167,6 +167,7 @@ struct Node;
> extern bool message_level_is_interesting(int elevel);
> +extern PGDLLIMPORT bool ProcessLogQueryPlanInterruptActive;
>
> utils/elog.h is already included in src/include/postgres.h.
> you don't need to declare ProcessLogQueryPlanInterruptActive at
> include/commands/explain.h?
> generally a variable should only declare once?

Yeah, this declaration is not necessary and we should add include
commands/explain.h to src/backend/access/transam/xact.c.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation


From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, James Coleman <jtc331(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2024-03-26 02:34:47
Message-ID: 20240326023447.yj5fi5qcbz5ocyrt@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2024-03-13 15:33:02 -0400, Robert Haas wrote:
> But also ... having to wrap the entire plan tree like this seems
> pretty awful. I don't really like the idea of a large-scan plan
> modification like this in the middle of the query.

It's not great. But I also don't really see an alternative with this approach.

I guess we could invent a new CFI version that gets the current PlanState and
use that in all of src/backend/executor/node* and pass the PlanState to that -
but then we could just as well just directly process the interrupt there.

> I also wonder whether it interacts properly with JIT.

I don't think there's a problem unless somebody invests a lot of time in
JITing much more of the query. Which will require a lot more work, basically
redesigning the executor...

> Andres, did you have some clever idea for this feature that would
> avoid the need to do this?

No. I think it's acceptable though.

However it might be worth inventing an executor tree walker in a preliminary
step. We have already quite a few switches over all plan nodes, which we could
largely replace with a helper.

Greetings,

Andres Freund


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, James Coleman <jtc331(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com, andres(at)anarazel(dot)de
Subject: Re: RFC: Logging plan of the running query
Date: 2024-10-28 13:05:03
Message-ID: 8a8f2432cf5f196f2732ce9e4da7fb89@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi

I've recently resumed work on this proposal.

I was revising tests that use the injection point, but given that there
was a recent talk at PGConf.EU about further features based on this
proposal [1], I thought it might be of interest to some. Therefore, I'll
share a patch that removes the injection point section for now.

Due to both the lack of an alternative implementation and the following
comment, I’ve taken the approach of wrapping all plan nodes:

On Tue, Mar 26, 2024 at 11:35 AM Andres Freund andres(at)anarazel(dot)de wrote:
>> Andres, did you have some clever idea for this feature that would
>> avoid the need to do this?
> No. I think it's acceptable though.

As a primary change since the last version, I have added wrapping
processing for child nodes other than left/right tree nodes in response
to the feedback in comment [2].

For nodes other than CustomScanState, I have confirmed via DEBUG logs
that the wrap/unwrap code is being executed. However, I'm still somewhat
uncertain whether CustomScanState should be included in this process.

For testing, I ran below queries in three sessions concurrently during
make installcheck and encountered no issues.

=# select pg_log_query_plan(pid) from pg_stat_activity;
=# \watch 0.1

[1]
https://www.postgresql.eu/events/pgconfeu2024/sessions/session/5689-debugging-active-queries-with-mid-flight-instrumented-explain-plans/
[2]
/message-id/CA%2BTgmoaaEQtuope6za%3D3GSCZ%2BWJFT4DbF5Cnv0QXFmDnZ_PqFw%40mail.gmail.com

Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.

Attachment Content-Type Size
v39-0001-Add-function-to-log-the-plan-of-the-currently-ru.patch text/x-diff 32.8 KB

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Étienne BERSAC <etienne(dot)bersac(at)dalibo(dot)com>, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, rafaelthca(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com, andres(at)anarazel(dot)de
Subject: Re: RFC: Logging plan of the running query
Date: 2025-02-03 12:46:29
Message-ID: 2aedcb5e8120bfdee994775e660f8718@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Rebased the patch since it couldn't be applied anymore.

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.

Attachment Content-Type Size
v40-0001-Add-function-to-log-the-plan-of-the-currently-ru.patch text/x-diff 32.8 KB

From: Akshat Jaimini <destrex271(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Atsushi Torikoshi <torikoshia(at)oss(dot)nttdata(dot)com>
Subject: Re: RFC: Logging plan of the running query
Date: 2025-03-08 15:42:21
Message-ID: 174144854199.294105.9417032645816235284.pgcf@coridan.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,
I think there is still some problem with the patch. I am not able to apply it to the master branch.

Can you please take another look at it?

Thanks,
Akshat Jaimini


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Akshat Jaimini <destrex271(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: RFC: Logging plan of the running query
Date: 2025-03-10 05:10:27
Message-ID: ffdfab063ac6f3d3d97763c9d744b70b@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2025-03-09 00:42, Akshat Jaimini wrote:
> Hi,
> I think there is still some problem with the patch. I am not able to
> apply it to the master branch.
>
> Can you please take another look at it?

Thanks for pointing it out!
Modified it.

BTW the patch adds about 400 lines to explain.c and it may be better to
split the file as well as 9173e8b6046, but I leave it as it is for now.

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.

Attachment Content-Type Size
v41-0001-Add-function-to-log-the-plan-of-the-currently-ru.patch text/x-diff 32.9 KB

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: destrex271(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2025-03-21 12:40:00
Message-ID: 784224198089cbe3cc3825332f195475@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Rebased it again.

On 2025-03-10 14:10, torikoshia wrote:
> BTW the patch adds about 400 lines to explain.c and it may be better
> to split the file as well as 9173e8b6046, but I leave it as it is for
> now.

This part remains unchanged.

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.

Attachment Content-Type Size
v42-0001-Add-function-to-log-the-plan-of-the-currently-ru.patch text/x-diff 33.4 KB

From: Sadeq Dousti <msdousti(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Sergey Dudoladov <sergey(dot)dudoladov(at)gmail(dot)com>
Subject: Re: RFC: Logging plan of the running query
Date: 2025-03-31 18:51:28
Message-ID: CADE6Lvi1yJYF8KgK9ofin50bRKxByqSHufxshj0pLC7dfoZD9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Sergey and I reviewed this patch and have a few comments, listed below.

> BTW the patch adds about 400 lines to explain.c and it may be better
> to split the file as well as 9173e8b6046, but I leave it as it is for
> now.

1. As rightfully described by the OP above, explain.c has grown too big. In
addition, explain.h is added to quite a number of other files.

2. We wanted to entertain the possibility of a GUC variable to enable the
feature. That is, having it disabled by default, and the DBA can
selectively enable it on the fly. The reasoning is that it can affect some
workloads in an unforeseen way, so this choice can make the next Postgres
release safer. In future releases, we can make the default "enabled",
assuming enough real-world scenarios have proven the feature safe (i.e.,
not affecting the DB performance noticeably).

3. In the email thread for this patch, we saw some attempts to measure the
performance overhead of the feature. We suggest a more rigorous study,
including memory overhead in addition to time overhead. We came to the
conclusion that analytical workloads - with complicated query plans - and a
high query rate are the best to attempt such benchmarks.

4. In PGConf EU 2024, there was a talk by Rafael Castro titled "Debugging
active queries" [1], and he also submitted a recent patch titled "Proposal:
Progressive explain" [2]. We see a lot of similarities in the idea behind
that patch and this one, and would like to suggest joining forces for an
ideal outcome.

[1] https://www.youtube.com/watch?v=6ahTb-7C05c
[2] https://commitfest.postgresql.org/patch/5473/

Best Regards,
Sadeq Dousti & Sergey Dudoladov


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, destrex271(at)gmail(dot)com, Rafael Thofehrn Castro <rafaelthca(at)gmail(dot)com>
Subject: Re: RFC: Logging plan of the running query
Date: 2025-03-31 19:24:51
Message-ID: CA+TgmoYcnAj2WTiNP_wtLTf6DabhF384nR8wvDZ_o8C5AG-K3w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 21, 2025 at 8:40 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> Rebased it again.

Hi,

I apologize for not having noticed this thread sooner. I just became
aware of it as a result of a discussion in the hacking Discord server.
I think this has got a lot over overlap with the progressive EXPLAIN
patch from Rafael Castro, which I have been reviewing, but I am not
sure why we have two different patch sets here. Why is that?

Thanks,

--
Robert Haas
EDB: http://www.enterprisedb.com


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, rafaelthca(at)gmail(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, destrex271(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2025-04-01 01:43:10
Message-ID: 067675dd73b5485ecf0159d9a986846d@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2025-04-01 04:24, Robert Haas wrote:
> On Fri, Mar 21, 2025 at 8:40 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
>> Rebased it again.
>
> Hi,
>
> I apologize for not having noticed this thread sooner.

Thank you for your checking! No worries.

> I just became
> aware of it as a result of a discussion in the hacking Discord server.

> I think this has got a lot over overlap with the progressive EXPLAIN
> patch from Rafael Castro, which I have been reviewing, but I am not
> sure why we have two different patch sets here. Why is that?

This thread proposes a feature to log the results of a plain EXPLAIN
(without ANALYZE). From Rafael Castro's presentation materials at
pgconf.eu 2024 [1], I understand that he built a new patch on top of
this one, adding functionality to track execution progress and display
results in a view.
He also mentioned that he used the way of wrapping plan nodes from this
patch during the development to solve a problem[2].
I think that's why there are overlaps between the two.

Previously, Rafael proposed a patch in this thread that added execution
progress tracking. However, I felt that expanding the scope could make
it harder to get the patch reviewed or committed. So, I suggested first
completing a feature that only retrieves the execution plan of a running
query, and then developing execution progress tracking afterward[3].

As far as I remember, he did not respond to this thread after my
suggestion but instead started a new thread later[4]. Based on that, I
assume he would not have agreed with my proposed approach.

Rafael, please let us know if there are any misunderstandings in the
above.

In this thread, I aimed to output the plan without requiring prior
configuration if possible. However, based on your comment on Rafael's
thread[5], it seems that this approach would be difficult:

One way in which this proposal seems safer than previous proposals is
that previous proposals have involved session A poking session B and
trying to get session B to emit an EXPLAIN on the fly with no prior
setup. That would be very useful, but I think it's more difficult and
more risky than this proposal, where all the configuration happens in
the session that is going to emit the EXPLAIN output.

I was considering whether to introduce a GUC in this patch to allow for
prior setup before outputting the plan or to switch to Rafael's patch
after reviewing its details. However, since there isn’t much time left
before the feature freeze, if you have already reviewed Rafael's patch
and there is a chance it could be committed, it would be better to focus
on that.

[1] https://www.youtube.com/watch?v=6ahTb-7C05c (mentions this patch
after the 12-minute mark)
[2]
/message-id/CAG0ozMo30smtXXOR8bSCbhaZAQHo4%3DezerLitpERk85Q0ga%2BFw%40mail.gmail.com
[3]
/message-id/c161b5e7e1888eb9c9eb182a7d9dcf89%40oss.nttdata.com
[4]
/message-id/flat/CAG0ozMo30smtXXOR8bSCbhaZAQHo4%3DezerLitpERk85Q0ga%2BFw%40mail.gmail.com#4b261994a0d44649ded6e1434a00e134
[5]
/message-id/CA%2BTgmoaD985%2BVLwR93c8PjSaoBqxw72Eu7pfBJcArzhjJ71aRw%40mail.gmail.com

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: rafaelthca(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, destrex271(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2025-04-01 12:32:04
Message-ID: CA+TgmoZ8dm2-19icz+Lta2+e_DRNb5ZkoOz_8pNDGm49ofEyGg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 31, 2025 at 9:43 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> Previously, Rafael proposed a patch in this thread that added execution
> progress tracking. However, I felt that expanding the scope could make
> it harder to get the patch reviewed or committed. So, I suggested first
> completing a feature that only retrieves the execution plan of a running
> query, and then developing execution progress tracking afterward[3].

That's reasonable. Comparing the two patches, I think that you have
correctly solved a couple of key problems that Rafael did not handle
correctly. Specifically, I believe you have correctly implemented what
Andres described i.e. use CFI to get control to do the wrapping and
then from the ExecProcNode wrapper do the actual EXPLAIN, whereas I
believe Rafael was doing the wrapping directly from the signal
handler, which did not seem safe to me:

http://postgr.es/m/CA+TgmobrzeDep+Z1BPQqGNsCqTQ8M58wNNKJB_8Lwpwbqbz3GQ@mail.gmail.com

I also like your version of (sub)transaction abort handling much
better than the memory-context cleanup that Rafael initially chose.
He's since revised that.

That said, I think the view that Rafael proposes, with a
periodically-updating version of the EXPLAIN ANALYZE output up to that
point, could be extremely useful in some situations. To make that
work, he mostly just hacked InstrStopNode(), although as I'm thinking
about it, that probably doesn't handle all of the parallel query cases
correctly. I'm somewhat inclined to hope that we eventually end up
with both interfaces.

> I was considering whether to introduce a GUC in this patch to allow for
> prior setup before outputting the plan or to switch to Rafael's patch
> after reviewing its details. However, since there isn’t much time left
> before the feature freeze, if you have already reviewed Rafael's patch
> and there is a chance it could be committed, it would be better to focus
> on that.

I wasn't feeling very confident about my ability to get that patch
committed before feature freeze. I don't want to rush into something
that we might later regret. I'm going to spend a bit more time
studying your patch next.

--
Robert Haas
EDB: http://www.enterprisedb.com


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: rafaelthca(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, destrex271(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2025-04-01 13:57:44
Message-ID: b6bac5dabf27016f5f2b939fa44e0f2e@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2025-04-01 21:32, Robert Haas wrote:
Thanks for your comment.

> On Mon, Mar 31, 2025 at 9:43 PM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
>> Previously, Rafael proposed a patch in this thread that added
>> execution
>> progress tracking. However, I felt that expanding the scope could make
>> it harder to get the patch reviewed or committed. So, I suggested
>> first
>> completing a feature that only retrieves the execution plan of a
>> running
>> query, and then developing execution progress tracking afterward[3].
>
> That's reasonable. Comparing the two patches, I think that you have
> correctly solved a couple of key problems that Rafael did not handle
> correctly. Specifically, I believe you have correctly implemented what
> Andres described i.e. use CFI to get control to do the wrapping and
> then from the ExecProcNode wrapper do the actual EXPLAIN, whereas I
> believe Rafael was doing the wrapping directly from the signal
> handler, which did not seem safe to me:
>
> http://postgr.es/m/CA+TgmobrzeDep+Z1BPQqGNsCqTQ8M58wNNKJB_8Lwpwbqbz3GQ@mail.gmail.com
>
> I also like your version of (sub)transaction abort handling much
> better than the memory-context cleanup that Rafael initially chose.
> He's since revised that.
>
> That said, I think the view that Rafael proposes, with a
> periodically-updating version of the EXPLAIN ANALYZE output up to that
> point, could be extremely useful in some situations.
+1.

> To make that
> work, he mostly just hacked InstrStopNode(), although as I'm thinking
> about it, that probably doesn't handle all of the parallel query cases
> correctly. I'm somewhat inclined to hope that we eventually end up
> with both interfaces.

>> I was considering whether to introduce a GUC in this patch to allow
>> for
>> prior setup before outputting the plan or to switch to Rafael's patch
>> after reviewing its details. However, since there isn’t much time left
>> before the feature freeze, if you have already reviewed Rafael's patch
>> and there is a chance it could be committed, it would be better to
>> focus
>> on that.
>
> I wasn't feeling very confident about my ability to get that patch
> committed before feature freeze. I don't want to rush into something
> that we might later regret. I'm going to spend a bit more time
> studying your patch next.

That’s really appreciated!
I believe some of the comments in Rafael's thread should be reflected in
this one as well, but I haven’t incorporated them yet. Apologies for
that.

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, destrex271(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2025-04-01 18:52:30
Message-ID: CA+TgmoatMzMedKRYAK5s3euxUohd6m00szL4Q6GGQj+NhF813A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 21, 2025 at 8:40 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> Rebased it again.
>
> On 2025-03-10 14:10, torikoshia wrote:
> > BTW the patch adds about 400 lines to explain.c and it may be better
> > to split the file as well as 9173e8b6046, but I leave it as it is for
> > now.
>
> This part remains unchanged.

Looking at ExplainAssembleLogOutput() is making me realize that
auto_explain is in serious need of some cleanup. That's not really the
fault of this patch, but the hack whereby we overwrite the [] that
would have surrounded the JSON output with {} is not very nice. I also
think that the auto_explain GUCs need rethinking. In theory, new
EXPLAIN options should be mirrored into auto_explain, but if you
compare ExplainAssembleLogOutput() to ExplainOnePlan(), you can see
that they are diverging. The PLANNING, SUMMARY, and SERIALIZE options
that are known to regular EXPLAIN aren't known to auto_explain, and
any customizable options that use explain_per_plan_hook won't be able
to work with auto_explain, either. Changing this is non-trivial
because SERIALIZE, for example, can't work the same way for
auto_explain as it does for EXPLAIN, and a full solution might also
require user-visible changes like replacing
auto_explain.log_<whatever> with auto_explain.explain, so I don't
really know. Maybe we should just live with it the way it is for now,
but it doesn't look very nice.

Rafael and I were just discussing off-list to what extent the parallel
query problems with his patch also apply to yours. His design makes
the problem easier to hit, I think, because setting
progressive_explain = on makes every backend attempt to dump a query
plan, whereas you'd have to hit the worker process with a signal and
just the right time. But the fundamental problem appears to be the
same. Another interesting wrinkle is that he settled on showing the
outermost running query whereas you settled on the innermost. If you
were showing the outermost, perhaps you could just prohibit
dump-the-query-plan on a parallel worker, but you don't really want to
prohibit it categorically, because the worker could be running an
inner query that could be dumped. So maybe you want to avoid setting
ActiveQueryDesc at the toplevel and then set/clear it for inner
queries. However, that's a bit weird, too. If we wanted to undertake a
bigger redesign here, we could try pushing the entire query plan
(including all subqueries) down to the worker, and just tell it which
part it's actually supposed to execute. However, that would have some
overhead and would arguably open up some risk of future bugs, since
passing subqueries as NULL is, I think, intended partly as a guard
against accidentally executing the wrong subqueries.

I don't think it's a good idea to put the logic to update
ActiveQueryDesc into ExecutorRun. I think that function should really
just call the hook, or standard_ExecutorRun. I don't know for sure
whether this work should be moved down into ExecutorRun or up into the
caller, but I don't think it should stay where it is.

My comment about the naming of WrapMultiExecProcNodesWithExplain() on
the other thread also applies here: MultiExecProcNode() is an
unrelated function. Likewise, WrapExecProcNodeWithExplain() misses
walking the node's subplan list. Also, I don't think an
ereport(DEBUG1, ...) is appropriate here.

Do we really need es->signaled? Couldn't we test es->analyze instead?

Do we really need ExecProcNodeOriginal? Can we find some way to reuse
ExecProcNodeReal instead of making the structure bigger?

I do think we should try to move some of this new code out into a
separate source file, but I'm not yet sure what we should call it. We
might want to share infrastructure with something what Rafael's patch,
which he called progressive EXPLAIN but which is really closer to a
general query progress facility, albeit perhaps too expensive to have
enabled by default.

Does the documentation typically mention when a function is
superuser-only? If so, it should do so here, too, using similar
wording.

It seems unnecessary to remark that you can't log a query plan after a
subtransaction abort, because the query wouldn't be running any more.
It's also true that you can't log a query after a toplevel abort, even
if you're still waiting for the aborted transaction to roll back. But
it also seems like once the aborted subtransaction is popped off the
stack and we return to the parent transaction, we should be able to
log any query running at the outer level. If this is meant to imply
that something doesn't work in this scenario, perhaps there is
something about the design that needs fixing.

Does ActiveQueryDesc really need to be exposed to the whole system?
Could it be a file-level variable?

--
Robert Haas
EDB: http://www.enterprisedb.com


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, destrex271(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2025-04-01 19:05:02
Message-ID: CAA5RZ0uTBLX6=M3y6zEmOrQtmS5=UDWD3-=vRFuaLpg1v6bpiw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Looking at ExplainAssembleLogOutput() is making me realize that
> auto_explain is in serious need of some cleanup. That's not really the
> fault of this patch, but the hack whereby we overwrite the [] that
> would have surrounded the JSON output with {} is not very nice. I also
> think that the auto_explain GUCs need rethinking. In theory, new
> EXPLAIN options should be mirrored into auto_explain, but if you
> compare ExplainAssembleLogOutput() to ExplainOnePlan(), you can see
> that they are diverging. The PLANNING, SUMMARY, and SERIALIZE options
> that are known to regular EXPLAIN aren't known to auto_explain, and
> any customizable options that use explain_per_plan_hook won't be able
> to work with auto_explain, either. Changing this is non-trivial
> because SERIALIZE, for example, can't work the same way for
> auto_explain as it does for EXPLAIN, and a full solution might also
> require user-visible changes like replacing
> auto_explain.log_<whatever> with auto_explain.explain, so I don't
> really know. Maybe we should just live with it the way it is for now,
> but it doesn't look very nice.

FWIW, I have been thinking about auto_explain for another task,
remote plans for fdw [0], and perhaps there are now other good
reasons, some that you mention, that can be simplified if "auto_explain"
becomes a core feature. This could be a proposal taken up in 19.

[0] /message-id/flat/CAP%2BB4TD%3Diy-C2EnsrJgjpwSc7_4pd3Xh-gFzA0bwsw3q8u860g%40mail.gmail.com

--
Sami Imseih
Amazon Web Services (AWS)


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, destrex271(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2025-04-01 19:29:27
Message-ID: CA+TgmobQCaYFQyrXLBeTvdeUugCiEcOiBEB2AwVhXfpbMe5fEw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Apr 1, 2025 at 3:05 PM Sami Imseih <samimseih(at)gmail(dot)com> wrote:
> > Looking at ExplainAssembleLogOutput() is making me realize that
> > auto_explain is in serious need of some cleanup. That's not really the
> > fault of this patch, but the hack whereby we overwrite the [] that
> > would have surrounded the JSON output with {} is not very nice. I also
> > think that the auto_explain GUCs need rethinking. In theory, new
> > EXPLAIN options should be mirrored into auto_explain, but if you
> > compare ExplainAssembleLogOutput() to ExplainOnePlan(), you can see
> > that they are diverging. The PLANNING, SUMMARY, and SERIALIZE options
> > that are known to regular EXPLAIN aren't known to auto_explain, and
> > any customizable options that use explain_per_plan_hook won't be able
> > to work with auto_explain, either. Changing this is non-trivial
> > because SERIALIZE, for example, can't work the same way for
> > auto_explain as it does for EXPLAIN, and a full solution might also
> > require user-visible changes like replacing
> > auto_explain.log_<whatever> with auto_explain.explain, so I don't
> > really know. Maybe we should just live with it the way it is for now,
> > but it doesn't look very nice.
>
> FWIW, I have been thinking about auto_explain for another task,
> remote plans for fdw [0], and perhaps there are now other good
> reasons, some that you mention, that can be simplified if "auto_explain"
> becomes a core feature. This could be a proposal taken up in 19.

For what we're talking about here, I don't think we would need to go
that far -- maybe put a few functions in core but no real need to move
the whole module into core. However, I don't rule out that there are
other reasons to do as you suggest.

--
Robert Haas
EDB: http://www.enterprisedb.com


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, destrex271(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2025-04-02 16:22:54
Message-ID: CAA5RZ0vyREmzjh4-PgGUSLSvvjR2TZf3g4AjhACepbA5qfrL5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > FWIW, I have been thinking about auto_explain for another task,
> > remote plans for fdw [0], and perhaps there are now other good
> > reasons, some that you mention, that can be simplified if "auto_explain"
> > becomes a core feature. This could be a proposal taken up in 19.
>
> For what we're talking about here, I don't think we would need to go
> that far -- maybe put a few functions in core but no real need to move
> the whole module into core. However, I don't rule out that there are
> other reasons to do as you suggest.

This is the first core feature that will allow users to log explain plans for
a live workload. EXPLAIN is not really good for this purpose. So this
proposal is a step in the correct direction. I think the appetite for more
plan logging/visibility options will likely increase, and auto_explainlike
features in core will be desired IMO. We will see.

As far as this patch goes, I took a look and I have some comments:

1/
The name of ExplainAssembleLogOutput is not appropriate as
it does not really do anything with logs. Maybe ExplainStringAssemble
is more appropriate?

2/
It should be noted that the plan will not print to the log until
the plan begins executing the next plan node? depending on the
operation, that could take some time ( i.e.long seq scan of a table, etc.)
Does this behavior need to be called out in docs?

3/
+ * By default, only superusers are allowed to signal to log the plan because
+ * allowing any users to issue this request at an unbounded rate would

Only superuser allowed to do this is very restrictive. Many shops do
not, for good
reasons, want DBAs or monitoring tools to connect as superuser. Why not allow
this functionality with "pg_monitor" ?

4/
nit: line removed here unnecessarily

extern PGDLLIMPORT Portal ActivePortal;
-
+extern PGDLLIMPORT QueryDesc *ActiveQueryDesc;

5/
WrapCustomPlanChildExecProcNodesWithExplain

This function name is too long, can the name be simplified?

6/
are such DEBUG1's really necessary, considering this is
a manually triggered function?

case T_Append:
ereport(DEBUG1, errmsg("wrapping Append"));

7/
Why do we only support TEXT format? I tried it with JSON
and it looks fine in the log as well. I can imagine automated
tools will want to be able to retrieve the plans using the
structured formats.

8/
+ es->format = EXPLAIN_FORMAT_TEXT;
+ es->settings = true;
+ es->verbose = true;
+ es->signaled = true;
+
+ ExplainAssembleLogOutput(es, ActiveQueryDesc,
EXPLAIN_FORMAT_TEXT, 0, -1);
+
Can we just pass es->format to ExplainAssembleLogOutput as the 3rd argument?

--
Sami Imseih
Amazon Web Services (AWS)


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, destrex271(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2025-04-03 05:32:16
Message-ID: 8de51267059a43f53848c69a2bd9bc3b@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2025-04-02 03:52, Robert Haas wrote:
Thank you for review!
> On Fri, Mar 21, 2025 at 8:40 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> wrote:
>> Rebased it again.
>>
>> On 2025-03-10 14:10, torikoshia wrote:
>> > BTW the patch adds about 400 lines to explain.c and it may be better
>> > to split the file as well as 9173e8b6046, but I leave it as it is for
>> > now.
>>
>> This part remains unchanged.
>
> Looking at ExplainAssembleLogOutput() is making me realize that
> auto_explain is in serious need of some cleanup. That's not really the
> fault of this patch, but the hack whereby we overwrite the [] that
> would have surrounded the JSON output with {} is not very nice. I also
> think that the auto_explain GUCs need rethinking. In theory, new
> EXPLAIN options should be mirrored into auto_explain, but if you
> compare ExplainAssembleLogOutput() to ExplainOnePlan(), you can see
> that they are diverging. The PLANNING, SUMMARY, and SERIALIZE options
> that are known to regular EXPLAIN aren't known to auto_explain, and
> any customizable options that use explain_per_plan_hook won't be able
> to work with auto_explain, either. Changing this is non-trivial
> because SERIALIZE, for example, can't work the same way for
> auto_explain as it does for EXPLAIN, and a full solution might also
> require user-visible changes like replacing
> auto_explain.log_<whatever> with auto_explain.explain, so I don't
> really know. Maybe we should just live with it the way it is for now,
> but it doesn't look very nice.
>
> Rafael and I were just discussing off-list to what extent the parallel
> query problems with his patch also apply to yours. His design makes
> the problem easier to hit, I think, because setting
> progressive_explain = on makes every backend attempt to dump a query
> plan, whereas you'd have to hit the worker process with a signal and
> just the right time. But the fundamental problem appears to be the
> same.

In this patch, plan output is restricted to processes with B_BACKEND.
When targeting parallel workers, the plan is not attempted to be output,
as shown below:

=# select pg_log_query_plan(pid) from pg_stat_activity where
backend_type = 'parallel worker';

pg_log_query_plan
-------------------
f
f
(2 rows)

WARNING: PID 22720 is not a PostgreSQL client backend process
WARNING: PID 22719 is not a PostgreSQL client backend process

Given that the goal of this thread is simply to output the plan, I think
this is sufficient. Users can view the plan by accessing their leader
process.
However, I do agree that retrieving this information is necessary for
Rafael's work on tracking execution progress.

I think tracking execution progress involves more challenges to solve
compared to simply outputting the plan.
For this reason, I believe an incremental approach -- first completing
the basic plan output functionality in this thread and then extending it
to support progress tracking -- would be the good way forward.

> Does ActiveQueryDesc really need to be exposed to the whole system?
> Could it be a file-level variable?

Until the latest version patch, my goal was to output the plan without
requiring prior configuration, and I haven't seen any other viable
approach.

However, for the next patch, I'm considering introducing a GUC to allow
prior setup before outputting the plan, in response to the previously
quoted comment:

One way in which this proposal seems safer than previous proposals is
that previous proposals have involved session A poking session B and
trying to get session B to emit an EXPLAIN on the fly with no prior
setup. That would be very useful, but I think it's more difficult and
more risky than this proposal, where all the configuration happens in
the session that is going to emit the EXPLAIN output.

With this change, it should be possible to use a file-level variable
instead.

I'm going to try to improve other points you raised.

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Sadeq Dousti <msdousti(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Sergey Dudoladov <sergey(dot)dudoladov(at)gmail(dot)com>
Subject: Re: RFC: Logging plan of the running query
Date: 2025-04-03 05:34:27
Message-ID: e04b16ddfad83bab3632617e8cda9b42@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2025-04-01 03:51, Sadeq Dousti wrote:
> Hi,
>
> Sergey and I reviewed this patch and have a few comments, listed
> below.
Thanks for your review!

>> BTW the patch adds about 400 lines to explain.c and it may be better
>> to split the file as well as 9173e8b6046, but I leave it as it is
> for
>> now.
>
> 1. As rightfully described by the OP above, explain.c has grown too
> big. In addition, explain.h is added to quite a number of other files.
Agreed.

>
> 2. We wanted to entertain the possibility of a GUC variable to enable
> the feature. That is, having it disabled by default, and the DBA can
> selectively enable it on the fly. The reasoning is that it can affect
> some workloads in an unforeseen way, so this choice can make the next
> Postgres release safer. In future releases, we can make the default
> "enabled", assuming enough real-world scenarios have proven the
> feature safe (i.e., not affecting the DB performance noticeably).
Agreed.

> 3. In the email thread for this patch, we saw some attempts to measure
> the performance overhead of the feature. We suggest a more rigorous
> study, including memory overhead in addition to time overhead. We came
> to the conclusion that analytical workloads - with complicated query
> plans - and a high query rate are the best to attempt such benchmarks.
Agreed.

> 4. In PGConf EU 2024, there was a talk by Rafael Castro titled
> "Debugging active queries" [1], and he also submitted a recent patch
> titled "Proposal: Progressive explain" [2]. We see a lot of
> similarities in the idea behind that patch and this one, and would
> like to suggest joining forces for an ideal outcome.
As we have been discussing in this thread recently, I believe that since
this patch has a narrower scope, it makes sense to complete it first and
then extend it to support progress tracking in a step-by-step manner.

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.


From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, destrex271(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2025-04-03 05:59:02
Message-ID: fc65c5df60509009183c28b840a6a609@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2025-04-03 01:22, Sami Imseih wrote:
>> > FWIW, I have been thinking about auto_explain for another task,
>> > remote plans for fdw [0], and perhaps there are now other good
>> > reasons, some that you mention, that can be simplified if "auto_explain"
>> > becomes a core feature. This could be a proposal taken up in 19.
>>
>> For what we're talking about here, I don't think we would need to go
>> that far -- maybe put a few functions in core but no real need to move
>> the whole module into core. However, I don't rule out that there are
>> other reasons to do as you suggest.
>
> This is the first core feature that will allow users to log explain
> plans for
> a live workload. EXPLAIN is not really good for this purpose. So this
> proposal is a step in the correct direction. I think the appetite for
> more
> plan logging/visibility options will likely increase, and
> auto_explainlike
> features in core will be desired IMO. We will see.
>
> As far as this patch goes, I took a look and I have some comments:

Thanks for your comments!

> 2/
> It should be noted that the plan will not print to the log until
> the plan begins executing the next plan node? depending on the
> operation, that could take some time ( i.e.long seq scan of a table,
> etc.)
> Does this behavior need to be called out in docs?

Seems reasonable, but long seq scan of a table would not cause the case
since ExecProcNode() is called at least the number of the rows in a
table, as far as I remember.
Of course there can be the case where long time can elapse before
executing ExecProcNode(), so I agree with adding the doc about this.

I'm going to improve including other than this comment in the next
patch.

--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, destrex271(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2025-04-03 12:40:14
Message-ID: CAA5RZ0v=j+YN6+Of2q3BiHHh=LOBVNN3fNWaRf_1iK+5h989gA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > 2/
> > It should be noted that the plan will not print to the log until
> > the plan begins executing the next plan node? depending on the
> > operation, that could take some time ( i.e.long seq scan of a table,
> > etc.)
> > Does this behavior need to be called out in docs?
>
> Seems reasonable, but long seq scan of a table would not cause the case
> since ExecProcNode() is called at least the number of the rows in a
> table, as far as I remember.
> Of course there can be the case where long time can elapse before
> executing ExecProcNode(), so I agree with adding the doc about this.

Correct. Seq Scan is a bad example since ExecProcNode is called for
every tuple as you mention. MultiExecProcNode, functions doing
time consuming computations, pg_sleep, etc. can all delay the
signal being sent.

I also realized that the extended query protocol may have its own caveats.
A query running under the extended protocol will alternate between
"active" and "idle in transaction"
as it transitions through parse, bind, and execute. If a user calls
pg_log_query_plan while
the state is "idle in transaction," it will result in a "backend with
PID ... is not running a query" log message.
This is more likely if the query repeatedly reissues an "execute"
message (i.e., JDBC fetchSize).
Of course, if the user executes pg_log_query_plan again, it will
(eventually) log the plan,
but it may be very confusing to see the "is not running a query"
message in the logs.
the chances of this behavior is low, but not 0, so it's probably worth
calling out
in documentation.

> 3/
> + * By default, only superusers are allowed to signal to log the plan because
> + * allowing any users to issue this request at an unbounded rate would

> Only superuser allowed to do this is very restrictive. Many shops do
> not, for good
> reasons, want DBAs or monitoring tools to connect as superuser. Why not allow
> this functionality with "pg_monitor" ?

I just realized that my comment above is unwarranted. A superuser can
just simply GRANT EXECUTE ON FUNCTION to pg_monitor, or whatever
monitoring role if they choose. You can ignore this.

--
Sami Imseih
Amazon Web Services (AWS)


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, destrex271(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2025-04-03 14:10:26
Message-ID: CA+TgmoZNYetpFSgVXpE0UdH_J7cPS=8UxxzuV6ubWWwX-8Gpag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Apr 3, 2025 at 1:32 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> I think tracking execution progress involves more challenges to solve
> compared to simply outputting the plan.
> For this reason, I believe an incremental approach -- first completing
> the basic plan output functionality in this thread and then extending it
> to support progress tracking -- would be the good way forward.

I think you might be right, although I am not totally certain yet.

> However, for the next patch, I'm considering introducing a GUC to allow
> prior setup before outputting the plan, in response to the previously
> quoted comment:
>
> One way in which this proposal seems safer than previous proposals is
> that previous proposals have involved session A poking session B and
> trying to get session B to emit an EXPLAIN on the fly with no prior
> setup. That would be very useful, but I think it's more difficult and
> more risky than this proposal, where all the configuration happens in
> the session that is going to emit the EXPLAIN output.

When I wrote this comment, I was unaware of Andres's proposal to use
ProcessInterrupts() to install the ExecProcNode() wrapper. With that
approach, which you have already implemented, I don't see a reason to
require prior configuration.

> With this change, it should be possible to use a file-level variable
> instead.

I think the question of whether ActiveQueryDesc can be file-level is
separate from whether prior configuration is needed. If it is
important to touch this from multiple source files, then it is fine
for it to be global. However, if we have a new source file, say
dynamic_explain.c, then you could have functions
ProcessDynamicExplainInterrupt() and DynamicExplainCleanup() in that
file to set, use, clear ActiveQueryDesc, and the rest of the system
might not need to know about it.

--
Robert Haas
EDB: http://www.enterprisedb.com


From: Atsushi Torikoshi <torikoshia(dot)tech(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, samimseih(at)gmail(dot)com
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, destrex271(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2025-04-05 06:13:47
Message-ID: CAM6-o=BecZ-FOOHZXLhMU=JGpGQBPS5cA=ASoLUr92fFgjvQqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Apr 3, 2025 at 11:10 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Looking at ExplainAssembleLogOutput() is making me realize that
> auto_explain is in serious need of some cleanup. That's not really the
> fault of this patch, but the hack whereby we overwrite the [] that
> would have surrounded the JSON output with {} is not very nice. I also
> think that the auto_explain GUCs need rethinking. In theory, new
> EXPLAIN options should be mirrored into auto_explain, but if you
> compare ExplainAssembleLogOutput() to ExplainOnePlan(), you can see
> that they are diverging. The PLANNING, SUMMARY, and SERIALIZE options
> that are known to regular EXPLAIN aren't known to auto_explain, and
> any customizable options that use explain_per_plan_hook won't be able
> to work with auto_explain, either. Changing this is non-trivial
> because SERIALIZE, for example, can't work the same way for
> auto_explain as it does for EXPLAIN, and a full solution might also
> require user-visible changes like replacing
> auto_explain.log_<whatever> with auto_explain.explain, so I don't
> really know. Maybe we should just live with it the way it is for now,
> but it doesn't look very nice.

I agree that the current state isn’t ideal, and that addressing it wouldn’t
be trivial.
While I do think it would be better to tackle this before proceeding with
the current patch, I don’t think it's a prerequisite.
Also, since I’m not yet sure what the best way to fix it would be, I’ve
left it as-is for now.

> I don't think it's a good idea to put the logic to update
> ActiveQueryDesc into ExecutorRun. I think that function should really
> just call the hook, or standard_ExecutorRun. I don't know for sure
> whether this work should be moved down into ExecutorRun or up into the
> caller, but I don't think it should stay where it is.

Moved the logic from ExecutorRun() to standard_ExecutorRun(), since it
performs the necessary setup for collecting information during plan
execution i.e. calling InstrStartNode().

> My comment about the naming of WrapMultiExecProcNodesWithExplain() on
> the other thread also applies here: MultiExecProcNode() is an
> unrelated function.

Renamed WrapMultiExecProcNodesWithExplain to WrapPlanStatesWithExplain.

> Likewise, WrapExecProcNodeWithExplain() misses
> walking the node's subplan list.

Added logic for subplan to WrapExecProcNodeWithExplain() and
UnwrapExecProcNodeWithExplain().

> Also, I don't think an
> ereport(DEBUG1, ...) is appropriate here.

Removed these ereport(DEBUG1).

> Do we really need es->signaled? Couldn't we test es->analyze instead?

I think it's necessary.
Although when implementing progressive EXPLAIN, I believe we can use
es->analyze instead, this patch aims to retrieve plans for queries that
have neither an EXPLAIN nor an ANALYZE specified.

> Do we really need ExecProcNodeOriginal? Can we find some way to reuse
> ExecProcNodeReal instead of making the structure bigger?

I also wanted to implement this without adding elements to PlanState if
possible, but I haven't found a good solution, so the patch uses
ExecSetExecProcNode.

> I do think we should try to move some of this new code out into a
> separate source file, but I'm not yet sure what we should call it. We
> might want to share infrastructure with something what Rafael's patch,
> which he called progressive EXPLAIN but which is really closer to a
> general query progress facility, albeit perhaps too expensive to have
> enabled by default.

Made new files dynamic_explain.h and dynamic_explain.c, which you named and
it seems fine to me, and move most of the code to them.

> Does the documentation typically mention when a function is
> superuser-only? If so, it should do so here, too, using similar
> wording.

Added below explanation like other functions:

This function is restricted to superusers by default, but other
users can be granted EXECUTE to run the function.

> It seems unnecessary to remark that you can't log a query plan after a
> subtransaction abort, because the query wouldn't be running any more.

Removed the description.

> It's also true that you can't log a query after a toplevel abort, even
> if you're still waiting for the aborted transaction to roll back. But
> it also seems like once the aborted subtransaction is popped off the
> stack and we return to the parent transaction, we should be able to
> log any query running at the outer level.

It works as below:

session-1
begin;
savepoint s1;
err;
ERROR: syntax error at or near "err"
rollback to s1;
select pg_sleep(5);

session-2
select pg_log_query_plan(pid of session-1);

log
LOG: query plan running on backend with PID 40025 is:
Query Text: select pg_sleep(5);
Result (cost=0.00..0.01 rows=1 width=4)
Output: pg_sleep('5'::double precision)

> Does ActiveQueryDesc really need to be exposed to the whole system?
> Could it be a file-level variable?

On Thu, Apr 3, 2025 at 11:10 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I think the question of whether ActiveQueryDesc can be file-level is
> separate from whether prior configuration is needed. If it is
> important to touch this from multiple source files, then it is fine
> for it to be global. However, if we have a new source file, say
> dynamic_explain.c, then you could have functions
> ProcessDynamicExplainInterrupt() and DynamicExplainCleanup() in that
> file to set, use, clear ActiveQueryDesc, and the rest of the system
> might not need to know about it.

Thanks for the advice, I took this approach and made ActiveQueryDesc
file-level variable.

On Thu, Apr 3, 2025 at 1:23 AM Sami Imseih <samimseih(at)gmail(dot)com> wrote:
> 7/
> Why do we only support TEXT format? I tried it with JSON
> and it looks fine in the log as well. I can imagine automated
> tools will want to be able to retrieve the plans using the
> structured formats.

I agree that allowing the choice of format would be useful.
However, I believe implementing this would require introducing new GUC or
creating a mechanism to pass the format from the executor of
pg_log_query_plan() to the target process.
For now, I think it would be better to minimize the scope of the initial
patch.

I think I have reflected your other comments in the attached patch.

Regards,
Atsushi Torikoshi

Attachment Content-Type Size
v43-0001-Add-function-to-log-the-plan-of-the-currently-ru.patch application/octet-stream 35.5 KB