BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data

Lists: pgsql-bugs
From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: maxim(dot)boguk(at)gmail(dot)com
Subject: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
Date: 2022-07-17 19:23:36
Message-ID: 17552-213b534c56ab5d02@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 17552
Logged by: Maxim Boguk
Email address: maxim(dot)boguk(at)gmail(dot)com
PostgreSQL version: 14.4
Operating system: Linux
Description:

pg_stat_statements track internal FK trigger check queries during data load
via COPY
( pg_stat_statements.track=top (default) and
pg_stat_statements.track_utility=off (not default) )
Tested on latest 13 and 14 versions.

Test script (test.sql):

drop table if exists t1 CASCADE;
drop table if exists t2;

create table t1 (id integer primary key);
insert into t1 values (1);
vacuum ANALYZE t1;

create table t2 (id integer primary key, fk integer references t1(id));

create extension IF NOT EXISTS pg_stat_statements;
select e.extname AS "Name", e.extversion AS "Version" FROM
pg_catalog.pg_extension e where e.extname='pg_stat_statements';
show pg_stat_statements.track;
show pg_stat_statements.track_utility;

select pg_stat_statements_reset();

COPY public.t2 (id, fk) FROM stdin;
1 1
2 1
3 1
4 1
5 1
6 1
7 1
8 1
9 1
10 1
11 1
12 1
13 1
14 1
15 1
16 1
17 1
18 1
19 1
20 1
21 1
22 1
23 1
24 1
25 1
26 1
27 1
28 1
29 1
30 1
31 1
32 1
33 1
34 1
35 1
36 1
37 1
38 1
39 1
40 1
41 1
42 1
43 1
44 1
45 1
46 1
47 1
48 1
49 1
50 1
51 1
52 1
53 1
54 1
55 1
56 1
57 1
58 1
59 1
60 1
61 1
62 1
63 1
64 1
65 1
66 1
67 1
68 1
69 1
70 1
71 1
72 1
73 1
74 1
75 1
76 1
77 1
78 1
79 1
80 1
81 1
82 1
83 1
84 1
85 1
86 1
87 1
88 1
89 1
90 1
91 1
92 1
93 1
94 1
95 1
96 1
97 1
98 1
99 1
100 1
\.

select query, calls from public.pg_stat_statements where dbid=(select oid
from pg_database where datname=current_database()) order by calls desc limit
2;

run:
psql -e -f test.sql -d sometestdb

Final output rows:
select query, calls from public.pg_stat_statements where dbid=(select oid
from pg_database where datname=current_database()) order by calls desc limit
2;
query
| calls
---------------------------------------------------------------------------------------------+-------
SELECT $2 FROM ONLY "public"."t1" x WHERE "id" OPERATOR(pg_catalog.=) $1
FOR KEY SHARE OF x | 100
select pg_stat_statements_reset()
| 1
(2 rows)

Expected output:
query
| calls
---------------------------------------------------------------------------------------------+-------
select pg_stat_statements_reset()
| 1
(1 rows)

PS: with pg_stat_statements.track_utility=on test work as expected and
provides expected
select query, calls from public.pg_stat_statements where dbid=(select oid
from pg_database where datname=current_database()) order by calls desc limit
2;
query | calls
------------------------------------+-------
COPY public.t2 (id, fk) FROM stdin | 1
select pg_stat_statements_reset() | 1

Regards,
Maxim


From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: maxim(dot)boguk(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
Date: 2022-07-18 10:19:55
Message-ID: 20220718101955.5erjttxbbjupm2xr@jrouhaud
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi,

On Sun, Jul 17, 2022 at 07:23:36PM +0000, PG Bug reporting form wrote:
>
> Bug reference: 17552
> Logged by: Maxim Boguk
> Email address: maxim(dot)boguk(at)gmail(dot)com
> PostgreSQL version: 14.4
> Operating system: Linux
> Description:
>
> pg_stat_statements track internal FK trigger check queries during data load
> via COPY
> ( pg_stat_statements.track=top (default) and
> pg_stat_statements.track_utility=off (not default) )
> Tested on latest 13 and 14 versions.

Reproduced on current master too.

> Test script (test.sql):
> [...]
> select query, calls from public.pg_stat_statements where dbid=(select oid
> from pg_database where datname=current_database()) order by calls desc limit
> 2;
>
> run:
> psql -e -f test.sql -d sometestdb
>
> Final output rows:
> select query, calls from public.pg_stat_statements where dbid=(select oid
> from pg_database where datname=current_database()) order by calls desc limit
> 2;
> query
> | calls
> ---------------------------------------------------------------------------------------------+-------
> SELECT $2 FROM ONLY "public"."t1" x WHERE "id" OPERATOR(pg_catalog.=) $1
> FOR KEY SHARE OF x | 100
> select pg_stat_statements_reset()
> | 1
> (2 rows)
>
>
> Expected output:
> query
> | calls
> ---------------------------------------------------------------------------------------------+-------
> select pg_stat_statements_reset()
> | 1
> (1 rows)

I agree with you that the current behavior seems wrong. If we ignore a utility
statement, it doesn't mean that any standard query executed underneath should
become a top-level statement as it's the case now.

I see that there's an existing test that actually relies on that behavior,
which is an anonymous block code like:

DO LANGUAGE plpgsql $$
BEGIN
-- this is a SELECT
PERFORM 'hello world'::TEXT;
END;
$$;

which is expected to reports a "SELECT $1::text" with pg_stat_statements.track
= top and track_utility = off, and also doesn't seem right.

That test was added in 83f2061dd03, apparently covered by "Also, significantly
expand pg_stat_statements' regression test script", but without any discussion
on that specific behavior as far as I can see after a quick look.

Assuming that we want to fix it, simply increasing the exec_nested_level in the
fallback calls to prev_ProcessUtility/standard_ProcessUtility for handled
utility commands should get the correct behavior.


From: Sergei Kornilov <sk(at)zsrv(dot)org>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: maxim(dot)boguk(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
Date: 2023-03-22 12:21:07
Message-ID: 430591679487667@lma6oifqwmuo4pt7.myt.yp-c.yandex.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hello

I have a similar problem with exactly the same reason. Queries are counted as top level, although they are not called at the top level. Another small example:

set pg_stat_statements.track_utility to off;
set pg_stat_statements.track to 'top';
do $$ declare i int; begin select 1 as n into i; end $$ language plpgsql;
select toplevel, query from pg_stat_statements order by query;
toplevel | query
----------+-----------------------------------
t | select $1 as n
t | select pg_stat_statements_reset()
(2 rows)

But "select $1 as n" was not a top-level query.

In the first patch, a testcase showing the behavior now. The second patch adds a nesting level increment if track_utility is disabled. I need to duplicate the PGSS_HANDLED_UTILITY check here because it's documented above:

> If it's an EXECUTE statement, we don't track it and don't increment the nesting level.

regards, Sergei

Attachment Content-Type Size
v1-0001-testcase-wrong-toplevel.patch text/x-diff 3.4 KB
v1-0002-increment-nested-level-utility.patch text/x-diff 1.2 KB

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Sergei Kornilov <sk(at)zsrv(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: maxim(dot)boguk(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
Date: 2023-07-16 15:19:49
Message-ID: 8dd1ef3e-378c-83dc-229b-e3d3e251ac11@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On 3/22/23 13:21, Sergei Kornilov wrote:
> Hello
>
> I have a similar problem with exactly the same reason. Queries are counted as top level, although they are not called at the top level. Another small example:
>
> set pg_stat_statements.track_utility to off;
> set pg_stat_statements.track to 'top';
> do $$ declare i int; begin select 1 as n into i; end $$ language plpgsql;
> select toplevel, query from pg_stat_statements order by query;
> toplevel | query
> ----------+-----------------------------------
> t | select $1 as n
> t | select pg_stat_statements_reset()
> (2 rows)
>
> But "select $1 as n" was not a top-level query.
>
> In the first patch, a testcase showing the behavior now. The second patch adds a nesting level increment if track_utility is disabled. I need to duplicate the PGSS_HANDLED_UTILITY check here because it's documented above:
>
>> If it's an EXECUTE statement, we don't track it and don't increment the nesting level.
>

I took a look, and the patch seems correct to me. It'll need better
comments explaining the change, but I'll take care of that. Barring
objections, I'll get this committed and backpatched.

FWIW it's probably a good idea to post patches on pgsql-hackers - not
everyone watches pgsql-bugs and/or CF app. Furthermore, when a patch is
switched to "RFC" it's customary to mention that in the thread, along
with a review. Otherwise these silent changes are quite puzzling.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Sergei Kornilov <sk(at)zsrv(dot)org>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: maxim(dot)boguk(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, Julien Rouhaud <rjuju123(at)gmail(dot)com>
Subject: Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
Date: 2023-07-18 16:19:25
Message-ID: 2044761689697165@zh4mfevp4ifcvtxn.vla.yp-c.yandex.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hello

> I took a look, and the patch seems correct to me. It'll need better
> comments explaining the change, but I'll take care of that. Barring
> objections, I'll get this committed and backpatched.

Great, thank you!

> FWIW it's probably a good idea to post patches on pgsql-hackers - not
> everyone watches pgsql-bugs and/or CF app. Furthermore, when a patch is
> switched to "RFC" it's customary to mention that in the thread, along
> with a review. Otherwise these silent changes are quite puzzling.

ok, I'll keep that in mind.
yeah, RFC status was a surprise.

regards, Sergei


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Sergei Kornilov <sk(at)zsrv(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, maxim(dot)boguk(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
Date: 2023-11-01 17:15:27
Message-ID: 2831679.1698858927@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:
> I took a look, and the patch seems correct to me. It'll need better
> comments explaining the change, but I'll take care of that. Barring
> objections, I'll get this committed and backpatched.

Since Tomas hadn't done anything with this, I picked it up with
intent to commit, and got as far as revising the comments and
eliminating some duplicative condition checks (see attached).

However, I then started to wonder whether it's really a good idea
that the code treats PREPARE and EXECUTE alike for this purpose.
In the case of EXECUTE, the idea is that we'll still be at top
level when we reach the executor hooks, and they will do the right
thing and then increment exec_nested_level before any lower-level
statement can be reached. But how does that concept apply to
PREPARE, which won't reach the executor? ISTM the net result
is that if any subsidiary statements are reached during PREPARE
(perhaps via const-simplification of some SQL or PL function),
we will erroneously treat them as top-level.

It's worth noting that the comments here explicitly mention that
suppressing the nesting level bump is desired for EXECUTE, while
*not* saying that for PREPARE. I've not dug in the commit history,
but I wonder if this isn't a bug somebody introduced during
careless refactoring to invent the PGSS_HANDLED_UTILITY macro.
It's not clear to me that that macro is anything but an
invitation to sloppy thinking.

In short, I think we probably want to bump the nest level
for everything except EXECUTE, and I'm not sure that
PGSS_HANDLED_UTILITY is worth keeping.

Thoughts?

BTW, I'm inclined not to back-patch this. While it seems like
a bug fix, it's also a behavioral change that might break
someone's expectations.

regards, tom lane

Attachment Content-Type Size
v2-0001-increment-nest-level-more-often.patch text/x-diff 7.5 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Sergei Kornilov <sk(at)zsrv(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, maxim(dot)boguk(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
Date: 2023-11-01 21:20:21
Message-ID: 2869953.1698873621@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

I wrote:
> However, I then started to wonder whether it's really a good idea
> that the code treats PREPARE and EXECUTE alike for this purpose.
> In the case of EXECUTE, the idea is that we'll still be at top
> level when we reach the executor hooks, and they will do the right
> thing and then increment exec_nested_level before any lower-level
> statement can be reached. But how does that concept apply to
> PREPARE, which won't reach the executor? ISTM the net result
> is that if any subsidiary statements are reached during PREPARE
> (perhaps via const-simplification of some SQL or PL function),
> we will erroneously treat them as top-level.

Poking at that, I found that (a) PREPARE doesn't run the planner,
so probably nested statements are impossible, and (b) if we
remove the check so that it bumps the nesting level, then some
existing regression test outputs change. PREPARE does run
parse analysis of the contained statement, and it seems that
that's sensitive to the is-top-level state. So we do need to
keep the exclusion for PREPARE. However, I'm still fairly down
on PGSS_HANDLED_UTILITY, because I think the rationale for
treating PREPARE and EXECUTE specially is different in each
place where that's being used, so tying them together is more
likely to cause future bugs than prevent bugs.

So v3-0001 attached revises the patch per those ideas. It's only
cosmetically different from before, but I think the explanatory
comments are better.

Meanwhile, I realized that we have a second set of bugs. As I
said above, it's possible to reach nested statements during
planning, as a consequence of const-simplification of an immutable
or stable function. pg_stat_statements generally does the wrong
thing here, incorrectly treating such statements as top-level if
we weren't already nested. This needs to be fixed by including
plan_nest_level as a reason to consider execution to be nested too.
What's more, testing this showed that pgss_planner has the same bug
as pgss_ProcessUtility: it needs to bump the nesting level even
if it chooses not to track planning time.

v3-0002 fixes this in a minimally invasive way by replacing each
check of exec_nested_level by plan_nested_level + exec_nested_level.
That's just for review/testing though. What I think we actually
ought to do is fold plan_nested_level and exec_nested_level into
a single variable nested_level (or nesting_level would be better
English). There's no reason to keep them separate; it just adds
cycles, complexity, and more risk of mistakes.

Comments?

regards, tom lane

Attachment Content-Type Size
v3-0001-increment-exec-nest-level-more-often.patch text/x-diff 7.8 KB
v3-0002-handle-plan-nesting-better.patch text/x-diff 7.8 KB

From: Sergei Kornilov <sk(at)zsrv(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, maxim(dot)boguk(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Subject: Re:BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
Date: 2023-11-07 20:23:17
Message-ID: 8152021699388597@fttg4r4wmjvthz4h.vla.yp-c.yandex.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hello

Thannk you! I reviewed v3 patches. Looks good for me. I experimented with different options (track_planning, track top/all, track_utility, simple query and extended query protocols) and did not find any unexpected results in toplevel field.

> BTW, I'm inclined not to back-patch this. While it seems like a bug fix, it's also a behavioral change that might break someone's expectations.

I'm fine with patches only on HEAD.

regards, Sergei


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Sergei Kornilov <sk(at)zsrv(dot)org>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, maxim(dot)boguk(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Subject: Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
Date: 2023-11-08 17:02:59
Message-ID: 729929.1699462979@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Sergei Kornilov <sk(at)zsrv(dot)org> writes:
> Thannk you! I reviewed v3 patches. Looks good for me. I experimented with different options (track_planning, track top/all, track_utility, simple query and extended query protocols) and did not find any unexpected results in toplevel field.

Cool. I merged the counters into one as I mentioned earlier,
and pushed it.

regards, tom lane