Re: BUG #16109: Postgres planning time is high across version (Expose buffer usage during planning in EXPLAIN)

Lists: pgsql-bugsPostg사설 토토 사이트SQL
From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: chhatani(dot)mukesh(at)gmail(dot)com
Subject: BUG #16109: Postgres planning time is high across version - 10.6 vs 10.10
Date: 2019-11-12 20:34:35
Message-ID: 16109-26a1a88651e90608@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

The following bug has been logged on the website:

Bug reference: 16109
Logged by: Mukesh Chhatani
Email address: chhatani(dot)mukesh(at)gmail(dot)com
PostgreSQL version: 10.10
Operating system: AWS RDS
Description:

Hello Team,

I am experiencing weird issue around planning time for the queries across
couple of environments below is the sample of the execution plan

Fast Execution Plan
transformations=> explain (analyze, buffers)
SELECT x2.x3, x2.x4, x2.x5, x2.x6, x2.x7, x2.x8, x2.x9, x2.x10, x2.x11,
x2.x12, x2.x13, x2.x14, x2.x15, x2.x16, x2.x17, x2.x18, x2.x19, x2.x20,
x2.x21, x2.x22, x2.x23, x2.x24, x2.x25, x2.x26, x2.x27, x2.x28, x2.x29,
x2.x30, x2.x31, x2.x32, x2.x33, x2.x34, x35. "provider_id", x35.
"provider_phone_id", x35. "provider_id", x35. "address_id", x35.
"prod_code", x35. "phone_number", x35. "phone_type", x36. "provider_id",
x36. "provider_id", x36. "address_id", x36. "language_code", x36.
"language_used_by" FROM ( SELECT x37.x38 AS x14, x37.x39 AS x6, x37.x40 AS
x26, x37.x41 AS x9, x37.x42 AS x20, x37.x43 AS x16, x37.x44 AS x8, x37.x45
AS x19, x37.x46 AS x3, x37.x47 AS x13, x37.x48 AS x12, x37.x49 AS x18,
x37.x50 AS x17, x37.x51 AS x11, x37.x52 AS x22, x37.x53 AS x21, x37.x54 AS
x10, x37.x55 AS x5, x37.x56 AS x4, x37.x57 AS x25, x37.x58 AS x7, x37.x59 AS
x15, x37.x60 AS x24, x37.x61 AS x23, ( CASE WHEN (x62. "attribute_value" IS
NULL) THEN NULL ELSE 1 END) AS x27, x62. "paid" AS x28, x62.
"attribute_value" AS x34, x62. "attribute_id" AS x33, x62. "provider_id" AS
x29, x62. "attribute_group_id" AS x32, x62. "parent_paid" AS x31, x62.
"address_id" AS x30 FROM ( SELECT "provider_id" AS x46, "zip" AS x38,
"first_name" AS x39, "provider_name_id" AS x40, "degree" AS x41,
"preferred_flag" AS x42, "county" AS x43, "suffix" AS x44, "individual_id"
AS x45, "state" AS x47, "city" AS x48, "latitude" AS x49, "longitude" AS
x50, "address" AS x51, "exclusion_type_id" AS x52, "quality_score" AS x53,
"gender" AS x54, "last_name" AS x55, "address_id" AS x56, "hi_q_hospital_id"
AS x57, "middle_name" AS x58, "zip4" AS x59, "handicap_accessible" AS x60,
"sour_address" AS x61 FROM "provider" WHERE "provider_id" =
'03563735-3798-441a-aea6-4e561ea347f7') x37 LEFT OUTER JOIN
"provider_attribute" x62 ON (x37.x46 = x62. "provider_id") AND (x37.x56 =
x62. "address_id")) x2 LEFT OUTER JOIN "provider_phone" x35 ON (x2.x3 = x35.
"provider_id") AND (x2.x4 = x35. "address_id") LEFT OUTER JOIN
"provider_language" x36 ON (x2.x3 = x36. "provider_id") AND (x2.x4 = x36.
"address_id");

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Merge Left Join (cost=15.87..16.25 rows=13 width=920) (actual
time=0.021..0.022 rows=0 loops=1)
Merge Cond: ((provider.address_id)::text = (x36.address_id)::text)
Join Filter: ((provider.provider_id)::text = (x36.provider_id)::text)
Buffers: shared hit=3
-> Merge Left Join (cost=12.37..12.69 rows=13 width=754) (actual
time=0.021..0.021 rows=0 loops=1)
Merge Cond: ((provider.address_id)::text =
(x35.address_id)::text)
Join Filter: ((provider.provider_id)::text =
(x35.provider_id)::text)
Buffers: shared hit=3
-> Merge Left Join (cost=8.38..8.59 rows=13 width=584) (actual
time=0.021..0.021 rows=0 loops=1)
Merge Cond: ((provider.address_id)::text =
(x62.address_id)::text)
Join Filter: ((provider.provider_id)::text =
(x62.provider_id)::text)
Buffers: shared hit=3
-> Sort (cost=3.89..3.93 rows=13 width=387) (actual
time=0.020..0.021 rows=0 loops=1)
Sort Key: provider.address_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=3
-> Index Scan using provider_provider_id_idx on
provider (cost=0.42..3.65 rows=13 width=387) (actual time=0.017..0.017
rows=0 loops=1)
Index Cond: ((provider_id)::text =
'03563735-3798-441a-aea6-4e561ea347f7'::text)
Buffers: shared hit=3
-> Sort (cost=4.49..4.56 rows=26 width=197) (never
executed)
Sort Key: x62.address_id
-> Append (cost=0.42..3.88 rows=26 width=197) (never
executed)
-> Index Scan using
provider_attribute_sub_0_provider_id_idx on provider_attribute_sub_0 x62
(cost=0.42..3.88 rows=26 width=197) (never executed)
Index Cond: ((provider_id)::text =
'03563735-3798-441a-aea6-4e561ea347f7'::text)
-> Sort (cost=3.98..4.02 rows=15 width=170) (never executed)
Sort Key: x35.address_id
-> Index Scan using provider_phone_provider_id_idx on
provider_phone x35 (cost=0.43..3.69 rows=15 width=170) (never executed)
Index Cond: ((provider_id)::text =
'03563735-3798-441a-aea6-4e561ea347f7'::text)
-> Sort (cost=3.50..3.51 rows=3 width=88) (never executed)
Sort Key: x36.address_id
-> Index Scan using provider_language_provider_id_idx on
provider_language x36 (cost=0.42..3.47 rows=3 width=88) (never executed)
Index Cond: ((provider_id)::text =
'03563735-3798-441a-aea6-4e561ea347f7'::text)
Planning time: 7.416 ms
Execution time: 0.096 ms
(34 rows)

Slow Execution Plan
transformations_uhc_medicaid=> explain (analyze, buffers)
SELECT x2.x3, x2.x4, x2.x5, x2.x6, x2.x7, x2.x8, x2.x9, x2.x10, x2.x11,
x2.x12, x2.x13, x2.x14, x2.x15, x2.x16, x2.x17, x2.x18, x2.x19, x2.x20,
x2.x21, x2.x22, x2.x23, x2.x24, x2.x25, x2.x26, x2.x27, x2.x28, x2.x29,
x2.x30, x2.x31, x2.x32, x2.x33, x2.x34, x35. "provider_id", x35.
"provider_phone_id", x35. "provider_id", x35. "address_id", x35.
"prod_code", x35. "phone_number", x35. "phone_type", x36. "provider_id",
x36. "provider_id", x36. "address_id", x36. "language_code", x36.
"language_used_by" FROM ( SELECT x37.x38 AS x14, x37.x39 AS x6, x37.x40 AS
x26, x37.x41 AS x9, x37.x42 AS x20, x37.x43 AS x16, x37.x44 AS x8, x37.x45
AS x19, x37.x46 AS x3, x37.x47 AS x13, x37.x48 AS x12, x37.x49 AS x18,
x37.x50 AS x17, x37.x51 AS x11, x37.x52 AS x22, x37.x53 AS x21, x37.x54 AS
x10, x37.x55 AS x5, x37.x56 AS x4, x37.x57 AS x25, x37.x58 AS x7, x37.x59 AS
x15, x37.x60 AS x24, x37.x61 AS x23, ( CASE WHEN (x62. "attribute_value" IS
NULL) THEN NULL ELSE 1 END) AS x27, x62. "paid" AS x28, x62.
"attribute_value" AS x34, x62. "attribute_id" AS x33, x62. "provider_id" AS
x29, x62. "attribute_group_id" AS x32, x62. "parent_paid" AS x31, x62.
"address_id" AS x30 FROM ( SELECT "provider_id" AS x46, "zip" AS x38,
"first_name" AS x39, "provider_name_id" AS x40, "degree" AS x41,
"preferred_flag" AS x42, "county" AS x43, "suffix" AS x44, "individual_id"
AS x45, "state" AS x47, "city" AS x48, "latitude" AS x49, "longitude" AS
x50, "address" AS x51, "exclusion_type_id" AS x52, "quality_score" AS x53,
"gender" AS x54, "last_name" AS x55, "address_id" AS x56, "hi_q_hospital_id"
AS x57, "middle_name" AS x58, "zip4" AS x59, "handicap_accessible" AS x60,
"sour_address" AS x61 FROM "provider" WHERE "provider_id" =
'03563735-3798-441a-aea6-4e561ea347f7') x37 LEFT OUTER JOIN
"provider_attribute" x62 ON (x37.x46 = x62. "provider_id") AND (x37.x56 =
x62. "address_id")) x2 LEFT OUTER JOIN "provider_phone" x35 ON (x2.x3 = x35.
"provider_id") AND (x2.x4 = x35. "address_id") LEFT OUTER JOIN
"provider_language" x36 ON (x2.x3 = x36. "provider_id") AND (x2.x4 = x36.
"address_id");

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop Left Join (cost=5.14..15.03 rows=2 width=944) (actual
time=0.039..0.039 rows=0 loops=1)
Join Filter: (((provider.provider_id)::text = (x36.provider_id)::text)
AND ((provider.address_id)::text = (x36.address_id)::text))
Buffers: shared hit=4
-> Nested Loop Left Join (cost=4.72..11.56 rows=2 width=777) (actual
time=0.039..0.039 rows=0 loops=1)
Join Filter: (((provider.provider_id)::text =
(x35.provider_id)::text) AND ((provider.address_id)::text =
(x35.address_id)::text))
Buffers: shared hit=4
-> Hash Right Join (cost=4.17..7.78 rows=2 width=607) (actual
time=0.038..0.038 rows=0 loops=1)
Hash Cond: (((x62.provider_id)::text =
(provider.provider_id)::text) AND ((x62.address_id)::text =
(provider.address_id)::text))
Buffers: shared hit=4
-> Append (cost=0.55..3.94 rows=22 width=171) (never
executed)
-> Index Scan using
provider_attribute_sub_0_provider_id_idx on provider_attribute_sub_0 x62
(cost=0.55..3.94 rows=22 width=171) (never executed)
Index Cond: ((provider_id)::text =
'03563735-3798-441a-aea6-4e561ea347f7'::text)
-> Hash (cost=3.59..3.59 rows=2 width=436) (actual
time=0.031..0.031 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
Buffers: shared hit=4
-> Index Scan using provider_provider_id_idx on
provider (cost=0.55..3.59 rows=2 width=436) (actual time=0.030..0.030
rows=0 loops=1)
Index Cond: ((provider_id)::text =
'03563735-3798-441a-aea6-4e561ea347f7'::text)
Buffers: shared hit=4
-> Materialize (cost=0.56..3.65 rows=4 width=170) (never
executed)
-> Index Scan using provider_phone_provider_id_idx on
provider_phone x35 (cost=0.56..3.62 rows=4 width=170) (never executed)
Index Cond: ((provider_id)::text =
'03563735-3798-441a-aea6-4e561ea347f7'::text)
-> Materialize (cost=0.42..3.44 rows=1 width=89) (never executed)
-> Index Scan using provider_language_provider_id_idx on
provider_language x36 (cost=0.42..3.44 rows=1 width=89) (never executed)
Index Cond: ((provider_id)::text =
'03563735-3798-441a-aea6-4e561ea347f7'::text)
Planning time: 7195.110 ms
Execution time: 0.143 ms

Some details around table structure
provider_attribute is partitioned tables as below while other tables are
normal tables
transformations=> \d+ provider_attribute
Table "public.provider_attribute"
Column | Type | Collation | Nullable | Default |
Storage | Stats target | Description
--------------------+-------------------+-----------+----------+---------+----------+--------------+-------------
paid | character varying | | | |
extended | |
provider_id | character varying | | not null | |
extended | |
address_id | character varying | | not null | |
extended | |
parent_paid | character varying | | | |
extended | |
attribute_group_id | character varying | | | |
extended | |
attribute_id | character varying | | not null | |
extended | |
attribute_value | character varying | | not null | |
extended | |
Partition key: RANGE (provider_id)
Partitions: provider_attribute_sub_0 FOR VALUES FROM ('0') TO ('1'),
provider_attribute_sub_1 FOR VALUES FROM ('1') TO ('2'),
provider_attribute_sub_2 FOR VALUES FROM ('2') TO ('3'),
provider_attribute_sub_3 FOR VALUES FROM ('3') TO ('4'),
provider_attribute_sub_4 FOR VALUES FROM ('4') TO ('5'),
provider_attribute_sub_5 FOR VALUES FROM ('5') TO ('6'),
provider_attribute_sub_6 FOR VALUES FROM ('6') TO ('7'),
provider_attribute_sub_7 FOR VALUES FROM ('7') TO ('8'),
provider_attribute_sub_8 FOR VALUES FROM ('8') TO ('9'),
provider_attribute_sub_9 FOR VALUES FROM ('9') TO ('a'),
provider_attribute_sub_a FOR VALUES FROM ('a') TO ('b'),
provider_attribute_sub_b FOR VALUES FROM ('b') TO ('c'),
provider_attribute_sub_c FOR VALUES FROM ('c') TO ('d'),
provider_attribute_sub_d FOR VALUES FROM ('d') TO ('e'),
provider_attribute_sub_e FOR VALUES FROM ('e') TO ('f'),
provider_attribute_sub_f FOR VALUES FROM ('f') TO ('g')

transformations=> \d+ provider_attribute_sub_0
Table
"public.provider_attribute_sub_0"
Column | Type | Collation | Nullable | Default |
Storage | Stats target | Description
--------------------+-------------------+-----------+----------+---------+----------+--------------+-------------
paid | character varying | | | |
extended | |
provider_id | character varying | | not null | |
extended | |
address_id | character varying | | not null | |
extended | |
parent_paid | character varying | | | |
extended | |
attribute_group_id | character varying | | | |
extended | |
attribute_id | character varying | | not null | |
extended | |
attribute_value | character varying | | not null | |
extended | |
Partition of: provider_attribute FOR VALUES FROM ('0') TO ('1')
Partition constraint: ((provider_id IS NOT NULL) AND ((provider_id)::text >=
'0'::character varying) AND ((provider_id)::text < '1'::character
varying))
Indexes:
"provider_attribute_sub_0_provider_id_idx" btree (provider_id) CLUSTER

I have tried to vacuum analyze the whole database still queries are slow in
1 of the environment while faster in another environment.

I have also compared and matched all postgres parameters and still no luck
in speeding the queries.

Any help would be greatly appreciated.

Thanks & Regards,
Mukesh Chhatani


From: Andres Freund <andres(at)anarazel(dot)de>
To: chhatani(dot)mukesh(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16109: Postgres planning time is high across version - 10.6 vs 10.10
Date: 2019-11-12 20:55:06
Message-ID: 20191112205506.rvadbx2dnku3paaw@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Hi,

On 2019-11-12 20:34:35 +0000, PG Bug reporting form wrote:
> I am experiencing weird issue around planning time for the queries across
> couple of environments below is the sample of the execution plan

Just to confirm, these are the same queries, but executed in different
databases / environments?

> Fast Execution Plan
> transformations=> explain (analyze, buffers)
> SELECT x2.x3, x2.x4, x2.x5, x2.x6, x2.x7, x2.x8, x2.x9, x2.x10, x2.x11,
> x2.x12, x2.x13, x2.x14, x2.x15, x2.x16, x2.x17, x2.x18, x2.x19, x2.x20,
> x2.x21, x2.x22, x2.x23, x2.x24, x2.x25, x2.x26, x2.x27, x2.x28, x2.x29,
> x2.x30, x2.x31, x2.x32, x2.x33, x2.x34, x35. "provider_id", x35.
> "provider_phone_id", x35. "provider_id", x35. "address_id", x35.
> "prod_code", x35. "phone_number", x35. "phone_type", x36. "provider_id",
> x36. "provider_id", x36. "address_id", x36. "language_code", x36.
> "language_used_by" FROM ( SELECT x37.x38 AS x14, x37.x39 AS x6, x37.x40 AS
> x26, x37.x41 AS x9, x37.x42 AS x20, x37.x43 AS x16, x37.x44 AS x8, x37.x45
> AS x19, x37.x46 AS x3, x37.x47 AS x13, x37.x48 AS x12, x37.x49 AS x18,
> x37.x50 AS x17, x37.x51 AS x11, x37.x52 AS x22, x37.x53 AS x21, x37.x54 AS
> x10, x37.x55 AS x5, x37.x56 AS x4, x37.x57 AS x25, x37.x58 AS x7, x37.x59 AS
> x15, x37.x60 AS x24, x37.x61 AS x23, ( CASE WHEN (x62. "attribute_value" IS
> NULL) THEN NULL ELSE 1 END) AS x27, x62. "paid" AS x28, x62.
> "attribute_value" AS x34, x62. "attribute_id" AS x33, x62. "provider_id" AS
> x29, x62. "attribute_group_id" AS x32, x62. "parent_paid" AS x31, x62.
> "address_id" AS x30 FROM ( SELECT "provider_id" AS x46, "zip" AS x38,
> "first_name" AS x39, "provider_name_id" AS x40, "degree" AS x41,
> "preferred_flag" AS x42, "county" AS x43, "suffix" AS x44, "individual_id"
> AS x45, "state" AS x47, "city" AS x48, "latitude" AS x49, "longitude" AS
> x50, "address" AS x51, "exclusion_type_id" AS x52, "quality_score" AS x53,
> "gender" AS x54, "last_name" AS x55, "address_id" AS x56, "hi_q_hospital_id"
> AS x57, "middle_name" AS x58, "zip4" AS x59, "handicap_accessible" AS x60,
> "sour_address" AS x61 FROM "provider" WHERE "provider_id" =
> '03563735-3798-441a-aea6-4e561ea347f7') x37 LEFT OUTER JOIN
> "provider_attribute" x62 ON (x37.x46 = x62. "provider_id") AND (x37.x56 =
> x62. "address_id")) x2 LEFT OUTER JOIN "provider_phone" x35 ON (x2.x3 = x35.
> "provider_id") AND (x2.x4 = x35. "address_id") LEFT OUTER JOIN
> "provider_language" x36 ON (x2.x3 = x36. "provider_id") AND (x2.x4 = x36.
> "address_id");

This is really hard to read for a human...

Here's a automatically reformatted version:

SELECT x2.x3,
x2.x4,
x2.x5,
x2.x6,
x2.x7,
x2.x8,
x2.x9,
x2.x10,
x2.x11,
x2.x12,
x2.x13,
x2.x14,
x2.x15,
x2.x16,
x2.x17,
x2.x18,
x2.x19,
x2.x20,
x2.x21,
x2.x22,
x2.x23,
x2.x24,
x2.x25,
x2.x26,
x2.x27,
x2.x28,
x2.x29,
x2.x30,
x2.x31,
x2.x32,
x2.x33,
x2.x34,
x35. "provider_id",
x35. "provider_phone_id",
x35. "provider_id",
x35. "address_id",
x35. "prod_code",
x35. "phone_number",
x35. "phone_type",
x36. "provider_id",
x36. "provider_id",
x36. "address_id",
x36. "language_code",
x36. "language_used_by"
FROM
(SELECT x37.x38 AS x14,
x37.x39 AS x6,
x37.x40 AS x26,
x37.x41 AS x9,
x37.x42 AS x20,
x37.x43 AS x16,
x37.x44 AS x8,
x37.x45 AS x19,
x37.x46 AS x3,
x37.x47 AS x13,
x37.x48 AS x12,
x37.x49 AS x18,
x37.x50 AS x17,
x37.x51 AS x11,
x37.x52 AS x22,
x37.x53 AS x21,
x37.x54 AS x10,
x37.x55 AS x5,
x37.x56 AS x4,
x37.x57 AS x25,
x37.x58 AS x7,
x37.x59 AS x15,
x37.x60 AS x24,
x37.x61 AS x23,
(CASE
WHEN (x62. "attribute_value" IS NULL) THEN NULL
ELSE 1
END) AS x27,
x62. "paid" AS x28,
x62. "attribute_value" AS x34,
x62. "attribute_id" AS x33,
x62. "provider_id" AS x29,
x62. "attribute_group_id" AS x32,
x62. "parent_paid" AS x31,
x62. "address_id" AS x30
FROM
(SELECT "provider_id" AS x46,
"zip" AS x38,
"first_name" AS x39,
"provider_name_id" AS x40,
"degree" AS x41,
"preferred_flag" AS x42,
"county" AS x43,
"suffix" AS x44,
"individual_id" AS x45,
"state" AS x47,
"city" AS x48,
"latitude" AS x49,
"longitude" AS x50,
"address" AS x51,
"exclusion_type_id" AS x52,
"quality_score" AS x53,
"gender" AS x54,
"last_name" AS x55,
"address_id" AS x56,
"hi_q_hospital_id" AS x57,
"middle_name" AS x58,
"zip4" AS x59,
"handicap_accessible" AS x60,
"sour_address" AS x61
FROM "provider"
WHERE "provider_id" = '03563735-3798-441a-aea6-4e561ea347f7') x37
LEFT OUTER JOIN "provider_attribute" x62 ON (x37.x46 = x62. "provider_id")
AND (x37.x56 = x62. "address_id")) x2
LEFT OUTER JOIN "provider_phone" x35 ON (x2.x3 = x35. "provider_id")
AND (x2.x4 = x35. "address_id")
LEFT OUTER JOIN "provider_language" x36 ON (x2.x3 = x36. "provider_id")
AND (x2.x4 = x36. "address_id");

> Slow Execution Plan
> transformations_uhc_medicaid=> explain (analyze, buffers)
> SELECT x2.x3, x2.x4, x2.x5, x2.x6, x2.x7, x2.x8, x2.x9, x2.x10, x2.x11,
> x2.x12, x2.x13, x2.x14, x2.x15, x2.x16, x2.x17, x2.x18, x2.x19, x2.x20,
> x2.x21, x2.x22, x2.x23, x2.x24, x2.x25, x2.x26, x2.x27, x2.x28, x2.x29,
> x2.x30, x2.x31, x2.x32, x2.x33, x2.x34, x35. "provider_id", x35.
> "provider_phone_id", x35. "provider_id", x35. "address_id", x35.
> "prod_code", x35. "phone_number", x35. "phone_type", x36. "provider_id",
> x36. "provider_id", x36. "address_id", x36. "language_code", x36.
> "language_used_by" FROM ( SELECT x37.x38 AS x14, x37.x39 AS x6, x37.x40 AS
> x26, x37.x41 AS x9, x37.x42 AS x20, x37.x43 AS x16, x37.x44 AS x8, x37.x45
> AS x19, x37.x46 AS x3, x37.x47 AS x13, x37.x48 AS x12, x37.x49 AS x18,
> x37.x50 AS x17, x37.x51 AS x11, x37.x52 AS x22, x37.x53 AS x21, x37.x54 AS
> x10, x37.x55 AS x5, x37.x56 AS x4, x37.x57 AS x25, x37.x58 AS x7, x37.x59 AS
> x15, x37.x60 AS x24, x37.x61 AS x23, ( CASE WHEN (x62. "attribute_value" IS
> NULL) THEN NULL ELSE 1 END) AS x27, x62. "paid" AS x28, x62.
> "attribute_value" AS x34, x62. "attribute_id" AS x33, x62. "provider_id" AS
> x29, x62. "attribute_group_id" AS x32, x62. "parent_paid" AS x31, x62.
> "address_id" AS x30 FROM ( SELECT "provider_id" AS x46, "zip" AS x38,
> "first_name" AS x39, "provider_name_id" AS x40, "degree" AS x41,
> "preferred_flag" AS x42, "county" AS x43, "suffix" AS x44, "individual_id"
> AS x45, "state" AS x47, "city" AS x48, "latitude" AS x49, "longitude" AS
> x50, "address" AS x51, "exclusion_type_id" AS x52, "quality_score" AS x53,
> "gender" AS x54, "last_name" AS x55, "address_id" AS x56, "hi_q_hospital_id"
> AS x57, "middle_name" AS x58, "zip4" AS x59, "handicap_accessible" AS x60,
> "sour_address" AS x61 FROM "provider" WHERE "provider_id" =
> '03563735-3798-441a-aea6-4e561ea347f7') x37 LEFT OUTER JOIN
> "provider_attribute" x62 ON (x37.x46 = x62. "provider_id") AND (x37.x56 =
> x62. "address_id")) x2 LEFT OUTER JOIN "provider_phone" x35 ON (x2.x3 = x35.
> "provider_id") AND (x2.x4 = x35. "address_id") LEFT OUTER JOIN
> "provider_language" x36 ON (x2.x3 = x36. "provider_id") AND (x2.x4 = x36.
> "address_id");

Based on a quick skim, this is the same query as before.

> I have tried to vacuum analyze the whole database still queries are slow in
> 1 of the environment while faster in another environment.

Is there a chance that one database has longrunning queries, slow
replication, or something like that, leading to a very bloated database?
Even if you VACUUM FULL, if there's still long-running sessions, the
bloat could not necessarily be removed, because those sessions might
still need to see the already superseded data.

Do the table / index sizes differ between the environment? Are the
databases expected to have the same content?

This last point is more oriented towards other PG developers: I wonder
if we ought to display buffer statistics for plan time, for EXPLAIN
(BUFFERS). That'd surely make it easier to discern cases where we
e.g. access the index and scan a lot of the index from cases where we
hit some CPU time issue. We should easily be able to get that data, I
think, we already maintain it, we'd just need to compute the diff
between pgBufferUsage before / after planning.

Greetings,

Andres Freund


From: Mukesh Chhatani <chhatani(dot)mukesh(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16109: Postgres planning time is high across version - 10.6 vs 10.10
Date: 2019-11-12 21:01:28
Message-ID: CACnmOYLj6PsfVKe6+mgmM5vsZtQQjx8HdogYB4d5uCiNPjXmQw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Thanks for getting back to me so quickly

Queries are same and executed in 2 different environments. There are no
long running queries on any of the environments since they are idle right
away for my testing.

I can try full vacuum if that is recommended since I tried vacuum analyze
on the whole database in both environments.

Datases have the same content and size is also same.

Sorry but I am never seen this before , if the sizes vary or if the content
varies I have seen execution time being slow and not the planning time.

I am working on a dataset which I will share for further investigation and
analysis.

Regards,
Mukesh

On Tue, Nov 12, 2019 at 2:55 PM Andres Freund <andres(at)anarazel(dot)de> wrote:

> Hi,
>
> On 2019-11-12 20:34:35 +0000, PG Bug reporting form wrote:
> > I am experiencing weird issue around planning time for the queries across
> > couple of environments below is the sample of the execution plan
>
>
> Just to confirm, these are the same queries, but executed in different
> databases / environments?
>
>
> > Fast Execution Plan
> > transformations=> explain (analyze, buffers)
> > SELECT x2.x3, x2.x4, x2.x5, x2.x6, x2.x7, x2.x8, x2.x9, x2.x10, x2.x11,
> > x2.x12, x2.x13, x2.x14, x2.x15, x2.x16, x2.x17, x2.x18, x2.x19, x2.x20,
> > x2.x21, x2.x22, x2.x23, x2.x24, x2.x25, x2.x26, x2.x27, x2.x28, x2.x29,
> > x2.x30, x2.x31, x2.x32, x2.x33, x2.x34, x35. "provider_id", x35.
> > "provider_phone_id", x35. "provider_id", x35. "address_id", x35.
> > "prod_code", x35. "phone_number", x35. "phone_type", x36. "provider_id",
> > x36. "provider_id", x36. "address_id", x36. "language_code", x36.
> > "language_used_by" FROM ( SELECT x37.x38 AS x14, x37.x39 AS x6, x37.x40
> AS
> > x26, x37.x41 AS x9, x37.x42 AS x20, x37.x43 AS x16, x37.x44 AS x8,
> x37.x45
> > AS x19, x37.x46 AS x3, x37.x47 AS x13, x37.x48 AS x12, x37.x49 AS x18,
> > x37.x50 AS x17, x37.x51 AS x11, x37.x52 AS x22, x37.x53 AS x21, x37.x54
> AS
> > x10, x37.x55 AS x5, x37.x56 AS x4, x37.x57 AS x25, x37.x58 AS x7,
> x37.x59 AS
> > x15, x37.x60 AS x24, x37.x61 AS x23, ( CASE WHEN (x62. "attribute_value"
> IS
> > NULL) THEN NULL ELSE 1 END) AS x27, x62. "paid" AS x28, x62.
> > "attribute_value" AS x34, x62. "attribute_id" AS x33, x62. "provider_id"
> AS
> > x29, x62. "attribute_group_id" AS x32, x62. "parent_paid" AS x31, x62.
> > "address_id" AS x30 FROM ( SELECT "provider_id" AS x46, "zip" AS x38,
> > "first_name" AS x39, "provider_name_id" AS x40, "degree" AS x41,
> > "preferred_flag" AS x42, "county" AS x43, "suffix" AS x44,
> "individual_id"
> > AS x45, "state" AS x47, "city" AS x48, "latitude" AS x49, "longitude" AS
> > x50, "address" AS x51, "exclusion_type_id" AS x52, "quality_score" AS
> x53,
> > "gender" AS x54, "last_name" AS x55, "address_id" AS x56,
> "hi_q_hospital_id"
> > AS x57, "middle_name" AS x58, "zip4" AS x59, "handicap_accessible" AS
> x60,
> > "sour_address" AS x61 FROM "provider" WHERE "provider_id" =
> > '03563735-3798-441a-aea6-4e561ea347f7') x37 LEFT OUTER JOIN
> > "provider_attribute" x62 ON (x37.x46 = x62. "provider_id") AND (x37.x56 =
> > x62. "address_id")) x2 LEFT OUTER JOIN "provider_phone" x35 ON (x2.x3 =
> x35.
> > "provider_id") AND (x2.x4 = x35. "address_id") LEFT OUTER JOIN
> > "provider_language" x36 ON (x2.x3 = x36. "provider_id") AND (x2.x4 = x36.
> > "address_id");
>
> This is really hard to read for a human...
>
> Here's a automatically reformatted version:
>
> SELECT x2.x3,
> x2.x4,
> x2.x5,
> x2.x6,
> x2.x7,
> x2.x8,
> x2.x9,
> x2.x10,
> x2.x11,
> x2.x12,
> x2.x13,
> x2.x14,
> x2.x15,
> x2.x16,
> x2.x17,
> x2.x18,
> x2.x19,
> x2.x20,
> x2.x21,
> x2.x22,
> x2.x23,
> x2.x24,
> x2.x25,
> x2.x26,
> x2.x27,
> x2.x28,
> x2.x29,
> x2.x30,
> x2.x31,
> x2.x32,
> x2.x33,
> x2.x34,
> x35. "provider_id",
> x35. "provider_phone_id",
> x35. "provider_id",
> x35. "address_id",
> x35. "prod_code",
> x35. "phone_number",
> x35. "phone_type",
> x36. "provider_id",
> x36. "provider_id",
> x36. "address_id",
> x36. "language_code",
> x36. "language_used_by"
> FROM
> (SELECT x37.x38 AS x14,
> x37.x39 AS x6,
> x37.x40 AS x26,
> x37.x41 AS x9,
> x37.x42 AS x20,
> x37.x43 AS x16,
> x37.x44 AS x8,
> x37.x45 AS x19,
> x37.x46 AS x3,
> x37.x47 AS x13,
> x37.x48 AS x12,
> x37.x49 AS x18,
> x37.x50 AS x17,
> x37.x51 AS x11,
> x37.x52 AS x22,
> x37.x53 AS x21,
> x37.x54 AS x10,
> x37.x55 AS x5,
> x37.x56 AS x4,
> x37.x57 AS x25,
> x37.x58 AS x7,
> x37.x59 AS x15,
> x37.x60 AS x24,
> x37.x61 AS x23,
> (CASE
> WHEN (x62. "attribute_value" IS NULL) THEN NULL
> ELSE 1
> END) AS x27,
> x62. "paid" AS x28,
> x62. "attribute_value" AS x34,
> x62. "attribute_id" AS x33,
> x62. "provider_id" AS x29,
> x62. "attribute_group_id" AS x32,
> x62. "parent_paid" AS x31,
> x62. "address_id" AS x30
> FROM
> (SELECT "provider_id" AS x46,
> "zip" AS x38,
> "first_name" AS x39,
> "provider_name_id" AS x40,
> "degree" AS x41,
> "preferred_flag" AS x42,
> "county" AS x43,
> "suffix" AS x44,
> "individual_id" AS x45,
> "state" AS x47,
> "city" AS x48,
> "latitude" AS x49,
> "longitude" AS x50,
> "address" AS x51,
> "exclusion_type_id" AS x52,
> "quality_score" AS x53,
> "gender" AS x54,
> "last_name" AS x55,
> "address_id" AS x56,
> "hi_q_hospital_id" AS x57,
> "middle_name" AS x58,
> "zip4" AS x59,
> "handicap_accessible" AS x60,
> "sour_address" AS x61
> FROM "provider"
> WHERE "provider_id" = '03563735-3798-441a-aea6-4e561ea347f7') x37
> LEFT OUTER JOIN "provider_attribute" x62 ON (x37.x46 = x62.
> "provider_id")
> AND (x37.x56 = x62. "address_id")) x2
> LEFT OUTER JOIN "provider_phone" x35 ON (x2.x3 = x35. "provider_id")
> AND (x2.x4 = x35. "address_id")
> LEFT OUTER JOIN "provider_language" x36 ON (x2.x3 = x36. "provider_id")
> AND (x2.x4 = x36. "address_id");
>
>
> > Slow Execution Plan
> > transformations_uhc_medicaid=> explain (analyze, buffers)
> > SELECT x2.x3, x2.x4, x2.x5, x2.x6, x2.x7, x2.x8, x2.x9, x2.x10, x2.x11,
> > x2.x12, x2.x13, x2.x14, x2.x15, x2.x16, x2.x17, x2.x18, x2.x19, x2.x20,
> > x2.x21, x2.x22, x2.x23, x2.x24, x2.x25, x2.x26, x2.x27, x2.x28, x2.x29,
> > x2.x30, x2.x31, x2.x32, x2.x33, x2.x34, x35. "provider_id", x35.
> > "provider_phone_id", x35. "provider_id", x35. "address_id", x35.
> > "prod_code", x35. "phone_number", x35. "phone_type", x36. "provider_id",
> > x36. "provider_id", x36. "address_id", x36. "language_code", x36.
> > "language_used_by" FROM ( SELECT x37.x38 AS x14, x37.x39 AS x6, x37.x40
> AS
> > x26, x37.x41 AS x9, x37.x42 AS x20, x37.x43 AS x16, x37.x44 AS x8,
> x37.x45
> > AS x19, x37.x46 AS x3, x37.x47 AS x13, x37.x48 AS x12, x37.x49 AS x18,
> > x37.x50 AS x17, x37.x51 AS x11, x37.x52 AS x22, x37.x53 AS x21, x37.x54
> AS
> > x10, x37.x55 AS x5, x37.x56 AS x4, x37.x57 AS x25, x37.x58 AS x7,
> x37.x59 AS
> > x15, x37.x60 AS x24, x37.x61 AS x23, ( CASE WHEN (x62. "attribute_value"
> IS
> > NULL) THEN NULL ELSE 1 END) AS x27, x62. "paid" AS x28, x62.
> > "attribute_value" AS x34, x62. "attribute_id" AS x33, x62. "provider_id"
> AS
> > x29, x62. "attribute_group_id" AS x32, x62. "parent_paid" AS x31, x62.
> > "address_id" AS x30 FROM ( SELECT "provider_id" AS x46, "zip" AS x38,
> > "first_name" AS x39, "provider_name_id" AS x40, "degree" AS x41,
> > "preferred_flag" AS x42, "county" AS x43, "suffix" AS x44,
> "individual_id"
> > AS x45, "state" AS x47, "city" AS x48, "latitude" AS x49, "longitude" AS
> > x50, "address" AS x51, "exclusion_type_id" AS x52, "quality_score" AS
> x53,
> > "gender" AS x54, "last_name" AS x55, "address_id" AS x56,
> "hi_q_hospital_id"
> > AS x57, "middle_name" AS x58, "zip4" AS x59, "handicap_accessible" AS
> x60,
> > "sour_address" AS x61 FROM "provider" WHERE "provider_id" =
> > '03563735-3798-441a-aea6-4e561ea347f7') x37 LEFT OUTER JOIN
> > "provider_attribute" x62 ON (x37.x46 = x62. "provider_id") AND (x37.x56 =
> > x62. "address_id")) x2 LEFT OUTER JOIN "provider_phone" x35 ON (x2.x3 =
> x35.
> > "provider_id") AND (x2.x4 = x35. "address_id") LEFT OUTER JOIN
> > "provider_language" x36 ON (x2.x3 = x36. "provider_id") AND (x2.x4 = x36.
> > "address_id");
>
> Based on a quick skim, this is the same query as before.
>
>
> > I have tried to vacuum analyze the whole database still queries are slow
> in
> > 1 of the environment while faster in another environment.
>
> Is there a chance that one database has longrunning queries, slow
> replication, or something like that, leading to a very bloated database?
> Even if you VACUUM FULL, if there's still long-running sessions, the
> bloat could not necessarily be removed, because those sessions might
> still need to see the already superseded data.
>
> Do the table / index sizes differ between the environment? Are the
> databases expected to have the same content?
>
>
> This last point is more oriented towards other PG developers: I wonder
> if we ought to display buffer statistics for plan time, for EXPLAIN
> (BUFFERS). That'd surely make it easier to discern cases where we
> e.g. access the index and scan a lot of the index from cases where we
> hit some CPU time issue. We should easily be able to get that data, I
> think, we already maintain it, we'd just need to compute the diff
> between pgBufferUsage before / after planning.
>
> Greetings,
>
> Andres Freund
>


From: legrand legrand <legrand_legrand(at)hotmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #16109: Postgres planning time is high across version - 10.6 vs 10.10
Date: 2019-11-12 21:41:15
Message-ID: 1573594875394-0.post@n3.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Hello,

there was a very similar post a few days ago:
https://www.postgresql-archive.org/Slow-planning-fast-execution-for-particular-3-table-query-tt6109879.html#none

the root cause was a modification of default_statistics_target

Maybe you are in the same situation ?
or maybe tables have different SET STATISTICS values set ?

Regards
PAscal

--
Sent from: https://www.postgresql-archive.org/PostgreSQL-bugs-f2117394.html


From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Mukesh Chhatani <chhatani(dot)mukesh(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16109: Postgres planning time is high across version - 10.6 vs 10.10
Date: 2019-11-13 06:58:28
Message-ID: 25c7ebae-a3e9-5c79-23f6-cb183045d79b@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

13.11.2019 00:01, Mukesh Chhatani пишет:
> Thanks for getting back to me so quickly
>
> Queries are same and executed in 2 different environments. There are no
> long running queries on any of the environments since they are idle
> right away for my testing.
>
> I can try full vacuum if that is recommended since I tried vacuum
> analyze on the whole database in both environments.
>
> Datases have the same content and size is also same.
>
> Sorry but I am never seen this before , if the sizes vary or if the
> content varies I have seen execution time being slow and not the
> planning time.
>
> I am working on a dataset which I will share for further investigation
> and analysis.
Interesting. I will be waiting for your data set.

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company


From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16109: Postgres planning time is high across version - 10.6 vs 10.10
Date: 2019-11-13 10:39:04
Message-ID: CAOBaU_ZfSUrU7Cuno_ycJDFDfqJH3jC0=2ou2GkcNy-P0UPy3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

(moved to -hackers)

On Tue, Nov 12, 2019 at 9:55 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> This last point is more oriented towards other PG developers: I wonder
> if we ought to display buffer statistics for plan time, for EXPLAIN
> (BUFFERS). That'd surely make it easier to discern cases where we
> e.g. access the index and scan a lot of the index from cases where we
> hit some CPU time issue. We should easily be able to get that data, I
> think, we already maintain it, we'd just need to compute the diff
> between pgBufferUsage before / after planning.

That would be quite interesting to have. I attach as a reference a
quick POC patch to implement it:

# explain (analyze, buffers) select * from pg_stat_activity;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=2.25..3.80 rows=100 width=440) (actual
time=0.259..0.276 rows=6 loops=1)
Hash Cond: (s.usesysid = u.oid)
Buffers: shared hit=5
-> Hash Left Join (cost=1.05..2.32 rows=100 width=376) (actual
time=0.226..0.236 rows=6 loops=1)
Hash Cond: (s.datid = d.oid)
Buffers: shared hit=4
-> Function Scan on pg_stat_get_activity s (cost=0.00..1.00
rows=100 width=312) (actual time=0.148..0.151 rows=6 loop
-> Hash (cost=1.02..1.02 rows=2 width=68) (actual
time=0.034..0.034 rows=5 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1
-> Seq Scan on pg_database d (cost=0.00..1.02 rows=2
width=68) (actual time=0.016..0.018 rows=5 loops=1)
Buffers: shared hit=1
-> Hash (cost=1.09..1.09 rows=9 width=68) (actual
time=0.015..0.015 rows=9 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1
-> Seq Scan on pg_authid u (cost=0.00..1.09 rows=9
width=68) (actual time=0.004..0.008 rows=9 loops=1)
Buffers: shared hit=1
Planning Time: 1.902 ms
Buffers: shared hit=37 read=29
I/O Timings: read=0.506
Execution Time: 0.547 ms
(21 rows)

Note that there's a related discussion in the "Planning counters in
pg_stat_statements" thread, on whether to also compute buffers from
planning or not.

Attachment Content-Type Size
show_planning_buffers-v1.diff application/octet-stream 8.2 KB

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16109: Postgres planning time is high across version - 10.6 vs 10.10
Date: 2019-11-13 10:49:58
Message-ID: CAFj8pRB2RX5YsxzTaLq2Hq6nLXL5t=sCovgDxC3VrhJwDBigcw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

st 13. 11. 2019 v 11:39 odesílatel Julien Rouhaud <rjuju123(at)gmail(dot)com>
napsal:

> (moved to -hackers)
>
> On Tue, Nov 12, 2019 at 9:55 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> >
> > This last point is more oriented towards other PG developers: I wonder
> > if we ought to display buffer statistics for plan time, for EXPLAIN
> > (BUFFERS). That'd surely make it easier to discern cases where we
> > e.g. access the index and scan a lot of the index from cases where we
> > hit some CPU time issue. We should easily be able to get that data, I
> > think, we already maintain it, we'd just need to compute the diff
> > between pgBufferUsage before / after planning.
>
> That would be quite interesting to have. I attach as a reference a
> quick POC patch to implement it:
>
> # explain (analyze, buffers) select * from pg_stat_activity;
> QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------------------------
> Hash Left Join (cost=2.25..3.80 rows=100 width=440) (actual
> time=0.259..0.276 rows=6 loops=1)
> Hash Cond: (s.usesysid = u.oid)
> Buffers: shared hit=5
> -> Hash Left Join (cost=1.05..2.32 rows=100 width=376) (actual
> time=0.226..0.236 rows=6 loops=1)
> Hash Cond: (s.datid = d.oid)
> Buffers: shared hit=4
> -> Function Scan on pg_stat_get_activity s (cost=0.00..1.00
> rows=100 width=312) (actual time=0.148..0.151 rows=6 loop
> -> Hash (cost=1.02..1.02 rows=2 width=68) (actual
> time=0.034..0.034 rows=5 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 9kB
> Buffers: shared hit=1
> -> Seq Scan on pg_database d (cost=0.00..1.02 rows=2
> width=68) (actual time=0.016..0.018 rows=5 loops=1)
> Buffers: shared hit=1
> -> Hash (cost=1.09..1.09 rows=9 width=68) (actual
> time=0.015..0.015 rows=9 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 9kB
> Buffers: shared hit=1
> -> Seq Scan on pg_authid u (cost=0.00..1.09 rows=9
> width=68) (actual time=0.004..0.008 rows=9 loops=1)
> Buffers: shared hit=1
> Planning Time: 1.902 ms
> Buffers: shared hit=37 read=29
> I/O Timings: read=0.506
> Execution Time: 0.547 ms
> (21 rows)
>
> Note that there's a related discussion in the "Planning counters in
> pg_stat_statements" thread, on whether to also compute buffers from
> planning or not.
>

+1

Pavel


From: Mukesh Chhatani <chhatani(dot)mukesh(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #16109: Postgres planning time is high across version - 10.6 vs 10.10
Date: 2019-11-13 18:37:30
Message-ID: 1573670250461-0.post@n3.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

All,

Update I was able to resolve the problem by changing - partitioned tables to
single table and changing data type of 2 columns used in the joins from
varchar to varchar(50).

FYI.. default_statistics_target was set to 10000 but I changed it 100 and
even to 1000 and still planning time was high.

Still working on the dataset so that more people can investigate the issues.

--
Sent from: https://www.postgresql-archive.org/PostgreSQL-bugs-f2117394.html


From: Andres Freund <andres(at)anarazel(dot)de>
To: Mukesh Chhatani <chhatani(dot)mukesh(at)gmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #16109: Postgres planning time is high across version - 10.6 vs 10.10
Date: 2019-11-13 18:42:13
Message-ID: 20191113184213.yexnib6cbxpkmmwq@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Hi,

On 2019-11-13 11:37:30 -0700, Mukesh Chhatani wrote:
> FYI.. default_statistics_target was set to 10000 but I changed it 100 and
> even to 1000 and still planning time was high.

Note that you'd need to ANALYZE the involved tables before that change
actually would effect planning time.

> Update I was able to resolve the problem by changing - partitioned tables to
> single table and changing data type of 2 columns used in the joins from
> varchar to varchar(50).

That's not going to be the fix. There's no efficiency difference between
those. It's more likely that, that the different statistics target would
have taken effect after the alter table etc.

- Andres


From: Mukesh Chhatani <chhatani(dot)mukesh(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #16109: Postgres planning time is high across version - 10.6 vs 10.10
Date: 2019-11-13 18:52:13
Message-ID: CACnmOYJQVHhuaYe70z0yFr2rKy8jqe=8Fx92c7-HXShQOEy8qw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Thanks for response.

I analyzed after changing default_statistics_target but no improvement, are
there any recommendations around this parameter because as far as I have
seen increasing this parameter in the past lets optimizer choose better
plans and has never caused me this problem related to high planning time,
but I am open to suggestions since every problem is a new problem.

I know changing partitioned to unpartitioned and then changing columns
types makes no sense in terms of resolving the issue but that is what is
working now.

I will go ahead and change the parameter - default_statistics_target to 100
and analyze whole database and then wait for couple of hours and then run
my queries. Let me know if this approach is good.

Regards,
Mukesh

On Wed, Nov 13, 2019 at 12:42 PM Andres Freund <andres(at)anarazel(dot)de> wrote:

> Hi,
>
> On 2019-11-13 11:37:30 -0700, Mukesh Chhatani wrote:
> > FYI.. default_statistics_target was set to 10000 but I changed it 100 and
> > even to 1000 and still planning time was high.
>
> Note that you'd need to ANALYZE the involved tables before that change
> actually would effect planning time.
>
>
> > Update I was able to resolve the problem by changing - partitioned
> tables to
> > single table and changing data type of 2 columns used in the joins from
> > varchar to varchar(50).
>
> That's not going to be the fix. There's no efficiency difference between
> those. It's more likely that, that the different statistics target would
> have taken effect after the alter table etc.
>
> - Andres
>


From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16109: Postgres planning time is high across version (Expose buffer usage during planning in EXPLAIN)
Date: 2020-01-24 05:55:34
Message-ID: 20200124055534.GM13621@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Wed, Nov 13, 2019 at 11:39:04AM +0100, Julien Rouhaud wrote:
> (moved to -hackers)
>
> On Tue, Nov 12, 2019 at 9:55 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> >
> > This last point is more oriented towards other PG developers: I wonder
> > if we ought to display buffer statistics for plan time, for EXPLAIN
> > (BUFFERS). That'd surely make it easier to discern cases where we
> > e.g. access the index and scan a lot of the index from cases where we
> > hit some CPU time issue. We should easily be able to get that data, I
> > think, we already maintain it, we'd just need to compute the diff
> > between pgBufferUsage before / after planning.
>
> That would be quite interesting to have. I attach as a reference a
> quick POC patch to implement it:

+1

+ result.shared_blks_hit = stop->shared_blks_hit - start->shared_blks_hit;
+ result.shared_blks_read = stop->shared_blks_read - start->shared_blks_read;
+ result.shared_blks_dirtied = stop->shared_blks_dirtied -
+ start->shared_blks_dirtied;
[...]

I think it would be more readable and maintainable using a macro:

#define CALC_BUFF_DIFF(x) result.##x = stop->##x - start->##x
CALC_BUFF_DIFF(shared_blks_hit);
CALC_BUFF_DIFF(shared_blks_read);
CALC_BUFF_DIFF(shared_blks_dirtied);
...
#undefine CALC_BUFF_DIFF


From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16109: Postgres planning time is high across version (Expose buffer usage during planning in EXPLAIN)
Date: 2020-01-24 21:06:11
Message-ID: CAOBaU_ZhmG7FZBH9z28nb7++zQ--FsqX_GcTefXa7oj1yrWnRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Fri, Jan 24, 2020 at 6:55 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
>
> On Wed, Nov 13, 2019 at 11:39:04AM +0100, Julien Rouhaud wrote:
> > (moved to -hackers)
> >
> > On Tue, Nov 12, 2019 at 9:55 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > >
> > > This last point is more oriented towards other PG developers: I wonder
> > > if we ought to display buffer statistics for plan time, for EXPLAIN
> > > (BUFFERS). That'd surely make it easier to discern cases where we
> > > e.g. access the index and scan a lot of the index from cases where we
> > > hit some CPU time issue. We should easily be able to get that data, I
> > > think, we already maintain it, we'd just need to compute the diff
> > > between pgBufferUsage before / after planning.
> >
> > That would be quite interesting to have. I attach as a reference a
> > quick POC patch to implement it:
>
> +1
>
> + result.shared_blks_hit = stop->shared_blks_hit - start->shared_blks_hit;
> + result.shared_blks_read = stop->shared_blks_read - start->shared_blks_read;
> + result.shared_blks_dirtied = stop->shared_blks_dirtied -
> + start->shared_blks_dirtied;
> [...]
>
> I think it would be more readable and maintainable using a macro:
>
> #define CALC_BUFF_DIFF(x) result.##x = stop->##x - start->##x
> CALC_BUFF_DIFF(shared_blks_hit);
> CALC_BUFF_DIFF(shared_blks_read);
> CALC_BUFF_DIFF(shared_blks_dirtied);
> ...
> #undefine CALC_BUFF_DIFF

Good idea. Note that you can't use preprocessor concatenation to
generate something else than a token or a number, so the ## can just
be removed here.

Attachment Content-Type Size
show_planning_buffers-v2.diff application/octet-stream 7.8 KB

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16109: Postgres planning time is high across version (Expose buffer usage during planning in EXPLAIN)
Date: 2020-01-29 11:15:59
Message-ID: CAOBaU_YAN6PW4perkROgXWtRmDNX8tvKwjgDpsR6wYetEUahGQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Fri, Jan 24, 2020 at 10:06 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> On Fri, Jan 24, 2020 at 6:55 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> >
> > On Wed, Nov 13, 2019 at 11:39:04AM +0100, Julien Rouhaud wrote:
> > > (moved to -hackers)
> > >
> > > On Tue, Nov 12, 2019 at 9:55 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > >
> > > > This last point is more oriented towards other PG developers: I wonder
> > > > if we ought to display buffer statistics for plan time, for EXPLAIN
> > > > (BUFFERS). That'd surely make it easier to discern cases where we
> > > > e.g. access the index and scan a lot of the index from cases where we
> > > > hit some CPU time issue. We should easily be able to get that data, I
> > > > think, we already maintain it, we'd just need to compute the diff
> > > > between pgBufferUsage before / after planning.
> > >
> > > That would be quite interesting to have. I attach as a reference a
> > > quick POC patch to implement it:
> >
> > +1
> >
> > + result.shared_blks_hit = stop->shared_blks_hit - start->shared_blks_hit;
> > + result.shared_blks_read = stop->shared_blks_read - start->shared_blks_read;
> > + result.shared_blks_dirtied = stop->shared_blks_dirtied -
> > + start->shared_blks_dirtied;
> > [...]
> >
> > I think it would be more readable and maintainable using a macro:
> >
> > #define CALC_BUFF_DIFF(x) result.##x = stop->##x - start->##x
> > CALC_BUFF_DIFF(shared_blks_hit);
> > CALC_BUFF_DIFF(shared_blks_read);
> > CALC_BUFF_DIFF(shared_blks_dirtied);
> > ...
> > #undefine CALC_BUFF_DIFF
>
> Good idea. Note that you can't use preprocessor concatenation to
> generate something else than a token or a number, so the ## can just
> be removed here.

Rebase due to conflict with 3ec20c7091e97.

Attachment Content-Type Size
show_planning_buffers-v3.diff application/octet-stream 19.4 KB

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16109: Postgres planning time is high across version (Expose buffer usage during planning in EXPLAIN)
Date: 2020-03-31 01:31:46
Message-ID: 20200331013146.GA14618@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Wed, Jan 29, 2020 at 12:15:59PM +0100, Julien Rouhaud wrote:
> Rebase due to conflict with 3ec20c7091e97.

This is failing to apply probably since 4a539a25ebfc48329fd656a95f3c1eb2cda38af3.
Could you rebase? (Also, not sure if this can be set as RFC?)

--
Justin


From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16109: Postgres planning time is high across version (Expose buffer usage during planning in EXPLAIN)
Date: 2020-04-01 17:51:36
Message-ID: 7bdd1e99-335f-a24a-ffbd-8a578a761970@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On 2020/03/31 10:31, Justin Pryzby wrote:
> On Wed, Jan 29, 2020 at 12:15:59PM +0100, Julien Rouhaud wrote:
>> Rebase due to conflict with 3ec20c7091e97.
>
> This is failing to apply probably since 4a539a25ebfc48329fd656a95f3c1eb2cda38af3.
> Could you rebase? (Also, not sure if this can be set as RFC?)

I updated the patch. Attached.

+/* Compute the difference between two BufferUsage */
+BufferUsage
+ComputeBufferCounters(BufferUsage *start, BufferUsage *stop)

Since BufferUsageAccumDiff() was exported, ComputeBufferCounters() is
no longer necessary. In the patched version, BufferUsageAccumDiff() is
used to calculate the difference of buffer usage.

+ if (es->summary && (planduration || es->buffers))
+ ExplainOpenGroup("Planning", "Planning", true, es);

Isn't it more appropriate to check "bufusage" instead of "es->buffers" here?
The patch changes the code so that "bufusage" is checked.

+ "Planning Time": N.N, +
+ "Shared Hit Blocks": N, +
+ "Shared Read Blocks": N, +
+ "Shared Dirtied Blocks": N,+

Doesn't this indent look strange? IMO no indent for buffer usage is
necessary when the format is either json, xml, and yaml. This looks
better at least for me. OTOH, in text format, it seems better to indent
the buffer usage for more readability. Thought?
The patch changes the code so that "es->indent" is
increment/decrement only when the format is text.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachment Content-Type Size
show_planning_buffers-v4.patch text/plain 8.7 KB

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16109: Postgres planning time is high across version (Expose buffer usage during planning in EXPLAIN)
Date: 2020-04-01 18:47:12
Message-ID: CAOBaU_ZrKOMXLxxeqUvuFN1Xy2Kfmz7GFfT47ksCuecC8N2AOQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg스포츠 토토 사이트SQL : Postg스포츠 토토 사이트SQL 메일 링리스트 : 2020-04-01 이후 PGSQL-BUGS 18:47 pgsql-hackers

On Wed, Apr 1, 2020 at 7:51 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>
>
> On 2020/03/31 10:31, Justin Pryzby wrote:
> > On Wed, Jan 29, 2020 at 12:15:59PM +0100, Julien Rouhaud wrote:
> >> Rebase due to conflict with 3ec20c7091e97.
> >
> > This is failing to apply probably since 4a539a25ebfc48329fd656a95f3c1eb2cda38af3.
> > Could you rebase? (Also, not sure if this can be set as RFC?)
>
> I updated the patch. Attached.

Thanks a lot! I'm sorry I missed Justin's ping, and it I just
realized that my cron job that used to warn me about cfbot failure was
broken :(

> +/* Compute the difference between two BufferUsage */
> +BufferUsage
> +ComputeBufferCounters(BufferUsage *start, BufferUsage *stop)
>
> Since BufferUsageAccumDiff() was exported, ComputeBufferCounters() is
> no longer necessary. In the patched version, BufferUsageAccumDiff() is
> used to calculate the difference of buffer usage.

Indeed, exposing BufferUsageAccumDiff wa definitely a good thing!

> + if (es->summary && (planduration || es->buffers))
> + ExplainOpenGroup("Planning", "Planning", true, es);
>
> Isn't it more appropriate to check "bufusage" instead of "es->buffers" here?
> The patch changes the code so that "bufusage" is checked.

AFAICS not unless ExplainOneQuery is also changed to pass a NULL
pointer if the BUFFER option wasn't provided (and maybe also
optionally skip the planning buffer computation). With this version
you now get:

=# explain (analyze, buffers off) update t1 set id = id;
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Update on t1 (cost=0.00..22.70 rows=1270 width=42) (actual
time=0.170..0.170 rows=0 loops=1)
-> Seq Scan on t1 (cost=0.00..22.70 rows=1270 width=42) (actual
time=0.050..0.054 rows=1 loops=1)
Planning Time: 1.461 ms
Buffers: shared hit=25
Execution Time: 1.071 ms
(5 rows)

which seems wrong to me.

I reused the es->buffers to avoid having needing something like:

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b1f3fe13c6..9dbff97a32 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -375,7 +375,9 @@ ExplainOneQuery(Query *query, int cursorOptions,
BufferUsage bufusage_start,
bufusage;

- bufusage_start = pgBufferUsage;
+ if (ex->buffers)
+ bufusage_start = pgBufferUsage;
+
INSTR_TIME_SET_CURRENT(planstart);

/* plan the query */
@@ -384,13 +386,16 @@ ExplainOneQuery(Query *query, int cursorOptions,
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);

- /* calc differences of buffer counters. */
- memset(&bufusage, 0, sizeof(BufferUsage));
- BufferUsageAccumDiff(&bufusage, &pgBufferUsage, &bufusage_start);
+ if (es->buffers)
+ {
+ /* calc differences of buffer counters. */
+ memset(&bufusage, 0, sizeof(BufferUsage));
+ BufferUsageAccumDiff(&bufusage, &pgBufferUsage, &bufusage_start);
+ }

/* run it (if needed) and produce output */
ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
- &planduration, &bufusage);
+ &planduration, (es->buffers ? &bufusage : NULL));
}

which seemed like a win, but I'm not opposed to do that if you prefer.

>
> + "Planning Time": N.N, +
> + "Shared Hit Blocks": N, +
> + "Shared Read Blocks": N, +
> + "Shared Dirtied Blocks": N,+
>
> Doesn't this indent look strange? IMO no indent for buffer usage is
> necessary when the format is either json, xml, and yaml. This looks
> better at least for me. OTOH, in text format, it seems better to indent
> the buffer usage for more readability. Thought?
> The patch changes the code so that "es->indent" is
> increment/decrement only when the format is text.

Indeed, that's way better!


From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16109: Postgres planning time is high across version (Expose buffer usage during planning in EXPLAIN)
Date: 2020-04-02 04:05:56
Message-ID: 045faedf-108d-26e6-7307-ab1ac2d64624@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs Postg토토 사이트SQL

On 2020/04/02 3:47, Julien Rouhaud wrote:
> On Wed, Apr 1, 2020 at 7:51 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>
>>
>> On 2020/03/31 10:31, Justin Pryzby wrote:
>>> On Wed, Jan 29, 2020 at 12:15:59PM +0100, Julien Rouhaud wrote:
>>>> Rebase due to conflict with 3ec20c7091e97.
>>>
>>> This is failing to apply probably since 4a539a25ebfc48329fd656a95f3c1eb2cda38af3.
>>> Could you rebase? (Also, not sure if this can be set as RFC?)
>>
>> I updated the patch. Attached.
>
> Thanks a lot! I'm sorry I missed Justin's ping, and it I just
> realized that my cron job that used to warn me about cfbot failure was
> broken :(
>
>> +/* Compute the difference between two BufferUsage */
>> +BufferUsage
>> +ComputeBufferCounters(BufferUsage *start, BufferUsage *stop)
>>
>> Since BufferUsageAccumDiff() was exported, ComputeBufferCounters() is
>> no longer necessary. In the patched version, BufferUsageAccumDiff() is
>> used to calculate the difference of buffer usage.
>
> Indeed, exposing BufferUsageAccumDiff wa definitely a good thing!
>
>> + if (es->summary && (planduration || es->buffers))
>> + ExplainOpenGroup("Planning", "Planning", true, es);
>>
>> Isn't it more appropriate to check "bufusage" instead of "es->buffers" here?
>> The patch changes the code so that "bufusage" is checked.
>
> AFAICS not unless ExplainOneQuery is also changed to pass a NULL
> pointer if the BUFFER option wasn't provided (and maybe also
> optionally skip the planning buffer computation). With this version
> you now get:
>
> =# explain (analyze, buffers off) update t1 set id = id;
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------
> Update on t1 (cost=0.00..22.70 rows=1270 width=42) (actual
> time=0.170..0.170 rows=0 loops=1)
> -> Seq Scan on t1 (cost=0.00..22.70 rows=1270 width=42) (actual
> time=0.050..0.054 rows=1 loops=1)
> Planning Time: 1.461 ms
> Buffers: shared hit=25
> Execution Time: 1.071 ms
> (5 rows)
>
> which seems wrong to me.
>
> I reused the es->buffers to avoid having needing something like:

Yes, you're right! So I updated the patch as you suggested.
Attached is the updated version of the patch.
Thanks for the review!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachment Content-Type Size
show_planning_buffers-v5.patch text/plain 8.8 KB

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16109: Postgres planning time is high across version (Expose buffer usage during planning in EXPLAIN)
Date: 2020-04-02 06:01:54
Message-ID: 20200402060154.gylqmvwdbes5s5ni@nol
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg범퍼카 토토SQL : Postg범퍼카 토토SQL 메일 링리스트 : 2020-04-02 이후 PGSQL-BUGS 06:01 pgsql-hackers

On Thu, Apr 02, 2020 at 01:05:56PM +0900, Fujii Masao wrote:
>
>
> On 2020/04/02 3:47, Julien Rouhaud wrote:
> > On Wed, Apr 1, 2020 at 7:51 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
> > >
> > >
> > > On 2020/03/31 10:31, Justin Pryzby wrote:
> > > > On Wed, Jan 29, 2020 at 12:15:59PM +0100, Julien Rouhaud wrote:
> > > > > Rebase due to conflict with 3ec20c7091e97.
> > > >
> > > > This is failing to apply probably since 4a539a25ebfc48329fd656a95f3c1eb2cda38af3.
> > > > Could you rebase? (Also, not sure if this can be set as RFC?)
> > >
> > > I updated the patch. Attached.
> >
> > Thanks a lot! I'm sorry I missed Justin's ping, and it I just
> > realized that my cron job that used to warn me about cfbot failure was
> > broken :(
> >
> > > +/* Compute the difference between two BufferUsage */
> > > +BufferUsage
> > > +ComputeBufferCounters(BufferUsage *start, BufferUsage *stop)
> > >
> > > Since BufferUsageAccumDiff() was exported, ComputeBufferCounters() is
> > > no longer necessary. In the patched version, BufferUsageAccumDiff() is
> > > used to calculate the difference of buffer usage.
> >
> > Indeed, exposing BufferUsageAccumDiff wa definitely a good thing!
> >
> > > + if (es->summary && (planduration || es->buffers))
> > > + ExplainOpenGroup("Planning", "Planning", true, es);
> > >
> > > Isn't it more appropriate to check "bufusage" instead of "es->buffers" here?
> > > The patch changes the code so that "bufusage" is checked.
> >
> > AFAICS not unless ExplainOneQuery is also changed to pass a NULL
> > pointer if the BUFFER option wasn't provided (and maybe also
> > optionally skip the planning buffer computation). With this version
> > you now get:
> >
> > =# explain (analyze, buffers off) update t1 set id = id;
> > QUERY PLAN
> > -------------------------------------------------------------------------------------------------------
> > Update on t1 (cost=0.00..22.70 rows=1270 width=42) (actual
> > time=0.170..0.170 rows=0 loops=1)
> > -> Seq Scan on t1 (cost=0.00..22.70 rows=1270 width=42) (actual
> > time=0.050..0.054 rows=1 loops=1)
> > Planning Time: 1.461 ms
> > Buffers: shared hit=25
> > Execution Time: 1.071 ms
> > (5 rows)
> >
> > which seems wrong to me.
> >
> > I reused the es->buffers to avoid having needing something like:
>
> Yes, you're right! So I updated the patch as you suggested.
> Attached is the updated version of the patch.
> Thanks for the review!

Thanks a lot, it all looks good to me!


From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16109: Postgres planning time is high across version (Expose buffer usage during planning in EXPLAIN)
Date: 2020-04-02 06:52:17
Message-ID: 0abcbb8f-5b96-da04-ad7d-b84c36d09dd0@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs Postg사설 토토 사이트SQL

On 2020/04/02 15:01, Julien Rouhaud wrote:
> On Thu, Apr 02, 2020 at 01:05:56PM +0900, Fujii Masao wrote:
>>
>>
>> On 2020/04/02 3:47, Julien Rouhaud wrote:
>>> On Wed, Apr 1, 2020 at 7:51 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>>>
>>>>
>>>> On 2020/03/31 10:31, Justin Pryzby wrote:
>>>>> On Wed, Jan 29, 2020 at 12:15:59PM +0100, Julien Rouhaud wrote:
>>>>>> Rebase due to conflict with 3ec20c7091e97.
>>>>>
>>>>> This is failing to apply probably since 4a539a25ebfc48329fd656a95f3c1eb2cda38af3.
>>>>> Could you rebase? (Also, not sure if this can be set as RFC?)
>>>>
>>>> I updated the patch. Attached.
>>>
>>> Thanks a lot! I'm sorry I missed Justin's ping, and it I just
>>> realized that my cron job that used to warn me about cfbot failure was
>>> broken :(
>>>
>>>> +/* Compute the difference between two BufferUsage */
>>>> +BufferUsage
>>>> +ComputeBufferCounters(BufferUsage *start, BufferUsage *stop)
>>>>
>>>> Since BufferUsageAccumDiff() was exported, ComputeBufferCounters() is
>>>> no longer necessary. In the patched version, BufferUsageAccumDiff() is
>>>> used to calculate the difference of buffer usage.
>>>
>>> Indeed, exposing BufferUsageAccumDiff wa definitely a good thing!
>>>
>>>> + if (es->summary && (planduration || es->buffers))
>>>> + ExplainOpenGroup("Planning", "Planning", true, es);
>>>>
>>>> Isn't it more appropriate to check "bufusage" instead of "es->buffers" here?
>>>> The patch changes the code so that "bufusage" is checked.
>>>
>>> AFAICS not unless ExplainOneQuery is also changed to pass a NULL
>>> pointer if the BUFFER option wasn't provided (and maybe also
>>> optionally skip the planning buffer computation). With this version
>>> you now get:
>>>
>>> =# explain (analyze, buffers off) update t1 set id = id;
>>> QUERY PLAN
>>> -------------------------------------------------------------------------------------------------------
>>> Update on t1 (cost=0.00..22.70 rows=1270 width=42) (actual
>>> time=0.170..0.170 rows=0 loops=1)
>>> -> Seq Scan on t1 (cost=0.00..22.70 rows=1270 width=42) (actual
>>> time=0.050..0.054 rows=1 loops=1)
>>> Planning Time: 1.461 ms
>>> Buffers: shared hit=25
>>> Execution Time: 1.071 ms
>>> (5 rows)
>>>
>>> which seems wrong to me.
>>>
>>> I reused the es->buffers to avoid having needing something like:
>>
>> Yes, you're right! So I updated the patch as you suggested.
>> Attached is the updated version of the patch.
>> Thanks for the review!
>
>
> Thanks a lot, it all looks good to me!

Thanks! Barring any objection, I will commit the latest version of the patch.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16109: Postgres planning time is high across version (Expose buffer usage during planning in EXPLAIN)
Date: 2020-04-03 02:31:20
Message-ID: 1ee339d1-790f-a0a9-38f8-99d4e70630d2@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs Postg롤 토토SQL :

On 2020/04/02 15:52, Fujii Masao wrote:
>
>
> On 2020/04/02 15:01, Julien Rouhaud wrote:
>> On Thu, Apr 02, 2020 at 01:05:56PM +0900, Fujii Masao wrote:
>>>
>>>
>>> On 2020/04/02 3:47, Julien Rouhaud wrote:
>>>> On Wed, Apr 1, 2020 at 7:51 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>>>>
>>>>>
>>>>> On 2020/03/31 10:31, Justin Pryzby wrote:
>>>>>> On Wed, Jan 29, 2020 at 12:15:59PM +0100, Julien Rouhaud wrote:
>>>>>>> Rebase due to conflict with 3ec20c7091e97.
>>>>>>
>>>>>> This is failing to apply probably since 4a539a25ebfc48329fd656a95f3c1eb2cda38af3.
>>>>>> Could you rebase?   (Also, not sure if this can be set as RFC?)
>>>>>
>>>>> I updated the patch. Attached.
>>>>
>>>> Thanks a lot!  I'm sorry I missed Justin's ping, and it I just
>>>> realized that my cron job that used to warn me about cfbot failure was
>>>> broken :(
>>>>
>>>>> +/* Compute the difference between two BufferUsage */
>>>>> +BufferUsage
>>>>> +ComputeBufferCounters(BufferUsage *start, BufferUsage *stop)
>>>>>
>>>>> Since BufferUsageAccumDiff() was exported, ComputeBufferCounters() is
>>>>> no longer necessary. In the patched version, BufferUsageAccumDiff() is
>>>>> used to calculate the difference of buffer usage.
>>>>
>>>> Indeed, exposing BufferUsageAccumDiff wa definitely a good thing!
>>>>
>>>>> +       if (es->summary && (planduration || es->buffers))
>>>>> +               ExplainOpenGroup("Planning", "Planning", true, es);
>>>>>
>>>>> Isn't it more appropriate to check "bufusage" instead of "es->buffers" here?
>>>>> The patch changes the code so that "bufusage" is checked.
>>>>
>>>> AFAICS not unless ExplainOneQuery is also changed to pass a NULL
>>>> pointer if the BUFFER option wasn't provided (and maybe also
>>>> optionally skip the planning buffer computation).  With this version
>>>> you now get:
>>>>
>>>> =# explain (analyze, buffers off) update t1 set id = id;
>>>>                                                 QUERY PLAN
>>>> -------------------------------------------------------------------------------------------------------
>>>>    Update on t1  (cost=0.00..22.70 rows=1270 width=42) (actual
>>>> time=0.170..0.170 rows=0 loops=1)
>>>>      ->  Seq Scan on t1  (cost=0.00..22.70 rows=1270 width=42) (actual
>>>> time=0.050..0.054 rows=1 loops=1)
>>>>    Planning Time: 1.461 ms
>>>>      Buffers: shared hit=25
>>>>    Execution Time: 1.071 ms
>>>> (5 rows)
>>>>
>>>> which seems wrong to me.
>>>>
>>>> I reused the es->buffers to avoid having needing something like:
>>>
>>> Yes, you're right! So I updated the patch as you suggested.
>>> Attached is the updated version of the patch.
>>> Thanks for the review!
>>
>>
>> Thanks a lot, it all looks good to me!
>
> Thanks! Barring any objection, I will commit the latest version of the patch.

Pushed! Thanks a lot!!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION