Re: stats_ext test fails with -DCATCACHE_FORCE_RELEASE

Lists: pgsql-hackers
From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: stats_ext test fails with -DCATCACHE_FORCE_RELEASE
Date: 2018-05-01 07:16:35
Message-ID: 1349aabb-3a1f-6675-9fc0-65e2ce7491dd@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi.

While playing around with a -DCATCACHE_FORCE_RELEASE build, I noticed that
stats_ext test failed with errors for multiple statements that looked like
this:

ERROR: invalid ndistinct magic 7f7f7f7f (expected a352bfa4)

I figured it's because statext_dependencies_load() and
statext_ndistinct_build() both return a pointer that points directly into
a pg_statistics_ext tuple obtained by using SearchSysCache1 that has
uncertain lifetime after subsequent call to ReleaseSysCache before returning.

I think we should be calling statext_dependencies_deserialize() and
statext_ndistinct_deserialize(), respectively, *before* we perform
ReleaseSysCache on the tuple. Attached patch does that.

Thanks,
Amit

Attachment Content-Type Size
extended-stat-catalog-datum-1.patch text/plain 1.6 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: stats_ext test fails with -DCATCACHE_FORCE_RELEASE
Date: 2018-05-01 15:33:35
Message-ID: 22200.1525188815@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> writes:
> While playing around with a -DCATCACHE_FORCE_RELEASE build, I noticed that
> stats_ext test failed with errors for multiple statements that looked like
> this:
> ERROR: invalid ndistinct magic 7f7f7f7f (expected a352bfa4)

Interesting. How come the buildfarm hasn't noticed this? I should
think that the CLOBBER_CACHE_ALWAYS animals, as well as the one(s)
using -DCATCACHE_FORCE_RELEASE, would have shown failures.

regards, tom lane


From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: stats_ext test fails with -DCATCACHE_FORCE_RELEASE
Date: 2018-05-02 01:40:11
Message-ID: 9afdefec-770c-ebff-5422-51ebeec7c494@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2018/05/02 0:33, Tom Lane wrote:
> Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> writes:
>> While playing around with a -DCATCACHE_FORCE_RELEASE build, I noticed that
>> stats_ext test failed with errors for multiple statements that looked like
>> this:
>> ERROR: invalid ndistinct magic 7f7f7f7f (expected a352bfa4)
>
> Interesting. How come the buildfarm hasn't noticed this? I should
> think that the CLOBBER_CACHE_ALWAYS animals, as well as the one(s)
> using -DCATCACHE_FORCE_RELEASE, would have shown failures.

I too wondered why. Fwiw, similar failure occurs in PG 10 branch.

Thanks,
Amit


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: stats_ext test fails with -DCATCACHE_FORCE_RELEASE
Date: 2018-05-02 15:48:29
Message-ID: 32282.1525276109@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> writes:
> On 2018/05/02 0:33, Tom Lane wrote:
>> Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> writes:
>>> While playing around with a -DCATCACHE_FORCE_RELEASE build, I noticed that
>>> stats_ext test failed with errors for multiple statements that looked like
>>> this:
>>> ERROR: invalid ndistinct magic 7f7f7f7f (expected a352bfa4)

>> Interesting. How come the buildfarm hasn't noticed this? I should
>> think that the CLOBBER_CACHE_ALWAYS animals, as well as the one(s)
>> using -DCATCACHE_FORCE_RELEASE, would have shown failures.

> I too wondered why. Fwiw, similar failure occurs in PG 10 branch.

Ah, after looking closer I understand that. First, there isn't any
buildfarm member using CATCACHE_FORCE_RELEASE --- what I was thinking
of is Andrew's prion, which uses RELCACHE_FORCE_RELEASE. Not the
same thing.

Second, the nature of the bug is that these functions are reading
from a catcache entry immediately after ReleaseSysCache, when they
should do that immediately before. CLOBBER_CACHE_ALWAYS does not
trigger the problem because it clobbers cache only at invalidation
opportunities. In the current implementation, ReleaseSysCache per
se is not an invalidation opportunity.

CATCACHE_FORCE_RELEASE does model a real-world hazard, which is
that if we get an invalidation signal for a catcache item *while
it's pinned*, it'd go away as soon as the last pin is released.
Evidently, these code paths do not contain any invalidation
opportunities occurring while the pin on the stats_ext catcache
entry is already held, so CCA can't trigger the problem. I think
this means that there's no production hazard here, just a violation
of coding convention. Nonetheless, we certainly should fix it,
since it's easy to imagine future changes that would create a live
hazard of the tuple going away during the ReleaseSysCache call.

tl;dr: we lack buildfarm coverage of CATCACHE_FORCE_RELEASE.
This is probably bad. It might be okay to just add that to
prion's configuration; I'm not sure whether there's any value
in testing it separately from RELCACHE_FORCE_RELEASE.

regards, tom lane


From: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: stats_ext test fails with -DCATCACHE_FORCE_RELEASE
Date: 2018-05-02 19:19:08
Message-ID: CAA8=A78SaxVkovbJxRu4Pj8UuXk+J4RAmTf9C4KFYML_uNkoLA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, May 2, 2018 at 11:48 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> tl;dr: we lack buildfarm coverage of CATCACHE_FORCE_RELEASE.
> This is probably bad. It might be okay to just add that to
> prion's configuration;

Will do that pronto.

cheers

andre

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services