Quick Links

Re: ResourceOwner refactoring

Lists:	pgsql-hackers

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc:	Michael Paquier <michael(at)paquier(dot)xyz>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Subject:	ResourceOwner refactoring
Date:	2020-11-17 14:21:29
Message-ID:	cbfabeb0-cd3c-e951-a572-19b365ed314d@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Two recent patches that I have reviewed have reminded me that the
ResourceOwner interface is a bit clunky. There are two issues:

1. Performance. It's fast enough in most use, but when I was testing
Kyotaro's catcache patches [1], the Resowner functions to track catcache
references nevertheless showed up, accounting for about 20% of the CPU
time used by catcache lookups.

2. It's difficult for extensions to use. There is a callback mechanism
for extensions, but it's much less convenient to use than the built-in
functions. The pgcrypto code uses the callbacks currently, and Michael's
patch [2] would move that support for tracking OpenSSL contexts to the
core, which makes it a lot more convenient for pgcrypto. Wouldn't it be
nice if extensions could have the same ergonomics as built-in code, if
they need to track external resources?

Attached patch refactors the ResourceOwner internals to do that.

The current code in HEAD has a separate array for each "kind" of object
that can be tracked. The number of different kinds of objects has grown
over the years, currently there is support for tracking buffers, files,
catcache, relcache and plancache references, tupledescs, snapshots, DSM
segments and LLVM JIT contexts. And locks, which are a bit special.

In the patch, I'm using the same array to hold all kinds of objects, and
carry another field with each tracked object, to tell what kind of an
object it is. An extension can define a new object kind, by defining
Release and PrintLeakWarning callback functions for it. The code in
resowner.c is now much more generic, as it doesn't need to know about
all the different object kinds anymore (with a couple of exceptions). In
the new scheme, there is one small fixed-size array to hold a few most
recently-added references, that is linear-searched, and older entries
are moved to a hash table.

I haven't done thorough performance testing of this, but with some quick
testing with Kyotaro's "catcachebench" to stress-test syscache lookups,
this performs a little better. In that test, all the activity happens in
the small array portion, but I believe that's true for most use cases.

Thoughts? Can anyone suggest test scenarios to verify the performance of
this?

[1]
/message-id/20201106.172958.1103727352904717607.horikyota.ntt%40gmail.com

[2] /message-id/20201113031429.GB1631@paquier.xyz

- Heikki

Attachment	Content-Type	Size
v1-resowner-refactor.patch	text/x-patch	76.2 KB

From:	Michael Paquier <michael(at)paquier(dot)xyz>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2020-11-18 08:06:48
Message-ID:	20201118080648.GL19692@paquier.xyz
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Nov 17, 2020 at 04:21:29PM +0200, Heikki Linnakangas wrote:
> 2. It's difficult for extensions to use. There is a callback mechanism for
> extensions, but it's much less convenient to use than the built-in
> functions. The pgcrypto code uses the callbacks currently, and Michael's
> patch [2] would move that support for tracking OpenSSL contexts to the core,
> which makes it a lot more convenient for pgcrypto. Wouldn't it be nice if
> extensions could have the same ergonomics as built-in code, if they need to
> track external resources?

+1. True that the current interface is a bit hard to grasp, one can
easily miss that showing reference leaks is very important if an
allocation happens out of the in-core palloc machinery at commit time.
(The patch you are referring to is not really popular, but that does
not prevent this thread to move on on its own.)

> Attached patch refactors the ResourceOwner internals to do that.

+ * Size of the small fixed-size array to hold most-recently remembered resources.
*/
-#define RESARRAY_INIT_SIZE 16
+#define RESOWNER_ARRAY_SIZE 8
Why did you choose this size for the initial array?

+extern const char *ResourceOwnerGetName(ResourceOwner owner);
This is only used in resowner.c, at one place.

@@ -140,7 +139,6 @@ jit_release_context(JitContext *context)
if (provider_successfully_loaded)
provider.release_context(context);

- ResourceOwnerForgetJIT(context->resowner, PointerGetDatum(context));
[...]
+ ResourceOwnerForget(context->resowner, PointerGetDatum(context), &jit_funcs);
This moves the JIT context release from jit.c to llvm.c. I think
that's indeed more consistent, and correct. It would be better to
check this one with Andres, though.

> I haven't done thorough performance testing of this, but with some quick
> testing with Kyotaro's "catcachebench" to stress-test syscache lookups, this
> performs a little better. In that test, all the activity happens in the
> small array portion, but I believe that's true for most use cases.

> Thoughts? Can anyone suggest test scenarios to verify the performance of
> this?

Playing with catcache.c is the first thing that came to my mind.
After that some micro-benchmarking with DSM or snapshots? I am not
sure if we would notice a difference in a real-life scenario, say even
a pgbench -S with prepared queries.
--
Michael

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Michael Paquier <michael(at)paquier(dot)xyz>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2020-11-18 08:50:08
Message-ID:	aabd6a93-c110-4ba2-fa2d-d6f7c44e49dd@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 18/11/2020 10:06, Michael Paquier wrote:
> On Tue, Nov 17, 2020 at 04:21:29PM +0200, Heikki Linnakangas wrote:
>> Attached patch refactors the ResourceOwner internals to do that.
>
> + * Size of the small fixed-size array to hold most-recently remembered resources.
> */
> -#define RESARRAY_INIT_SIZE 16
> +#define RESOWNER_ARRAY_SIZE 8
> Why did you choose this size for the initial array?

Just a guess. The old init size was 16 Datums. The entries in the new
array are twice as large, Datum+pointer.

The "RESOWNER_STATS" #ifdef blocks can be enabled to check how many
lookups fit in the array. With pgbench, RESOWNER_ARRAY_SIZE 8:

RESOWNER STATS: lookups: array 235, hash 32

If RESOWNER_ARRAY_STATS is increased to 16, all the lookups fit in the
array. But I haven't done any benchmarking to see which is faster.

BTW, I think there would be an easy win in the hashing codepath, by
changing to a cheaper hash function. Currently, with or without this
patch, we use hash_any(). Changing that to murmurhash32() or something
similar would be a drop-in replacement.

- Heikki

From:	Michael Paquier <michael(at)paquier(dot)xyz>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2020-11-19 02:16:05
Message-ID:	20201119021605.GC26172@paquier.xyz
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Nov 18, 2020 at 10:50:08AM +0200, Heikki Linnakangas wrote:
> If RESOWNER_ARRAY_STATS is increased to 16, all the lookups fit in the
> array. But I haven't done any benchmarking to see which is faster.

My gut tells me that your guess is right, but it would be better to be
sure.

> BTW, I think there would be an easy win in the hashing codepath, by changing
> to a cheaper hash function. Currently, with or without this patch, we use
> hash_any(). Changing that to murmurhash32() or something similar would be a
> drop-in replacement.

Good idea.
--
Michael

From:	Julien Rouhaud <rjuju123(at)gmail(dot)com>
To:	Michael Paquier <michael(at)paquier(dot)xyz>
Cc:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2020-11-19 09:27:18
Message-ID:	CAOBaU_a1Rt_Vn0XxWU1fiGrWWzEZMrCEjxDDsEePGc=i9EYVmQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Nov 19, 2020 at 10:16 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Wed, Nov 18, 2020 at 10:50:08AM +0200, Heikki Linnakangas wrote:
> > If RESOWNER_ARRAY_STATS is increased to 16, all the lookups fit in the
> > array. But I haven't done any benchmarking to see which is faster.
>
> My gut tells me that your guess is right, but it would be better to be
> sure.
>
> > BTW, I think there would be an easy win in the hashing codepath, by changing
> > to a cheaper hash function. Currently, with or without this patch, we use
> > hash_any(). Changing that to murmurhash32() or something similar would be a
> > drop-in replacement.
>
> Good idea.

+1, and +1 for this refactoring.

I just saw a minor issue in a comment while reviewing the patch:

[...]
+ /* Is there space in the hash? If not, enlarge it. */

/* Double the capacity of the array (capacity must stay a power of 2!) */
- newcap = (oldcap > 0) ? oldcap * 2 : RESARRAY_INIT_SIZE;
[...]

The existing comment is kept as-is, but it should mention that it's
now the hash capacity that is increased.

From:	Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2020-11-19 10:40:19
Message-ID:	CAGRY4nx+RrrcC+qxs10PAbnpeKC4we8Bf+E1qz+c-L3G6UkdUQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

[off-list for now]

On Tue, Nov 17, 2020 at 10:21 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:

>
> 2. It's difficult for extensions to use. There is a callback mechanism
> for extensions, but it's much less convenient to use than the built-in
> functions. The pgcrypto code uses the callbacks currently, and Michael's
> patch [2] would move that support for tracking OpenSSL contexts to the
> core, which makes it a lot more convenient for pgcrypto. Wouldn't it be
> nice if extensions could have the same ergonomics as built-in code, if
> they need to track external resources?
>

Yes, in general I'm a big fan of making life easier for extensions (see
postscript), and this looks helpful. It's certainly fairly clear to read.
I'm in-principle in favour, though I haven't read the old resowner code
closely enough to have a strong opinion.

I haven't done thorough performance testing of this, but with some quick
> testing with Kyotaro's "catcachebench" to stress-test syscache lookups,
> this performs a little better. In that test, all the activity happens in
> the small array portion, but I believe that's true for most use cases.
>
> Thoughts? Can anyone suggest test scenarios to verify the performance of
> this?
>

I didn't think of much really.

I have tossed together a patch on top of yours that adds some
systemtap/dtrace tracepoints and provides a demo systemtap script that
shows some basic stats collection done using them. My intention was to make
it easier to see where the work in the resource owner code was being done.
But after I did the initial version I realised that in this case it
probably doesn't add a lot of value over using targeted profiling of only
functions in resowner.c and their callees which is easy enough using perf
and regular DWARF debuginfo. So I didn't land up polishing it and adding
stats for the other counters. It's attached anyway, in case it's
interesting or useful to anyone.

P.S. At some point I want to tackle some things similar to what you've done
here for other kinds of backend resources. In particular, to allow
extensions to register their own heavyweight lock types - with the ability
to handle lock timeouts, and add callbacks in the deadlock detector to
allow extensions to register dependency edges that are not natively visible
to the deadlock detector and to choose victim priorities. Also some
enhancements to how pg_depend tracking can be used by extensions. And I
want to help get the proposed custom ProcSignal changes in too. I think
there's a whole lot to be done to make extensions easier and safer to
author in general, like providing a simple way to do error trapping and
recovery in an extension mainloop without copy/pasting a bunch of
PostgresMain code, better default signal handlers, startup/shutdown that
shares more with user backends, etc. Right now it's quite tricky to get
bgworkers to behave well. </wildhandwaving>

Attachment	Content-Type	Size
0001-Poc-of-systemtap-probes-for-resowner-timings.patch	text/x-patch	14.5 KB

From:	Michael Paquier <michael(at)paquier(dot)xyz>
To:	Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>
Cc:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2020-11-26 07:29:47
Message-ID:	X79Za8w6t/6WVsW1@paquier.xyz
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Nov 19, 2020 at 06:40:19PM +0800, Craig Ringer wrote:
> [off-list for now]

Looks like you have missed something here.
--
Michael

From:	Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To:	rjuju123(at)gmail(dot)com
Cc:	michael(at)paquier(dot)xyz, hlinnaka(at)iki(dot)fi, pgsql-hackers(at)postgresql(dot)org, andres(at)anarazel(dot)de
Subject:	Re: ResourceOwner refactoring
Date:	2020-11-26 08:52:53
Message-ID:	20201126.175253.151474327206611616.horikyota.ntt@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

At Thu, 19 Nov 2020 17:27:18 +0800, Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote in
> On Thu, Nov 19, 2020 at 10:16 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> >
> > On Wed, Nov 18, 2020 at 10:50:08AM +0200, Heikki Linnakangas wrote:
> > > If RESOWNER_ARRAY_STATS is increased to 16, all the lookups fit in the
> > > array. But I haven't done any benchmarking to see which is faster.
> >
> > My gut tells me that your guess is right, but it would be better to be
> > sure.
> >
> > > BTW, I think there would be an easy win in the hashing codepath, by changing
> > > to a cheaper hash function. Currently, with or without this patch, we use
> > > hash_any(). Changing that to murmurhash32() or something similar would be a
> > > drop-in replacement.
> >
> > Good idea.
>
> +1, and +1 for this refactoring.

+1 for making the interface more generic. I thought a similar thing
when I added an resowner array for WaitEventSet (not committed yet).
About performance, though I'm not sure about, there's no reason not to
do this as far as the resowner mechanism doesn't get slower, and +2 if
gets faster.

> I just saw a minor issue in a comment while reviewing the patch:
>
> [...]
> + /* Is there space in the hash? If not, enlarge it. */
>
> /* Double the capacity of the array (capacity must stay a power of 2!) */
> - newcap = (oldcap > 0) ? oldcap * 2 : RESARRAY_INIT_SIZE;
> [...]
>
> The existing comment is kept as-is, but it should mention that it's
> now the hash capacity that is increased.

+ /* And release old array. */
+ pfree(oldhash);

+ for (int i = 0; i < owner->narr; i++)
{
+ if (owner->arr[i].kind->phase == phase)
+ {
+ Datum value = owner->arr[i].item;
+ ResourceOwnerFuncs *kind = owner->arr[i].kind;
+
+ if (printLeakWarnings)
+ kind->PrintLeakWarning(value);
+ kind->ReleaseResource(value);
+ found = true;
+ }
+ /*
+ * If any resources were released, check again because some of the
+ * elements might have moved by the callbacks. We don't want to
+ * miss them.
+ */
+ } while (found && owner->narr > 0);

Coundn't that missing be avoided by just not incrementing i if found?

+ /*
+ * Like with the array, we must check again after we reach the
+ * end, if any callbacks were called. XXX: We could probably
+ * stipulate that the callbacks may not do certain thing, like
+ * remember more references in the same resource owner, to avoid
+ * that.
+ */

If I read this patch correctly, ResourceOwnerForget doesn't seem to do
such a thing for hash?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, rjuju123(at)gmail(dot)com
Cc:	michael(at)paquier(dot)xyz, pgsql-hackers(at)postgresql(dot)org, andres(at)anarazel(dot)de
Subject:	Re: ResourceOwner refactoring
Date:	2020-12-16 15:46:33
Message-ID:	20fe4d27-a948-9674-dccd-b0567f155f38@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 26/11/2020 10:52, Kyotaro Horiguchi wrote:
> + for (int i = 0; i < owner->narr; i++)
> {
> + if (owner->arr[i].kind->phase == phase)
> + {
> + Datum value = owner->arr[i].item;
> + ResourceOwnerFuncs *kind = owner->arr[i].kind;
> +
> + if (printLeakWarnings)
> + kind->PrintLeakWarning(value);
> + kind->ReleaseResource(value);
> + found = true;
> + }
> + /*
> + * If any resources were released, check again because some of the
> + * elements might have moved by the callbacks. We don't want to
> + * miss them.
> + */
> + } while (found && owner->narr > 0);
>
> Coundn't that missing be avoided by just not incrementing i if found?

Hmm, perhaps. ResourceOwnerForget() can move an entry from the end of
the array to the beginning, but if it's removing the entry that we're
just looking at, it probably can't move anything before that entry. I'm
not very comfortable with that, though. What if the callback releases
something else as a side effect?

This isn't super-critical for performance, and given that the array is
very small, it's very cheap to loop through it. So I'm inclined to keep
it safe.

> + /*
> + * Like with the array, we must check again after we reach the
> + * end, if any callbacks were called. XXX: We could probably
> + * stipulate that the callbacks may not do certain thing, like
> + * remember more references in the same resource owner, to avoid
> + * that.
> + */
>
> If I read this patch correctly, ResourceOwnerForget doesn't seem to do
> such a thing for hash?

Hmm, true. I tried removing the loop and hit another issue, however: if
the same resource has been remembered twice in the same resource owner,
the callback might remove different reference to it than what we're
looking at. So we need to loop anyway to handle that. Also, what if the
callback remembers some other resource in the same resowner, causing the
hash to grow? I'm not sure if any of the callbacks currently do that,
but better safe than sorry. I updated the code and the comments accordingly.

Here's an updated version of this patch. I fixed the bit rot, and
addressed all the other comment issues and such that people pointed out
(thanks!).

TODO:

1. Performance testing. We discussed this a little bit, but I haven't
done any more testing.

2. Before this patch, the ResourceOwnerEnlarge() calls enlarged the
array for the particular "kind" of resource, but now they are all stored
in the same structure. That could lead to trouble if we do something
like this:

ResourceOwnerEnlargeAAA()
ResourceOwnerEnlargeBBB()
ResourceOwnerRememberAAA()
ResourceOwnerRememberBBB()

Previously, this was OK, because resources AAA and BBB were kept in
separate arrays. But after this patch, it's not guaranteed that the
ResourceOwnerRememberBBB() will find an empty slot.

I don't think we do that currently, but to be sure, I'm planning to grep
ResourceOwnerRemember and look at each call carefullly. And perhaps we
can add an assertion for this, although I'm not sure where.

- Heikki

Attachment	Content-Type	Size
v2-0001-Make-resowners-more-easily-extensible.patch	text/x-patch	82.0 KB

From:	"kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>
To:	'Heikki Linnakangas' <hlinnaka(at)iki(dot)fi>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "rjuju123(at)gmail(dot)com" <rjuju123(at)gmail(dot)com>
Cc:	"michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>
Subject:	RE: ResourceOwner refactoring
Date:	2021-01-13 01:55:15
Message-ID:	OSBPR01MB31575F85F7363FDA04CCE264F5A90@OSBPR01MB3157.jpnprd01.prod.outlook.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	Postg사설 토토SQL

Dear Heikki,

I'm also interested in this patch, but it cannot be applied to the current HEAD...

$ git apply ~/v2-0001-Make-resowners-more-easily-extensible.patch
error: patch failed: src/common/cryptohash_openssl.c:57
error: src/common/cryptohash_openssl.c: patch does not apply
error: patch failed: src/include/utils/resowner_private.h:1
error: src/include/utils/resowner_private.h: patch does not apply

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	"kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "rjuju123(at)gmail(dot)com" <rjuju123(at)gmail(dot)com>
Cc:	"michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2021-01-13 07:18:57
Message-ID:	da1f43a9-eb0f-97e9-98c7-6f923705d5c4@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 13/01/2021 03:55, kuroda(dot)hayato(at)fujitsu(dot)com wrote:
> Dear Heikki,
>
> I'm also interested in this patch, but it cannot be applied to the current HEAD...
>
> $ git apply ~/v2-0001-Make-resowners-more-easily-extensible.patch
> error: patch failed: src/common/cryptohash_openssl.c:57
> error: src/common/cryptohash_openssl.c: patch does not apply
> error: patch failed: src/include/utils/resowner_private.h:1
> error: src/include/utils/resowner_private.h: patch does not apply

Here's a rebased version. Thanks!

- Heikki

Attachment	Content-Type	Size
v3-0001-Make-resowners-more-easily-extensible.patch	text/x-patch	81.9 KB

From:	Michael Paquier <michael(at)paquier(dot)xyz>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	"kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "rjuju123(at)gmail(dot)com" <rjuju123(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2021-01-13 07:22:36
Message-ID:	X/6fvGM/RxBrn2zy@paquier.xyz
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Jan 13, 2021 at 09:18:57AM +0200, Heikki Linnakangas wrote:
> --- a/src/common/cryptohash_openssl.c
> +++ b/src/common/cryptohash_openssl.c
> +static ResourceOwnerFuncs cryptohash_funcs =
> +{
> + /* relcache references */
> + .name = "LLVM JIT context",
> + .phase = RESOURCE_RELEASE_BEFORE_LOCKS,
> + .ReleaseResource = ResOwnerReleaseCryptoHash,
> + .PrintLeakWarning = ResOwnerPrintCryptoHashLeakWarning,
> +};
> +#endif

Looks like a copy-paste error here.
--
Michael

From:	"kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>
To:	'Heikki Linnakangas' <hlinnaka(at)iki(dot)fi>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "rjuju123(at)gmail(dot)com" <rjuju123(at)gmail(dot)com>
Cc:	"michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>
Subject:	RE: ResourceOwner refactoring
Date:	2021-01-13 07:39:38
Message-ID:	OSBPR01MB3157C3D612008968FE7BB055F5A90@OSBPR01MB3157.jpnprd01.prod.outlook.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Dear Heikki,

Thank you for rebasing it, I confirmed it can be applied.
I will check the source.

Now I put the very elementary comment.
ResourceOwnerEnlarge(), ResourceOwnerRemember(), and ResourceOwnerForget()
are exported routines.
They should put below L418.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

From:	"kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>
To:	'Heikki Linnakangas' <hlinnaka(at)iki(dot)fi>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "rjuju123(at)gmail(dot)com" <rjuju123(at)gmail(dot)com>
Cc:	"michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>
Subject:	RE: ResourceOwner refactoring
Date:	2021-01-14 10:15:35
Message-ID:	OSBPR01MB3157BD946853F6FC1FD213B0F5A80@OSBPR01MB3157.jpnprd01.prod.outlook.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

I put some comments.

Throughout, some components don’t have helper functions.
(e.g. catcache has ResOwnerReleaseCatCache, but tupdesc doesn't.)
I think it should be unified.

[catcache.c]
> +/* support for catcache refcount management */
> +static inline void
> +ResourceOwnerEnlargeCatCacheRefs(ResourceOwner owner)
> +{
> + ResourceOwnerEnlarge(owner);
> +}

This function is not needed.

[syscache.c]
> -static CatCache *SysCache[SysCacheSize];
> + CatCache *SysCache[SysCacheSize];

Is it right? Compilation is done even if this variable is static...

[fd.c, dsm.c]
In these files helper functions are implemented as the define directive.
Could you explain the reason? For the performance?

> Previously, this was OK, because resources AAA and BBB were kept in
> separate arrays. But after this patch, it's not guaranteed that the
> ResourceOwnerRememberBBB() will find an empty slot.
>
> I don't think we do that currently, but to be sure, I'm planning to grep
> ResourceOwnerRemember and look at each call carefullly. And perhaps we
> can add an assertion for this, although I'm not sure where.

Indeed, but I think this line works well, isn't it?

> Assert(owner->narr < RESOWNER_ARRAY_SIZE);

Best regards,
Hayato Kuroda
FUJITSU LIMITED

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	"kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc:	"michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "rjuju123(at)gmail(dot)com" <rjuju123(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2021-01-14 13:34:56
Message-ID:	3b6c2c31-0e2a-3656-c442-48bca31cc079@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 14/01/2021 12:15, kuroda(dot)hayato(at)fujitsu(dot)com wrote:
> I put some comments.

Thanks for the review!

> Throughout, some components don’t have helper functions.
> (e.g. catcache has ResOwnerReleaseCatCache, but tupdesc doesn't.)
> I think it should be unified.

Hmm. ResOwnerReleaseTupleDesc() does exist, those functions are needed
for the callbacks. I think you meant the wrappers around
ResourceOwnerRemember and ResourceOwnerForget, like
ResourceOwnerRememberCatCacheRef(). I admit those are not fully
consistent: I didn't bother with the wrapper functions when there is
only one caller.

> [catcache.c]
>> +/* support for catcache refcount management */
>> +static inline void
>> +ResourceOwnerEnlargeCatCacheRefs(ResourceOwner owner)
>> +{
>> + ResourceOwnerEnlarge(owner);
>> +}
>
> This function is not needed.

Removed.

> [syscache.c]
>> -static CatCache *SysCache[SysCacheSize];
>> + CatCache *SysCache[SysCacheSize];
>
> Is it right? Compilation is done even if this variable is static...

Fixed. (It was a leftover from when I was playing with Kyotaro's
"catcachebench" tool from another thread).

> [fd.c, dsm.c]
> In these files helper functions are implemented as the define directive.
> Could you explain the reason? For the performance?

No particular reason. I turned them all into macros for consistency.

>> Previously, this was OK, because resources AAA and BBB were kept in
>> separate arrays. But after this patch, it's not guaranteed that the
>> ResourceOwnerRememberBBB() will find an empty slot.
>>
>> I don't think we do that currently, but to be sure, I'm planning to grep
>> ResourceOwnerRemember and look at each call carefullly. And perhaps we
>> can add an assertion for this, although I'm not sure where.
>
> Indeed, but I think this line works well, isn't it?
>
>> Assert(owner->narr < RESOWNER_ARRAY_SIZE);

That catches cases where you actually overrun the array, but it doesn't
catch unsafe patterns when there happens to be enough space left in the
array. For example, if you have code like this:

/* Make sure there's room for one more entry, but remember *two* things */
ResourceOwnerEnlarge();
ResourceOwnerRemember(foo);
ResourceOwnerRemember(bar);

That is not safe, but it would only fail the assertion if the first
ResourceOwnerRemember() call happens to consume the last remaining slot
in the array.

I checked all the callers of ResourceOwnerEnlarge() to see if they're
safe. A couple of places seemed a bit suspicious. I fixed them by moving
the ResourceOwnerEnlarge() calls closer to the ResourceOwnerRemember()
calls, so that it's now easier to see that they are correct. See first
attached patch.

The second patch is an updated version of the main patch, fixing all the
little things you and Michael pointed out since the last patch version.

I've been working on performance testing too. I'll post more numbers
later, but preliminary result from some micro-benchmarking suggests that
the new code is somewhat slower, except in the common case that the
object to remember and forget fits in the array. When running the
regression test suite, about 96% of ResourceOwnerForget() calls fit in
the array. I think that's acceptable.

- Heikki

Attachment	Content-Type	Size
v4-0001-Move-a-few-ResourceOwnerEnlarge-calls-for-safety-.patch	text/x-patch	8.6 KB
v4-0002-Make-resowners-more-easily-extensible.patch	text/x-patch	83.9 KB

From:	"kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>
To:	'Heikki Linnakangas' <hlinnaka(at)iki(dot)fi>
Cc:	"michael(at)paquier(dot)xyz" <michael(at)paquier(dot)xyz>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "rjuju123(at)gmail(dot)com" <rjuju123(at)gmail(dot)com>
Subject:	RE: ResourceOwner refactoring
Date:	2021-01-15 09:10:34
Message-ID:	OSBPR01MB31575D0055E4B75F836CA16CF5A70@OSBPR01MB3157.jpnprd01.prod.outlook.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Dear Heikki,

> Hmm. ResOwnerReleaseTupleDesc() does exist, those functions are needed
> for the callbacks. I think you meant the wrappers around
> ResourceOwnerRemember and ResourceOwnerForget, like
> ResourceOwnerRememberCatCacheRef(). I admit those are not fully
> consistent: I didn't bother with the wrapper functions when there is
> only one caller.

Yeah, I meant it. And I prefer your policy.

Good job. I confirmed your fixes, and I confirmed it looks fine.
I will check another ResourceOwnerEnlarge() if I have a time.

> I've been working on performance testing too.

I'm looking forward to seeing it.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

From:	"kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>
To:	'Heikki Linnakangas' <hlinnaka(at)iki(dot)fi>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	RE: ResourceOwner refactoring
Date:	2021-01-18 07:49:20
Message-ID:	OSBPR01MB315736FEC412E3E711B45293F5A40@OSBPR01MB3157.jpnprd01.prod.outlook.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	Postg토토 캔SQL :

Dear Heikki,

I apologize for sending again.

> I will check another ResourceOwnerEnlarge() if I have a time.

I checked all ResourceOwnerEnlarge() types after applying patch 0001.
(searched by "grep -rI ResourceOwnerEnlarge")
No problem was found.

Hayato Kuroda
FUJITSU LIMITED

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	"kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2021-01-18 12:22:33
Message-ID:	7ba71639-dbb7-5eef-e3a2-5f7f0dd22078@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 18/01/2021 09:49, kuroda(dot)hayato(at)fujitsu(dot)com wrote:
> Dear Heikki,
>
> I apologize for sending again.
>
>> I will check another ResourceOwnerEnlarge() if I have a time.
>
> I checked all ResourceOwnerEnlarge() types after applying patch 0001.
> (searched by "grep -rI ResourceOwnerEnlarge")
> No problem was found.

Great, thanks!

Here are more details on the performance testing I did:

Unpatched
----------

postgres=# \i snaptest.sql
numkeep | numsnaps | lifo_time_ns | fifo_time_ns
---------+----------+--------------+--------------
0 | 1 | 38.2 | 34.8
0 | 5 | 35.7 | 35.4
0 | 10 | 37.6 | 37.6
0 | 60 | 35.4 | 39.2
0 | 70 | 55.0 | 53.8
0 | 100 | 48.6 | 48.9
0 | 1000 | 54.8 | 57.0
0 | 10000 | 63.9 | 67.1
9 | 10 | 39.3 | 38.8
9 | 100 | 44.0 | 42.5
9 | 1000 | 45.8 | 47.1
9 | 10000 | 64.2 | 66.8
65 | 70 | 45.7 | 43.7
65 | 100 | 42.9 | 43.7
65 | 1000 | 46.9 | 45.7
65 | 10000 | 65.0 | 64.5
(16 rows)

Patched
--------

postgres=# \i snaptest.sql
numkeep | numsnaps | lifo_time_ns | fifo_time_ns
---------+----------+--------------+--------------
0 | 1 | 35.3 | 33.8
0 | 5 | 34.8 | 34.6
0 | 10 | 49.8 | 51.4
0 | 60 | 53.1 | 57.2
0 | 70 | 53.2 | 59.6
0 | 100 | 53.1 | 58.2
0 | 1000 | 63.1 | 69.7
0 | 10000 | 83.3 | 83.5
9 | 10 | 35.0 | 35.1
9 | 100 | 55.4 | 54.1
9 | 1000 | 56.8 | 60.3
9 | 10000 | 88.6 | 82.0
65 | 70 | 36.4 | 35.1
65 | 100 | 52.4 | 53.0
65 | 1000 | 55.8 | 59.4
65 | 10000 | 79.3 | 80.3
(16 rows)

About the test:

Each test call Register/UnregisterSnapshot in a loop. numsnaps is the
number of snapshots that are registered in each iteration. For exmaple,
on the second line with numkeep=0 and numnaps=5, each iteration in the
LIFO test does essentially:

rs[0] = RegisterSnapshot()
rs[1] = RegisterSnapshot()
rs[2] = RegisterSnapshot()
rs[3] = RegisterSnapshot()
rs[4] = RegisterSnapshot()
UnregisterSnapshot(rs[4]);
UnregisterSnapshot(rs[3]);
UnregisterSnapshot(rs[2]);
UnregisterSnapshot(rs[1]);
UnregisterSnapshot(rs[0]);

In the FIFO tests, the Unregister calls are made in reverse order.

When numkeep is non zero, that many snapshots are registered once at the
beginning of the test, and released only after the test loop. So this
simulates the scenario that you remember a bunch of resources and hold
them for a long time, and while holding them, you do many more
remember/forget calls.

Based on this test, this patch makes things slightly slower overall.
However, *not* in the cases that matter the most I believe. That's the
cases where the (numsnaps - numkeep) is small, so that the hot action
happens in the array and the hashing is not required. In my testing
earlier, about 95% of the ResourceOwnerRemember calls in the regression
tests fall into that category.

There are a few simple things we could do speed this up, if needed. A
different hash function might be cheaper, for example. And we could
inline the fast paths of the ResourceOwnerEnlarge and
ResourceOwnerRemember() into the callers. But I didn't do those things
as part of this patch, because it wouldn't be a fair comparison; we
could do those with the old implementation too. And unless we really
need the speed, it's more readable this way.

I'm OK with these results. Any objections?

Attached are the patches again. Same as previous version, except for a
couple of little comment changes. And a third patch containing the
source for the performance test.

- Heikki

Attachment	Content-Type	Size
v5-0001-Move-a-few-ResourceOwnerEnlarge-calls-for-safety-.patch	text/x-patch	8.6 KB
v5-0002-Make-resowners-more-easily-extensible.patch	text/x-patch	84.8 KB
0001-resownerbench.patch	text/x-patch	8.2 KB

From:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	"kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2021-01-18 14:34:43
Message-ID:	20210118143443.GA7276@alvherre.pgsql
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2021-Jan-18, Heikki Linnakangas wrote:

> +static ResourceOwnerFuncs jit_funcs =
> +{
> + /* relcache references */
> + .name = "LLVM JIT context",
> + .phase = RESOURCE_RELEASE_BEFORE_LOCKS,
> + .ReleaseResource = ResOwnerReleaseJitContext,
> + .PrintLeakWarning = ResOwnerPrintJitContextLeakWarning
> +};

I think you mean jit_resowner_funcs here; "jit_funcs" is a bit
excessively vague. Also, why did you choose not to define
ResourceOwnerRememberJIT? You do that in other modules and it seems
better.

> +static ResourceOwnerFuncs catcache_funcs =
> +{
> + /* catcache references */
> + .name = "catcache reference",
> + .phase = RESOURCE_RELEASE_AFTER_LOCKS,
> + .ReleaseResource = ResOwnerReleaseCatCache,
> + .PrintLeakWarning = ResOwnerPrintCatCacheLeakWarning
> +};
> +
> +static ResourceOwnerFuncs catlistref_funcs =
> +{
> + /* catcache-list pins */
> + .name = "catcache list reference",
> + .phase = RESOURCE_RELEASE_AFTER_LOCKS,
> + .ReleaseResource = ResOwnerReleaseCatCacheList,
> + .PrintLeakWarning = ResOwnerPrintCatCacheListLeakWarning
> +};

Similar naming issue here, I say.

> diff --git a/src/common/cryptohash_openssl.c b/src/common/cryptohash_openssl.c
> index 551ec392b60..642e72d8c0f 100644
> --- a/src/common/cryptohash_openssl.c
> +++ b/src/common/cryptohash_openssl.c
> @@ -60,6 +59,21 @@ struct pg_cryptohash_ctx
> #endif
> };
>
> +/* ResourceOwner callbacks to hold JitContexts */
> +#ifndef FRONTEND
> +static void ResOwnerReleaseCryptoHash(Datum res);
> +static void ResOwnerPrintCryptoHashLeakWarning(Datum res);

The comment is wrong -- "Crypohashes", not "JitContexts".

So according to your performance benchmark, we're willing to accept a
30% performance loss on an allegedly common operation -- numkeep=0
numsnaps=10 becomes 49.8ns from 37.6ns. That seems a bit shocking.
Maybe you can claim that these operations aren't exactly hot spots, and
so the fact that we remain in the same power-of-ten is sufficient. Is
that the argument?

--
Álvaro Herrera 39°49'30"S 73°17'W
"The Postgresql hackers have what I call a "NASA space shot" mentality.
Quite refreshing in a world of "weekend drag racer" developers."
(Scott Marlowe)

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc:	"kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2021-01-18 15:19:04
Message-ID:	113f69b8-04f4-93be-1bae-0c2dfa40426d@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 18/01/2021 16:34, Alvaro Herrera wrote:
> So according to your performance benchmark, we're willing to accept a
> 30% performance loss on an allegedly common operation -- numkeep=0
> numsnaps=10 becomes 49.8ns from 37.6ns. That seems a bit shocking.
> Maybe you can claim that these operations aren't exactly hot spots, and
> so the fact that we remain in the same power-of-ten is sufficient. Is
> that the argument?

That's right. The fast path is fast, and that's important. The slow path
becomes 30% slower, but that's acceptable.

- Heikki

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2021-01-18 16:11:10
Message-ID:	CA+TgmobYE99Ka5rc+-Z_LQnWFWWhmeEDEtEm_f4-Av_47swjcg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	Postg사설 토토SQL

On Mon, Jan 18, 2021 at 10:19 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>
> On 18/01/2021 16:34, Alvaro Herrera wrote:
> > So according to your performance benchmark, we're willing to accept a
> > 30% performance loss on an allegedly common operation -- numkeep=0
> > numsnaps=10 becomes 49.8ns from 37.6ns. That seems a bit shocking.
> > Maybe you can claim that these operations aren't exactly hot spots, and
> > so the fact that we remain in the same power-of-ten is sufficient. Is
> > that the argument?
>
> That's right. The fast path is fast, and that's important. The slow path
> becomes 30% slower, but that's acceptable.
>
> - Heikki
>
>

--
Robert Haas
EDB: http://www.enterprisedb.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2021-01-18 16:11:57
Message-ID:	CA+TgmoZNk2cn_vkYzw7dxCf=ECDZFW15VB_5MFfK5e_8-a57+w@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	Postg사설 토토SQL

On Mon, Jan 18, 2021 at 11:11 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Jan 18, 2021 at 10:19 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> > On 18/01/2021 16:34, Alvaro Herrera wrote:
> > > So according to your performance benchmark, we're willing to accept a
> > > 30% performance loss on an allegedly common operation -- numkeep=0
> > > numsnaps=10 becomes 49.8ns from 37.6ns. That seems a bit shocking.
> > > Maybe you can claim that these operations aren't exactly hot spots, and
> > > so the fact that we remain in the same power-of-ten is sufficient. Is
> > > that the argument?
> >
> > That's right. The fast path is fast, and that's important. The slow path
> > becomes 30% slower, but that's acceptable.

Sorry for the empty message.

I don't know whether a 30% slowdown will hurt anybody, but it seems
like kind of a lot, and I'm not sure I understand what corresponding
benefit we're getting.

--
Robert Haas
EDB: http://www.enterprisedb.com

From:	Michael Paquier <michael(at)paquier(dot)xyz>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	"kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2021-01-19 06:48:08
Message-ID:	YAaAqF93S5JzxGNO@paquier.xyz
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Jan 18, 2021 at 02:22:33PM +0200, Heikki Linnakangas wrote:
> diff --git a/src/common/cryptohash_openssl.c b/src/common/cryptohash_openssl.c
> index 551ec392b60..642e72d8c0f 100644
> --- a/src/common/cryptohash_openssl.c
> +++ b/src/common/cryptohash_openssl.c
[...]
> +/* ResourceOwner callbacks to hold JitContexts */

Slight copy-paste error here.

> /*
> * Ensure, while the spinlock's not yet held, that there's a free
> - * refcount entry.
> + * refcount entry and that the resoure owner has room to remember the
> + * pin.
s/resoure/resource/.
--
Michael

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2021-01-19 09:09:16
Message-ID:	76a0b012-14ed-bcfd-cf40-af084dba8a11@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 18/01/2021 18:11, Robert Haas wrote:
> On Mon, Jan 18, 2021 at 11:11 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Mon, Jan 18, 2021 at 10:19 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>>> On 18/01/2021 16:34, Alvaro Herrera wrote:
>>>> So according to your performance benchmark, we're willing to accept a
>>>> 30% performance loss on an allegedly common operation -- numkeep=0
>>>> numsnaps=10 becomes 49.8ns from 37.6ns. That seems a bit shocking.
>>>> Maybe you can claim that these operations aren't exactly hot spots, and
>>>> so the fact that we remain in the same power-of-ten is sufficient. Is
>>>> that the argument?
>>>
>>> That's right. The fast path is fast, and that's important. The slow path
>>> becomes 30% slower, but that's acceptable.
>
> I don't know whether a 30% slowdown will hurt anybody, but it seems
> like kind of a lot, and I'm not sure I understand what corresponding
> benefit we're getting.

The benefit is to make it easy for extensions to use resource owners to
track whatever resources they need to track. And arguably, the new
mechanism is nicer for built-in code, too.

I'll see if I can optimize the slow paths, to make it more palatable.

- Heikki

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc:	"kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2021-01-19 14:45:57
Message-ID:	21f73bd6-d1d5-560f-5a6c-7224c3e870fc@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 18/01/2021 16:34, Alvaro Herrera wrote:
> On 2021-Jan-18, Heikki Linnakangas wrote:
>
>> +static ResourceOwnerFuncs jit_funcs =
>> +{
>> + /* relcache references */
>> + .name = "LLVM JIT context",
>> + .phase = RESOURCE_RELEASE_BEFORE_LOCKS,
>> + .ReleaseResource = ResOwnerReleaseJitContext,
>> + .PrintLeakWarning = ResOwnerPrintJitContextLeakWarning
>> +};
>
> I think you mean jit_resowner_funcs here; "jit_funcs" is a bit
> excessively vague. Also, why did you choose not to define
> ResourceOwnerRememberJIT? You do that in other modules and it seems
> better.

I did it in modules that had more than one ResourceOwnerRemeber/Forget
call. Didn't seem worth it in functions like IncrTupleDescRefCount(),
for example.

Hayato Kuroda also pointed that out, though. So perhaps it's better to
be consistent, to avoid the confusion. I'll add the missing wrappers.

- Heikki

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2021-01-20 22:11:37
Message-ID:	d746cead-a1ef-7efe-fb47-933311e876a3@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 19/01/2021 11:09, Heikki Linnakangas wrote:
> On 18/01/2021 18:11, Robert Haas wrote:
>> On Mon, Jan 18, 2021 at 11:11 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> On Mon, Jan 18, 2021 at 10:19 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>>>> On 18/01/2021 16:34, Alvaro Herrera wrote:
>>>>> So according to your performance benchmark, we're willing to accept a
>>>>> 30% performance loss on an allegedly common operation -- numkeep=0
>>>>> numsnaps=10 becomes 49.8ns from 37.6ns. That seems a bit shocking.
>>>>> Maybe you can claim that these operations aren't exactly hot spots, and
>>>>> so the fact that we remain in the same power-of-ten is sufficient. Is
>>>>> that the argument?
>>>>
>>>> That's right. The fast path is fast, and that's important. The slow path
>>>> becomes 30% slower, but that's acceptable.
>>
>> I don't know whether a 30% slowdown will hurt anybody, but it seems
>> like kind of a lot, and I'm not sure I understand what corresponding
>> benefit we're getting.
>
> The benefit is to make it easy for extensions to use resource owners to
> track whatever resources they need to track. And arguably, the new
> mechanism is nicer for built-in code, too.
>
> I'll see if I can optimize the slow paths, to make it more palatable.

Ok, here's a new set of patches, and new test results. I replaced the
hash function with a cheaper one. I also added the missing wrappers that
Alvaro and Hayato Kuroda commented on, and fixed the typos that Michael
Paquier pointed out.

In the test script, I increased the number of iterations used in the
tests, to make them run longer and produce more stable results. There is
still a fair amount of jitter in the results, so take any particular
number with a grain of salt, but the overall trend is repeatable.

The results now look like this:

Unpatched
---------

postgres=# \i snaptest.sql
numkeep | numsnaps | lifo_time_ns | fifo_time_ns
---------+----------+--------------+--------------
0 | 1 | 32.7 | 32.3
0 | 5 | 33.0 | 32.9
0 | 10 | 34.1 | 35.5
0 | 60 | 32.5 | 38.3
0 | 70 | 47.6 | 48.9
0 | 100 | 47.9 | 49.3
0 | 1000 | 52.9 | 52.7
0 | 10000 | 61.7 | 62.4
9 | 10 | 38.4 | 37.6
9 | 100 | 42.3 | 42.3
9 | 1000 | 43.9 | 45.0
9 | 10000 | 62.2 | 62.5
65 | 70 | 42.4 | 42.9
65 | 100 | 43.2 | 43.9
65 | 1000 | 44.0 | 45.1
65 | 10000 | 62.4 | 62.6
(16 rows)

Patched
-------

postgres=# \i snaptest.sql
numkeep | numsnaps | lifo_time_ns | fifo_time_ns
---------+----------+--------------+--------------
0 | 1 | 32.8 | 34.2
0 | 5 | 33.8 | 32.2
0 | 10 | 37.2 | 40.2
0 | 60 | 40.2 | 45.1
0 | 70 | 40.9 | 48.4
0 | 100 | 41.9 | 45.2
0 | 1000 | 49.0 | 55.6
0 | 10000 | 62.4 | 67.2
9 | 10 | 33.6 | 33.0
9 | 100 | 39.6 | 39.7
9 | 1000 | 42.2 | 45.0
9 | 10000 | 60.7 | 63.9
65 | 70 | 32.5 | 33.6
65 | 100 | 37.5 | 39.7
65 | 1000 | 42.3 | 46.3
65 | 10000 | 61.9 | 64.8
(16 rows)

For easier side-by-side comparison, here are the same numbers with the
patched and unpatched results side by side, and their ratio (ratio < 1
means the patched version is faster):

LIFO tests:
numkeep | numsnaps | unpatched | patched | ratio
---------+----------+-----------+---------+-------
0 | 1 | 32.7 | 32.8 | 1.00
0 | 5 | 33.0 | 33.8 | 1.02
0 | 10 | 34.1 | 37.2 | 1.09
0 | 60 | 32.5 | 40.2 | 1.24
0 | 70 | 47.6 | 40.9 | 0.86
0 | 100 | 47.9 | 41.9 | 0.87
0 | 1000 | 52.9 | 49.0 | 0.93
0 | 10000 | 61.7 | 62.4 | 1.01
9 | 10 | 38.4 | 33.6 | 0.88
9 | 100 | 42.3 | 39.6 | 0.94
9 | 1000 | 43.9 | 42.2 | 0.96
9 | 10000 | 62.2 | 60.7 | 0.98
65 | 70 | 42.4 | 32.5 | 0.77
65 | 100 | 43.2 | 37.5 | 0.87
65 | 1000 | 44.0 | 42.3 | 0.96
65 | 10000 | 62.4 | 61.9 | 0.99
(16 rows)

FIFO tests:
numkeep | numsnaps | unpatched | patched | ratio
---------+----------+-----------+---------+-------
0 | 1 | 32.3 | 34.2 | 1.06
0 | 5 | 32.9 | 32.2 | 0.98
0 | 10 | 35.5 | 40.2 | 1.13
0 | 60 | 38.3 | 45.1 | 1.18
0 | 70 | 48.9 | 48.4 | 0.99
0 | 100 | 49.3 | 45.2 | 0.92
0 | 1000 | 52.7 | 55.6 | 1.06
0 | 10000 | 62.4 | 67.2 | 1.08
9 | 10 | 37.6 | 33.0 | 0.88
9 | 100 | 42.3 | 39.7 | 0.94
9 | 1000 | 45.0 | 45.0 | 1.00
9 | 10000 | 62.5 | 63.9 | 1.02
65 | 70 | 42.9 | 33.6 | 0.78
65 | 100 | 43.9 | 39.7 | 0.90
65 | 1000 | 45.1 | 46.3 | 1.03
65 | 10000 | 62.6 | 64.8 | 1.04
(16 rows)

Summary: In the the worst scenario, the patched version is still 24%
slower than unpatched. But many other scenarios are now faster with the
patch.

- Heikki

Attachment	Content-Type	Size
v6-0001-Move-a-few-ResourceOwnerEnlarge-calls-for-safety-.patch	text/x-patch	8.6 KB
v6-0002-Make-resowners-more-easily-extensible.patch	text/x-patch	85.8 KB
v6-0003-Optimize-hash-function.patch	text/x-patch	2.3 KB
v2-0001-resownerbench.patch	text/x-patch	8.2 KB

From:	"kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>
To:	'Heikki Linnakangas' <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	RE: ResourceOwner refactoring
Date:	2021-01-21 02:22:20
Message-ID:	OSBPR01MB315778F8FDC6C4B55497D72FF5A10@OSBPR01MB3157.jpnprd01.prod.outlook.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	Postg스포츠 토토 사이트SQL

Dear Heikki,

I tested in the same situation, and I confirmed that almost same results are returned.
(results are at the end of the e-mail)

You think that these results are acceptable
because resowners own many resources(more than 64) in general
and it's mainly faster in such a situation, isn't it?

I cannot distinguish correctly, but sounds good.

====test result====

unpatched

numkeep | numsnaps | lifo_time_ns | fifo_time_ns
---------+----------+--------------+--------------
0 | 1 | 68.5 | 69.9
0 | 5 | 73.2 | 76.8
0 | 10 | 70.6 | 74.7
0 | 60 | 68.0 | 75.6
0 | 70 | 91.3 | 94.8
0 | 100 | 89.0 | 89.1
0 | 1000 | 97.9 | 98.9
0 | 10000 | 116.0 | 115.9
9 | 10 | 74.7 | 76.6
9 | 100 | 80.8 | 80.1
9 | 1000 | 86.0 | 86.2
9 | 10000 | 116.1 | 116.8
65 | 70 | 84.7 | 85.3
65 | 100 | 80.5 | 80.3
65 | 1000 | 86.3 | 86.2
65 | 10000 | 115.4 | 115.9
(16 rows)

patched

numkeep | numsnaps | lifo_time_ns | fifo_time_ns
---------+----------+--------------+--------------
0 | 1 | 62.4 | 62.6
0 | 5 | 68.0 | 66.9
0 | 10 | 73.6 | 78.1
0 | 60 | 82.3 | 87.2
0 | 70 | 83.0 | 89.1
0 | 100 | 82.8 | 87.9
0 | 1000 | 88.2 | 96.6
0 | 10000 | 119.6 | 124.5
9 | 10 | 62.0 | 62.8
9 | 100 | 75.3 | 78.0
9 | 1000 | 82.6 | 89.3
9 | 10000 | 116.6 | 122.6
65 | 70 | 66.7 | 66.4
65 | 100 | 74.6 | 77.2
65 | 1000 | 82.1 | 88.2
65 | 10000 | 118.0 | 124.1
(16 rows)

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

From:	Michael Paquier <michael(at)paquier(dot)xyz>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2021-01-21 04:14:12
Message-ID:	YAj/lHU54pzQ6cSl@paquier.xyz
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jan 21, 2021 at 12:11:37AM +0200, Heikki Linnakangas wrote:
> Summary: In the the worst scenario, the patched version is still 24% slower
> than unpatched. But many other scenarios are now faster with the patch.

Is there a reason explaining the sudden drop for numsnaps within the
[10,60] range? The gap looks deeper with a low numkeep.
--
Michael

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Michael Paquier <michael(at)paquier(dot)xyz>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2021-01-21 10:14:10
Message-ID:	ad52c002-821c-e0e3-b406-9a0ab72a07bf@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 21/01/2021 06:14, Michael Paquier wrote:
> On Thu, Jan 21, 2021 at 12:11:37AM +0200, Heikki Linnakangas wrote:
>> Summary: In the the worst scenario, the patched version is still 24% slower
>> than unpatched. But many other scenarios are now faster with the patch.
>
> Is there a reason explaining the sudden drop for numsnaps within the
> [10,60] range? The gap looks deeper with a low numkeep.

It's the switch from array to hash table. With the patch, the array
holds 8 entries. Without the patch, it's 64 entries. So you see a drop
around those points. I added more data points in that range to get a
better picture:

LIFO tests:
numkeep | numsnaps | unpatched | patched | ratio
---------+----------+-----------+---------+-------
0 | 1 | 32.8 | 32.9 | 1.00
0 | 2 | 31.6 | 32.8 | 1.04
0 | 4 | 32.7 | 32.0 | 0.98
0 | 6 | 34.1 | 33.9 | 0.99
0 | 8 | 35.1 | 32.4 | 0.92
0 | 10 | 34.0 | 37.1 | 1.09
0 | 15 | 33.1 | 35.9 | 1.08
0 | 20 | 33.0 | 38.8 | 1.18
0 | 25 | 32.9 | 42.3 | 1.29
0 | 30 | 32.9 | 40.5 | 1.23
0 | 35 | 33.1 | 39.9 | 1.21
0 | 40 | 33.0 | 39.0 | 1.18
0 | 45 | 35.3 | 41.1 | 1.16
0 | 50 | 33.0 | 40.8 | 1.24
0 | 55 | 32.8 | 40.6 | 1.24
0 | 58 | 33.0 | 41.5 | 1.26
0 | 60 | 33.1 | 41.6 | 1.26
0 | 62 | 32.8 | 41.7 | 1.27
0 | 64 | 46.8 | 40.9 | 0.87
0 | 66 | 47.0 | 42.5 | 0.90
0 | 68 | 47.1 | 41.8 | 0.89
0 | 70 | 47.8 | 41.7 | 0.87
(22 rows)

FIFO tests:
numkeep | numsnaps | unpatched | patched | ratio
---------+----------+-----------+---------+-------
0 | 1 | 32.3 | 32.1 | 0.99
0 | 2 | 33.4 | 31.6 | 0.95
0 | 4 | 34.0 | 31.4 | 0.92
0 | 6 | 35.4 | 33.2 | 0.94
0 | 8 | 34.8 | 31.9 | 0.92
0 | 10 | 35.4 | 40.2 | 1.14
0 | 15 | 35.4 | 40.3 | 1.14
0 | 20 | 35.6 | 43.8 | 1.23
0 | 25 | 35.4 | 42.4 | 1.20
0 | 30 | 36.0 | 43.3 | 1.20
0 | 35 | 36.4 | 45.1 | 1.24
0 | 40 | 36.9 | 46.6 | 1.26
0 | 45 | 37.7 | 45.3 | 1.20
0 | 50 | 37.2 | 43.9 | 1.18
0 | 55 | 38.4 | 46.8 | 1.22
0 | 58 | 37.6 | 45.0 | 1.20
0 | 60 | 37.7 | 46.6 | 1.24
0 | 62 | 38.4 | 46.5 | 1.21
0 | 64 | 48.7 | 47.6 | 0.98
0 | 66 | 48.2 | 45.8 | 0.95
0 | 68 | 48.5 | 47.5 | 0.98
0 | 70 | 48.4 | 47.3 | 0.98
(22 rows)

Let's recap the behavior:

Without patch
-------------

For each different kind of resource, there's an array that holds up to
64 entries. In ResourceOwnerForget(), the array is scanned linearly. If
the array fills up, we replace the array with a hash table. After
switching, all operations use the hash table.

With patch
----------

There is one array that holds up to 8 entries. It is shared by all
resources. In ResourceOwnerForget(), the array is always scanned.

If the array fills up, all the entries are moved to a hash table,
freeing up space in the array, and the new entry is added to the array.
So the array is used together with the hash table, like an LRU cache of
the most recently remembered entries.

Why this change? I was afraid that now that all different resources
share the same data structure, remembering e.g. a lot of locks at the
beginning of a transaction would cause the switch to the hash table,
making all subsequent remember/forget operations, like for buffer pins,
slower. That kind of interference seems bad. By continuing to use the
array for the recently-remembered entries, we avoid that problem. The
[numkeep, numsnaps] = [65, 70] test is in that regime, and the patched
version was significantly faster.

Because the array is now always scanned, I felt that it needs to be
small, to avoid wasting much time scanning for entries that have already
been moved to the hash table. That's why I made it just 8 entries.

Perhaps 8 entries is too small, after all? Linearly scanning an array is
very fast. To test that, I bumped up RESOWNER_ARRAY_SIZE to 64, and ran
the test again:

numkeep | numsnaps | lifo_time_ns | fifo_time_ns
---------+----------+--------------+--------------
0 | 1 | 35.4 | 31.5
0 | 2 | 32.3 | 32.3
0 | 4 | 32.8 | 31.0
0 | 6 | 34.5 | 33.7
0 | 8 | 33.9 | 32.7
0 | 10 | 33.7 | 33.0
0 | 15 | 34.8 | 33.1
0 | 20 | 35.0 | 32.6
0 | 25 | 36.9 | 33.0
0 | 30 | 38.7 | 33.2
0 | 35 | 39.9 | 34.5
0 | 40 | 40.8 | 35.5
0 | 45 | 42.6 | 36.4
0 | 50 | 44.9 | 37.8
0 | 55 | 45.4 | 38.5
0 | 58 | 47.7 | 39.6
0 | 60 | 45.9 | 40.2
0 | 62 | 47.6 | 40.9
0 | 64 | 51.4 | 41.2
0 | 66 | 38.2 | 39.8
0 | 68 | 39.7 | 40.3
0 | 70 | 38.1 | 40.9
(22 rows)

Here you can see that as numsnaps increases, the test becomes slower,
but then it becomes faster again at 64-66, when it switches to the hash
table. So 64 seems too much. 32 seems to be the sweet spot here, that's
where scanning the hash and scanning the array are about the same speed.

- Heikki

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2021-01-25 17:14:42
Message-ID:	CA+TgmoZyi5VOh85fdNfadVfLckFy199x9_Q3Kg_eo7sTq2iwxw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jan 21, 2021 at 5:14 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> Here you can see that as numsnaps increases, the test becomes slower,
> but then it becomes faster again at 64-66, when it switches to the hash
> table. So 64 seems too much. 32 seems to be the sweet spot here, that's
> where scanning the hash and scanning the array are about the same speed.

That sounds good. I mean, it could be that not all hardware behaves
the same here. But trying to get it into the right ballpark makes
sense.

I also like the fact that this now has some cases where it wins by a
significant margin. That's pretty cool; thanks for working on it!

--
Robert Haas
EDB: http://www.enterprisedb.com

From:	Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2021-03-08 16:47:42
Message-ID:	CALtqXTfD9CWEBz3=SGnePZMouLX2UdqANT4TnTnsiY1WqpAU6g@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Jan 25, 2021 at 10:15 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Thu, Jan 21, 2021 at 5:14 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
> wrote:
> > Here you can see that as numsnaps increases, the test becomes slower,
> > but then it becomes faster again at 64-66, when it switches to the hash
> > table. So 64 seems too much. 32 seems to be the sweet spot here, that's
> > where scanning the hash and scanning the array are about the same speed.
>
> That sounds good. I mean, it could be that not all hardware behaves
> the same here. But trying to get it into the right ballpark makes
> sense.
>
> I also like the fact that this now has some cases where it wins by a
> significant margin. That's pretty cool; thanks for working on it!
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com
>
>
>
The patchset does not apply successfully, there are some hunk failures.

http://cfbot.cputube.org/patch_32_2834.log

v6-0002-Make-resowners-more-easily-extensible.patch

1 out of 6 hunks FAILED -- saving rejects to file
src/backend/utils/cache/plancache.c.rej
2 out of 15 hunks FAILED -- saving rejects to file
src/backend/utils/resowner/resowner.c.rej

Can we get a rebase?

I am marking the patch "Waiting on Author"

--
Ibrar Ahmed

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>
Cc:	Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2021-03-09 12:40:42
Message-ID:	b8c5d745-a92f-6a89-26dc-773585672019@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	503 사설 토토 페치 실패

On 08/03/2021 18:47, Ibrar Ahmed wrote:
> The patchset does not apply successfully, there are some hunk failures.
>
> http://cfbot.cputube.org/patch_32_2834.log
> <http://cfbot.cputube.org/patch_32_2834.log>
>
> v6-0002-Make-resowners-more-easily-extensible.patch
>
> 1 out of 6 hunks FAILED -- saving rejects to file
> src/backend/utils/cache/plancache.c.rej
> 2 out of 15 hunks FAILED -- saving rejects to file
> src/backend/utils/resowner/resowner.c.rej
>
>
> Can we get a rebase?

Here you go.

- Heikki

Attachment	Content-Type	Size
v7-0001-Move-a-few-ResourceOwnerEnlarge-calls-for-safety-.patch	text/x-patch	8.6 KB
v7-0002-Make-resowners-more-easily-extensible.patch	text/x-patch	85.7 KB
v7-0003-Optimize-hash-function.patch	text/x-patch	2.3 KB

From:	vignesh C <vignesh21(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2021-07-14 12:14:42
Message-ID:	CALDaNm3xAcy+cPi7pdM91mtzfJZPu3etU8Tm9bVEcJ7vWWSXyA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Mar 9, 2021 at 6:10 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>
> On 08/03/2021 18:47, Ibrar Ahmed wrote:
> > The patchset does not apply successfully, there are some hunk failures.
> >
> > http://cfbot.cputube.org/patch_32_2834.log
> > <http://cfbot.cputube.org/patch_32_2834.log>
> >
> > v6-0002-Make-resowners-more-easily-extensible.patch
> >
> > 1 out of 6 hunks FAILED -- saving rejects to file
> > src/backend/utils/cache/plancache.c.rej
> > 2 out of 15 hunks FAILED -- saving rejects to file
> > src/backend/utils/resowner/resowner.c.rej
> >
> >
> > Can we get a rebase?
>
> Here you go.

The patch does not apply on Head anymore, could you rebase and post a
patch. I'm changing the status to "Waiting for Author".

Regards,
Vignesh

From:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To:	vignesh C <vignesh21(at)gmail(dot)com>
Cc:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2021-07-14 14:07:07
Message-ID:	202107141407.aydgrhuxskhj@alvherre.pgsql
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2021-Jul-14, vignesh C wrote:

> On Tue, Mar 9, 2021 at 6:10 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:

> > Here you go.
>
> The patch does not apply on Head anymore, could you rebase and post a
> patch. I'm changing the status to "Waiting for Author".

Support for hmac was added by e6bdfd9700eb so the rebase is not trivial.

--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
"Ellos andaban todos desnudos como su madre los parió, y también las mujeres,
aunque no vi más que una, harto moza, y todos los que yo vi eran todos
mancebos, que ninguno vi de edad de más de XXX años" (Cristóbal Colón)

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>
Cc:	Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2021-07-14 14:40:27
Message-ID:	363e1ca2-0797-6023-2865-2d30121ac568@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 14/07/2021 17:07, Alvaro Herrera wrote:
> On 2021-Jul-14, vignesh C wrote:
>
>> On Tue, Mar 9, 2021 at 6:10 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>
>>> Here you go.
>>
>> The patch does not apply on Head anymore, could you rebase and post a
>> patch. I'm changing the status to "Waiting for Author".
>
> Support for hmac was added by e6bdfd9700eb so the rebase is not trivial.

Yeah, needed some manual fixing, but here you go.

- Heikki

Attachment	Content-Type	Size
v8-0001-Move-a-few-ResourceOwnerEnlarge-calls-for-safety-.patch	text/x-patch	8.6 KB
v8-0002-Make-resowners-more-easily-extensible.patch	text/x-patch	91.5 KB
v8-0003-Optimize-hash-function.patch	text/x-patch	2.3 KB

From:	Zhihong Yu <zyu(at)yugabyte(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2021-07-14 15:18:52
Message-ID:	CALNJ-vSjN9fsLApo8qxr-NtGGSBkfsHOQV5n4jgEdJwMHgxvEA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Jul 14, 2021 at 7:40 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:

> On 14/07/2021 17:07, Alvaro Herrera wrote:
> > On 2021-Jul-14, vignesh C wrote:
> >
> >> On Tue, Mar 9, 2021 at 6:10 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
> wrote:
> >
> >>> Here you go.
> >>
> >> The patch does not apply on Head anymore, could you rebase and post a
> >> patch. I'm changing the status to "Waiting for Author".
> >
> > Support for hmac was added by e6bdfd9700eb so the rebase is not trivial.
>
> Yeah, needed some manual fixing, but here you go.
>
> - Heikki
>
Hi,
For the loop over the hash:

+ for (int idx = 0; idx < capacity; idx++)
{
- if (olditemsarr[i] != resarr->invalidval)
- ResourceArrayAdd(resarr, olditemsarr[i]);
+ while (owner->hash[idx].kind != NULL &&
+ owner->hash[idx].kind->phase == phase)
...
+ } while (capacity != owner->capacity);

Since the phase variable doesn't seem to change for the while loop, I
wonder what benefit the while loop has (since the release is governed by
phase).

Cheers

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Zhihong Yu <zyu(at)yugabyte(dot)com>
Cc:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2021-07-14 15:26:13
Message-ID:	908491f9-57ff-5061-30c6-d15d3dbf0e67@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Thanks for having a look!

On 14/07/2021 18:18, Zhihong Yu wrote:
> For the loop over the hash:
>
> + for (int idx = 0; idx < capacity; idx++)
> {
> - if (olditemsarr[i] != resarr->invalidval)
> - ResourceArrayAdd(resarr, olditemsarr[i]);
> + while (owner->hash[idx].kind != NULL &&
> + owner->hash[idx].kind->phase == phase)
> ...
> + } while (capacity != owner->capacity);
>
> Since the phase variable doesn't seem to change for the while loop, I
> wonder what benefit the while loop has (since the release is governed by
> phase).

Hmm, the phase variable doesn't change, but could the element at
'owner->hash[idx]' change? I'm not sure about that. The loop is supposed
to handle the case that the hash table grows; could that replace the
element at 'owner->hash[idx]' with something else, with different phase?
The check is very cheap, so I'm inclined to keep it to be sure.

- Heikki

From:	Zhihong Yu <zyu(at)yugabyte(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2021-07-14 15:41:43
Message-ID:	CALNJ-vTh=O0O0ENOMC4uGd1gbzcnHaXTi8W4W=qri5Na3-LMKw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Jul 14, 2021 at 8:26 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:

> Thanks for having a look!
>
> On 14/07/2021 18:18, Zhihong Yu wrote:
> > For the loop over the hash:
> >
> > + for (int idx = 0; idx < capacity; idx++)
> > {
> > - if (olditemsarr[i] != resarr->invalidval)
> > - ResourceArrayAdd(resarr, olditemsarr[i]);
> > + while (owner->hash[idx].kind != NULL &&
> > + owner->hash[idx].kind->phase == phase)
> > ...
> > + } while (capacity != owner->capacity);
> >
> > Since the phase variable doesn't seem to change for the while loop, I
> > wonder what benefit the while loop has (since the release is governed by
> > phase).
>
> Hmm, the phase variable doesn't change, but could the element at
> 'owner->hash[idx]' change? I'm not sure about that. The loop is supposed
> to handle the case that the hash table grows; could that replace the
> element at 'owner->hash[idx]' with something else, with different phase?
> The check is very cheap, so I'm inclined to keep it to be sure.
>
> - Heikki
>
Hi,
Agreed that ```owner->hash[idx].kind->phase == phase``` can be kept.

I just wonder if we should put limit on the number of iterations for the
while loop.
Maybe add a bool variable indicating that kind->ReleaseResource(value) is
called in the inner loop.
If there is no releasing when the inner loop finishes, we can come out of
the while loop.

Cheers

From:	Aleksander Alekseev <aleksander(at)timescale(dot)com>
To:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Cc:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject:	Re: ResourceOwner refactoring
Date:	2021-09-08 10:19:19
Message-ID:	CAJ7c6TP8v=2Ym5S1C4YenaCMgwu5FrFTGi40Nn5TyDS4ig-gRA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	Postg토토 꽁 머니SQL

Hi Heikki,

> Yeah, needed some manual fixing, but here you go.

Thanks for working on this!

v8-0002 didn't apply to the current master, so I rebased it. See
attached v9-* patches. I also included v9-0004 with some minor tweaks
from me. I have several notes regarding the code.

1. Not sure if I understand this code of ResourceOwnerReleaseAll():
```
/* First handle all the entries in the array. */
do
{
found = false;
for (int i = 0; i < owner->narr; i++)
{
if (owner->arr[i].kind->phase == phase)
{
Datum value = owner->arr[i].item;
ResourceOwnerFuncs *kind = owner->arr[i].kind;

if (printLeakWarnings)
kind->PrintLeakWarning(value);
kind->ReleaseResource(value);
found = true;
}
}

/*
* If any resources were released, check again because some of the
* elements might have been moved by the callbacks. We don't want to
* miss them.
*/
} while (found && owner->narr > 0);
```

Shouldn't we mark the resource as released and/or decrease narr and/or
save the last processed i? Why this will not call ReleaseResource()
endlessly on the same resource (array item)? Same question for the
following code that iterates over the hash table.

2. Just an idea/observation. It's possible that the performance of
ResourceOwnerEnlarge() can be slightly improved. Since the size of the
hash table is always a power of 2 and the code always doubles the size
of the hash table, (idx & mask) value will get one extra bit, which
can be 0 or 1. If it's 0, the value is already in its place,
otherwise, it should be moved on the known distance. In other words,
it's possible to copy `oldhash` to `newhash` and then move only half
of the items. I don't claim that this code necessarily should be
faster, or that this should be checked in the scope of this work.

--
Best regards,
Aleksander Alekseev

Attachment	Content-Type	Size
v9-0003-Optimize-hash-function.patch	application/octet-stream	2.3 KB
v9-0001-Move-a-few-ResourceOwnerEnlarge-calls-for-safety.patch	application/octet-stream	8.6 KB
v9-0002-Make-resowners-more-easily-extensible.patch	application/octet-stream	90.1 KB
v9-0004-Minor-tweaks.patch	application/octet-stream	1.4 KB

From:	Michael Paquier <michael(at)paquier(dot)xyz>
To:	Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject:	Re: ResourceOwner refactoring
Date:	2021-10-01 07:18:31
Message-ID:	YVa2R620cUc6oEB5@paquier.xyz
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Sep 08, 2021 at 01:19:19PM +0300, Aleksander Alekseev wrote:
> Thanks for working on this!

The latest patch does not apply anymore, and has been waiting on
author for a couple of weeks now, so switched as RwF for now.
--
Michael

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2021-10-18 11:17:30
Message-ID:	1377f062-13ea-ebfb-ef8a-97739c156a05@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	Postg무지개 토토SQL

On 08/09/2021 13:19, Aleksander Alekseev wrote:
> Hi Heikki,
>
>> Yeah, needed some manual fixing, but here you go.
>
> Thanks for working on this!
>
> v8-0002 didn't apply to the current master, so I rebased it. See
> attached v9-* patches. I also included v9-0004 with some minor tweaks
> from me. I have several notes regarding the code.

Thanks!

Attached is a newly rebased version. It includes your tweaks, and a few
more comment and indentation tweaks.

> 1. Not sure if I understand this code of ResourceOwnerReleaseAll():
> ```
> /* First handle all the entries in the array. */
> do
> {
> found = false;
> for (int i = 0; i < owner->narr; i++)
> {
> if (owner->arr[i].kind->phase == phase)
> {
> Datum value = owner->arr[i].item;
> ResourceOwnerFuncs *kind = owner->arr[i].kind;
>
> if (printLeakWarnings)
> kind->PrintLeakWarning(value);
> kind->ReleaseResource(value);
> found = true;
> }
> }
>
> /*
> * If any resources were released, check again because some of the
> * elements might have been moved by the callbacks. We don't want to
> * miss them.
> */
> } while (found && owner->narr > 0);
> ```
>
> Shouldn't we mark the resource as released and/or decrease narr and/or
> save the last processed i? Why this will not call ReleaseResource()
> endlessly on the same resource (array item)? Same question for the
> following code that iterates over the hash table.

ReleaseResource() must call ResourceOwnerForget(), which removes the
item from the or hash table. This works the same as the code in 'master':

> /*
> * Release buffer pins. Note that ReleaseBuffer will remove the
> * buffer entry from our array, so we just have to iterate till there
> * are none.
> *
> * During a commit, there shouldn't be any remaining pins --- that
> * would indicate failure to clean up the executor correctly --- so
> * issue warnings. In the abort case, just clean up quietly.
> */
> while (ResourceArrayGetAny(&(owner->bufferarr), &foundres))
> {
> Buffer res = DatumGetBuffer(foundres);
>
> if (isCommit)
> PrintBufferLeakWarning(res);
> ReleaseBuffer(res);
> }

That comment explains how it works. I added a comment like that in this
patch, too.

> 2. Just an idea/observation. It's possible that the performance of
> ResourceOwnerEnlarge() can be slightly improved. Since the size of the
> hash table is always a power of 2 and the code always doubles the size
> of the hash table, (idx & mask) value will get one extra bit, which
> can be 0 or 1. If it's 0, the value is already in its place,
> otherwise, it should be moved on the known distance. In other words,
> it's possible to copy `oldhash` to `newhash` and then move only half
> of the items. I don't claim that this code necessarily should be
> faster, or that this should be checked in the scope of this work.

Hmm, the hash table uses open addressing, I think we want to also
rearrange any existing entries that might not be at their right place
because of collisions. We don't actually do that currently when an
element is removed (even before this patch), but enlarging the array is
a good opportunity for it. In any case, I haven't seen
ResourceOwnerEnlarge() show up while profiling, so I'm going to leave it
as it is.

- Heikki

Attachment	Content-Type	Size
v10-0001-Move-a-few-ResourceOwnerEnlarge-calls-for-safety.patch	text/x-patch	8.6 KB
v10-0002-Make-resowners-more-easily-extensible.patch	text/x-patch	91.8 KB
v10-0003-Optimize-hash-function.patch	text/x-patch	2.3 KB

From:	Aleksander Alekseev <aleksander(at)timescale(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2021-11-23 15:37:03
Message-ID:	CAJ7c6TOj8RsUkfnVZnAPZ6jgWxqTLuv-o5mHYTc5goEhfBXaTA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi Heikki,

> Attached is a newly rebased version. It includes your tweaks, and a few
> more comment and indentation tweaks.

v10-0002 rotted a little so I rebased it. The new patch set passed `make
installcheck-world` locally, but let's see if cfbot has a second opinion.

I'm a bit busy with various things at the moment, so (hopefully) I will
submit the actual code review in the follow-up email.

--
Best regards,
Aleksander Alekseev

Attachment	Content-Type	Size
v11-0001-Move-a-few-ResourceOwnerEnlarge-calls-for-safety.patch	application/octet-stream	8.6 KB
v11-0002-Make-resowners-more-easily-extensible.patch	application/octet-stream	90.3 KB
v11-0003-Optimize-hash-function.patch	application/octet-stream	2.3 KB

From:	Aleksander Alekseev <aleksander(at)timescale(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2021-11-26 12:40:57
Message-ID:	CAJ7c6TNqFTFBw5KF55TPQAMHYuct5NZP7afohVxReBLXfUOyOQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi Heikki,

> I will submit the actual code review in the follow-up email

The patchset is in a good shape. I'm changing the status to "Ready for
Committer".

--
Best regards,
Aleksander Alekseev

From:	Julien Rouhaud <rjuju123(at)gmail(dot)com>
To:	Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2022-01-12 05:57:14
Message-ID:	CAOBaU_Z-NaqzAv=9YntFjbJhgWkJDAwCVyzvKUQTJ2UKgK4AOw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

On Fri, Nov 26, 2021 at 8:41 PM Aleksander Alekseev
<aleksander(at)timescale(dot)com> wrote:
>
> > I will submit the actual code review in the follow-up email
>
> The patchset is in a good shape. I'm changing the status to "Ready for
> Committer".

The 2nd patch doesn't apply anymore due to a conflict on
resowner_private.h: http://cfbot.cputube.org/patch_36_3364.log.

I'm switching the entry to Waiting on Author.

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Julien Rouhaud <rjuju123(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2022-10-31 09:51:36
Message-ID:	2e10b71b-352e-b97b-1e47-658e2669cecb@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 12/01/2022 07:57, Julien Rouhaud wrote:
> On Fri, Nov 26, 2021 at 8:41 PM Aleksander Alekseev
> <aleksander(at)timescale(dot)com> wrote:
>>
>> The patchset is in a good shape. I'm changing the status to "Ready for
>> Committer".
>
> The 2nd patch doesn't apply anymore due to a conflict on
> resowner_private.h: http://cfbot.cputube.org/patch_36_3364.log.

Rebased version attached. Given that Aleksander marked this as Ready for
Committer earlier, I'll add this to the next commitfest in that state,
and will commit in the next few days, barring any new objections.

- Heikki

Attachment	Content-Type	Size
v12-0001-Move-a-few-ResourceOwnerEnlarge-calls-for-safety.patch	text/x-patch	8.9 KB
v12-0002-Make-resowners-more-easily-extensible.patch	text/x-patch	94.0 KB
v12-0003-Use-a-faster-hash-function-in-resource-owners.patch	text/x-patch	2.6 KB

From:	Aleksander Alekseev <aleksander(at)timescale(dot)com>
To:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Cc:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Julien Rouhaud <rjuju123(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2022-10-31 12:49:37
Message-ID:	CAJ7c6TM1kTOi4jBuMFR4Okd8VthnsT=q6jhgGiuPQ3fD2jXKCA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi Heikki,

> Rebased version attached. Given that Aleksander marked this as Ready for
> Committer earlier, I'll add this to the next commitfest in that state,
> and will commit in the next few days, barring any new objections.

Thanks for resurrecting this patch.

While taking a fresh look at the code I noticed a few things.

In 0002 we have:

```
+ .name = "buffer"
...
+ .name = "File",
```

Not sure why "File" starts with an uppercase letter while "buffer"
starts with a lowercase one. This is naturally not a big deal but
could be worth changing for consistency.

In 0003:

```
+#if SIZEOF_DATUM == 8
+ return hash_combine64(murmurhash64((uint64) value), (uint64) kind);
+#else
+ return hash_combine(murmurhash32((uint32) value), (uint32) kind);
+#endif
```

Maybe it's worth using PointerGetDatum() + DatumGetInt32() /
DatumGetInt64() inline functions instead of casting Datums and
pointers directly.

These are arguably nitpicks though and shouldn't stop you from merging
the patches as is.

--
Best regards,
Aleksander Alekseev

From:	Aleksander Alekseev <aleksander(at)timescale(dot)com>
To:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Cc:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Julien Rouhaud <rjuju123(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2022-10-31 15:51:41
Message-ID:	CAJ7c6TNK5Ei02Fr8whptYDzaEnUo8Cx-vsgKAP3u_V7xWtcANQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi hackers,

> > Rebased version attached. Given that Aleksander marked this as Ready for
> > Committer earlier, I'll add this to the next commitfest in that state,
> > and will commit in the next few days, barring any new objections.
>
> Thanks for resurrecting this patch.

Additionally I decided to run `make installcheck-world` on Raspberry
Pi 3. It just terminated successfully.

--
Best regards,
Aleksander Alekseev

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Julien Rouhaud <rjuju123(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2022-11-01 00:15:23
Message-ID:	20221101001523.lsgmohl6fjwppbgx@awork3.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

On 2022-10-31 10:51:36 +0100, Heikki Linnakangas wrote:
> These are functions where quite a lot of things happen between the
> ResourceOwnerEnlarge and ResourceOwnerRemember calls. It's important that
> there are no unrelated ResourceOwnerRemember() calls in the code in
> between, otherwise the entry reserved by the ResourceOwnerEnlarge() call
> might be used up by the intervening ResourceOwnerRemember() and not be
> available at the intended ResourceOwnerRemember() call anymore. The longer
> the code path between them is, the harder it is to verify that.

This seems to work towards a future where only one kind of resource can be
reserved ahead of time. That doesn't strike me as great.

> Instead of having a separate array/hash for each resource kind, use a
> single array and hash to hold all kinds of resources. This makes it
> possible to introduce new resource "kinds" without having to modify the
> ResourceOwnerData struct. In particular, this makes it possible for
> extensions to register custom resource kinds.

As a goal I like this.

However, I'm not quite sold on the implementation. Two main worries:

1) As far as I can tell, the way ResourceOwnerReleaseAll() now works seems to
assume that within a phase the reset order does not matter. I don't think
that's a good assumption. I e.g. have a patch to replace InProgressBuf with
resowner handling, and in-progress IO errors should be processed before
before pins are released.

2) There's quite a few resource types where we actually don't need an entry in
an array, because we can instead add a dlist_node to the resource -
avoiding memory overhead and making removal cheap. I have a few pending
patches that use that approach, and this doesn't really provide a path for
that anymore.

I did try out the benchmark from
https://postgr.es/m/20221029200025.w7bvlgvamjfo6z44%40awork3.anarazel.de and
the patches performed well, slightly better than my approach of allocating
some initial memory for each resarray.

Greetings,

Andres Freund

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Julien Rouhaud <rjuju123(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2022-11-01 10:39:39
Message-ID:	9a3245db-ca03-4a47-5706-73404cfe5108@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 01/11/2022 02:15, Andres Freund wrote:
> On 2022-10-31 10:51:36 +0100, Heikki Linnakangas wrote:
>> These are functions where quite a lot of things happen between the
>> ResourceOwnerEnlarge and ResourceOwnerRemember calls. It's important that
>> there are no unrelated ResourceOwnerRemember() calls in the code in
>> between, otherwise the entry reserved by the ResourceOwnerEnlarge() call
>> might be used up by the intervening ResourceOwnerRemember() and not be
>> available at the intended ResourceOwnerRemember() call anymore. The longer
>> the code path between them is, the harder it is to verify that.
>
> This seems to work towards a future where only one kind of resource can be
> reserved ahead of time. That doesn't strike me as great.

True. It is enough for all the current callers AFAICS, though. I don't
remember wanting to reserve multiple resources at the same time.
Usually, the distance between the Enlarge and Remember calls is very
short. If it's not, it's error-prone anyway, so we should try to keep
the distance short.

While we're at it, who says that it's enough to reserve space for only
one resource of a kind either? For example, what if you want to pin two
buffers?

If we really need to support that, I propose something like this:

/*
* Reserve a slot in the resource owner.
*
* This is separate from actually inserting an entry because if we run out
* of memory, it's critical to do so *before* acquiring the resource.
*/
ResOwnerReservation *
ResourceOwnerReserve(ResourceOwner owner)

/*
* Remember that an object is owner by a ResourceOwner
*
* 'reservation' is a slot in the resource owner that was pre-reserved
* by ResOwnerReservation.
*/
void
ResOwnerRemember(ResOwnerReservaton *reservation, Datum value,
ResourceOwnerFuncs *kind)

>> Instead of having a separate array/hash for each resource kind, use a
>> single array and hash to hold all kinds of resources. This makes it
>> possible to introduce new resource "kinds" without having to modify the
>> ResourceOwnerData struct. In particular, this makes it possible for
>> extensions to register custom resource kinds.
>
> As a goal I like this.
>
> However, I'm not quite sold on the implementation. Two main worries:
>
> 1) As far as I can tell, the way ResourceOwnerReleaseAll() now works seems to
> assume that within a phase the reset order does not matter. I don't think
> that's a good assumption. I e.g. have a patch to replace InProgressBuf with
> resowner handling, and in-progress IO errors should be processed before
> before pins are released.

Hmm. Currently, you're not supposed to hold any resources at commit. You
get warnings about resource leaks if a resource owner is not empty on
ResourceOwnerReleaseAll(). On abort, does the order matter? I'm not
familiar with your InProgressBuf patch, but I guess you could handle the
in-progress IO errors in ReleaseBuffer().

If we do need to worry about release order, perhaps add a "priority" or
"phase" to each resource kind, and release them in priority order. We
already have before- and after-locks as phases, but we could generalize
that.

However, I feel that trying to enforce a particular order moves the
goalposts. If we need that, let's add it as a separate patch later.

> 2) There's quite a few resource types where we actually don't need an entry in
> an array, because we can instead add a dlist_node to the resource -
> avoiding memory overhead and making removal cheap. I have a few pending
> patches that use that approach, and this doesn't really provide a path for
> that anymore.

Is that materially better than using the array? The fast path with an
array is very fast. If it is better, perhaps we should bite the bullet
and require a dlist node and use that mechanism for all resource types?

> I did try out the benchmark from
> https://postgr.es/m/20221029200025.w7bvlgvamjfo6z44%40awork3.anarazel.de and
> the patches performed well, slightly better than my approach of allocating
> some initial memory for each resarray.

Thank you, glad to hear that!

- Heikki

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2022-11-01 12:43:09
Message-ID:	CA+TgmoYEVaq8-F9x71AL00bkAkYVawmb625HYciFfm-GtF4Tzw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Nov 1, 2022 at 6:39 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> However, I feel that trying to enforce a particular order moves the
> goalposts. If we need that, let's add it as a separate patch later.

I don't really see it that way, because with the current
implementation, we do all resources of a particular type together,
before moving on to the next type. That seems like a valuable property
to preserve, and I think we should.

--
Robert Haas
EDB: http://www.enterprisedb.com

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Julien Rouhaud <rjuju123(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2022-11-01 15:42:51
Message-ID:	20221101154251.64vyq7wy6w7aykrn@awork3.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

On 2022-11-01 12:39:39 +0200, Heikki Linnakangas wrote:
> > 1) As far as I can tell, the way ResourceOwnerReleaseAll() now works seems to
> > assume that within a phase the reset order does not matter. I don't think
> > that's a good assumption. I e.g. have a patch to replace InProgressBuf with
> > resowner handling, and in-progress IO errors should be processed before
> > before pins are released.
>
> Hmm. Currently, you're not supposed to hold any resources at commit. You get
> warnings about resource leaks if a resource owner is not empty on
> ResourceOwnerReleaseAll(). On abort, does the order matter? I'm not familiar
> with your InProgressBuf patch, but I guess you could handle the in-progress
> IO errors in ReleaseBuffer().

I was thinking about doing that as well, but it's not really trivial to know
about the in-progress IO at that time, without additional tracking (which
isn't free).

> If we do need to worry about release order, perhaps add a "priority" or
> "phase" to each resource kind, and release them in priority order. We
> already have before- and after-locks as phases, but we could generalize
> that.
>
> However, I feel that trying to enforce a particular order moves the
> goalposts. If we need that, let's add it as a separate patch later.

Like Robert, I think that the patch is moving the goalpost...

> > 2) There's quite a few resource types where we actually don't need an entry in
> > an array, because we can instead add a dlist_node to the resource -
> > avoiding memory overhead and making removal cheap. I have a few pending
> > patches that use that approach, and this doesn't really provide a path for
> > that anymore.
>
> Is that materially better than using the array?

It's safe in critical sections. I have a, not really baked but promising,
patch to make WAL writes use AIO. There's no way to know the number of
"asynchronous IOs" needed before entering the critical section.

> The fast path with an array is very fast. If it is better, perhaps we should
> bite the bullet and require a dlist node and use that mechanism for all
> resource types?

I don't think it's suitable for all - you need an exclusively owned region of
memory to embed a list in. That works nicely for some things, but not others
(e.g. buffer pins).

Greetings,

Andres Freund

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Julien Rouhaud <rjuju123(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-02-21 11:40:57
Message-ID:	bf88d2a9-a7bc-3d68-57e3-abfba87832e9@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 01/11/2022 17:42, Andres Freund wrote:
> On 2022-11-01 12:39:39 +0200, Heikki Linnakangas wrote:
>> If we do need to worry about release order, perhaps add a "priority" or
>> "phase" to each resource kind, and release them in priority order. We
>> already have before- and after-locks as phases, but we could generalize
>> that.
>>
>> However, I feel that trying to enforce a particular order moves the
>> goalposts. If we need that, let's add it as a separate patch later.
>
> Like Robert, I think that the patch is moving the goalpost...

Ok, I added a release-priority mechanism to this. New patchset attached.
I added it as a separate patch on top of the previous patches, as
v13-0003-Release-resources-in-priority-order.patch, to make it easier to
see what changed after the previous patchset version. But it should be
squashed with patch 0002 before committing.

In this implementation, when you call ResourceOwnerRelease(), the
array/hash table is sorted by phase and priority, and the callbacks are
called in that order. To make that feasible, I had to add one limitation:

After you call ResourceOwnerRelease(RESOURCE_RELEASE_BEFORE_LOCKS), you
cannot continue using the resource owner. You now get an error if you
try to call ResourceOwnerRemember() or ResourceOwnerForget() after
ResourceOwnerRelease(). Previously, it was allowed, but if you
remembered any more before-locks resources after calling
ResourceOwnerRelease(RESOURCE_RELEASE_BEFORE_LOCKS), you had to be
careful to release them manually, or you hit an assertion failure when
deleting the ResourceOwner.

We did that sometimes with relcache references in AbortSubTransaction(),
in AtEOSubXact_Inval(). Regression tests caught that case. I worked
around that in AbortSubTransaction() by switching to the parent
subxact's resource owner between the phases. Other end-of-transaction
routines also perform some tasks between the phases, but AFAICS they
don't try to remember any extra resources. It's a bit hard to tell,
though, there's a lot going on in AtEOXact_Inval().

A more general fix would be to have special ResourceOwner for use at
end-of-transaction, similar to the TransactionAbortContext memory
context. But in general, end-of-transaction activities should be kept
simple, especially between the release phases, so I feel that having to
remember extra resources there is a bad code smell and we shouldn't
encourage that. Perhaps there is a better fix or workaround for the
known case in AbortSubTransaction(), too, that would avoid having to
remember any resources there.

I added a test module in src/test/modules/test_resowner to test those cases.

Also, I changed the ReleaseResource callback's contract so that the
callback no longer needs to call ResourceOwnerForget. That required some
changes in bufmgr.c and some other files, to avoid calling
ResourceOwnerForget when the resource is released by the callback.

There are no changes to the performance-critical Remember/Forget
codepaths, the performance is the same as before.

- Heikki

Attachment	Content-Type	Size
v13-0001-Move-a-few-ResourceOwnerEnlarge-calls-for-safety.patch	text/x-patch	8.9 KB
v13-0002-Make-resowners-more-easily-extensible.patch	text/x-patch	94.0 KB
v13-0003-Release-resources-in-priority-order.patch	text/x-patch	82.4 KB
v13-0004-Use-a-faster-hash-function-in-resource-owners.patch	text/x-patch	2.6 KB
v13-0005-Change-pgcrypto-to-use-the-new-ResourceOwner-mec.patch	text/x-patch	7.9 KB

From:	Aleksander Alekseev <aleksander(at)timescale(dot)com>
To:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Cc:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-03-22 14:23:40
Message-ID:	CAJ7c6TNijrJ+0hnTiAPDBn-tCiJsk-yvpF3Y8rXG4gTw09JYbg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

I noticed that the patchset didn't make much progress since February
and decided to give it another round of code review.

> [...]. But in general, end-of-transaction activities should be kept
> simple, especially between the release phases, so I feel that having to
> remember extra resources there is a bad code smell and we shouldn't
> encourage that.

> I added a test module in src/test/modules/test_resowner to test those cases.

This is much appreciated, as well as extensive changes made to READMEs
and the comments.

> [...] New patchset attached.
> I added it as a separate patch on top of the previous patches, as
> v13-0003-Release-resources-in-
> priority-order.patch, to make it easier to
> see what changed after the previous patchset version. But it should be
> squashed with patch 0002 before committing.

My examination, which besides reading the code included running it
under sanitizer and checking the code coverage, didn't reveal any
major problems with the patchset.

Certain "should never happen in practice" scenarios seem not to be
test-covered in resowner.c, particularly:

```
elog(ERROR, "ResourceOwnerEnlarge called after release started");
elog(ERROR, "ResourceOwnerRemember called but array was full");
elog(ERROR, "ResourceOwnerForget called for %s after release started",
kind->name);
elog(ERROR, "%s %p is not owned by resource owner %s",
elog(ERROR, "ResourceOwnerForget called for %s after release started",
kind->name);
elog(ERROR, "lock reference %p is not owned by resource owner %s"
```

I didn't check whether these or similar code paths were tested before
the patch and I don't have a strong opinion on whether we should test
these scenarios. Personally I'm OK with the fact that these few lines
are not covered with tests.

The following procedures are never executed:

* RegisterResourceReleaseCallback()
* UnregisterResourceReleaseCallback()

And are never actually called anymore due to changes in 0005.
Previously they were used by contrib/pgcrypto. I suggest dropping this
part of the API since it seems to be redundant now. This will simplify
the implementation even further.

This patch, although moderately complicated, was moved between several
commitfests. I think it would be great if it made it to PG16. I'm
inclined to change the status of the patchset to RfC in a bit, unless
anyone has a second opinion.

--
Best regards,
Aleksander Alekseev

From:	Aleksander Alekseev <aleksander(at)timescale(dot)com>
To:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Cc:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-03-30 19:17:07
Message-ID:	CAJ7c6TNwH75E=hq6pP_gibHawU+Avqed=tUQ0FTzdkCQXfoJZg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

> This patch, although moderately complicated, was moved between several
> commitfests. I think it would be great if it made it to PG16. I'm
> inclined to change the status of the patchset to RfC in a bit, unless
> anyone has a second opinion.

> I added a test module in src/test/modules/test_resowner to test those cases.

Hmm... it looks like a file is missing in the patchset. When building
with Autotools:

```
============== running regression test queries ==============
test test_resowner ... /bin/sh: 1: cannot open
/home/eax/projects/postgresql/src/test/modules/test_resowner/sql/test_resowner.sql:
No such file
```

I wonder why the tests pass when using Meson.

--
Best regards,
Aleksander Alekseev

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-05-22 09:40:23
Message-ID:	d17d4d6b-c39c-7e40-9f3d-b09f19e5ccec@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Here's another version of this patchset:

- I squashed the resource release priority changes with the main patch.

- In 0001-Move-a-few-ResourceOwnerEnlarge-calls-for-safety.patch, I
moved things around a little differently. In previous version, I changed
PinBuffer() so that it does ResourceOwnerEnlargeBuffers, so that the
callers don't need to do it. In this version, I went the other
direction, and made the caller responsible for calling both
ResourceOwnerEnlargeBuffers and ReservePrivateRefCountEntry. There were
some callers that had to call those functions before calling PinBuffer()
anyway, because they wanted to avoid the possibility of elog(ERROR), so
it seems better to always make it the caller's responsibility.

More comments below:

On 22/03/2023 16:23, Aleksander Alekseev wrote:
> Certain "should never happen in practice" scenarios seem not to be
> test-covered in resowner.c, particularly:
>
> ```
> elog(ERROR, "ResourceOwnerEnlarge called after release started");
> elog(ERROR, "ResourceOwnerRemember called but array was full");
> elog(ERROR, "ResourceOwnerForget called for %s after release started",
> kind->name);
> elog(ERROR, "%s %p is not owned by resource owner %s",
> elog(ERROR, "ResourceOwnerForget called for %s after release started",
> kind->name);
> elog(ERROR, "lock reference %p is not owned by resource owner %s"
> ```
>
> I didn't check whether these or similar code paths were tested before
> the patch and I don't have a strong opinion on whether we should test
> these scenarios. Personally I'm OK with the fact that these few lines
> are not covered with tests.

They were not covered before. Might make sense to add coverage, I'll
look into that.

> The following procedures are never executed:
>
> * RegisterResourceReleaseCallback()
> * UnregisterResourceReleaseCallback()
>
> And are never actually called anymore due to changes in 0005.
> Previously they were used by contrib/pgcrypto. I suggest dropping this
> part of the API since it seems to be redundant now. This will simplify
> the implementation even further.

There might be extensions out there that use those callbacks, that's why
I left them in place. I'll add a test for those functions in
test_resowner now, so that we still have some coverage for them in core.

> Hmm... it looks like a file is missing in the patchset. When building
> with Autotools:
>
> ```
> ============== running regression test queries ==============
> test test_resowner ... /bin/sh: 1: cannot open
> /home/eax/projects/postgresql/src/test/modules/test_resowner/sql/test_resowner.sql:
> No such file
> ```

Oops, added.

> I wonder why the tests pass when using Meson.

A-ha, it's because I didn't add the new test_resowner directory to the
src/test/modules/meson.build file. Fixed.

Thanks for the review!

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment	Content-Type	Size
v14-0001-Move-a-few-ResourceOwnerEnlarge-calls-for-safety.patch	text/x-patch	9.8 KB
v14-0002-Make-resowners-more-easily-extensible.patch	text/x-patch	148.1 KB
v14-0003-Use-a-faster-hash-function-in-resource-owners.patch	text/x-patch	2.6 KB
v14-0004-Change-pgcrypto-to-use-the-new-ResourceOwner-mec.patch	text/x-patch	7.9 KB

From:	Aleksander Alekseev <aleksander(at)timescale(dot)com>
To:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Cc:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-05-23 13:41:55
Message-ID:	CAJ7c6TM_9XtQX6xAGkH2u2-b20z-1rii9szMdW-aMHJZC_NOsA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

> [...]
> A-ha, it's because I didn't add the new test_resowner directory to the
> src/test/modules/meson.build file. Fixed.
>
> Thanks for the review!

You probably already noticed, but for the record: cfbot seems to be
extremely unhappy with the patch.

--
Best regards,
Aleksander Alekseev

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-05-25 17:14:23
Message-ID:	527af829-4a03-03b0-121c-46ff79c61887@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 23/05/2023 16:41, Aleksander Alekseev wrote:
> Hi,
>
>> [...]
>> A-ha, it's because I didn't add the new test_resowner directory to the
>> src/test/modules/meson.build file. Fixed.
>>
>> Thanks for the review!
>
> You probably already noticed, but for the record: cfbot seems to be
> extremely unhappy with the patch.

Thanks, here's a fixed version. (ResourceOwner resource release
callbacks mustn't call ResourceOwnerForget anymore, but AbortBufferIO
was still doing that)

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment	Content-Type	Size
v15-0001-Move-a-few-ResourceOwnerEnlarge-calls-for-safety.patch	text/x-patch	9.8 KB
v15-0002-Make-resowners-more-easily-extensible.patch	text/x-patch	151.9 KB
v15-0003-Use-a-faster-hash-function-in-resource-owners.patch	text/x-patch	2.6 KB
v15-0004-Change-pgcrypto-to-use-the-new-ResourceOwner-mec.patch	text/x-patch	7.9 KB

From:	Aleksander Alekseev <aleksander(at)timescale(dot)com>
To:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Cc:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-07-04 14:18:24
Message-ID:	CAJ7c6TMxDcJOCAnKcn81r3BrR6GoKT_jdpwxi0+KU4F4dhKQmw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

> Thanks, here's a fixed version. (ResourceOwner resource release
> callbacks mustn't call ResourceOwnerForget anymore, but AbortBufferIO
> was still doing that)

I believe v15 is ready to be merged. I suggest we merge it early in
the PG17 release cycle.

--
Best regards,
Aleksander Alekseev

From:	Peter Eisentraut <peter(at)eisentraut(dot)org>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-07-10 12:37:23
Message-ID:	a54c746c-d956-e712-b4e7-4df2bfd61821@eisentraut.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

A few suggestions on the API:

> +static ResourceOwnerFuncs tupdesc_resowner_funcs =

These aren't all "functions", so maybe another word like "info" or
"description" would be more appropriate?

> + .release_phase = RESOURCE_RELEASE_AFTER_LOCKS,
> + .release_priority = RELEASE_PRIO_TUPDESC_REFS,

I suggest assigning explicit non-zero numbers to the RESOURCE_RELEASE_*
constants, as well as changing RELEASE_PRIO_FIRST to 1 and adding an
assertion that the priority is not zero. Otherwise, someone might
forget to assign these fields and would implicitly get phase 0 and
priority 0, which would probably work in most cases, but wouldn't be the
explicit intention.

Then again, you are using RELEASE_PRIO_FIRST for the pgcrypto patch. Is
it the recommendation that if there are no other requirements, external
users should use that? If we are going to open this up to external
users, we should probably have some documented guidance around this.

> + .PrintLeakWarning = ResOwnerPrintTupleDescLeakWarning

I think this can be refactored further. We really just need a function
to describe a resource, like maybe

static char *
ResOwnerDescribeTupleDesc(Datum res)
{
TupleDesc tupdesc = (TupleDesc) DatumGetPointer(res);

return psprintf("TupleDesc %p (%u,%d)",
tupdesc, tupdesc->tdtypeid, tupdesc->tdtypmod);
}

and then the common code can generate the warnings from that like

elog(WARNING,
"%s leak: %s still referenced",
kind->name, kind->describe(value));

That way, we could more easily develop further options, such as turning
this warning into an error (useful for TAP tests).

Also, maybe, you could supply a default description that just prints
"%p", which appears to be applicable in about half the places.

Possible bonus project: I'm not sure these descriptions are so useful
anyway. It doesn't help me much in debugging if "File 55 was not
released" or some such. If would be cool if ResourceOwnerRemember() in
some debug mode could remember the source code location, and then print
that together with the leak warning.

> +#define ResourceOwnerRememberTupleDesc(owner, tupdesc) \
> + ResourceOwnerRemember(owner, PointerGetDatum(tupdesc),
&tupdesc_resowner_funcs)
> +#define ResourceOwnerForgetTupleDesc(owner, tupdesc) \
> + ResourceOwnerForget(owner, PointerGetDatum(tupdesc),
&tupdesc_resowner_funcs)

I would prefer that these kinds of things be made inline functions, so
that we maintain the type safety of the previous interface. And it's
also easier when reading code to see type decorations.

(But I suspect the above bonus project would require these to be macros?)

> + if (context->resowner)
> + ResourceOwnerForgetJIT(context->resowner, context);

It seems most ResourceOwnerForget*() calls have this kind of check
before them. Could this be moved inside the function?

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-07-10 19:14:46
Message-ID:	20230710191446.pn5ltm73t2i3xtxz@awork3.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	503 롤 토토 페치 실패

Hi,

On 2023-05-25 20:14:23 +0300, Heikki Linnakangas wrote:
> @@ -2491,9 +2495,6 @@ BufferSync(int flags)
> int mask = BM_DIRTY;
> WritebackContext wb_context;
>
> - /* Make sure we can handle the pin inside SyncOneBuffer */
> - ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
> -
> /*
> * Unless this is a shutdown checkpoint or we have been explicitly told,
> * we write only permanent, dirty buffers. But at shutdown or end of
> @@ -2970,9 +2971,6 @@ BgBufferSync(WritebackContext *wb_context)
> * requirements, or hit the bgwriter_lru_maxpages limit.
> */
>
> - /* Make sure we can handle the pin inside SyncOneBuffer */
> - ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
> -
> num_to_scan = bufs_to_lap;
> num_written = 0;
> reusable_buffers = reusable_buffers_est;
> @@ -3054,8 +3052,6 @@ BgBufferSync(WritebackContext *wb_context)
> *
> * (BUF_WRITTEN could be set in error if FlushBuffer finds the buffer clean
> * after locking it, but we don't care all that much.)
> - *
> - * Note: caller must have done ResourceOwnerEnlargeBuffers.
> */
> static int
> SyncOneBuffer(int buf_id, bool skip_recently_used, WritebackContext *wb_context)

> @@ -3065,7 +3061,9 @@ SyncOneBuffer(int buf_id, bool skip_recently_used, WritebackContext *wb_context)
> uint32 buf_state;
> BufferTag tag;
>
> + /* Make sure we can handle the pin */
> ReservePrivateRefCountEntry();
> + ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
>
> /*
> * Check whether buffer needs writing.
> @@ -4110,9 +4108,6 @@ FlushRelationBuffers(Relation rel)
> return;
> }
>
> - /* Make sure we can handle the pin inside the loop */
> - ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
> -
> for (i = 0; i < NBuffers; i++)
> {
> uint32 buf_state;
> @@ -4126,7 +4121,9 @@ FlushRelationBuffers(Relation rel)
> if (!BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator))
> continue;
>
> + /* Make sure we can handle the pin */
> ReservePrivateRefCountEntry();
> + ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
>
> buf_state = LockBufHdr(bufHdr);
> if (BufTagMatchesRelFileLocator(&bufHdr->tag, &rel->rd_locator) &&
> @@ -4183,9 +4180,6 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
> if (use_bsearch)
> pg_qsort(srels, nrels, sizeof(SMgrSortArray), rlocator_comparator);
>
> - /* Make sure we can handle the pin inside the loop */
> - ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
> -
> for (i = 0; i < NBuffers; i++)
> {
> SMgrSortArray *srelent = NULL;

Hm, a tad worried that increasing the number of ResourceOwnerEnlargeBuffers()
that drastically could have a visible overhead.

> +static ResourceOwnerFuncs tupdesc_resowner_funcs =
> +{
> + .name = "tupdesc reference",
> + .release_phase = RESOURCE_RELEASE_AFTER_LOCKS,
> + .release_priority = RELEASE_PRIO_TUPDESC_REFS,
> + .ReleaseResource = ResOwnerReleaseTupleDesc,
> + .PrintLeakWarning = ResOwnerPrintTupleDescLeakWarning
> +};

I think these should all be marked const, there's no need to have them be in a
mutable section of the binary. Of course that will require adjusting a bunch
of the signatures too, but that seems fine.

> --- a/src/backend/access/transam/xact.c
> +++ b/src/backend/access/transam/xact.c
> @@ -5171,9 +5171,24 @@ AbortSubTransaction(void)
> ResourceOwnerRelease(s->curTransactionOwner,
> RESOURCE_RELEASE_BEFORE_LOCKS,
> false, false);
> +
> AtEOSubXact_RelationCache(false, s->subTransactionId,
> s->parent->subTransactionId);
> +
> +
> + /*
> + * AtEOSubXact_Inval sometimes needs to temporarily
> + * bump the refcount on the relcache entries that it processes.
> + * We cannot use the subtransaction's resource owner anymore,
> + * because we've already started releasing it. But we can use
> + * the parent resource owner.
> + */
> + CurrentResourceOwner = s->parent->curTransactionOwner;
> +
> AtEOSubXact_Inval(false);
> +
> + CurrentResourceOwner = s->curTransactionOwner;
> +
> ResourceOwnerRelease(s->curTransactionOwner,
> RESOURCE_RELEASE_LOCKS,
> false, false);

I might be a bit slow this morning, but why did this need to change as part of
this?

> @@ -5182,8 +5221,8 @@ TerminateBufferIO(BufferDesc *buf, bool clear_dirty, uint32 set_flag_bits)
> * If I/O was in progress, we always set BM_IO_ERROR, even though it's
> * possible the error condition wasn't related to the I/O.
> */
> -void
> -AbortBufferIO(Buffer buffer)
> +static void
> +AbortBufferIO(Buffer buffer, bool forget_owner)

What is the forget_owner argument for? It's always false, afaics?

> +/* Convenience wrappers over ResourceOwnerRemember/Forget */
> +#define ResourceOwnerRememberCatCacheRef(owner, tuple) \
> + ResourceOwnerRemember(owner, PointerGetDatum(tuple), &catcache_resowner_funcs)
> +#define ResourceOwnerForgetCatCacheRef(owner, tuple) \
> + ResourceOwnerForget(owner, PointerGetDatum(tuple), &catcache_resowner_funcs)
> +#define ResourceOwnerRememberCatCacheListRef(owner, list) \
> + ResourceOwnerRemember(owner, PointerGetDatum(list), &catlistref_resowner_funcs)
> +#define ResourceOwnerForgetCatCacheListRef(owner, list) \
> + ResourceOwnerForget(owner, PointerGetDatum(list), &catlistref_resowner_funcs)

Not specific to this type of resource owner, but I wonder if it'd not be
better to use inline functions for these - the PointerGetDatum() cast removes
nearly all type safety.

> +Releasing
> +---------
> +
> +Releasing the resources of a ResourceOwner happens in three phases:
> +
> +1. "Before-locks" resources
> +
> +2. Locks
> +
> +3. "After-locks" resources

Given that we now have priorites, I wonder if we shouldn't merge the phase and
the priorities? We have code like this:

ResourceOwnerRelease(TopTransactionResourceOwner,
RESOURCE_RELEASE_BEFORE_LOCKS,
true, true);
...
ResourceOwnerRelease(TopTransactionResourceOwner,
RESOURCE_RELEASE_LOCKS,
true, true);
ResourceOwnerRelease(TopTransactionResourceOwner,
RESOURCE_RELEASE_AFTER_LOCKS,
true, true);

Now, admittedly locks currently are "special" anyway, but we have a plan to
get rid of that. If we did, then we could do the last two as one as
ResourceOwnerRelease(from = RELEASE_PRIO_LOCKS, to: RELEASE_PRIO_LAST, ...)
or such.

> +Normally, you are expected to call ResourceOwnerForget on every resource so
> +that at commit, the ResourceOwner is empty (locks are an exception). If there
> +are any resources still held at commit, ResourceOwnerRelease will call the
> +PrintLeakWarning callback on each such resource. At abort, however, we truly
> +rely on the ResourceOwner mechanism and it is normal that there are resources
> +to be released.

I am not sure it's a good idea to encode that all kinds of resources ought to
have been released prior on commit. I can see that not always making sense -
in fact we already don't do it for locks, no? Perhaps just add a "commonly"?

> +typedef struct ResourceElem
> {
> ...
> + Datum item;
> + ResourceOwnerFuncs *kind; /* NULL indicates a free hash table slot */
> +} ResourceElem;

> /*
> - * Initially allocated size of a ResourceArray. Must be power of two since
> - * we'll use (arraysize - 1) as mask for hashing.
> + * Size of the fixed-size array to hold most-recently remembered resources.
> */
> -#define RESARRAY_INIT_SIZE 16
> +#define RESOWNER_ARRAY_SIZE 32

That's 512 bytes, pretty large to be searched sequentially. It's somewhat sad
that the array needs to include 8 byte of ResourceOwnerFuncs...

> + bool releasing; /* has ResourceOwnerRelease been called? */
> +
> + /*
> + * The fixed-size array for recent resources.
> + *
> + * If 'releasing' is set, the contents are sorted by release priority.
> + */
> + uint32 narr; /* how many items are stored in the array */
> + ResourceElem arr[RESOWNER_ARRAY_SIZE];
> +
> + /*
> + * The hash table. Uses open-addressing. 'nhash' is the number of items
> + * present; if it would exceed 'grow_at', we enlarge it and re-hash.
> + * 'grow_at' should be rather less than 'capacity' so that we don't waste
> + * too much time searching for empty slots.
> + *
> + * If 'releasing' is set, the contents are no longer hashed, but sorted by
> + * release priority. The first 'nhash' elements are occupied, the rest
> + * are empty.
> + */
> + ResourceElem *hash;
> + uint32 nhash; /* how many items are stored in the hash */
> + uint32 capacity; /* allocated length of hash[] */
> + uint32 grow_at; /* grow hash when reach this */
> +
> + /* The local locks cache. */
> + uint32 nlocks; /* number of owned locks */
> LOCALLOCK *locks[MAX_RESOWNER_LOCKS]; /* list of owned locks */

Due to 'narr' being before the large 'arr', 'nhash' is always on a separate
cacheline. I.e. the number of cachelines accessed in happy paths is higher
than necessary, also relevant because there's more dependencies on narr, nhash
than on the contents of those.

I'd probably order it so that both of the large elements (arr, locks) are at
the end.

It's hard to know with changes this small, but it does seem to yield a small
and repeatable performance benefit (pipelined pgbench -S).

> -} ResourceOwnerData;
> +} ResourceOwnerData;

That one pgindent will likely undo again ;)

> +/*
> + * Forget that an object is owned by a ResourceOwner
> + *
> + * Note: if same resource ID is associated with the ResourceOwner more than
> + * once, one instance is removed.
> + */
> +void
> +ResourceOwnerForget(ResourceOwner owner, Datum value, ResourceOwnerFuncs *kind)
> +{
> +#ifdef RESOWNER_TRACE
> + elog(LOG, "FORGET %d: owner %p value " UINT64_FORMAT ", kind: %s",
> + resowner_trace_counter++, owner, DatumGetUInt64(value), kind->name);
> +#endif
> +
> + /*
> + * Mustn't call this after we have already started releasing resources.
> + * (Release callback functions are not allowed to release additional
> + * resources.)
> + */
> + if (owner->releasing)
> + elog(ERROR, "ResourceOwnerForget called for %s after release started", kind->name);
> +
> + /* Search through all items in the array first. */
> + for (uint32 i = 0; i < owner->narr; i++)
> + {

Hm, I think we oftend up releasing resources in a lifo order. Would it
possibly make sense to search in reverse order?

Separately, I wonder if it'd be worth to check if there are any other matches
when assertions are enabled.

> ResourceOwnerRelease(ResourceOwner owner,
> @@ -492,6 +631,15 @@ ResourceOwnerRelease(ResourceOwner owner,
> {
> /* There's not currently any setup needed before recursing */
> ResourceOwnerReleaseInternal(owner, phase, isCommit, isTopLevel);
> +
> +#ifdef RESOWNER_STATS
> + if (isTopLevel)
> + {
> + elog(LOG, "RESOWNER STATS: lookups: array %d, hash %d", narray_lookups, nhash_lookups);
> + narray_lookups = 0;
> + nhash_lookups = 0;
> + }
> +#endif
> }

Why do we have ResourceOwnerRelease() vs ResourceOwnerReleaseInternal()? Looks
like that's older than your patch, but still confused.

> + /* Let add-on modules get a chance too */
> + for (item = ResourceRelease_callbacks; item; item = next)
> + {
> + /* allow callbacks to unregister themselves when called */
> + next = item->next;
> + item->callback(phase, isCommit, isTopLevel, item->arg);
> + }

I wonder if it's really a good idea to continue having this API. What you are
allowed to do in a resowner changed

> +/*
> + * ResourceOwnerReleaseAllOfKind
> + * Release all resources of a certain type held by this owner.
> + */
> +void
> +ResourceOwnerReleaseAllOfKind(ResourceOwner owner, ResourceOwnerFuncs *kind)
> +{
> + /* Mustn't call this after we have already started releasing resources. */
> + if (owner->releasing)
> + elog(ERROR, "ResourceOwnerForget called for %s after release started", kind->name);
>
> - /* Ditto for plancache references */
> - while (ResourceArrayGetAny(&(owner->planrefarr), &foundres))
> + /*
> + * Temporarily set 'releasing', to prevent calls to ResourceOwnerRemember
> + * while we're scanning the owner. Enlarging the hash would cause us to
> + * lose track of the point we're scanning.
> + */
> + owner->releasing = true;

Is it a problem than a possible error would lead to ->releasing = true to be
"leaked"? I think it might be, because we haven't called ResourceOwnerSort(),
which means we'd not process resources in the right order during abort
processing.

> +/*
> + * Hash function for value+kind combination.
> + */
> static inline uint32
> hash_resource_elem(Datum value, ResourceOwnerFuncs *kind)
> {
> - Datum data[2];
> -
> - data[0] = value;
> - data[1] = PointerGetDatum(kind);
> -
> - return hash_bytes((unsigned char *) &data, 2 * SIZEOF_DATUM);
> + /*
> + * Most resource kinds store a pointer in 'value', and pointers are unique
> + * all on their own. But some resources store plain integers (Files and
> + * Buffers as of this writing), so we want to incorporate the 'kind' in
> + * the hash too, otherwise those resources will collide a lot. But
> + * because there are only a few resource kinds like that - and only a few
> + * resource kinds to begin with - we don't need to work too hard to mix
> + * 'kind' into the hash. Just add it with hash_combine(), it perturbs the
> + * result enough for our purposes.
> + */
> +#if SIZEOF_DATUM == 8
> + return hash_combine64(murmurhash64((uint64) value), (uint64) kind);
> +#else
> + return hash_combine(murmurhash32((uint32) value), (uint32) kind);
> +#endif
> }

Why are we using a 64bit hash to then just throw out the high bits?

Greetings,

Andres Freund

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-07-10 19:40:00
Message-ID:	20230710194000.fh5scmicma5snbcc@awork3.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

On 2023-07-10 12:14:46 -0700, Andres Freund wrote:
> > /*
> > - * Initially allocated size of a ResourceArray. Must be power of two since
> > - * we'll use (arraysize - 1) as mask for hashing.
> > + * Size of the fixed-size array to hold most-recently remembered resources.
> > */
> > -#define RESARRAY_INIT_SIZE 16
> > +#define RESOWNER_ARRAY_SIZE 32
>
> That's 512 bytes, pretty large to be searched sequentially. It's somewhat sad
> that the array needs to include 8 byte of ResourceOwnerFuncs...

On that note:

It's perhaps worth noting that this change actually *increases* the size of
ResourceOwnerData. Previously:

/* size: 576, cachelines: 9, members: 19 */
/* sum members: 572, holes: 1, sum holes: 4 */

now:
/* size: 696, cachelines: 11, members: 13 */
/* sum members: 693, holes: 1, sum holes: 3 */

It won't make a difference memory-usage wise, given aset.c's rounding
behaviour. But it does mean that more memory needs to be zeroed, as
ResourceOwnerCreate() uses MemoryContextAllocZero.

As there's really no need to initialize the long ->arr, ->locks, it seems
worth to just initialize explicitly instead of using MemoryContextAllocZero().

With the new representation, is there any point in having ->locks? We still
need ->nlocks to be able to switch to the lossy behaviour, but there doesn't
seem to be much point in having the separate array.

Greetings,

Andres Freund

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Peter Eisentraut <peter(at)eisentraut(dot)org>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-10-25 11:34:39
Message-ID:	07b7c6f2-b392-4823-b914-53f73bf9bc05@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 10/07/2023 15:37, Peter Eisentraut wrote:
> A few suggestions on the API:
>
> > +static ResourceOwnerFuncs tupdesc_resowner_funcs =
>
> These aren't all "functions", so maybe another word like "info" or
> "description" would be more appropriate?
>
>
> > + .release_phase = RESOURCE_RELEASE_AFTER_LOCKS,
> > + .release_priority = RELEASE_PRIO_TUPDESC_REFS,
>
> I suggest assigning explicit non-zero numbers to the RESOURCE_RELEASE_*
> constants, as well as changing RELEASE_PRIO_FIRST to 1 and adding an
> assertion that the priority is not zero. Otherwise, someone might
> forget to assign these fields and would implicitly get phase 0 and
> priority 0, which would probably work in most cases, but wouldn't be the
> explicit intention.

Done.

> Then again, you are using RELEASE_PRIO_FIRST for the pgcrypto patch. Is
> it the recommendation that if there are no other requirements, external
> users should use that? If we are going to open this up to external
> users, we should probably have some documented guidance around this.

I added a brief comment about that in the ResourceReleasePhase typedef.

I also added a section to the README on how to define your own resource
kind. (The README doesn't go into any detail on how to choose the
release priority though).

> > + .PrintLeakWarning = ResOwnerPrintTupleDescLeakWarning
>
> I think this can be refactored further. We really just need a function
> to describe a resource, like maybe
>
> static char *
> ResOwnerDescribeTupleDesc(Datum res)
> {
> TupleDesc tupdesc = (TupleDesc) DatumGetPointer(res);
>
> return psprintf("TupleDesc %p (%u,%d)",
> tupdesc, tupdesc->tdtypeid, tupdesc->tdtypmod);
> }
>
> and then the common code can generate the warnings from that like
>
> elog(WARNING,
> "%s leak: %s still referenced",
> kind->name, kind->describe(value));
>
> That way, we could more easily develop further options, such as turning
> this warning into an error (useful for TAP tests).
>
> Also, maybe, you could supply a default description that just prints
> "%p", which appears to be applicable in about half the places.

Refactored it that way.

> Possible bonus project: I'm not sure these descriptions are so useful
> anyway. It doesn't help me much in debugging if "File 55 was not
> released" or some such. If would be cool if ResourceOwnerRemember() in
> some debug mode could remember the source code location, and then print
> that together with the leak warning.

Yeah, that would be useful. I remember I've hacked something like that
as a one-off thing in the past, when debugging a leak.

> > +#define ResourceOwnerRememberTupleDesc(owner, tupdesc) \
> > + ResourceOwnerRemember(owner, PointerGetDatum(tupdesc),
> &tupdesc_resowner_funcs)
> > +#define ResourceOwnerForgetTupleDesc(owner, tupdesc) \
> > + ResourceOwnerForget(owner, PointerGetDatum(tupdesc),
> &tupdesc_resowner_funcs)
>
> I would prefer that these kinds of things be made inline functions, so
> that we maintain the type safety of the previous interface. And it's
> also easier when reading code to see type decorations.
>
> (But I suspect the above bonus project would require these to be macros?)
>
>
> > + if (context->resowner)
> > + ResourceOwnerForgetJIT(context->resowner, context);
>
> It seems most ResourceOwnerForget*() calls have this kind of check
> before them. Could this be moved inside the function?

Many do, but I don't know if it's the majority. We could make
ResurceOwnerForget(NULL) do nothing, but I think it's better to have the
check in the callers. You wouldn't want to silently do nothing when the
resource owner is not expected to be NULL.

(I'm attaching new patch version in my reply to Andres shortly)

--
Heikki Linnakangas
Neon (https://neon.tech)

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-10-25 12:43:36
Message-ID:	20607b96-b275-4a57-8509-29a3ed320d3f@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 10/07/2023 22:14, Andres Freund wrote:
> On 2023-05-25 20:14:23 +0300, Heikki Linnakangas wrote:
>> @@ -4183,9 +4180,6 @@ FlushRelationsAllBuffers(SMgrRelation *smgrs, int nrels)
>> if (use_bsearch)
>> pg_qsort(srels, nrels, sizeof(SMgrSortArray), rlocator_comparator);
>>
>> - /* Make sure we can handle the pin inside the loop */
>> - ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
>> -
>> for (i = 0; i < NBuffers; i++)
>> {
>> SMgrSortArray *srelent = NULL;
>
> Hm, a tad worried that increasing the number of ResourceOwnerEnlargeBuffers()
> that drastically could have a visible overhead.

You mean in FlushRelationsAllBuffers() in particular? Seems pretty cheap
compared to all the other work involved, and it's not that performance
critical anyway.

ExtendBufferedRelShared() gave me a pause, as it is in the critical
path. But there too, the ResourceOwnerEnlarge() call pales in comparison
to all the other work that it does for each buffer.

Other than that, I don't think I'm introducing new
ResourceOwnerEnlarge() calls to hot codepaths, just moving them around.

>> +static ResourceOwnerFuncs tupdesc_resowner_funcs =
>> +{
>> + .name = "tupdesc reference",
>> + .release_phase = RESOURCE_RELEASE_AFTER_LOCKS,
>> + .release_priority = RELEASE_PRIO_TUPDESC_REFS,
>> + .ReleaseResource = ResOwnerReleaseTupleDesc,
>> + .PrintLeakWarning = ResOwnerPrintTupleDescLeakWarning
>> +};
>
> I think these should all be marked const, there's no need to have them be in a
> mutable section of the binary. Of course that will require adjusting a bunch
> of the signatures too, but that seems fine.

Done.

>> --- a/src/backend/access/transam/xact.c
>> +++ b/src/backend/access/transam/xact.c
>> @@ -5171,9 +5171,24 @@ AbortSubTransaction(void)
>> ResourceOwnerRelease(s->curTransactionOwner,
>> RESOURCE_RELEASE_BEFORE_LOCKS,
>> false, false);
>> +
>> AtEOSubXact_RelationCache(false, s->subTransactionId,
>> s->parent->subTransactionId);
>> +
>> +
>> + /*
>> + * AtEOSubXact_Inval sometimes needs to temporarily
>> + * bump the refcount on the relcache entries that it processes.
>> + * We cannot use the subtransaction's resource owner anymore,
>> + * because we've already started releasing it. But we can use
>> + * the parent resource owner.
>> + */
>> + CurrentResourceOwner = s->parent->curTransactionOwner;
>> +
>> AtEOSubXact_Inval(false);
>> +
>> + CurrentResourceOwner = s->curTransactionOwner;
>> +
>> ResourceOwnerRelease(s->curTransactionOwner,
>> RESOURCE_RELEASE_LOCKS,
>> false, false);
>
> I might be a bit slow this morning, but why did this need to change as part of
> this?

I tried to explain it in the comment. AtEOSubXact_Inval() sometimes
needs to bump the refcount on a relcache entry, which is tracked in the
current resource owner. But we have already started to release it. On
master, you can allocate new resources in a ResourceOwner after you have
already called ResourceOwnerRelease(RESOURCE_RELEASE_BEFORE_LOCKS), but
with this patch, that's no longer allowed and you get an assertion
failure. I think it was always a bit questionable; it's not clear that
the resource would've been released correctly if an error occurred. In
practice, I don't think an error could actually occur while
AtEOXSubXact_Inval() briefly holds the relcache entry.

>> @@ -5182,8 +5221,8 @@ TerminateBufferIO(BufferDesc *buf, bool clear_dirty, uint32 set_flag_bits)
>> * If I/O was in progress, we always set BM_IO_ERROR, even though it's
>> * possible the error condition wasn't related to the I/O.
>> */
>> -void
>> -AbortBufferIO(Buffer buffer)
>> +static void
>> +AbortBufferIO(Buffer buffer, bool forget_owner)
>
> What is the forget_owner argument for? It's always false, afaics?

Removed. There used to be more callers of AbortBufferIO() which needed
that, but not anymore.

>> +/* Convenience wrappers over ResourceOwnerRemember/Forget */
>> +#define ResourceOwnerRememberCatCacheRef(owner, tuple) \
>> + ResourceOwnerRemember(owner, PointerGetDatum(tuple), &catcache_resowner_funcs)
>> +#define ResourceOwnerForgetCatCacheRef(owner, tuple) \
>> + ResourceOwnerForget(owner, PointerGetDatum(tuple), &catcache_resowner_funcs)
>> +#define ResourceOwnerRememberCatCacheListRef(owner, list) \
>> + ResourceOwnerRemember(owner, PointerGetDatum(list), &catlistref_resowner_funcs)
>> +#define ResourceOwnerForgetCatCacheListRef(owner, list) \
>> + ResourceOwnerForget(owner, PointerGetDatum(list), &catlistref_resowner_funcs)
>
> Not specific to this type of resource owner, but I wonder if it'd not be
> better to use inline functions for these - the PointerGetDatum() cast removes
> nearly all type safety.

Peter just said the same thing. I guess you're right, changed.

>> +Releasing
>> +---------
>> +
>> +Releasing the resources of a ResourceOwner happens in three phases:
>> +
>> +1. "Before-locks" resources
>> +
>> +2. Locks
>> +
>> +3. "After-locks" resources
>
> Given that we now have priorities, I wonder if we shouldn't merge the phase and
> the priorities? We have code like this:
>
>
> ResourceOwnerRelease(TopTransactionResourceOwner,
> RESOURCE_RELEASE_BEFORE_LOCKS,
> true, true);
> ...
> ResourceOwnerRelease(TopTransactionResourceOwner,
> RESOURCE_RELEASE_LOCKS,
> true, true);
> ResourceOwnerRelease(TopTransactionResourceOwner,
> RESOURCE_RELEASE_AFTER_LOCKS,
> true, true);
>
> Now, admittedly locks currently are "special" anyway, but we have a plan to
> get rid of that. If we did, then we could do the last two as one as
> ResourceOwnerRelease(from = RELEASE_PRIO_LOCKS, to: RELEASE_PRIO_LAST, ...)
> or such.

Currently, we always release parent's resources before a child's, within
each phase. I'm not sure if we rely on that parent-child ordering
anywhere though. I'd like to leave it as it is for now, to limit the
scope of this, and maybe revisit later.

>> +Normally, you are expected to call ResourceOwnerForget on every resource so
>> +that at commit, the ResourceOwner is empty (locks are an exception). If there
>> +are any resources still held at commit, ResourceOwnerRelease will call the
>> +PrintLeakWarning callback on each such resource. At abort, however, we truly
>> +rely on the ResourceOwner mechanism and it is normal that there are resources
>> +to be released.
>
> I am not sure it's a good idea to encode that all kinds of resources ought to
> have been released prior on commit. I can see that not always making sense -
> in fact we already don't do it for locks, no? Perhaps just add a "commonly"?

Hmm, I'd still like to print the warnings for resources that we expect
to be released before commit, though. We could have a flag in the
ResourceOwnerDesc to give flexibility, but I'm inclined to wait until we
get the first case of someone actually needing it.

>> +typedef struct ResourceElem
>> {
>> ...
>> + Datum item;
>> + ResourceOwnerFuncs *kind; /* NULL indicates a free hash table slot */
>> +} ResourceElem;
>
>> /*
>> - * Initially allocated size of a ResourceArray. Must be power of two since
>> - * we'll use (arraysize - 1) as mask for hashing.
>> + * Size of the fixed-size array to hold most-recently remembered resources.
>> */
>> -#define RESARRAY_INIT_SIZE 16
>> +#define RESOWNER_ARRAY_SIZE 32
>
> That's 512 bytes, pretty large to be searched sequentially. It's somewhat sad
> that the array needs to include 8 byte of ResourceOwnerFuncs...

Yeah. Early on I considered having a RegisterResourceKind() function
that you need to call first, and then each resource kind could be
assigned a smaller ID. If we wanted to keep the old mechanism of
separate arrays for each different resource kind, that'd be the way to
go. But overall this seems simpler, and the performance is decent
despite that linear scan.

Note that the current code in master also uses a plain array, until it
grows to 512 bytes, when it switches to a hash table. Lot of the details
with the array/hash combo are different, but that threshold was the same.

>> + bool releasing; /* has ResourceOwnerRelease been called? */
>> +
>> + /*
>> + * The fixed-size array for recent resources.
>> + *
>> + * If 'releasing' is set, the contents are sorted by release priority.
>> + */
>> + uint32 narr; /* how many items are stored in the array */
>> + ResourceElem arr[RESOWNER_ARRAY_SIZE];
>> +
>> + /*
>> + * The hash table. Uses open-addressing. 'nhash' is the number of items
>> + * present; if it would exceed 'grow_at', we enlarge it and re-hash.
>> + * 'grow_at' should be rather less than 'capacity' so that we don't waste
>> + * too much time searching for empty slots.
>> + *
>> + * If 'releasing' is set, the contents are no longer hashed, but sorted by
>> + * release priority. The first 'nhash' elements are occupied, the rest
>> + * are empty.
>> + */
>> + ResourceElem *hash;
>> + uint32 nhash; /* how many items are stored in the hash */
>> + uint32 capacity; /* allocated length of hash[] */
>> + uint32 grow_at; /* grow hash when reach this */
>> +
>> + /* The local locks cache. */
>> + uint32 nlocks; /* number of owned locks */
>> LOCALLOCK *locks[MAX_RESOWNER_LOCKS]; /* list of owned locks */
>
> Due to 'narr' being before the large 'arr', 'nhash' is always on a separate
> cacheline. I.e. the number of cachelines accessed in happy paths is higher
> than necessary, also relevant because there's more dependencies on narr, nhash
> than on the contents of those.
>
> I'd probably order it so that both of the large elements (arr, locks) are at
> the end.
>
> It's hard to know with changes this small, but it does seem to yield a small
> and repeatable performance benefit (pipelined pgbench -S).

Swapped the 'hash' and 'arr' parts of the struct, so that 'arr' and
'locks' are at the end.

>> +/*
>> + * Forget that an object is owned by a ResourceOwner
>> + *
>> + * Note: if same resource ID is associated with the ResourceOwner more than
>> + * once, one instance is removed.
>> + */
>> +void
>> +ResourceOwnerForget(ResourceOwner owner, Datum value, ResourceOwnerFuncs *kind)
>> +{
>> +#ifdef RESOWNER_TRACE
>> + elog(LOG, "FORGET %d: owner %p value " UINT64_FORMAT ", kind: %s",
>> + resowner_trace_counter++, owner, DatumGetUInt64(value), kind->name);
>> +#endif
>> +
>> + /*
>> + * Mustn't call this after we have already started releasing resources.
>> + * (Release callback functions are not allowed to release additional
>> + * resources.)
>> + */
>> + if (owner->releasing)
>> + elog(ERROR, "ResourceOwnerForget called for %s after release started", kind->name);
>> +
>> + /* Search through all items in the array first. */
>> + for (uint32 i = 0; i < owner->narr; i++)
>> + {
>
> Hm, I think we oftend up releasing resources in a lifo order. Would it
> possibly make sense to search in reverse order?

Changed. I can see a small speedup in a micro benchmark, when the array
is almost full and you repeatedly remember / forget one more resource.

> Separately, I wonder if it'd be worth to check if there are any other matches
> when assertions are enabled.

It is actually allowed to remember the same resource twice. There's a
comment in ResourceOwnerForget (previously in ResourceArrayRemove) that
each call releases one instance in that case. I'm not sure if we rely on
that anywhere currently.

>> ResourceOwnerRelease(ResourceOwner owner,
>> @@ -492,6 +631,15 @@ ResourceOwnerRelease(ResourceOwner owner,
>> {
>> /* There's not currently any setup needed before recursing */
>> ResourceOwnerReleaseInternal(owner, phase, isCommit, isTopLevel);
>> +
>> +#ifdef RESOWNER_STATS
>> + if (isTopLevel)
>> + {
>> + elog(LOG, "RESOWNER STATS: lookups: array %d, hash %d", narray_lookups, nhash_lookups);
>> + narray_lookups = 0;
>> + nhash_lookups = 0;
>> + }
>> +#endif
>> }
>
> Why do we have ResourceOwnerRelease() vs ResourceOwnerReleaseInternal()? Looks
> like that's older than your patch, but still confused.

Dunno. It provides a place to do things on the top-level resource owner
that you don't want to do when you recurse to the children, as hinted by
the comment, but I don't know if we've ever had such things. It was
handy for printing the RESOWNER_STATS now, though :-).

>> + /* Let add-on modules get a chance too */
>> + for (item = ResourceRelease_callbacks; item; item = next)
>> + {
>> + /* allow callbacks to unregister themselves when called */
>> + next = item->next;
>> + item->callback(phase, isCommit, isTopLevel, item->arg);
>> + }
>
> I wonder if it's really a good idea to continue having this API. What you are
> allowed to do in a resowner changed

Hmm, I don't think anything changed for users of this API. I'm not in a
hurry to remove this and force extensions to adapt, as it's not hard to
maintain this.

>> +/*
>> + * ResourceOwnerReleaseAllOfKind
>> + * Release all resources of a certain type held by this owner.
>> + */
>> +void
>> +ResourceOwnerReleaseAllOfKind(ResourceOwner owner, ResourceOwnerFuncs *kind)
>> +{
>> + /* Mustn't call this after we have already started releasing resources. */
>> + if (owner->releasing)
>> + elog(ERROR, "ResourceOwnerForget called for %s after release started", kind->name);
>>
>> - /* Ditto for plancache references */
>> - while (ResourceArrayGetAny(&(owner->planrefarr), &foundres))
>> + /*
>> + * Temporarily set 'releasing', to prevent calls to ResourceOwnerRemember
>> + * while we're scanning the owner. Enlarging the hash would cause us to
>> + * lose track of the point we're scanning.
>> + */
>> + owner->releasing = true;
>
> Is it a problem than a possible error would lead to ->releasing = true to be
> "leaked"? I think it might be, because we haven't called ResourceOwnerSort(),
> which means we'd not process resources in the right order during abort
> processing.

Good point. I added a separate 'sorted' flag to handle that gracefully.

>> +/*
>> + * Hash function for value+kind combination.
>> + */
>> static inline uint32
>> hash_resource_elem(Datum value, ResourceOwnerFuncs *kind)
>> {
>> - Datum data[2];
>> -
>> - data[0] = value;
>> - data[1] = PointerGetDatum(kind);
>> -
>> - return hash_bytes((unsigned char *) &data, 2 * SIZEOF_DATUM);
>> + /*
>> + * Most resource kinds store a pointer in 'value', and pointers are unique
>> + * all on their own. But some resources store plain integers (Files and
>> + * Buffers as of this writing), so we want to incorporate the 'kind' in
>> + * the hash too, otherwise those resources will collide a lot. But
>> + * because there are only a few resource kinds like that - and only a few
>> + * resource kinds to begin with - we don't need to work too hard to mix
>> + * 'kind' into the hash. Just add it with hash_combine(), it perturbs the
>> + * result enough for our purposes.
>> + */
>> +#if SIZEOF_DATUM == 8
>> + return hash_combine64(murmurhash64((uint64) value), (uint64) kind);
>> +#else
>> + return hash_combine(murmurhash32((uint32) value), (uint32) kind);
>> +#endif
>> }
>
> Why are we using a 64bit hash to then just throw out the high bits?

It was just expedient when the inputs are 64-bit. Implementation of
murmurhash64 is tiny, and we already use murmurhash32 elsewhere, so it
seemed like an OK choice. I'm sure there are better algorithms out
there, though.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment	Content-Type	Size
v16-0001-Move-a-few-ResourceOwnerEnlarge-calls-for-safety.patch	text/x-patch	9.8 KB
v16-0002-Make-resowners-more-easily-extensible.patch	text/x-patch	158.8 KB
v16-0003-Use-a-faster-hash-function-in-resource-owners.patch	text/x-patch	2.8 KB
v16-0004-Change-pgcrypto-to-use-the-new-ResourceOwner-mec.patch	text/x-patch	7.9 KB

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-10-25 18:07:29
Message-ID:	20231025180729.dmsohuolcv4d5rqa@awork3.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

On 2023-10-25 15:43:36 +0300, Heikki Linnakangas wrote:
> On 10/07/2023 22:14, Andres Freund wrote:
> > > /*
> > > - * Initially allocated size of a ResourceArray. Must be power of two since
> > > - * we'll use (arraysize - 1) as mask for hashing.
> > > + * Size of the fixed-size array to hold most-recently remembered resources.
> > > */
> > > -#define RESARRAY_INIT_SIZE 16
> > > +#define RESOWNER_ARRAY_SIZE 32
> >
> > That's 512 bytes, pretty large to be searched sequentially. It's somewhat sad
> > that the array needs to include 8 byte of ResourceOwnerFuncs...
>
> Yeah. Early on I considered having a RegisterResourceKind() function that
> you need to call first, and then each resource kind could be assigned a
> smaller ID. If we wanted to keep the old mechanism of separate arrays for
> each different resource kind, that'd be the way to go. But overall this
> seems simpler, and the performance is decent despite that linear scan.

Realistically, on a 64bit platform, the compiler / we want 64bit alignment for
ResourceElem->item, so making "kind" smaller isn't going to do much...

> > Due to 'narr' being before the large 'arr', 'nhash' is always on a separate
> > cacheline. I.e. the number of cachelines accessed in happy paths is higher
> > than necessary, also relevant because there's more dependencies on narr, nhash
> > than on the contents of those.
> >
> > I'd probably order it so that both of the large elements (arr, locks) are at
> > the end.
> >
> > It's hard to know with changes this small, but it does seem to yield a small
> > and repeatable performance benefit (pipelined pgbench -S).
>
> Swapped the 'hash' and 'arr' parts of the struct, so that 'arr' and 'locks'
> are at the end.

I don't remember what the layout was then, but now there are 10 bytes of holes
on x86-64 (and I think most other 64bit architectures).

/* XXX 6 bytes hole, try to pack */

ResourceElem * hash; /* 40 8 */
uint32 nhash; /* 48 4 */
uint32 capacity; /* 52 4 */
uint32 grow_at; /* 56 4 */
uint32 narr; /* 60 4 */
/* --- cacheline 1 boundary (64 bytes) --- */
ResourceElem arr[32]; /* 64 512 */
/* --- cacheline 9 boundary (576 bytes) --- */
uint32 nlocks; /* 576 4 */

/* XXX 4 bytes hole, try to pack */

LOCALLOCK * locks[15]; /* 584 120 */

/* size: 704, cachelines: 11, members: 14 */
/* sum members: 694, holes: 2, sum holes: 10 */
};

While somewhat annoying from a human pov, it seems like it would make sense to
rearrange a bit? At least moving nlocks into the first cacheline would be good
(not that we guarantee cacheline alignment rn, though it sometimes works out
to be aligned).

We also can make narr, nlocks uint8s.

With that narrowing and reordering, we end up with 688 bytes. Not worth for
most structs, but here perhaps worth doing?

Moving the lock length to the start of the struct would make sense to keep
things that are likely to be used in a "stalling" manner at the start of the
struct / in one cacheline.

It's not too awful to have it be in this order:

struct ResourceOwnerData {
ResourceOwner parent; /* 0 8 */
ResourceOwner firstchild; /* 8 8 */
ResourceOwner nextchild; /* 16 8 */
const char * name; /* 24 8 */
_Bool releasing; /* 32 1 */
_Bool sorted; /* 33 1 */
uint8 narr; /* 34 1 */
uint8 nlocks; /* 35 1 */
uint32 nhash; /* 36 4 */
uint32 capacity; /* 40 4 */
uint32 grow_at; /* 44 4 */
ResourceElem arr[32]; /* 48 512 */
/* --- cacheline 8 boundary (512 bytes) was 48 bytes ago --- */
ResourceElem * hash; /* 560 8 */
LOCALLOCK * locks[15]; /* 568 120 */

/* size: 688, cachelines: 11, members: 14 */
/* last cacheline: 48 bytes */
};

Requires rephrasing a few comments to document that the lenghts are separate
from the array / hashtable / locks, but otherwise...

This reliably shows a decent speed improvement in my stress test [1], on the
order of 5%.

At that point, the first array entry fits into the first cacheline. If we were
to move parent, firstchild, nextchild, name further down the struct, three
array entries would be on the first line. Just using the array presumably is a
very common case, so that might be worth it?

I got less clear performance results with this one, and it's also quite
possible it could hurt, if resowners aren't actually "used", just
released. Therefore it's probably not worth it for now.

Greetings,

Andres Freund

[1] pgbench running
DO $$ BEGIN FOR i IN 1..5000 LOOP PERFORM abalance FROM pgbench_accounts WHERE aid = 17;END LOOP;END;$$;

From:	Peter Eisentraut <peter(at)eisentraut(dot)org>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-11-06 10:43:32
Message-ID:	48b16f57-4204-4bd2-ab42-fb24144649c4@eisentraut.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

It looks like this patch set needs a bit of surgery to adapt to the LLVM
changes in 9dce22033d. The cfbot is reporting compiler warnings about
this, and also some crashes, which might also be caused by this.

I do like the updated APIs. (Maybe the repeated ".DebugPrint = NULL,
/* default message is fine */" lines could be omitted?)

I like that one can now easily change the elog(WARNING) in
ResourceOwnerReleaseAll() to a PANIC or something to get automatic
verification during testing. I wonder if we should make this the
default if assertions are on? This would need some adjustments to
src/test/modules/test_resowner because it would then fail.

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-11-07 11:24:48
Message-ID:	bbb3ebeb-cde9-48b4-8433-36d7ed41648e@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	503 사설 토토 사이트

On 25/10/2023 21:07, Andres Freund wrote:
> It's not too awful to have it be in this order:
>
> struct ResourceOwnerData {
> ResourceOwner parent; /* 0 8 */
> ResourceOwner firstchild; /* 8 8 */
> ResourceOwner nextchild; /* 16 8 */
> const char * name; /* 24 8 */
> _Bool releasing; /* 32 1 */
> _Bool sorted; /* 33 1 */
> uint8 narr; /* 34 1 */
> uint8 nlocks; /* 35 1 */
> uint32 nhash; /* 36 4 */
> uint32 capacity; /* 40 4 */
> uint32 grow_at; /* 44 4 */
> ResourceElem arr[32]; /* 48 512 */
> /* --- cacheline 8 boundary (512 bytes) was 48 bytes ago --- */
> ResourceElem * hash; /* 560 8 */
> LOCALLOCK * locks[15]; /* 568 120 */
>
> /* size: 688, cachelines: 11, members: 14 */
> /* last cacheline: 48 bytes */
> };
>
> Requires rephrasing a few comments to document that the lenghts are separate
> from the array / hashtable / locks, but otherwise...

Hmm, let's move 'capacity' and 'grow_at' after 'arr', too. They are only
needed together with 'hash'.

> This reliably shows a decent speed improvement in my stress test [1], on the
> order of 5%.
>
> [1] pgbench running
> DO $$ BEGIN FOR i IN 1..5000 LOOP PERFORM abalance FROM pgbench_accounts WHERE aid = 17;END LOOP;END;$$;

I'm seeing similar results, although there's enough noise in the test
that I'm sure how real the would be across different tests.

> At that point, the first array entry fits into the first cacheline. If we were
> to move parent, firstchild, nextchild, name further down the struct, three
> array entries would be on the first line. Just using the array presumably is a
> very common case, so that might be worth it?
>
> I got less clear performance results with this one, and it's also quite
> possible it could hurt, if resowners aren't actually "used", just
> released. Therefore it's probably not worth it for now.

You're assuming that the ResourceOwner struct begins at a cacheline
boundary. That's not usually true, we don't try to cacheline-align it.
So while it's helpful to avoid padding and to keep frequently-accessed
fields close to each other, there's no benefit in keeping them at the
beginning of the struct.

--
Heikki Linnakangas
Neon (https://neon.tech)

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-11-07 11:28:28
Message-ID:	bf814230-4172-47ce-a47c-6212ee6d298e@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 06/11/2023 12:43, Peter Eisentraut wrote:
> It looks like this patch set needs a bit of surgery to adapt to the LLVM
> changes in 9dce22033d. The cfbot is reporting compiler warnings about
> this, and also some crashes, which might also be caused by this.

Updated patch set attached. I fixed those LLVM crashes, and reordered
the fields in the ResourceOwner struct per Andres' suggestion.

> I do like the updated APIs. (Maybe the repeated ".DebugPrint = NULL,
> /* default message is fine */" lines could be omitted?)
>
> I like that one can now easily change the elog(WARNING) in
> ResourceOwnerReleaseAll() to a PANIC or something to get automatic
> verification during testing. I wonder if we should make this the
> default if assertions are on? This would need some adjustments to
> src/test/modules/test_resowner because it would then fail.

Yeah, perhaps, but I'll leave that to possible a follow-up patch.

I feel pretty good about this overall. Barring objections or new cfbot
failures, I will commit this in the next few days.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment	Content-Type	Size
v17-0001-Move-a-few-ResourceOwnerEnlarge-calls-for-safety.patch	text/x-patch	9.8 KB
v17-0002-Make-resowners-more-easily-extensible.patch	text/x-patch	158.9 KB
v17-0003-Use-a-faster-hash-function-in-resource-owners.patch	text/x-patch	2.8 KB
v17-0004-Change-pgcrypto-to-use-the-new-ResourceOwner-mec.patch	text/x-patch	7.9 KB

From:	Alexander Lakhin <exclusion(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-11-07 14:00:00
Message-ID:	9cb1277d-aee4-6511-44ee-7924676f7756@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello Heikki,

07.11.2023 14:28, Heikki Linnakangas wrote:
>
> I feel pretty good about this overall. Barring objections or new cfbot failures, I will commit this in the next few days.
>

A script, that I published in [1], detects several typos in the patches:
betwen
ther
ReourceOwner
ResouceOwnerSort
ReleaseOwnerRelease

Maybe it's worth to fix them now, or just accumulate and clean all such
defects once a year...

[1] /message-id/27a7998b-d67f-e32a-a28d-e659a2e390c6%40gmail.com

Best regards,
Alexander

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Alexander Lakhin <exclusion(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-11-08 11:37:42
Message-ID:	599c6674-d40b-4b31-9251-c5c073ff9fdc@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 07/11/2023 16:00, Alexander Lakhin wrote:
> 07.11.2023 14:28, Heikki Linnakangas wrote:
>>
>> I feel pretty good about this overall. Barring objections or new cfbot failures, I will commit this in the next few days.
>
> A script, that I published in [1], detects several typos in the patches:
> betwen
> ther
> ReourceOwner
> ResouceOwnerSort
> ReleaseOwnerRelease
>
> Maybe it's worth to fix them now, or just accumulate and clean all such
> defects once a year...

Fixed these, and pushed. Thanks everyone for reviewing!

--
Heikki Linnakangas
Neon (https://neon.tech)

From:	Alexander Lakhin <exclusion(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-11-08 18:00:00
Message-ID:	be58d565-9e95-d417-4e47-f6bd408dea4b@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	Postg토토 사이트SQL

Hello Heikki,

08.11.2023 14:37, Heikki Linnakangas wrote:
>
> Fixed these, and pushed. Thanks everyone for reviewing!
>

Please look at a new assertion failure, I've managed to trigger with this script:
CREATE TABLE t(
i01 int, i02 int, i03 int, i04 int, i05 int, i06 int, i07 int, i08 int, i09 int, i10 int,
i11 int, i12 int, i13 int, i14 int, i15 int, i16 int, i17 int, i18 int, i19 int, i20 int,
i21 int, i22 int, i23 int, i24 int, i25 int, i26 int
);
CREATE TABLE tp PARTITION OF t FOR VALUES IN (1);

(gdb) bt
...
#5 0x0000560dd4e42f77 in ExceptionalCondition (conditionName=0x560dd5059fbc "owner->narr == 0", fileName=0x560dd5059e31
"resowner.c", lineNumber=362) at assert.c:66
#6 0x0000560dd4e93cbd in ResourceOwnerReleaseAll (owner=0x560dd69cebd0, phase=RESOURCE_RELEASE_BEFORE_LOCKS,
printLeakWarnings=false) at resowner.c:362
#7 0x0000560dd4e92e22 in ResourceOwnerReleaseInternal (owner=0x560dd69cebd0, phase=RESOURCE_RELEASE_BEFORE_LOCKS,
isCommit=false, isTopLevel=true) at resowner.c:725
#8 0x0000560dd4e92d42 in ResourceOwnerReleaseInternal (owner=0x560dd69ce3f8, phase=RESOURCE_RELEASE_BEFORE_LOCKS,
isCommit=false, isTopLevel=true) at resowner.c:678
#9 0x0000560dd4e92cdb in ResourceOwnerRelease (owner=0x560dd69ce3f8, phase=RESOURCE_RELEASE_BEFORE_LOCKS,
isCommit=false, isTopLevel=true) at resowner.c:652
#10 0x0000560dd47316ef in AbortTransaction () at xact.c:2848
#11 0x0000560dd47329ac in AbortCurrentTransaction () at xact.c:3339
#12 0x0000560dd4c37284 in PostgresMain (dbname=0x560dd69779f0 "regression", username=0x560dd69779d8 "law") at
postgres.c:4370
...

Best regards,
Alexander

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Alexander Lakhin <exclusion(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-11-08 23:48:59
Message-ID:	d1bb4cfc-ce2b-4e20-bad7-7ae31c7b9cd0@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 08/11/2023 20:00, Alexander Lakhin wrote:
> Please look at a new assertion failure, I've managed to trigger with this script:
> CREATE TABLE t(
> i01 int, i02 int, i03 int, i04 int, i05 int, i06 int, i07 int, i08 int, i09 int, i10 int,
> i11 int, i12 int, i13 int, i14 int, i15 int, i16 int, i17 int, i18 int, i19 int, i20 int,
> i21 int, i22 int, i23 int, i24 int, i25 int, i26 int
> );
> CREATE TABLE tp PARTITION OF t FOR VALUES IN (1);
>
> (gdb) bt
> ...
> #5 0x0000560dd4e42f77 in ExceptionalCondition (conditionName=0x560dd5059fbc "owner->narr == 0", fileName=0x560dd5059e31
> "resowner.c", lineNumber=362) at assert.c:66
> #6 0x0000560dd4e93cbd in ResourceOwnerReleaseAll (owner=0x560dd69cebd0, phase=RESOURCE_RELEASE_BEFORE_LOCKS,
> printLeakWarnings=false) at resowner.c:362
> #7 0x0000560dd4e92e22 in ResourceOwnerReleaseInternal (owner=0x560dd69cebd0, phase=RESOURCE_RELEASE_BEFORE_LOCKS,
> isCommit=false, isTopLevel=true) at resowner.c:725
> #8 0x0000560dd4e92d42 in ResourceOwnerReleaseInternal (owner=0x560dd69ce3f8, phase=RESOURCE_RELEASE_BEFORE_LOCKS,
> isCommit=false, isTopLevel=true) at resowner.c:678
> #9 0x0000560dd4e92cdb in ResourceOwnerRelease (owner=0x560dd69ce3f8, phase=RESOURCE_RELEASE_BEFORE_LOCKS,
> isCommit=false, isTopLevel=true) at resowner.c:652
> #10 0x0000560dd47316ef in AbortTransaction () at xact.c:2848
> #11 0x0000560dd47329ac in AbortCurrentTransaction () at xact.c:3339
> #12 0x0000560dd4c37284 in PostgresMain (dbname=0x560dd69779f0 "regression", username=0x560dd69779d8 "law") at
> postgres.c:4370
> ...

Thanks for the testing! Fixed. I introduced that bug in one of the last
revisions of the patch: I changed ResourceOwnerSort() to not bother
moving all the elements from the array to the hash table if the hash
table is allocated but empty. However, ResourceOwnerReleas() didn't get
the memo, and still checked "owner->hash != NULL" rather than
"owner->nhash == 0".

--
Heikki Linnakangas
Neon (https://neon.tech)

From:	Alexander Lakhin <exclusion(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-11-10 09:00:00
Message-ID:	11b70743-c5f3-3910-8e5b-dd6c115ff829@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello Heikki,

09.11.2023 02:48, Heikki Linnakangas wrote:
>
> Thanks for the testing! Fixed. ...

Thank you for the fix!

Please look at one more failure caused be the new implementation of
ResourceOwners:
numdbs=80
for ((i=1;i<=10;i++)); do
echo "ITERATION $i"

for ((d=1;d<=$numdbs;d++)); do createdb db$d; done

for ((d=1;d<=$numdbs;d++)); do
echo "
create table t(t1 text);
drop table t;
" | psql -d db$d >psql-$d.log 2>&1 &
done
wait
grep 'PANIC' server.log && break;

for ((d=1;d<=$numdbs;d++)); do dropdb db$d; done
grep 'PANIC' server.log && break;
done

I could see two failure modes:
2023-11-10 08:42:28.870 UTC [1163274] ERROR: ResourceOwnerEnlarge called after release started
2023-11-10 08:42:28.870 UTC [1163274] STATEMENT: drop table t;
2023-11-10 08:42:28.870 UTC [1163274] WARNING: AbortTransaction while in COMMIT state
2023-11-10 08:42:28.870 UTC [1163274] PANIC: cannot abort transaction 906, it was already committed

2023-11-10 08:43:27.897 UTC [1164148] ERROR: ResourceOwnerEnlarge called after release started
2023-11-10 08:43:27.897 UTC [1164148] STATEMENT: DROP DATABASE db69;
2023-11-10 08:43:27.897 UTC [1164148] WARNING: AbortTransaction while in COMMIT state
2023-11-10 08:43:27.897 UTC [1164148] PANIC: cannot abort transaction 1043, it was already committed

The stack trace for the second ERROR (ResourceOwnerEnlarge called ...) is:
...
#6 0x0000558af5b2f35c in ResourceOwnerEnlarge (owner=0x558af716f3c8) at resowner.c:455
#7 0x0000558af5888f18 in dsm_create_descriptor () at dsm.c:1207
#8 0x0000558af5889205 in dsm_attach (h=3172038420) at dsm.c:697
#9 0x0000558af5b1ebed in get_segment_by_index (area=0x558af711da18, index=2) at dsa.c:1764
#10 0x0000558af5b1ea4b in dsa_get_address (area=0x558af711da18, dp=2199023329568) at dsa.c:970
#11 0x0000558af5669366 in dshash_seq_next (status=0x7ffdd5912fd0) at dshash.c:687
#12 0x0000558af5901998 in pgstat_drop_database_and_contents (dboid=16444) at pgstat_shmem.c:830
#13 0x0000558af59016f0 in pgstat_drop_entry (kind=PGSTAT_KIND_DATABASE, dboid=16444, objoid=0) at pgstat_shmem.c:888
#14 0x0000558af59044eb in AtEOXact_PgStat_DroppedStats (xact_state=0x558af7111ee0, isCommit=true) at pgstat_xact.c:88
#15 0x0000558af59043c7 in AtEOXact_PgStat (isCommit=true, parallel=false) at pgstat_xact.c:55
#16 0x0000558af53c782e in CommitTransaction () at xact.c:2371
#17 0x0000558af53c709e in CommitTransactionCommand () at xact.c:306
...

Best regards,
Alexander

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Alexander Lakhin <exclusion(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2023-11-10 14:26:51
Message-ID:	10399503-9a92-4107-8c36-698da29438db@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Thanks for the testing again!

On 10/11/2023 11:00, Alexander Lakhin wrote:
> I could see two failure modes:
> 2023-11-10 08:42:28.870 UTC [1163274] ERROR: ResourceOwnerEnlarge called after release started
> 2023-11-10 08:42:28.870 UTC [1163274] STATEMENT: drop table t;
> 2023-11-10 08:42:28.870 UTC [1163274] WARNING: AbortTransaction while in COMMIT state
> 2023-11-10 08:42:28.870 UTC [1163274] PANIC: cannot abort transaction 906, it was already committed
>
> 2023-11-10 08:43:27.897 UTC [1164148] ERROR: ResourceOwnerEnlarge called after release started
> 2023-11-10 08:43:27.897 UTC [1164148] STATEMENT: DROP DATABASE db69;
> 2023-11-10 08:43:27.897 UTC [1164148] WARNING: AbortTransaction while in COMMIT state
> 2023-11-10 08:43:27.897 UTC [1164148] PANIC: cannot abort transaction 1043, it was already committed
>
> The stack trace for the second ERROR (ResourceOwnerEnlarge called ...) is:
> ...
> #6 0x0000558af5b2f35c in ResourceOwnerEnlarge (owner=0x558af716f3c8) at resowner.c:455
> #7 0x0000558af5888f18 in dsm_create_descriptor () at dsm.c:1207
> #8 0x0000558af5889205 in dsm_attach (h=3172038420) at dsm.c:697
> #9 0x0000558af5b1ebed in get_segment_by_index (area=0x558af711da18, index=2) at dsa.c:1764
> #10 0x0000558af5b1ea4b in dsa_get_address (area=0x558af711da18, dp=2199023329568) at dsa.c:970
> #11 0x0000558af5669366 in dshash_seq_next (status=0x7ffdd5912fd0) at dshash.c:687
> #12 0x0000558af5901998 in pgstat_drop_database_and_contents (dboid=16444) at pgstat_shmem.c:830
> #13 0x0000558af59016f0 in pgstat_drop_entry (kind=PGSTAT_KIND_DATABASE, dboid=16444, objoid=0) at pgstat_shmem.c:888
> #14 0x0000558af59044eb in AtEOXact_PgStat_DroppedStats (xact_state=0x558af7111ee0, isCommit=true) at pgstat_xact.c:88
> #15 0x0000558af59043c7 in AtEOXact_PgStat (isCommit=true, parallel=false) at pgstat_xact.c:55
> #16 0x0000558af53c782e in CommitTransaction () at xact.c:2371
> #17 0x0000558af53c709e in CommitTransactionCommand () at xact.c:306
> ...

The quick, straightforward fix is to move the "CurrentResourceOwner =
NULL" line earlier in CommitTransaction, per attached
0003-Clear-CurrentResourceOwner-earlier-in-CommitTransact.patch. You're
not allowed to use the resource owner after you start to release it; it
was a bit iffy even before the ResourceOwner rewrite but now it's
explicitly forbidden. By clearing CurrentResourceOwner as soon as we
start releasing it, we can prevent any accidental use.

When CurrentResourceOwner == NULL, dsm_attach() returns a handle that's
not associated with any ResourceOwner. That's appropriate for the pgstat
case. The DSA is "pinned", so the handle is forgotten from the
ResourceOwner right after calling dsm_attach() anyway.

Looking closer at dsa.c, I think this is a wider problem though. The
comments don't make it very clear how it's supposed to interact with
ResourceOwners. There's just this brief comment in dsa_pin_mapping():

> * By default, areas are owned by the current resource owner, which means they
> * are detached automatically when that scope ends.

The dsa_area struct isn't directly owned by any ResourceOwner though.
The DSM segments created by dsa_create() or dsa_attach() are.

But the functions dsa_allocate() and dsa_get_address() can create or
attach more DSM segments to the area, and they will be owned by the by
the current resource owner *at the time of the call*. So if you call
dsa_get_address() while in a different resource owner, things get very
confusing. The attached new test module demonstrates that
(0001-Add-test_dsa-module.patch), here's a shortened version:

a = dsa_create(tranche_id);

/* Switch to new resource owner */
oldowner = CurrentResourceOwner;
childowner = ResourceOwnerCreate(oldowner, "temp owner");
CurrentResourceOwner = childowner;

/* make a bunch of allocations */
for (int i = 0; i < 10000; i++)
p[i] = dsa_allocate(a, 1000);

/* Release the child resource owner */
CurrentResourceOwner = oldowner;
ResourceOwnerRelease(childowner,
RESOURCE_RELEASE_BEFORE_LOCKS,
true, false);
ResourceOwnerRelease(childowner,
RESOURCE_RELEASE_LOCKS,
true, false);
ResourceOwnerRelease(childowner,
RESOURCE_RELEASE_AFTER_LOCKS,
true, false);
ResourceOwnerDelete(childowner);

dsa_detach(a);

This first prints warnings on The ResourceOwnerRelease() calls:

2023-11-10 13:57:21.475 EET [745346] WARNING: resource was not closed:
dynamic shared memory segment 2395813396
2023-11-10 13:57:21.475 EET [745346] WARNING: resource was not closed:
dynamic shared memory segment 3922992700
2023-11-10 13:57:21.475 EET [745346] WARNING: resource was not closed:
dynamic shared memory segment 1155452762
2023-11-10 13:57:21.475 EET [745346] WARNING: resource was not closed:
dynamic shared memory segment 4045183168
2023-11-10 13:57:21.476 EET [745346] WARNING: resource was not closed:
dynamic shared memory segment 1529990480

And a segfault at the dsm_detach() call:

2023-11-10 13:57:21.480 EET [745246] LOG: server process (PID 745346)
was terminated by signal 11: Segmentation fault

#0 0x000055a5148f64ee in slist_pop_head_node (head=0x55a516d06bf8) at
../../../../src/include/lib/ilist.h:1034
#1 0x000055a5148f5eba in dsm_detach (seg=0x55a516d06bc0) at dsm.c:822
#2 0x000055a514b86db0 in dsa_detach (area=0x55a516d64dc8) at dsa.c:1939
#3 0x00007fd5dcaee5e0 in test_dsa_resowners (fcinfo=0x55a516d56a28) at
test_dsa.c:112

I think that is surprising behavior from the DSA facility. When you make
allocations with dsa_allocate() or just call dsa_get_address() on an
existing dsa_pointer, you wouldn't expect the current resource owner to
matter. I think dsa_create/attach() should store the current resource
owner in the dsa_area, for use in subsequent operations on the DSA, per
attached patch (0002-Fix-dsa.c-with-different-resource-owners.patch).

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment	Content-Type	Size
0001-Add-test_dsa-module.patch	text/x-patch	8.8 KB
0002-Fix-dsa.c-with-different-resource-owners.patch	text/x-patch	3.3 KB
0003-Clear-CurrentResourceOwner-earlier-in-CommitTransact.patch	text/x-patch	1.5 KB

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Alexander Lakhin <exclusion(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>
Subject:	Re: ResourceOwner refactoring
Date:	2023-11-10 23:34:25
Message-ID:	20231110233425.q3fqbtep65zgzkma@alap3.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

On 2023-11-10 16:26:51 +0200, Heikki Linnakangas wrote:
> The quick, straightforward fix is to move the "CurrentResourceOwner = NULL"
> line earlier in CommitTransaction, per attached
> 0003-Clear-CurrentResourceOwner-earlier-in-CommitTransact.patch. You're not
> allowed to use the resource owner after you start to release it; it was a
> bit iffy even before the ResourceOwner rewrite but now it's explicitly
> forbidden. By clearing CurrentResourceOwner as soon as we start releasing
> it, we can prevent any accidental use.
>
> When CurrentResourceOwner == NULL, dsm_attach() returns a handle that's not
> associated with any ResourceOwner. That's appropriate for the pgstat case.
> The DSA is "pinned", so the handle is forgotten from the ResourceOwner right
> after calling dsm_attach() anyway.

Yea - I wonder if we should have a version of dsm_attach() that pins at the
same time. It's pretty silly to first attach (remembering the dsm) and then
immediately pin (forgetting the dsm).

> Looking closer at dsa.c, I think this is a wider problem though. The
> comments don't make it very clear how it's supposed to interact with
> ResourceOwners. There's just this brief comment in dsa_pin_mapping():
>
> > * By default, areas are owned by the current resource owner, which means they
> > * are detached automatically when that scope ends.
>
> The dsa_area struct isn't directly owned by any ResourceOwner though. The
> DSM segments created by dsa_create() or dsa_attach() are.

But there's a relationship from the dsa too, via on_dsm_detach().

> But the functions dsa_allocate() and dsa_get_address() can create or attach
> more DSM segments to the area, and they will be owned by the by the current
> resource owner *at the time of the call*.

Yea, that doesn't seem great. It shouldn't affect the stats could though, due
the dsa being pinned.

> I think that is surprising behavior from the DSA facility. When you make
> allocations with dsa_allocate() or just call dsa_get_address() on an
> existing dsa_pointer, you wouldn't expect the current resource owner to
> matter. I think dsa_create/attach() should store the current resource owner
> in the dsa_area, for use in subsequent operations on the DSA, per attached
> patch (0002-Fix-dsa.c-with-different-resource-owners.patch).

Yea, that seems the right direction from here.

Greetings,

Andres Freund

From:	Alexander Lakhin <exclusion(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2023-11-11 12:00:00
Message-ID:	4160bf8f-3101-2e5f-cca1-a75ca7c522f3@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello Heikki,

10.11.2023 17:26, Heikki Linnakangas wrote:
>
> I think that is surprising behavior from the DSA facility. When you make allocations with dsa_allocate() or just call
> dsa_get_address() on an existing dsa_pointer, you wouldn't expect the current resource owner to matter. I think
> dsa_create/attach() should store the current resource owner in the dsa_area, for use in subsequent operations on the
> DSA, per attached patch (0002-Fix-dsa.c-with-different-resource-owners.patch).
>

With the patch 0002 applied, I'm observing another anomaly:
CREATE TABLE t(t1 text, t2 text);
INSERT INTO t SELECT md5(g::text), '12345678901234567890' FROM generate_series(1, 100000) g;
CREATE INDEX tidx ON t(t1);

CREATE FUNCTION f() RETURNS TABLE (t1 text, t2 text) AS 'BEGIN END' LANGUAGE plpgsql;

SELECT * FROM f();

gives me under Valgrind:
2023-11-11 11:54:18.964 UTC|law|regression|654f5d2e.3c7a92|LOG: statement: SELECT * FROM f();
==00:00:00:49.589 3963538== Invalid read of size 1
==00:00:00:49.589 3963538==    at 0x9C8785: ResourceOwnerEnlarge (resowner.c:454)
==00:00:00:49.589 3963538==    by 0x7507C4: dsm_create_descriptor (dsm.c:1207)
==00:00:00:49.589 3963538==    by 0x74EF71: dsm_create (dsm.c:538)
==00:00:00:49.589 3963538==    by 0x9B7BEA: make_new_segment (dsa.c:2171)
==00:00:00:49.589 3963538==    by 0x9B6E28: ensure_active_superblock (dsa.c:1696)
==00:00:00:49.589 3963538==    by 0x9B67DD: alloc_object (dsa.c:1487)
==00:00:00:49.589 3963538==    by 0x9B5064: dsa_allocate_extended (dsa.c:816)
==00:00:00:49.589 3963538==    by 0x978A4C: share_tupledesc (typcache.c:2742)
==00:00:00:49.589 3963538==    by 0x978C07: find_or_make_matching_shared_tupledesc (typcache.c:2796)
==00:00:00:49.589 3963538==    by 0x977652: assign_record_type_typmod (typcache.c:1995)
==00:00:00:49.589 3963538==    by 0x9885A2: internal_get_result_type (funcapi.c:462)
==00:00:00:49.589 3963538==    by 0x98800D: get_expr_result_type (funcapi.c:299)
==00:00:00:49.589 3963538== Address 0x72f9bf0 is 2,096 bytes inside a recently re-allocated block of size 16,384 alloc'd
==00:00:00:49.589 3963538==    at 0x4848899: malloc (vg_replace_malloc.c:381)
==00:00:00:49.589 3963538==    by 0x9B1F94: AllocSetAlloc (aset.c:928)
==00:00:00:49.589 3963538==    by 0x9C1162: MemoryContextAllocZero (mcxt.c:1076)
==00:00:00:49.589 3963538==    by 0x9C872C: ResourceOwnerCreate (resowner.c:423)
==00:00:00:49.589 3963538==    by 0x2CC522: AtStart_ResourceOwner (xact.c:1211)
==00:00:00:49.589 3963538==    by 0x2CD52A: StartTransaction (xact.c:2084)
==00:00:00:49.589 3963538==    by 0x2CE442: StartTransactionCommand (xact.c:2948)
==00:00:00:49.589 3963538==    by 0x992D69: InitPostgres (postinit.c:860)
==00:00:00:49.589 3963538==    by 0x791E6E: PostgresMain (postgres.c:4209)
==00:00:00:49.589 3963538==    by 0x6B13C4: BackendRun (postmaster.c:4423)
==00:00:00:49.589 3963538==    by 0x6B09EB: BackendStartup (postmaster.c:4108)
==00:00:00:49.589 3963538==    by 0x6AD0A6: ServerLoop (postmaster.c:1767)

Best regards,
Alexander

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Alexander Lakhin <exclusion(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2023-11-12 22:16:50
Message-ID:	009634e6-762f-4be4-b04a-aeb735ba7b11@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	Postg사설 토토 사이트SQL

On 11/11/2023 14:00, Alexander Lakhin wrote:
> 10.11.2023 17:26, Heikki Linnakangas wrote:
>>
>> I think that is surprising behavior from the DSA facility. When you make allocations with dsa_allocate() or just call
>> dsa_get_address() on an existing dsa_pointer, you wouldn't expect the current resource owner to matter. I think
>> dsa_create/attach() should store the current resource owner in the dsa_area, for use in subsequent operations on the
>> DSA, per attached patch (0002-Fix-dsa.c-with-different-resource-owners.patch).
>>
>
> With the patch 0002 applied, I'm observing another anomaly:

Ok, so the ownership of a dsa was still muddled :-(. Attached is a new
version that goes a little further. It replaces the whole
'mapping_pinned' flag in dsa_area with the 'resowner'. When a mapping is
pinned, it means that 'resowner == NULL'. This is now similar to how the
resowner field and pinning works in dsm.c.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment	Content-Type	Size
0001-Add-test_dsa-module.patch	text/x-patch	8.8 KB
0002-Fix-dsa.c-with-different-resource-owners.patch	text/x-patch	5.0 KB
0003-Clear-CurrentResourceOwner-earlier-in-CommitTransact.patch	text/x-patch	1.5 KB

From:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Alexander Lakhin <exclusion(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2023-11-12 23:08:57
Message-ID:	CA+hUKGL_RzOe5Nj1eu7jkMWwKRDV8X=-bkX7BEPgiaMY6FFiKQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	Postg토토 사이트 순위SQL

On Mon, Nov 13, 2023 at 11:16 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> On 11/11/2023 14:00, Alexander Lakhin wrote:
> > 10.11.2023 17:26, Heikki Linnakangas wrote:
> >> I think that is surprising behavior from the DSA facility. When you make allocations with dsa_allocate() or just call
> >> dsa_get_address() on an existing dsa_pointer, you wouldn't expect the current resource owner to matter. I think
> >> dsa_create/attach() should store the current resource owner in the dsa_area, for use in subsequent operations on the
> >> DSA, per attached patch (0002-Fix-dsa.c-with-different-resource-owners.patch).

Yeah, I agree that it is surprising and dangerous that the DSM
segments can have different owners if you're not careful, and it's not
clear that you have to be. Interesting that no one ever reported
breakage in parallel query due to this thinko, presumably because the
current resource owner always turns out to be either the same one or
something with longer life.

> > With the patch 0002 applied, I'm observing another anomaly:
>
> Ok, so the ownership of a dsa was still muddled :-(. Attached is a new
> version that goes a little further. It replaces the whole
> 'mapping_pinned' flag in dsa_area with the 'resowner'. When a mapping is
> pinned, it means that 'resowner == NULL'. This is now similar to how the
> resowner field and pinning works in dsm.c.

This patch makes sense to me.

It might be tempting to delete the dsa_pin_mapping() interface
completely and say that if CurrentResourceOwner == NULL, then it's
effectively (what we used to call) pinned, but I think it's useful for
exception-safe construction of multiple objects to be able to start
out with a resource owner and then later 'commit' by clearing it. As
seen in GetSessionDsmHandle().

I don't love the way this stuff is not very explicit, and if we're
going to keep doing this 'dynamic scope' stuff then I wish we could
find a scoping notation that looks more like the stuff one sees in
other languages that say something like
"with-resource-owner(area->resowner) { block of code }".

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Alexander Lakhin <exclusion(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2023-11-15 10:11:40
Message-ID:	9a3cb7eb-f996-45bb-b803-bc562a8d06aa@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 13/11/2023 01:08, Thomas Munro wrote:
> On Mon, Nov 13, 2023 at 11:16 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>> On 11/11/2023 14:00, Alexander Lakhin wrote:
>>> 10.11.2023 17:26, Heikki Linnakangas wrote:
>>>> I think that is surprising behavior from the DSA facility. When you make allocations with dsa_allocate() or just call
>>>> dsa_get_address() on an existing dsa_pointer, you wouldn't expect the current resource owner to matter. I think
>>>> dsa_create/attach() should store the current resource owner in the dsa_area, for use in subsequent operations on the
>>>> DSA, per attached patch (0002-Fix-dsa.c-with-different-resource-owners.patch).
>
> Yeah, I agree that it is surprising and dangerous that the DSM
> segments can have different owners if you're not careful, and it's not
> clear that you have to be. Interesting that no one ever reported
> breakage in parallel query due to this thinko, presumably because the
> current resource owner always turns out to be either the same one or
> something with longer life.

Yep. I ran check-world with an extra assertion that
"CurrentResourceOwner == area->resowner || area->resowner == NULL" to
verify that. No failures.

>>> With the patch 0002 applied, I'm observing another anomaly:
>>
>> Ok, so the ownership of a dsa was still muddled :-(. Attached is a new
>> version that goes a little further. It replaces the whole
>> 'mapping_pinned' flag in dsa_area with the 'resowner'. When a mapping is
>> pinned, it means that 'resowner == NULL'. This is now similar to how the
>> resowner field and pinning works in dsm.c.
>
> This patch makes sense to me.
>
> It might be tempting to delete the dsa_pin_mapping() interface
> completely and say that if CurrentResourceOwner == NULL, then it's
> effectively (what we used to call) pinned, but I think it's useful for
> exception-safe construction of multiple objects to be able to start
> out with a resource owner and then later 'commit' by clearing it. As
> seen in GetSessionDsmHandle().

Yeah that's useful. I don't like the "pinned mapping" term here. It's
confusing because we also have dsa_pin(), which means something
different: the area will continue to exist even after all backends have
detached from it. I think we should rename 'dsa_pin_mapping()' to
'dsa_forget_resowner()' or something like that.

I pushed these fixes, without that renaming. Let's do that as a separate
patch if we like that.

I didn't backpatch this for now, because we don't seem to have a live
bug in back branches. You could argue for backpatching because this
could bite us in the future if we backpatch something else that uses
dsa's, or if there are extensions using it. We can do that later after
we've had this in 'master' for a while.

--
Heikki Linnakangas
Neon (https://neon.tech)

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Peter Eisentraut <peter(at)eisentraut(dot)org>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-11-17 20:44:41
Message-ID:	20231117204441.7ff37sw53udg34lc@awork3.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

On 2023-11-07 13:28:28 +0200, Heikki Linnakangas wrote:
> I feel pretty good about this overall. Barring objections or new cfbot
> failures, I will commit this in the next few days.

I am working on rebasing the AIO patch over this. I think I found a crash
that's unrelated to AIO.

#4 0x0000000000ea7631 in ExceptionalCondition (conditionName=0x4ba6f1 "owner->narr == 0",
fileName=0x4ba628 "../../../../../home/andres/src/postgresql/src/backend/utils/resowner/resowner.c", lineNumber=354)
at ../../../../../home/andres/src/postgresql/src/backend/utils/error/assert.c:66
#5 0x0000000000ef13b5 in ResourceOwnerReleaseAll (owner=0x3367d48, phase=RESOURCE_RELEASE_BEFORE_LOCKS, printLeakWarnings=false)
at ../../../../../home/andres/src/postgresql/src/backend/utils/resowner/resowner.c:354
#6 0x0000000000ef1d9c in ResourceOwnerReleaseInternal (owner=0x3367d48, phase=RESOURCE_RELEASE_BEFORE_LOCKS, isCommit=false, isTopLevel=true)
at ../../../../../home/andres/src/postgresql/src/backend/utils/resowner/resowner.c:717
#7 0x0000000000ef1c89 in ResourceOwnerRelease (owner=0x3367d48, phase=RESOURCE_RELEASE_BEFORE_LOCKS, isCommit=false, isTopLevel=true)
at ../../../../../home/andres/src/postgresql/src/backend/utils/resowner/resowner.c:644
#8 0x00000000008c1f87 in AbortTransaction () at ../../../../../home/andres/src/postgresql/src/backend/access/transam/xact.c:2851
#9 0x00000000008c4ae0 in AbortOutOfAnyTransaction () at ../../../../../home/andres/src/postgresql/src/backend/access/transam/xact.c:4761
#10 0x0000000000ec1502 in ShutdownPostgres (code=1, arg=0) at ../../../../../home/andres/src/postgresql/src/backend/utils/init/postinit.c:1357
#11 0x0000000000c942e5 in shmem_exit (code=1) at ../../../../../home/andres/src/postgresql/src/backend/storage/ipc/ipc.c:243

I think the problem is that ResourceOwnerSort() looks at owner->nhash == 0,
whereas ResourceOwnerReleaseAll() looks at !owner->hash. Therefore it's
possible to reach ResourceOwnerReleaseAll() with owner->hash != NULL while
also having owner->narr > 0.

I think both checks for !owner->hash in ResourceOwnerReleaseAll() need to
instead check owner->nhash == 0.

I'm somewhat surprised that this is only hit with the AIO branch. I guess one
needs to be somewhat unlucky to end up with a hashtable but 0 elements in it
at the time of resowner release. But still somewhat surprising.

Seperately, I see that resowner_private.h still exists in the repo, I think
that should be deleted, as many of the functions don't exist anymore?

Lastly, I think it'd be good to assert that we're not releasing the resowner
in ResourceOwnerRememberLock(). It's currently not strictly required, but that
seems like it's just leaking an implementation detail out?

Greetings,

Andres Freund

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Peter Eisentraut <peter(at)eisentraut(dot)org>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2023-11-17 20:52:10
Message-ID:	20231117205210.kaedtrstwi3zl5pc@awork3.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	Postg사설 토토SQL

Hi,

On 2023-11-17 12:44:41 -0800, Andres Freund wrote:
> On 2023-11-07 13:28:28 +0200, Heikki Linnakangas wrote:
> > I feel pretty good about this overall. Barring objections or new cfbot
> > failures, I will commit this in the next few days.
>
> I am working on rebasing the AIO patch over this. I think I found a crash
> that's unrelated to AIO.
>
> #4 0x0000000000ea7631 in ExceptionalCondition (conditionName=0x4ba6f1 "owner->narr == 0",
> fileName=0x4ba628 "../../../../../home/andres/src/postgresql/src/backend/utils/resowner/resowner.c", lineNumber=354)
> at ../../../../../home/andres/src/postgresql/src/backend/utils/error/assert.c:66
> #5 0x0000000000ef13b5 in ResourceOwnerReleaseAll (owner=0x3367d48, phase=RESOURCE_RELEASE_BEFORE_LOCKS, printLeakWarnings=false)
> at ../../../../../home/andres/src/postgresql/src/backend/utils/resowner/resowner.c:354
> #6 0x0000000000ef1d9c in ResourceOwnerReleaseInternal (owner=0x3367d48, phase=RESOURCE_RELEASE_BEFORE_LOCKS, isCommit=false, isTopLevel=true)
> at ../../../../../home/andres/src/postgresql/src/backend/utils/resowner/resowner.c:717
> #7 0x0000000000ef1c89 in ResourceOwnerRelease (owner=0x3367d48, phase=RESOURCE_RELEASE_BEFORE_LOCKS, isCommit=false, isTopLevel=true)
> at ../../../../../home/andres/src/postgresql/src/backend/utils/resowner/resowner.c:644
> #8 0x00000000008c1f87 in AbortTransaction () at ../../../../../home/andres/src/postgresql/src/backend/access/transam/xact.c:2851
> #9 0x00000000008c4ae0 in AbortOutOfAnyTransaction () at ../../../../../home/andres/src/postgresql/src/backend/access/transam/xact.c:4761
> #10 0x0000000000ec1502 in ShutdownPostgres (code=1, arg=0) at ../../../../../home/andres/src/postgresql/src/backend/utils/init/postinit.c:1357
> #11 0x0000000000c942e5 in shmem_exit (code=1) at ../../../../../home/andres/src/postgresql/src/backend/storage/ipc/ipc.c:243
>
> I think the problem is that ResourceOwnerSort() looks at owner->nhash == 0,
> whereas ResourceOwnerReleaseAll() looks at !owner->hash. Therefore it's
> possible to reach ResourceOwnerReleaseAll() with owner->hash != NULL while
> also having owner->narr > 0.
>
> I think both checks for !owner->hash in ResourceOwnerReleaseAll() need to
> instead check owner->nhash == 0.
>
> I'm somewhat surprised that this is only hit with the AIO branch. I guess one
> needs to be somewhat unlucky to end up with a hashtable but 0 elements in it
> at the time of resowner release. But still somewhat surprising.

Oops - I hadn't rebased far enough at this point... That was already fixed in
8f4a1ab471e.

Seems like it'd be good to cover that path in one of the tests?

- Andres

From:	"Anton A(dot) Melnikov" <a(dot)melnikov(at)postgrespro(dot)ru>
To:	Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Peter Eisentraut <peter(at)eisentraut(dot)org>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: ResourceOwner refactoring
Date:	2024-01-16 11:23:56
Message-ID:	0b78546c-ffef-4cd9-9ba1-d1e6aab88cea@postgrespro.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello!

Found possibly missing PGDLLIMPORT in the definition of the

extern const ResourceOwnerDesc buffer_io_resowner_desc;
and
extern const ResourceOwnerDesc buffer_pin_resowner_desc;

callbacks in the src/include/storage/buf_internals.h

Please take a look on it.

With the best regards,

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	"Anton A(dot) Melnikov" <a(dot)melnikov(at)postgrespro(dot)ru>
Cc:	Peter Eisentraut <peter(at)eisentraut(dot)org>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2024-01-16 11:54:53
Message-ID:	31ad8b51-1590-49c2-acd3-2d7e0313c3ee@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 16/01/2024 13:23, Anton A. Melnikov wrote:
> Hello!
>
> Found possibly missing PGDLLIMPORT in the definition of the
>
> extern const ResourceOwnerDesc buffer_io_resowner_desc;
> and
> extern const ResourceOwnerDesc buffer_pin_resowner_desc;
>
> callbacks in the src/include/storage/buf_internals.h
>
> Please take a look on it.

Fixed, thanks. mark_pgdllimport.pl also highlighted two new variables in
walsummarizer.h, fixed those too.

--
Heikki Linnakangas
Neon (https://neon.tech)

From:	"Anton A(dot) Melnikov" <a(dot)melnikov(at)postgrespro(dot)ru>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Peter Eisentraut <peter(at)eisentraut(dot)org>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2024-01-16 12:07:13
Message-ID:	2888e358-0106-4622-bbf9-ce242d0b0e76@postgrespro.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	Postg젠 토토SQL :

On 16.01.2024 14:54, Heikki Linnakangas wrote:
> Fixed, thanks. mark_pgdllimport.pl also highlighted two new variables in walsummarizer.h, fixed those too.
>

Thanks!

Have a nice day!

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

From:	Alexander Lakhin <exclusion(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2024-02-02 09:00:00
Message-ID:	dc574fea-c83e-a600-08cd-10881762e4fa@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello Heikki,

08.11.2023 14:37, Heikki Linnakangas wrote:
>
> Fixed these, and pushed. Thanks everyone for reviewing!
>

Please try the following script:
mkdir /tmp/50m
sudo mount -t tmpfs -o size=50M tmpfs /tmp/50m
export PGDATA=/tmp/50m/tmpdb

initdb
pg_ctl -l server.log start

cat << 'EOF' | psql
CREATE TEMP TABLE t (a name, b name, c name, d name);
INSERT INTO t SELECT 'a', 'b', 'c', 'd' FROM generate_series(1, 1000) g;

COPY t TO '/tmp/t.data';
SELECT 'COPY t FROM ''/tmp/t.data''' FROM generate_series(1, 100)
\gexec
EOF

which produces an unexpected error, a warning, and an assertion failure,
starting from b8bff07da:
..
COPY 1000
ERROR: could not extend file "base/5/t2_16386" with FileFallocate(): No space left on device
HINT: Check free disk space.
CONTEXT: COPY t, line 1000
ERROR: ResourceOwnerRemember called but array was full
CONTEXT: COPY t, line 1000
WARNING: local buffer refcount leak: [-499] (rel=base/5/t2_16386, blockNum=1483, flags=0x2000000, refcount=0 1)
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
connection to server was lost
2024-02-02 08:29:24.434 UTC [2052831] LOG: server process (PID 2052839) was terminated by signal 6: Aborted

server.log contains:
TRAP: failed Assert("RefCountErrors == 0"), File: "localbuf.c", Line: 804, PID: 2052839

Best regards,
Alexander

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2024-02-02 19:32:36
Message-ID:	cd41ee65-ff35-49ef-a7bd-f8f15947df97@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 02/02/2024 11:00, Alexander Lakhin wrote:
> Please try the following script:
> mkdir /tmp/50m
> sudo mount -t tmpfs -o size=50M tmpfs /tmp/50m
> export PGDATA=/tmp/50m/tmpdb
>
> initdb
> pg_ctl -l server.log start
>
> cat << 'EOF' | psql
> CREATE TEMP TABLE t (a name, b name, c name, d name);
> INSERT INTO t SELECT 'a', 'b', 'c', 'd' FROM generate_series(1, 1000) g;
>
> COPY t TO '/tmp/t.data';
> SELECT 'COPY t FROM ''/tmp/t.data''' FROM generate_series(1, 100)
> \gexec
> EOF
>
> which produces an unexpected error, a warning, and an assertion failure,
> starting from b8bff07da:

Fixed, thanks for the report!

Comparing ExtendBufferedRelLocal() and ExtendBufferedRelShared(), it's
easy to see that ExtendBufferedRelLocal() was missing a
ResourceOwnerEnlarge() call in the loop. But it's actually a bit more
subtle: it was correct without the ResourceOwnerEnlarge() call until
commit b8bff07da, because ExtendBufferedRelLocal() unpins the old buffer
pinning the new one, while ExtendBufferedRelShared() does it the other
way 'round. The implicit assumption was that unpinning the old buffer
ensures that you can pin a new one. That no longer holds with commit
b8bff07da. Remembering a new resource expects there to be a free slot in
the fixed-size array, but if the forgotten resource was in the hash,
rather than the array, forgetting it doesn't make space in the array.

We also make that assumption here in BufferAlloc:

> /*
> * Got a collision. Someone has already done what we were about to do.
> * We'll just handle this as if it were found in the buffer pool in
> * the first place. First, give up the buffer we were planning to
> * use.
> *
> * We could do this after releasing the partition lock, but then we'd
> * have to call ResourceOwnerEnlarge() & ReservePrivateRefCountEntry()
> * before acquiring the lock, for the rare case of such a collision.
> */
> UnpinBuffer(victim_buf_hdr);

It turns out to be OK in that case, because it unpins the buffer that
was the last one pinned. That does ensure that you have one free slot in
the array, but forgetting anything other than the most recently
remembered resource does not.

I've added a note to that in ResourceOwnerForget. I read through the
other callers of ResourceOwnerRemember and PinBuffer, but didn't find
any other unsafe uses. I'm not too happy with this subtlety, but at
least it's documented now.

--
Heikki Linnakangas
Neon (https://neon.tech)

From:	Alexander Lakhin <exclusion(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2024-06-01 08:00:00
Message-ID:	e56be7d9-14b1-664d-0bfc-00ce9772721c@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello Heikki,

I've stumbled upon yet another anomaly introduced with b8bff07da.

Please try the following script:
SELECT 'init' FROM pg_create_logical_replication_slot('slot',
'test_decoding');
CREATE TABLE tbl(t text);
BEGIN;
TRUNCATE table tbl;

SELECT data FROM pg_logical_slot_get_changes('slot', NULL, NULL,
'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');

On b8bff07da (and the current HEAD), it ends up with:
ERROR: ResourceOwnerEnlarge called after release started

But on the previous commit, I get no error:
data
------
(0 rows)

Best regards,
Alexander

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2024-06-03 22:49:52
Message-ID:	7d9fa5cf-07e3-4691-b40a-802048a0b998@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Thanks for the testing!

On 01/06/2024 11:00, Alexander Lakhin wrote:
> Hello Heikki,
>
> I've stumbled upon yet another anomaly introduced with b8bff07da.
>
> Please try the following script:
> SELECT 'init' FROM pg_create_logical_replication_slot('slot',
> 'test_decoding');
> CREATE TABLE tbl(t text);
> BEGIN;
> TRUNCATE table tbl;
>
> SELECT data FROM pg_logical_slot_get_changes('slot', NULL, NULL,
> 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
>
> On b8bff07da (and the current HEAD), it ends up with:
> ERROR: ResourceOwnerEnlarge called after release started
>
> But on the previous commit, I get no error:
> data
> ------
> (0 rows)

This is another case of the issue I tried to fix in commit b8bff07da
with this:

> --- b/src/backend/access/transam/xact.c
> +++ a/src/backend/access/transam/xact.c
> @@ -5279,7 +5279,20 @@
>
> AtEOSubXact_RelationCshould be ache(false, s->subTransactionId,
> s->parent->subTransactionId);
> +
> +
> + /*
> + * AtEOSubXact_Inval sometimes needs to temporarily bump the refcount
> + * on the relcache entries that it processes. We cannot use the
> + * subtransaction's resource owner anymore, because we've already
> + * started releasing it. But we can use the parent resource owner.
> + */
> + CurrentResourceOwner = s->parent->curTransactionOwner;
> +
> AtEOSubXact_Inval(false);
> +
> + CurrentResourceOwner = s->curTransactionOwner;
> +
> ResourceOwnerRelease(s->curTransactionOwner,
> RESOURCE_RELEASE_LOCKS,
> false, false);

AtEOSubXact_Inval calls LocalExecuteInvalidationMessage(), which
ultimately calls RelationFlushRelation(). Your test case reaches
ReorderBufferExecuteInvalidations(), which also calls
LocalExecuteInvalidationMessage(), in an already aborted subtransaction.

Looking closer at relcache.c, I think we need to make
RelationFlushRelation() safe to call without a valid resource owner.
There are already IsTransactionState() checks in
RelationClearRelation(), and a comment noting that it does not touch
CurrentResourceOwner. So we should make sure RelationFlushRelation()
doesn't touch it either, and revert the above change to
AbortTransaction(). I didn't feel very good about it in the first place,
and now I see what the proper fix is.

A straightforward fix is to modify RelationFlushRelation() so that if
!IsTransactionState(), it just marks the entry as invalid instead of
calling RelationClearRelation(). That's what RelationClearRelation()
would do anyway, if it didn't hit the assertion first.

RelationClearRelation() is complicated. Depending on the circumstances,
it removes the relcache entry, rebuilds it, marks it as invalid, or
performs a partial reload of a nailed relation. So before going for the
straightforward fix, I'm going to see if I can refactor
RelationClearRelation() to be less complicated.

--
Heikki Linnakangas
Neon (https://neon.tech)

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2024-06-05 13:58:45
Message-ID:	ae623024-3ae7-44f5-bede-4faafdbb280a@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 04/06/2024 01:49, Heikki Linnakangas wrote:
> A straightforward fix is to modify RelationFlushRelation() so that if
> !IsTransactionState(), it just marks the entry as invalid instead of
> calling RelationClearRelation(). That's what RelationClearRelation()
> would do anyway, if it didn't hit the assertion first.

Here's a patch with that straightforward fix. Your test case hit the
"rd_createSubid != InvalidSubTransactionId" case, I expanded it to also
cover the "rd_firstRelfilelocatorSubid != InvalidSubTransactionId" case.
Barring objections, I'll commit this later today or tomorrow. Thanks for
the report!

I started a new thread with the bigger refactorings I had in mind:
/message-id/9c9e8908-7b3e-4ce7-85a8-00c0e165a3d6%40iki.fi

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment	Content-Type	Size
0001-Make-RelationFlushRelation-work-without-ResourceOwne.patch	text/x-patch	9.4 KB

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2024-06-06 11:32:38
Message-ID:	45bc67c6-85f6-4c5b-a701-9b59d81afcf7@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 05/06/2024 16:58, Heikki Linnakangas wrote:
> On 04/06/2024 01:49, Heikki Linnakangas wrote:
>> A straightforward fix is to modify RelationFlushRelation() so that if
>> !IsTransactionState(), it just marks the entry as invalid instead of
>> calling RelationClearRelation(). That's what RelationClearRelation()
>> would do anyway, if it didn't hit the assertion first.
>
> Here's a patch with that straightforward fix. Your test case hit the
> "rd_createSubid != InvalidSubTransactionId" case, I expanded it to also
> cover the "rd_firstRelfilelocatorSubid != InvalidSubTransactionId" case.

For the record, I got the above backwards: your test case covered the
rd_firstRelfilelocatorSubid case and I expanded it to also cover the
rd_createSubid case.

> Barring objections, I'll commit this later today or tomorrow. Thanks for
> the report!

Committed.

--
Heikki Linnakangas
Neon (https://neon.tech)

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc:	Alexander Lakhin <exclusion(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2024-06-06 15:27:40
Message-ID:	CA+Tgmoae8b0sGMVhaZgEtmG1HJzi3Uina4KwoSyr-bh7ReMLww@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jun 6, 2024 at 7:32 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> > Barring objections, I'll commit this later today or tomorrow. Thanks for
> > the report!
>
> Committed.

I think you may have forgotten to push.

--
Robert Haas
EDB: http://www.enterprisedb.com

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Alexander Lakhin <exclusion(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: ResourceOwner refactoring
Date:	2024-06-06 15:57:21
Message-ID:	0716de28-f19d-4076-b6f1-d7c1187c2a14@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 06/06/2024 18:27, Robert Haas wrote:
> On Thu, Jun 6, 2024 at 7:32 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>>> Barring objections, I'll commit this later today or tomorrow. Thanks for
>>> the report!
>>
>> Committed.
>
> I think you may have forgotten to push.

Huh, you're right. I could swear I did...

Pushed now thanks!

--
Heikki Linnakangas
Neon (https://neon.tech)