Quick Links

Re: pgsql: Add kqueue(2) support to the WaitEventSet API.

Lists:	pgsql-committerspgsql-hackers

From:	Thomas Munro <tmunro(at)postgresql(dot)org>
To:	pgsql-committers(at)lists(dot)postgresql(dot)org
Subject:	pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-05 04:59:10
Message-ID:	E1izCmM-0005pV-Co@gemulon.postgresql.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL Postg무지개 토토SQL

Add kqueue(2) support to the WaitEventSet API.

Use kevent(2) to wait for events on the BSD family of operating
systems and macOS. This is similar to the epoll(2) support added
for Linux by commit 98a64d0bd.

Author: Thomas Munro
Reviewed-by: Andres Freund, Marko Tiikkaja, Tom Lane
Tested-by: Mateusz Guzik, Matteo Beccati, Keith Fiske, Heikki Linnakangas, Michael Paquier, Peter Eisentraut, Rui DeSousa, Tom Lane, Mark Wong
Discussion: https://postgr.es/m/CAEepm%3D37oF84-iXDTQ9MrGjENwVGds%2B5zTr38ca73kWR7ez_tA%40mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/815c2f0972c8386aba7c606f1ee6690d13b04db2

Modified Files
--------------
configure | 4 +-
configure.in | 2 +
src/backend/storage/ipc/latch.c | 300 +++++++++++++++++++++++++++++++++++++++-
src/include/pg_config.h.in | 6 +
src/tools/msvc/Solution.pm | 2 +
5 files changed, 311 insertions(+), 3 deletions(-)

From:	Michael Paquier <michael(at)paquier(dot)xyz>
To:	Thomas Munro <tmunro(at)postgresql(dot)org>
Cc:	pgsql-committers(at)lists(dot)postgresql(dot)org
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-20 07:24:35
Message-ID:	20200220072435.GM2288@paquier.xyz
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL Postg토토 사이트SQL

Hi Thomas,

On Wed, Feb 05, 2020 at 04:59:10AM +0000, Thomas Munro wrote:
> Add kqueue(2) support to the WaitEventSet API.
>
> Use kevent(2) to wait for events on the BSD family of operating
> systems and macOS. This is similar to the epoll(2) support added
> for Linux by commit 98a64d0bd.

Worth noting this issue with the test suite of postgres_fdw for
buildfarm animal coypu, running on NetBSD:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=coypu&dt=2020-02-19%2023%3A01%3A01
+ERROR: kqueue failed: Too many open files
--
Michael

From:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To:	Michael Paquier <michael(at)paquier(dot)xyz>
Cc:	Thomas Munro <tmunro(at)postgresql(dot)org>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>, remi_zara(at)mac(dot)com
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-20 11:15:59
Message-ID:	CA+hUKG+yGQDPdwcH5euDb_pE7OcnbrvGDLoOi2JMaCwj4G+ttg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL Postg토토 꽁 머니SQL

On Thu, Feb 20, 2020 at 8:24 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> On Wed, Feb 05, 2020 at 04:59:10AM +0000, Thomas Munro wrote:
> > Add kqueue(2) support to the WaitEventSet API.
> >
> > Use kevent(2) to wait for events on the BSD family of operating
> > systems and macOS. This is similar to the epoll(2) support added
> > for Linux by commit 98a64d0bd.
>
> Worth noting this issue with the test suite of postgres_fdw for
> buildfarm animal coypu, running on NetBSD:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=coypu&dt=2020-02-19%2023%3A01%3A01
> +ERROR: kqueue failed: Too many open files

Hmm. So coypu just came back after 48 days, and the new kqueue() code
fails for process 19829 after successfully running 265 log lines'
worth of postgres_fdw tests, because it's run out of file
descriptors. I can see that WaitLatchOrSocket() actually could leak
an epoll/kqueue socket if WaitEventSetWait() raises an error, which is
interesting, but apparently not the explanation here because we don't
see a preceding error report. Another theory would be that this
machine has a low max_safe_fds, and NUM_RESERVED_FDS is only just
enough to handle the various sockets that postgres_fdw.sql creates and
at some point kqueue()'s demand for just one more pushed it over the
edge. From the error text and a look at the man page for errno, this
error is EMFILE (per process limit, which could be as low as 64)
rather then ENFILE (system limit).

Remi, any chance you could run gmake installcheck under
contrib/postgres_fdw on that host, to see if this is repeatable? Can
you tell us about the relevant limits? Maybe ulimit -n (for the user
that runs the build farm), and also sysctl -a | grep descriptors,
sysctl -a | grep maxfiles?

From:	Rémi Zara <remi_zara(at)mac(dot)com>
To:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Michael Paquier <michael(at)paquier(dot)xyz>, Thomas Munro <tmunro(at)postgresql(dot)org>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-20 18:03:59
Message-ID:	08364CCB-07F5-423E-8CE0-ABF0195C4CA2@mac.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL pgsql-hackers

> Le 20 févr. 2020 à 12:15, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> a écrit :
>
> On Thu, Feb 20, 2020 at 8:24 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>> On Wed, Feb 05, 2020 at 04:59:10AM +0000, Thomas Munro wrote:
>>> Add kqueue(2) support to the WaitEventSet API.
>>>
>>> Use kevent(2) to wait for events on the BSD family of operating
>>> systems and macOS. This is similar to the epoll(2) support added
>>> for Linux by commit 98a64d0bd.
>>
>> Worth noting this issue with the test suite of postgres_fdw for
>> buildfarm animal coypu, running on NetBSD:
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=coypu&dt=2020-02-19%2023%3A01%3A01
>> +ERROR: kqueue failed: Too many open files
>
> Hmm. So coypu just came back after 48 days, and the new kqueue() code
> fails for process 19829 after successfully running 265 log lines'
> worth of postgres_fdw tests, because it's run out of file
> descriptors. I can see that WaitLatchOrSocket() actually could leak
> an epoll/kqueue socket if WaitEventSetWait() raises an error, which is
> interesting, but apparently not the explanation here because we don't
> see a preceding error report. Another theory would be that this
> machine has a low max_safe_fds, and NUM_RESERVED_FDS is only just
> enough to handle the various sockets that postgres_fdw.sql creates and
> at some point kqueue()'s demand for just one more pushed it over the
> edge. From the error text and a look at the man page for errno, this
> error is EMFILE (per process limit, which could be as low as 64)
> rather then ENFILE (system limit).
>
> Remi, any chance you could run gmake installcheck under
> contrib/postgres_fdw on that host, to see if this is repeatable? Can
> you tell us about the relevant limits? Maybe ulimit -n (for the user
> that runs the build farm), and also sysctl -a | grep descriptors,
> sysctl -a | grep maxfiles?

Hi,

Unfortunalty, coypu went offline again. I will run tests as soon as I can bring it back up.

Rémi

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Rémi Zara <remi_zara(at)mac(dot)com>
Cc:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Thomas Munro <tmunro(at)postgresql(dot)org>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-20 18:40:47
Message-ID:	17106.1582224047@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL Postg사설 토토 사이트SQL

=?utf-8?Q?R=C3=A9mi_Zara?= <remi_zara(at)mac(dot)com> writes:
>> Le 20 févr. 2020 à 12:15, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> a écrit :
>> Remi, any chance you could run gmake installcheck under
>> contrib/postgres_fdw on that host, to see if this is repeatable? Can
>> you tell us about the relevant limits? Maybe ulimit -n (for the user
>> that runs the build farm), and also sysctl -a | grep descriptors,
>> sysctl -a | grep maxfiles?

> Unfortunalty, coypu went offline again. I will run tests as soon as I can bring it back up.

I have a working NetBSD 8/ppc installation, will try to reproduce there.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-20 19:44:04
Message-ID:	19710.1582227844@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-committers Postg메이저 토토 사이트SQL

[ redirecting to -hackers ]

I wrote:
> =?utf-8?Q?R=C3=A9mi_Zara?= <remi_zara(at)mac(dot)com> writes:
> Le 20 févr. 2020 à 12:15, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> a écrit :
>>> Remi, any chance you could run gmake installcheck under
>>> contrib/postgres_fdw on that host, to see if this is repeatable? Can
>>> you tell us about the relevant limits? Maybe ulimit -n (for the user
>>> that runs the build farm), and also sysctl -a | grep descriptors,
>>> sysctl -a | grep maxfiles?

> I have a working NetBSD 8/ppc installation, will try to reproduce there.

Yup, it reproduces fine here. I see

$ ulimit -a
...
nofiles (-n descriptors) 128

which squares with the sysctl values:

proc.curproc.rlimit.descriptors.soft = 128
proc.curproc.rlimit.descriptors.hard = 1772
kern.maxfiles = 1772

and also with set_max_safe_fds' results:

2020-02-20 14:29:38.610 EST [2218] DEBUG: max_safe_fds = 115, usable_fds = 125, already_open = 3

It seems fairly obvious now that I look at it, but: the epoll and kqueue
variants of CreateWaitEventSet are both *fundamentally* unsafe, because
they assume that they can always get a FD when they want one, which is
not a property that we generally want backend code to have. The only
reason we've not seen this before with epoll is a lack of testing
under lots-of-FDs stress.

The fact that they'll likely leak those FDs on subsequent failures is
another not-very-nice property.

I think we ought to redesign this so that those FDs are handed out by
fd.c, which can ReleaseLruFile() and retry if it gets EMFILE or ENFILE.
fd.c could also be responsible for the resource tracking needed to
prevent leakage.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-20 19:56:57
Message-ID:	20263.1582228617@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-committers Postg토토 핫SQL :

I wrote:
> It seems fairly obvious now that I look at it, but: the epoll and kqueue
> variants of CreateWaitEventSet are both *fundamentally* unsafe, because
> they assume that they can always get a FD when they want one, which is
> not a property that we generally want backend code to have. The only
> reason we've not seen this before with epoll is a lack of testing
> under lots-of-FDs stress.
> The fact that they'll likely leak those FDs on subsequent failures is
> another not-very-nice property.

Hmmm ... actually, there's a third problem, which is the implicit
assumption that we can have as many concurrently-active WaitEventSets
as we like. We can't, if they depend on FDs --- that's a precious
resource. It doesn't look like we actually ever have more than about
two per process right now, but I'm concerned about what will happen
as the usage of the feature increases.

regards, tom lane

From:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-20 20:50:29
Message-ID:	CA+hUKGLh5NSUhPmipF+3n2-y_ncuVWu3DSg7vKhvY2HwxEBiNQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL Postg윈 토토SQL :

On Fri, Feb 21, 2020 at 8:56 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:
> > It seems fairly obvious now that I look at it, but: the epoll and kqueue
> > variants of CreateWaitEventSet are both *fundamentally* unsafe, because
> > they assume that they can always get a FD when they want one, which is
> > not a property that we generally want backend code to have. The only
> > reason we've not seen this before with epoll is a lack of testing
> > under lots-of-FDs stress.
> > The fact that they'll likely leak those FDs on subsequent failures is
> > another not-very-nice property.
>
> Hmmm ... actually, there's a third problem, which is the implicit
> assumption that we can have as many concurrently-active WaitEventSets
> as we like. We can't, if they depend on FDs --- that's a precious
> resource. It doesn't look like we actually ever have more than about
> two per process right now, but I'm concerned about what will happen
> as the usage of the feature increases.

One thing I've been planning to do for 13 is to get rid of all the
temporary create/destroy WaitEventSets from the main backend loops.
My goal was cutting down on stupid system calls, but this puts a new
spin on it. I have a patch set to do a bunch of that[1], but now I'm
thinking that perhaps I need to be even more aggressive about it and
set up the 'common' long lived WES up front at backend startup, rather
than doing it on demand, so that there is no chance of failure due to
lack of fds once you've started up. I also recently figured out how
to handle some more places with the common WES. I'll post a new patch
set over on that thread shortly.

That wouldn't mean that the postgres_fdw.sql can't fail on a ulimit -n
= 128 system, though, it might just mean that it's postgres_fdw's
socket() call that hits EMFILE rather than WES's kqueue() call while
running that test. I suppose there are two kinds of system: those
where ulimit -n is higher than max_files_per_process (defaults, on
Linux: 1024 vs 1000) so you have more allowance for sockets and the
like, and those where it isn't, like coypu, where NUM_RESERVED_FDS is
the only thing ensuring we have some spare fds. I don't know the
history but it looks like NUM_RESERVED_FDS was deliberately or
accidentally tuned to be just enough to be able to run the regression
tests (the interesting ones being the ones that use sockets, like
postgres_fdw.sql), but with a new long lived kqueue() fd in the
picture, it might have to be increased to cover it, no?

About the potential for leaks, Horiguchi-san realised this hazard and
posted a patch[2] to allow WaitEventSets to be cleaned up by the
resource owner machinery. That's useful for the temporary
WaitEventSet objects that we'd genuinely need in the patch set that's
part of: that's for creating a query-lifetime WES to manage N
connections to remote shards, and it needs to be cleaned up on
failure. For the temporary ones created by WaitLatch(), I suspect
they don't really belong in a resource owner: instead we should get
rid of it using my WaitMyLatch() patch set, and if there are any
places where we can't for some reason (I hope not), perhaps a
try/catch block should be used to fix that.

[1] /message-id/flat/CA%2BhUKGJAC4Oqao%3DqforhNey20J8CiG2R%3DoBPqvfR0vOJrFysGw%40mail.gmail.com
[2] /message-id/20191206.171211.1119526746053895900.horikyota.ntt%40gmail.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-20 22:05:14
Message-ID:	26174.1582236314@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-committers Postg토토 사이트SQL

Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> One thing I've been planning to do for 13 is to get rid of all the
> temporary create/destroy WaitEventSets from the main backend loops.
> My goal was cutting down on stupid system calls, but this puts a new
> spin on it. I have a patch set to do a bunch of that[1], but now I'm
> thinking that perhaps I need to be even more aggressive about it and
> set up the 'common' long lived WES up front at backend startup, rather
> than doing it on demand, so that there is no chance of failure due to
> lack of fds once you've started up.

> That wouldn't mean that the postgres_fdw.sql can't fail on a ulimit -n
> = 128 system, though, it might just mean that it's postgres_fdw's
> socket() call that hits EMFILE rather than WES's kqueue() call while
> running that test.

Good point.

> I suppose there are two kinds of system: those
> where ulimit -n is higher than max_files_per_process (defaults, on
> Linux: 1024 vs 1000) so you have more allowance for sockets and the
> like, and those where it isn't, like coypu, where NUM_RESERVED_FDS is
> the only thing ensuring we have some spare fds. I don't know the
> history but it looks like NUM_RESERVED_FDS was deliberately or
> accidentally tuned to be just enough to be able to run the regression
> tests (the interesting ones being the ones that use sockets, like
> postgres_fdw.sql), but with a new long lived kqueue() fd in the
> picture, it might have to be increased to cover it, no?

No. NUM_RESERVED_FDS was set decades ago, long before any of those tests
existed, and it has never been changed AFAIK. It is a bit striking that
we just started seeing it be insufficient with this patch. Maybe that's
just happenstance, but I wonder whether there is a plain old FD leak
involved in addition to the design issue? I'll take a closer look at
exactly what's open when we hit the error.

The point about possibly hitting EMFILE in libpq's socket() call is
an interesting one. libpq of course can't do anything to recover
from that (and even if it could, there are lower levels such as a
possible DNS lookup that we're not going to be able to modify).
I'm speculating about having postgres_fdw ask fd.c to forcibly
free one LRU file before it attempts to open a new libpq connection.
That would prevent EMFILE (process-level exhaustion) and it would
also provide some small protection against ENFILE (system-wide
exhaustion), though of course there's no guarantee that someone
else doesn't snap up the FD you so graciously relinquished.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-20 23:55:43
Message-ID:	6610.1582242943@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-committers Postg토토 결과SQL

I wrote:
> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
>> ... like coypu, where NUM_RESERVED_FDS is
>> the only thing ensuring we have some spare fds. I don't know the
>> history but it looks like NUM_RESERVED_FDS was deliberately or
>> accidentally tuned to be just enough to be able to run the regression
>> tests (the interesting ones being the ones that use sockets, like
>> postgres_fdw.sql), but with a new long lived kqueue() fd in the
>> picture, it might have to be increased to cover it, no?

> No. NUM_RESERVED_FDS was set decades ago, long before any of those tests
> existed, and it has never been changed AFAIK. It is a bit striking that
> we just started seeing it be insufficient with this patch. Maybe that's
> just happenstance, but I wonder whether there is a plain old FD leak
> involved in addition to the design issue? I'll take a closer look at
> exactly what's open when we hit the error.

Hmm ... looks like I'm wrong: we've been skating way too close to the edge
for awhile. Here's a breakdown of the open FDs in the backend at the time
of the failure, excluding stdin/stdout/stderr (which set_max_safe_fds
accounted for) and FDs pointing to database files:

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
postmaste 2657 postgres 3r FIFO 0,8 0t0 20902158 pipe postmaster_alive_fds[0]
postmaste 2657 postgres 4u REG 0,9 0 3877 [eventpoll] FeBeWaitSet's epoll_fd
postmaste 2657 postgres 7u unix 0xffff880878e21880 0t0 20902664 socket socket for a PGconn owned by postgres_fdw
postmaste 2657 postgres 8u IPv6 20902171 0t0 UDP localhost:40795->localhost:40795 pgStatSock
postmaste 2657 postgres 9u unix 0xffff880876903c00 0t0 20902605 /tmp/.s.PGSQL.5432 MyProcPort->sock
postmaste 2657 postgres 10r FIFO 0,8 0t0 20902606 pipe selfpipe_readfd
postmaste 2657 postgres 11w FIFO 0,8 0t0 20902606 pipe selfpipe_writefd
postmaste 2657 postgres 105u unix 0xffff880878e21180 0t0 20902647 socket socket for a PGconn owned by postgres_fdw
postmaste 2657 postgres 118u unix 0xffff8804772c88c0 0t0 20902650 socket socket for a PGconn owned by postgres_fdw

One thing to notice is that there are only nine, though NUM_RESERVED_FDS
should have allowed ten. That's because there are 116 open FDs pointing
at database files, though max_safe_fds is 115. I'm not sure whether
that's OK or an off-by-one error in fd.c's accounting. One of the 116
is pointing at a WAL segment, and I think we might not be sending that
through the normal VFD path, so it might be "expected".

But anyway, what this shows is that over time we've eaten enough of
the "reserved" FDs that only three are available for random usage like
postgres_fdw's, if the process's back is against the wall FD-wise.
The postgres_fdw regression test is using all three, meaning there's
exactly no daylight in that test.

Clearly, we gotta do something about that too. Maybe bumping up
NUM_RESERVED_FDS would be a good idea, but I feel like more-honest
accounting for permanently-eaten FDs would be a better idea. And
in any case we can't suppose that there's a fixed upper limit on
the number of postgres_fdw connections. I'm liking the idea I floated
earlier of letting postgres_fdw forcibly close the oldest LRU entry.

BTW, you don't need anything very exotic to provoke this error.
The results above are from a Linux box; I just did "ulimit -n 128"
before starting the postmaster.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-21 04:30:07
Message-ID:	18091.1582259407@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-committers pgsql-hackers

I wrote:
> Clearly, we gotta do something about that too. Maybe bumping up
> NUM_RESERVED_FDS would be a good idea, but I feel like more-honest
> accounting for permanently-eaten FDs would be a better idea. And
> in any case we can't suppose that there's a fixed upper limit on
> the number of postgres_fdw connections. I'm liking the idea I floated
> earlier of letting postgres_fdw forcibly close the oldest LRU entry.

A late-night glimmer of an idea: suppose we make fd.c keep count,
not just of the open FDs under its control, but also of open FDs
not under its control. The latter count would include the initial
FDs (stdin/stdout/stderr), and FDs allocated by OpenTransientFile
et al, and we could provide functions for other callers to report
that they just allocated or released a FD. So postgres_fdw could
report, when it opens or closes a PGconn, that the count of external
FDs should be increased or decreased. fd.c would then forcibly
close VFDs as needed to keep NUM_RESERVED_FDS worth of slop. We
still need slop, to provide some daylight for code that isn't aware
of this mechanism. But we could certainly get all these known
long-lived FDs to be counted, and that would allow fd.c to reduce
the number of open VFDs enough to ensure that some slop remains.

This idea doesn't fix every possible problem. For instance, if you
have a plperlu function that wants to open a bunch of files, I don't
see any easy way to ensure we release VFDs to make that possible.
But it's sure an improvement on the status quo.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-23 21:49:12
Message-ID:	6230.1582494552@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-committers Postg메이저 토토 사이트SQL

I wrote:
>> Clearly, we gotta do something about that too. Maybe bumping up
>> NUM_RESERVED_FDS would be a good idea, but I feel like more-honest
>> accounting for permanently-eaten FDs would be a better idea. And
>> in any case we can't suppose that there's a fixed upper limit on
>> the number of postgres_fdw connections. I'm liking the idea I floated
>> earlier of letting postgres_fdw forcibly close the oldest LRU entry.

> A late-night glimmer of an idea: suppose we make fd.c keep count,
> not just of the open FDs under its control, but also of open FDs
> not under its control.

Here's a draft patch that does it like that. There are undoubtedly
more places that need to be taught to report their FD consumption;
one obvious candidate that I didn't touch is dblink. But this is
enough to account for all the long-lived "extra" FDs that are currently
open in a default build, and it passes check-world even with ulimit -n
set to the minimum that set_max_safe_fds will allow.

One thing I noticed is that if you open enough postgres_fdw connections
to cause a failure, that tends to come out as an unintelligible-to-
the-layman "epoll_create1 failed: Too many open files" error. That's
because after postgres_fdw has consumed the last available "external
FD" slot, it tries to use CreateWaitEventSet to wait for input from
the remote server, and now that needs to get another external FD.
I left that alone for the moment, because if you do rejigger the
WaitEventSet code to avoid dynamically creating/destroying epoll sockets,
it will stop being a problem. If that doesn't happen, another
possibility is to reclassify CreateWaitEventSet as a caller that should
ignore "failure" from ReserveExternalFD --- but that would open us up
to problems if a lot of WaitEventSets get created, so it's not a great
answer. It'd be okay perhaps if we added a distinction between
long-lived and short-lived WaitEventSets (ignoring "failure" only for
the latter). But I didn't want to go there in this patch.

Thoughts?

regards, tom lane

Attachment	Content-Type	Size
account-honestly-for-external-FD-usage-1.patch	text/x-diff	21.4 KB

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-23 23:24:07
Message-ID:	27649.1582500247@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-committers Postg윈 토토SQL :

I wrote:
> Here's a draft patch that does it like that.

On reflection, trying to make ReserveExternalFD serve two different
use-cases was pretty messy. Here's a version that splits it into two
functions. I also took the trouble to fix dblink.

regards, tom lane

Attachment	Content-Type	Size
account-honestly-for-external-FD-usage-2.patch	text/x-diff	24.5 KB

From:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-24 06:42:52
Message-ID:	CA+hUKGJ5+5qWtLGhq-HXJ1UUF2rOtmrCo1rtb+F_-UHTRmdmGg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL pgsql-hackers

On Mon, Feb 24, 2020 at 12:24 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> On reflection, trying to make ReserveExternalFD serve two different
> use-cases was pretty messy. Here's a version that splits it into two
> functions. I also took the trouble to fix dblink.

+ /*
+ * We don't want more than max_safe_fds / 3 FDs to be consumed for
+ * "external" FDs.
+ */
+ if (numExternalFDs < max_safe_fds / 3)

This looks pretty reasonable to me.

I'll have a new patch set to create a common WES at startup over on
that other thread in a day or two.

From:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-24 06:53:34
Message-ID:	CA+hUKGJeL9XP8p5c4fGoAgPU9ktuxpiCt6An4N9ytO=qSh9WgA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-committers pgsql-hackers

On Mon, Feb 24, 2020 at 7:42 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> On Mon, Feb 24, 2020 at 12:24 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > On reflection, trying to make ReserveExternalFD serve two different
> > use-cases was pretty messy. Here's a version that splits it into two
> > functions. I also took the trouble to fix dblink.
>
> + /*
> + * We don't want more than max_safe_fds / 3 FDs to be consumed for
> + * "external" FDs.
> + */
> + if (numExternalFDs < max_safe_fds / 3)

I suppose there may be users who have set ulimit -n high enough to
support an FDW workload that connects to very many hosts, who will now
need to set max_files_per_process higher to avoid the new error now
that we're doing this accounting. That doesn't seem to be a problem
in itself, but I wonder if the error message should make it clearer
that it's our limit they hit here.

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-24 14:44:29
Message-ID:	16353.1582555469@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL Postg토토 커뮤니티SQL

Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> I suppose there may be users who have set ulimit -n high enough to
> support an FDW workload that connects to very many hosts, who will now
> need to set max_files_per_process higher to avoid the new error now
> that we're doing this accounting. That doesn't seem to be a problem
> in itself, but I wonder if the error message should make it clearer
> that it's our limit they hit here.

I struggled with the wording of that message, actually. The patch
proposes

+ ereport(ERROR,
+ (errcode(ERRCODE_SQLCLIENT_UNABLE_TO_ESTABLISH_SQLCONNECTION),
+ errmsg("could not connect to server \"%s\"",
+ server->servername),
+ errdetail("There are too many open files.")));

I wanted to say "The server has too many open files." but in context
it would seem to be talking about the remote server, so I'm not sure
how to fix that.

We could also consider a HINT, along the lines of "Raise the server's
max_files_per_process and/or \"ulimit -n\" limits." This again has
the ambiguity about which server, and it also seems dangerously
platform-specific.

regards, tom lane

From:	Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-24 18:29:51
Message-ID:	13F92C31-0E60-45C9-9031-3B1ECFDD5261@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL pgsql-hackers

> On Feb 20, 2020, at 8:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> This idea doesn't fix every possible problem. For instance, if you
> have a plperlu function that wants to open a bunch of files, I don't
> see any easy way to ensure we release VFDs to make that possible.
> But it's sure an improvement on the status quo.

I understand that you were using plperlu just as an example, but it got me thinking. Could we ship a wrapper using perl's tie() mechanism to call a new spi function to report when a file handle is opened and when it is closed? Most plperlu functions would not participate, since developers will not necessarily know to use the wrapper, but at least they could learn about the wrapper and use it as a work-around if this becomes a problem for them. Perhaps the same spi function could be used by other procedural languages.

I can't see this solution working unless the backend can cleanup properly under exceptional conditions, and decrement the counter of used file handles appropriately. But that's the same requirement that postgres_fdw would also have, right? Would the same mechanism work for both?

I imagine something like <PgPerluSafe>::IO::File and <PgPerluSafe>::File::Temp which could be thin wrappers around IO::File and File::Temp that automatically do the tie()ing for you. (Replace <PgPerluSafe> with whichever name seems best.)

Is this too convoluted to be worth the bother?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-24 18:41:21
Message-ID:	20200224184121.725gnltgycvnw3xu@alap3.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-committers pgsql-hackers

Hi,

On 2020-02-24 10:29:51 -0800, Mark Dilger wrote:
> > On Feb 20, 2020, at 8:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > This idea doesn't fix every possible problem. For instance, if you
> > have a plperlu function that wants to open a bunch of files, I don't
> > see any easy way to ensure we release VFDs to make that possible.
> > But it's sure an improvement on the status quo.
>
> I understand that you were using plperlu just as an example, but it
> got me thinking. Could we ship a wrapper using perl's tie() mechanism
> to call a new spi function to report when a file handle is opened and
> when it is closed? Most plperlu functions would not participate,
> since developers will not necessarily know to use the wrapper, but at
> least they could learn about the wrapper and use it as a work-around
> if this becomes a problem for them. Perhaps the same spi function
> could be used by other procedural languages.

While we're thinking a bit outside of the box: We could just dup() a
bunch of fds within fd.c to protect fd.c's fd "quota". And then close
them when actually needing the fds.

Not really suggesting that we go for that, but it does have some appeal.

> I can't see this solution working unless the backend can cleanup properly under exceptional conditions, and decrement the counter of used file handles appropriately. But that's the same requirement that postgres_fdw would also have, right? Would the same mechanism work for both?

We can just throw an error, and all fdw state should get cleaned up. We
can't generally rely on that for plperl, as it IIRC can global state. So
I don't think they're in the same boat.

Greetings,

Andres Freund

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Cc:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-24 18:48:36
Message-ID:	14227.1582570116@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL pgsql-hackers

Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> writes:
>> On Feb 20, 2020, at 8:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> This idea doesn't fix every possible problem. For instance, if you
>> have a plperlu function that wants to open a bunch of files, I don't
>> see any easy way to ensure we release VFDs to make that possible.
>> But it's sure an improvement on the status quo.

> I understand that you were using plperlu just as an example, but it got
> me thinking. Could we ship a wrapper using perl's tie() mechanism to
> call a new spi function to report when a file handle is opened and when
> it is closed? Most plperlu functions would not participate, since
> developers will not necessarily know to use the wrapper, but at least
> they could learn about the wrapper and use it as a work-around if this
> becomes a problem for them. Perhaps the same spi function could be used
> by other procedural languages.

Hmm. I had thought briefly about getting plperl to do that automatically
and had concluded that I didn't see a way (though there might be one;
I'm not much of a Perl expert). But if we accept that changes in the
plperl function's source code might be needed, it gets a lot easier,
for sure.

Anyway, the point of the current patch is to provide the mechanism and
use it in a couple of places where we know there's an issue. Improving
the PLs is something that could be added later.

> I can't see this solution working unless the backend can cleanup
> properly under exceptional conditions, and decrement the counter of used
> file handles appropriately. But that's the same requirement that
> postgres_fdw would also have, right? Would the same mechanism work for
> both?

The hard part is to tie into whatever is responsible for closing the
kernel FD. If you can ensure that the FD gets closed, you can put
the ReleaseExternalFD() call at the same place(s).

> Is this too convoluted to be worth the bother?

So far we've not seen field reports of PL functions running out of FDs;
and there's always the ad-hoc solution of making sure the server's
ulimit -n limit is sufficiently larger than max_files_per_process.
So I wouldn't put a lot of effort into it right now. But it's nice
to have an idea about what to do if it does become a hot issue for
somebody.

regards, tom lane

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-24 19:14:53
Message-ID:	20200224191453.GA711@alvherre.pgsql
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL pgsql-hackers

On 2020-Feb-24, Tom Lane wrote:

> We could also consider a HINT, along the lines of "Raise the server's
> max_files_per_process and/or \"ulimit -n\" limits." This again has
> the ambiguity about which server, and it also seems dangerously
> platform-specific.

Maybe talk about "the local server" to distinguish from the other one.

As for the platform dependencies, I see two main options: make the hint
platform-specific (i.e have #ifdef branches per platform) or make it
even more generic, such as "file descriptor limits".

A quick search suggests that current Windows itself doesn't typically
have such problems:
https://stackoverflow.com/questions/31108693/increasing-no-of-file-handles-in-windows-7-64-bit
https://docs.microsoft.com/es-es/archive/blogs/markrussinovich/pushing-the-limits-of-windows-handles

But the C runtime does:
https://docs.microsoft.com/en-us/cpp/c-runtime-library/file-handling?view=vs-2019
I suppose we do use the C runtime. There's a reference to
_setmaxstdio() being able to raise the limit from the default of 512 to
up to 8192 open files. We don't currently invoke that.
https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/setmaxstdio?view=vs-2019

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-24 19:27:48
Message-ID:	15886.1582572468@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL Postg젠 토토SQL :

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> On 2020-Feb-24, Tom Lane wrote:
>> We could also consider a HINT, along the lines of "Raise the server's
>> max_files_per_process and/or \"ulimit -n\" limits." This again has
>> the ambiguity about which server, and it also seems dangerously
>> platform-specific.

> Maybe talk about "the local server" to distinguish from the other one.

OK by me.

> As for the platform dependencies, I see two main options: make the hint
> platform-specific (i.e have #ifdef branches per platform) or make it
> even more generic, such as "file descriptor limits".

I thought about platform-specific messages, but it's not clear to me
whether our translation infrastructure will find messages that are
inside #ifdefs ... anyone know? If that does work, I'd be inclined
to mention ulimit -n on non-Windows and just say nothing about that
on Windows. "File descriptor limits" seems too unhelpful here.

regards, tom lane

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-24 20:01:19
Message-ID:	20200224200119.x27tfdybwpzde3cy@alap3.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-committers Postg토토 커뮤니티SQL

Hi,

On 2020-02-24 16:14:53 -0300, Alvaro Herrera wrote:
> But the C runtime does:
> https://docs.microsoft.com/en-us/cpp/c-runtime-library/file-handling?view=vs-2019
> I suppose we do use the C runtime. There's a reference to
> _setmaxstdio() being able to raise the limit from the default of 512 to
> up to 8192 open files. We don't currently invoke that.
> https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/setmaxstdio?view=vs-2019

If we ever go for that, we should also consider raising the limit on
unix systems up to the hard limit when hitting the fd ceiling. I.e. get
the current limit with getrlimit(RLIMIT_NOFILE) and raise rlim_cur
[closer] to rlim_max with setrlimit.

Perhaps it'd even be worthwhile to just always raise the limit, if
possible, in set_max_safe_fds(), by max_safe_fds +
NUM_RESERVED_FDS. That way PLs, other shared libs, would have a more
usualy amount of FDs available. Rather than a fairly small number, but
only when the backend has been running for a while in the right
workload.

Greetings,

Andres Freund

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-24 20:07:37
Message-ID:	17681.1582574857@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2020-02-24 16:14:53 -0300, Alvaro Herrera wrote:
>> I suppose we do use the C runtime. There's a reference to
>> _setmaxstdio() being able to raise the limit from the default of 512 to
>> up to 8192 open files. We don't currently invoke that.
>> https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/setmaxstdio?view=vs-2019

> If we ever go for that, we should also consider raising the limit on
> unix systems up to the hard limit when hitting the fd ceiling. I.e. get
> the current limit with getrlimit(RLIMIT_NOFILE) and raise rlim_cur
> [closer] to rlim_max with setrlimit.

I'm disinclined to think we should override the user's wishes in this way.
Especially given PG's proven ability to run the kernel completely out of
file descriptors ...

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-24 20:30:21
Message-ID:	18625.1582576221@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-committers Postg토토 꽁 머니SQL

I wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>> As for the platform dependencies, I see two main options: make the hint
>> platform-specific (i.e have #ifdef branches per platform) or make it
>> even more generic, such as "file descriptor limits".

> I thought about platform-specific messages, but it's not clear to me
> whether our translation infrastructure will find messages that are
> inside #ifdefs ... anyone know?

Oh, but of course it does. So let's do

errdetail("There are too many open files on the local server."),
#ifndef WIN32
errhint("Raise the server's max_files_per_process and/or \"ulimit -n\" limits.")
#else
errhint("Raise the server's max_files_per_process setting.")
#endif

I don't think there's much point in telling Windows users about
_setmaxstdio() here.

regards, tom lane

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-24 20:55:09
Message-ID:	20200224205509.GA5686@alvherre.pgsql
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL Postg토토 사이트 추천SQL

On 2020-Feb-24, Tom Lane wrote:

> I wrote:

> > I thought about platform-specific messages, but it's not clear to me
> > whether our translation infrastructure will find messages that are
> > inside #ifdefs ... anyone know?
>
> Oh, but of course it does. So let's do
>
> errdetail("There are too many open files on the local server."),
> #ifndef WIN32
> errhint("Raise the server's max_files_per_process and/or \"ulimit -n\" limits.")
> #else
> errhint("Raise the server's max_files_per_process setting.")
> #endif

WFM.

> I don't think there's much point in telling Windows users about
> _setmaxstdio() here.

Yeah, telling users to _setmaxstdio() themselves is useless, because
they can't do it; that's something *we* should do. I think the 512
limit is a bit low; why not increase that a little bit? Maybe just to
the Linux default of 1024.

Then again, that would be akin to setrlimit() on Linux. Maybe we can
consider that a separate GUC, in a separate patch, with a
platform-specific default value that just corresponds to the OS's
default, and the user can set to whatever suits them; then we call
either _setmaxstdio() or setrlimit().

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-24 21:01:41
Message-ID:	19960.1582578101@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> On 2020-Feb-24, Tom Lane wrote:
>> I don't think there's much point in telling Windows users about
>> _setmaxstdio() here.

> Yeah, telling users to _setmaxstdio() themselves is useless, because
> they can't do it; that's something *we* should do. I think the 512
> limit is a bit low; why not increase that a little bit? Maybe just to
> the Linux default of 1024.

> Then again, that would be akin to setrlimit() on Linux. Maybe we can
> consider that a separate GUC, in a separate patch, with a
> platform-specific default value that just corresponds to the OS's
> default, and the user can set to whatever suits them; then we call
> either _setmaxstdio() or setrlimit().

Why not just drive it off max_files_per_process? On Unix, that
largely exists to override the ulimit setting anyway. With no
comparable knob on a Windows system, we might as well just say
that's what you set.

regards, tom lane

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-24 21:17:09
Message-ID:	20200224211709.GA6996@alvherre.pgsql
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL Postg토토 꽁 머니SQL

On 2020-Feb-24, Tom Lane wrote:

> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:

> > Then again, that would be akin to setrlimit() on Linux. Maybe we can
> > consider that a separate GUC, in a separate patch, with a
> > platform-specific default value that just corresponds to the OS's
> > default, and the user can set to whatever suits them; then we call
> > either _setmaxstdio() or setrlimit().
>
> Why not just drive it off max_files_per_process? On Unix, that
> largely exists to override the ulimit setting anyway. With no
> comparable knob on a Windows system, we might as well just say
> that's what you set.

That makes sense to me -- but if we do that, then maybe we should be
doing the setrlimit() dance on it too, on Linux^W^W where supported.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-02-24 21:44:27
Message-ID:	5195.1582580667@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> On 2020-Feb-24, Tom Lane wrote:
>> Why not just drive it off max_files_per_process? On Unix, that
>> largely exists to override the ulimit setting anyway. With no
>> comparable knob on a Windows system, we might as well just say
>> that's what you set.

> That makes sense to me -- but if we do that, then maybe we should be
> doing the setrlimit() dance on it too, on Linux^W^W where supported.

Yeah, arguably we could try to setrlimit if max_files_per_process is
larger than the ulimit. We should definitely not reduce the ulimit
if max_files_per_process is smaller, though, since the DBA might
intentionally be leaving daylight for purposes such as FD-hungry PL
functions. On the whole I'm inclined to leave well enough alone on
the Unix side --- there's nothing there that the DBA can't set if
she wishes.

regards, tom lane

From:	Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To:	Thomas Munro <tmunro(at)postgresql(dot)org>
Cc:	pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-03-16 11:54:43
Message-ID:	CAPpHfdu4XVQxhqyO697U5idP8w8=MrPo=cWah+4DFhmAu018dA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL pgsql-hackers

On Wed, Feb 5, 2020 at 7:59 AM Thomas Munro <tmunro(at)postgresql(dot)org> wrote:
> Add kqueue(2) support to the WaitEventSet API.
>
> Use kevent(2) to wait for events on the BSD family of operating
> systems and macOS. This is similar to the epoll(2) support added
> for Linux by commit 98a64d0bd.

I'm not sure if it was already reported in this thread (it seems to be
not at the first glance), but I've discovered following issue on macos
10.13.6. If backend is under lldb and does XactLockTableWait(), then
it does proc_exit(1).

The full reproduction case is following.

s1# create table test (id serial primary key, value int);
s1# insert into test values (1,1);
s1# begin;
s1# update test set value = value + 1 where id = 1;

lldb attached to s2: b proc_exit
s2# update test set value = value + 1 where id = 1;

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 2.1
* frame #0: 0x000000010f9a7f5b postgres`proc_exit(code=1) at ipc.c:107
frame #1: 0x000000010f9aa3b9
postgres`WaitEventSetWaitBlock(set=0x00007fabdd847c90, cur_timeout=-1,
occurred_events=0x00007ffee0796c10, nevents=1) at latch.c:1427
frame #2: 0x000000010f9a9a43
postgres`WaitEventSetWait(set=0x00007fabdd847c90, timeout=-1,
occurred_events=0x00007ffee0796c10, nevents=1,
wait_event_info=50331652) at latch.c:1237
frame #3: 0x000000010f9a93b5
postgres`WaitLatchOrSocket(latch=0x00000001197eacc4, wakeEvents=33,
sock=-1, timeout=-1, wait_event_info=50331652) at latch.c:428
frame #4: 0x000000010f9a91c1
postgres`WaitLatch(latch=0x00000001197eacc4, wakeEvents=33, timeout=0,
wait_event_info=50331652) at latch.c:368
frame #5: 0x000000010f9d65b6
postgres`ProcSleep(locallock=0x00007fabdd01d5d8,
lockMethodTable=0x000000010fdb5cf8) at proc.c:1286
frame #6: 0x000000010f9c2af9
postgres`WaitOnLock(locallock=0x00007fabdd01d5d8,
owner=0x00007fabdf0056d0) at lock.c:1766
frame #7: 0x000000010f9c13a1
postgres`LockAcquireExtended(locktag=0x00007ffee0797140, lockmode=5,
sessionLock=false, dontWait=false, reportMemoryError=true,
locallockp=0x0000000000000000) at lock.c:1048
frame #8: 0x000000010f9c08b5
postgres`LockAcquire(locktag=0x00007ffee0797140, lockmode=5,
sessionLock=false, dontWait=false) at lock.c:713
frame #9: 0x000000010f9bef32 postgres`XactLockTableWait(xid=511,
rel=0x000000011031e148, ctid=0x00007ffee0797394, oper=XLTW_Update) at
lmgr.c:658
frame #10: 0x000000010f4e9cab
postgres`heap_update(relation=0x000000011031e148,
otid=0x00007ffee0797818, newtup=0x00007fabdd847a48, cid=0,
crosscheck=0x0000000000000000, wait=true, tmfd=0x00007ffee07976f0,
lockmode=0x00007ffee07976d8) at heapam.c:3239
frame #11: 0x000000010f4f9353
postgres`heapam_tuple_update(relation=0x000000011031e148,
otid=0x00007ffee0797818, slot=0x00007fabdc828558, cid=0,
snapshot=0x00007fabdc818170, crosscheck=0x0000000000000000, wait=true,
tmfd=0x00007ffee07976f0, lockmode=0x00007ffee07976d8,
update_indexes=0x00007ffee07976d6) at heapam_handler.c:326
frame #12: 0x000000010f7ba73d
postgres`table_tuple_update(rel=0x000000011031e148,
otid=0x00007ffee0797818, slot=0x00007fabdc828558, cid=0,
snapshot=0x00007fabdc818170, crosscheck=0x0000000000000000, wait=true,
tmfd=0x00007ffee07976f0, lockmode=0x00007ffee07976d8,
update_indexes=0x00007ffee07976d6) at tableam.h:1293
frame #13: 0x000000010f7b8952
postgres`ExecUpdate(mtstate=0x00007fabdc826ca8,
tupleid=0x00007ffee0797818, oldtuple=0x0000000000000000,
slot=0x00007fabdc828558, planSlot=0x00007fabdc828408,
epqstate=0x00007fabdc826da0, estate=0x00007fabdc826920,
canSetTag=true) at nodeModifyTable.c:1336
frame #14: 0x000000010f7b6d5a
postgres`ExecModifyTable(pstate=0x00007fabdc826ca8) at
nodeModifyTable.c:2246
frame #15: 0x000000010f780e82
postgres`ExecProcNodeFirst(node=0x00007fabdc826ca8) at
execProcnode.c:444
frame #16: 0x000000010f779332
postgres`ExecProcNode(node=0x00007fabdc826ca8) at executor.h:245
frame #17: 0x000000010f7751b1
postgres`ExecutePlan(estate=0x00007fabdc826920,
planstate=0x00007fabdc826ca8, use_parallel_mode=false,
operation=CMD_UPDATE, sendTuples=false, numberTuples=0,
direction=ForwardScanDirection, dest=0x00007fabdd843840,
execute_once=true) at execMain.c:1646
frame #18: 0x000000010f775072
postgres`standard_ExecutorRun(queryDesc=0x00007fabdd81ff20,
direction=ForwardScanDirection, count=0, execute_once=true) at
execMain.c:364
frame #19: 0x000000010f774e42
postgres`ExecutorRun(queryDesc=0x00007fabdd81ff20,
direction=ForwardScanDirection, count=0, execute_once=true) at
execMain.c:308
frame #20: 0x000000010f9eb63e
postgres`ProcessQuery(plan=0x00007fabdd8447a8, sourceText="update test
set value = value + 1 where id = 1;", params=0x0000000000000000,
queryEnv=0x0000000000000000, dest=0x00007fabdd843840,
qc=0x00007ffee0797d70) at pquery.c:160
frame #21: 0x000000010f9ea71d
postgres`PortalRunMulti(portal=0x00007fabdd823720, isTopLevel=true,
setHoldSnapshot=false, dest=0x00007fabdd843840,
altdest=0x00007fabdd843840, qc=0x00007ffee0797d70) at pquery.c:1265
frame #22: 0x000000010f9e9d92
postgres`PortalRun(portal=0x00007fabdd823720,
count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0x00007fabdd843840, altdest=0x00007fabdd843840,
qc=0x00007ffee0797d70) at pquery.c:779
frame #23: 0x000000010f9e5279
postgres`exec_simple_query(query_string="update test set value = value
+ 1 where id = 1;") at postgres.c:1236
frame #24: 0x000000010f9e43b8 postgres`PostgresMain(argc=1,
argv=0x00007fabdd01fe78, dbname="postgres", username="smagen") at
postgres.c:4295
frame #25: 0x000000010f9147a0
postgres`BackendRun(port=0x00007fabde000320) at postmaster.c:4510
frame #26: 0x000000010f913b9a
postgres`BackendStartup(port=0x00007fabde000320) at postmaster.c:4202
frame #27: 0x000000010f912aea postgres`ServerLoop at postmaster.c:1727
frame #28: 0x000000010f9104fa postgres`PostmasterMain(argc=3,
argv=0x00007fabdbd009b0) at postmaster.c:1400
frame #29: 0x000000010f7fae19 postgres`main(argc=3,
argv=0x00007fabdbd009b0) at main.c:210
frame #30: 0x00007fff69069015 libdyld.dylib`start + 1

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

From:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To:	Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc:	Thomas Munro <tmunro(at)postgresql(dot)org>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-03-16 20:11:22
Message-ID:	CA+hUKGKhAxJ8V8RVwCo6zJaeVrdOG1kFBHGZOOjf6DzW_omeMA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL pgsql-hackers

On Tue, Mar 17, 2020 at 12:55 AM Alexander Korotkov
<a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> On Wed, Feb 5, 2020 at 7:59 AM Thomas Munro <tmunro(at)postgresql(dot)org> wrote:
> > Add kqueue(2) support to the WaitEventSet API.
> >
> > Use kevent(2) to wait for events on the BSD family of operating
> > systems and macOS. This is similar to the epoll(2) support added
> > for Linux by commit 98a64d0bd.
>
> I'm not sure if it was already reported in this thread (it seems to be
> not at the first glance), but I've discovered following issue on macos
> 10.13.6. If backend is under lldb and does XactLockTableWait(), then
> it does proc_exit(1).

/me digs out a Macintosh

Reproduced here. The problem seems to be that macOS's getppid()
returns the debugger's PID, while the debugger is attached. This
doesn't happen on FreeBSD (even though the debugger does internally
become the parent, getppid() is careful to return the "real" parent
PID so that user space doesn't notice this trickery; apparently Apple
made a different choice).

The getppid() check is there to close a vanishingly rare race
condition: when creating a WaitEventSet, we ask the kernel to tell us
when the postmaster exits, but there is a possibility that the
postmaster has already exited; normally that causes an error with
errno == ESRCH (no such PID, it's already gone), but another unrelated
process might have started that has the same PID, so we check if our
ppid has changed after a successful return code. That's not going to
work under a debugger on this OS.

Looking into some options.

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <tmunro(at)postgresql(dot)org>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-03-16 20:30:40
Message-ID:	20200316203040.GA4019@alvherre.pgsql
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-committers Postg스포츠 토토SQL

On 2020-Mar-17, Thomas Munro wrote:

> Reproduced here. The problem seems to be that macOS's getppid()
> returns the debugger's PID, while the debugger is attached. This
> doesn't happen on FreeBSD (even though the debugger does internally
> become the parent, getppid() is careful to return the "real" parent
> PID so that user space doesn't notice this trickery; apparently Apple
> made a different choice).

Wow ... Yeah, that was a known problem with FreeBSD, see
https://postgr.es/m/1292851036-sup-5399@alvh.no-ip.org
Evidently FreeBSD must have fixed it, but macOS has not caught up with
that ...

> The getppid() check is there to close a vanishingly rare race
> condition: when creating a WaitEventSet, we ask the kernel to tell us
> when the postmaster exits, but there is a possibility that the
> postmaster has already exited; normally that causes an error with
> errno == ESRCH (no such PID, it's already gone), but another unrelated
> process might have started that has the same PID, so we check if our
> ppid has changed after a successful return code. That's not going to
> work under a debugger on this OS.

Irk.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <tmunro(at)postgresql(dot)org>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-03-16 21:21:29
Message-ID:	CA+hUKGKsTnvROYqx4hUjGpwvCTYi+=+da7sTPCw2Mf2Yxfdi4Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL Postg스포츠 토토 사이트SQL

On Tue, Mar 17, 2020 at 9:30 AM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> On 2020-Mar-17, Thomas Munro wrote:
> > Reproduced here. The problem seems to be that macOS's getppid()
> > returns the debugger's PID, while the debugger is attached. This
> > doesn't happen on FreeBSD (even though the debugger does internally
> > become the parent, getppid() is careful to return the "real" parent
> > PID so that user space doesn't notice this trickery; apparently Apple
> > made a different choice).
>
> Wow ... Yeah, that was a known problem with FreeBSD, see
> https://postgr.es/m/1292851036-sup-5399@alvh.no-ip.org
> Evidently FreeBSD must have fixed it, but macOS has not caught up with
> that ...

Oh, interesting. Sorry to bring a variant of this problem back.

> > The getppid() check is there to close a vanishingly rare race
> > condition: when creating a WaitEventSet, we ask the kernel to tell us
> > when the postmaster exits, but there is a possibility that the
> > postmaster has already exited; normally that causes an error with
> > errno == ESRCH (no such PID, it's already gone), but another unrelated
> > process might have started that has the same PID, so we check if our
> > ppid has changed after a successful return code. That's not going to
> > work under a debugger on this OS.
>
> Irk.

I'm now far away from my home Mac so I can't test until later but I
think we can fix this by double checking with the pipe:

- else if (event->events == WL_POSTMASTER_DEATH && PostmasterPid
!= getppid())
+ else if (event->events == WL_POSTMASTER_DEATH &&
+ PostmasterPid != getppid() &&
+ !PostmasterIsAliveInternal())
+ {
+ /*
+ * The extra PostmasterIsAliveInternal() check
prevents false alarms
+ * from systems where getppid() returns a debugger PID
while being
+ * traced.
+ */
set->report_postmaster_not_running = true;
+ }

The fast getppid() check will prevent the slow and redundant
PostmasterIsAliveInternal() check from being reached on production
systems, until the postmaster really is gone in the race scenario
described.

(Note that all of this per-lock-wait work will go away with
https://commitfest.postgresql.org/27/2452/, so I'm glad Alexander
found this now).

From:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <tmunro(at)postgresql(dot)org>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-03-18 00:07:10
Message-ID:	CA+hUKGLEAi_QHzuW=jQeHiO3jfjmMec6jvFWTOhXhQiMO-G7hQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	PostgreSQL : PostgreSQL pgsql-hackers

On Tue, Mar 17, 2020 at 10:21 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> I'm now far away from my home Mac so I can't test until later but I
> think we can fix this by double checking with the pipe:

Pushed.

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <tmunro(at)postgresql(dot)org>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-03-28 18:43:23
Message-ID:	27779.1585421003@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-committers Postg토토 꽁 머니SQL

Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> Pushed.

prairiedog just turned up a different issue in this area [1].
I wondered why it hadn't reported in for awhile, and upon
investigation I found that the test run was stuck in the
final pg_dump step of the pg_upgrade test. pg_dump itself
was waiting for a query result, while the connected backend
was sitting here:

(gdb) bt
#0 0x9002ec88 in kevent ()
#1 0x0039cff8 in WaitEventSetWait (set=0x20c502c, timeout=-1, occurred_events=0xbfffdd4c, nevents=1, wait_event_info=100663296) at latch.c:1443
#2 0x00261d98 in secure_read (port=0x2401ba0, ptr=0x713558, len=8192) at be-secure.c:184
#3 0x00269d34 in pq_recvbuf () at pqcomm.c:950
#4 0x00269e24 in pq_getbyte () at pqcomm.c:993
#5 0x003cec2c in PostgresMain (argc=1, argv=0x38060ac, dbname=0x20c5154 "regression", username=0x20c5138 "buildfarm") at postgres.c:337
#6 0x0032de0c in BackendStartup (port=0x2401ba0) at postmaster.c:4510
#7 0x0032fcf8 in PostmasterMain (argc=1585338749, argv=0x5e7e59b9) at postmaster.c:1727
#8 0x0026f32c in main (argc=6, argv=0x24009b0) at main.c:210

It'd appear that we dropped an input-is-available condition.

Now prairiedog is running a museum-grade macOS release, so
it's hardly impossible that this is a kernel bug not a
Postgres bug. But we shouldn't jump to that conclusion,
either, given that our kevent support is so new.

regards, tom lane

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prairiedog&dt=2020-03-27%2018%3A55%3A51
The log shows a SIGABRT trap, but that's because I manually did "kill
-ABRT" to unblock the buildfarm animal.

From:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <tmunro(at)postgresql(dot)org>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>, Rémi Zara <remi_zara(at)mac(dot)com>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-03-28 22:25:12
Message-ID:	CA+hUKGLzaR5cV0EmZWoVXJDO_XwZpmpQX_sYwCBRE1qLBEcGPQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-committers pgsql-hackers

On Sun, Mar 29, 2020 at 7:43 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> > Pushed.
>
> prairiedog just turned up a different issue in this area [1].
> I wondered why it hadn't reported in for awhile, and upon
> investigation I found that the test run was stuck in the
> final pg_dump step of the pg_upgrade test. pg_dump itself
> was waiting for a query result, while the connected backend
> was sitting here:
>
> (gdb) bt
> #0 0x9002ec88 in kevent ()
> #1 0x0039cff8 in WaitEventSetWait (set=0x20c502c, timeout=-1, occurred_events=0xbfffdd4c, nevents=1, wait_event_info=100663296) at latch.c:1443
> #2 0x00261d98 in secure_read (port=0x2401ba0, ptr=0x713558, len=8192) at be-secure.c:184
> #3 0x00269d34 in pq_recvbuf () at pqcomm.c:950
> #4 0x00269e24 in pq_getbyte () at pqcomm.c:993
> #5 0x003cec2c in PostgresMain (argc=1, argv=0x38060ac, dbname=0x20c5154 "regression", username=0x20c5138 "buildfarm") at postgres.c:337
> #6 0x0032de0c in BackendStartup (port=0x2401ba0) at postmaster.c:4510
> #7 0x0032fcf8 in PostmasterMain (argc=1585338749, argv=0x5e7e59b9) at postmaster.c:1727
> #8 0x0026f32c in main (argc=6, argv=0x24009b0) at main.c:210
>
> It'd appear that we dropped an input-is-available condition.
>
> Now prairiedog is running a museum-grade macOS release, so
> it's hardly impossible that this is a kernel bug not a
> Postgres bug. But we shouldn't jump to that conclusion,
> either, given that our kevent support is so new.

My first thought was that it might have been due to the EV_CLEAR flag
problem discussed elsewhere, but the failing build has commit 9b8aa092
so that's not it.

About the kernel bug hypothesis: I see that the libevent project
doesn't use kqueue on early macOS versions due to some bug that it
tests for that apparently fails on 10.4/kernel 8.11 (what you have
there). Kqueue was added to macOS 10.3 (which pulled a bunch of code
from FreeBSD 5 including this), so in 10.4 I suppose it was still
somewhat new. I also found a few other vague complaints about bugs
from that era including some claims of missing events, but without
conclusions. The kernel source is mirrored on github with change
history[1], but without commit log messages or a public bug tracker
it's practically impossible for a drive-by reader to figure out what
was broken and fixed. That seems like a bit of a wild dino-goose
chase.

Hmm, I see that Remi also runs an ancient PowerPC Mac on macOS
10.5/Darwin 9.8. His build farm animal "locust" hasn't reported in 22
days. Remi, is that animal down for other reasons, or could it be
stuck like this?

Further evidence for a version-specific problem is that there are
surely many in our hacker community working on modern Macs, and I
haven't heard of any problems so far. Of course that doesn't rule
anything out.

[1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_event.c

From:	Rémi Zara <remi_zara(at)mac(dot)com>
To:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <tmunro(at)postgresql(dot)org>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>
Subject:	Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date:	2020-03-30 13:42:59
Message-ID:	7E9D5928-C9AC-4DC6-B7E7-62F40560873D@mac.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-committers pgsql-hackers

> Le 28 mars 2020 à 23:25, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> a écrit :
>
> Hmm, I see that Remi also runs an ancient PowerPC Mac on macOS
> 10.5/Darwin 9.8. His build farm animal "locust" hasn't reported in 22
> days. Remi, is that animal down for other reasons, or could it be
> stuck like this?

Hi,

locust is down, and due to circulation restrictions, I cannot access it for the moment. Sorry.

Rémi