Obsolete coding in fork_process.c

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Obsolete coding in fork_process.c
Date: 2014-05-01 16:13:28
Message-ID: 30040.1398960808@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

fork_process.c quoth:

/*
* Flush stdio channels just before fork, to avoid double-output problems.
* Ideally we'd use fflush(NULL) here, but there are still a few non-ANSI
* stdio libraries out there (like SunOS 4.1.x) that coredump if we do.
* Presently stdout and stderr are the only stdio output channels used by
* the postmaster, so fflush'ing them should be sufficient.
*/
fflush(stdout);
fflush(stderr);

Is there any reason not to change this to just fflush(NULL)? We dropped
support for SunOS 4.1 quite some time ago ...

While it's still true that the postmaster proper doesn't need to fflush
anything but stdout and stderr, this coding seems a bit less than safe
when you consider the possibility of third-party libraries loaded into the
postmaster process.

regards, tom lane


From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Obsolete coding in fork_process.c
Date: 2014-05-01 18:32:31
Message-ID: 20140501183231.GA1190179@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 01, 2014 at 12:13:28PM -0400, Tom Lane wrote:
> fork_process.c quoth:
>
> /*
> * Flush stdio channels just before fork, to avoid double-output problems.
> * Ideally we'd use fflush(NULL) here, but there are still a few non-ANSI
> * stdio libraries out there (like SunOS 4.1.x) that coredump if we do.
> * Presently stdout and stderr are the only stdio output channels used by
> * the postmaster, so fflush'ing them should be sufficient.
> */
> fflush(stdout);
> fflush(stderr);
>
> Is there any reason not to change this to just fflush(NULL)? We dropped
> support for SunOS 4.1 quite some time ago ...

Modern systems have other fflush(NULL) problems:

http://www.nntp.perl.org/group/perl.perl5.porters/2013/09/msg207692.html
http://perl5.git.perl.org/metaconfig.git/blob/master:/U/perl/fflushall.U

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Obsolete coding in fork_process.c
Date: 2014-05-01 19:16:28
Message-ID: 4013.1398971788@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Noah Misch <noah(at)leadboat(dot)com> writes:
> On Thu, May 01, 2014 at 12:13:28PM -0400, Tom Lane wrote:
>> Is there any reason not to change this to just fflush(NULL)? We dropped
>> support for SunOS 4.1 quite some time ago ...

> Modern systems have other fflush(NULL) problems:

> http://www.nntp.perl.org/group/perl.perl5.porters/2013/09/msg207692.html
> http://perl5.git.perl.org/metaconfig.git/blob/master:/U/perl/fflushall.U

Fun. I doubt that the postmaster's stdin would ever be a pipe, but
maybe we'd better leave well enough alone. Should update the comment
though.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Obsolete coding in fork_process.c
Date: 2014-05-01 21:59:08
Message-ID: 28165.1398981548@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> Noah Misch <noah(at)leadboat(dot)com> writes:
>> On Thu, May 01, 2014 at 12:13:28PM -0400, Tom Lane wrote:
>>> Is there any reason not to change this to just fflush(NULL)? We dropped
>>> support for SunOS 4.1 quite some time ago ...

>> Modern systems have other fflush(NULL) problems:

>> http://www.nntp.perl.org/group/perl.perl5.porters/2013/09/msg207692.html
>> http://perl5.git.perl.org/metaconfig.git/blob/master:/U/perl/fflushall.U

> Fun. I doubt that the postmaster's stdin would ever be a pipe, but
> maybe we'd better leave well enough alone. Should update the comment
> though.

However ... after looking around I notice that fflush(NULL) is already
being used in parallel pg_dump and pg_upgrade; and at least in the latter
case I'm afraid to change that because it looks like there are probably
other stdio output files open in the process.

I think we should probably go ahead and switch over to using
fflush(NULL). The capability is required by C89 for crissake;
are we really going to rely on hearsay evidence that there are
current platforms that are still broken? Also, I found at least
one claim on the net that this has been fixed in Solaris for awhile:
http://grokbase.com/t/perl/perl5-porters/021hn4z9pq/fflush-null-on-solaris-9

Attached is a draft patch that I was working on before I decided that
making the indicated changes in pg_upgrade was probably a bad idea.
This is mainly to memorialize the places we should look at if we
change it.

BTW, while working on this I noticed that there are a boatload of places
where we use system() or popen() without a prior fflush. I suspect most
of them are safe, or we'd have heard more complaints --- but shouldn't
we clamp down on that?

regards, tom lane

Attachment Content-Type Size
fflush-null-update.patch text/x-diff 9.0 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Noah Misch <noah(at)leadboat(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Obsolete coding in fork_process.c
Date: 2014-05-01 23:43:45
Message-ID: CA+TgmoawJqZizjowi+QcUCXsrKMvhud=iw7zQpH6AjAZ=yWiOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 1, 2014 at 5:59 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:
>> Noah Misch <noah(at)leadboat(dot)com> writes:
>>> On Thu, May 01, 2014 at 12:13:28PM -0400, Tom Lane wrote:
>>>> Is there any reason not to change this to just fflush(NULL)? We dropped
>>>> support for SunOS 4.1 quite some time ago ...
>
>>> Modern systems have other fflush(NULL) problems:
>
>>> http://www.nntp.perl.org/group/perl.perl5.porters/2013/09/msg207692.html
>>> http://perl5.git.perl.org/metaconfig.git/blob/master:/U/perl/fflushall.U
>
>> Fun. I doubt that the postmaster's stdin would ever be a pipe, but
>> maybe we'd better leave well enough alone. Should update the comment
>> though.
>
> However ... after looking around I notice that fflush(NULL) is already
> being used in parallel pg_dump and pg_upgrade; and at least in the latter
> case I'm afraid to change that because it looks like there are probably
> other stdio output files open in the process.
>
> I think we should probably go ahead and switch over to using
> fflush(NULL). The capability is required by C89 for crissake;
> are we really going to rely on hearsay evidence that there are
> current platforms that are still broken? Also, I found at least
> one claim on the net that this has been fixed in Solaris for awhile:
> http://grokbase.com/t/perl/perl5-porters/021hn4z9pq/fflush-null-on-solaris-9

I am having a hard time understanding what the real impetus to change
this is. IIUC, we have zero reports of the current coding being a
problem, so I'm not sure why we want to go make it different. Sure,
there are hypothetical problems with the current code, but there are
also hypothetical problems with the proposed change, and the current
code has survived quite a bit of real-world deployment.

I guess it's hard for me to be dead-set against this if we're using
that pattern elsewhere, but I won't be very surprised if more weird
cases where it doesn't work turn up down the road; and I'm a bit
worried that if we let it proliferate we'll find it hard to get rid of
if and when those reports turn up.

> Attached is a draft patch that I was working on before I decided that
> making the indicated changes in pg_upgrade was probably a bad idea.
> This is mainly to memorialize the places we should look at if we
> change it.
>
> BTW, while working on this I noticed that there are a boatload of places
> where we use system() or popen() without a prior fflush. I suspect most
> of them are safe, or we'd have heard more complaints --- but shouldn't
> we clamp down on that?

I think that would probably be a good idea. I wouldn't be shocked if
there are problems there that we've failed to notice.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Obsolete coding in fork_process.c
Date: 2014-05-02 00:00:53
Message-ID: 20140502000053.GA1191730@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 01, 2014 at 05:59:08PM -0400, Tom Lane wrote:
> I wrote:
> > Noah Misch <noah(at)leadboat(dot)com> writes:
> >> Modern systems have other fflush(NULL) problems:
>
> >> http://www.nntp.perl.org/group/perl.perl5.porters/2013/09/msg207692.html
> >> http://perl5.git.perl.org/metaconfig.git/blob/master:/U/perl/fflushall.U
>
> > Fun. I doubt that the postmaster's stdin would ever be a pipe, but
> > maybe we'd better leave well enough alone. Should update the comment
> > though.
>
> However ... after looking around I notice that fflush(NULL) is already
> being used in parallel pg_dump and pg_upgrade; and at least in the latter
> case I'm afraid to change that because it looks like there are probably
> other stdio output files open in the process.

Do those programs, operating in those modes, read from stdin or some other
long-lived, pipe-backed FILE*?

> BTW, while working on this I noticed that there are a boatload of places
> where we use system() or popen() without a prior fflush. I suspect most
> of them are safe, or we'd have heard more complaints --- but shouldn't
> we clamp down on that?

You need the fflush() when forking and then using stdio in the child before
any exec(). Have you caught wind of any system() or popen() implementation
having that property?

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Obsolete coding in fork_process.c
Date: 2014-05-02 00:44:46
Message-ID: 8749.1398991486@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Noah Misch <noah(at)leadboat(dot)com> writes:
> On Thu, May 01, 2014 at 05:59:08PM -0400, Tom Lane wrote:
>> However ... after looking around I notice that fflush(NULL) is already
>> being used in parallel pg_dump and pg_upgrade; and at least in the latter
>> case I'm afraid to change that because it looks like there are probably
>> other stdio output files open in the process.

> Do those programs, operating in those modes, read from stdin or some other
> long-lived, pipe-backed FILE*?

Probably not. However, if that's your criterion, I'd say that I'd much
rather assume that the postmaster doesn't have a pipe connected to stdin
than that it has no stdio outputs besides stdout/stderr.

>> BTW, while working on this I noticed that there are a boatload of places
>> where we use system() or popen() without a prior fflush. I suspect most
>> of them are safe, or we'd have heard more complaints --- but shouldn't
>> we clamp down on that?

> You need the fflush() when forking and then using stdio in the child before
> any exec(). Have you caught wind of any system() or popen() implementation
> having that property?

You're only considering one aspect of the problem. Yeah, you might not
get duplicated output unless system() prints something before exec(),
but we would also like to have causality: that is, whatever we sent to
stdout before calling system() should appear there before anything the
child process sends to stdout.

regards, tom lane


From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Obsolete coding in fork_process.c
Date: 2014-05-02 01:24:56
Message-ID: 20140502012456.GB1191730@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 01, 2014 at 08:44:46PM -0400, Tom Lane wrote:
> Noah Misch <noah(at)leadboat(dot)com> writes:
> > On Thu, May 01, 2014 at 05:59:08PM -0400, Tom Lane wrote:
> >> However ... after looking around I notice that fflush(NULL) is already
> >> being used in parallel pg_dump and pg_upgrade; and at least in the latter
> >> case I'm afraid to change that because it looks like there are probably
> >> other stdio output files open in the process.
>
> > Do those programs, operating in those modes, read from stdin or some other
> > long-lived, pipe-backed FILE*?
>
> Probably not.

Then the use of fflush(NULL) in those programs tells us nothing about the
status of the aforementioned HP-UX bug.

> However, if that's your criterion, I'd say that I'd much
> rather assume that the postmaster doesn't have a pipe connected to stdin
> than that it has no stdio outputs besides stdout/stderr.

Either way, the beneficiary is theoretical; who can tell which one deserves
priority? Let's do what we usually do in the absence of a clear improvement:
keep the longstanding behavior.

> >> BTW, while working on this I noticed that there are a boatload of places
> >> where we use system() or popen() without a prior fflush. I suspect most
> >> of them are safe, or we'd have heard more complaints --- but shouldn't
> >> we clamp down on that?
>
> > You need the fflush() when forking and then using stdio in the child before
> > any exec(). Have you caught wind of any system() or popen() implementation
> > having that property?
>
> You're only considering one aspect of the problem. Yeah, you might not
> get duplicated output unless system() prints something before exec(),
> but we would also like to have causality: that is, whatever we sent to
> stdout before calling system() should appear there before anything the
> child process sends to stdout.

Good point. I suppose a couple of fflush() calls have negligible cost next to
a system() or popen(). Introduce pg_popen()/pg_system(), and adopt a rule
that they are [almost] our only callers of raw popen()/system()?

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Obsolete coding in fork_process.c
Date: 2014-05-02 03:07:51
Message-ID: 11846.1399000071@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Noah Misch <noah(at)leadboat(dot)com> writes:
> On Thu, May 01, 2014 at 08:44:46PM -0400, Tom Lane wrote:
>> You're only considering one aspect of the problem. Yeah, you might not
>> get duplicated output unless system() prints something before exec(),
>> but we would also like to have causality: that is, whatever we sent to
>> stdout before calling system() should appear there before anything the
>> child process sends to stdout.

> Good point. I suppose a couple of fflush() calls have negligible cost next to
> a system() or popen(). Introduce pg_popen()/pg_system(), and adopt a rule
> that they are [almost] our only callers of raw popen()/system()?

Meh. I'm not usually in favor of adopting nonstandard notation, and
this doesn't seem like a place to start. In particular, if you don't
want to use fflush(NULL) in these proposed wrappers, then call sites
are still going to have an issue with needing to do manual fflushes;
pg_regress.c's spawn_process is an example:

/*
* Must flush I/O buffers before fork. Ideally we'd use fflush(NULL) here
* ... does anyone still care about systems where that doesn't work?
*/
fflush(stdout);
fflush(stderr);
if (logfile)
fflush(logfile);

pid = fork();

I think that removing the need for fflush(stdout) and fflush(stderr)
in this context would mostly result in people forgetting to fflush
other output files. I'd rather have the two lines of boilerplate
(and a comment about why we're refusing to depend on fflush(NULL))
than take that risk.

Now, if you were willing to put fflush(NULL) into the wrappers,
I could get on board with that ;-)

regards, tom lane


From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Obsolete coding in fork_process.c
Date: 2014-05-02 14:28:49
Message-ID: 20140502142849.GA1219884@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 01, 2014 at 11:07:51PM -0400, Tom Lane wrote:
> Noah Misch <noah(at)leadboat(dot)com> writes:
> > On Thu, May 01, 2014 at 08:44:46PM -0400, Tom Lane wrote:
> >> You're only considering one aspect of the problem. Yeah, you might not
> >> get duplicated output unless system() prints something before exec(),
> >> but we would also like to have causality: that is, whatever we sent to
> >> stdout before calling system() should appear there before anything the
> >> child process sends to stdout.
>
> > Good point. I suppose a couple of fflush() calls have negligible cost next to
> > a system() or popen(). Introduce pg_popen()/pg_system(), and adopt a rule
> > that they are [almost] our only callers of raw popen()/system()?
>
> Meh. I'm not usually in favor of adopting nonstandard notation, and
> this doesn't seem like a place to start. In particular, if you don't
> want to use fflush(NULL) in these proposed wrappers, then call sites
> are still going to have an issue with needing to do manual fflushes;
> pg_regress.c's spawn_process is an example:
>
> /*
> * Must flush I/O buffers before fork. Ideally we'd use fflush(NULL) here
> * ... does anyone still care about systems where that doesn't work?
> */
> fflush(stdout);
> fflush(stderr);
> if (logfile)
> fflush(logfile);
>
> pid = fork();
>
> I think that removing the need for fflush(stdout) and fflush(stderr)
> in this context would mostly result in people forgetting to fflush
> other output files. I'd rather have the two lines of boilerplate
> (and a comment about why we're refusing to depend on fflush(NULL))
> than take that risk.

Works for me.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com