Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)

Lists: pgsql-bugspgsql-hackers
From: Sandro Santilli <strk(at)keybit(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Uninterruptable regexp_replace in 9.3.1 ?
Date: 2014-02-21 09:17:59
Message-ID: 20140221091759.GK6924@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

The following snippet reveals that 9.3.1 has a bug
in regexp_matches, which uninterruptably keeps CPU
spinning for minutes:

-----8<---------------------------------------------------

\timing
SET statement_timeout = 2;
-- this is only to show statement_timeout is effective here
SELECT count(*) from generate_series(1, 100000);
-- this one is uninterruptable!
SELECT regexp_matches($INPUT$

/a
$b$
$c$d
;
$INPUT$,
$REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*?\2)+)$REG$, 'g' );

-----8<---------------------------------------------------

The above has been tested to be harmless with PostgreSQL 9.1.11
in that the regexp_matches call is interrupted, but it is NOT
with PostgreSQL 9.3.1.

Is it a known bug ?

Please include my address in replies as I don't get notified
of list activity. Thanks.

--strk;

()  ASCII ribbon campaign -- Keep it simple !
/\  http://strk.keybit.net/rants/ascii_mails.txt


From: Sandro Santilli <strk(at)keybit(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Uninterruptable regexp_replace in 9.3.1 ?
Date: 2014-02-21 10:27:13
Message-ID: 20140221102713.GN6924@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

I've just tested 9.3.3 and it is _also_ affected.
Should I report the regression somewhere else ?

--strk;

On Fri, Feb 21, 2014 at 10:17:59AM +0100, Sandro Santilli wrote:
> The following snippet reveals that 9.3.1 has a bug
> in regexp_matches, which uninterruptably keeps CPU
> spinning for minutes:
>
> -----8<---------------------------------------------------
>
> \timing
> SET statement_timeout = 2;
> -- this is only to show statement_timeout is effective here
> SELECT count(*) from generate_series(1, 100000);
> -- this one is uninterruptable!
> SELECT regexp_matches($INPUT$
>
>
>
>
>
>
>
>
> /a
> $b$
> $c$d
> ;
> $INPUT$,
> $REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*?\2)+)$REG$, 'g' );
>
> -----8<---------------------------------------------------
>
> The above has been tested to be harmless with PostgreSQL 9.1.11
> in that the regexp_matches call is interrupted, but it is NOT
> with PostgreSQL 9.3.1.
>
> Is it a known bug ?
>
> Please include my address in replies as I don't get notified
> of list activity. Thanks.
>
> --strk;
>
> ()  ASCII ribbon campaign -- Keep it simple !
> /\  http://strk.keybit.net/rants/ascii_mails.txt


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org, strk(at)keybit(dot)net
Subject: Re: Uninterruptable regexp_replace in 9.3.1 ?
Date: 2014-02-21 15:46:45
Message-ID: 530774E5.5040105@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On 02/21/2014 05:17 PM, Sandro Santilli wrote:
> The following snippet reveals that 9.3.1 has a bug
> in regexp_matches, which uninterruptably keeps CPU
> spinning for minutes:

Huh. So it does. That's interesting.

(You should generally report things to pgsql-bugs(at)postgresql(dot)org btw,
not -hackers)

Looks like it's busily looping within the regex.c code, never hitting a
CHECK_FOR_INTERRUPTS.

The real question IMO is why it's taking so long. It looks like
cfindloop(...) is being called multiple times, with each call taking a
couple of seconds.

A profile of the run is attached. I don't expect to have a chance to dig
into this right away, as I haven't touched the regexp code before and
would need to spend a bit of time studying it to achieve anything.
Hopefully the test, confirmation, and profile is useful.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
perf.report text/plain 10.6 KB

From: Florian Pflug <fgp(at)phlo(dot)org>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Sandro Santilli <strk(at)keybit(dot)net>
Subject: Re: Uninterruptable regexp_replace in 9.3.1 ?
Date: 2014-02-21 16:04:04
Message-ID: D4594E29-6A28-43C0-BD33-FC3EE8A23683@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Feb21, 2014, at 16:46 , Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> The real question IMO is why it's taking so long. It looks like
> cfindloop(...) is being called multiple times, with each call taking a
> couple of seconds.

Yeah, I wondered about this too. I've shortened the example a bit - here
are a few observations

postgres=# select regexp_matches(' $a$b$c$',
$REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*\2)+)$REG$, 'g');
Time: 33.026 ms

postgres=# select regexp_matches(' $a$b$c$',
$REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*\2)+)$REG$, 'g');
Time: 60.594 ms

postgres=# select regexp_matches(' $a$b$c$',
$REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*\2)+)$REG$, 'g');
Time: 114.410 ms

postgres=# select regexp_matches(' $a$b$c$',
$REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*\2)+)$REG$, 'g');
Time: 227.467 ms

postgres=# select regexp_matches(' $a$b$c$',
$REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*\2)+)$REG$, 'g');
Time: 452.739 ms

postgres=# select regexp_matches(' $a$b$c$',
$REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*\2)+)$REG$, 'g');
Time: 943.098 ms

postgres=# select regexp_matches(' $a$b$c$d$e$',
$REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*\2)+)$REG$, 'g');
Time: 200.795 ms

postgres=# select regexp_matches(' $a$b$c$d$e$f$',
$REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*\2)+)$REG$, 'g');
Time: 298.264 ms

postgres=# select regexp_matches(' $a$b$c$d$e$f$g$',
$REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*\2)+)$REG$, 'g');
Time: 444.219 ms

postgres=# select regexp_matches(' $a$b$c$d$e$f$g$h$',
$REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*\2)+)$REG$, 'g');
Time: 696.137 ms

postgres=# select regexp_matches(' $a$b$c$d$e$f$g$h$i$',
$REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*\2)+)$REG$, 'g');
Time: 974.502 ms

postgres=# select regexp_matches(' $a$b$c$d$e$f$g$h$i$j$',
$REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*\2)+)$REG$, 'g');
Time: 1369.703 ms

postgres=# select regexp_matches(' $a$b$c$d$e$f$g$h$i$j$',
$REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*\2)+)$REG$, 'g');
Time: 2747.766 ms

In other words, the observes runtime is roughly 2^i * 1.5^j for inputs
consiting of i leading spaces (any character will probably do) followed
by j substring of the form $X$ (X is an arbitrary character).

best regards,
Florian Pflug


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Sandro Santilli <strk(at)keybit(dot)net>
Subject: Re: Uninterruptable regexp_replace in 9.3.1 ?
Date: 2014-02-21 16:29:33
Message-ID: 53077EED.2000907@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On 02/22/2014 12:04 AM, Florian Pflug wrote:
> On Feb21, 2014, at 16:46 , Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
>> The real question IMO is why it's taking so long. It looks like
>> cfindloop(...) is being called multiple times, with each call taking a
>> couple of seconds.
>
> Yeah, I wondered about this too. I've shortened the example a bit - here
> are a few observations
>
> [snip]
>
> In other words, the observes runtime is roughly 2^i * 1.5^j for inputs
> consiting of i leading spaces (any character will probably do) followed
> by j substring of the form $X$ (X is an arbitrary character).

The biggest change in regexp support in the introduction of proper
unicode support, but that was before 9.1.

The problem report claims that the issue does not occur on 9.1, but yet:

git diff REL9_1_STABLE master -- ./src/backend/utils/adt/regexp.c

is utterly trivial; a copyright date line change, and 1609797c which
just tweaks the includes. 9.0 has a much bigger diff.

So I'd like to confirm that this issue doesn't affect 9.1. I can
reproduce the issue againts 9.2. I don't have 9.1 or older lying around
to test against right this second.

Sandro, can you please provide the output of "SELECT version()" from a
PostgreSQL version that is not slow with this query?

(BTW, I'm highly amused by how the development style has changed around
here. From "git blame", this is from 1997:

"I haven't actually measured the speed improvement, but it `looks' a lot
quicker visually when watching regression test output."

Ha.)

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Craig Ringer <craig(at)2ndQuadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Sandro Santilli <strk(at)keybit(dot)net>
Subject: Re: Uninterruptable regexp_replace in 9.3.1 ?
Date: 2014-02-21 16:56:04
Message-ID: 6EF2A961-79C6-470B-BC2C-AA6061D07CE4@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Feb21, 2014, at 17:29 , Craig Ringer <craig(at)2ndQuadrant(dot)com> wrote:
> The problem report claims that the issue does not occur on 9.1, but yet:
>
> git diff REL9_1_STABLE master -- ./src/backend/utils/adt/regexp.c
>
> is utterly trivial; a copyright date line change, and 1609797c which
> just tweaks the includes. 9.0 has a much bigger diff.

On 9.1.12:

postgres=# select regexp_matches(' $a$b$c$d$e$f$g$h$i$j$',
$REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*\2)+)$REG$, 'g');
regexp_matches
---------------------------------------------
{" $a$b$c$d$e$f$g$h$i$j",NULL}
(1 row)
Time: 1.048 ms

On HEAD

postgres=# select regexp_matches(' $a$b$c$d$e$f$g$h$i$j$',
$REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*\2)+)$REG$, 'g');
regexp_matches
-------------------------
{" ",NULL}
{a,NULL}
{b,NULL}
{c,NULL}
{d,NULL}
{e,NULL}
{f,NULL}
{g,NULL}
{h,NULL}
{i,NULL}
{j,NULL}
(11 rows)
Time: 4787.239 ms

Aha! Since we go rid of regex_flavor pre-9.1, I don't have an immediate suspect...

best regards,
Florian Pflug


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Florian Pflug <fgp(at)phlo(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Sandro Santilli <strk(at)keybit(dot)net>
Subject: Re: Uninterruptable regexp_replace in 9.3.1 ?
Date: 2014-02-21 17:03:33
Message-ID: 3663.1393002213@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
> So I'd like to confirm that this issue doesn't affect 9.1.

It doesn't. I suspect it has something to do with 173e29aa5 or one
of the nearby commits in backend/regex/.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Sandro Santilli <strk(at)keybit(dot)net>
Subject: Re: Uninterruptable regexp_replace in 9.3.1 ?
Date: 2014-02-22 01:17:36
Message-ID: 646.1393031856@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

I wrote:
> Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
>> So I'd like to confirm that this issue doesn't affect 9.1.

> It doesn't. I suspect it has something to do with 173e29aa5 or one
> of the nearby commits in backend/regex/.

Indeed, git bisect fingers that commit as introducing the problem.

What seems to be happening is that citerdissect() is trying some
combinatorially large number of ways to split the input string into
substrings that can satisfy the argument of the outer "+" iterator.
It keeps failing on the substring starting with the first '$', and
then vainly looking for other splits that dodge the problem.

I'm not entirely sure how come the previous coding didn't fall into
the same problem. It's quite possible Henry Spencer is smarter than
I am, but there was certainly nothing there before that was obviously
avoiding this hazard.

Worthy of note is that I think pre-9.2 is actually giving the wrong
answer --- it's claiming the whole string matches the regex,
which it does not if I'm reading it right. This may be related to
the problem that commit 173e29aa5 was trying to fix, ie failure to
enforce backref matches in some cases. So one possible theory is
that by failing to notice that it *didn't* have a valid match,
the old code accidentally failed to go down the rabbit hole of trying
zillions of other ways to match.

regards, tom lane


From: Sandro Santilli <strk(at)keybit(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Uninterruptable regexp_replace in 9.3.1 ?
Date: 2014-02-26 15:01:13
Message-ID: 20140226150113.GD9740@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Fri, Feb 21, 2014 at 08:17:36PM -0500, Tom Lane wrote:
> I wrote:
> > Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
> >> So I'd like to confirm that this issue doesn't affect 9.1.
>
> > It doesn't. I suspect it has something to do with 173e29aa5 or one
> > of the nearby commits in backend/regex/.
>
> Indeed, git bisect fingers that commit as introducing the problem.

Is there anything I can do for this to not fade to oblivion ?
Will a mail to pgsql-bugs help ?

--strk;

()  ASCII ribbon campaign -- Keep it simple !
/\  http://strk.keybit.net/rants/ascii_mails.txt


From: Sandro Santilli <strk(at)keybit(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, pgsql-bugs(at)postgresql(dot)org
Subject: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)
Date: 2014-03-18 17:04:01
Message-ID: 20140318170401.GE29859@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Wed, Feb 26, 2014 at 04:01:13PM +0100, Sandro Santilli wrote:
> On Fri, Feb 21, 2014 at 08:17:36PM -0500, Tom Lane wrote:
> > I wrote:
> > > Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
> > >> So I'd like to confirm that this issue doesn't affect 9.1.
> >
> > > It doesn't. I suspect it has something to do with 173e29aa5 or one
> > > of the nearby commits in backend/regex/.
> >
> > Indeed, git bisect fingers that commit as introducing the problem.
>
> Is there anything I can do for this to not fade to oblivion ?
> Will a mail to pgsql-bugs help ?

Tom: I saw you pushed a fix for the interruptability with this:
https://github.com/postgres/postgres/commit/f5f21315d25ffcbfe7c6a3fa6ffaad54d31bcde0

But we also noticed a memory leak in the regepx_replace call, did you notice
that in your tests ? Same regexp as reported. Do you need another testcase ?

--strk;

()  ASCII ribbon campaign -- Keep it simple !
/\  http://strk.keybit.net/rants/ascii_mails.txt


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Sandro Santilli <strk(at)keybit(dot)net>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)
Date: 2014-03-18 23:41:32
Message-ID: 32100.1395186092@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Sandro Santilli <strk(at)keybit(dot)net> writes:
> Tom: I saw you pushed a fix for the interruptability with this:
> https://github.com/postgres/postgres/commit/f5f21315d25ffcbfe7c6a3fa6ffaad54d31bcde0

> But we also noticed a memory leak in the regepx_replace call, did you notice
> that in your tests ? Same regexp as reported. Do you need another testcase ?

I see no memory growth while running your original example against HEAD.

regards, tom lane


From: Sandro Santilli <strk(at)keybit(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)
Date: 2014-03-19 07:53:59
Message-ID: 20140319075359.GA4042@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Tue, Mar 18, 2014 at 07:41:32PM -0400, Tom Lane wrote:
> Sandro Santilli <strk(at)keybit(dot)net> writes:
> > Tom: I saw you pushed a fix for the interruptability with this:
> > https://github.com/postgres/postgres/commit/f5f21315d25ffcbfe7c6a3fa6ffaad54d31bcde0
>
> > But we also noticed a memory leak in the regepx_replace call, did you notice
> > that in your tests ? Same regexp as reported. Do you need another testcase ?
>
> I see no memory growth while running your original example against HEAD.

This is what I'm running:

SELECT regexp_matches('test', $REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*?\2)+)$REG$, 'g' );

And this is what I see as the process memory:

PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
5359 ? Ss 0:00 0 5693 172446 2908 0.0 postgres: strk strk [local] idle
5359 ? Ss 0:00 0 5693 172718 4224 0.0 postgres: strk strk [local] idle
5359 ? Ss 0:00 0 5693 173250 4488 0.0 postgres: strk strk [local] idle
5359 ? Ss 0:00 0 5693 173670 5016 0.0 postgres: strk strk [local] idle
5359 ? Ss 0:00 0 5693 174070 5280 0.0 postgres: strk strk [local] idle
5359 ? Ss 0:00 0 5693 174494 5544 0.0 postgres: strk strk [local] idle
5359 ? Ss 0:00 0 5693 174762 5808 0.0 postgres: strk strk [local] idle
5359 ? Ss 0:00 0 5693 175690 6600 0.0 postgres: strk strk [local] idle
5359 ? Ss 0:00 0 5693 179006 9240 0.0 postgres: strk strk [local] idle

This is as of commit 63817f86b57fc3d29b57787bca9d786218b7ee25

Or, simpler, from this state:

PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
5359 ? Ss 0:00 0 5693 1274962 869832 5.3 postgres: strk strk [local] idle

I run this:

SELECT count(*) from (
select regexp_matches( 'test', $REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*?\2)+)$REG$, 'g' )
FROM generate_series(1, 10000)
) as f;

And get here:

PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
5359 ? Ss 0:00 0 5693 1274962 869832 5.3 postgres: strk strk [local] idle
5359 ? Rs 0:00 0 5693 1555894 1090536 6.6 postgres: strk strk [local] SELECT
5359 ? Ss 0:00 0 5693 2269682 1650744 10.0 postgres: strk strk [local] idle

The memory is only released on session close.
This is with generate_series going up to 1e5:

PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
5359 ? Ss 0:00 0 5693 2269682 1650744 10.0 postgres: strk strk [local] idle
5359 ? Rs 0:00 0 5693 2300726 1675560 10.2 postgres: strk strk [local] SELECT
5359 ? Rs 0:01 0 5693 6438626 4923024 30.0 postgres: strk strk [local] SELECT
5359 ? Rs 0:02 0 5693 10591778 8182368 50.0 postgres: strk strk [local] SELECT
5359 ? Ss 0:03 0 5693 12216566 9457488 57.8 postgres: strk strk [local] idle

Up to 1e6 the story ends with a crash, after:

PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
5359 ? Ss 0:03 0 5693 12216566 9457488 57.8 postgres: strk strk [local] idle
5359 ? Rs 0:04 0 5693 15666674 12166392 74.3 postgres: strk strk [local] SELECT
5359 ? Rs 0:05 0 5693 19403254 15099056 92.3 postgres: strk strk [local] SELECT
5359 ? Rs 0:06 0 5693 20284546 15366412 93.9 postgres: strk strk [local] SELECT
5359 ? Rs 0:07 1 5693 21665438 15333312 93.7 postgres: strk strk [local] SELECT
5359 ? Rs 0:08 1 5693 23065426 15295472 93.5 postgres: strk strk [local] SELECT
5359 ? Rs 0:09 1 5693 24468334 15256592 93.2 postgres: strk strk [local] SELECT
5359 ? Rs 0:10 1 5693 25869382 15225544 93.0 postgres: strk strk [local] SELECT
5359 ? Rs 0:11 1 5693 27280514 15210012 92.9 postgres: strk strk [local] SELECT
5359 ? Rs 0:12 1 5693 28684350 15173964 92.7 postgres: strk strk [local] SELECT
5359 ? Ds 0:12 1 5693 29683014 15100680 92.3 postgres: strk strk [local] SELECT
5359 ? Rs 0:13 1 5693 30298130 15051580 92.0 postgres: strk strk [local] SELECT
5359 ? Rs 0:13 1 5693 30979690 15088992 92.2 postgres: strk strk [local] SELECT
5359 ? Rs 0:13 1 5693 31758466 15139792 92.5 postgres: strk strk [local] SELECT
5359 ? Ds 0:14 1 5693 32406206 15111128 92.3 postgres: strk strk [local] SELECT
5359 ? Ds 0:14 1 5693 33045722 15131676 92.5 postgres: strk strk [local] SELECT
5359 ? Ds 0:14 1 5693 33479010 15135036 92.5 postgres: strk strk [local] SELECT
5359 ? Rs 0:15 1 5693 33847174 15146748 92.6 postgres: strk strk [local] SELECT
5359 ? Rs 0:15 1 5693 34186830 15131648 92.5 postgres: strk strk [local] SELECT
5359 ? Rs 0:15 1 5693 34339878 15119460 92.4 postgres: strk strk [local] SELECT
5359 ? Ds 0:15 1 5693 34502474 15139816 92.5 postgres: strk strk [local] SELECT
5359 ? Ds 0:15 1 5693 34625554 15143068 92.5 postgres: strk strk [local] SELECT
5359 ? Rs 0:15 1833 0 0 0 0.0 [postgres]

The connection to the server was lost. Attempting reset: Failed.
!>

The configure switches:

./configure \
--prefix=/home/postgresql-9.3.4 \
--with-libxml \
--with-python \
--enable-nls \
--enable-cassert \
--enable-debug \
--with-ossp-uuid

--strk;

()  ASCII ribbon campaign -- Keep it simple !
/\  http://strk.keybit.net/rants/ascii_mails.txt


From: Sandro Santilli <strk(at)keybit(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)
Date: 2014-03-19 08:57:02
Message-ID: 20140319085702.GF4042@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

A valgrind view on the matter (over a single call, in --single mode):

==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 203 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B80A: cdissect (regexec.c:673)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240== by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240== by 0x6ECB10: regexp_matches (regexp.c:809)
==21240== by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240== by 0x5A8D6C: ExecProject (execQual.c:5261)
==21240== by 0x5BB471: ExecResult (nodeResult.c:155)
==21240== by 0x5A1E77: ExecProcNode (execProcnode.c:373)
==21240==
==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 204 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B7EE: cdissect (regexec.c:671)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240== by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240== by 0x6ECB10: regexp_matches (regexp.c:809)
==21240== by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240== by 0x5A8D6C: ExecProject (execQual.c:5261)
==21240== by 0x5BB471: ExecResult (nodeResult.c:155)
==21240==
==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 205 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B80A: cdissect (regexec.c:673)
==21240== by 0x64B46B: cdissect (regexec.c:633)
==21240== by 0x64B8AA: cdissect (regexec.c:689)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240== by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240== by 0x6ECB10: regexp_matches (regexp.c:809)
==21240== by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240==
==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 206 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B7EE: cdissect (regexec.c:671)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64B46B: cdissect (regexec.c:633)
==21240== by 0x64B8AA: cdissect (regexec.c:689)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240== by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240== by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==
==21240== 6,748 (120 direct, 6,628 indirect) bytes in 1 blocks are definitely lost in loss record 207 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B97C: cdissect (regexec.c:975)
==21240== by 0x64B8AA: cdissect (regexec.c:689)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64B46B: cdissect (regexec.c:633)
==21240== by 0x64B8AA: cdissect (regexec.c:689)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240== by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240==
==21240== 16,928 bytes in 1 blocks are definitely lost in loss record 227 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A41E: newdfa.isra.4 (rege_dfa.c:289)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B80A: cdissect (regexec.c:673)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240== by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240== by 0x6ECB10: regexp_matches (regexp.c:809)
==21240== by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240== by 0x5A8D6C: ExecProject (execQual.c:5261)
==21240== by 0x5BB471: ExecResult (nodeResult.c:155)
==21240==
==21240== 16,928 bytes in 1 blocks are definitely lost in loss record 228 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A41E: newdfa.isra.4 (rege_dfa.c:289)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B7EE: cdissect (regexec.c:671)
==21240== by 0x64B46B: cdissect (regexec.c:633)
==21240== by 0x64B8AA: cdissect (regexec.c:689)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240== by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240== by 0x6ECB10: regexp_matches (regexp.c:809)
==21240== by 0x5A641D: ExecMakeFunctionResult (execQual.c:1795)
==21240==
==21240== 16,928 bytes in 1 blocks are definitely lost in loss record 229 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A41E: newdfa.isra.4 (rege_dfa.c:289)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B80A: cdissect (regexec.c:673)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64B46B: cdissect (regexec.c:633)
==21240== by 0x64B8AA: cdissect (regexec.c:689)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240== by 0x6EBDC8: setup_regexp_matches (regexp.c:926)
==21240== by 0x6ECB10: regexp_matches (regexp.c:809)
==21240==
==21240== 16,928 bytes in 1 blocks are definitely lost in loss record 230 of 239
==21240== at 0x4C2A2DB: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21240== by 0x64A41E: newdfa.isra.4 (rege_dfa.c:289)
==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
==21240== by 0x64B512: cdissect (regexec.c:903)
==21240== by 0x64BAF8: cdissect (regexec.c:1050)
==21240== by 0x64B8AA: cdissect (regexec.c:689)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64B46B: cdissect (regexec.c:633)
==21240== by 0x64B8AA: cdissect (regexec.c:689)
==21240== by 0x64BDE9: cdissect (regexec.c:692)
==21240== by 0x64C802: pg_regexec (regexec.c:473)
==21240== by 0x6EBB3A: RE_wchar_execute (regexp.c:269)
==21240==
==21240== LEAK SUMMARY:
==21240== definitely lost: 69,064 bytes in 12 blocks
==21240== indirectly lost: 38,143 bytes in 102 blocks
==21240== possibly lost: 0 bytes in 0 blocks
==21240== still reachable: 1,317,683 bytes in 342 blocks

--strk;

On Wed, Mar 19, 2014 at 08:53:59AM +0100, Sandro Santilli wrote:
> On Tue, Mar 18, 2014 at 07:41:32PM -0400, Tom Lane wrote:
> > Sandro Santilli <strk(at)keybit(dot)net> writes:
> > > Tom: I saw you pushed a fix for the interruptability with this:
> > > https://github.com/postgres/postgres/commit/f5f21315d25ffcbfe7c6a3fa6ffaad54d31bcde0
> >
> > > But we also noticed a memory leak in the regepx_replace call, did you notice
> > > that in your tests ? Same regexp as reported. Do you need another testcase ?
> >
> > I see no memory growth while running your original example against HEAD.
>
> This is what I'm running:
>
> SELECT regexp_matches('test', $REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*?\2)+)$REG$, 'g' );
>
> And this is what I see as the process memory:
>
> PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
> 5359 ? Ss 0:00 0 5693 172446 2908 0.0 postgres: strk strk [local] idle
> 5359 ? Ss 0:00 0 5693 172718 4224 0.0 postgres: strk strk [local] idle
> 5359 ? Ss 0:00 0 5693 173250 4488 0.0 postgres: strk strk [local] idle
> 5359 ? Ss 0:00 0 5693 173670 5016 0.0 postgres: strk strk [local] idle
> 5359 ? Ss 0:00 0 5693 174070 5280 0.0 postgres: strk strk [local] idle
> 5359 ? Ss 0:00 0 5693 174494 5544 0.0 postgres: strk strk [local] idle
> 5359 ? Ss 0:00 0 5693 174762 5808 0.0 postgres: strk strk [local] idle
> 5359 ? Ss 0:00 0 5693 175690 6600 0.0 postgres: strk strk [local] idle
> 5359 ? Ss 0:00 0 5693 179006 9240 0.0 postgres: strk strk [local] idle
>
> This is as of commit 63817f86b57fc3d29b57787bca9d786218b7ee25
>
> Or, simpler, from this state:
>
> PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
> 5359 ? Ss 0:00 0 5693 1274962 869832 5.3 postgres: strk strk [local] idle
>
> I run this:
>
> SELECT count(*) from (
> select regexp_matches( 'test', $REG$((?:[^'"$;]+|"[^"]*"|'(?:[^']*|'')*'|(\$[^$]*\$).*?\2)+)$REG$, 'g' )
> FROM generate_series(1, 10000)
> ) as f;
>
> And get here:
>
> PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
> 5359 ? Ss 0:00 0 5693 1274962 869832 5.3 postgres: strk strk [local] idle
> 5359 ? Rs 0:00 0 5693 1555894 1090536 6.6 postgres: strk strk [local] SELECT
> 5359 ? Ss 0:00 0 5693 2269682 1650744 10.0 postgres: strk strk [local] idle
>
> The memory is only released on session close.
> This is with generate_series going up to 1e5:
>
> PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
> 5359 ? Ss 0:00 0 5693 2269682 1650744 10.0 postgres: strk strk [local] idle
> 5359 ? Rs 0:00 0 5693 2300726 1675560 10.2 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:01 0 5693 6438626 4923024 30.0 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:02 0 5693 10591778 8182368 50.0 postgres: strk strk [local] SELECT
> 5359 ? Ss 0:03 0 5693 12216566 9457488 57.8 postgres: strk strk [local] idle
>
> Up to 1e6 the story ends with a crash, after:
>
> PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
> 5359 ? Ss 0:03 0 5693 12216566 9457488 57.8 postgres: strk strk [local] idle
> 5359 ? Rs 0:04 0 5693 15666674 12166392 74.3 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:05 0 5693 19403254 15099056 92.3 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:06 0 5693 20284546 15366412 93.9 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:07 1 5693 21665438 15333312 93.7 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:08 1 5693 23065426 15295472 93.5 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:09 1 5693 24468334 15256592 93.2 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:10 1 5693 25869382 15225544 93.0 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:11 1 5693 27280514 15210012 92.9 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:12 1 5693 28684350 15173964 92.7 postgres: strk strk [local] SELECT
> 5359 ? Ds 0:12 1 5693 29683014 15100680 92.3 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:13 1 5693 30298130 15051580 92.0 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:13 1 5693 30979690 15088992 92.2 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:13 1 5693 31758466 15139792 92.5 postgres: strk strk [local] SELECT
> 5359 ? Ds 0:14 1 5693 32406206 15111128 92.3 postgres: strk strk [local] SELECT
> 5359 ? Ds 0:14 1 5693 33045722 15131676 92.5 postgres: strk strk [local] SELECT
> 5359 ? Ds 0:14 1 5693 33479010 15135036 92.5 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:15 1 5693 33847174 15146748 92.6 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:15 1 5693 34186830 15131648 92.5 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:15 1 5693 34339878 15119460 92.4 postgres: strk strk [local] SELECT
> 5359 ? Ds 0:15 1 5693 34502474 15139816 92.5 postgres: strk strk [local] SELECT
> 5359 ? Ds 0:15 1 5693 34625554 15143068 92.5 postgres: strk strk [local] SELECT
> 5359 ? Rs 0:15 1833 0 0 0 0.0 [postgres]
>
> The connection to the server was lost. Attempting reset: Failed.
> !>
>
> The configure switches:
>
> ./configure \
> --prefix=/home/postgresql-9.3.4 \
> --with-libxml \
> --with-python \
> --enable-nls \
> --enable-cassert \
> --enable-debug \
> --with-ossp-uuid
>
> --strk;
>
> ()  ASCII ribbon campaign -- Keep it simple !
> /\  http://strk.keybit.net/rants/ascii_mails.txt

--

()  ASCII ribbon campaign -- Keep it simple !
/\  http://strk.keybit.net/rants/ascii_mails.txt


From: Greg Stark <stark(at)mit(dot)edu>
To: Sandro Santilli <strk(at)keybit(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)
Date: 2014-03-19 10:57:10
Message-ID: CAM-w4HNWVuNqo0d8DUNg5Ucaw6wnx7yzWv6zKnK5RRFN=_rT4Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Wed, Mar 19, 2014 at 8:57 AM, Sandro Santilli <strk(at)keybit(dot)net> wrote:
> ==21240== by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
> ==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
> ==21240== by 0x64B80A: cdissect (regexec.c:673)
> ==21240== by 0x64C802: pg_regexec (regexec.c:473)

It looks like a simple bug pg_regexec where it's reusing a loop
counter "n" to mean two different things. When it goes to free all the
subdfas it looks like the code was written based "n" still being the
number of trees but in fact it's been reused to be the number of
matches.

--
greg


From: Sandro Santilli <strk(at)keybit(dot)net>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)
Date: 2014-03-19 11:43:25
Message-ID: 20140319114325.GG4042@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Wed, Mar 19, 2014 at 10:57:10AM +0000, Greg Stark wrote:
> On Wed, Mar 19, 2014 at 8:57 AM, Sandro Santilli <strk(at)keybit(dot)net> wrote:
> > ==21240== by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
> > ==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
> > ==21240== by 0x64B80A: cdissect (regexec.c:673)
> > ==21240== by 0x64C802: pg_regexec (regexec.c:473)
>
>
> It looks like a simple bug pg_regexec where it's reusing a loop
> counter "n" to mean two different things. When it goes to free all the
> subdfas it looks like the code was written based "n" still being the
> number of trees but in fact it's been reused to be the number of
> matches.

I confirm this patch fixes the leak:

diff --git a/src/backend/regex/regexec.c b/src/backend/regex/regexec.c
index 0edb83c..2e97662 100644
--- a/src/backend/regex/regexec.c
+++ b/src/backend/regex/regexec.c
@@ -259,6 +259,7 @@ pg_regexec(regex_t *re,
/* clean up */
if (v->pmatch != pmatch && v->pmatch != mat)
FREE(v->pmatch);
+ n = (size_t) v->g->ntree;
for (i = 0; i < n; i++)
{
if (v->subdfas[i] != NULL)

--strk;


From: Sandro Santilli <strk(at)keybit(dot)net>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)
Date: 2014-03-19 12:11:02
Message-ID: 20140319121102.GJ4042@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg메이저 토토 사이트SQL : Postg메이저 토토 사이트SQL 메일 링리스트 : 2014-03-19 이후 PGSQL-BUGS Postg토토 사이트SQL

On Wed, Mar 19, 2014 at 12:43:25PM +0100, Sandro Santilli wrote:
> On Wed, Mar 19, 2014 at 10:57:10AM +0000, Greg Stark wrote:
> > On Wed, Mar 19, 2014 at 8:57 AM, Sandro Santilli <strk(at)keybit(dot)net> wrote:
> > > ==21240== by 0x64A200: newdfa.isra.4 (rege_dfa.c:307)
> > > ==21240== by 0x64A4C3: getsubdfa (regexec.c:285)
> > > ==21240== by 0x64B80A: cdissect (regexec.c:673)
> > > ==21240== by 0x64C802: pg_regexec (regexec.c:473)
> >
> >
> > It looks like a simple bug pg_regexec where it's reusing a loop
> > counter "n" to mean two different things. When it goes to free all the
> > subdfas it looks like the code was written based "n" still being the
> > number of trees but in fact it's been reused to be the number of
> > matches.
>
> I confirm this patch fixes the leak:

I've encoded another version of it as a pull request:
https://github.com/postgres/postgres/pull/5

--strk;


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Sandro Santilli <strk(at)keybit(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Re: Leaking regexp_replace in 9.3.1 ? (was: [HACKERSUninterruptable regexp_replace in 9.3.1 ?)
Date: 2014-03-19 14:55:51
Message-ID: 30516.1395240951@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Greg Stark <stark(at)mit(dot)edu> writes:
> It looks like a simple bug pg_regexec where it's reusing a loop
> counter "n" to mean two different things. When it goes to free all the
> subdfas it looks like the code was written based "n" still being the
> number of trees but in fact it's been reused to be the number of
> matches.

Looks like it's my bug (how embarrassing). Thanks!

regards, tom lane