Lists: | pgsql-bugspgsql-hackers |
---|
From: | Marc Cousin <cousinmarc(at)gmail(dot)com> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | cannot restore schema with is not distinct from on hstore since PG 9.6.8 |
Date: | 2018-07-09 13:52:26 |
Message-ID: | ffefc172-a487-aa87-a0e7-472bf29735c8@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs pgsql-hackers |
Hi,
This is a really simple test case, I think it's an unintended
consequence of CVE-2018-1058:
demo=# create extension hstore;
CREATE EXTENSION
demo=# create table test (a hstore);
CREATE TABLE
demo=# create index idx_test_not_distinct on test(a) where a is not
distinct from '';
CREATE INDEX
Then dump this database with a simple pg_dump and try to restore it in
another database:
9.6.9/0 [marc(at)marco-portable pg_dump]$ psql -e demo2 -f /tmp/demo_dump
SET statement_timeout = 0;
SET
SET lock_timeout = 0;
SET
SET idle_in_transaction_session_timeout = 0;
SET
SET client_encoding = 'UTF8';
SET
SET standard_conforming_strings = on;
SET
SELECT pg_catalog.set_config('search_path', '', false);
set_config
------------
(1 row)
SET check_function_bodies = false;
SET
SET client_min_messages = warning;
SET
SET row_security = off;
SET
CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog;
CREATE EXTENSION
COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';
COMMENT
CREATE EXTENSION IF NOT EXISTS hstore WITH SCHEMA public;
CREATE EXTENSION
COMMENT ON EXTENSION hstore IS 'data type for storing sets of (key,
value) pairs';
COMMENT
SET default_tablespace = '';
SET
SET default_with_oids = false;
SET
CREATE TABLE public.test (
a public.hstore
);
CREATE TABLE
ALTER TABLE public.test OWNER TO marc;
ALTER TABLE
COPY public.test (a) FROM stdin;
COPY 0
CREATE INDEX idx_test_not_distinct ON public.test USING btree (a) WHERE
(NOT (a IS DISTINCT FROM ''::public.hstore));
psql:/tmp/demo_bug:73: ERROR: operator does not exist: public.hstore =
public.hstore
LINE 1: ...inct ON public.test USING btree (a) WHERE (NOT (a IS DISTINC...
^
HINT: No operator matches the given name and argument type(s). You
might need to add explicit type casts.
As we are working on the 9.6 branch, it appeared after upgrading to
9.6.8 and 9.6.9 following CVE-2018-1058 I presume. Removing the
set_config fixes it (but removes the security fix...)
Of course, my use case is not this simple example, it's failing on a
trigger WHEN ( old.hstorecol is distinct from new.hstorecol )
Regards
Marc
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Marc Cousin <cousinmarc(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)postgresql(dot)org, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: cannot restore schema with is not distinct from on hstore since PG 9.6.8 |
Date: | 2018-07-09 15:34:26 |
Message-ID: | 15487.1531150466@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs pgsql-hackers |
Marc Cousin <cousinmarc(at)gmail(dot)com> writes:
> This is a really simple test case, I think it's an unintended
> consequence of CVE-2018-1058:
> demo=# create extension hstore;
> CREATE EXTENSION
> demo=# create table test (a hstore);
> CREATE TABLE
> demo=# create index idx_test_not_distinct on test(a) where a is not
> distinct from '';
> CREATE INDEX
> [ whereupon dump/restore fails with ]
> CREATE INDEX idx_test_not_distinct ON public.test USING btree (a) WHERE
> (NOT (a IS DISTINCT FROM ''::public.hstore));
> psql:/tmp/demo_bug:73: ERROR: operator does not exist: public.hstore =
> public.hstore
Yeah, the core of the problem here is that there's no way to
schema-qualify IS [NOT] DISTINCT FROM's choice of underlying operator.
It was possible to ignore that as long as the operator you wanted
was in the search path, but now that we've tightened up pg_dump's
search path settings, we can't play fast and loose anymore.
I think the most practical way to deal with this probably is to change
the parser so that the lookup works by finding a default btree or hash
opclass rather than by looking for "=" by name. We've made similar
changes in the past to get rid of implicit dependencies on operator
names, but those efforts never reached IS [NOT] DISTINCT FROM.
I have a nasty feeling that there are still operator name dependencies
elsewhere, notably in CASE expressions, but haven't researched it yet.
Although this doesn't seem like an outlandish change to make in HEAD,
back-patching it might cause some issues. On the other hand, I don't
see what choice we have. Leaving things as they stand isn't very
workable, and inventing some kind of schema-qualification syntax for
IS [NOT] DISTINCT FROM is surely even worse from a backwards
compatibility standpoint.
Thoughts?
regards, tom lane
From: | Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marc Cousin <cousinmarc(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)postgresql(dot)org, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: cannot restore schema with is not distinct from on hstore since PG 9.6.8 |
Date: | 2018-07-10 00:06:00 |
Message-ID: | e89644f6-5472-1713-68d3-84da792d5821@2ndQuadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs pgsql-hackers |
On 07/09/2018 11:34 AM, Tom Lane wrote:
> Marc Cousin <cousinmarc(at)gmail(dot)com> writes:
>> This is a really simple test case, I think it's an unintended
>> consequence of CVE-2018-1058:
>> demo=# create extension hstore;
>> CREATE EXTENSION
>> demo=# create table test (a hstore);
>> CREATE TABLE
>> demo=# create index idx_test_not_distinct on test(a) where a is not
>> distinct from '';
>> CREATE INDEX
>> [ whereupon dump/restore fails with ]
>> CREATE INDEX idx_test_not_distinct ON public.test USING btree (a) WHERE
>> (NOT (a IS DISTINCT FROM ''::public.hstore));
>> psql:/tmp/demo_bug:73: ERROR: operator does not exist: public.hstore =
>> public.hstore
> Yeah, the core of the problem here is that there's no way to
> schema-qualify IS [NOT] DISTINCT FROM's choice of underlying operator.
> It was possible to ignore that as long as the operator you wanted
> was in the search path, but now that we've tightened up pg_dump's
> search path settings, we can't play fast and loose anymore.
>
> I think the most practical way to deal with this probably is to change
> the parser so that the lookup works by finding a default btree or hash
> opclass rather than by looking for "=" by name. We've made similar
> changes in the past to get rid of implicit dependencies on operator
> names, but those efforts never reached IS [NOT] DISTINCT FROM.
>
> I have a nasty feeling that there are still operator name dependencies
> elsewhere, notably in CASE expressions, but haven't researched it yet.
>
> Although this doesn't seem like an outlandish change to make in HEAD,
> back-patching it might cause some issues. On the other hand, I don't
> see what choice we have. Leaving things as they stand isn't very
> workable, and inventing some kind of schema-qualification syntax for
> IS [NOT] DISTINCT FROM is surely even worse from a backwards
> compatibility standpoint.
I agree with your approach, including backpatching. I guess we'll have to try to think of some scenario that backpatching would break. Maybe to minimize any effect it should fall back on a default opclass if the search for "=" fails? Dunno how practical that is.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com> |
Cc: | Marc Cousin <cousinmarc(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: cannot restore schema with is not distinct from on hstore since PG 9.6.8 |
Date: | 2018-07-13 20:54:15 |
Message-ID: | 10492.1531515255@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs pgsql-hackers |
Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com> writes:
> On 07/09/2018 11:34 AM, Tom Lane wrote:
>> I think the most practical way to deal with this probably is to change
>> the parser so that the lookup works by finding a default btree or hash
>> opclass rather than by looking for "=" by name. We've made similar
>> changes in the past to get rid of implicit dependencies on operator
>> names, but those efforts never reached IS [NOT] DISTINCT FROM.
> I agree with your approach, including backpatching. I guess we'll have
> to try to think of some scenario that backpatching would break. Maybe to
> minimize any effect it should fall back on a default opclass if the
> search for "=" fails? Dunno how practical that is.
I poked into this area for awhile, and it turns out to be even a
worse can of worms than I thought. I looked through gram.y and
parse_expr.c, and identified several distinct classes of issue.
(I'm not promising that I found everything.)
1. LIKE, ILIKE, SIMILAR TO are converted to unqualified operator names
"~~", "~~*", etc. One might reflexively propose forcing these expansions
to be qualified with pg_catalog, but I'm afraid that would break use-cases
involving non-core data types, where the desired operator might well not
be in pg_catalog. However, I think that there's not actually any new
dump/restore hazard here, because of the fact that ruleutils.c simply
prints the results as the internal operator names without trying to
convert back to LIKE et al. If the operator isn't in the search path
it will get schema-qualified, and all's well. So my inclination in this
category is to do nothing. There's certainly a hazard for naive writers
of SQL, who may not realize that LIKE entails a search-path-dependent
lookup; but that hazard is no worse than it was before we decided to
lock down dump/restore search paths.
2. BETWEEN and related constructs are converted to boolean expressions
involving unqualified operator names ">=" etc. As above, we are saved
by the fact that what will be dumped out is the expanded expression,
providing room to schema-qualify the selected operators if needed.
So again I'm inclined to leave this alone. (This will be something
to think about if we ever get around to re-implementing BETWEEN as a
unitary construct, though.)
3. IN and NOT IN are converted to "= ANY" or "<> ALL" with unqualified
operator names. This is almost like the above cases, except that
ruleutils.c sometimes chooses to print the result using "IN" rather than
with the actual operator name. We might have to back off that readability
hack in some places (though it seems like it'd be OK if what would be
printed is exactly unqualified "= ANY" or "<> ALL").
4. As per original report, IS [NOT] DISTINCT FROM, as well as NULLIF,
are interpreted by doing a binary operator lookup using unqualified "="
as the operator name. My original thought of replacing that by using
default operator classes to identify the comparison semantics doesn't
really work, because there is no requirement that the two input values be
of the same datatype. For example, "int_var IS DISTINCT FROM numeric_var"
is handled perfectly well at present, but I see no principled way to
figure out what to do with that just by reference to operator classes.
We might have little choice but to invent a schema-qualifiable variant
syntax so we can represent these constructs unambiguously in dumps.
(Ugly as that is, it does have the advantage that we'd not be risking
subtle changes in the semantics of these operations.)
5. Row comparison constructs, for example ROW(a,b) < ROW(c,d), are handled
by doing column-by-column lookups using the specified operator name; this
example devolves to what "a < c" and "b < d" would be interpreted as.
Although we support OPERATOR() syntax in these constructs, that doesn't
get us far: if the original input was not a schema-qualified operator
then it's entirely possible that the per-column operators were found in
different schemas, which we can't handle by writing OPERATOR() in the
middle. (There's already a comment in ruleutils.c's handling of
RowCompareExpr complaining about this.)
One idea for resolving that is to extend the OPERATOR syntax to allow
multiple operator names for row comparisons, along the lines of
ROW(a,b) OPERATOR(pg_catalog.<, public.<) ROW(c,d)
This seems probably to be not terribly hard, although back-patching it
wouldn't be appetizing.
6. The row comparison problem also manifests for multiple-column cases
of ANY_SUBLINK and ALL_SUBLINK, for instance
WHERE (a, b) IN (SELECT x, y FROM ...)
Again, ruleutils.c contributes to the issue by printing "IN" even
though it knows that that isn't 100% accurate. But I think that here
again, a possible fix is to allow writing
WHERE (a, b) OPERATOR(pg_catalog.=, public.=) ANY (SELECT x, y FROM ...)
so that the operators to use can be locked down exactly.
7. The implicit-comparison form of CASE also resolves comparisons by
doing lookups with an assumed operator name of "=". For instance
CASE x WHEN 4 THEN ... WHEN 5.0 THEN ... END
might well choose different comparison operators for the two WHEN
clauses. We could imagine printing this as
CASE WHEN x = 4 THEN ... WHEN x = 5.0 THEN ... END
and schema-qualifying the operators as needed, but that's really
totally unsatisfactory if the test expression is expensive or volatile.
So I'm not sure what to do about this, but it seems like any real
answer will again involve introducing new syntax.
So this is all pretty messy, but on the bright side, fixing it would allow
cleaning up some ancient squishy coding in ruleutils.c. It wouldn't be
controversial as just a v12 addition, perhaps ... but do we have a choice
about back-patching? Dump/restore failures are not good.
regards, tom lane
From: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Marc Cousin <cousinmarc(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: cannot restore schema with is not distinct from on hstore since PG 9.6.8 |
Date: | 2018-07-13 21:17:30 |
Message-ID: | CAKFQuwYAY3hV9yE3D9SiaSfM9JJBu9DFOh0y3C6ZKE1s2OSY8w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs pgsql-hackers |
On Fri, Jul 13, 2018 at 1:54 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> So this is all pretty messy, but on the bright side, fixing it would allow
> cleaning up some ancient squishy coding in ruleutils.c. It wouldn't be
> controversial as just a v12 addition, perhaps ... but do we have a choice
> about back-patching? Dump/restore failures are not good.
>
I think serious consideration needs to be given to ways to allow the user
of pg_dump/pg_restore to choose the prior, less secure, mode of operation.
IMO the risk surface presented to support back-patching the behavioral
changes was not severe enough to do so in the first place. I'm presuming
undoing the back-patch will be shot down without mercy but at least
consider an escape hatch for unafflicted secure systems that just happen to
depend on search_path more than a super-hardened system would.
David J.
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Marc Cousin <cousinmarc(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: cannot restore schema with is not distinct from on hstore since PG 9.6.8 |
Date: | 2018-07-13 21:23:40 |
Message-ID: | 13033.1531517020@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs pgsql-hackers |
"David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
> I think serious consideration needs to be given to ways to allow the user
> of pg_dump/pg_restore to choose the prior, less secure, mode of operation.
> IMO the risk surface presented to support back-patching the behavioral
> changes was not severe enough to do so in the first place. I'm presuming
> undoing the back-patch will be shot down without mercy but at least
> consider an escape hatch for unafflicted secure systems that just happen to
> depend on search_path more than a super-hardened system would.
FWIW, in the security team's discussions of CVE-2018-1058, I argued
strenuously in favor of providing a way to run pg_dump/pg_restore with
the system's default search_path as before. I lost the argument;
but maybe the need for features like this shows that we are not really
ready to insist on unconditional security there.
regards, tom lane
From: | Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | Marc Cousin <cousinmarc(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: cannot restore schema with is not distinct from on hstore since PG 9.6.8 |
Date: | 2018-07-13 22:16:32 |
Message-ID: | 59f8f5e3-2b2d-36c4-c5bd-9e3606cd008b@2ndQuadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs pgsql-hackers |
On 07/13/2018 05:23 PM, Tom Lane wrote:
> "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
>> I think serious consideration needs to be given to ways to allow the user
>> of pg_dump/pg_restore to choose the prior, less secure, mode of operation.
>> IMO the risk surface presented to support back-patching the behavioral
>> changes was not severe enough to do so in the first place. I'm presuming
>> undoing the back-patch will be shot down without mercy but at least
>> consider an escape hatch for unafflicted secure systems that just happen to
>> depend on search_path more than a super-hardened system would.
> FWIW, in the security team's discussions of CVE-2018-1058, I argued
> strenuously in favor of providing a way to run pg_dump/pg_restore with
> the system's default search_path as before. I lost the argument;
> but maybe the need for features like this shows that we are not really
> ready to insist on unconditional security there.
>
>
I don't remember that, TBH.
Certainly this problem seems nasty enough that we should possibly
revisit the issue.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com> |
Cc: | Marc Cousin <cousinmarc(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: cannot restore schema with is not distinct from on hstore since PG 9.6.8 |
Date: | 2020-07-02 16:51:24 |
Message-ID: | 857178.1593708684@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs pgsql-hackers |
[ just to tie back to this old thread ]
I wrote:
> I poked into this area for awhile, and it turns out to be even a
> worse can of worms than I thought. I looked through gram.y and
> parse_expr.c, and identified several distinct classes of issue.
> (I'm not promising that I found everything.)
In fact, the thread at [1] identifies an essentially similar class
of issue that I missed: JOIN ... USING (x) also implicitly looks up
an equality operator by seeing what "tab1.x = tab2.x" resolves as.
This is much like the CASE situation, in that there's no
SQL-standard-compliant way to reverse-list a view containing
such a construct while showing how it was resolved. We'd need
to invent some new syntax if we want to make this safer.
regards, tom lane
[1] /message-id/flat/CAC35HNnNGavaZ%3DP%3DrUcwTwYEhfoyXDg32REXCRDgxBmC3No3nA%40mail.gmail.com