Re: BUG #14087: btree_gin index doesn't work on INT with POSITIVE constraint

Lists: pgsql-bugs
From: joris(dot)vandyck(at)promani(dot)be
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #14087: btree_gin index doesn't work on INT with POSITIVE constraint
Date: 2016-04-15 18:59:02
Message-ID: 20160415185902.22924.77993@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 14087
Logged by: Joris Van Dyck
Email address: joris(dot)vandyck(at)promani(dot)be
PostgreSQL version: 9.4.6
Operating system: Debian Jessie 64-bit
Description:

Multi-column GIN/BTREE index
-----------------------------
CREATE INDEX MyIndex
ON articles
USING gin
(to_tsvector('simple'::regconfig, tags), magazine_id, random_id);

Query
-----
SELECT "articles"."id", "articles"."magazine_id", "articles"."thumb_url",
"articles"."date", "articles"."lancode", "articles"."title",
"articles"."description", "articles"."slug", "articles"."tags" FROM
"articles" WHERE (( (to_tsvector('pg_catalog.simple', "articles"."tags")) @@
(plainto_tsquery('pg_catalog.simple', 'housing'))) AND
"articles"."magazine_id" = 16 AND "articles"."random_id" > 11422) ORDER BY
"articles"."random_id" ASC LIMIT 49

Execution plan with random_id defined as integer
------------------------------------------------
Limit (cost=36.12..36.14 rows=8 width=334) (actual time=23.650..23.662
rows=49 loops=1)
-> Sort (cost=36.12..36.14 rows=8 width=334) (actual time=23.649..23.653
rows=49 loops=1)
Sort Key: random_id
Sort Method: top-N heapsort Memory: 51kB
-> Bitmap Heap Scan on articles (cost=20.10..36.00 rows=8
width=334) (actual time=21.488..22.849 rows=629 loops=1)
Recheck Cond: ((to_tsvector('simple'::regconfig, tags) @@
'''housing'''::tsquery) AND (magazine_id = 16) AND (random_id > 11422))
Heap Blocks: exact=378
-> Bitmap Index Scan on MyIndex (cost=0.00..20.10 rows=8
width=0) (actual time=21.370..21.370 rows=629 loops=1)
Index Cond: ((to_tsvector('simple'::regconfig, tags) @@
'''housing'''::tsquery) AND (magazine_id = 16) AND (random_id > 11422))
Planning time: 2.668 ms
Execution time: 24.150 ms

Execution plan with random_id defined as smallint
-------------------------------------------------
Limit (cost=55.57..55.59 rows=8 width=332) (actual time=4.253..4.266
rows=49 loops=1)
-> Sort (cost=55.57..55.59 rows=8 width=332) (actual time=4.250..4.256
rows=49 loops=1)
Sort Key: random_id
Sort Method: top-N heapsort Memory: 51kB
-> Bitmap Heap Scan on articles (cost=10.23..55.45 rows=8
width=332) (actual time=1.407..3.492 rows=629 loops=1)
Recheck Cond: ((to_tsvector('simple'::regconfig, tags) @@
'''housing'''::tsquery) AND (magazine_id = 16))
Filter: (random_id > 11422)
Rows Removed by Filter: 337
Heap Blocks: exact=476
-> Bitmap Index Scan on MyIndex (cost=0.00..10.23 rows=23
width=0) (actual time=1.246..1.246 rows=966 loops=1)
Index Cond: ((to_tsvector('simple'::regconfig, tags) @@
'''housing'''::tsquery) AND (magazine_id = 16))
Planning time: 2.771 ms
Execution time: 4.509 ms

The problem
-----------
It seems like the choice between using the full index (first case with
integer) or using the index partly and then filtering manually (second case
with smallint) should not depend on whether the random_id is defined as
integer or smallint. It is not clear to me whether this is because the
implementation doesn't support the multi-column GIN index with the smallint,
or because the planner decides to do it this way.
If this is not supported, then probably the documentation should be
updated:
http://www.postgresql.org/docs/9.4/static/btree-gin.html


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: joris(dot)vandyck(at)promani(dot)be
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #14087: btree_gin index doesn't work on INT with POSITIVE constraint
Date: 2016-04-15 21:09:16
Message-ID: 30984.1460754556@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

joris(dot)vandyck(at)promani(dot)be writes:
> It seems like the choice between using the full index (first case with
> integer) or using the index partly and then filtering manually (second case
> with smallint) should not depend on whether the random_id is defined as
> integer or smallint.

The reason for the plan change is that if "random_id" is smallint, then
"random_id > 11422" is a cross-type operator (int2 > int4), and btree_gin
lacks support for such operators. Fixing that is probably just a Small
Matter Of Programming, but I doubt it's very high on anyone's to-do list.
In the meantime, you might consider casting the comparison constant to
int2 explicitly if you want to make "random_id" be smallint; that is,
"random_id > 11422::smallint" would be indexable.

regards, tom lane