BUG #15679: Partial HASH index takes too much space

Lists: pgsql-bugs
From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: gentz(at)futuretap(dot)com
Subject: BUG #15679: Partial HASH index takes too much space
Date: 2019-03-08 20:06:45
Message-ID: 15679-5e1b6086b202047c@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 15679
Logged by: Ortwin Gentz
Email address: gentz(at)futuretap(dot)com
PostgreSQL version: 10.6
Operating system: Amazon RDS
Description:

I created a partial HASH index for a sparsely populated column:

CREATE INDEX partial_hash ON mytable USING HASH(my_id) WHERE my_ID IS NOT
NULL;

Even though the my_id VARCHAR(255) column is populated (NON NULL) only for a
few hundred records, the index takes 256 MB of space (for a table with > 10m
records). Also, it doesn't make a difference if the index is created as a
partial index ("WHERE my_ID IS NOT NULL") or as a full index.

In contrast to that, a BTREE index differs considerably in space for full
and partial:

CREATE INDEX full_btree ON mytable (my_id); # 543 MB
CREATE INDEX partial_btree ON mytable (my_id) WHERE my_ID IS NOT NULL; # 16
KB

See also this StackExchange conversation where user jjanes considers the
behavior a bug in the hash index code:
https://dba.stackexchange.com/a/231660/25337


From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: gentz(at)futuretap(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #15679: Partial HASH index takes too much space
Date: 2019-03-11 09:23:32
Message-ID: CAKJS1f9rg6sxWKg5mbB8P5w_wGbP2FxKDxgnPy20xOW63-4m+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Sat, 9 Mar 2019 at 09:38, PG Bug reporting form
<noreply(at)postgresql(dot)org> wrote:
> I created a partial HASH index for a sparsely populated column:
>
> CREATE INDEX partial_hash ON mytable USING HASH(my_id) WHERE my_ID IS NOT
> NULL;
>
> Even though the my_id VARCHAR(255) column is populated (NON NULL) only for a
> few hundred records, the index takes 256 MB of space (for a table with > 10m
> records). Also, it doesn't make a difference if the index is created as a
> partial index ("WHERE my_ID IS NOT NULL") or as a full index.
>
> In contrast to that, a BTREE index differs considerably in space for full
> and partial:
>
> CREATE INDEX full_btree ON mytable (my_id); # 543 MB
> CREATE INDEX partial_btree ON mytable (my_id) WHERE my_ID IS NOT NULL; # 16
> KB

Thanks for the report. This issue is being discussed over in
/message-id/flat/CAMkU%3D1x0k%2BdRQHDUgp4BjFeSgxyLBBXyKNY5Pt1Yu6YHB0mhKA%40mail.gmail.com

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services