BUG #14054: "create index using gist ..." on large table never completes.

Lists: Postg토토SQL : Postg토토SQL 메일 링리스트 : 2016-03-30 이후 PGSQL-BUGS 08:38
From: tsingle(at)muddypaddock(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #14054: "create index using gist ..." on large table never completes.
Date: 2016-03-30 08:38:19
Message-ID: 20160330083819.8934.34370@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg토토SQL : Postg토토SQL 메일 링리스트 : 2016-03-30 이후 PGSQL-BUGS 08:38

The following bug has been logged on the website:

Bug reference: 14054
Logged by: Tim Singletary
Email address: tsingle(at)muddypaddock(dot)com
PostgreSQL version: 9.5.1
Operating system: osx 10.11.4
Description:

"create index using gist ..." ran for over three days on 100 million row
table; completes within a couple hours on a 10 million row table.

Had previously tried same experiment with 9.2.3 on a Linux box where the 100
million row table's index built within 5 hours.

The table looks like

CREATE TABLE mock (
handle INT NOT NULL
,lowest INT NOT NULL
,highest INT NOT NULL
,stuff TEXT
);

Data was randomly generated.

The command that never completes is:

create index on mock using gist (handle,(int4range(lowest,highest,'[]')));


From: Emre Hasegeli <emre(at)hasegeli(dot)com>
To: tsingle(at)muddypaddock(dot)com
Cc: PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14054: "create index using gist ..." on large table never completes.
Date: 2016-04-05 07:18:22
Message-ID: CAE2gYzzWjQdMgOKAMtMhgjODMPCWNcK-0LbJvTtSUjSxfB1oBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

> Had previously tried same experiment with 9.2.3 on a Linux box where the 100
> million row table's index built within 5 hours.

It finished in an hour on my laptop for serially generated 100 million
rows. I guess the regression is caused by too many overlapping
values. GiST doesn't tend to perform well in this case. I would give
SP-GiST a try.


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: tsingle(at)muddypaddock(dot)com
Cc: Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14054: "create index using gist ..." on large table never completes.
Date: 2016-04-07 02:21:03
Message-ID: CAMkU=1xnnJtCpj5WGka_Omz=FFnWJEYkJjkVgBoOKN7ivZpWjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Wed, Mar 30, 2016 at 1:38 AM, <tsingle(at)muddypaddock(dot)com> wrote:
> The following bug has been logged on the website:
>
> Bug reference: 14054
> Logged by: Tim Singletary
> Email address: tsingle(at)muddypaddock(dot)com
> PostgreSQL version: 9.5.1
> Operating system: osx 10.11.4
> Description:
>
> "create index using gist ..." ran for over three days on 100 million row
> table; completes within a couple hours on a 10 million row table.
>
> Had previously tried same experiment with 9.2.3 on a Linux box where the 100
> million row table's index built within 5 hours.
>
> The table looks like
>
> CREATE TABLE mock (
> handle INT NOT NULL
> ,lowest INT NOT NULL
> ,highest INT NOT NULL
> ,stuff TEXT
> );
>
> Data was randomly generated.

"Randomly generated" covers a lot of territory and is not sufficient
to be reproducible. Please provide the random generator.

Thanks,

Jeff


From: Tim Singletary <tsingle(at)muddypaddock(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Tim Singletary <tsingle(at)muddypaddock(dot)com>, Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14054: "create index using gist ..." on large table never completes.
Date: 2016-04-07 11:15:47
Message-ID: 045E8FF4-97EA-4D98-AA70-C0784874643F@muddypaddock.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

My data generator is at https://github.com/flightlesson/mockdata.git

Here’s how I load 100 million rows:

createdb mockdb
mockdatagenerator --create-table --nrows 100000000 | psql mockdb

After loading the data

CREATE INDEX ON mock USING gist (handle,(int4range(range_low,range_high,’[]')));

seems to take forever.

Thanks,

tim

> On Apr 6, 2016, at 10:21 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>
> On Wed, Mar 30, 2016 at 1:38 AM, <tsingle(at)muddypaddock(dot)com> wrote:
>> The following bug has been logged on the website:
>>
>> Bug reference: 14054
>> Logged by: Tim Singletary
>> Email address: tsingle(at)muddypaddock(dot)com
>> PostgreSQL version: 9.5.1
>> Operating system: osx 10.11.4
>> Description:
>>
>> "create index using gist ..." ran for over three days on 100 million row
>> table; completes within a couple hours on a 10 million row table.
>>
>> Had previously tried same experiment with 9.2.3 on a Linux box where the 100
>> million row table's index built within 5 hours.
>>
>> The table looks like
>>
>> CREATE TABLE mock (
>> handle INT NOT NULL
>> ,lowest INT NOT NULL
>> ,highest INT NOT NULL
>> ,stuff TEXT
>> );
>>
>> Data was randomly generated.
>
> "Randomly generated" covers a lot of territory and is not sufficient
> to be reproducible. Please provide the random generator.
>
> Thanks,
>
> Jeff