Re: Long-running DELETE...WHERE...

Lists: pgsql-sql
From: jboes(at)nexcerpt(dot)com (Jeff Boes)
To: pgsql-sql(at)postgresql(dot)org
Subject: Long-running DELETE...WHERE...
Date: 2002-01-14 18:27:50
Message-ID: d40a65a1.0201141027.6b15bc97@posting.google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Why would a delete involving a subselect run so much longer than the
individual delete commands?

My situation: table A has 200,000 rows. I've made up a temporary table
which holds the single-valued primary key for 80,000 rows which I want
to delete.

DELETE FROM a WHERE id IN (select ID from tmp LIMIT 800);

runs for several minutes. But if I do

\o tmpfile
\t
SELECT 'DELETE FROM a WHERE id = ' || id || ';' from tmp limit 800;
\o
\i tmpfile

this completes in about 15 seconds, or 1/50 of the time for the
single-statement delete above.

In trying to optimize this process, I disabled all the relational
integrity triggers (foreign keys) involving the table, and then I
dropped all the indexes EXCEPT that of the primary key. All the
experiments were done within a single transaction using BEGIN. The
database version is 7.1.3, and the table was vacuumed very recently.


From: Jochem van Dieten <nomail(at)dev(dot)null>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: Long-running DELETE...WHERE...
Date: 2002-01-14 22:47:37
Message-ID: 3C436009.40204@dev.null
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Jeff Boes wrote:

> Why would a delete involving a subselect run so much longer than the
> individual delete commands?

Some SQL statements are faster than others. Try to rewrite your query to
use EXISTS instead of IN and see if it makes any difference. I would
recommend using EXPLAIN to find out what PostgreSQL is really doing and
optimize from there.

jochem


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: jboes(at)nexcerpt(dot)com (Jeff Boes)
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Long-running DELETE...WHERE...
Date: 2002-01-15 01:21:10
Message-ID: 21141.1011057670@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

jboes(at)nexcerpt(dot)com (Jeff Boes) writes:
> My situation: table A has 200,000 rows. I've made up a temporary table
> which holds the single-valued primary key for 80,000 rows which I want
> to delete.
> DELETE FROM a WHERE id IN (select ID from tmp LIMIT 800);
> runs for several minutes.

WHERE ... IN ... is notoriously inefficient. I'd try

DELETE FROM aa WHERE id = tmp.id;

which is not standard SQL but should be able to produce a decent plan.

You might find that a VACUUM ANALYZE on both tables beforehand would be
a good idea, too; never-vacuumed temp tables have some default
statistics assumed that are a lot less than 80k rows.

regards, tom lane


From: Chester Carlton Young <chestercyoung(at)yahoo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Long-running DELETE...WHERE...
Date: 2002-01-15 02:04:15
Message-ID: 20020115020415.45801.qmail@web12702.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Nice syntax. Could not find in doc. Do you have any ideas where I
could find it?

--- Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> jboes(at)nexcerpt(dot)com (Jeff Boes) writes:
> > My situation: table A has 200,000 rows. I've made up a temporary
> table
> > which holds the single-valued primary key for 80,000 rows which I
> want
> > to delete.
> > DELETE FROM a WHERE id IN (select ID from tmp LIMIT 800);
> > runs for several minutes.
>
> WHERE ... IN ... is notoriously inefficient. I'd try
>
> DELETE FROM aa WHERE id = tmp.id;
>
> which is not standard SQL but should be able to produce a decent
> plan.
>
> You might find that a VACUUM ANALYZE on both tables beforehand would
> be
> a good idea, too; never-vacuumed temp tables have some default
> statistics assumed that are a lot less than 80k rows.
>
> regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

__________________________________________________
Do You Yahoo!?
Send FREE video emails in Yahoo! Mail!
http://promo.yahoo.com/videomail/


From: Frank Bax <fbax(at)sympatico(dot)ca>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: jboes(at)nexcerpt(dot)com (Jeff Boes), pgsql-sql(at)postgresql(dot)org
Subject: Re: Long-running DELETE...WHERE...
Date: 2002-01-15 19:22:50
Message-ID: 3.0.6.32.20020115142250.009b3970@pop6.sympatico.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

At 08:21 PM 1/14/02 -0500, Tom Lane wrote:
>jboes(at)nexcerpt(dot)com (Jeff Boes) writes:
>> My situation: table A has 200,000 rows. I've made up a temporary table
>> which holds the single-valued primary key for 80,000 rows which I want
>> to delete.
>> DELETE FROM a WHERE id IN (select ID from tmp LIMIT 800);
>> runs for several minutes.
>
>WHERE ... IN ... is notoriously inefficient. I'd try
>
> DELETE FROM aa WHERE id = tmp.id;
>
>which is not standard SQL but should be able to produce a decent plan.

Nice alternative. Is there an alternate format for this one:

DELETE FROM teamwork WHERE emp NOT IN
( SELECT DISTINCT emp FROM timesheet
WHERE lo_shift > (now()-'90days'::interval) )

Frank