Lists: | pgsql-sql |
---|
From: | jboes(at)nexcerpt(dot)com (Jeff Boes) |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | Long-running DELETE...WHERE... |
Date: | 2002-01-14 18:27:50 |
Message-ID: | d40a65a1.0201141027.6b15bc97@posting.google.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
Why would a delete involving a subselect run so much longer than the
individual delete commands?
My situation: table A has 200,000 rows. I've made up a temporary table
which holds the single-valued primary key for 80,000 rows which I want
to delete.
DELETE FROM a WHERE id IN (select ID from tmp LIMIT 800);
runs for several minutes. But if I do
\o tmpfile
\t
SELECT 'DELETE FROM a WHERE id = ' || id || ';' from tmp limit 800;
\o
\i tmpfile
this completes in about 15 seconds, or 1/50 of the time for the
single-statement delete above.
In trying to optimize this process, I disabled all the relational
integrity triggers (foreign keys) involving the table, and then I
dropped all the indexes EXCEPT that of the primary key. All the
experiments were done within a single transaction using BEGIN. The
database version is 7.1.3, and the table was vacuumed very recently.
From: | Jochem van Dieten <nomail(at)dev(dot)null> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: Long-running DELETE...WHERE... |
Date: | 2002-01-14 22:47:37 |
Message-ID: | 3C436009.40204@dev.null |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
Jeff Boes wrote:
> Why would a delete involving a subselect run so much longer than the
> individual delete commands?
Some SQL statements are faster than others. Try to rewrite your query to
use EXISTS instead of IN and see if it makes any difference. I would
recommend using EXPLAIN to find out what PostgreSQL is really doing and
optimize from there.
jochem
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | jboes(at)nexcerpt(dot)com (Jeff Boes) |
Cc: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: Long-running DELETE...WHERE... |
Date: | 2002-01-15 01:21:10 |
Message-ID: | 21141.1011057670@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
jboes(at)nexcerpt(dot)com (Jeff Boes) writes:
> My situation: table A has 200,000 rows. I've made up a temporary table
> which holds the single-valued primary key for 80,000 rows which I want
> to delete.
> DELETE FROM a WHERE id IN (select ID from tmp LIMIT 800);
> runs for several minutes.
WHERE ... IN ... is notoriously inefficient. I'd try
DELETE FROM aa WHERE id = tmp.id;
which is not standard SQL but should be able to produce a decent plan.
You might find that a VACUUM ANALYZE on both tables beforehand would be
a good idea, too; never-vacuumed temp tables have some default
statistics assumed that are a lot less than 80k rows.
regards, tom lane
From: | Chester Carlton Young <chestercyoung(at)yahoo(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: Long-running DELETE...WHERE... |
Date: | 2002-01-15 02:04:15 |
Message-ID: | 20020115020415.45801.qmail@web12702.mail.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
Nice syntax. Could not find in doc. Do you have any ideas where I
could find it?
--- Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> jboes(at)nexcerpt(dot)com (Jeff Boes) writes:
> > My situation: table A has 200,000 rows. I've made up a temporary
> table
> > which holds the single-valued primary key for 80,000 rows which I
> want
> > to delete.
> > DELETE FROM a WHERE id IN (select ID from tmp LIMIT 800);
> > runs for several minutes.
>
> WHERE ... IN ... is notoriously inefficient. I'd try
>
> DELETE FROM aa WHERE id = tmp.id;
>
> which is not standard SQL but should be able to produce a decent
> plan.
>
> You might find that a VACUUM ANALYZE on both tables beforehand would
> be
> a good idea, too; never-vacuumed temp tables have some default
> statistics assumed that are a lot less than 80k rows.
>
> regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
__________________________________________________
Do You Yahoo!?
Send FREE video emails in Yahoo! Mail!
http://promo.yahoo.com/videomail/
From: | Frank Bax <fbax(at)sympatico(dot)ca> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | jboes(at)nexcerpt(dot)com (Jeff Boes), pgsql-sql(at)postgresql(dot)org |
Subject: | Re: Long-running DELETE...WHERE... |
Date: | 2002-01-15 19:22:50 |
Message-ID: | 3.0.6.32.20020115142250.009b3970@pop6.sympatico.ca |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
At 08:21 PM 1/14/02 -0500, Tom Lane wrote:
>jboes(at)nexcerpt(dot)com (Jeff Boes) writes:
>> My situation: table A has 200,000 rows. I've made up a temporary table
>> which holds the single-valued primary key for 80,000 rows which I want
>> to delete.
>> DELETE FROM a WHERE id IN (select ID from tmp LIMIT 800);
>> runs for several minutes.
>
>WHERE ... IN ... is notoriously inefficient. I'd try
>
> DELETE FROM aa WHERE id = tmp.id;
>
>which is not standard SQL but should be able to produce a decent plan.
Nice alternative. Is there an alternate format for this one:
DELETE FROM teamwork WHERE emp NOT IN
( SELECT DISTINCT emp FROM timesheet
WHERE lo_shift > (now()-'90days'::interval) )
Frank