Re: Performance question

Lists: Postg사설 토토SQL
From: Anil Menon <gakmenon(at)gmail(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Performance question
Date: 2014-11-19 16:26:23
Message-ID: CAHzbRKc_=d8HeJ=w7ymCRHqxMUHK6xWe57Jm0WPb=_XyKqb15Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hello,

I would like to ask from your experience which would be the best "generic"
method for checking if row sets of a certain condition exists in a PLPGSQL
function.

I know of 4 methods so far (please feel free to add if I missed out any
others)

1) get a count (my previous experience with ORCL shaped this option)

select count(*) into vcnt
from table
where <<condition>>
if vcnt >0 then
do X
else
do y
end if
Cons : It seems doing a count(*) is not the best option for PG

2) Use a non-count option
select primary_key_Col into vcnt
from table
where <<condition>>
if found then
do X
else
do y
end if
Cons :Some people seems not to prefer this as (AFAIU) it causes a
plpgsql->sql->plpgsql switches

3) using perform
perform primary_key_Col into vcnt
from table
where <<condition>>
if found then
do X
else
do y
end if

Seems to remove the above (item 2) issues (if any)

4) using exists
if exists ( select 1 from table where <<condition>> ) then
do x
else
do y
end if

My question is what would be the best (in terms of performance) method to
use? My gut feeling is to use option 4 for PG. Am I right or is there any
other method?

Thanks in advance
Anil


From: Francisco Olarte <folarte(at)peoplecall(dot)com>
To: Anil Menon <gakmenon(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Performance question
Date: 2014-11-19 18:21:00
Message-ID: CA+bJJbwzmxM4o6NN9i7TiY1P+c4Gz3KaE9iHLvMLs5TL1pRFGQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi Anil:

On Wed, Nov 19, 2014 at 5:26 PM, Anil Menon <gakmenon(at)gmail(dot)com> wrote:

> Cons : It seems doing a count(*) is not the best option for PG
>

For this and some of the following options, if you are going to just test
for existence, I would consider adding a limit 1 somewehere on the query,
to let the optimizer know you only need one and it will abort the scan on
first hit. Probabley not needed if you are going to give a query which uses
an unique index, but it shouldn't hurt.

Francisco Olarte.


From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Anil Menon <gakmenon(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Performance question
Date: 2014-11-19 19:46:58
Message-ID: 546CF3B2.5070808@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 11/19/2014 08:26 AM, Anil Menon wrote:
> Hello,
>
> I would like to ask from your experience which would be the best
> "generic" method for checking if row sets of a certain condition exists
> in a PLPGSQL function.
>
> I know of 4 methods so far (please feel free to add if I missed out any
> others)
>
> 1) get a count (my previous experience with ORCL shaped this option)
>
> select count(*) into vcnt
> from table
> where <<condition>>
> if vcnt >0 then
> do X
> else
> do y
> end if
> Cons : It seems doing a count(*) is not the best option for PG

Well that would depend on the table size, whether it was 100 rows vs
1,000,000 rows

>
> 2) Use a non-count option
> select primary_key_Col into vcnt
> from table
> where <<condition>>
> if found then
> do X
> else
> do y
> end if
> Cons :Some people seems not to prefer this as (AFAIU) it causes a
> plpgsql->sql->plpgsql switches

plpgsql is fairly tightly coupled to SQL, so I have not really seen any
problems. But then I am not working on large datasets.

>
> 3) using perform
> perform primary_key_Col into vcnt
> from table
> where <<condition>>
> if found then
> do X
> else
> do y
> end if
>
> Seems to remove the above (item 2) issues (if any)

AFAIK, you cannot do the above as written. PERFORM does not return a result:

http://www.postgresql.org/docs/9.3/interactive/plpgsql-statements.html#PLPGSQL-STATEMENTS-SQL-NORESULT

It would have to be more like:

perform primary_key_Col from table where <<condition>>

>
> 4) using exists
> if exists ( select 1 from table where <<condition>> ) then
> do x
> else
> do y
> end if
>
>
> My question is what would be the best (in terms of performance) method
> to use? My gut feeling is to use option 4 for PG. Am I right or is there
> any other method?

All of the above is context specific. To know for sure you will need to
test on actual data.

>
> Thanks in advance
> Anil

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com


From: Anil Menon <gakmenon(at)gmail(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Performance question
Date: 2014-11-20 04:08:47
Message-ID: CAHzbRKf8d4c4OVGkDhryMmzbKh4MeCX6PDsyWW9Ws93GahDXbQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg사설 토토SQL

Thanks Adrian

On Thu, Nov 20, 2014 at 3:46 AM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
wrote:

> On 11/19/2014 08:26 AM, Anil Menon wrote:
>
>> Hello,
>>
>> I would like to ask from your experience which would be the best
>> "generic" method for checking if row sets of a certain condition exists
>> in a PLPGSQL function.
>>
>> I know of 4 methods so far (please feel free to add if I missed out any
>> others)
>>
>> 1) get a count (my previous experience with ORCL shaped this option)
>>
>> select count(*) into vcnt
>> from table
>> where <<condition>>
>> if vcnt >0 then
>> do X
>> else
>> do y
>> end if
>> Cons : It seems doing a count(*) is not the best option for PG
>>
>
>
> Well that would depend on the table size, whether it was 100 rows vs
> 1,000,000 rows
>
>

​The table is estimated/guesstimated to be ~900 million rows (~30Ma day​,
90 days history, though initially it would be ~30M), though the <<where>>
part of the query would return between 0 and ~2 rows

>
>> 2) Use a non-count option
>> select primary_key_Col into vcnt
>> from table
>> where <<condition>>
>> if found then
>> do X
>> else
>> do y
>> end if
>> Cons :Some people seems not to prefer this as (AFAIU) it causes a
>> plpgsql->sql->plpgsql switches
>>
>
> plpgsql is fairly tightly coupled to SQL, so I have not really seen any
> problems. But then I am not working on large datasets.
>

​I think that ~900M rows would constitute a large data set most likely

>
>
>> 3) using perform
>> perform primary_key_Col into vcnt
>> from table
>> where <<condition>>
>> if found then
>> do X
>> else
>> do y
>> end if
>>
>> Seems to remove the above (item 2) issues (if any)
>>
>
> AFAIK, you cannot do the above as written. PERFORM does not return a
> result:
>
> http://www.postgresql.org/docs/9.3/interactive/plpgsql-
> statements.html#PLPGSQL-STATEMENTS-SQL-NORESULT
>
> It would have to be more like:
>
> perform primary_key_Col from table where <<condition>>
>
>
​You are absolutely right - my bad​.

>
>> 4) using exists
>> if exists ( select 1 from table where <<condition>> ) then
>> do x
>> else
>> do y
>> end if
>>
>>
>> My question is what would be the best (in terms of performance) method
>> to use? My gut feeling is to use option 4 for PG. Am I right or is there
>> any other method?
>>
>
> All of the above is context specific. To know for sure you will need to
> test on actual data.
>

​Absolutely right, just that I want to ensure that I follow the most
optimal method before the DB goes into production, after which priorities
change on what needs to be changed.​

>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com
>

​I guess the best answer would be "its very context specific​", but picking
the brains of experienced resources helps :-)

Thanks again
Anil​


From: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Anil Menon *EXTERN*" <gakmenon(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Performance question
Date: 2014-11-21 09:16:40
Message-ID: A737B7A37273E048B164557ADEF4A58B17DA3923@ntex2010a.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Anil Menon wrote:
> I would like to ask from your experience which would be the best "generic" method for checking if row
> sets of a certain condition exists in a PLPGSQL function.
>
> I know of 4 methods so far (please feel free to add if I missed out any others)
[...]

Are you aware that all of these methods have a race condition unless
you use isolation level READ STABILITY or better?

It may be that rows are added or removed between the check and the
corresponding action.

Yours,
Laurenz Albe


From: Anil Menon <gakmenon(at)gmail(dot)com>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Performance question
Date: 2014-11-21 10:32:40
Message-ID: CAHzbRKecduY-_YpUHur8p2wWW4XBvsP_4eLkVn7fiEKvbmMXbg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Thanks Laurenz, very good point!

Luckily (phew!) the business scenario is such that race conditions cannot
occur (and the transaction table is append only). There is business
workflow to address duplicates but
1) it occurs extremely rarely (it would be a deliberate sabotage if it
occurs)
2) there is no impact on business

Yours
Anil

On Fri, Nov 21, 2014 at 5:16 PM, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
wrote:

> Anil Menon wrote:
> > I would like to ask from your experience which would be the best
> "generic" method for checking if row
> > sets of a certain condition exists in a PLPGSQL function.
> >
> > I know of 4 methods so far (please feel free to add if I missed out any
> others)
> [...]
>
> Are you aware that all of these methods have a race condition unless
> you use isolation level READ STABILITY or better?
>
> It may be that rows are added or removed between the check and the
> corresponding action.
>
> Yours,
> Laurenz Albe
>