From: | Brice André <brice(at)famille-andre(dot)be> |
---|---|
To: | Achilleas Mantzios <achill(at)matrix(dot)gatewaynet(dot)com> |
Cc: | "pgsql-sql(at)postgresql(dot)org" <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: join tables by nearest timestamp |
Date: | 2017-11-02 09:31:35 |
Message-ID: | CAOBG12ko-YJMJq=rEw7R1W+81n-Ev4P_OXWvGW+R3Ga-7hFh-A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
I am not sure I understand what you have in mind. If I want to keep the
'JOIN' mechanism, but limit the number of rows used in second table of join
thanks to a subquery, I need to use parameters of selected row of first
table of join in subquery used in second part of join, which seems possible
only with a 'JOIN LATERAL' mechanism (which is not available in postgresql
9.1).
Or maybe am I missing something in your proposal ?
2017-11-02 8:43 GMT+01:00 Achilleas Mantzios <achill(at)matrix(dot)gatewaynet(dot)com>:
> On 01/11/2017 15:12, Brice André wrote:
>
> After some tests, I have some performance issues with this solution. It
> seems that for each row that satisfies first event condition, all possible
> results of second join table search are tested.
>
> On my first attempts, this was reasonable because I tested on only one
> day. But request time increases exponentially with size of searched
> interval.
>
> As I have multi-row index (event type and timestamp) on this table, for
> me, an ideal request could only check two entries of second event type for
> each first event type entry. But with left outer join request, optimizer
> does not seem able to do that.
>
> Any idea on how to improve perfs?
>
>
> You may try with a subselect. First just query from the left table, and in
> the select list, append a subselect, at first returning only one column.
>
>
>
> Regards,
> Brice
>
> Le mer. 1 nov. 2017 à 09:47, Brice André <brice(at)famille-andre(dot)be> a
> écrit :
>
>> Many thanks Achilleas. I did not think to use an outer join in
>> combination with Distnct and order by clauses, which seems to be the key to
>> my problem.
>>
>> I slighly adapted your proposal to match my DB schema, but also to select
>> the real nearest point (and not the nearest one after). I defined a
>> function 'abs' that computes the absolute value from a timestamp and the
>> query looks like :
>>
>> SELECT DISTINCT ON (l1."ID") (l1."Time"-l2."Time") as time_diff,
>> l1.*,l2.* from
>> "KnxBusAccess" l1
>> LEFT OUTER JOIN
>> "KnxBusAccess" l2
>> ON ('t')
>> where
>> l1."ToGroupAddress" = '2/0/1' AND
>> l1."Time" >= (now()-interval '1 day') AND
>> l2."ToGroupAddress" = '2/5/1' AND
>> l2."Time" >= (now()-interval '1 day')
>> order by l1."ID", abs(l2."Time"-l1."Time")
>>
>> the "l1."Time" >= (now()-interval '1 day')" and "l2."Time" >=
>> (now()-interval '1 day')" are thereto use an index in order to limit the
>> values to an acceptable range (I have years of records and with an outer
>> join and without this, the query never finishes).
>>
>> Thannks, this solves my issue.
>>
>> Regards,
>> Brice
>>
>>
>>
>> 2017-11-01 9:11 GMT+01:00 Achilleas Mantzios <
>> achill(at)matrix(dot)gatewaynet(dot)com>:
>>
>>> On 01/11/2017 10:06, Achilleas Mantzios wrote:
>>>
>>>> On 01/11/2017 07:53, Brice André wrote:
>>>>
>>>>> Dear all,
>>>>>
>>>>> I am running a postgresql 9.1 server and I have a table containing
>>>>> events information with, for each entry, an event type, a timestamp, and
>>>>> additional information.
>>>>>
>>>>> I would want to write a query that would return all events of type
>>>>> 'a', but each returned entry should be associated to the neraest event of
>>>>> type 'b' (ideally, the nearest, non taking into account if it happened
>>>>> before or after, but if not possible, it could be the first happening just
>>>>> after).
>>>>>
>>>>> By searching on the web, I found a solution base on a "LEFT JOIN
>>>>> LATERAL", but this is not supported by postgresql 9.1 (and I cannot update
>>>>> my server) :
>>>>>
>>>>> SELECT *
>>>>> FROM
>>>>> (SELECT * FROM events WHERE type = 'a' ) as t1
>>>>> LEFT JOIN LATERAL
>>>>> (SELECT * FROM events WHERE type = 'b' AND timestamp >= t1.timestamp
>>>>> ORDER BY timestamp LIMIT 1) as t2
>>>>> ON TRUE;
>>>>>
>>>>> Any idea on how to adapt this query so that it runs on 9.1 ? Or any
>>>>> other idea on how to perform my query ?
>>>>>
>>>> smth like :
>>>>
>>>> SELECT l1.*,l2.logtime,l2.category,l2.username from logging l1 LEFT
>>>> OUTER JOIN logging l2 ON ('t') where l1.category='vsl.login' AND
>>>> (l2.category IS NULL OR l2.category='vsl.SpareCases') AND (l2.logtime IS
>>>> NULL OR l2.logtime>=l1.logtime) order by l1.logtime;
>>>>
>>> oopss, sorry I forgot, you'll have to add a DISTINCT ON and order by
>>> l2.logtime in order to have what you want :
>>>
>>> SELECT DISTINCT ON (l1.logtime,l1.category,l1.username,l1.action)
>>> l1.*,l2.logtime,l2.category,l2.username from logging l1 LEFT OUTER
>>> JOIN logging l2 ON ('t') where l1.category='vsl.login' AND (l2.category IS
>>> NULL OR l2.category='vsl.SpareCases') AND (l2.logtime IS NULL OR
>>> l2.logtime>=l1.logtime) order by l1.logtime,l1.category,l1.username,l1.action,l2.logtime;
>>>
>>>
>>>
>>>
>>>
>>>>
>>>>> Thanks in advance,
>>>>> Brice
>>>>>
>>>>
>>>>
>>>>
>>> --
>>> Achilleas Mantzios
>>> IT DEV Lead
>>> IT DEPT
>>> Dynacom Tankers Mgmt
>>>
>>>
>>>
>>> --
>>> Sent via pgsql-sql mailing list (pgsql-sql(at)postgresql(dot)org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-sql
>>>
>>
>>
> --
> Achilleas Mantzios
> IT DEV Lead
> IT DEPT
> Dynacom Tankers Mgmt
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Achilleas Mantzios | 2017-11-02 14:49:17 | Re: join tables by nearest timestamp |
Previous Message | Achilleas Mantzios | 2017-11-02 07:43:28 | Re: join tables by nearest timestamp |