Re: [PATCH] few fts functions for jsonb

Lists: pgsql-hackers
From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: [PATCH] few fts functions for jsonb
Date: 2017-02-28 10:59:52
Message-ID: CA+q6zcWm_1Ygg5QOq0gYbnB_=zq7G51uexQt3QEgDJa0qQnPKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi all

I would like to propose patch with a set of new small functions for fts in
case of
jsonb data type:

* to_tsvector(config, jsonb) - make a tsvector from all string values and
elements of jsonb object. To prevent the situation, when tsquery can find
a
phrase consisting of lexemes from two different values/elements, this
function will add an increment to position of each lexeme from every new
value/element.

* ts_headline(config, jsonb, tsquery, options) - generate a headline
directly
from jsonb object

Here are the examples how they work:

```
=# select to_tsvector('{"a": "aaa bbb", "b": ["ccc ddd"], "c": {"d": "eee
fff"}}'::jsonb);
to_tsvector
-------------------------------------------------
'aaa':1 'bbb':2 'ccc':4 'ddd':5 'eee':7 'fff':8
(1 row)

=# select ts_headline('english', '{"a": "aaa bbb", "b": {"c": "ccc
ddd"}}'::jsonb, tsquery('bbb & ddd & hhh'), 'StartSel = <, StopSel = >');
ts_headline
----------------------
aaa <bbb> ccc <ddd>
(1 row)
```

Any comments or suggestions?

Attachment Content-Type Size
jsonb_fts_v1.patch text/x-patch 12.9 KB

From: Oleg Bartunov <obartunov(at)gmail(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-02-28 12:21:39
Message-ID: CAF4Au4wB3qGsXQs86Yb-n3MiqycvhGRHBqn+SAxH9TGDfCg6Ag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

The proposed patch looks not very important, but I consider it as an
important feature, which Oracle and Microsoft already have, that's why I
asked Dmitry to work on this and made it before feature freeze. My comments
follows below the post.

On Tue, Feb 28, 2017 at 1:59 PM, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
wrote:

> Hi all
>
> I would like to propose patch with a set of new small functions for fts in
> case of
> jsonb data type:
>
> * to_tsvector(config, jsonb) - make a tsvector from all string values and
> elements of jsonb object. To prevent the situation, when tsquery can
> find a
> phrase consisting of lexemes from two different values/elements, this
> function will add an increment to position of each lexeme from every new
> value/element.
>
> * ts_headline(config, jsonb, tsquery, options) - generate a headline
> directly
> from jsonb object
>
> Here are the examples how they work:
>
> ```
> =# select to_tsvector('{"a": "aaa bbb", "b": ["ccc ddd"], "c": {"d": "eee
> fff"}}'::jsonb);
> to_tsvector
> -------------------------------------------------
> 'aaa':1 'bbb':2 'ccc':4 'ddd':5 'eee':7 'fff':8
> (1 row)
>
>
> =# select ts_headline('english', '{"a": "aaa bbb", "b": {"c": "ccc
> ddd"}}'::jsonb, tsquery('bbb & ddd & hhh'), 'StartSel = <, StopSel = >');
> ts_headline
> ----------------------
> aaa <bbb> ccc <ddd>
> (1 row)
> ```
>

> Any comments or suggestions?
>

1. add json support
2. Its_headline should returns the original json with highlighting. As a
first try the proposed ts_headline could be ok, probably need special
option.

> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Oleg Bartunov <obartunov(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-03-10 16:13:41
Message-ID: CA+q6zcX0KWMXKzm1ujQ74nO4W1=bqq3LsHQMOy9U3rdma9izhw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On 28 February 2017 at 19:21, Oleg Bartunov <obartunov(at)gmail(dot)com> wrote:
> 1. add json support

I've added json support for all functions.

> Its_headline should returns the original json with highlighting

Yes, I see now. I don't think it's worth it to add a special option for that
purpose, so I've just changed the implementation to return the original
json(b).

Attachment Content-Type Size
jsonb_fts_v2.patch text/x-patch 32.1 KB

From: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-03-21 02:03:02
Message-ID: 3ed3a754-3bd0-1788-befc-f651dbd8cf64@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/10/2017 11:13 AM, Dmitry Dolgov wrote:
> > On 28 February 2017 at 19:21, Oleg Bartunov <obartunov(at)gmail(dot)com
> <mailto:obartunov(at)gmail(dot)com>> wrote:
> > 1. add json support
>
> I've added json support for all functions.
>
> > Its_headline should returns the original json with highlighting
>
> Yes, I see now. I don't think it's worth it to add a special option
> for that
> purpose, so I've just changed the implementation to return the
> original json(b).
>

This is a pretty good idea.

However, I think it should probably be broken up into a couple of pieces
- one for the generic json/jsonb transforms infrastructure (which
probably needs some more comments) and one for the FTS functions that
will use it.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-03-21 22:28:26
Message-ID: CA+q6zcV=K6pdc+n4A_9TWoEHTnxK0a7tK-eDA_Yqm10wDm7LuA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On 21 March 2017 at 03:03, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
wrote:
>
> However, I think it should probably be broken up into a couple of pieces -
> one for the generic json/jsonb transforms infrastructure (which probably
> needs some more comments) and one for the FTS functions that will use it.

Sure, here are two patches with separated functionality and a bit more
commentaries for the transform functions.

Attachment Content-Type Size
jsonb_fts_support_v1.patch text/x-patch 9.1 KB
jsonb_fts_functions_v1.patch text/x-patch 23.8 KB

From: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-03-23 22:04:16
Message-ID: 9c97e73b-fa84-0711-b408-3a6e2d89ea96@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/21/2017 06:28 PM, Dmitry Dolgov wrote:
> > On 21 March 2017 at 03:03, Andrew Dunstan
> <andrew(dot)dunstan(at)2ndquadrant(dot)com
> <mailto:andrew(dot)dunstan(at)2ndquadrant(dot)com>> wrote:
> >
> > However, I think it should probably be broken up into a couple of
> pieces -
> > one for the generic json/jsonb transforms infrastructure (which probably
> > needs some more comments) and one for the FTS functions that will
> use it.
>
> Sure, here are two patches with separated functionality and a bit more
> commentaries for the transform functions.

I'm not through looking at this. However, here are a few preliminary
comments

* we might need to rationalize the header locations a bit
* iterate_json(b) and transform_json(b) are a bit too generally named.
Really what they do is iterate over or transform string values in
the json(b). They ignore / preserve the structure, keys, and
non-string scalar values in the json(b). A general iterate or
transform function would be called in effect with a stream of all
the elements in the json, not just scalar strings.
* Unless I'm missing something the iterate_json(b)_values return value
is ignored. Instead of returning the state it looks to me like it
should return nothing and be declared as void instead of void *
* transform_jsonb and transform_json are somewhat asymmetrical. The
latter should probably return a text* instead of a StringInfo, to be
consistent with the former.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-03-26 21:57:10
Message-ID: CA+q6zcWKqo46Z7n1AzRoT8U3vA6hJWQbu1_iO7SGcVvVArMSvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I'm not through looking at this. However, here are a few preliminary
comments

I've attached new versions of the patches with improvements related to
these commentaries.

Attachment Content-Type Size
jsonb_fts_functions_v2.patch text/x-patch 23.8 KB
jsonb_fts_support_v2.patch text/x-patch 9.2 KB

From: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-03-29 16:28:08
Message-ID: CAA8=A7_mGqeFmrkKLqV3VdaEy=QRY+PFg_n+3+KBoyRybXE=kw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 March 2017 at 17:57, Dmitry Dolgov <9erthalion6(at)gmail(dot)com> wrote:
>> I'm not through looking at this. However, here are a few preliminary
>> comments
>
> I've attached new versions of the patches with improvements related to these
> commentaries.

These patches seem fundamentally OK. But I'm still not happy with the
naming etc.

I think the header changes would probably be better placed in
jsonapi.h or in a new header file.

And the names still seem too general to me. e.g. transform_json_values
should probably be transform_json_string_values, and the static
support functions should be renamed to match. Also the
JsonIterateAction and JsonTransformAction funtion typedefs should
probably be renamed to match.

I'm not sure there is any great point in the is_jsonb_data macro,
which is only used in one spot. I would get rid of it and expand the
test in place.

I don't have much time this week to work on it, as there are one or
two other patches I also want to look at. If you clean these things
up I will commit it. The second patch looks fine.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-03-29 20:19:10
Message-ID: CA+q6zcVz-8Jkvu7Ob6hBU_ZzE=T_LqyxDrgmuAjDJy1n+7z0pQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On 29 March 2017 at 18:28, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
wrote:
>
> These patches seem fundamentally OK. But I'm still not happy with the
> naming etc.

I've changed names for all functions and action definitions, moved out the
changes in header file to `jsonapi.h` and removed `is_jsonb_data` macro. So
it
should be better now.

Attachment Content-Type Size
jsonb_fts_support_v3.patch text/x-patch 9.6 KB
jsonb_fts_functions_v3.patch text/x-patch 23.9 KB

From: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-03-30 22:01:39
Message-ID: CAA8=A785XTOUkedF6j+fX7GPW4qXBr7D3_YV6Ge2tVpq5r4LkA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 29 March 2017 at 16:19, Dmitry Dolgov <9erthalion6(at)gmail(dot)com> wrote:
>> On 29 March 2017 at 18:28, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
>> wrote:
>>
>> These patches seem fundamentally OK. But I'm still not happy with the
>> naming etc.
>
> I've changed names for all functions and action definitions, moved out the
> changes in header file to `jsonapi.h` and removed `is_jsonb_data` macro. So
> it
> should be better now.

I have just noticed as I was writing/testing the non-existent docs for
this patch that it doesn't supply variants of to_tsvector that take a
regconfig as the first argument. Is there a reason for that? Why
should the json(b) versions be different from the text versions?

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-03-31 03:43:01
Message-ID: CA+q6zcWd+tu5xuJP_guL0VB5nErohx8ttJbQJP_3jBwA398kyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 31 March 2017 at 00:01, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
wrote:
>
> I have just noticed as I was writing/testing the non-existent docs for
> this patch that it doesn't supply variants of to_tsvector that take a
> regconfig as the first argument. Is there a reason for that? Why
> should the json(b) versions be different from the text versions?

No, there is no reason, I just missed that. Here is a new version of the
patch (only the functions part)
to add those variants.

Attachment Content-Type Size
jsonb_fts_functions_v4.patch text/x-patch 26.0 KB

From: Oleg Bartunov <obartunov(at)gmail(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-03-31 19:17:52
Message-ID: CAF4Au4wAxSOcUN41uVO03CAQ_P5bvM_ELMmi9BHoxBLQtiAw2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 30 Mar 2017 23:43, "Dmitry Dolgov" <9erthalion6(at)gmail(dot)com> wrote:

On 31 March 2017 at 00:01, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
wrote:
>
> I have just noticed as I was writing/testing the non-existent docs for
> this patch that it doesn't supply variants of to_tsvector that take a
> regconfig as the first argument. Is there a reason for that? Why
> should the json(b) versions be different from the text versions?

No, there is no reason, I just missed that. Here is a new version of the
patch (only the functions part)
to add those variants.

Congratulations with patch committed, who will write an addition
documentation? I think we need to touch FTS and JSON parts.


From: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
To: obartunov(at)gmail(dot)com, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-04-01 20:20:46
Message-ID: 719f8817-14ba-a093-0cc3-2c9278023de3@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/31/2017 03:17 PM, Oleg Bartunov wrote:
>
>
> On 30 Mar 2017 23:43, "Dmitry Dolgov" <9erthalion6(at)gmail(dot)com
> <mailto:9erthalion6(at)gmail(dot)com>> wrote:
>
> On 31 March 2017 at 00:01, Andrew Dunstan
> <andrew(dot)dunstan(at)2ndquadrant(dot)com
> <mailto:andrew(dot)dunstan(at)2ndquadrant(dot)com>> wrote:
> >
> > I have just noticed as I was writing/testing the non-existent
> docs for
> > this patch that it doesn't supply variants of to_tsvector that
> take a
> > regconfig as the first argument. Is there a reason for that? Why
> > should the json(b) versions be different from the text versions?
>
> No, there is no reason, I just missed that. Here is a new version
> of the patch (only the functions part)
> to add those variants.
>
>
> Congratulations with patch committed, who will write an addition
> documentation? I think we need to touch FTS and JSON parts.

I added documentation when I committed it for the new functions, in the
FTS section. I'm not sure what we need to add to the JSON section if
anything.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Andres Freund <andres(at)anarazel(dot)de>
To: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
Cc: obartunov(at)gmail(dot)com, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-04-03 18:22:33
Message-ID: 20170403182233.pesqcff4qa56wmj3@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2017-04-01 16:20:46 -0400, Andrew Dunstan wrote:
>
>
> On 03/31/2017 03:17 PM, Oleg Bartunov wrote:
> >
> >
> > On 30 Mar 2017 23:43, "Dmitry Dolgov" <9erthalion6(at)gmail(dot)com
> > <mailto:9erthalion6(at)gmail(dot)com>> wrote:
> >
> > On 31 March 2017 at 00:01, Andrew Dunstan
> > <andrew(dot)dunstan(at)2ndquadrant(dot)com
> > <mailto:andrew(dot)dunstan(at)2ndquadrant(dot)com>> wrote:
> > >
> > > I have just noticed as I was writing/testing the non-existent
> > docs for
> > > this patch that it doesn't supply variants of to_tsvector that
> > take a
> > > regconfig as the first argument. Is there a reason for that? Why
> > > should the json(b) versions be different from the text versions?
> >
> > No, there is no reason, I just missed that. Here is a new version
> > of the patch (only the functions part)
> > to add those variants.
> >
> >
> > Congratulations with patch committed, who will write an addition
> > documentation? I think we need to touch FTS and JSON parts.

> I added documentation when I committed it for the new functions, in the
> FTS section. I'm not sure what we need to add to the JSON section if
> anything.

I see that the CF entry for this hasn't been marked as committed:
https://commitfest.postgresql.org/13/1054/
Is there anything left here?

- Andres


From: "Sven R(dot) Kunze" <srkunze(at)mail(dot)de>
To: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, obartunov(at)gmail(dot)com, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-04-03 18:44:08
Message-ID: 98dfb391-d057-e78c-6248-0a3dc2f9e442@mail.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01.04.2017 22:20, Andrew Dunstan wrote:
> I added documentation when I committed it for the new functions, in the
> FTS section. I'm not sure what we need to add to the JSON section if
> anything.

Not sure, if this is related but the formatting of
/docs/devel/static/functions-textsearch.html
looks a bit strange.

Just 2 questions/notes:
1) in what order are the values of the JSON extracted?

2) Regarding the additional line:
to_tsvector([ config regconfig , ] document json(b)) tsvector reduce
document text to tsvector to_tsvector('english', '{"a": "The Fat
Rats"}'::json) 'fat':2 'rat':3

Maybe change "reduce document text to tsvector" to "extracting JSON
values <in what order> and reduce to tsvector"?

Sven


From: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: obartunov(at)gmail(dot)com, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-04-03 19:26:08
Message-ID: 1f15b7b5-bfb2-a878-8058-348d1b2e924a@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 04/03/2017 02:22 PM, Andres Freund wrote:
> On 2017-04-01 16:20:46 -0400, Andrew Dunstan wrote:
>>
>> On 03/31/2017 03:17 PM, Oleg Bartunov wrote:
>>>
>>> On 30 Mar 2017 23:43, "Dmitry Dolgov" <9erthalion6(at)gmail(dot)com
>>> <mailto:9erthalion6(at)gmail(dot)com>> wrote:
>>>
>>> On 31 March 2017 at 00:01, Andrew Dunstan
>>> <andrew(dot)dunstan(at)2ndquadrant(dot)com
>>> <mailto:andrew(dot)dunstan(at)2ndquadrant(dot)com>> wrote:
>>> >
>>> > I have just noticed as I was writing/testing the non-existent
>>> docs for
>>> > this patch that it doesn't supply variants of to_tsvector that
>>> take a
>>> > regconfig as the first argument. Is there a reason for that? Why
>>> > should the json(b) versions be different from the text versions?
>>>
>>> No, there is no reason, I just missed that. Here is a new version
>>> of the patch (only the functions part)
>>> to add those variants.
>>>
>>>
>>> Congratulations with patch committed, who will write an addition
>>> documentation? I think we need to touch FTS and JSON parts.
>> I added documentation when I committed it for the new functions, in the
>> FTS section. I'm not sure what we need to add to the JSON section if
>> anything.
> I see that the CF entry for this hasn't been marked as committed:
> https://commitfest.postgresql.org/13/1054/
> Is there anything left here?
>

Says "Status committed" for me. I fixed this in Sunday after Tom prodded me.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
To: "Sven R(dot) Kunze" <srkunze(at)mail(dot)de>, obartunov(at)gmail(dot)com, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-04-03 19:30:06
Message-ID: f95ca33b-aef6-60fc-ff10-f103c08228e6@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 04/03/2017 02:44 PM, Sven R. Kunze wrote:
> On 01.04.2017 22:20, Andrew Dunstan wrote:
>> I added documentation when I committed it for the new functions, in the
>> FTS section. I'm not sure what we need to add to the JSON section if
>> anything.
>
> Not sure, if this is related but the formatting of
> /docs/devel/static/functions-textsearch.html
> looks a bit strange.
>
> Just 2 questions/notes:
> 1) in what order are the values of the JSON extracted?

In the order they exist in the underlying document.

>
> 2) Regarding the additional line:
> to_tsvector([ config regconfig , ] document json(b)) tsvector
> reduce document text to tsvector to_tsvector('english', '{"a": "The
> Fat Rats"}'::json) 'fat':2 'rat':3
>
> Maybe change "reduce document text to tsvector" to "extracting JSON
> values <in what order> and reduce to tsvector"?
>
>

OK, I will do something along those lines.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: "Sven R(dot) Kunze" <srkunze(at)mail(dot)de>
To: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, obartunov(at)gmail(dot)com, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-04-03 19:41:54
Message-ID: eb944a89-a920-1449-1b4e-938a6dbb8675@mail.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03.04.2017 21:30, Andrew Dunstan wrote:
> On 04/03/2017 02:44 PM, Sven R. Kunze wrote:
>> On 01.04.2017 22:20, Andrew Dunstan wrote:
>>> I added documentation when I committed it for the new functions, in the
>>> FTS section. I'm not sure what we need to add to the JSON section if
>>> anything.
>> Not sure, if this is related but the formatting of
>> /docs/devel/static/functions-textsearch.html
>> looks a bit strange.
>>
>> Just 2 questions/notes:
>> 1) in what order are the values of the JSON extracted?
> In the order they exist in the underlying document.

Just asking as the order can have implications for fulltext searches.
So, might be valuable for the docs.

Are these documents equally ordered in this sense?

srkunze=# select '{"a": "abc", "b": "def"}'::jsonb;
jsonb
--------------------------
{"a": "abc", "b": "def"}
(1 row)

srkunze=# select '{"b": "def", "a": "abc"}'::jsonb;
jsonb
--------------------------
{"a": "abc", "b": "def"}
(1 row)

Also what about non-ascii keys? Are they ordered by the default locale
of the PostgreSQL cluster (say de_DE.utf-8)?

>> 2) Regarding the additional line:
>> to_tsvector([ config regconfig , ] document json(b)) tsvector
>> reduce document text to tsvector to_tsvector('english', '{"a": "The
>> Fat Rats"}'::json) 'fat':2 'rat':3
>>
>> Maybe change "reduce document text to tsvector" to "extracting JSON
>> values <in what order> and reduce to tsvector"?
>>
>>
>
> OK, I will do something along those lines.
>
> cheers
>
> andrew
>


From: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
To: "Sven R(dot) Kunze" <srkunze(at)mail(dot)de>, obartunov(at)gmail(dot)com, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] few fts functions for jsonb
Date: 2017-04-03 19:52:25
Message-ID: b35782fb-5088-068e-9be0-f327e166052e@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 04/03/2017 03:41 PM, Sven R. Kunze wrote:
> On 03.04.2017 21:30, Andrew Dunstan wrote:
>> On 04/03/2017 02:44 PM, Sven R. Kunze wrote:
>>> On 01.04.2017 22:20, Andrew Dunstan wrote:
>>>> I added documentation when I committed it for the new functions, in
>>>> the
>>>> FTS section. I'm not sure what we need to add to the JSON section if
>>>> anything.
>>> Not sure, if this is related but the formatting of
>>> /docs/devel/static/functions-textsearch.html
>>> looks a bit strange.
>>>
>>> Just 2 questions/notes:
>>> 1) in what order are the values of the JSON extracted?
>> In the order they exist in the underlying document.
>
> Just asking as the order can have implications for fulltext searches.
> So, might be valuable for the docs.
>
>
> Are these documents equally ordered in this sense?
>
> srkunze=# select '{"a": "abc", "b": "def"}'::jsonb;
> jsonb
> --------------------------
> {"a": "abc", "b": "def"}
> (1 row)
>
> srkunze=# select '{"b": "def", "a": "abc"}'::jsonb;
> jsonb
> --------------------------
> {"a": "abc", "b": "def"}
> (1 row)
>

Yes, when converted to jsonb these two documents are identical.

>
> Also what about non-ascii keys? Are they ordered by the default locale
> of the PostgreSQL cluster (say de_DE.utf-8)?

Yes, I believe so.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services