Lists: | Postg토토 베이SQLPostg스포츠 토토 결과SQL |
---|
From: | "Matt Magoffin" <postgresql(dot)org(at)msqr(dot)us> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | possible to create multivalued index from xpath() results in 8.3? |
Date: | 2007-11-19 07:52:22 |
Message-ID: | 52218.192.168.1.108.1195458742.squirrel@msqr.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general pgsql-hackers |
I've working with XML in Postgres 8.3 and am trying to find a way to
create a text-based index from an XPath that returns multiple nodes. For
example, if I have an XPath like
/elem[(at)key="mykey"]/text()
which might return a few text nodes like
value1
value2
value3
I'd like 3 index values associated with this row's index key: {value1,
value2, value3). I was trying to figure out a way to define an index like
this but ran into a couple of issues:
1) The xpath() function returns an array of XML type, but in the above
example the text nodes are joined together into a single xml result node,
like {value1value2value3}. How can I get it to return 3 individual text
nodes, so an array of 3 values instead of 1?
2) Even if I could have an xpath() result return an array with multiple
values, like {value1,value2,value3} I wasn't able to define a GIN index
against the xml[] type. Should this be possible?
-- m@
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Matt Magoffin" <postgresql(dot)org(at)msqr(dot)us> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: possible to create multivalued index from xpath() results in 8.3? |
Date: | 2007-11-19 15:20:24 |
Message-ID: | 29692.1195485624@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general pgsql-hackers |
"Matt Magoffin" <postgresql(dot)org(at)msqr(dot)us> writes:
> 2) Even if I could have an xpath() result return an array with multiple
> values, like {value1,value2,value3} I wasn't able to define a GIN index
> against the xml[] type. Should this be possible?
Dunno about your other questions, but the answer to this one is "no"
--- AFAICS there is no indexing support of any kind for the xml type
in 8.3. Try casting to text[] instead.
regards, tom lane
From: | "Matt Magoffin" <postgresql(dot)org(at)msqr(dot)us> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: possible to create multivalued index from xpath() results in 8.3? |
Date: | 2007-11-19 19:00:31 |
Message-ID: | 53241.192.168.1.108.1195498831.squirrel@msqr.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general pgsql-hackers |
> "Matt Magoffin" <postgresql(dot)org(at)msqr(dot)us> writes:
>> 2) Even if I could have an xpath() result return an array with multiple
>> values, like {value1,value2,value3} I wasn't able to define a GIN index
>> against the xml[] type. Should this be possible?
>
> Dunno about your other questions, but the answer to this one is "no"
> --- AFAICS there is no indexing support of any kind for the xml type
> in 8.3. Try casting to text[] instead.
I had tried that, but it does not actually work because of my first issue,
of a way for the XPath to return 3 individual text nodes. If I return 3
elements, like
/elem[(at)key="mykey"] => {
<elem key="mykey">value1</elem>,
<elem key="mykey">value2</elem>,
<elem key="mykey">value3</elem>
}
and cast that to text[], I get 3 XML strings, including the <elem></elem>.
I want only the element text content.
Should the xpath() function return 3 individual text nodes like this:
/elem[(at)key="mykey"]/text() => {
value1,
value2,
value3
}
rather than concatenating these into a single text node result? I also
tried something like
string(/elem[(at)key="mykey"])
but that throws an XPath error. It looks like the code is converting this
to /x/string(/elem[(at)key="mykey"] internally, which is not a valid XPath.
So if xpath() cannot return individual text node results, would a possible
solution be to write a trigger function that generates a tsvector from the
XML array, and then use text search to locate matches?
-- m@
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Matt Magoffin" <postgresql(dot)org(at)msqr(dot)us> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: possible to create multivalued index from xpath() results in 8.3? |
Date: | 2007-11-20 19:29:42 |
Message-ID: | 12829.1195586982@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general pgsql-hackers |
"Matt Magoffin" <postgresql(dot)org(at)msqr(dot)us> writes:
> Should the xpath() function return 3 individual text nodes like this:
> /elem[(at)key="mykey"]/text() => {
> value1,
> value2,
> value3
> }
> rather than concatenating these into a single text node result?
AFAICT that's exactly what it does.
regression=# select xpath('//foo[(at)key="mykey"]/text()', '<value>ABC<foo key="mykey">XYZ</foo></value><foo key="mykey">RST</foo><foo>DEF</foo>');
xpath
-----------
{XYZ,RST}
(1 row)
regression=#
Of course this is of type xml[], but you can cast to text[] and then
index.
regards, tom lane
From: | "Matt Magoffin" <postgresql(dot)org(at)msqr(dot)us> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: possible to create multivalued index from xpath() results in 8.3? |
Date: | 2007-11-21 03:19:15 |
Message-ID: | 51496.192.168.1.108.1195615155.squirrel@msqr.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general pgsql-hackers |
> AFAICT that's exactly what it does.
>
> regression=# select xpath('//foo[(at)key="mykey"]/text()', '<value>ABC<foo
> key="mykey">XYZ</foo></value><foo key="mykey">RST</foo><foo>DEF</foo>');
> xpath
> -----------
> {XYZ,RST}
> (1 row)
>
> regression=#
>
> Of course this is of type xml[], but you can cast to text[] and then
> index.
Ugh, you're right of course! Somehow I had this wrong. So I tried to
create an index on the xml[] result by casting to text[] but I got the
"function must be immutable" error. Is there any reason the xml[] to
text[] cast is not immutable?
I worked around it by writing a function like
CREATE OR REPLACE FUNCTION xpath_to_text(xml_array xml[]) RETURNS text[] AS
$BODY$
BEGIN
RETURN xml_array::text[];
END;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;
and wrapping my CREATE INDEX call with that, like:
create index type_flag_idx on lead using gin (
(xpath_to_text(xpath('/element[(at)key="foo"]/text()', xml)))
);
-- m@
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Matt Magoffin" <postgresql(dot)org(at)msqr(dot)us> |
Cc: | pgsql-general(at)postgresql(dot)org, Peter Eisentraut <peter_e(at)gmx(dot)net> |
Subject: | Re: possible to create multivalued index from xpath() results in 8.3? |
Date: | 2007-11-21 03:38:16 |
Message-ID: | 22363.1195616296@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | Postg토토 베이SQL Postg스포츠 토토 결과SQL |
"Matt Magoffin" <postgresql(dot)org(at)msqr(dot)us> writes:
> Ugh, you're right of course! Somehow I had this wrong. So I tried to
> create an index on the xml[] result by casting to text[] but I got the
> "function must be immutable" error. Is there any reason the xml[] to
> text[] cast is not immutable?
Hmm ... I see that xmltotext() is marked 'stable' in pg_proc.h,
but texttoxml() is marked 'immutable', which is at best inconsistent.
It looks to me like they both depend on the GUC setting "xmloption",
which would mean they should both be stable. Peter, is there a bug
there? Also, is there a way to get rid of the GUC dependency so that
there's a reasonably safe way to index XML values?
regards, tom lane
From: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Matt Magoffin" <postgresql(dot)org(at)msqr(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [GENERAL] possible to create multivalued index from xpath() results in 8.3? |
Date: | 2007-11-21 12:20:07 |
Message-ID: | 200711211320.08713.peter_e@gmx.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general pgsql-hackers |
Am Mittwoch, 21. November 2007 schrieb Tom Lane:
> "Matt Magoffin" <postgresql(dot)org(at)msqr(dot)us> writes:
> > Ugh, you're right of course! Somehow I had this wrong. So I tried to
> > create an index on the xml[] result by casting to text[] but I got the
> > "function must be immutable" error. Is there any reason the xml[] to
> > text[] cast is not immutable?
>
> Hmm ... I see that xmltotext() is marked 'stable' in pg_proc.h,
> but texttoxml() is marked 'immutable', which is at best inconsistent.
> It looks to me like they both depend on the GUC setting "xmloption",
> which would mean they should both be stable. Peter, is there a bug
> there?
Yeah, that doesn't look right.
> Also, is there a way to get rid of the GUC dependency so that
> there's a reasonably safe way to index XML values?
We could drop the dependency in xmltotext() with little loss of functionality.
The spec doesn't allow casts between xml and text (varchar) at all. The way
I appear to have derived the current behavior from the spec is that this is
interpreted as an implicit XMLSERIALIZE call in the context of a prepared
statement, which is defined to observe the XML option, as per clause 17.3
(part 14). This was the clostest piece of spec that described conversion
from xml to character types. Now with the xpath functionality, there is
certainly a strong use case for ignoring this altogether and just serializing
with the XML option set to "content".
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
Cc: | "Matt Magoffin" <postgresql(dot)org(at)msqr(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [GENERAL] possible to create multivalued index from xpath() results in 8.3? |
Date: | 2007-11-21 15:27:38 |
Message-ID: | 10361.1195658858@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general pgsql-hackers |
Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> The spec doesn't allow casts between xml and text (varchar) at all. The way
> I appear to have derived the current behavior from the spec is that this is
> interpreted as an implicit XMLSERIALIZE call in the context of a prepared
> statement, which is defined to observe the XML option, as per clause 17.3
> (part 14). This was the clostest piece of spec that described conversion
> from xml to character types. Now with the xpath functionality, there is
> certainly a strong use case for ignoring this altogether and just serializing
> with the XML option set to "content".
Given the actual behavior of xmltotext_with_xmloption, it certainly
seems like a pretty useless error check. Also, xml_out doesn't behave
that way, so why should xmltotext?
The volatility markings of xml_in and texttoxml seem wrong too.
It looks to me like we need:
xml_in should be STABLE because it depends on xmloption
xml_recv ditto (OK already)
xml_out correctly(?) marked IMMUTABLE
xml_send is STABLE, OK because it depends on client_encoding
texttoxml should be STABLE because it depends on xmloption
xmltotext remove xmloption dependency, mark as IMMUTABLE
Should we force initdb to correct these pg_proc entries, or just quietly
change pg_proc.h?
regards, tom lane
From: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Matt Magoffin" <postgresql(dot)org(at)msqr(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [GENERAL] possible to create multivalued index from xpath() results in 8.3? |
Date: | 2007-11-22 22:56:38 |
Message-ID: | 200711222356.38667.peter_e@gmx.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general pgsql-hackers |
Tom Lane wrote:
> Given the actual behavior of xmltotext_with_xmloption, it certainly
> seems like a pretty useless error check. Also, xml_out doesn't behave
> that way, so why should xmltotext?
>
> The volatility markings of xml_in and texttoxml seem wrong too.
This is the patch that came out of it.
> Should we force initdb to correct these pg_proc entries, or just quietly
> change pg_proc.h?
Considering the extent of the changes, I'd be in favor of forcing an initdb.
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
Attachment | Content-Type | Size |
---|---|---|
xml-casts.patch | text/x-diff | 5.2 KB |
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
Cc: | "Matt Magoffin" <postgresql(dot)org(at)msqr(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [GENERAL] possible to create multivalued index from xpath() results in 8.3? |
Date: | 2007-11-23 06:19:05 |
Message-ID: | 22904.1195798745@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general pgsql-hackers |
Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> Tom Lane wrote:
>> Should we force initdb to correct these pg_proc entries, or just quietly
>> change pg_proc.h?
> Considering the extent of the changes, I'd be in favor of forcing an initdb.
Well, if you're going to change the contents of pg_cast then there is
little choice. I was considering something less invasive ...
regards, tom lane
From: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Matt Magoffin" <postgresql(dot)org(at)msqr(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [GENERAL] possible to create multivalued index from xpath() results in 8.3? |
Date: | 2007-11-26 12:53:41 |
Message-ID: | 200711261353.41889.peter_e@gmx.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general pgsql-hackers |
Am Freitag, 23. November 2007 schrieb Tom Lane:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> > Tom Lane wrote:
> >> Should we force initdb to correct these pg_proc entries, or just quietly
> >> change pg_proc.h?
> >
> > Considering the extent of the changes, I'd be in favor of forcing an
> > initdb.
>
> Well, if you're going to change the contents of pg_cast then there is
> little choice. I was considering something less invasive ...
I will hang on to this patch for a few more days to see if any invasive
catalog changes come out of the quote_literal/set_config discussion. If not,
I'll consider a less invasive solution.
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
Cc: | "Matt Magoffin" <postgresql(dot)org(at)msqr(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | initdb for RC1 (was Re: [GENERAL] possible to create multivalued index from xpath() results in 8.3?) |
Date: | 2007-11-26 15:02:43 |
Message-ID: | 13773.1196089363@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general pgsql-hackers |
Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> Tom Lane wrote:
>> Well, if you're going to change the contents of pg_cast then there is
>> little choice. I was considering something less invasive ...
> I will hang on to this patch for a few more days to see if any invasive
> catalog changes come out of the quote_literal/set_config discussion. If not,
> I'll consider a less invasive solution.
ATM it seems that consensus is to change quote_literal, so we may as
well adopt the cleaner solution for fixing the xml functions too.
Anyone out there who wants to argue against forcing initdb for RC1?
If so, better speak up now.
regards, tom lane