Re: RowDescription for a function does not include table OID

Lists: pgsql-generalpgsql-interfaces
From: Maxwell Dreytser <Maxwell(dot)Dreytser(at)assistek(dot)com>
To: "pgsql-interfaces(at)lists(dot)postgresql(dot)org" <pgsql-interfaces(at)lists(dot)postgresql(dot)org>
Subject: RowDescription for a function does not include table OID
Date: 2024-06-18 03:18:40
Message-ID: LV3PR16MB6001AF4D05893E9FF4B3C746E6CE2@LV3PR16MB6001.namprd16.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-interfaces

Hello,

I am working on a meta-programming use-case where I need to scrape some detailed information about the results of a function call that "RETURNS TABLE (LIKE physical_table)". Unfortunately RowDescription messages don't contain nearly enough information (no nullability). In pg_type the pg_proc.prorettype of this function shows up as a composite type with a valid typrelid. I am interested in getting this typrelid in the RowDescription field table OID field and the respective attribute number field. This would allow me to figure out all the necessary information by looking up in pg_attribute.

If this is something that might be accepted, I would be willing to work on a patch to implement this change.

Thank you,
Maxwell.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Maxwell Dreytser <Maxwell(dot)Dreytser(at)assistek(dot)com>
Cc: "pgsql-interfaces(at)lists(dot)postgresql(dot)org" <pgsql-interfaces(at)lists(dot)postgresql(dot)org>
Subject: Re: RowDescription for a function does not include table OID
Date: 2024-06-18 03:51:18
Message-ID: 3528232.1718682678@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-interfaces

Maxwell Dreytser <Maxwell(dot)Dreytser(at)assistek(dot)com> writes:
> I am working on a meta-programming use-case where I need to scrape
> some detailed information about the results of a function call that
> "RETURNS TABLE (LIKE physical_table)".

Hmm, I do not think that syntax means what you think it means ;-).
However, it seems to end up with prorettype =
'physical_table'::regtype anyway thanks to some special rules about
single-column output tables, so as far as I can see you should get
the table's composite type OID as the column type OID in the result
descriptor for "SELECT my_function(...)". Or is that not the case
you're concerned about?

> Unfortunately RowDescription messages don't contain nearly enough information (no nullability). In pg_type the pg_proc.prorettype of this function shows up as a composite type with a valid typrelid.

Right ...

> I am interested in getting this typrelid in the RowDescription field table OID field and the respective attribute number field. This would allow me to figure out all the necessary information by looking up in pg_attribute.

I'm confused about exactly what you're asking for, but (a) returning a
type OID where a relation OID is expected is absolutely not OK ---
there is no guarantee that those OID sets are distinct; (b) regardless
of that, you seem to be asking for a silent semantic change in the
wire protocol, which is going to be a very hard sell because it will
probably break more applications than it makes happy. Why can't you
get what you need from the composite type OID?

regards, tom lane


From: Maxwell Dreytser <Maxwell(dot)Dreytser(at)assistek(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-interfaces(at)lists(dot)postgresql(dot)org" <pgsql-interfaces(at)lists(dot)postgresql(dot)org>
Subject: Re: RowDescription for a function does not include table OID
Date: 2024-06-18 04:28:34
Message-ID: LV3PR16MB6001CD31C5982777BA9BA25EE6CE2@LV3PR16MB6001.namprd16.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-interfaces

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

>Hmm, I do not think that syntax means what you think it means ;-).

Its an interesting trick that I came across on DBA SE on a question named "How to use RETURNS TABLE with an existing table in PostgreSQL?".

>However, it seems to end up with prorettype =
>physical_table'::regtype anyway thanks to some special rules about
>single-column output tables, so as far as I can see you should get
>the table's composite type OID as the column type OID in the result
>descriptor for "SELECT my_function(...)".  Or is that not the case
>you're concerned about?

The query I am running is "SELECT * FROM my_function()". According to Wireshark I can see that the returned RowDescription shows 0 for Table OID and Column index:

PostgreSQL
Type: Row description
Length: 219
Field count: 7
Column name: table_id
Table OID: 0
Column index: 0
Type OID: 20
Column length: 8
Type modifier: -1
Format: Binary (1)
<snipped>

>I'm confused about exactly what you're asking for, but (a) returning a
>type OID where a relation OID is expected is absolutely not OK ---
>there is no guarantee that those OID sets are distinct; (b) regardless
>of that, you seem to be asking for a silent semantic change in the
>wire protocol, which is going to be a very hard sell because it will
>probably break more applications than it makes happy.  Why can't you
>get what you need from the composite type OID?

I would indeed like the relation OID to be returned for the composite type that is returned from the function. Maybe this can be simply considered a bug as it does seem like returning the relation OID that is clearly available would be the expected behaviour.

Regards,
Maxwell.


From: Maxwell Dreytser <Maxwell(dot)Dreytser(at)assistek(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-interfaces(at)lists(dot)postgresql(dot)org" <pgsql-interfaces(at)lists(dot)postgresql(dot)org>
Subject: Re: RowDescription for a function does not include table OID
Date: 2024-06-20 01:22:23
Message-ID: LV3PR16MB6001216CE02B2D908F9D87ECE6C82@LV3PR16MB6001.namprd16.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-interfaces

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

>Hmm, I do not think that syntax means what you think it means ;-).

Its an interesting trick that I came across on DBA SE on a question named "How to use RETURNS TABLE with an existing table in PostgreSQL?".

>However, it seems to end up with prorettype =
>physical_table'::regtype anyway thanks to some special rules about
>single-column output tables, so as far as I can see you should get
>the table's composite type OID as the column type OID in the result
>descriptor for "SELECT my_function(...)".  Or is that not the case
>you're concerned about?

The query I am running is "SELECT * FROM my_function()". According to Wireshark I can see that the returned RowDescription shows 0 for Table OID and Column index:

PostgreSQL
    Type: Row description
    Length: 219
    Field count: 7
        Column name: table_id
            Table OID: 0
            Column index: 0
            Type OID: 20
            Column length: 8
            Type modifier: -1
            Format: Binary (1)
<snipped>

>I'm confused about exactly what you're asking for, but (a) returning a
>type OID where a relation OID is expected is absolutely not OK ---
>there is no guarantee that those OID sets are distinct; (b) regardless
>of that, you seem to be asking for a silent semantic change in the
>wire protocol, which is going to be a very hard sell because it will
>probably break more applications than it makes happy.  Why can't you
>get what you need from the composite type OID?

I would like the relation OID to be returned for the composite type that is returned from the function.

Maybe this can be simply considered a bug as it does seem like returning the relation OID that is clearly available would be the expected behavior.

Regards,
Maxwell.


From: Maxwell Dreytser <Maxwell(dot)Dreytser(at)assistek(dot)com>
To: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: RowDescription for a function does not include table OID
Date: 2024-06-21 14:41:55
Message-ID: LV3PR16MB60014D610A726368182ACC7EE6C92@LV3PR16MB6001.namprd16.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-interfaces

Hello,

I am working on a meta-programming use-case where I need to scrape some detailed information about the results of a function that "RETURNS TABLE (LIKE physical_table)", which ends up with prorettype = 'physical_table'::regtype.
The problem is that for the query "SELECT * FROM my_function()" the RowDescription that is sent back shows 0 for Table OID and Column Index.

From Wireshark:
PostgreSQL
Type: Row description
Length: 219
Field count: 7
Column name: table_id
Table OID: 0
Column index: 0
Type OID: 20
Column length: 8
Type modifier: -1
Format: Binary (1)
<snipped>

I would expect that the Table OID contains the relation OID of this table, as it would do for a typical statement like "SELECT * FROM my_table". It would seem there is a bug here that is preventing PostgreSQL from connecting the dots.

Regards,
Maxwell.


From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Maxwell Dreytser <Maxwell(dot)Dreytser(at)assistek(dot)com>
Cc: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: RowDescription for a function does not include table OID
Date: 2024-06-21 14:48:16
Message-ID: CAKFQuwY76Jc=6JJOPzAt98RLuGCgCdNnS6syB0T4hF6iA=+uoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-interfaces

On Fri, Jun 21, 2024 at 7:42 AM Maxwell Dreytser <
Maxwell(dot)Dreytser(at)assistek(dot)com> wrote:

> I am working on a meta-programming use-case where I need to scrape some
> detailed information about the results of a function that "RETURNS TABLE
> (LIKE physical_table)"
>

Yes, but the bug is yours. The definition you want is: RETURNS SETOF
physical_table (not tested though)

What you did was produce a one-column table whose column type is a
composite (and whose name is like - what with case-folding of unquoted
identifiers). Since that table doesn't exist anywhere in the catalogs it
has no TableOID.

David J.


From: Maxwell Dreytser <Maxwell(dot)Dreytser(at)assistek(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: RowDescription for a function does not include table OID
Date: 2024-06-21 15:03:58
Message-ID: LV3PR16MB6001FF31889B0B0DD4EF6476E6C92@LV3PR16MB6001.namprd16.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-interfaces

On Friday, June 21, 2024 10:48 AM David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>wrote:

>Yes, but the bug is yours.  The definition you want is:  RETURNS SETOF physical_table (not tested though)
>What you did was produce a one-column table whose column type is a composite (and whose name is like - what with case-folding of unquoted identifiers).  Since that table doesn't exist anywhere in the catalogs it has no TableOID.

SETOF also does not return correct RowDescription data. Table OID and column number are still both 0.
Both versions have the exact same pg_proc.prorettype. If I join this onto pg_type, the pg_type.typrelid = 'physical_table'::regclass.

Regards,
Maxwell


From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Maxwell Dreytser <Maxwell(dot)Dreytser(at)assistek(dot)com>
Cc: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: RowDescription for a function does not include table OID
Date: 2024-06-21 15:28:56
Message-ID: CAKFQuwbSJ3Muor4MELvASxmSa4JUhXjduaas1tMrpDKj1=QPGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-interfaces

On Fri, Jun 21, 2024 at 8:04 AM Maxwell Dreytser <
Maxwell(dot)Dreytser(at)assistek(dot)com> wrote:

> On Friday, June 21, 2024 10:48 AM David G. Johnston <
> david(dot)g(dot)johnston(at)gmail(dot)com>wrote:
>
> >Yes, but the bug is yours. The definition you want is: RETURNS SETOF
> physical_table (not tested though)
> >What you did was produce a one-column table whose column type is a
> composite (and whose name is like - what with case-folding of unquoted
> identifiers). Since that table doesn't exist anywhere in the catalogs it
> has no TableOID.
>
> SETOF also does not return correct RowDescription data. Table OID and
> column number are still both 0.
> Both versions have the exact same pg_proc.prorettype. If I join this onto
> pg_type, the pg_type.typrelid = 'physical_table'::regclass.
>
>
Interesting, then I suppose it is semantics. There is no table involved -
you are referencing the type of that name, not the table - so no TableOID.
There is no guarantee the row you are holding came from a table - and I'd
interpret the current behavior as conveying that fact. Though the current
wording: "If the field can be identified as a column of a specific table,
the object ID of the table; otherwise zero."; and the observation that at
least a human "can identify" a related column, leads one to reasonably
infer the system should be able to make such an identification as well.

I would expect you'd be able to find the pg_type.oid value somewhere in the
RowDescription given those specifications, but not the pg_type.typrelid
value. But since the header has no allowance for a row type oid this
information does seem to be missing.

In short, the system doesn't generate the information you need, where you
need it, to tie these pieces together. Modifying existing elements of the
backend protocol is not presently in the cards.

David J.


From: Maxwell Dreytser <Maxwell(dot)Dreytser(at)assistek(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: RowDescription for a function does not include table OID
Date: 2024-06-21 15:41:09
Message-ID: LV3PR16MB60016206D56BF62B51D0568BE6C92@LV3PR16MB6001.namprd16.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg배트맨 토토SQL Postg범퍼카 토토SQL :

On Friday, June 21, 2024 11:28 AM David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com> wrote:

> Interesting, then I suppose it is semantics.  There is no table involved - you are referencing the type of that name, not the table - so no TableOID.  There is no guarantee the row you are holding came from a table - and I'd interpret the current behavior as conveying that fact.  Though the current wording: "If the field can be identified as a column of a specific table, the object ID of the table; otherwise zero."; and the observation that at least a human "can identify" a related column, leads one to reasonably infer the system should be able to make such an identification as well.

This is exactly my point. If the return type of the function is strongly linked (directly in the function schema) to the table according to pg_catalog, the field can obviously be tied to a specific column of that specific table. The RowDescription not having that value filled in is a violation of that promise.

> In short, the system doesn't generate the information you need, where you need it, to tie these pieces together.  Modifying existing elements of the backend protocol is not presently in the cards.

From my perspective this is clearly a bug as there is no way to define a function in a way that provides enough data to the reader.

Regards,
Maxwell.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Maxwell Dreytser <Maxwell(dot)Dreytser(at)assistek(dot)com>
Cc: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: RowDescription for a function does not include table OID
Date: 2024-06-21 15:51:18
Message-ID: 597261.1718985078@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-interfaces

Maxwell Dreytser <Maxwell(dot)Dreytser(at)assistek(dot)com> writes:
> I am working on a meta-programming use-case where I need to scrape some detailed information about the results of a function that "RETURNS TABLE (LIKE physical_table)", which ends up with prorettype = 'physical_table'::regtype.
> The problem is that for the query "SELECT * FROM my_function()" the RowDescription that is sent back shows 0 for Table OID and Column Index.

Yes, that's expected. You're selecting from a function, not a table.

> I would expect that the Table OID contains the relation OID of this
> table, as it would do for a typical statement like "SELECT * FROM
> my_table".

The PG wire protocol specification [1] defines these fields thus:

If the field can be identified as a column of a specific
table, the object ID of the table; otherwise zero.

If the field can be identified as a column of a specific
table, the attribute number of the column; otherwise zero.

My reading of that is that we should populate these fields only for
the case of direct selection from a table. If you go further than
that, then first off you have a ton of definitional issues (should it
"look through" views, for example?), and second you probably break
applications that are expecting the existing, longstanding definition.

regards, tom lane

[1] /docs/current/protocol-message-formats.html


From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Maxwell Dreytser <Maxwell(dot)Dreytser(at)assistek(dot)com>
Cc: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: RowDescription for a function does not include table OID
Date: 2024-06-21 15:52:00
Message-ID: CAKFQuwaoF+3VTmrxZ+xwifS6sSHbSpdLuSv2w=rRxiopWJUa_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-interfaces

On Fri, Jun 21, 2024 at 8:41 AM Maxwell Dreytser <
Maxwell(dot)Dreytser(at)assistek(dot)com> wrote:

> On Friday, June 21, 2024 11:28 AM David G. Johnston <
> david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>
> > In short, the system doesn't generate the information you need, where
> you need it, to tie these pieces together. Modifying existing elements of
> the backend protocol is not presently in the cards.
>
> From my perspective this is clearly a bug as there is no way to define a
> function in a way that provides enough data to the reader.
>

Quick search turned up this prior thread:

/message-id/19323.1245720832%40sss.pgh.pa.us

Based upon that unargued point the only bug here is in the documentation,
leaving the reader to assume that some effort will be made to chain
together a function returns clause to a physical table through that table's
automatically-generated composite type. We don't and never will modify the
existing protocol message semantics in that respect.

David J.


From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Maxwell Dreytser <Maxwell(dot)Dreytser(at)assistek(dot)com>, "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: RowDescription for a function does not include table OID
Date: 2024-06-21 15:57:49
Message-ID: CAKFQuwZ+_Hmn2oq=DmKyi=h2RJdONLtqH_j_eUdH_oYML3idhA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-interfaces

On Fri, Jun 21, 2024 at 8:51 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>
> The PG wire protocol specification [1] defines these fields thus:
>
> If the field can be identified as a column of a specific
> table, the object ID of the table; otherwise zero.
>
> If the field can be identified as a column of a specific
> table, the attribute number of the column; otherwise zero.
>
> My reading of that is that we should populate these fields only for
> the case of direct selection from a table.
>

s/can be identified as/is/g ?

Experience shows people are inferring a lot from "can be identified" so we
should remove it. "is" maybe over-simplifies a bit but in the correct
direction.

David J.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: Maxwell Dreytser <Maxwell(dot)Dreytser(at)assistek(dot)com>, "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: RowDescription for a function does not include table OID
Date: 2024-06-21 16:12:43
Message-ID: 599780.1718986363@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-interfaces

"David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
> Based upon that unargued point the only bug here is in the documentation,
> leaving the reader to assume that some effort will be made to chain
> together a function returns clause to a physical table through that table's
> automatically-generated composite type.

Hmm, I read the documentation as making minimal promises about how
much effort will be expended, not maximal ones.

But in any case, I repeat the point that you can't open this can of
worms without having a lot of definitional slipperiness wriggle out.
Here is an example:

regression=# create table foo(a int, b int);
CREATE TABLE
regression=# create table bar(x int, y int, z int);
CREATE TABLE
regression=# create function f(int) returns setof foo stable
begin atomic select y, z from bar where x = $1; end;
CREATE FUNCTION

What labeling would you expect for "select * from f(...)",
and on what grounds? It is by no stretch of the imagination a
select from table foo. Moreover, the system has fully enough
information to perceive the query as a select from bar after
inlining the function call:

regression=# explain verbose select * from f(42);
QUERY PLAN
------------------------------------------------------------
Seq Scan on public.bar (cost=0.00..35.50 rows=10 width=8)
Output: bar.y, bar.z
Filter: (bar.x = 42)
(3 rows)

In fact, if we implemented this labeling at the tail end of
planning rather than early in parsing, it'd be fairly hard
to avoid labeling the output columns as bar.* rather than
foo.*. But we don't, and I'm not seeing an upside to
redefining how that works.

I've long forgotten the alleged JDBC connection that David
mentions, but it's surely just the tip of the iceberg of
client-side code that we could break if we change how this
works.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: Maxwell Dreytser <Maxwell(dot)Dreytser(at)assistek(dot)com>, "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: RowDescription for a function does not include table OID
Date: 2024-06-21 16:17:46
Message-ID: 600217.1718986666@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general PostgreSQL : PostgreSQL

"David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
> On Fri, Jun 21, 2024 at 8:51 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> The PG wire protocol specification [1] defines these fields thus:
>> If the field can be identified as a column of a specific
>> table, the object ID of the table; otherwise zero.

> s/can be identified as/is/g ?

> Experience shows people are inferring a lot from "can be identified" so we
> should remove it. "is" maybe over-simplifies a bit but in the correct
> direction.

I dunno, that seems to me to be just as open to argument if not
more so. Perhaps some phrasing like "can be directly identified"?

The real point IMV is that it's based purely on parse analysis,
without looking into the behavior of views or functions (which
could change between parsing and execution, anyway).

regards, tom lane