Lists: | pgsql-bugs |
---|
From: | jitesh tiwari <jitesh120(at)gmail(dot)com> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | fetching bytea (blob) data of 850 MB from psql client failed |
Date: | 2022-08-22 07:14:17 |
Message-ID: | CAOEdJKF4PFrHtH0OEn+Qmeqy-Z7UnNWUMAtrT+jVhTo3ozUjxw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
*Hi all, *
*I have a use case in postgresql where I have inserted 500 rows in a table
with 2 columns as described below. *
*create table xyz (*
*id citext not null primary key, *
*col1 bytea*
*);*
*The table has 500 rows and each row has around 850 MB of data. The bytea
column data in the row has around 830 to 840 MB of data while citext column
data has 10 to to 15 bytes approximately.*
*When I try to fetch all rows with a select query or try to fetch a
single row with a select statement using the where clause for this
table, the Database throws the below error -*
*VERROR; invalid memory alloc request size 1764311653(File mcxt.c; Line
959;Routine palloc; )*
*The error above pointing the PostgreSQL database backend code.*
*May I know why I am getting this error? If I use array fetch using the
ODBC driver with fetchsize=1 (Assuming the application will fetch 1 record
at a time from the PostgreSQL database server) then also I get that error.
Is there any Server configuration which can control this **memory**
allocation error and allow me to fetch one record at time from the
PostgreSQL database table?*
*Please suggest if it is a known issue or limitation in postgresql backend
code. If so please point to the documentation link for the same. If there
is any workaround as well please update me.*
Regards,
Jitesh
From: | John Naylor <john(dot)naylor(at)enterprisedb(dot)com> |
---|---|
To: | jitesh tiwari <jitesh120(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: fetching bytea (blob) data of 850 MB from psql client failed |
Date: | 2022-08-22 10:52:29 |
Message-ID: | CAFBsxsExRqZjvkMNBLgS8D-5BUBG6JYhMsqngy=DB5cZOYBV=A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
On Mon, Aug 22, 2022 at 4:35 PM jitesh tiwari <jitesh120(at)gmail(dot)com> wrote:
>
> Hi all,
>
> I have a use case in postgresql where I have inserted 500 rows in a table with 2 columns as described below.
> create table xyz (
> id citext not null primary key,
> col1 bytea
> );
>
> The table has 500 rows and each row has around 850 MB of data. The bytea column data in the row has around 830 to 840 MB of data while citext column data has 10 to to 15 bytes approximately.
> When I try to fetch all rows with a select query or try to fetch a single row with a select statement using the where clause for this table, the Database throws the below error -
>
> VERROR; invalid memory alloc request size 1764311653(File mcxt.c; Line 959;
> Routine palloc; )
The request size is about 2x bigger than 880MB, which is what we would
expect when outputting using the default setting for "bytea_output",
which is "hex". Hex uses two bytes of text to represent each byte:
/docs/devel/datatype-binary.html
Unfortunately this retrieval problem has been encountered before, but
is not documented that I can see:
/message-id/flat/13120.1518793109%40sss.pgh.pa.us#81d92e9e7e15975e606bdb349ec573d4
If the data is largely printable ASCII, then one thing to try is
setting "bytea_output" to "escape", if your driver and client can
handle that:
/docs/devel/runtime-config-client.html#GUC-BYTEA-OUTPUT
If the data is binary and not printable ASCII, then there is no easy
workaround. For this, possibly the "large object" facility could be
useful, or storing the data in a regular file with the path stored in
the database.
--
John Naylor
EDB: http://www.enterprisedb.com
From: | jitesh tiwari <jitesh120(at)gmail(dot)com> |
---|---|
To: | John Naylor <john(dot)naylor(at)enterprisedb(dot)com> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: fetching bytea (blob) data of 850 MB from psql client failed |
Date: | 2022-08-29 06:01:48 |
Message-ID: | CAOEdJKGWw1_f5HvN6YD=0XECRRjAw1Hp5hb=AmTxoVe04VTHKA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Hi John,
Thanks for those details.
The database table for bytea columns has binary data data and some are not
text. Hence, when it tried to fetch by changing the bytea_output to
'escape' it failed again with the below error -
VERROR; result of bytea output conversion is too large(File varlena.c; Line
415; Routine byteaout; ).
BTW, is there any plan to fix this issue in the PostgreSQL backend code in
near future?
Regards,
Jitesh
On Mon, Aug 22, 2022 at 4:22 PM John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
wrote:
> On Mon, Aug 22, 2022 at 4:35 PM jitesh tiwari <jitesh120(at)gmail(dot)com> wrote:
> >
> > Hi all,
> >
> > I have a use case in postgresql where I have inserted 500 rows in a
> table with 2 columns as described below.
> > create table xyz (
> > id citext not null primary key,
> > col1 bytea
> > );
> >
> > The table has 500 rows and each row has around 850 MB of data. The bytea
> column data in the row has around 830 to 840 MB of data while citext column
> data has 10 to to 15 bytes approximately.
> > When I try to fetch all rows with a select query or try to fetch a
> single row with a select statement using the where clause for this
> table, the Database throws the below error -
> >
> > VERROR; invalid memory alloc request size 1764311653(File mcxt.c; Line
> 959;
> > Routine palloc; )
>
> The request size is about 2x bigger than 880MB, which is what we would
> expect when outputting using the default setting for "bytea_output",
> which is "hex". Hex uses two bytes of text to represent each byte:
>
> /docs/devel/datatype-binary.html
>
> Unfortunately this retrieval problem has been encountered before, but
> is not documented that I can see:
>
>
> /message-id/flat/13120.1518793109%40sss.pgh.pa.us#81d92e9e7e15975e606bdb349ec573d4
>
> If the data is largely printable ASCII, then one thing to try is
> setting "bytea_output" to "escape", if your driver and client can
> handle that:
>
>
> /docs/devel/runtime-config-client.html#GUC-BYTEA-OUTPUT
>
> If the data is binary and not printable ASCII, then there is no easy
> workaround. For this, possibly the "large object" facility could be
> useful, or storing the data in a regular file with the path stored in
> the database.
>
> --
> John Naylor
> EDB: http://www.enterprisedb.com
>
From: | John Naylor <john(dot)naylor(at)enterprisedb(dot)com> |
---|---|
To: | jitesh tiwari <jitesh120(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: fetching bytea (blob) data of 850 MB from psql client failed |
Date: | 2022-08-31 07:27:04 |
Message-ID: | CAFBsxsG=UcZo_pM2kLho3qPv67YBaWa7GuQ6iWb1g-5gLf+8BA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
On Mon, Aug 29, 2022 at 1:02 PM jitesh tiwari <jitesh120(at)gmail(dot)com> wrote:
>
> Hi John,
> Thanks for those details.
> The database table for bytea columns has binary data data and some are not text. Hence, when it tried to fetch by changing the bytea_output to 'escape' it failed again with the below error -
> VERROR; result of bytea output conversion is too large(File varlena.c; Line 415; Routine byteaout; ).
> BTW, is there any plan to fix this issue in the PostgreSQL backend code in near future?
Unfortunately, there is no plan as there is no consensus on how to
approach the issue. It seems the possible actions are (easiest to
hardest):
1. Document that bytea is unsafe -- you can insert anything you like,
but you may be unable to retrieve it again.
2. Determine some maximum size that guarantees accessibility and
invent a GUC that by default prevents inserting larger values than
that.
3. Fix the issue properly.
#2 and #3 are challenging for reasons given in the thread I linked to above.
--
John Naylor
EDB: http://www.enterprisedb.com
From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | John Naylor <john(dot)naylor(at)enterprisedb(dot)com>, jitesh tiwari <jitesh120(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: fetching bytea (blob) data of 850 MB from psql client failed |
Date: | 2022-08-31 13:20:29 |
Message-ID: | e40f0c2d-68c7-dafd-2628-7eca2c62b425@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
On 8/31/22 09:27, John Naylor wrote:
> On Mon, Aug 29, 2022 at 1:02 PM jitesh tiwari <jitesh120(at)gmail(dot)com> wrote:
>>
>> Hi John,
>> Thanks for those details.
>> The database table for bytea columns has binary data data and some are not text. Hence, when it tried to fetch by changing the bytea_output to 'escape' it failed again with the below error -
>> VERROR; result of bytea output conversion is too large(File varlena.c; Line 415; Routine byteaout; ).
>> BTW, is there any plan to fix this issue in the PostgreSQL backend code in near future?
>
> Unfortunately, there is no plan as there is no consensus on how to
> approach the issue. It seems the possible actions are (easiest to
> hardest):
>
> 1. Document that bytea is unsafe -- you can insert anything you like,
> but you may be unable to retrieve it again.
> 2. Determine some maximum size that guarantees accessibility and
> invent a GUC that by default prevents inserting larger values than
> that.
> 3. Fix the issue properly.
>
> #2 and #3 are challenging for reasons given in the thread I linked to above.
>
I haven't tried, but wouldn't it be enough to fetch the data in smaller
chunks? The application would have to re-assemble that, of course.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Cc: | John Naylor <john(dot)naylor(at)enterprisedb(dot)com>, jitesh tiwari <jitesh120(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: fetching bytea (blob) data of 850 MB from psql client failed |
Date: | 2022-08-31 15:40:22 |
Message-ID: | CAFj8pRBdHYSh5cGxs1NWd5yGAkb+E9gdBW3002DEs04y1VEGMg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
st 31. 8. 2022 v 15:20 odesílatel Tomas Vondra <
tomas(dot)vondra(at)enterprisedb(dot)com> napsal:
>
>
> On 8/31/22 09:27, John Naylor wrote:
> > On Mon, Aug 29, 2022 at 1:02 PM jitesh tiwari <jitesh120(at)gmail(dot)com>
> wrote:
> >>
> >> Hi John,
> >> Thanks for those details.
> >> The database table for bytea columns has binary data data and some are
> not text. Hence, when it tried to fetch by changing the bytea_output to
> 'escape' it failed again with the below error -
> >> VERROR; result of bytea output conversion is too large(File varlena.c;
> Line 415; Routine byteaout; ).
> >> BTW, is there any plan to fix this issue in the PostgreSQL backend code
> in near future?
> >
> > Unfortunately, there is no plan as there is no consensus on how to
> > approach the issue. It seems the possible actions are (easiest to
> > hardest):
> >
> > 1. Document that bytea is unsafe -- you can insert anything you like,
> > but you may be unable to retrieve it again.
> > 2. Determine some maximum size that guarantees accessibility and
> > invent a GUC that by default prevents inserting larger values than
> > that.
> > 3. Fix the issue properly.
> >
> > #2 and #3 are challenging for reasons given in the thread I linked to
> above.
> >
>
> I haven't tried, but wouldn't it be enough to fetch the data in smaller
> chunks? The application would have to re-assemble that, of course.
>
>
yes - there are possible both direction conversions to LO. So anybody can
convert any bytea to temp LO, and then it can download to any client
Regards
Pavel
>
> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
>
From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | jitesh tiwari <jitesh120(at)gmail(dot)com>, John Naylor <john(dot)naylor(at)enterprisedb(dot)com> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: fetching bytea (blob) data of 850 MB from psql client failed |
Date: | 2022-09-01 04:39:13 |
Message-ID: | 20220901043913.GA1001056@rfd.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
On Mon, Aug 22, 2022 at 05:52:29PM +0700, John Naylor wrote:
> If the data is binary and not printable ASCII, then there is no easy
> workaround.
I would say the chief solution is to request binary transmission in your
application. This cuts your network traffic in half for retrieving a bytea,
so it's a good change even if the server improves someday. If you're using
libpq, search for show_binary_results in
/docs/devel/libpq-example.html to see an example of
doing this. I don't know of an equivalent option in ODBC; if there is none,
one could modify the ODBC driver to use the same protocol feature that libpq
uses.