Lists: | pgsql-bugs |
---|
From: | PG Bug reporting form <noreply(at)postgresql(dot)org> |
---|---|
To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Cc: | ganguly(dot)04(at)gmail(dot)com |
Subject: | BUG #17450: SUBSTRING function extracting lesser characters than specified |
Date: | 2022-03-28 10:30:07 |
Message-ID: | 17450-69df48455c28fbea@postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
The following bug has been logged on the website:
Bug reference: 17450
Logged by: Suman Ganguly
Email address: ganguly(dot)04(at)gmail(dot)com
PostgreSQL version: 10.17
Operating system: x86_64-pc-linux-gnu
Description:
select substring('123456', 0 , 5)
On running this, Postgres returns '1234'
Expecting '12345' to be returned as per the documentation
From: | hubert depesz lubaczewski <depesz(at)depesz(dot)com> |
---|---|
To: | ganguly(dot)04(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17450: SUBSTRING function extracting lesser characters than specified |
Date: | 2022-03-28 11:01:20 |
Message-ID: | 20220328110119.GA28018@depesz.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
On Mon, Mar 28, 2022 at 10:30:07AM +0000, PG Bug reporting form wrote:
> The following bug has been logged on the website:
>
> Bug reference: 17450
> Logged by: Suman Ganguly
> Email address: ganguly(dot)04(at)gmail(dot)com
> PostgreSQL version: 10.17
> Operating system: x86_64-pc-linux-gnu
> Description:
>
> select substring('123456', 0 , 5)
> On running this, Postgres returns '1234'
> Expecting '12345' to be returned as per the documentation
Well, the problem is that you're trying to provide character number 0,
which is leading to bad results.
substring works on base-1 numbering.
depesz
From: | Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> |
---|---|
To: | depesz(at)depesz(dot)com, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | ganguly(dot)04(at)gmail(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #17450: SUBSTRING function extracting lesser characters than specified |
Date: | 2022-03-28 12:35:30 |
Message-ID: | CALT9ZEE-kh8KufDLsohpvOw5rG9eMQXYfMTdE9dt-gawe1F+5g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
пн, 28 мар. 2022 г. в 15:01, hubert depesz lubaczewski <depesz(at)depesz(dot)com>:
> On Mon, Mar 28, 2022 at 10:30:07AM +0000, PG Bug reporting form wrote:
> > The following bug has been logged on the website:
> >
> > Bug reference: 17450
> > Logged by: Suman Ganguly
> > Email address: ganguly(dot)04(at)gmail(dot)com
> > PostgreSQL version: 10.17
> > Operating system: x86_64-pc-linux-gnu
> > Description:
> >
> > select substring('123456', 0 , 5)
> > On running this, Postgres returns '1234'
> > Expecting '12345' to be returned as per the documentation
>
> Well, the problem is that you're trying to provide character number 0,
> which is leading to bad results.
>
> substring works on base-1 numbering.
>
Yes, the behavior expected in a report seems to be right. I've attached a
very small fix for that issue. Probably it should be backpatched into all
versions having 4bd3fad80e5c i.e. since v11.
--
Best regards,
Pavel Borisov
Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>
Attachment | Content-Type | Size |
---|---|---|
0001-Fix-text_substring-for-correct-processing-of-zero-or.patch | application/octet-stream | 2.1 KB |
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> |
Cc: | depesz(at)depesz(dot)com, ganguly(dot)04(at)gmail(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #17450: SUBSTRING function extracting lesser characters than specified |
Date: | 2022-03-28 13:55:21 |
Message-ID: | 703080.1648475721@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> writes:
> Yes, the behavior expected in a report seems to be right. I've attached a
> very small fix for that issue. Probably it should be backpatched into all
> versions having 4bd3fad80e5c i.e. since v11.
The current calculation matches the SQL standard; this makes
it not match the standard. From SQL99:
3) If <character substring function> is specified, then:
a) Let C be the value of the <character value expression>, let
LC be the length of C, and let S be the value of the <start
position>.
b) If <string length> is specified, then let L be the value of
<string length> and let E be S+L. Otherwise, let E be the
larger of LC + 1 and S.
c) If either C, S, or L is the null value, then the result of
the <character substring function> is the null value.
d) If E is less than S, then an exception condition is raised:
data exception - substring error.
e) Case:
i) If S is greater than LC or if E is less than 1 (one), then
the result of the <character substring function> is a zero-
length string.
ii) Otherwise,
1) Let S1 be the larger of S and 1 (one). Let E1 be the
smaller of E and LC+1. Let L1 be E1-S1.
2) The result of the <character substring function> is
a character string containing the L1 characters of C
starting at character number S1 in the same order that
the characters appear in C.
Perhaps there's something to do here documentation-wise,
but there is no bug.
regards, tom lane
From: | Japin Li <japinli(at)hotmail(dot)com> |
---|---|
To: | Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> |
Cc: | depesz(at)depesz(dot)com, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, ganguly(dot)04(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17450: SUBSTRING function extracting lesser characters than specified |
Date: | 2022-03-28 14:00:06 |
Message-ID: | MEYP282MB1669BB4D28832DEB430F32C5B61D9@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
On Mon, 28 Mar 2022 at 20:35, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> wrote:
> пн, 28 мар. 2022 г. в 15:01, hubert depesz lubaczewski <depesz(at)depesz(dot)com>:
>
>> On Mon, Mar 28, 2022 at 10:30:07AM +0000, PG Bug reporting form wrote:
>> > The following bug has been logged on the website:
>> >
>> > Bug reference: 17450
>> > Logged by: Suman Ganguly
>> > Email address: ganguly(dot)04(at)gmail(dot)com
>> > PostgreSQL version: 10.17
>> > Operating system: x86_64-pc-linux-gnu
>> > Description:
>> >
>> > select substring('123456', 0 , 5)
>> > On running this, Postgres returns '1234'
>> > Expecting '12345' to be returned as per the documentation
>>
>> Well, the problem is that you're trying to provide character number 0,
>> which is leading to bad results.
>>
>> substring works on base-1 numbering.
>>
> Yes, the behavior expected in a report seems to be right. I've attached a
> very small fix for that issue. Probably it should be backpatched into all
> versions having 4bd3fad80e5c i.e. since v11.
It seems bytea_substring also has the problem.
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 22ab5a4329..a48f2fa2a7 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -3335,7 +3335,7 @@ bytea_substring(Datum str,
errmsg("negative substring length not allowed")));
L1 = -1; /* silence stupider compilers */
}
- else if (pg_add_s32_overflow(S, L, &E))
+ else if (pg_add_s32_overflow(S1, L, &E))
{
/*
* L could be large enough for S + L to overflow, in which case the
--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
From: | Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | depesz(at)depesz(dot)com, ganguly(dot)04(at)gmail(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #17450: SUBSTRING function extracting lesser characters than specified |
Date: | 2022-03-28 14:00:15 |
Message-ID: | CALT9ZEHN6a+55vQM4p2NMjj0NJWgUFtCBVicEDufw-Zf=vLoUw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
пн, 28 мар. 2022 г. в 17:55, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> writes:
> > Yes, the behavior expected in a report seems to be right. I've attached a
> > very small fix for that issue. Probably it should be backpatched into all
> > versions having 4bd3fad80e5c i.e. since v11.
>
> The current calculation matches the SQL standard; this makes
> it not match the standard. From SQL99:
>
> 3) If <character substring function> is specified, then:
>
> a) Let C be the value of the <character value expression>, let
> LC be the length of C, and let S be the value of the <start
> position>.
>
> b) If <string length> is specified, then let L be the value of
> <string length> and let E be S+L. Otherwise, let E be the
> larger of LC + 1 and S.
>
> c) If either C, S, or L is the null value, then the result of
> the <character substring function> is the null value.
>
> d) If E is less than S, then an exception condition is raised:
> data exception - substring error.
>
> e) Case:
>
> i) If S is greater than LC or if E is less than 1 (one), then
> the result of the <character substring function> is a
> zero-
> length string.
>
> ii) Otherwise,
>
> 1) Let S1 be the larger of S and 1 (one). Let E1 be the
> smaller of E and LC+1. Let L1 be E1-S1.
>
> 2) The result of the <character substring function> is
> a character string containing the L1 characters of C
> starting at character number S1 in the same order that
> the characters appear in C.
>
> Perhaps there's something to do here documentation-wise,
> but there is no bug.
>
Ok, thanks for clarification!
--
Best regards,
Pavel Borisov
Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>
From: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
---|---|
To: | Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> |
Cc: | Hubert Lubaczewski <depesz(at)depesz(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, ganguly(dot)04(at)gmail(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #17450: SUBSTRING function extracting lesser characters than specified |
Date: | 2022-03-28 14:09:23 |
Message-ID: | CAKFQuwZ9=nQojnCdk3nVqaPUH1CQGvg2+w_H+U_xp+1vzMoPJQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
On Mon, Mar 28, 2022 at 5:35 AM Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
wrote:
> пн, 28 мар. 2022 г. в 15:01, hubert depesz lubaczewski <depesz(at)depesz(dot)com
> >:
>
>> On Mon, Mar 28, 2022 at 10:30:07AM +0000, PG Bug reporting form wrote:
>> > The following bug has been logged on the website:
>> >
>> > Bug reference: 17450
>> > Logged by: Suman Ganguly
>> > Email address: ganguly(dot)04(at)gmail(dot)com
>> > PostgreSQL version: 10.17
>> > Operating system: x86_64-pc-linux-gnu
>> > Description:
>> >
>> > select substring('123456', 0 , 5)
>> > On running this, Postgres returns '1234'
>> > Expecting '12345' to be returned as per the documentation
>>
>
You should reference the documentation you are basing your conclusion off
of for this kind of report.
This example in the documentation clearly demonstrates the 1-based nature
of the numbering:
substring('Thomas' from 2 for 3) → hom
as does this one:
substr('alphabet', 3, 2) → ph
/docs/current/functions-string.html
Oddly, I don't actually see a non-standard form of substring spelled that
way though indeed the example works.
Probably it should be backpatched into all versions having 4bd3fad80e5c
> i.e. since v11.
>
The behavior of the example command is identical both before and since v11
so I don't see how that commit has anything to do with this. Nor, as shown
above, does this contradict the documentation. The bug report is simply
wrong and you seem to have attempted to supply a fix without confirming it.
David J.