Lists: | Postg사설 토토 사이트SQL |
---|
From: | Miles Elam <mileselam(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Format base - Code contribution |
Date: | 2018-04-23 00:23:42 |
Message-ID: | CAPVvHdNDttbpV9nAq-CPx+Ai7mEmHcagAC=DJ405sQp=zr3+NQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
I would like to donate some code to the project, formatting numbers as any
base from 2 to 64. The FAQ describes contributions to the core code, but
it's possible contrib is a better target. This is all of course contingent
on how well received this extension code is of course. Code available at
the following link.
https://github.com/ttfkam/pg_formatbase
I believe it follows the PostgreSQL project's C code formatting guidelines
and includes tests. Preliminary checks show it to be about as efficient as
the built in hex formatting only it supports the full gamut of number
formatting: binary, ternary, octal, hex, base 36 (popular with Javascript),
etc. I was scratching a personal itch, but hopefully this can scratch
others' as well.
Cheers,
Miles Elam
--
Quidquid latine dictum sit, altum sonatur.
- Whatever is said in Latin sounds profound.
From: | Craig Ringer <craig(at)2ndquadrant(dot)com> |
---|---|
To: | Miles Elam <mileselam(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Format base - Code contribution |
Date: | 2018-04-26 03:03:55 |
Message-ID: | CAMsr+YHDXWqmnvy3JJBmuQfc+nPnmmWmc78qVU8O1As2pS=Q0g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 23 April 2018 at 08:23, Miles Elam <mileselam(at)gmail(dot)com> wrote:
> I would like to donate some code to the project, formatting numbers as any
> base from 2 to 64. The FAQ describes contributions to the core code, but
> it's possible contrib is a better target. This is all of course contingent
> on how well received this extension code is of course. Code available at the
> following link.
>
> https://github.com/ttfkam/pg_formatbase
Personally, I think this is a better candidate for being incorporated
directly rather than as a contrib. This sort of utility is much less
useful if you cannot rely on it being present.
I'm not convinced by the wisdom of adding int8 overloads, etc, with a
second argument. I'd rather this be named as a separate function. I
realise that many programming languages do this, but it's IMO less
discoverable this way, and might make our life harder if we later need
to overload these functions in a different way.
We already have to_hex. So to_base seems a reasonable choice. Then
adding a from_hex, from_base seems natural. Bonus points if you add
to/from base64 and oct while you're at it.
We don't seem to have a "from_hex" or "int8_from_hex", which is a
bewildering oversight really, and we don't accept literals:
test=> select int8 0x1234;
ERROR: syntax error at or near "0"
LINE 1: select int8 0x1234;
^
test=> select int8 '0x1234';
ERROR: invalid input syntax for integer: "0x1234"
LINE 1: select int8 '0x1234';
... unless you abuse our nonstandard bitstring SQL extension:
test=> select x'1234';
?column?
------------------
0001001000110100
(1 row)
test=> select pg_typeof( x'1234' );
pg_typeof
-----------
bit
(1 row)
test=> select x'1234'::int8;
int8
------
4660
(1 row)
which won't work for anyone using bind parameters, so it's not that
handy really.
I'm also amused by
test=> select 0x1234;
x1234
-------
0
(1 row)
because of our willingness to ignore the whitespace between a value
and the column label.
While I'm on that topic, I've never found anything that unquotes a
literal or identifier without going through the full parser, some sort
of unquote_literal. Guess I should find time to scratch that itch
myself soon.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From: | Miles Elam <mileselam(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Format base - Code contribution |
Date: | 2018-05-01 21:29:29 |
Message-ID: | CAPVvHdMtkMJ-7+X7koio9i4rsvgKww6GwEyofkSrBOJd-jhFpQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | Postg사설 토토 사이트SQL |
Hi Chris, thanks for the reply.
On Wed, Apr 25, 2018 at 8:03 PM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> Personally, I think this is a better candidate for being incorporated
> directly rather than as a contrib. This sort of utility is much less
> useful if you cannot rely on it being present.
>
I guess I've gotten used to the idea of contrib being both a test bed of
newer functionality—e.g., tsearch—without committing to a final API in the
core. Also can't imagine using PostgreSQL without pgcrypto being available.
But this is a perception issue on my part. I'm looking into where to put
this into core now.
> I'm not convinced by the wisdom of adding int8 overloads, etc, with a
> second argument. I'd rather this be named as a separate function. I
> realise that many programming languages do this, but it's IMO less
> discoverable this way, and might make our life harder if we later need
> to overload these functions in a different way.
>
Totally fair observation. Easier for users in the short term, may be harder
in the long term.
> We already have to_hex. So to_base seems a reasonable choice. Then
> adding a from_hex, from_base seems natural.
I have some misgivings about the existing to_hex now that I've had a chance
to go over it. It follows the printf model with %x for integers. I feel
this was a mistake. Hexadecimal, while enormously useful for bitwise
analysis, is still an output for human eyes. The fact that a negative int
value could be substantially different from a negative bigint value is
problematic. I understand the underlying reason for it, but a cursory check
in the mailing list archives shows more than a couple folks who got tripped
up by it.
I do not think that base 10 output should be wildly different from base 16
(or base 8). I don't think anyone would consider it intuitive to print out,
for example, 2147483647 for to_base(-1, 10), yet that's exactly what's done
for base 16 with the current implementation of to_hex. I see these problems
as apples and oranges. To be more precise, I consider the current to_hex to
be wrong, but too late to fix. to_bitwise_hex, to_raw_hex, or similar would
be more appropriate. In C, it's clear at all times what the size may be.
Within an SQL query, things can become far more ambiguous.
Most modern, high-level languages will present 15 as hex F and -15 as hex
-F, which is uniform no matter the underlying type size. All numeric types
in PostgreSQL are signed. Getting a wildly different value because some
smallint got silently coverted into an integer is non-intuitive to say the
least.
So it would appear there should be a strict demarcation between to_hex and
the proposed to_base.
> Bonus points if you add
> to/from base64 and oct while you're at it.
>
I can happily do it, but again, I think from_hex and from_oct should follow
as inverses to_hex and to_oct, not to_base/from_base for the reasons given
above. As for base64, that's another problematic one. To most folks, base64
means a binary encoding of data into ASCII. Again, solving a different
problem. I think it would be a good idea to avoid mixed messages to the
user here even if to the point of limiting to an upper limit of base 62
(0-9, A-Z, and a-z) and erroring out above that. I'd like to go to 64 if
for no other reason than the power of 2 affinity, but I don't think it
should be done lightly at the expense of user confusion. On the bright
side, encode/decode are both well-established within PostgreSQL and clearly
dealing with bytea values rather than integer values.
We don't seem to have a "from_hex" or "int8_from_hex", which is a
> bewildering oversight really, and we don't accept literals:
>
Thanks for the illustration into PostgreSQL parser behavior. The
flexibility of PostgreSQL can obviously be both a curse and a blessing.
Hoping I can add to the blessings.
--
Quidquid latine dictum sit, altum sonatur.
- Whatever is said in Latin sounds profound.