From: | Emre Hasegeli <emre(at)hasegeli(dot)com> |
---|---|
To: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
Cc: | Kevin Grittner <kgrittn(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andreas Karlsson <andreas(at)proxel(dot)se>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Joe Conway <mail(at)joeconway(dot)com> |
Subject: | Re: Floating point comparison inconsistencies of the geometric types |
Date: | 2016-11-13 21:41:01 |
Message-ID: | CAE2gYzymeQXGGmhU1Vc35DpugwfRd-QRK3BM-6TGg0rwHcDN_w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> We can remove the fuzz factor altogether but I think we also
> should provide a means usable to do similar things. At least "is
> a point on a line" might be useless for most cases without any
> fuzzing feature. (Nevertheless, it is a problem only when it is
> being used to do that:) If we don't find reasonable policy on
> fuzzing operations, it would be the proof that we shouldn't
> change the behavior.
It was my initial idea to keep the fuzzy comparison behaviour on some
places, but the more I get into I realised that it is almost
impossible to get this right. Instead, I re-implemented some
operators to keep precision as much as possible. The previous "is a
point on a line" operator would *never* give the true result without
the fuzzy comparison. The new implementation would return true, when
precision is not lost. I think this is a behaviour people, who are
working with floating points, are prepared to deal with. By the way,
"is a point on a line" operator is quite wrong with the fuzzy
comparison at the moment [1].
> The 0001 patch adds many FP comparison functions individually
> considering NaN. As the result the sort order logic involving NaN
> is scattered around into the functions, then, you implement
> generic comparison function using them. It seems inside-out to
> me. Defining ordering at one place, then comparison using it
> seems to be reasonable.
I agree that it would be simpler to use the comparison function for
implementing other operators. I have done it other way around to make
them more optimised. They are called very often. I don't think
checking exit code of the comparison function would be optimised the
same way. I could leave the comparison functions as they are, but
re-implemented them using the others to keep documentation of NaN
comparison in the single place.
> If the center somehow goes extremely near to the origin, it could
> result in a false error.
>
>> =# select @@ box'(-8e-324, -8e-324), (4.9e-324, 4.9e-324)';
>> ERROR: value out of range: underflow
>
> I don't think this underflow is an error, and actually it is a
> change of the current behavior without a reasonable reason. More
> significant (and maybe unacceptable) side-effect is that it
> changes the behavior of ordinary operators. I don't think this is
> acceptable. More consideration is needed.
>
>> =# select ('-8e-324'::float8 + '4.9e-324'::float8) / 2.0;
>> ERROR: value out of range: underflow
This is the current behaviour of float datatype. My patch doesn't
change that. This problem would probably also apply to multiplying
very small values. I agree that this is not the ideal behaviour.
Though I am not sure, if we should go to a different direction than
the float datatypes.
I think there is value in making geometric types compatible with the
float. Users are going to mix them, anyway. For example, users can
calculate the center of a box manually, and confuse when the built-in
operator behaves differently.
> In regard to fuzzy operations, libgeos seems to have several
> types of this kind of feature. (I haven't looked closer into
> them). Other than reducing precision seems overkill or
> unappliable for PostgreSQL bulitins. As Jim said, can we replace
> the fixed scale fuzz factor by precision reduction? Maybe, with a
> GUC variable (I hear someone's roaring..) to specify the amount
> defaults to fit the current assumption.
I am disinclined to try to implement something complicated for the
geometric types. I think they are mostly useful for 2 purposes: uses
simple enough to not worth looking for better solutions, and
demonstrating our indexing capabilities. The inconsistencies harm
both of those.
[1] /message-id/flat/CAE2gYzw_-z%3DV2kh8QqFjenu%3D8MJXzOP44wRW%3DAzzeamrmTT1%3DQ%40mail.gmail.com
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2016-11-13 21:46:05 | Re: Improving RLS planning |
Previous Message | Guillaume Lelarge | 2016-11-13 21:23:41 | Re: Exclude pg_largeobject form pg_dump |