Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Lists: pgsql-bugs
From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-02 17:26:50
Message-ID: PH8PR21MB3902F334A3174C54058F792CE5182@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

I'm sorry to bother you if you are not the right person. In that case, could you please help to point me in the right direction for the right contact? Thanks a lot!
Hi Team,

I'm a program manager in the Windows App Assure ISV Outreach Team at Microsoft. We work with Microsoft's test organization to notify developers when issues have been identified in their applications. We're reaching out to notify you of a potential issue in one of your applications.
The issue details are below, please review it when you have a moment. Our goal is to work with you to address this issue and to understand what your expected timeline to address this issue might be. If you have any questions about the details below or have already addressed this issue in a forthcoming update, please let me know.
Account
토토 사이트
Product
PostgreSQL
Reference #
48475607
Issue
Postgres.exe crash observed while installing the application.

Environment: Desktop
OS: Windows 11
App Version: 16.1

Repro Steps:

1. Deploy windows server 2019 Turkey Build.
2. Patch the machine till WU and enable roles & feature in server manager.
3. Restart the machine to configure updates & observe winver 17763.5329
4. Login as Administrator.
5. Download and install the application: /download/
6. Click on Setup and in setup launched click on next.
7. Click next on installation directory window.
8. In Select components window make sure all are selected .and click on next .
9. In data directory window click on next.
10. In Password window provide the password and click on next.
11. In port window click on next .
12. Click on next default in advanced settings ,summary & ready to install window.
13. While the application is installing ,
14. Observe..

Observations:
Postgres.exe crash is observed while installing the application.

Expected Results:
Should not observe any crash while installing the application
Resource
For any questions on app development (or) submission on windows, contact Windows Dev Center. https://developer.microsoft.com/en-us/windows/support

Thanks!
Haifang


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-02 23:50:58
Message-ID: CA+hUKG+b_XqMZ6xkFhMCEwLXbGSN1rDYYY-En8MgFe81auRAAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, May 3, 2024 at 7:05 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> Deploy windows server 2019 Turkey Build.
...
> Postgres.exe crash is observed while installing the application.

Hi,

"Crash" with no details is not a very useful report, but I guess that
this is the Turkey -> Türkiye issue caused by a recent operating
system upgrade, which is apparently now hitting the server edition of
Windows based on the new inflow of bug reports. I think we know
approximately how to fix it, and there are several possible
workarounds for a system that is already in this state, but interested
parties who know and care about the relevant OS need to get involved
to make progress. It's too late for Turkish but ideally we'll be able
to stop this from happening the next time a country changes its name.
I have written everything I know about the issue here:

/message-id/flat/18196-b10f93dfbde3d7db%40postgresql.org

Not being a Windows user, I have not been able to take that proposed
fix all the way to the finish line, so all I can do is post what I
know and hope that open source will happen.


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-02 23:53:23
Message-ID: CA+hUKG+v4_kY9mx8==QLrn8ZaJXzntN+Kc0Ra2mx7MEP2j-QMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, May 3, 2024 at 11:50 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> /message-id/flat/18196-b10f93dfbde3d7db%40postgresql.org

And more here:

/message-id/flat/CA+hUKGJ=XThErgAQRoqfCy1bKPxXVuF0=2zDbB+SxDs59pv7Fw(at)mail(dot)gmail(dot)com


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-13 18:51:53
Message-ID: PH8PR21MB39022E5B54CE6E03E194B30FE5E22@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi Thomas,

Thanks a lot for your previous reply. Still need to double check with you:

You said the Turkey -> Türkiye issue caused by a recent operating system upgrade, you mean this crash is caused by the changes made on Windows side? If that is the case, you prefer to leave the bug there? Any plan for the future?

Thanks!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Thursday, May 2, 2024 4:51 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

[You don't often get email from thomas(dot)munro(at)gmail(dot)com(dot) Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

On Fri, May 3, 2024 at 7:05 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> Deploy windows server 2019 Turkey Build.
...
> Postgres.exe crash is observed while installing the application.

Hi,

"Crash" with no details is not a very useful report, but I guess that this is the Turkey -> Türkiye issue caused by a recent operating system upgrade, which is apparently now hitting the server edition of Windows based on the new inflow of bug reports. I think we know approximately how to fix it, and there are several possible workarounds for a system that is already in this state, but interested parties who know and care about the relevant OS need to get involved to make progress. It's too late for Turkish but ideally we'll be able to stop this from happening the next time a country changes its name.
I have written everything I know about the issue here:

/message-id/flat/18196-b10f93dfbde3d7db%40postgresql.org

Not being a Windows user, I have not been able to take that proposed fix all the way to the finish line, so all I can do is post what I know and hope that open source will happen.


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-13 21:50:15
Message-ID: PH8PR21MB3902FCA57600AF7CD5ADBA6DE5E22@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi Thomas and team,

Some suggestion from our engineers, please take a look and let me know if you have any question:

1. Could you please use _wsetlocale API in case of wide string locale parameter, this is what the recommended way as par Microsoft documentation as well. (Reference: setlocale: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170, wsetlocale: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170)
2. Additionally you can work around in case you don’t want to change the API to update their mapping table with Turkish_Tur in case of Turkish_Türkiye as you have done for others locale with same issue as well in your code
(reference PostgreSQL Source Code: src/port/win32setlocale.c Source File: https://doxygen.postgresql.org/win32setlocale_8c_source.html, Line number 66-67)

Please let me know if there is any other questions.

Regards!
Haifang

-----Original Message-----
From: Haifang Wang (Centific Technologies Inc)
Sent: Monday, May 13, 2024 11:52 AM
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Hi Thomas,

Thanks a lot for your previous reply. Still need to double check with you:

You said the Turkey -> Türkiye issue caused by a recent operating system upgrade, you mean this crash is caused by the changes made on Windows side? If that is the case, you prefer to leave the bug there? Any plan for the future?

Thanks!
Haifang

-----Original Message-----
From: Thomas Munro <mailto:thomas(dot)munro(at)gmail(dot)com>
Sent: Thursday, May 2, 2024 4:51 PM
To: Haifang Wang (Centific Technologies Inc) <mailto:v-haiwang(at)microsoft(dot)com>
Cc: mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

[You don't often get email from mailto:thomas(dot)munro(at)gmail(dot)com(dot) Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

On Fri, May 3, 2024 at 7:05 AM Haifang Wang (Centific Technologies
Inc) <mailto:v-haiwang(at)microsoft(dot)com> wrote:
> Deploy windows server 2019 Turkey Build.
...
> Postgres.exe crash is observed while installing the application.

Hi,

"Crash" with no details is not a very useful report, but I guess that this is the Turkey -> Türkiye issue caused by a recent operating system upgrade, which is apparently now hitting the server edition of Windows based on the new inflow of bug reports. I think we know approximately how to fix it, and there are several possible workarounds for a system that is already in this state, but interested parties who know and care about the relevant OS need to get involved to make progress. It's too late for Turkish but ideally we'll be able to stop this from happening the next time a country changes its name.
I have written everything I know about the issue here:

/message-id/flat/18196-b10f93dfbde3d7db%40postgresql.org

Not being a Windows user, I have not been able to take that proposed fix all the way to the finish line, so all I can do is post what I know and hope that open source will happen.


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-13 22:11:53
Message-ID: CA+hUKGKrzGnCW3MoVC7gayHCtJQPEJKgpfQWD6c7eUvFp+ac+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Tue, May 14, 2024 at 6:51 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> You said the Turkey -> Türkiye issue caused by a recent operating system upgrade, you mean this crash is caused by the changes made on Windows side? If that is the case, you prefer to leave the bug there? Any plan for the future?

First, let me restate the problem:

When you create a database cluster (= a PostgreSQL instance) with
"initdb", unless you request a default locale with --locale, initdb
uses setlocale("") to query the system/user default locale. It then
records that string in postgresql.conf, and also in the pg_database
catalog. On POSIX systems, that captures something like "tr-TR.utf-8"
or similar. On Windows, that captures something like
"Turkish_Turkey.1254". Later, PostgreSQL uses newlocale() or
setlocale() functions to access that locale again. In rare cases
where a country changes its name, a Windows update *renames* the
locale, and then those calls fail, because the old name is not
recognised anymore.

I proposed a partial solution that should help avoid the problem in future:

I think that initdb should instead call GetUserDefaultLocaleName() to
discover the user account's default locale, because it returns BCP47
names like "tr-TR". It is probably less likely for a country or
language to change its ISO code (but not impossible[1]), than for the
English-language names of them to change. Even if you reject this
idea because technically they can both change, there are other reasons
why we should not be storing this "display"-style names anywhere,
including that PostgreSQL needs to store them in a place where the
encoding must be ASCII, which Türkiye is not. And finally, the
Windows manual explicitly warns us about this[2]: "We don't recommend
this form for locale strings embedded in code or serialized to
storage: These strings are more likely to be changed by an operating
system update than the locale name form."

(Note: the BCP47 support in Windows did not exist when that PostgreSQL
code was written, so it did what it had to at the time.)

There are further problems to resolve:

1. We don't know if we should put encodings (AKA codepages?) on the
end of those strings or not. There is some confusion about what
exactly it does, and how it interacts with the "ACP". I'd be worried
that if you don't put the endings on, perhaps it can change under your
feet. (I suspect that part of the discussion on that other thread
took some wrong turns, based on citext_utf8 results that were actually
probably misleading.)
2. If we do decide to put the encoding suffixes on, which encoding
should we be suggesting? It appears from anecdotal reports that most
PostgreSQL-on-Windows users are stuck in the past, using the old
pre-utf-8 language-specific encodings. Should we be encouraging utf-8
use by default, if we can? Maybe that is a separate question.

Then there is the practical question of what to do with an
already-broken system. One idea would be to introduce a "locale
remapping" file, pgdata/pg_locale.map where you can write things
like "Turkish_Turkey.1254"="Turkish_Türkiye.1254".

I think there might be a GUI tool that allows you to duplicate,
rename, etc locales in Windows, so you can re-create the old name. I
believe that is how some people have fixed their broken databases. I
don't know if there is a good reference/blog/article on that, that we
should be pointing people towards if they show up with broken systems.

Patches, testing, research are welcome! Even though I put forward that
BCP47 idea, it was based on reading the manual, so the "unresolved"
questions may in fact be very easy to resolve by people who actually
use/know Windows. Even if I had been gung-ho about committing
that in 16 without feedback from Windows users, it would have been too
late to help Turkish users with existing databases.

[1] https://learn.microsoft.com/en-us/globalization/locale/standard-locale-names
[2] https://learn.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings?view=msvc-160


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-13 22:27:32
Message-ID: PH8PR21MB39027F1F4EAFEAE33EB4BA39E5E22@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi Thomas,

Thanks for your reply. But I'm not sure if there is any miscommunication. Let me make it clear again.

I’m a program manager in the Windows App Assure ISV Outreach Team at Microsoft. We work with Microsoft’s test organization to notify developers when issues have been identified in their applications. The issue I reported in this mail is an issue we found in our testing and I believe it also impact a lot of end users. Like you mentioned previously, it is caused by recent operating system upgrade.

The solution I shared in my last email was suggested our engineers. It would be great is you could use _wsetlocale API. this is what the recommended way as par Microsoft documentation as well. (Reference: setlocale wsetlocale: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170

If you don’t want to change the API to update their mapping table with Turkish_Tur in case of Turkish_Türkiye as you have done for others locale with same issue as well in your code (reference PostgreSQL Source Code: src/port/win32setlocale.c Source File: https://doxygen.postgresql.org/win32setlocale_8c_source.html, Line number 66-67).

Please let me know if there is any misunderstanding.

Thanks!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Monday, May 13, 2024 3:12 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Tue, May 14, 2024 at 6:51 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> You said the Turkey -> Türkiye issue caused by a recent operating system upgrade, you mean this crash is caused by the changes made on Windows side? If that is the case, you prefer to leave the bug there? Any plan for the future?

First, let me restate the problem:

When you create a database cluster (= a PostgreSQL instance) with "initdb", unless you request a default locale with --locale, initdb uses setlocale("") to query the system/user default locale. It then records that string in postgresql.conf, and also in the pg_database catalog. On POSIX systems, that captures something like "tr-TR.utf-8"
or similar. On Windows, that captures something like "Turkish_Turkey.1254". Later, PostgreSQL uses newlocale() or
setlocale() functions to access that locale again. In rare cases where a country changes its name, a Windows update *renames* the locale, and then those calls fail, because the old name is not recognised anymore.

I proposed a partial solution that should help avoid the problem in future:

I think that initdb should instead call GetUserDefaultLocaleName() to discover the user account's default locale, because it returns BCP47 names like "tr-TR". It is probably less likely for a country or language to change its ISO code (but not impossible[1]), than for the English-language names of them to change. Even if you reject this idea because technically they can both change, there are other reasons why we should not be storing this "display"-style names anywhere, including that PostgreSQL needs to store them in a place where the encoding must be ASCII, which Türkiye is not. And finally, the Windows manual explicitly warns us about this[2]: "We don't recommend this form for locale strings embedded in code or serialized to
storage: These strings are more likely to be changed by an operating system update than the locale name form."

(Note: the BCP47 support in Windows did not exist when that PostgreSQL code was written, so it did what it had to at the time.)

There are further problems to resolve:

1. We don't know if we should put encodings (AKA codepages?) on the end of those strings or not. There is some confusion about what exactly it does, and how it interacts with the "ACP". I'd be worried that if you don't put the endings on, perhaps it can change under your feet. (I suspect that part of the discussion on that other thread took some wrong turns, based on citext_utf8 results that were actually probably misleading.) 2. If we do decide to put the encoding suffixes on, which encoding should we be suggesting? It appears from anecdotal reports that most PostgreSQL-on-Windows users are stuck in the past, using the old
pre-utf-8 language-specific encodings. Should we be encouraging utf-8 use by default, if we can? Maybe that is a separate question.

Then there is the practical question of what to do with an already-broken system. One idea would be to introduce a "locale remapping" file, pgdata/pg_locale.map where you can write things like "Turkish_Turkey.1254"="Turkish_Türkiye.1254".

I think there might be a GUI tool that allows you to duplicate, rename, etc locales in Windows, so you can re-create the old name. I believe that is how some people have fixed their broken databases. I don't know if there is a good reference/blog/article on that, that we should be pointing people towards if they show up with broken systems.

Patches, testing, research are welcome! Even though I put forward that
BCP47 idea, it was based on reading the manual, so the "unresolved"
questions may in fact be very easy to resolve by people who actually use/know Windows. Even if I had been gung-ho about committing that in 16 without feedback from Windows users, it would have been too late to help Turkish users with existing databases.

[1] https://learn.microsoft.com/en-us/globalization/locale/standard-locale-names
[2] https://learn.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings?view=msvc-160


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-13 22:47:23
Message-ID: CA+hUKG+GUcSzk7UwvuLr4p84BQV8k2xvVAjToW=fQ5rCrciJ4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Tue, May 14, 2024 at 10:27 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> Thanks for your reply. But I'm not sure if there is any miscommunication. Let me make it clear again.

(Sorry, it seems our emails crossed.)

> I’m a program manager in the Windows App Assure ISV Outreach Team at Microsoft. We work with Microsoft’s test organization to notify developers when issues have been identified in their applications. The issue I reported in this mail is an issue we found in our testing and I believe it also impact a lot of end users. Like you mentioned previously, it is caused by recent operating system upgrade.

Thanks for doing that, and yes, it affects a lot of users, and this is
not the first time. It is still possible for it to be the last...

> The solution I shared in my last email was suggested our engineers. It would be great is you could use _wsetlocale API. this is what the recommended way as par Microsoft documentation as well. (Reference: setlocale wsetlocale: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170

I don't understand. Can you explain why _wsetlocale() is better than
setlocale()? They behave identically except one takes wide
characters, which doesn't seem to solve any problem we have.

> If you don’t want to change the API to update their mapping table with Turkish_Tur in case of Turkish_Türkiye as you have done for others locale with same issue as well in your code (reference PostgreSQL Source Code: src/port/win32setlocale.c Source File: https://doxygen.postgresql.org/win32setlocale_8c_source.html, Line number 66-67).

Yeah, OK we could put more kludges into win32setlocale.c. I don't
mind committing a patch like that if it addresses the issue. I am not
in a position to confirm that myself... what we need is someone who
works with Windows to write the patch and test it across that upgrade.
Longer term I'm looking for something better than that though, because
it doesn't address the root cause (need for stable identifiers), and
will only ever allow us to fix problems with the old unstable names
*after* users complain that their database is dead, 3-6 months after
in fact due to release cycles. I think a dynamic mapping file might
be better? (Maybe win32locale.c should be able to read that kludge
table from a file that you can give it with an environment variable,
or something like that?)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-13 23:07:18
Message-ID: 872239.1715641638@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> Longer term I'm looking for something better than that though, because
> it doesn't address the root cause (need for stable identifiers), and
> will only ever allow us to fix problems with the old unstable names
> *after* users complain that their database is dead, 3-6 months after
> in fact due to release cycles. I think a dynamic mapping file might
> be better? (Maybe win32locale.c should be able to read that kludge
> table from a file that you can give it with an environment variable,
> or something like that?)

+1 for the long-term solution being more-stable locale identifiers.
However, we should try to build something that will let users get
out of these situations with the existing identifiers, so I like
your idea of a plain-text mapping file for Windows locale names.
I don't think an environment variable is necessary; just define
a fixed name "$PGDATA/locale_map.txt" or such. If that file
exists, just read it and map the pg_database field values with it.

Maybe this shouldn't even be Windows-specific? Are there any
cases where it'd save people's bacon on other platforms?

regards, tom lane


From: John McKown <john(dot)archie(dot)mckown(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-13 23:09:52
Message-ID: CAAJSdjiR_kP3-YDfUyKv0Yruv1K2Rn0LPgoyabagrJauJDCNTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Just curious. Why a plain text file rather than a system table?

On Mon, May 13, 2024, 18:07 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> > Longer term I'm looking for something better than that though, because
> > it doesn't address the root cause (need for stable identifiers), and
> > will only ever allow us to fix problems with the old unstable names
> > *after* users complain that their database is dead, 3-6 months after
> > in fact due to release cycles. I think a dynamic mapping file might
> > be better? (Maybe win32locale.c should be able to read that kludge
> > table from a file that you can give it with an environment variable,
> > or something like that?)
>
> +1 for the long-term solution being more-stable locale identifiers.
> However, we should try to build something that will let users get
> out of these situations with the existing identifiers, so I like
> your idea of a plain-text mapping file for Windows locale names.
> I don't think an environment variable is necessary; just define
> a fixed name "$PGDATA/locale_map.txt" or such. If that file
> exists, just read it and map the pg_database field values with it.
>
> Maybe this shouldn't even be Windows-specific? Are there any
> cases where it'd save people's bacon on other platforms?
>
> regards, tom lane
>
>
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: John McKown <john(dot)archie(dot)mckown(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-13 23:15:00
Message-ID: 873254.1715642100@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

John McKown <john(dot)archie(dot)mckown(at)gmail(dot)com> writes:
> Just curious. Why a plain text file rather than a system table?

Because you'd have no way to update such a table, if you can't
start the database or connect to it. So that approach isn't
suitable for people whose database has been broken by one of
these system updates. (This is largely the same reason why,
eg, postgresql.conf isn't a table.)

regards, tom lane


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-14 06:38:20
Message-ID: CA+hUKGJJgyc9xg+g_8_jPUV08Kr0NFdgq2_H9mm7-eZd7YVRQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Tue, May 14, 2024 at 11:07 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> +1 for the long-term solution being more-stable locale identifiers.
> However, we should try to build something that will let users get
> out of these situations with the existing identifiers, so I like
> your idea of a plain-text mapping file for Windows locale names.
> I don't think an environment variable is necessary; just define
> a fixed name "$PGDATA/locale_map.txt" or such. If that file
> exists, just read it and map the pg_database field values with it.

OK, I tried that, first draft attached (with my standard proviso that
I don't do Windows, I just know that this passes CI and that the code
works the way I intended on my local Unix system if extracted into a
little harness). With this, you could in theory create a file
PGDATA/win32setlocale.map containing:

c Turkish_Turkey.1254=Turkish_Türkiye.1254

... or perhaps more likely:

c Turkish_Turkey.1254=tr-TR.1254

I also absorbed the pre-existing kludge table into the new system by
default (though they got a bit shorter 'cause I invented some
wildcards). Some problems came up while wondering how to fit Türkiye
into the defaults, and how to back-patch:

1. In the back-branches, we claim to support ancient Windows releases
as far back as "Windows 2000 SP4" (!), which obviously aren't getting
the Windows updates, so I guess "Turkish_Türkiye.1254" will fail there
and generally before Windows 10. And even if you exclude the
extremities of our support window somehow (how?), modern systems might
not have applied the update yet (IIUC they *have* to at some point
under the new world order, so there is a defined window of version
skew these days).

2. It's generally a terrible idea to be using "ü" in a locale name.
FWIW I assume setlocale() actually accepts and returns names encoded
in the current ACP ("active codepage", system-wide changeable setting
that controls char↔wchar_t conversion in system APIs), so the encoding
of that file (and the built-in default table) would need to match that
to work, as coded. Perhaps it would be possible to make the mapping
file utf-8 and transform that to ACP! But it feels a bit too loopy
for me, and on the PostgreSQL side it is undefined/illegal whatever
you choose in PostgreSQL due to being accessed from different
databases which are using potentially different encodings that are
only required to be a superset of ASCII. Avoid.

3. Therefore you'd probably want to prefer "tr-TR.1254" as the
replacement string. But what is the oldest Windows release that can
understand a BCP47 code like that?

4. Conversely, on modern systems, I'm still not entirely sure that
"tr-TR.1254" is exactly the same thing as "Turkish-XXX.1254" and that
it's OK to put ".1254" on the end like that. Is it, and is it? I
don't mean just "does it mean Turkish?", I mean "does it give exactly
the same answer for every conceivable pair of strings when compared
with strcoll_l(), and likewise for the ctype-based functions like
towlower() et al".

If the answers are not in our favour, I guess we could leave the
default behaviour unchanged, and let people set up a text file as
shown above to fix their database if they want, but that's also not
very nice and kinda weird (helping hypothetical users of museum-grade
systems by leaving real users' systems broken).

If the answer to 4 is yes, yes then we could also push ahead with the
plan to make initdb pick BCP47 names by default in PG18 (or even 17).

> Maybe this shouldn't even be Windows-specific? Are there any
> cases where it'd save people's bacon on other platforms?

Good question. Sometimes ISO code go away or countries split etc, so
it's no like POSIX locale names are set in stone under all
circumstances. But on Unixen it's all just files in practice, you can
always just symlink them, move them around, compile them yourself from
sources, etc, if you really have to, so I think I'd rather contain the
crazy in win32*.c.

Attachment Content-Type Size
v1-0001-Make-win32locale.c-s-kludge-table-dynamic.patch text/x-patch 15.0 KB

From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>
Cc: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-15 23:46:33
Message-ID: PH8PR21MB3902402C3C3C20DD8CB40AFFE5EC2@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Thanks for your questions, Thomas and Tom. + @Vishwa to help with technical questions.

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Monday, May 13, 2024 11:38 PM
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>; pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Tue, May 14, 2024 at 11:07 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> +1 for the long-term solution being more-stable locale identifiers.
> However, we should try to build something that will let users get out
> of these situations with the existing identifiers, so I like your idea
> of a plain-text mapping file for Windows locale names.
> I don't think an environment variable is necessary; just define a
> fixed name "$PGDATA/locale_map.txt" or such. If that file exists,
> just read it and map the pg_database field values with it.

OK, I tried that, first draft attached (with my standard proviso that I don't do Windows, I just know that this passes CI and that the code works the way I intended on my local Unix system if extracted into a little harness). With this, you could in theory create a file PGDATA/win32setlocale.map containing:

c Turkish_Turkey.1254=Turkish_Türkiye.1254

... or perhaps more likely:

c Turkish_Turkey.1254=tr-TR.1254

I also absorbed the pre-existing kludge table into the new system by default (though they got a bit shorter 'cause I invented some wildcards). Some problems came up while wondering how to fit Türkiye into the defaults, and how to back-patch:

1. In the back-branches, we claim to support ancient Windows releases as far back as "Windows 2000 SP4" (!), which obviously aren't getting the Windows updates, so I guess "Turkish_Türkiye.1254" will fail there and generally before Windows 10. And even if you exclude the extremities of our support window somehow (how?), modern systems might not have applied the update yet (IIUC they *have* to at some point under the new world order, so there is a defined window of version skew these days).

2. It's generally a terrible idea to be using "ü" in a locale name.
FWIW I assume setlocale() actually accepts and returns names encoded in the current ACP ("active codepage", system-wide changeable setting that controls char↔wchar_t conversion in system APIs), so the encoding of that file (and the built-in default table) would need to match that to work, as coded. Perhaps it would be possible to make the mapping file utf-8 and transform that to ACP! But it feels a bit too loopy for me, and on the PostgreSQL side it is undefined/illegal whatever you choose in PostgreSQL due to being accessed from different databases which are using potentially different encodings that are only required to be a superset of ASCII. Avoid.

3. Therefore you'd probably want to prefer "tr-TR.1254" as the replacement string. But what is the oldest Windows release that can understand a BCP47 code like that?

4. Conversely, on modern systems, I'm still not entirely sure that "tr-TR.1254" is exactly the same thing as "Turkish-XXX.1254" and that it's OK to put ".1254" on the end like that. Is it, and is it? I don't mean just "does it mean Turkish?", I mean "does it give exactly the same answer for every conceivable pair of strings when compared with strcoll_l(), and likewise for the ctype-based functions like
towlower() et al".

If the answers are not in our favour, I guess we could leave the default behaviour unchanged, and let people set up a text file as shown above to fix their database if they want, but that's also not very nice and kinda weird (helping hypothetical users of museum-grade systems by leaving real users' systems broken).

If the answer to 4 is yes, yes then we could also push ahead with the plan to make initdb pick BCP47 names by default in PG18 (or even 17).

> Maybe this shouldn't even be Windows-specific? Are there any cases
> where it'd save people's bacon on other platforms?

Good question. Sometimes ISO code go away or countries split etc, so it's no like POSIX locale names are set in stone under all circumstances. But on Unixen it's all just files in practice, you can always just symlink them, move them around, compile them yourself from sources, etc, if you really have to, so I think I'd rather contain the crazy in win32*.c.


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>
Cc: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-17 22:25:34
Message-ID: PH8PR21MB39027671C218DE448D8C091CE5EE2@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi Thomas and Tom,

Please let us know your question or concern about the bug we discuss in this mail.

Thanks!
Haifang

-----Original Message-----
From: Haifang Wang (Centific Technologies Inc)
Sent: Wednesday, May 15, 2024 4:47 PM
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Thanks for your questions, Thomas and Tom. + @Vishwa to help with technical questions.

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Monday, May 13, 2024 11:38 PM
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>; pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Tue, May 14, 2024 at 11:07 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> +1 for the long-term solution being more-stable locale identifiers.
> However, we should try to build something that will let users get out
> of these situations with the existing identifiers, so I like your idea
> of a plain-text mapping file for Windows locale names.
> I don't think an environment variable is necessary; just define a
> fixed name "$PGDATA/locale_map.txt" or such. If that file exists,
> just read it and map the pg_database field values with it.

OK, I tried that, first draft attached (with my standard proviso that I don't do Windows, I just know that this passes CI and that the code works the way I intended on my local Unix system if extracted into a little harness). With this, you could in theory create a file PGDATA/win32setlocale.map containing:

c Turkish_Turkey.1254=Turkish_Türkiye.1254

... or perhaps more likely:

c Turkish_Turkey.1254=tr-TR.1254

I also absorbed the pre-existing kludge table into the new system by default (though they got a bit shorter 'cause I invented some wildcards). Some problems came up while wondering how to fit Türkiye into the defaults, and how to back-patch:

1. In the back-branches, we claim to support ancient Windows releases as far back as "Windows 2000 SP4" (!), which obviously aren't getting the Windows updates, so I guess "Turkish_Türkiye.1254" will fail there and generally before Windows 10. And even if you exclude the extremities of our support window somehow (how?), modern systems might not have applied the update yet (IIUC they *have* to at some point under the new world order, so there is a defined window of version skew these days).

2. It's generally a terrible idea to be using "ü" in a locale name.
FWIW I assume setlocale() actually accepts and returns names encoded in the current ACP ("active codepage", system-wide changeable setting that controls char↔wchar_t conversion in system APIs), so the encoding of that file (and the built-in default table) would need to match that to work, as coded. Perhaps it would be possible to make the mapping file utf-8 and transform that to ACP! But it feels a bit too loopy for me, and on the PostgreSQL side it is undefined/illegal whatever you choose in PostgreSQL due to being accessed from different databases which are using potentially different encodings that are only required to be a superset of ASCII. Avoid.

3. Therefore you'd probably want to prefer "tr-TR.1254" as the replacement string. But what is the oldest Windows release that can understand a BCP47 code like that?

4. Conversely, on modern systems, I'm still not entirely sure that "tr-TR.1254" is exactly the same thing as "Turkish-XXX.1254" and that it's OK to put ".1254" on the end like that. Is it, and is it? I don't mean just "does it mean Turkish?", I mean "does it give exactly the same answer for every conceivable pair of strings when compared with strcoll_l(), and likewise for the ctype-based functions like
towlower() et al".

If the answers are not in our favour, I guess we could leave the default behaviour unchanged, and let people set up a text file as shown above to fix their database if they want, but that's also not very nice and kinda weird (helping hypothetical users of museum-grade systems by leaving real users' systems broken).

If the answer to 4 is yes, yes then we could also push ahead with the plan to make initdb pick BCP47 names by default in PG18 (or even 17).

> Maybe this shouldn't even be Windows-specific? Are there any cases
> where it'd save people's bacon on other platforms?

Good question. Sometimes ISO code go away or countries split etc, so it's no like POSIX locale names are set in stone under all circumstances. But on Unixen it's all just files in practice, you can always just symlink them, move them around, compile them yourself from sources, etc, if you really have to, so I think I'd rather contain the crazy in win32*.c.


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-18 05:39:48
Message-ID: CA+hUKG+oS+rriPvML+j6mZgCkJDsGP9bP7qopVxSLeTWsAOtsA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Sat, May 18, 2024 at 10:25 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> Please let us know your question or concern about the bug we discuss in this mail.

Thanks for your help with this! Did I miss an email in between? Here
are some unanswered questions that seem to be stopping progress:

1. What is the oldest Windows release that can understand the "new"
BCP47 locale names, like "tr-TR" or "tr-TR.1452"?
2. If we translate to BCP47 automatically, should we put the ".1452"
on the end? What does it mean exactly? What does it mean if you
don't put it there?
3. Do the new BCP47 locale names give *exactly* the same results for
strcoll() and tolower() etc, as the old "Turkish*" style names?

With answers to those questions we might be able to ship some nice
built-in translations to get users out of this jam. If there are
issues on those points, we might have to face some questions about
what the encoding is of the "Turkish_Türkiye.1254" string itself,
which is tricky for us for technical reasons, among other problems...
not sure.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-18 05:47:13
Message-ID: 2682258.1716011233@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> With answers to those questions we might be able to ship some nice
> built-in translations to get users out of this jam. If there are
> issues on those points, we might have to face some questions about
> what the encoding is of the "Turkish_Türkiye.1254" string itself,
> which is tricky for us for technical reasons, among other problems...
> not sure.

TBH, my idea of how this should go was to *not* ship any built-in
translations, or indeed any translation file at all (so I didn't
like your moving some existing hacks into that file). If we
approach it like that, then individual users who hit this problem
are responsible for creating their own translation file, which
will automatically use whatever is the locally-appropriate
encoding. Sure, it's less transparent for affected users, but
it will work for them which other approaches might not; and
evidence so far is that there's not a huge number of affected
users.

regards, tom lane


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-18 05:56:48
Message-ID: CA+hUKG+j5YwGtoV0_q3NxjD3EygxdwYeedUAFKamcLzR+PrcpQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Sat, May 18, 2024 at 5:47 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> > With answers to those questions we might be able to ship some nice
> > built-in translations to get users out of this jam. If there are
> > issues on those points, we might have to face some questions about
> > what the encoding is of the "Turkish_Türkiye.1254" string itself,
> > which is tricky for us for technical reasons, among other problems...
> > not sure.
>
> TBH, my idea of how this should go was to *not* ship any built-in
> translations, or indeed any translation file at all (so I didn't
> like your moving some existing hacks into that file). If we
> approach it like that, then individual users who hit this problem
> are responsible for creating their own translation file, which
> will automatically use whatever is the locally-appropriate
> encoding. Sure, it's less transparent for affected users, but
> it will work for them which other approaches might not; and
> evidence so far is that there's not a huge number of affected
> users.

Hmm, yeah I guess we could just ship a patch like what I posted
already, and let users figure out what works. I would still like to
know the answer to those questions so we can offer good advice. If
the answers are all yes then I think we can say "the file is in
encoded in ASCII; use wildcards to deal with any legacy non-ASCII
locale names on the left, and use BCP47 names on the right" and begin
expunging the bad old names from our universe.


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-20 20:17:46
Message-ID: DS0PR21MB390450195D21634ABCA26F4FE5E92@DS0PR21MB3904.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi Thomas,

What questions do you have? Could you please list them clearly so that Vishwa could help to answer?

NOTE: this email is plain text format, not easy to read. And also the previous email content usually is not included in the latest email, we only could see the most latest one.

Thanks!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Friday, May 17, 2024 10:57 PM
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Sat, May 18, 2024 at 5:47 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> > With answers to those questions we might be able to ship some nice
> > built-in translations to get users out of this jam. If there are
> > issues on those points, we might have to face some questions about
> > what the encoding is of the "Turkish_Türkiye.1254" string itself,
> > which is tricky for us for technical reasons, among other problems...
> > not sure.
>
> TBH, my idea of how this should go was to *not* ship any built-in
> translations, or indeed any translation file at all (so I didn't like
> your moving some existing hacks into that file). If we approach it
> like that, then individual users who hit this problem are responsible
> for creating their own translation file, which will automatically use
> whatever is the locally-appropriate encoding. Sure, it's less
> transparent for affected users, but it will work for them which other
> approaches might not; and evidence so far is that there's not a huge
> number of affected users.

Hmm, yeah I guess we could just ship a patch like what I posted already, and let users figure out what works. I would still like to know the answer to those questions so we can offer good advice. If the answers are all yes then I think we can say "the file is in encoded in ASCII; use wildcards to deal with any legacy non-ASCII locale names on the left, and use BCP47 names on the right" and begin expunging the bad old names from our universe.


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-20 20:56:51
Message-ID: CA+hUKGJDv3pCnQXY13uYW1oe9CvFw8Y2j8mBMh2HGk_X0qYEcw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Tue, May 21, 2024 at 8:17 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> What questions do you have? Could you please list them clearly so that Vishwa could help to answer?

I already did, twice, but perhaps Vishwa or others can't see the whole
thread, so here is this whole thread in our project email archive:

/message-id/flat/PH8PR21MB3902F334A3174C54058F792CE5182%40PH8PR21MB3902.namprd21.prod.outlook.com

But let me ask the questions again, with some motivation/reason I want
to know in parentheses:

1. What is the oldest Windows release that can understand the "new"
BCP47 locale names, like "tr-TR" or "tr-TR.1452"? (Some PostgreSQL
versions, for example PostgreSQL 12, are expected to run on old
versions of Windows from long before Windows 10, so we might have to
consider this. However, if we go with Tom's idea that we do nothing
by default but just allow users to supply their own optional mapping
file, then this question becomes unimportant, users can figure out for
themselves whether it works, and presumably only 10+ got the update
that renamed Turkey to Türkiye. [And in reality, I hope/expect that
no one really does run old out-of-support OSes, because that's crazy,
but I'm not allowed to assume...])

2. If we translate to BCP47 locale names like "tr-TR" automatically,
should we put the ".1452" on the end? What does it mean exactly?
What does it mean if you don't put it there? (I could guess that if
you don't put it on, the encoding in "char"-based functions is the
"ACP". What I really want to know is, can it be different from the
"ACP", and if it is, which functions does it affect? For example if
the ACP is 1521 and I call _tolower_l() giving it a locale_t that I
opened with "en-US.utf-8", what happens? I am sure this is a simple
question but we are not Windows programmers, you are the first person
to show up offering to investigate, and I personally found the docs a
bit light on the topic.)

3. Do the new BCP47 locale names give *exactly* the same results for
strcoll() and tolower() etc, as the old "Turkish*" style names? (In
other words, is it *exactly the same code and driving data*, just
using different labels? Or is it a new locale implementation that
could differ arbitrarily in behaviour? If the answer is yes, it's
just a new naming scheme, then life will be much much simpler for our
users, but if not, then indexes might be corrupted if we tell people
to switch to the new BCP47 names, and so we'd better know about that,
so we can adjust our advice to users.)


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-20 23:32:55
Message-ID: PH8PR21MB3902C3DB0535FD33CBF2E8C2E5E92@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi Vishwa,

Could you please help with the questions below?

Thanks!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Monday, May 20, 2024 1:57 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Tue, May 21, 2024 at 8:17 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> What questions do you have? Could you please list them clearly so that Vishwa could help to answer?

I already did, twice, but perhaps Vishwa or others can't see the whole thread, so here is this whole thread in our project email archive:

/message-id/flat/PH8PR21MB3902F334A3174C54058F792CE5182%40PH8PR21MB3902.namprd21.prod.outlook.com

But let me ask the questions again, with some motivation/reason I want to know in parentheses:

1. What is the oldest Windows release that can understand the "new"
BCP47 locale names, like "tr-TR" or "tr-TR.1452"? (Some PostgreSQL versions, for example PostgreSQL 12, are expected to run on old versions of Windows from long before Windows 10, so we might have to consider this. However, if we go with Tom's idea that we do nothing by default but just allow users to supply their own optional mapping file, then this question becomes unimportant, users can figure out for themselves whether it works, and presumably only 10+ got the update that renamed Turkey to Türkiye. [And in reality, I hope/expect that no one really does run old out-of-support OSes, because that's crazy, but I'm not allowed to assume...])

2. If we translate to BCP47 locale names like "tr-TR" automatically, should we put the ".1452" on the end? What does it mean exactly?
What does it mean if you don't put it there? (I could guess that if you don't put it on, the encoding in "char"-based functions is the "ACP". What I really want to know is, can it be different from the "ACP", and if it is, which functions does it affect? For example if the ACP is 1521 and I call _tolower_l() giving it a locale_t that I opened with "en-US.utf-8", what happens? I am sure this is a simple question but we are not Windows programmers, you are the first person to show up offering to investigate, and I personally found the docs a bit light on the topic.)

3. Do the new BCP47 locale names give *exactly* the same results for
strcoll() and tolower() etc, as the old "Turkish*" style names? (In other words, is it *exactly the same code and driving data*, just using different labels? Or is it a new locale implementation that could differ arbitrarily in behaviour? If the answer is yes, it's just a new naming scheme, then life will be much much simpler for our users, but if not, then indexes might be corrupted if we tell people to switch to the new BCP47 names, and so we'd better know about that, so we can adjust our advice to users.)


From: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-21 08:27:26
Message-ID: KL1P15301MB0387AF8D3B2FE878BF1B576AE3EA2@KL1P15301MB0387.APCP153.PROD.OUTLOOK.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


Hey @Haifang Wang (Centific Technologies Inc)<mailto:v-haiwang(at)microsoft(dot)com>

I am adding experts in this thread to address these queries. @Amy Wishnousky<mailto:amyw(at)microsoft(dot)com> @Shawn Steele<mailto:Shawn(dot)Steele(at)microsoft(dot)com> @Rahul Pandey<mailto:pandeyrah(at)microsoft(dot)com>
Adding my inline comment to best of my knowledge.

Thanks & Regards
Vishwa

________________________________
From: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>
Sent: Tuesday, May 21, 2024 5:02 AM
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; pgsql-bugs(at)lists(dot)postgresql(dot)org <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Hi Vishwa,

Could you please help with the questions below?

Thanks!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Monday, May 20, 2024 1:57 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Tue, May 21, 2024 at 8:17 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> What questions do you have? Could you please list them clearly so that Vishwa could help to answer?

I already did, twice, but perhaps Vishwa or others can't see the whole thread, so here is this whole thread in our project email archive:

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql.org%2Fmessage-id%2Fflat%2FPH8PR21MB3902F334A3174C54058F792CE5182%2540PH8PR21MB3902.namprd21.prod.outlook.com&data=05%7C02%7CVishwa.Deepak%40microsoft.com%7Cd5972d3ecb084308e18d08dc79252cea%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638518447836539330%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=ELfVuQ62kphneTt0ES5v2iPOhlXH2OtxU0EGBre6BG8%3D&reserved=0</message-id/flat/PH8PR21MB3902F334A3174C54058F792CE5182%40PH8PR21MB3902.namprd21.prod.outlook.com>

But let me ask the questions again, with some motivation/reason I want to know in parentheses:

1. What is the oldest Windows release that can understand the "new"
BCP47 locale names, like "tr-TR" or "tr-TR.1452"? (Some PostgreSQL versions, for example PostgreSQL 12, are expected to run on old versions of Windows from long before Windows 10, so we might have to consider this. However, if we go with Tom's idea that we do nothing by default but just allow users to supply their own optional mapping file, then this question becomes unimportant, users can figure out for themselves whether it works, and presumably only 10+ got the update that renamed Turkey to Türkiye. [And in reality, I hope/expect that no one really does run old out-of-support OSes, because that's crazy, but I'm not allowed to assume...])

Its difficult for me to answer this question with accuracy. I can see BCP47 related code in win8, so my assumption is that window 8 and above will support it.
May be added expert from feature team can validate this assumption.

2. If we translate to BCP47 locale names like "tr-TR" automatically, should we put the ".1452" on the end? What does it mean exactly?
What does it mean if you don't put it there? (I could guess that if you don't put it on, the encoding in "char"-based functions is the "ACP". What I really want to know is, can it be different from the "ACP", and if it is, which functions does it affect? For example, if the ACP is 1521 and I call _tolower_l() giving it a locale_t that I opened with "en-US.utf-8", what happens? I am sure this is a simple question but we are not Windows programmers, you are the first person to show up offering to investigate, and I personally found the docs a bit light on the topic.)

It tries to figure out the best match for the given input locale. Below link explain in more detail. As far as codepage part is concerned, may be expert add the details
https://learn.microsoft.com/en-us/windows/uwp/app-resources/how-rms-matches-lang-tags
[https://learn.microsoft.com/en-us/media/open-graph-image.png]<https://learn.microsoft.com/en-us/windows/uwp/app-resources/how-rms-matches-lang-tags>
How the Resource Management System matches language tags - UWP applications<https://learn.microsoft.com/en-us/windows/uwp/app-resources/how-rms-matches-lang-tags>
The previous topic (How the Resource Management System matches and chooses resources) looks at qualifier-matching in general. This topic focuses on language-tag-matching in more detail.
learn.microsoft.com

3. Do the new BCP47 locale names give *exactly* the same results for
strcoll() and tolower() etc, as the old "Turkish*" style names? (In other words, is it *exactly the same code and driving data*, just using different labels? Or is it a new locale implementation that could differ arbitrarily in behaviour? If the answer is yes, it's just a new naming scheme, then life will be much much simpler for our users, but if not, then indexes might be corrupted if we tell people to switch to the new BCP47 names, and so we'd better know about that, so we can adjust our advice to users.)
Yes, its exactly the same code and driving data , only spelling is changed which included replcement of u with ü

Please do proper due diligence at your end before proceeding with any kind of mapping.

Regards     
Vishwa


From: Rahul Pandey <pandeyrah(at)microsoft(dot)com>
To: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-22 11:37:17
Message-ID: PSAP153MB0439B2D8386BDFF643A7B5A2BDEB2@PSAP153MB0439.APCP153.PROD.OUTLOOK.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Thanks, Vishwa, for tagging me.
Adding my two cents,

Hello Thomas,

1. What is the oldest Windows release that can understand the "new" BCP47 locale names, like "tr-TR" or "tr-TR.1452"? (Some PostgreSQL versions, for example PostgreSQL 12, are expected to run on old versions of Windows from long before Windows 10, so we might have to consider this. However, if we go with Tom's idea that we do nothing by default but just allow users to supply their own optional mapping file, then this question becomes unimportant, users can figure out for themselves whether it works, and presumably only 10+ got the update that renamed Turkey to Türkiye. [And in reality, I hope/expect that no one really does run old out-of-support OSes, because that's crazy, but I'm not allowed to assume...])

Vishwa: Its difficult for me to answer this question with accuracy. I can see BCP47 related code in win8, so my assumption is that window 8 and above will support it.
May be added expert from feature team can validate this assumption.

Rahul: Locale names based on BCP 47 were first introduced in Windows Vista timeframe, so using them should be pretty safe for most modern and older versions (unless its XP or earlier).

2. If we translate to BCP47 locale names like "tr-TR" automatically, should we put the ".1452" on the end? What does it mean exactly?
What does it mean if you don't put it there? (I could guess that if you don't put it on, the encoding in "char"-based functions is the "ACP". What I really want to know is, can it be different from the "ACP", and if it is, which functions does it affect? For example, if the ACP is 1521 and I call _tolower_l() giving it a locale_t that I opened with "en-US.utf-8", what happens? I am sure this is a simple question but we are not Windows programmers, you are the first person to show up offering to investigate, and I personally found the docs a bit light on the topic.)

Vishwa: It tries to figure out the best match for the given input locale. Below link explain in more detail. As far as codepage part is concerned, may be expert add the details
https://learn.microsoft.com/en-us/windows/uwp/app-resources/how-rms-matches-lang-tags

Rahul: For the first part, your assumption is correct. The behaviour for "tr-TR" and "tr-TR.ACP" would be same and it would try to use the default ANSI Code Page for Turkish (which happens to be 1254<https://en.wikipedia.org/wiki/Windows-1254>). Using any other code page (for example"tr-TR.1252") would use that codepage (1252: English). For the second part (example), I am not sure if I understand the question completely, but mixing the encoding is almost never a good idea and could lead to mojibaked strings in the worst case to no change (if strings only contain ASCII chars) in the best-case scenario.

3. Do the new BCP47 locale names give *exactly* the same results for
strcoll() and tolower() etc, as the old "Turkish*" style names? (In other words, is it *exactly the same code and driving data*, just using different labels? Or is it a new locale implementation that could differ arbitrarily in behaviour? If the answer is yes, it's just a new naming scheme, then life will be much much simpler for our users, but if not, then indexes might be corrupted if we tell people to switch to the new BCP47 names, and so we'd better know about that, so we can adjust our advice to users.)

Vishwa: Yes, its exactly the same code and driving data , only spelling is changed which included replcement of u with ü

Rahul: I agree with Vishwa. the locale is the same, just the name of the country in English is changed. Rest all data is the same.

Thanks,
Rahul

________________________________
From: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>
Sent: Tuesday, May 21, 2024 1:57 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>; Thomas Munro <thomas(dot)munro(at)gmail(dot)com>; Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; pgsql-bugs(at)lists(dot)postgresql(dot)org <pgsql-bugs(at)lists(dot)postgresql(dot)org>; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Hey @Haifang Wang (Centific Technologies Inc)<mailto:v-haiwang(at)microsoft(dot)com>

I am adding experts in this thread to address these queries. @Amy Wishnousky<mailto:amyw(at)microsoft(dot)com> @Shawn Steele<mailto:Shawn(dot)Steele(at)microsoft(dot)com> @Rahul Pandey<mailto:pandeyrah(at)microsoft(dot)com>
Adding my inline comment to best of my knowledge.

Thanks & Regards
Vishwa

________________________________
From: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>
Sent: Tuesday, May 21, 2024 5:02 AM
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; pgsql-bugs(at)lists(dot)postgresql(dot)org <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Hi Vishwa,

Could you please help with the questions below?

Thanks!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Monday, May 20, 2024 1:57 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Tue, May 21, 2024 at 8:17 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> What questions do you have? Could you please list them clearly so that Vishwa could help to answer?

I already did, twice, but perhaps Vishwa or others can't see the whole thread, so here is this whole thread in our project email archive:

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql.org%2Fmessage-id%2Fflat%2FPH8PR21MB3902F334A3174C54058F792CE5182%2540PH8PR21MB3902.namprd21.prod.outlook.com&data=05%7C02%7CVishwa.Deepak%40microsoft.com%7Cd5972d3ecb084308e18d08dc79252cea%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638518447836539330%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=ELfVuQ62kphneTt0ES5v2iPOhlXH2OtxU0EGBre6BG8%3D&reserved=0</message-id/flat/PH8PR21MB3902F334A3174C54058F792CE5182%40PH8PR21MB3902.namprd21.prod.outlook.com>

But let me ask the questions again, with some motivation/reason I want to know in parentheses:

1. What is the oldest Windows release that can understand the "new"
BCP47 locale names, like "tr-TR" or "tr-TR.1452"? (Some PostgreSQL versions, for example PostgreSQL 12, are expected to run on old versions of Windows from long before Windows 10, so we might have to consider this. However, if we go with Tom's idea that we do nothing by default but just allow users to supply their own optional mapping file, then this question becomes unimportant, users can figure out for themselves whether it works, and presumably only 10+ got the update that renamed Turkey to Türkiye. [And in reality, I hope/expect that no one really does run old out-of-support OSes, because that's crazy, but I'm not allowed to assume...])

Its difficult for me to answer this question with accuracy. I can see BCP47 related code in win8, so my assumption is that window 8 and above will support it.
May be added expert from feature team can validate this assumption.

2. If we translate to BCP47 locale names like "tr-TR" automatically, should we put the ".1452" on the end? What does it mean exactly?
What does it mean if you don't put it there? (I could guess that if you don't put it on, the encoding in "char"-based functions is the "ACP". What I really want to know is, can it be different from the "ACP", and if it is, which functions does it affect? For example, if the ACP is 1521 and I call _tolower_l() giving it a locale_t that I opened with "en-US.utf-8", what happens? I am sure this is a simple question but we are not Windows programmers, you are the first person to show up offering to investigate, and I personally found the docs a bit light on the topic.)

It tries to figure out the best match for the given input locale. Below link explain in more detail. As far as codepage part is concerned, may be expert add the details
https://learn.microsoft.com/en-us/windows/uwp/app-resources/how-rms-matches-lang-tags
[https://learn.microsoft.com/en-us/media/open-graph-image.png]<https://learn.microsoft.com/en-us/windows/uwp/app-resources/how-rms-matches-lang-tags>
How the Resource Management System matches language tags - UWP applications<https://learn.microsoft.com/en-us/windows/uwp/app-resources/how-rms-matches-lang-tags>
The previous topic (How the Resource Management System matches and chooses resources) looks at qualifier-matching in general. This topic focuses on language-tag-matching in more detail.
learn.microsoft.com

3. Do the new BCP47 locale names give *exactly* the same results for
strcoll() and tolower() etc, as the old "Turkish*" style names? (In other words, is it *exactly the same code and driving data*, just using different labels? Or is it a new locale implementation that could differ arbitrarily in behaviour? If the answer is yes, it's just a new naming scheme, then life will be much much simpler for our users, but if not, then indexes might be corrupted if we tell people to switch to the new BCP47 names, and so we'd better know about that, so we can adjust our advice to users.)
Yes, its exactly the same code and driving data , only spelling is changed which included replcement of u with ü

Please do proper due diligence at your end before proceeding with any kind of mapping.

Regards     
Vishwa


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-05-28 18:21:18
Message-ID: PH8PR21MB3902C9024782A8A3E26370ACE5F12@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Thanks Vishwa for all the clarification below.

Hi @Rahul<mailto:pandeyrah(at)microsoft(dot)com> and everyone,

Is there anything else not clear? Is there any solution for the issue?

Thanks!
Haifang

From: Rahul Pandey <pandeyrah(at)microsoft(dot)com>
Sent: Wednesday, May 22, 2024 4:37 AM
To: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>; Thomas Munro <thomas(dot)munro(at)gmail(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Thanks, Vishwa, for tagging me.
Adding my two cents,

Hello Thomas,

1. What is the oldest Windows release that can understand the "new" BCP47 locale names, like "tr-TR" or "tr-TR.1452"? (Some PostgreSQL versions, for example PostgreSQL 12, are expected to run on old versions of Windows from long before Windows 10, so we might have to consider this. However, if we go with Tom's idea that we do nothing by default but just allow users to supply their own optional mapping file, then this question becomes unimportant, users can figure out for themselves whether it works, and presumably only 10+ got the update that renamed Turkey to Türkiye. [And in reality, I hope/expect that no one really does run old out-of-support OSes, because that's crazy, but I'm not allowed to assume...])
Vishwa: Its difficult for me to answer this question with accuracy. I can see BCP47 related code in win8, so my assumption is that window 8 and above will support it.
May be added expert from feature team can validate this assumption.

Rahul: Locale names based on BCP 47 were first introduced in Windows Vista timeframe, so using them should be pretty safe for most modern and older versions (unless its XP or earlier).

2. If we translate to BCP47 locale names like "tr-TR" automatically, should we put the ".1452" on the end? What does it mean exactly?
What does it mean if you don't put it there? (I could guess that if you don't put it on, the encoding in "char"-based functions is the "ACP". What I really want to know is, can it be different from the "ACP", and if it is, which functions does it affect? For example, if the ACP is 1521 and I call _tolower_l() giving it a locale_t that I opened with "en-US.utf-8", what happens? I am sure this is a simple question but we are not Windows programmers, you are the first person to show up offering to investigate, and I personally found the docs a bit light on the topic.)

Vishwa: It tries to figure out the best match for the given input locale. Below link explain in more detail. As far as codepage part is concerned, may be expert add the details
https://learn.microsoft.com/en-us/windows/uwp/app-resources/how-rms-matches-lang-tags

Rahul: For the first part, your assumption is correct. The behaviour for "tr-TR" and "tr-TR.ACP" would be same and it would try to use the default ANSI Code Page for Turkish (which happens to be 1254<https://en.wikipedia.org/wiki/Windows-1254>). Using any other code page (for example"tr-TR.1252") would use that codepage (1252: English). For the second part (example), I am not sure if I understand the question completely, but mixing the encoding is almost never a good idea and could lead to mojibaked strings in the worst case to no change (if strings only contain ASCII chars) in the best-case scenario.

3. Do the new BCP47 locale names give *exactly* the same results for
strcoll() and tolower() etc, as the old "Turkish*" style names? (In other words, is it *exactly the same code and driving data*, just using different labels? Or is it a new locale implementation that could differ arbitrarily in behaviour? If the answer is yes, it's just a new naming scheme, then life will be much much simpler for our users, but if not, then indexes might be corrupted if we tell people to switch to the new BCP47 names, and so we'd better know about that, so we can adjust our advice to users.)

Vishwa: Yes, its exactly the same code and driving data , only spelling is changed which included replcement of u with ü

Rahul: I agree with Vishwa. the locale is the same, just the name of the country in English is changed. Rest all data is the same.

Thanks,
Rahul

________________________________
From: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com<mailto:Vishwa(dot)Deepak(at)microsoft(dot)com>>
Sent: Tuesday, May 21, 2024 1:57 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>>; Thomas Munro <thomas(dot)munro(at)gmail(dot)com<mailto:thomas(dot)munro(at)gmail(dot)com>>; Rahul Pandey <pandeyrah(at)microsoft(dot)com<mailto:pandeyrah(at)microsoft(dot)com>>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com<mailto:Shawn(dot)Steele(at)microsoft(dot)com>>; Amy Wishnousky <amyw(at)microsoft(dot)com<mailto:amyw(at)microsoft(dot)com>>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us<mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>; pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org> <pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org>>; Shweta Gulati <gulatishweta(at)microsoft(dot)com<mailto:gulatishweta(at)microsoft(dot)com>>; Ashish Nawal <nawalashish(at)microsoft(dot)com<mailto:nawalashish(at)microsoft(dot)com>>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Hey @Haifang Wang (Centific Technologies Inc)<mailto:v-haiwang(at)microsoft(dot)com>
I am adding experts in this thread to address these queries. @Amy Wishnousky<mailto:amyw(at)microsoft(dot)com> @Shawn Steele<mailto:Shawn(dot)Steele(at)microsoft(dot)com> @Rahul Pandey<mailto:pandeyrah(at)microsoft(dot)com>
Adding my inline comment to best of my knowledge.

Thanks & Regards
Vishwa

________________________________
From: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>>
Sent: Tuesday, May 21, 2024 5:02 AM
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com<mailto:thomas(dot)munro(at)gmail(dot)com>>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us<mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com<mailto:Vishwa(dot)Deepak(at)microsoft(dot)com>>; pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org> <pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org>>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Hi Vishwa,

Could you please help with the questions below?

Thanks!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com<mailto:thomas(dot)munro(at)gmail(dot)com>>
Sent: Monday, May 20, 2024 1:57 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us<mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com<mailto:Vishwa(dot)Deepak(at)microsoft(dot)com>>; pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Tue, May 21, 2024 at 8:17 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>> wrote:
> What questions do you have? Could you please list them clearly so that Vishwa could help to answer?

I already did, twice, but perhaps Vishwa or others can't see the whole thread, so here is this whole thread in our project email archive:

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql.org%2Fmessage-id%2Fflat%2FPH8PR21MB3902F334A3174C54058F792CE5182%2540PH8PR21MB3902.namprd21.prod.outlook.com&data=05%7C02%7CVishwa.Deepak%40microsoft.com%7Cd5972d3ecb084308e18d08dc79252cea%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638518447836539330%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=ELfVuQ62kphneTt0ES5v2iPOhlXH2OtxU0EGBre6BG8%3D&reserved=0</message-id/flat/PH8PR21MB3902F334A3174C54058F792CE5182%40PH8PR21MB3902.namprd21.prod.outlook.com>

But let me ask the questions again, with some motivation/reason I want to know in parentheses:

1. What is the oldest Windows release that can understand the "new"
BCP47 locale names, like "tr-TR" or "tr-TR.1452"? (Some PostgreSQL versions, for example PostgreSQL 12, are expected to run on old versions of Windows from long before Windows 10, so we might have to consider this. However, if we go with Tom's idea that we do nothing by default but just allow users to supply their own optional mapping file, then this question becomes unimportant, users can figure out for themselves whether it works, and presumably only 10+ got the update that renamed Turkey to Türkiye. [And in reality, I hope/expect that no one really does run old out-of-support OSes, because that's crazy, but I'm not allowed to assume...])

Its difficult for me to answer this question with accuracy. I can see BCP47 related code in win8, so my assumption is that window 8 and above will support it.
May be added expert from feature team can validate this assumption.

2. If we translate to BCP47 locale names like "tr-TR" automatically, should we put the ".1452" on the end? What does it mean exactly?
What does it mean if you don't put it there? (I could guess that if you don't put it on, the encoding in "char"-based functions is the "ACP". What I really want to know is, can it be different from the "ACP", and if it is, which functions does it affect? For example, if the ACP is 1521 and I call _tolower_l() giving it a locale_t that I opened with "en-US.utf-8", what happens? I am sure this is a simple question but we are not Windows programmers, you are the first person to show up offering to investigate, and I personally found the docs a bit light on the topic.)

It tries to figure out the best match for the given input locale. Below link explain in more detail. As far as codepage part is concerned, may be expert add the details
https://learn.microsoft.com/en-us/windows/uwp/app-resources/how-rms-matches-lang-tags
[https://learn.microsoft.com/en-us/media/open-graph-image.png]<https://learn.microsoft.com/en-us/windows/uwp/app-resources/how-rms-matches-lang-tags>
How the Resource Management System matches language tags - UWP applications<https://learn.microsoft.com/en-us/windows/uwp/app-resources/how-rms-matches-lang-tags>
The previous topic (How the Resource Management System matches and chooses resources) looks at qualifier-matching in general. This topic focuses on language-tag-matching in more detail.
learn.microsoft.com

3. Do the new BCP47 locale names give *exactly* the same results for
strcoll() and tolower() etc, as the old "Turkish*" style names? (In other words, is it *exactly the same code and driving data*, just using different labels? Or is it a new locale implementation that could differ arbitrarily in behaviour? If the answer is yes, it's just a new naming scheme, then life will be much much simpler for our users, but if not, then indexes might be corrupted if we tell people to switch to the new BCP47 names, and so we'd better know about that, so we can adjust our advice to users.)
Yes, its exactly the same code and driving data , only spelling is changed which included replcement of u with ü

Please do proper due diligence at your end before proceeding with any kind of mapping.
Regards     
Vishwa


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-06-03 13:00:23
Message-ID: PH8PR21MB39022FCDB4806CD57694F50AE5FF2@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi Thomas and Team,

Just would like to follow up with you about this bug. Hope Rahul and Vishwa have answered all your questions below. Did you get chance to do investigations? Any other questions?

Thanks!
Haifang

From: Haifang Wang (Centific Technologies Inc)
Sent: Tuesday, May 28, 2024 11:21 AM
To: Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Thomas Munro <thomas(dot)munro(at)gmail(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Thanks Vishwa for all the clarification below.

Hi @Rahul<mailto:pandeyrah(at)microsoft(dot)com> and everyone,

Is there anything else not clear? Is there any solution for the issue?

Thanks!
Haifang

From: Rahul Pandey <pandeyrah(at)microsoft(dot)com<mailto:pandeyrah(at)microsoft(dot)com>>
Sent: Wednesday, May 22, 2024 4:37 AM
To: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com<mailto:Vishwa(dot)Deepak(at)microsoft(dot)com>>; Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>>; Thomas Munro <thomas(dot)munro(at)gmail(dot)com<mailto:thomas(dot)munro(at)gmail(dot)com>>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com<mailto:Shawn(dot)Steele(at)microsoft(dot)com>>; Amy Wishnousky <amyw(at)microsoft(dot)com<mailto:amyw(at)microsoft(dot)com>>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us<mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>; pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org>; Shweta Gulati <gulatishweta(at)microsoft(dot)com<mailto:gulatishweta(at)microsoft(dot)com>>; Ashish Nawal <nawalashish(at)microsoft(dot)com<mailto:nawalashish(at)microsoft(dot)com>>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Thanks, Vishwa, for tagging me.
Adding my two cents,

Hello Thomas,

1. What is the oldest Windows release that can understand the "new" BCP47 locale names, like "tr-TR" or "tr-TR.1452"? (Some PostgreSQL versions, for example PostgreSQL 12, are expected to run on old versions of Windows from long before Windows 10, so we might have to consider this. However, if we go with Tom's idea that we do nothing by default but just allow users to supply their own optional mapping file, then this question becomes unimportant, users can figure out for themselves whether it works, and presumably only 10+ got the update that renamed Turkey to Türkiye. [And in reality, I hope/expect that no one really does run old out-of-support OSes, because that's crazy, but I'm not allowed to assume...])
Vishwa: Its difficult for me to answer this question with accuracy. I can see BCP47 related code in win8, so my assumption is that window 8 and above will support it.
May be added expert from feature team can validate this assumption.

Rahul: Locale names based on BCP 47 were first introduced in Windows Vista timeframe, so using them should be pretty safe for most modern and older versions (unless its XP or earlier).

2. If we translate to BCP47 locale names like "tr-TR" automatically, should we put the ".1452" on the end? What does it mean exactly?
What does it mean if you don't put it there? (I could guess that if you don't put it on, the encoding in "char"-based functions is the "ACP". What I really want to know is, can it be different from the "ACP", and if it is, which functions does it affect? For example, if the ACP is 1521 and I call _tolower_l() giving it a locale_t that I opened with "en-US.utf-8", what happens? I am sure this is a simple question but we are not Windows programmers, you are the first person to show up offering to investigate, and I personally found the docs a bit light on the topic.)

Vishwa: It tries to figure out the best match for the given input locale. Below link explain in more detail. As far as codepage part is concerned, may be expert add the details
https://learn.microsoft.com/en-us/windows/uwp/app-resources/how-rms-matches-lang-tags

Rahul: For the first part, your assumption is correct. The behaviour for "tr-TR" and "tr-TR.ACP" would be same and it would try to use the default ANSI Code Page for Turkish (which happens to be 1254<https://en.wikipedia.org/wiki/Windows-1254>). Using any other code page (for example"tr-TR.1252") would use that codepage (1252: English). For the second part (example), I am not sure if I understand the question completely, but mixing the encoding is almost never a good idea and could lead to mojibaked strings in the worst case to no change (if strings only contain ASCII chars) in the best-case scenario.

3. Do the new BCP47 locale names give *exactly* the same results for
strcoll() and tolower() etc, as the old "Turkish*" style names? (In other words, is it *exactly the same code and driving data*, just using different labels? Or is it a new locale implementation that could differ arbitrarily in behaviour? If the answer is yes, it's just a new naming scheme, then life will be much much simpler for our users, but if not, then indexes might be corrupted if we tell people to switch to the new BCP47 names, and so we'd better know about that, so we can adjust our advice to users.)

Vishwa: Yes, its exactly the same code and driving data , only spelling is changed which included replcement of u with ü

Rahul: I agree with Vishwa. the locale is the same, just the name of the country in English is changed. Rest all data is the same.

Thanks,
Rahul

________________________________
From: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com<mailto:Vishwa(dot)Deepak(at)microsoft(dot)com>>
Sent: Tuesday, May 21, 2024 1:57 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>>; Thomas Munro <thomas(dot)munro(at)gmail(dot)com<mailto:thomas(dot)munro(at)gmail(dot)com>>; Rahul Pandey <pandeyrah(at)microsoft(dot)com<mailto:pandeyrah(at)microsoft(dot)com>>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com<mailto:Shawn(dot)Steele(at)microsoft(dot)com>>; Amy Wishnousky <amyw(at)microsoft(dot)com<mailto:amyw(at)microsoft(dot)com>>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us<mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>; pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org> <pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org>>; Shweta Gulati <gulatishweta(at)microsoft(dot)com<mailto:gulatishweta(at)microsoft(dot)com>>; Ashish Nawal <nawalashish(at)microsoft(dot)com<mailto:nawalashish(at)microsoft(dot)com>>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Hey @Haifang Wang (Centific Technologies Inc)<mailto:v-haiwang(at)microsoft(dot)com>
I am adding experts in this thread to address these queries. @Amy Wishnousky<mailto:amyw(at)microsoft(dot)com> @Shawn Steele<mailto:Shawn(dot)Steele(at)microsoft(dot)com> @Rahul Pandey<mailto:pandeyrah(at)microsoft(dot)com>
Adding my inline comment to best of my knowledge.

Thanks & Regards
Vishwa

________________________________
From: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>>
Sent: Tuesday, May 21, 2024 5:02 AM
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com<mailto:thomas(dot)munro(at)gmail(dot)com>>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us<mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com<mailto:Vishwa(dot)Deepak(at)microsoft(dot)com>>; pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org> <pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org>>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Hi Vishwa,

Could you please help with the questions below?

Thanks!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com<mailto:thomas(dot)munro(at)gmail(dot)com>>
Sent: Monday, May 20, 2024 1:57 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us<mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com<mailto:Vishwa(dot)Deepak(at)microsoft(dot)com>>; pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Tue, May 21, 2024 at 8:17 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>> wrote:
> What questions do you have? Could you please list them clearly so that Vishwa could help to answer?

I already did, twice, but perhaps Vishwa or others can't see the whole thread, so here is this whole thread in our project email archive:

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql.org%2Fmessage-id%2Fflat%2FPH8PR21MB3902F334A3174C54058F792CE5182%2540PH8PR21MB3902.namprd21.prod.outlook.com&data=05%7C02%7CVishwa.Deepak%40microsoft.com%7Cd5972d3ecb084308e18d08dc79252cea%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638518447836539330%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=ELfVuQ62kphneTt0ES5v2iPOhlXH2OtxU0EGBre6BG8%3D&reserved=0</message-id/flat/PH8PR21MB3902F334A3174C54058F792CE5182%40PH8PR21MB3902.namprd21.prod.outlook.com>

But let me ask the questions again, with some motivation/reason I want to know in parentheses:

1. What is the oldest Windows release that can understand the "new"
BCP47 locale names, like "tr-TR" or "tr-TR.1452"? (Some PostgreSQL versions, for example PostgreSQL 12, are expected to run on old versions of Windows from long before Windows 10, so we might have to consider this. However, if we go with Tom's idea that we do nothing by default but just allow users to supply their own optional mapping file, then this question becomes unimportant, users can figure out for themselves whether it works, and presumably only 10+ got the update that renamed Turkey to Türkiye. [And in reality, I hope/expect that no one really does run old out-of-support OSes, because that's crazy, but I'm not allowed to assume...])

Its difficult for me to answer this question with accuracy. I can see BCP47 related code in win8, so my assumption is that window 8 and above will support it.
May be added expert from feature team can validate this assumption.

2. If we translate to BCP47 locale names like "tr-TR" automatically, should we put the ".1452" on the end? What does it mean exactly?
What does it mean if you don't put it there? (I could guess that if you don't put it on, the encoding in "char"-based functions is the "ACP". What I really want to know is, can it be different from the "ACP", and if it is, which functions does it affect? For example, if the ACP is 1521 and I call _tolower_l() giving it a locale_t that I opened with "en-US.utf-8", what happens? I am sure this is a simple question but we are not Windows programmers, you are the first person to show up offering to investigate, and I personally found the docs a bit light on the topic.)

It tries to figure out the best match for the given input locale. Below link explain in more detail. As far as codepage part is concerned, may be expert add the details
https://learn.microsoft.com/en-us/windows/uwp/app-resources/how-rms-matches-lang-tags
[https://learn.microsoft.com/en-us/media/open-graph-image.png]<https://learn.microsoft.com/en-us/windows/uwp/app-resources/how-rms-matches-lang-tags>
How the Resource Management System matches language tags - UWP applications<https://learn.microsoft.com/en-us/windows/uwp/app-resources/how-rms-matches-lang-tags>
The previous topic (How the Resource Management System matches and chooses resources) looks at qualifier-matching in general. This topic focuses on language-tag-matching in more detail.
learn.microsoft.com

3. Do the new BCP47 locale names give *exactly* the same results for
strcoll() and tolower() etc, as the old "Turkish*" style names? (In other words, is it *exactly the same code and driving data*, just using different labels? Or is it a new locale implementation that could differ arbitrarily in behaviour? If the answer is yes, it's just a new naming scheme, then life will be much much simpler for our users, but if not, then indexes might be corrupted if we tell people to switch to the new BCP47 names, and so we'd better know about that, so we can adjust our advice to users.)
Yes, its exactly the same code and driving data , only spelling is changed which included replcement of u with ü

Please do proper due diligence at your end before proceeding with any kind of mapping.
Regards     
Vishwa


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-06-03 21:22:22
Message-ID: CA+hUKGK-G=DB5MttZzzZf7mzTMxRVSACNg-WGK59Tk1Q9vY3cA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Tue, Jun 4, 2024 at 1:00 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> Just would like to follow up with you about this bug. Hope Rahul and Vishwa have answered all your questions below. Did you get chance to do investigations? Any other questions?

Hi,

Thanks for all your feedback! Sorry for my late replies, I was
travelling and at a conference last week (at which one of the topics
was "what are we going to do about the sorry state of PostgreSQL on
Windows", but that's a wider topic than this thread...)

The answers are all useful, thank you. There was just one thing I
wanted to clarify. Vishwa said:

"Yes, its exactly the same code and driving data , only spelling is
changed which included replcement of u with ü"

That sounds like it is answering the question 'Do
"Turkish_Turkey.1254" and "Turkish_Türkiye.1254" behave identically?',
which I had already assumed to be true, but I was actually asking 'Do
""Turkish_Turkey.1254", "Turkish_Türkiye.1254" and "tr-TR.1254" all
behave identically?'. It sounds like the answer is probably yes, but
the mention of "u" vs "ü" implied a narrower answer than I was looking
for... It's important, because we're proposing to translate to
"tr-TR.1254".

Since no one has come forward to test the patch I wrote on Windows, I
think my next move will be to try to make a build option that can also
do locale name renaming on Unix, so that I have something that I could
test myself and push for the next release of PostgreSQL which will be
in October.

One open question is whether we should ship a translation file
ourselves. I have heard two opinions: Tom Lane proposed in this
thread that there should be no file initially, we let the user figure
out what to put in there. Magnus Hagander proposed at the pgconf.dev
conference last week that we should ideally ship a complete
translation file and maintain it over time, and it should be installed
not in pgdata but rather in the install directory. I am not
personally able or willing to maintain such a file, I'm only offering
to supply the C code to read it and do what it says, but perhaps a
group such as the EDB Installer maintainer group, or at least someone
with a vested interest in PostgreSQL on Windows, might like to own
that job.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-06-03 22:54:51
Message-ID: 151229.1717455291@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> One open question is whether we should ship a translation file
> ourselves. I have heard two opinions: Tom Lane proposed in this
> thread that there should be no file initially, we let the user figure
> out what to put in there. Magnus Hagander proposed at the pgconf.dev
> conference last week that we should ideally ship a complete
> translation file and maintain it over time, and it should be installed
> not in pgdata but rather in the install directory.

FWIW, I'm kind of down on the latter approach, because I don't think
it'll move the needle very far. Based on track record so far, there's
no chance that we will be aware of a Microsoft locale renaming before
it starts breaking users' databases. Therefore, "edit the translation
file" is going to have to be a documented process in any case, because
affected users are not going to want to wait around for our next
release for a fix. Also, if people do have to do that, it doesn't
seem like a great idea to tell them to modify an installed file rather
than a cluster-local configuration file. What if they do a minor
version update but the minor version doesn't (yet) contain the fix?

Admittedly, the installed-file approach could make it more transparent
for people who'd done a PG minor update before the relevant Windows
update. I'm not sure how large that set of people will be, though.

regards, tom lane


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-06-04 00:09:17
Message-ID: PH8PR21MB3902C10CE3A1B07FE8C3EEF2E5F82@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Thanks, Thomas. I would leave the questions to Vishwa and Rahul to answer.

Regards!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Monday, June 3, 2024 2:22 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>
Cc: Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Tue, Jun 4, 2024 at 1:00 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> Just would like to follow up with you about this bug. Hope Rahul and Vishwa have answered all your questions below. Did you get chance to do investigations? Any other questions?

Hi,

Thanks for all your feedback! Sorry for my late replies, I was travelling and at a conference last week (at which one of the topics was "what are we going to do about the sorry state of PostgreSQL on Windows", but that's a wider topic than this thread...)

The answers are all useful, thank you. There was just one thing I wanted to clarify. Vishwa said:

"Yes, its exactly the same code and driving data , only spelling is changed which included replcement of u with ü"

That sounds like it is answering the question 'Do "Turkish_Turkey.1254" and "Turkish_Türkiye.1254" behave identically?', which I had already assumed to be true, but I was actually asking 'Do ""Turkish_Turkey.1254", "Turkish_Türkiye.1254" and "tr-TR.1254" all behave identically?'. It sounds like the answer is probably yes, but the mention of "u" vs "ü" implied a narrower answer than I was looking for... It's important, because we're proposing to translate to "tr-TR.1254".

Since no one has come forward to test the patch I wrote on Windows, I think my next move will be to try to make a build option that can also do locale name renaming on Unix, so that I have something that I could test myself and push for the next release of PostgreSQL which will be in October.

One open question is whether we should ship a translation file ourselves. I have heard two opinions: Tom Lane proposed in this thread that there should be no file initially, we let the user figure out what to put in there. Magnus Hagander proposed at the pgconf.dev conference last week that we should ideally ship a complete translation file and maintain it over time, and it should be installed not in pgdata but rather in the install directory. I am not personally able or willing to maintain such a file, I'm only offering to supply the C code to read it and do what it says, but perhaps a group such as the EDB Installer maintainer group, or at least someone with a vested interest in PostgreSQL on Windows, might like to own that job.


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-06-04 00:13:05
Message-ID: PH8PR21MB390299F7550829468654A77FE5F82@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Thanks, Tom. Is there any suggestion on how to track all the conversations in one thread? Seems like all our previous discussion are in different mail thread. It is not easy to track. 😊

Regards!
Haifang

-----Original Message-----
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Sent: Monday, June 3, 2024 3:55 PM
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>; Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>; Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> One open question is whether we should ship a translation file
> ourselves. I have heard two opinions: Tom Lane proposed in this
> thread that there should be no file initially, we let the user figure
> out what to put in there. Magnus Hagander proposed at the pgconf.dev
> conference last week that we should ideally ship a complete
> translation file and maintain it over time, and it should be installed
> not in pgdata but rather in the install directory.

FWIW, I'm kind of down on the latter approach, because I don't think it'll move the needle very far. Based on track record so far, there's no chance that we will be aware of a Microsoft locale renaming before it starts breaking users' databases. Therefore, "edit the translation file" is going to have to be a documented process in any case, because affected users are not going to want to wait around for our next release for a fix. Also, if people do have to do that, it doesn't seem like a great idea to tell them to modify an installed file rather than a cluster-local configuration file. What if they do a minor version update but the minor version doesn't (yet) contain the fix?

Admittedly, the installed-file approach could make it more transparent for people who'd done a PG minor update before the relevant Windows update. I'm not sure how large that set of people will be, though.

regards, tom lane


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-06-05 06:20:26
Message-ID: CA+hUKGJOoVT1iTqgvt4xynTG=tFCfjzZAs-0hQ7Pv3veiNhwwQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Tue, Jun 4, 2024 at 12:13 PM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> Thanks, Tom. Is there any suggestion on how to track all the conversations in one thread? Seems like all our previous discussion are in different mail thread. It is not easy to track. 😊

FWIW this shows up as one thread in my email client, and in the
PostgreSQL archives website[1]. I don't know much about email and
which RFCs or conventions are at work here, but apparently different
clients are using different techniques to identify threads. I assume
it could be done with thread headers (as seen in this thread) or
reply-to chains or fuzzy recognition of subject etc. Given that
Outlook seems to ignore the "> " inline response quoting convention
used by the rest of the internet, it wouldn't surprise me to hear that
it also doesn't follow the conventions for recognising threads either,
being a sort of related topic.

[1] /message-id/flat/PH8PR21MB3902F334A3174C54058F792CE5182%40PH8PR21MB3902.namprd21.prod.outlook.com


From: Rahul Pandey <pandeyrah(at)microsoft(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>, 🎯dev <targetdev(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-06-05 09:10:30
Message-ID: PSAP153MB0439B5E53B80F6D4EF37AD49BDF92@PSAP153MB0439.APCP153.PROD.OUTLOOK.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi,

"Do "Turkish_Turkey.1254", "Turkish_Türkiye.1254" and "tr-TR.1254" all behave identically?"
You are correct, they all behave in exactly the same way.
I too would recommend using "tr-TR.1254" since BCP47 tags have a defined structure and this would make the implementation more future proof.

As for translation files, apologies I neither have a preference nor expertise to comment on this.

________________________________
From: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>
Sent: Tuesday, June 4, 2024 5:39 AM
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; pgsql-bugs(at)lists(dot)postgresql(dot)org <pgsql-bugs(at)lists(dot)postgresql(dot)org>; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Thanks, Thomas. I would leave the questions to Vishwa and Rahul to answer.

Regards!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Monday, June 3, 2024 2:22 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>
Cc: Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Tue, Jun 4, 2024 at 1:00 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> Just would like to follow up with you about this bug. Hope Rahul and Vishwa have answered all your questions below. Did you get chance to do investigations? Any other questions?

Hi,

Thanks for all your feedback! Sorry for my late replies, I was travelling and at a conference last week (at which one of the topics was "what are we going to do about the sorry state of PostgreSQL on Windows", but that's a wider topic than this thread...)

The answers are all useful, thank you. There was just one thing I wanted to clarify. Vishwa said:

"Yes, its exactly the same code and driving data , only spelling is changed which included replcement of u with ü"

That sounds like it is answering the question 'Do "Turkish_Turkey.1254" and "Turkish_Türkiye.1254" behave identically?', which I had already assumed to be true, but I was actually asking 'Do ""Turkish_Turkey.1254", "Turkish_Türkiye.1254" and "tr-TR.1254" all behave identically?'. It sounds like the answer is probably yes, but the mention of "u" vs "ü" implied a narrower answer than I was looking for... It's important, because we're proposing to translate to "tr-TR.1254".

Since no one has come forward to test the patch I wrote on Windows, I think my next move will be to try to make a build option that can also do locale name renaming on Unix, so that I have something that I could test myself and push for the next release of PostgreSQL which will be in October.

One open question is whether we should ship a translation file ourselves. I have heard two opinions: Tom Lane proposed in this thread that there should be no file initially, we let the user figure out what to put in there. Magnus Hagander proposed at the pgconf.dev conference last week that we should ideally ship a complete translation file and maintain it over time, and it should be installed not in pgdata but rather in the install directory. I am not personally able or willing to maintain such a file, I'm only offering to supply the C code to read it and do what it says, but perhaps a group such as the EDB Installer maintainer group, or at least someone with a vested interest in PostgreSQL on Windows, might like to own that job.


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Rahul Pandey <pandeyrah(at)microsoft(dot)com>
Cc: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>, 🎯dev <targetdev(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-06-06 23:39:08
Message-ID: CA+hUKGLikGNKvSU8x+oaxx=PWVO6H9G-7PunuKu1tM0DrM7jfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Wed, Jun 5, 2024 at 9:10 PM Rahul Pandey <pandeyrah(at)microsoft(dot)com> wrote:
> "Do "Turkish_Turkey.1254", "Turkish_Türkiye.1254" and "tr-TR.1254" all behave identically?"
> You are correct, they all behave in exactly the same way.
> I too would recommend using "tr-TR.1254" since BCP47 tags have a defined structure and this would make the implementation more future proof.

Thanks for confirming. More soon.


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>, 🎯dev <targetdev(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-06-13 18:46:31
Message-ID: PH8PR21MB3902ED71EBC45EF204D5567CE5C12@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi Thomas and Tom,

Just would like to follow up you about the bug. Is there any update? Any other information needed?

Thanks!
Haifang

From: Rahul Pandey <pandeyrah(at)microsoft(dot)com>
Sent: Wednesday, June 5, 2024 2:11 AM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>; Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>; 🎯dev <targetdev(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Hi,

"Do "Turkish_Turkey.1254", "Turkish_Türkiye.1254" and "tr-TR.1254" all behave identically?"
You are correct, they all behave in exactly the same way.
I too would recommend using "tr-TR.1254" since BCP47 tags have a defined structure and this would make the implementation more future proof.

As for translation files, apologies I neither have a preference nor expertise to comment on this.

________________________________
From: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>>
Sent: Tuesday, June 4, 2024 5:39 AM
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com<mailto:thomas(dot)munro(at)gmail(dot)com>>
Cc: Rahul Pandey <pandeyrah(at)microsoft(dot)com<mailto:pandeyrah(at)microsoft(dot)com>>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com<mailto:Vishwa(dot)Deepak(at)microsoft(dot)com>>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com<mailto:Shawn(dot)Steele(at)microsoft(dot)com>>; Amy Wishnousky <amyw(at)microsoft(dot)com<mailto:amyw(at)microsoft(dot)com>>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us<mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>; pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org> <pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org>>; Shweta Gulati <gulatishweta(at)microsoft(dot)com<mailto:gulatishweta(at)microsoft(dot)com>>; Ashish Nawal <nawalashish(at)microsoft(dot)com<mailto:nawalashish(at)microsoft(dot)com>>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Thanks, Thomas. I would leave the questions to Vishwa and Rahul to answer.

Regards!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com<mailto:thomas(dot)munro(at)gmail(dot)com>>
Sent: Monday, June 3, 2024 2:22 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>>
Cc: Rahul Pandey <pandeyrah(at)microsoft(dot)com<mailto:pandeyrah(at)microsoft(dot)com>>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com<mailto:Vishwa(dot)Deepak(at)microsoft(dot)com>>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com<mailto:Shawn(dot)Steele(at)microsoft(dot)com>>; Amy Wishnousky <amyw(at)microsoft(dot)com<mailto:amyw(at)microsoft(dot)com>>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us<mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>; pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org>; Shweta Gulati <gulatishweta(at)microsoft(dot)com<mailto:gulatishweta(at)microsoft(dot)com>>; Ashish Nawal <nawalashish(at)microsoft(dot)com<mailto:nawalashish(at)microsoft(dot)com>>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Tue, Jun 4, 2024 at 1:00 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>> wrote:
> Just would like to follow up with you about this bug. Hope Rahul and Vishwa have answered all your questions below. Did you get chance to do investigations? Any other questions?

Hi,

Thanks for all your feedback! Sorry for my late replies, I was travelling and at a conference last week (at which one of the topics was "what are we going to do about the sorry state of PostgreSQL on Windows", but that's a wider topic than this thread...)

The answers are all useful, thank you. There was just one thing I wanted to clarify. Vishwa said:

"Yes, its exactly the same code and driving data , only spelling is changed which included replcement of u with ü"

That sounds like it is answering the question 'Do "Turkish_Turkey.1254" and "Turkish_Türkiye.1254" behave identically?', which I had already assumed to be true, but I was actually asking 'Do ""Turkish_Turkey.1254", "Turkish_Türkiye.1254" and "tr-TR.1254" all behave identically?'. It sounds like the answer is probably yes, but the mention of "u" vs "ü" implied a narrower answer than I was looking for... It's important, because we're proposing to translate to "tr-TR.1254".

Since no one has come forward to test the patch I wrote on Windows, I think my next move will be to try to make a build option that can also do locale name renaming on Unix, so that I have something that I could test myself and push for the next release of PostgreSQL which will be in October.

One open question is whether we should ship a translation file ourselves. I have heard two opinions: Tom Lane proposed in this thread that there should be no file initially, we let the user figure out what to put in there. Magnus Hagander proposed at the pgconf.dev conference last week that we should ideally ship a complete translation file and maintain it over time, and it should be installed not in pgdata but rather in the install directory. I am not personally able or willing to maintain such a file, I'm only offering to supply the C code to read it and do what it says, but perhaps a group such as the EDB Installer maintainer group, or at least someone with a vested interest in PostgreSQL on Windows, might like to own that job.


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>, 🎯dev <targetdev(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-06-17 21:06:20
Message-ID: PH8PR21MB390222476B744F5675989923E5CD2@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg토토 사이트 추천SQL : Postg토토 사이트 추천SQL 메일 링리스트 : 2024-06-17 이후 PGSQL-BUGS 21:06

Hi all,

Still need to double check with you about this bug. How is it going? Any other information needed?

Thanks!
Haifang

From: Haifang Wang (Centific Technologies Inc)
Sent: Thursday, June 13, 2024 11:47 AM
To: Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>; 🎯dev <targetdev(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Hi Thomas and Tom,

Just would like to follow up you about the bug. Is there any update? Any other information needed?

Thanks!
Haifang

From: Rahul Pandey <pandeyrah(at)microsoft(dot)com<mailto:pandeyrah(at)microsoft(dot)com>>
Sent: Wednesday, June 5, 2024 2:11 AM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>>; Thomas Munro <thomas(dot)munro(at)gmail(dot)com<mailto:thomas(dot)munro(at)gmail(dot)com>>
Cc: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com<mailto:Vishwa(dot)Deepak(at)microsoft(dot)com>>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com<mailto:Shawn(dot)Steele(at)microsoft(dot)com>>; Amy Wishnousky <amyw(at)microsoft(dot)com<mailto:amyw(at)microsoft(dot)com>>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us<mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>; pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org>; Shweta Gulati <gulatishweta(at)microsoft(dot)com<mailto:gulatishweta(at)microsoft(dot)com>>; Ashish Nawal <nawalashish(at)microsoft(dot)com<mailto:nawalashish(at)microsoft(dot)com>>; 🎯dev <targetdev(at)microsoft(dot)com<mailto:targetdev(at)microsoft(dot)com>>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Hi,

"Do "Turkish_Turkey.1254", "Turkish_Türkiye.1254" and "tr-TR.1254" all behave identically?"
You are correct, they all behave in exactly the same way.
I too would recommend using "tr-TR.1254" since BCP47 tags have a defined structure and this would make the implementation more future proof.

As for translation files, apologies I neither have a preference nor expertise to comment on this.

________________________________
From: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>>
Sent: Tuesday, June 4, 2024 5:39 AM
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com<mailto:thomas(dot)munro(at)gmail(dot)com>>
Cc: Rahul Pandey <pandeyrah(at)microsoft(dot)com<mailto:pandeyrah(at)microsoft(dot)com>>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com<mailto:Vishwa(dot)Deepak(at)microsoft(dot)com>>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com<mailto:Shawn(dot)Steele(at)microsoft(dot)com>>; Amy Wishnousky <amyw(at)microsoft(dot)com<mailto:amyw(at)microsoft(dot)com>>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us<mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>; pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org> <pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org>>; Shweta Gulati <gulatishweta(at)microsoft(dot)com<mailto:gulatishweta(at)microsoft(dot)com>>; Ashish Nawal <nawalashish(at)microsoft(dot)com<mailto:nawalashish(at)microsoft(dot)com>>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Thanks, Thomas. I would leave the questions to Vishwa and Rahul to answer.

Regards!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com<mailto:thomas(dot)munro(at)gmail(dot)com>>
Sent: Monday, June 3, 2024 2:22 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>>
Cc: Rahul Pandey <pandeyrah(at)microsoft(dot)com<mailto:pandeyrah(at)microsoft(dot)com>>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com<mailto:Vishwa(dot)Deepak(at)microsoft(dot)com>>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com<mailto:Shawn(dot)Steele(at)microsoft(dot)com>>; Amy Wishnousky <amyw(at)microsoft(dot)com<mailto:amyw(at)microsoft(dot)com>>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us<mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>; pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org>; Shweta Gulati <gulatishweta(at)microsoft(dot)com<mailto:gulatishweta(at)microsoft(dot)com>>; Ashish Nawal <nawalashish(at)microsoft(dot)com<mailto:nawalashish(at)microsoft(dot)com>>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Tue, Jun 4, 2024 at 1:00 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>> wrote:
> Just would like to follow up with you about this bug. Hope Rahul and Vishwa have answered all your questions below. Did you get chance to do investigations? Any other questions?

Hi,

Thanks for all your feedback! Sorry for my late replies, I was travelling and at a conference last week (at which one of the topics was "what are we going to do about the sorry state of PostgreSQL on Windows", but that's a wider topic than this thread...)

The answers are all useful, thank you. There was just one thing I wanted to clarify. Vishwa said:

"Yes, its exactly the same code and driving data , only spelling is changed which included replcement of u with ü"

That sounds like it is answering the question 'Do "Turkish_Turkey.1254" and "Turkish_Türkiye.1254" behave identically?', which I had already assumed to be true, but I was actually asking 'Do ""Turkish_Turkey.1254", "Turkish_Türkiye.1254" and "tr-TR.1254" all behave identically?'. It sounds like the answer is probably yes, but the mention of "u" vs "ü" implied a narrower answer than I was looking for... It's important, because we're proposing to translate to "tr-TR.1254".

Since no one has come forward to test the patch I wrote on Windows, I think my next move will be to try to make a build option that can also do locale name renaming on Unix, so that I have something that I could test myself and push for the next release of PostgreSQL which will be in October.

One open question is whether we should ship a translation file ourselves. I have heard two opinions: Tom Lane proposed in this thread that there should be no file initially, we let the user figure out what to put in there. Magnus Hagander proposed at the pgconf.dev conference last week that we should ideally ship a complete translation file and maintain it over time, and it should be installed not in pgdata but rather in the install directory. I am not personally able or willing to maintain such a file, I'm only offering to supply the C code to read it and do what it says, but perhaps a group such as the EDB Installer maintainer group, or at least someone with a vested interest in PostgreSQL on Windows, might like to own that job.


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>, 🎯dev <targetdev(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-06-24 22:07:30
Message-ID: DS0PR21MB390457CD865E95CB58165268E5D42@DS0PR21MB3904.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


Hi all,

Still need to double check with you about this bug. How is it going? Any other information needed?

Thanks!
Haifang

From: Haifang Wang (Centific Technologies Inc)
Sent: Monday, June 17, 2024 2:06 PM
To: Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>; 🎯dev <targetdev(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Hi all,

Still need to double check with you about this bug. How is it going? Any other information needed?

Thanks!
Haifang

From: Haifang Wang (Centific Technologies Inc)
Sent: Thursday, June 13, 2024 11:47 AM
To: Rahul Pandey <pandeyrah(at)microsoft(dot)com<mailto:pandeyrah(at)microsoft(dot)com>>; Thomas Munro <thomas(dot)munro(at)gmail(dot)com<mailto:thomas(dot)munro(at)gmail(dot)com>>
Cc: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com<mailto:Vishwa(dot)Deepak(at)microsoft(dot)com>>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com<mailto:Shawn(dot)Steele(at)microsoft(dot)com>>; Amy Wishnousky <amyw(at)microsoft(dot)com<mailto:amyw(at)microsoft(dot)com>>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us<mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>; pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org>; Shweta Gulati <gulatishweta(at)microsoft(dot)com<mailto:gulatishweta(at)microsoft(dot)com>>; Ashish Nawal <nawalashish(at)microsoft(dot)com<mailto:nawalashish(at)microsoft(dot)com>>; 🎯dev <targetdev(at)microsoft(dot)com<mailto:targetdev(at)microsoft(dot)com>>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Hi Thomas and Tom,

Just would like to follow up you about the bug. Is there any update? Any other information needed?

Thanks!
Haifang

From: Rahul Pandey <pandeyrah(at)microsoft(dot)com<mailto:pandeyrah(at)microsoft(dot)com>>
Sent: Wednesday, June 5, 2024 2:11 AM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>>; Thomas Munro <thomas(dot)munro(at)gmail(dot)com<mailto:thomas(dot)munro(at)gmail(dot)com>>
Cc: Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com<mailto:Vishwa(dot)Deepak(at)microsoft(dot)com>>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com<mailto:Shawn(dot)Steele(at)microsoft(dot)com>>; Amy Wishnousky <amyw(at)microsoft(dot)com<mailto:amyw(at)microsoft(dot)com>>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us<mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>; pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org>; Shweta Gulati <gulatishweta(at)microsoft(dot)com<mailto:gulatishweta(at)microsoft(dot)com>>; Ashish Nawal <nawalashish(at)microsoft(dot)com<mailto:nawalashish(at)microsoft(dot)com>>; 🎯dev <targetdev(at)microsoft(dot)com<mailto:targetdev(at)microsoft(dot)com>>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Hi,

"Do "Turkish_Turkey.1254", "Turkish_Türkiye.1254" and "tr-TR.1254" all behave identically?"
You are correct, they all behave in exactly the same way.
I too would recommend using "tr-TR.1254" since BCP47 tags have a defined structure and this would make the implementation more future proof.

As for translation files, apologies I neither have a preference nor expertise to comment on this.

________________________________
From: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>>
Sent: Tuesday, June 4, 2024 5:39 AM
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com<mailto:thomas(dot)munro(at)gmail(dot)com>>
Cc: Rahul Pandey <pandeyrah(at)microsoft(dot)com<mailto:pandeyrah(at)microsoft(dot)com>>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com<mailto:Vishwa(dot)Deepak(at)microsoft(dot)com>>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com<mailto:Shawn(dot)Steele(at)microsoft(dot)com>>; Amy Wishnousky <amyw(at)microsoft(dot)com<mailto:amyw(at)microsoft(dot)com>>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us<mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>; pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org> <pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org>>; Shweta Gulati <gulatishweta(at)microsoft(dot)com<mailto:gulatishweta(at)microsoft(dot)com>>; Ashish Nawal <nawalashish(at)microsoft(dot)com<mailto:nawalashish(at)microsoft(dot)com>>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

Thanks, Thomas. I would leave the questions to Vishwa and Rahul to answer.

Regards!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com<mailto:thomas(dot)munro(at)gmail(dot)com>>
Sent: Monday, June 3, 2024 2:22 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>>
Cc: Rahul Pandey <pandeyrah(at)microsoft(dot)com<mailto:pandeyrah(at)microsoft(dot)com>>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com<mailto:Vishwa(dot)Deepak(at)microsoft(dot)com>>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com<mailto:Shawn(dot)Steele(at)microsoft(dot)com>>; Amy Wishnousky <amyw(at)microsoft(dot)com<mailto:amyw(at)microsoft(dot)com>>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us<mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>; pgsql-bugs(at)lists(dot)postgresql(dot)org<mailto:pgsql-bugs(at)lists(dot)postgresql(dot)org>; Shweta Gulati <gulatishweta(at)microsoft(dot)com<mailto:gulatishweta(at)microsoft(dot)com>>; Ashish Nawal <nawalashish(at)microsoft(dot)com<mailto:nawalashish(at)microsoft(dot)com>>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Tue, Jun 4, 2024 at 1:00 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>> wrote:
> Just would like to follow up with you about this bug. Hope Rahul and Vishwa have answered all your questions below. Did you get chance to do investigations? Any other questions?

Hi,

Thanks for all your feedback! Sorry for my late replies, I was travelling and at a conference last week (at which one of the topics was "what are we going to do about the sorry state of PostgreSQL on Windows", but that's a wider topic than this thread...)

The answers are all useful, thank you. There was just one thing I wanted to clarify. Vishwa said:

"Yes, its exactly the same code and driving data , only spelling is changed which included replcement of u with ü"

That sounds like it is answering the question 'Do "Turkish_Turkey.1254" and "Turkish_Türkiye.1254" behave identically?', which I had already assumed to be true, but I was actually asking 'Do ""Turkish_Turkey.1254", "Turkish_Türkiye.1254" and "tr-TR.1254" all behave identically?'. It sounds like the answer is probably yes, but the mention of "u" vs "ü" implied a narrower answer than I was looking for... It's important, because we're proposing to translate to "tr-TR.1254".

Since no one has come forward to test the patch I wrote on Windows, I think my next move will be to try to make a build option that can also do locale name renaming on Unix, so that I have something that I could test myself and push for the next release of PostgreSQL which will be in October.

One open question is whether we should ship a translation file ourselves. I have heard two opinions: Tom Lane proposed in this thread that there should be no file initially, we let the user figure out what to put in there. Magnus Hagander proposed at the pgconf.dev conference last week that we should ideally ship a complete translation file and maintain it over time, and it should be installed not in pgdata but rather in the install directory. I am not personally able or willing to maintain such a file, I'm only offering to supply the C code to read it and do what it says, but perhaps a group such as the EDB Installer maintainer group, or at least someone with a vested interest in PostgreSQL on Windows, might like to own that job.


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-06-25 11:51:38
Message-ID: CA+hUKGJO7Ma=30ayjwAhFPQ41Hej_mSsCUb3Kr22i-cr3qxM0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Tue, Jun 4, 2024 at 9:22 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> Since no one has come forward to test the patch I wrote on Windows, I
> think my next move will be to try to make a build option that can also
> do locale name renaming on Unix, so that I have something that I could
> test myself and push for the next release of PostgreSQL which will be
> in October.

Here is a such a patch. If you go into pg_config_manual.h and
uncomment this line:

/* #define DEBUG_SETLOCALE_MAP */

... then Unix systems will also be able to rename locales passed to
setlocale(). A map file can be provided either by putting its
absolute path into the environment variable PG_SETLOCALE_MAP, or by
installing it as $PREFIX/share/postgresql/setlocale.map. I couldn't
immediately think of a good way to find it in the data directory.

Here's an example of a line that should fix the Turkish problem
(though I haven't tested that, I am not a Windows user):

Turkish_T*.1254=tr-TR.1254

I added some documentation and showed that example.

If you wanted to check it's working on a Unix system, you might try
some lines like *.utf-8=does_not_exist or en_US.utf-8=fr_FR.utf-8 and
then somehow verify that it's using French.

I considered adding win32setlocale.c to the list of files to build for
the port libraries even on Unix, and then wrapping the contents in
#ifdef, but IIUC macOS squawks if you have an empty .c after
preprocessing, so I'd have to add a dummy symbol in there. Or maybe
that'd be better than what I did here, namely including
win32setlocale.c in chklocale.c in this case. Better ideas welcome.
Adding a meson/configure switch to enable it and make the whole .c
file optional seemed excessive.

Attachment Content-Type Size
v2-0001-Make-win32locale.c-s-kludge-table-dynamic.patch text/x-patch 19.4 KB

From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-06-25 22:24:21
Message-ID: PH8PR21MB390233F7BD9F7D8E6D4B06F3E5D52@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Thanks, Thomas. Will wait for the build option to do further testing.

Regards!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Tuesday, June 25, 2024 4:52 AM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>
Cc: Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Tue, Jun 4, 2024 at 9:22 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> Since no one has come forward to test the patch I wrote on Windows, I
> think my next move will be to try to make a build option that can also
> do locale name renaming on Unix, so that I have something that I could
> test myself and push for the next release of PostgreSQL which will be
> in October.

Here is a such a patch. If you go into pg_config_manual.h and uncomment this line:

/* #define DEBUG_SETLOCALE_MAP */

... then Unix systems will also be able to rename locales passed to setlocale(). A map file can be provided either by putting its absolute path into the environment variable PG_SETLOCALE_MAP, or by installing it as $PREFIX/share/postgresql/setlocale.map. I couldn't immediately think of a good way to find it in the data directory.

Here's an example of a line that should fix the Turkish problem (though I haven't tested that, I am not a Windows user):

Turkish_T*.1254=tr-TR.1254

I added some documentation and showed that example.

If you wanted to check it's working on a Unix system, you might try some lines like *.utf-8=does_not_exist or en_US.utf-8=fr_FR.utf-8 and then somehow verify that it's using French.

I considered adding win32setlocale.c to the list of files to build for the port libraries even on Unix, and then wrapping the contents in #ifdef, but IIUC macOS squawks if you have an empty .c after preprocessing, so I'd have to add a dummy symbol in there. Or maybe that'd be better than what I did here, namely including win32setlocale.c in chklocale.c in this case. Better ideas welcome.
Adding a meson/configure switch to enable it and make the whole .c file optional seemed excessive.


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Windows Application Issues | PostgreSQL | REF # 51961374
Date: 2024-07-05 22:00:11
Message-ID: PH8PR21MB3902AFBFF4D859BD631E9FA8E5DF2@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi Team,

I’m a program manager in the Windows App Assure ISV Outreach Team at Microsoft. We work with Microsoft’s test organization to notify developers when issues have been identified in their applications. We’re reaching out to notify you of a potential issue in one of your applications.

The issue details are below, please review it when you have a moment. Our goal is to work with you to address this issue and to understand what your expected timeline to address this issue might be. If you have any questions about the details below or have already addressed this issue in a forthcoming update, please let me know.

Account

토토 사이트

Product

PostgreSQL

Reference #

51961374

Issue

Fail to work with Find and Replace options within the application with latest version of app

Environment: Desktop
OS: Windows 11
Bamboo Paper: 16.3-2

Repro Steps:

1. Deploy OS.
2. Perform WU, enable roles & features.
3. Download the application: PostgreSQL from vendor link .
4. Run the setup file.
5. After the installation, Restart the machine.
6. Launch the PostgreSQL from the start menu.
7. Connect to the database.
8. Go to Tool menu ->click on query tool option.
9. Write the sql query.
10. Click edit option ->Try to work with find and replace options.
11. Observe.

Observations:

Fail to work with Find and Replace options within the application with latest version of app

Expected Results:

Should be able to work with Find and Replace options successfully without any issue.

Resource

For any questions on app development (or) submission on windows, contact Windows Dev Center. https://developer.microsoft.com/en-us/windows/support

Thanks!

Haifang


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Windows Application Issues | PostgreSQL | REF # 51961374
Date: 2024-07-07 07:16:56
Message-ID: CA+hUKGLcqTcWWZ8jpUEytysOhBvSFfiF+C1vnO5FL7WrP4oyxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Sat, Jul 6, 2024 at 10:00 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> Launch the PostgreSQL from the start menu.
> Connect to the database.
> Go to Tool menu ->click on query tool option.
> Write the sql query.
> Click edit option ->Try to work with find and replace options.

Hi Haifang,

It sounds like this might be referring to "pgadmin", a GUI tool for
talking to PostgreSQL? PostgreSQL itself doesn't have a GUI, but the
associated pgadmin project does, and it has its own place for bug
reporting over here:

https://www.pgadmin.org/support/issues/


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 51961374
Date: 2024-07-08 15:52:40
Message-ID: PH8PR21MB3902099523EACA724AF8E035E5DA2@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Thanks a lot for pointing this out, Thomas. I will report the issue to pgadmin.

Regards!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Sunday, July 7, 2024 12:17 AM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 51961374

On Sat, Jul 6, 2024 at 10:00 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> Launch the PostgreSQL from the start menu.
> Connect to the database.
> Go to Tool menu ->click on query tool option.
> Write the sql query.
> Click edit option ->Try to work with find and replace options.

Hi Haifang,

It sounds like this might be referring to "pgadmin", a GUI tool for talking to PostgreSQL? PostgreSQL itself doesn't have a GUI, but the associated pgadmin project does, and it has its own place for bug reporting over here:

https://www.pgadmin.org/support/issues/


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-07-21 11:54:37
Message-ID: CA+hUKGJTOgnTzu4VD6Am0X6g67atkQHFVk+C-w5wkGrGiao-=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

A minor release deadline is only a couple of weeks away and
unfortunately we still have no confirmation from a Windows user that
this technique can be used to unbreak a database cluster that actually
goes though that exact upgrade sequence. So I spent today figuring
out how to write a TAP test to try to convince myself of that in the
abstract. No C code change from v2. Does the test show a useful
enough pathway that is likely to help people in future cases at least?

I learned two new things:

1. With a remapped locale, you can log into your existing databases
again, but CREATE DATABASE doesn't work (see test). That must have
been the case for the historical transitions that were hard-coded
(Macau etc); this is no different, it's just that the mapping table is
now user-changeable. In other words, this just gives an administrator
a way to log into a cluster again after such a change, but then they
really need to update the names in catalogs, at least for any database
that might be used as a template, to get a 100% functional system. So
I have added a paragraph to the documentation piece to explain that.

2. Including the encoding suffix eg "en-US.1252" doesn't actually
work after all (discovered by CI). You have to use just "en-US". (I
guess ".utf-8" must be a special case.) So I have updated the
documentation example.

Attachment Content-Type Size
v3-0001-Make-win32locale.c-s-kludge-table-dynamic.patch text/x-patch 26.2 KB
v3-0002-ci-Enable-DEBUG_SETLOCALE_MAP-on-FreeBSD-task.patch text/x-patch 1.5 KB

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-07-21 13:28:15
Message-ID: 0ba29a3c-9c1c-44c4-84a0-3405f8464075@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


On 2024-07-21 Su 7:54 AM, Thomas Munro wrote:
> A minor release deadline is only a couple of weeks away and
> unfortunately we still have no confirmation from a Windows user that
> this technique can be used to unbreak a database cluster that actually
> goes though that exact upgrade sequence. So I spent today figuring
> out how to write a TAP test to try to convince myself of that in the
> abstract. No C code change from v2. Does the test show a useful
> enough pathway that is likely to help people in future cases at least?
>
> I learned two new things:
>
> 1. With a remapped locale, you can log into your existing databases
> again, but CREATE DATABASE doesn't work (see test). That must have
> been the case for the historical transitions that were hard-coded
> (Macau etc); this is no different, it's just that the mapping table is
> now user-changeable. In other words, this just gives an administrator
> a way to log into a cluster again after such a change, but then they
> really need to update the names in catalogs, at least for any database
> that might be used as a template, to get a 100% functional system. So
> I have added a paragraph to the documentation piece to explain that.
>
> 2. Including the encoding suffix eg "en-US.1252" doesn't actually
> work after all (discovered by CI). You have to use just "en-US". (I
> guess ".utf-8" must be a special case.) So I have updated the
> documentation example.

I don't have a general comment, at least yet, but a couple, mostly
stylistic, on your TAP test.

change

    if ($^O eq 'MSWin32')

to

    if ($windows_os)

Also your use of "is()" is unidiomatic. The expected value should come
after the received value, not before. While your use won't change the
success or failure of the result, the message on failure will be at best
confusing. (I ran into this the other day, so it's fresh in my memory).
See Test::More documentation.

Using all upper case for variables is not our usual perl style (or
anybody else's that I am aware of). Mostly we just use lower case with
underscores. If you use the "constant" module upper case letters are
usually used, but then the things defined are bare words, and come with
some caveats, and I don't think you want to use it here.

The substance of the test looks good.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-07-21 22:00:28
Message-ID: CA+hUKGJasTBQ1THqDeeVdeRSFao3+T1LGnwJfqmT4sR-WmTebQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Mon, Jul 22, 2024 at 1:28 AM Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> change
>
> if ($^O eq 'MSWin32')
>
> to
>
> if ($windows_os)

Done.

> Also your use of "is()" is unidiomatic. The expected value should come
> after the received value, not before. While your use won't change the
> success or failure of the result, the message on failure will be at best
> confusing. (I ran into this the other day, so it's fresh in my memory).
> See Test::More documentation.

Fixed.

> Using all upper case for variables is not our usual perl style (or
> anybody else's that I am aware of). Mostly we just use lower case with
> underscores. If you use the "constant" module upper case letters are
> usually used, but then the things defined are bare words, and come with
> some caveats, and I don't think you want to use it here.

Fixed.

> The substance of the test looks good.

Thanks for looking!

One new-to-me fact from the -general mailing list this morning[1] is
that the EDB installer is explicitly feeding these unstable locale
names into the world's database clusters, so that's something that's
beyond the power of us upstream PostgreSQL hackers to fix even if we
pulled the trigger on the initdb BCP47-by-default patch[2]. Something
for someone at EDB to look into?

[1] /message-id/flat/CA%2BhUKGJA4pGEeZtfUUEuQmNNO7x16wZ%2BRWOeXP%2Btq3Hf4iu8oQ%40mail.gmail.com#6426ff9fd44fbcd9a7f343a3f94ae364
[2] /message-id/flat/CA+hUKGJ=XThErgAQRoqfCy1bKPxXVuF0=2zDbB+SxDs59pv7Fw(at)mail(dot)gmail(dot)com

Attachment Content-Type Size
v4-0001-Make-win32locale.c-s-kludge-table-dynamic.patch text/x-patch 26.3 KB
v4-0002-ci-Enable-DEBUG_SETLOCALE_MAP-on-FreeBSD-task.patch text/x-patch 1.5 KB

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>
Cc: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-07-22 11:24:21
Message-ID: dd630b4a-e136-42d4-814e-954f9c03f90a@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


On 2024-07-21 Su 6:00 PM, Thomas Munro wrote:
> One new-to-me fact from the -general mailing list this morning[1] is
> that the EDB installer is explicitly feeding these unstable locale
> names into the world's database clusters, so that's something that's
> beyond the power of us upstream PostgreSQL hackers to fix even if we
> pulled the trigger on the initdb BCP47-by-default patch[2]. Something
> for someone at EDB to look into?
>
> [1]/message-id/flat/CA%2BhUKGJA4pGEeZtfUUEuQmNNO7x16wZ%2BRWOeXP%2Btq3Hf4iu8oQ%40mail.gmail.com#6426ff9fd44fbcd9a7f343a3f94ae364
> [2]/message-id/flat/CA+hUKGJ=XThErgAQRoqfCy1bKPxXVuF0=2zDbB+SxDs59pv7Fw(at)mail(dot)gmail(dot)com

Sandeep, can you comment?

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


From: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-07-22 12:12:41
Message-ID: CANFyU963Cf1RK0AZ99muoKWCZKGB0+RTMVeZ5J57upksR1qgnQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi

On Mon, Jul 22, 2024 at 4:54 PM Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:

>
> On 2024-07-21 Su 6:00 PM, Thomas Munro wrote:
>
> One new-to-me fact from the -general mailing list this morning[1] is
> that the EDB installer is explicitly feeding these unstable locale
> names into the world's database clusters, so that's something that's
> beyond the power of us upstream PostgreSQL hackers to fix even if we
> pulled the trigger on the initdb BCP47-by-default patch[2]. Something
> for someone at EDB to look into?
>
> [1] /message-id/flat/CA%2BhUKGJA4pGEeZtfUUEuQmNNO7x16wZ%2BRWOeXP%2Btq3Hf4iu8oQ%40mail.gmail.com#6426ff9fd44fbcd9a7f343a3f94ae364
> [2] /message-id/flat/CA+hUKGJ=XThErgAQRoqfCy1bKPxXVuF0=2zDbB+SxDs59pv7Fw(at)mail(dot)gmail(dot)com
>
>
> Sandeep, can you comment?
>
>
> I responded to it at
/message-id/CANFyU95Le%2BtrZbtB1D4Y1FdUu4y_OJLxvJMi1Zb9WOWuf8KpqA%40mail.gmail.com
When I checked the installation log shared by Ertan at
https://github.com/EnterpriseDB/edb-installers/issues/127, I see installer
is using the correct locale name while doing initdb.

Executing C:\Windows\System32\cscript //NoLogo "C:\Program
Files\PostgreSQL\16/installer/server/initcluster.vbs" "NT
AUTHORITY\NetworkService" "postgres" "****"
"C:\Users\User1\AppData\Local\Temp/postgresql_installer_cd79fad8b7"
"C:\Program Files\PostgreSQL\16" "C:\DATA_PG16" 5432 "Turkish,Türkiye" 0

Thomas, I can generate the test installer with your patch and share here so
this issue can be tested before the minor release?

cheers
>
>
> andrew
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>
>

--
Sandeep Thakkar


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-07-25 06:08:49
Message-ID: CA+hUKGKa7R23AqDFnLymLuLU0E7fjCnd0=SBe50VXonYLG0EiA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Tue, Jul 23, 2024 at 12:12 AM Sandeep Thakkar
<sandeep(dot)thakkar(at)enterprisedb(dot)com> wrote:
> Thomas, I can generate the test installer with your patch and share here so this issue can be tested before the minor release?

That might be useful, thanks! If someone wants to try it out on a
real Windows system.

. o O { I wonder if it could ever be possible to have that automated
via CI. Imagine if I/anyone could clone the EDB installer github
repo, and push a small patch into a private branch that tells it where
to pull PostgreSQL sources from, including any public git branch
including commitfest ones, and some options like with/without
assertions, and have CI spit out a new installer as an artifact?
Hopefully non-REL_XXX_STABLE branches would have a big warning on the
GUI so that users would never accidentally think it's a real release.
}


From: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-07-31 09:53:42
Message-ID: CANFyU94EO+_9Qj5CDK=Ei6pxXyZu9FMOw8rJ1zr-VqkY3fq+0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Thu, Jul 25, 2024 at 11:39 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
wrote:

> On Tue, Jul 23, 2024 at 12:12 AM Sandeep Thakkar
> <sandeep(dot)thakkar(at)enterprisedb(dot)com> wrote:
> > Thomas, I can generate the test installer with your patch and share here
> so this issue can be tested before the minor release?
>
> That might be useful, thanks! If someone wants to try it out on a
> real Windows system.
>

Sure, I can give it a try. Is your patch ready?

>
> . o O { I wonder if it could ever be possible to have that automated
> via CI. Imagine if I/anyone could clone the EDB installer github
> repo, and push a small patch into a private branch that tells it where
> to pull PostgreSQL sources from, including any public git branch
> including commitfest ones, and some options like with/without
> assertions, and have CI spit out a new installer as an artifact?
> Hopefully non-REL_XXX_STABLE branches would have a big warning on the
> GUI so that users would never accidentally think it's a real release.
> }
>

yeah, that'll be helpful but unfortunately it's not possible atm.
Hopefully, we will
have that available in future through the GitHub actions.

--
Sandeep Thakkar


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-08-08 02:35:11
Message-ID: CA+hUKGJtx=o0Td4JjJ9-P+HMuKZpf6cnmu=_dx9hr2THC08wsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

I am going to withdraw this proposal. In another email thread, we
learned two very important things:

1. The problem of a locale names changing for an *existing* database
can apparently be fixed using "Locale Builder"[1]. Users in Turkey
have presumably already addressed the situation that way and there are
reports of articles about that in Turkish. We don't really need the
"setlocale.map" rename-on-the-fly patch in this thread, and I don't
want to maintain it if we don't need it.

2. The Windows runtime library is crashing when initdb's subprocess
postgres.exe calls it with a name encoded in an encoding it doesn't
like[2]. (IMHO it should return NULL instead! The C standard does
not have a demons-may-fly-out-of-your-nose clause.) We need to tell
users not to use those locale names. That string is actually coming
from the EDB installer, and we'll follow up on that. Perhaps
PostgreSQL should also consider stripping non-ASCII characters to
force it to fail nicely.

[1] /message-id/CA%2BhUKGLyfou2xa2fxGOs0WwkOrcKjwqn-bRNR01Hq5RAD16ueg%40mail.gmail.com
[2] /message-id/CA%2BhUKG%2BFxeRLURZ%3Dn8NPyLwgjFds_SqU_cQvE40ks6RQKUGbGg%40mail.gmail.com


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-08-08 23:45:12
Message-ID: CA+hUKGJiEL-yFcpSRFkJLp-W4s7LMpAL7hhD02kLLV2cviHgCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Wed, Jul 31, 2024 at 9:53 PM Sandeep Thakkar
<sandeep(dot)thakkar(at)enterprisedb(dot)com> wrote:
> On Thu, Jul 25, 2024 at 11:39 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>> On Tue, Jul 23, 2024 at 12:12 AM Sandeep Thakkar
>> <sandeep(dot)thakkar(at)enterprisedb(dot)com> wrote:
>> . o O { I wonder if it could ever be possible to have that automated
>> via CI. Imagine if I/anyone could clone the EDB installer github
>> repo, and push a small patch into a private branch that tells it where
>> to pull PostgreSQL sources from, including any public git branch
>> including commitfest ones, and some options like with/without
>> assertions, and have CI spit out a new installer as an artifact?
>> Hopefully non-REL_XXX_STABLE branches would have a big warning on the
>> GUI so that users would never accidentally think it's a real release.
>> }
>
> yeah, that'll be helpful but unfortunately it's not possible atm. Hopefully, we will
> have that available in future through the GitHub actions.

Sounds great.

FWIW it's easy[1][2] to make the PostgreSQL CI spit out a zip file
with a snapshot build of PostgreSQL. I expect it doesn't actually run
on a real Windows system unless you also install DLLs it needs like
OpenSSL, ICU etc, and I'm not sure if that would be best done by
stuffing them too into the .zip file, or if the scripts[3] that the CI
images are prepared with should be used by someone who wants to test
it. But thinking about that leads me back to the thought that it'd be
better if people interested in trying pre-releases could get an EDB
installer for a snapshot build from any branch anywhere, so it all
works the way they will actually use it in real life. And of course
you guys already did all the hard work to deal with dependencies.

[1] https://github.com/macdice/postgres/commit/56e967f9cd35ef418a653d191a4d9ea85055d94b
[2] https://cirrus-ci.com/task/4831619775725568
[3] https://github.com/anarazel/pg-vm-images/tree/main/scripts


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-08-26 21:29:37
Message-ID: PH8PR21MB3902C154DA07EC6206E9372BE58B2@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi Thomas and Andrew,

Just would like to follow up with you about this bug. How is it going with the investigation?

Thanks!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Wednesday, August 7, 2024 7:35 PM
To: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>; Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>; Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

I am going to withdraw this proposal. In another email thread, we learned two very important things:

1. The problem of a locale names changing for an *existing* database can apparently be fixed using "Locale Builder"[1]. Users in Turkey have presumably already addressed the situation that way and there are reports of articles about that in Turkish. We don't really need the "setlocale.map" rename-on-the-fly patch in this thread, and I don't want to maintain it if we don't need it.

2. The Windows runtime library is crashing when initdb's subprocess postgres.exe calls it with a name encoded in an encoding it doesn't like[2]. (IMHO it should return NULL instead! The C standard does not have a demons-may-fly-out-of-your-nose clause.) We need to tell users not to use those locale names. That string is actually coming from the EDB installer, and we'll follow up on that. Perhaps PostgreSQL should also consider stripping non-ASCII characters to force it to fail nicely.

[1] /message-id/CA%2BhUKGLyfou2xa2fxGOs0WwkOrcKjwqn-bRNR01Hq5RAD16ueg%40mail.gmail.com
[2] /message-id/CA%2BhUKG%2BFxeRLURZ%3Dn8NPyLwgjFds_SqU_cQvE40ks6RQKUGbGg%40mail.gmail.com


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-08-28 10:16:11
Message-ID: CA+hUKG+yLnVjc7qw1ac32GCzRUenXrK=3=nCvwP-LJBL8wVTEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Tue, Aug 27, 2024 at 9:29 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> Just would like to follow up with you about this bug. How is it going with the investigation?

Well, we figured out that Windows is aborting inside setlocale(). The
C standard says "The pointer to string returned by the setlocale
function is such that a subsequent call with that string value and its
associated category will restore that part of the program’s locale."
Window does not conform, and crashes itself.

One practical change we could make would be to reject non-ASCII locale
names in initdb, as in the attached, and back-patch to 16. Then the
EDB installer (a separate project that is calling initdb and providing
the non-ASCII string containing "ü") would at least get a graceful
failure instead of a crash. It hardly makes a difference to the user,
but I've finally understood that these crashes in the Windows runtime
library might be generating a stream of crash reports to Microsoft,
which might be bringing you here?

For example:

$ initdb -D asdf --locale="liberté"
The files belonging to this database system will be owned by user "tmunro".
This user must also own the server process.

initdb: error: locale name "liberté" contains non-ASCII characters

Attachment Content-Type Size
0001-initdb-Reject-non-ASCII-locale-names.patch text/x-patch 1.7 KB

From: Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>, Peter Constable <pconstable(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-08-28 19:34:22
Message-ID: SJ0PR00MB131822273FA789E742832C3A82952@SJ0PR00MB1318.namprd00.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

+Peter Constable, FYI

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Wednesday, August 28, 2024 3:16 AM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>
Cc: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>; Andrew Dunstan <andrew(at)dunslane(dot)net>; Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Tue, Aug 27, 2024 at 9:29 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> Just would like to follow up with you about this bug. How is it going with the investigation?

Well, we figured out that Windows is aborting inside setlocale(). The C standard says "The pointer to string returned by the setlocale function is such that a subsequent call with that string value and its associated category will restore that part of the program’s locale."
Window does not conform, and crashes itself.

One practical change we could make would be to reject non-ASCII locale names in initdb, as in the attached, and back-patch to 16. Then the EDB installer (a separate project that is calling initdb and providing the non-ASCII string containing "ü") would at least get a graceful failure instead of a crash. It hardly makes a difference to the user, but I've finally understood that these crashes in the Windows runtime library might be generating a stream of crash reports to Microsoft, which might be bringing you here?

For example:

$ initdb -D asdf --locale="liberté"
The files belonging to this database system will be owned by user "tmunro".
This user must also own the server process.

initdb: error: locale name "liberté" contains non-ASCII characters


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-08-29 15:16:53
Message-ID: 9b3a9dce-c909-4da4-a3d8-daae0135714d@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


On 2024-08-28 We 6:16 AM, Thomas Munro wrote:
> On Tue, Aug 27, 2024 at 9:29 AM Haifang Wang (Centific Technologies
> Inc) <v-haiwang(at)microsoft(dot)com> wrote:
>> Just would like to follow up with you about this bug. How is it going with the investigation?
> Well, we figured out that Windows is aborting inside setlocale(). The
> C standard says "The pointer to string returned by the setlocale
> function is such that a subsequent call with that string value and its
> associated category will restore that part of the program’s locale."
> Window does not conform, and crashes itself.
>
> One practical change we could make would be to reject non-ASCII locale
> names in initdb, as in the attached, and back-patch to 16. Then the
> EDB installer (a separate project that is calling initdb and providing
> the non-ASCII string containing "ü") would at least get a graceful
> failure instead of a crash. It hardly makes a difference to the user,
> but I've finally understood that these crashes in the Windows runtime
> library might be generating a stream of crash reports to Microsoft,
> which might be bringing you here?
>
> For example:
>
> $ initdb -D asdf --locale="liberté"
> The files belonging to this database system will be owned by user "tmunro".
> This user must also own the server process.
>
> initdb: error: locale name "liberté" contains non-ASCII characters

+1 for doing this.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>
Cc: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-08-29 17:53:51
Message-ID: PH8PR21MB3902B95DE48AD71D4168926CE5962@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Thanks for the reply, Andrew and Thomas. The issue is originally from user feedback. And our team verified and reproduced it too. And it is high priority from our end is because: 1. It is regression, it did not see it in the previous build. 2. After the crash happened, even the app could be installed, but cluster initialization failed. And due to which we cannot connect to the database and proceed further.

@Shawn @Rahul, do you have any comment or suggestions?

Regards!
Haifang

-----Original Message-----
From: Andrew Dunstan <andrew(at)dunslane(dot)net>
Sent: Thursday, August 29, 2024 8:17 AM
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>; Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>
Cc: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>; Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

[You don't often get email from andrew(at)dunslane(dot)net(dot) Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

On 2024-08-28 We 6:16 AM, Thomas Munro wrote:
> On Tue, Aug 27, 2024 at 9:29 AM Haifang Wang (Centific Technologies
> Inc) <v-haiwang(at)microsoft(dot)com> wrote:
>> Just would like to follow up with you about this bug. How is it going with the investigation?
> Well, we figured out that Windows is aborting inside setlocale(). The
> C standard says "The pointer to string returned by the setlocale
> function is such that a subsequent call with that string value and its
> associated category will restore that part of the program’s locale."
> Window does not conform, and crashes itself.
>
> One practical change we could make would be to reject non-ASCII locale
> names in initdb, as in the attached, and back-patch to 16. Then the
> EDB installer (a separate project that is calling initdb and providing
> the non-ASCII string containing "ü") would at least get a graceful
> failure instead of a crash. It hardly makes a difference to the user,
> but I've finally understood that these crashes in the Windows runtime
> library might be generating a stream of crash reports to Microsoft,
> which might be bringing you here?
>
> For example:
>
> $ initdb -D asdf --locale="liberté"
> The files belonging to this database system will be owned by user "tmunro".
> This user must also own the server process.
>
> initdb: error: locale name "liberté" contains non-ASCII characters

+1 for doing this.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com/


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>
Cc: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-08-29 21:09:37
Message-ID: 5599a382-77bf-453f-8257-beb694a3c279@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


On 2024-08-29 Th 1:53 PM, Haifang Wang (Centific Technologies Inc) wrote:
> Thanks for the reply, Andrew and Thomas. The issue is originally from user feedback. And our team verified and reproduced it too. And it is high priority from our end is because: 1. It is regression, it did not see it in the previous build. 2. After the crash happened, even the app could be installed, but cluster initialization failed. And due to which we cannot connect to the database and proceed further.

Please don't top-post on PostgreSQL lists.

If I read Thomas correctly, the solution is to create a locale with an
acceptable (pure Ascii) name using Locale Builder, and then call initdb
using that name. It might be that you need to do some of this manually
if the installer fails. IIRC is it possible to run the installer without
having it create a cluster automatically. You can then set up the
cluster and create a windows service manually using initdb / pg_ctl.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-08-29 22:55:29
Message-ID: CA+hUKGK4UivYpFsBmNbpELvgQRkaVTHLn6vjkmU9MdQdfQ97FQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, Aug 30, 2024 at 3:16 AM Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> On 2024-08-28 We 6:16 AM, Thomas Munro wrote:
> > initdb: error: locale name "liberté" contains non-ASCII characters
>
> +1 for doing this.

Thanks for looking! OK, let's go with this approach. That two-line
demo patch doesn't work properly for the system default, and of course
it should be in sync with the backend function, and needs comments to
explain this mess, so I'll work on a serious patch.


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-08-30 03:22:13
Message-ID: CA+hUKG+28zRyk8U=eFLn8O+17VrbpKykWPVrSzf57Ru32dQo8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, Aug 30, 2024 at 10:55 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> On Fri, Aug 30, 2024 at 3:16 AM Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> > On 2024-08-28 We 6:16 AM, Thomas Munro wrote:
> > > initdb: error: locale name "liberté" contains non-ASCII characters
> >
> > +1 for doing this.
>
> Thanks for looking! OK, let's go with this approach. That two-line
> demo patch doesn't work properly for the system default, and of course
> it should be in sync with the backend function, and needs comments to
> explain this mess, so I'll work on a serious patch.

Done. To develop this patch, I set up a cursed locale name on my
FreeBSD box like so:

$ sudo cp -r /usr/share/locale/fr_FR.utf-8 /usr/share/locale/fraternité.utf-8

Then I could exercise the setlocale("") path, approximating what I
expect a Turkish Windows system to do without explicit --locale:

$ LC_TIME="fraternité.utf-8" initdb -D pdata
...
initdb: error: locale name "fraternité.utf-8" contains non-ASCII characters

The explicit path is easier to test, like before:

$ initdb -D pgdata --locale "égalité.UTF-9"
...
initdb: error: locale name "égalité.UTF-9" contains non-ASCII characters

That's what the EDB installer (if not changed to provide ASCII-clean
names somehow or other) will start failing with, if it tries to feed
umlauts to initdb.

Working through that, I understood another problem: even if you
provide an acceptable string like --locale "tr-TR", we still need to
be able to save-and-restore the value that initdb inherited from the
environment. Using my simulation, what I mean is if you do:

$ export LANG="liberté.utf-8"

... then ...

$ initdb -D pgdata --locale "fr_FR.utf-8"

... it still needs to be able to do a safe save-and-restore back to
the environment's value without crashing, even though it's not really
using that value. I wondered if initdb really needs a locale other
than "C" for itself, but I'm trying to minimise the changes here so I
don't break something else... That seems to mean that we have to use
_wsetlocale() for that particular save/restore. This is essentially
what Haifang's colleagues were suggesting upthread. Done.

Once a server is up and running, we shouldn't be exposed to the server
environment (right?), but there are several more places that accept
user-supplied strings, and only CREATE COLLATION fails to run
check_locale():

postgres=# create collation foo (locale = 'fraternité.utf-8');
CREATE COLLATION

The other places I am aware of do, so putting the equivalent code into
check_locale() does the right thing:

postgres=# create database db locale = 'fraternité.utf-8';
WARNING: locale name "fraternité.utf-8" contains non-ASCII characters
ERROR: invalid LC_COLLATE locale name: "fraternité.utf-8"
HINT: If the locale name is specific to ICU, use ICU_LOCALE.

postgres=# set lc_time = 'fraternité.utf-8';
WARNING: locale name "fraternité.utf-8" contains non-ASCII characters
ERROR: invalid value for parameter "lc_time": "fraternité.utf-8"

postgres=# set lc_messages = 'fraternité.utf-8';
WARNING: locale name "fraternité.utf-8" contains non-ASCII characters
ERROR: invalid value for parameter "lc_messages": "fraternité.utf-8"

postgres=# set lc_monetary = 'fraternité.utf-8';
WARNING: locale name "fraternité.utf-8" contains non-ASCII characters
ERROR: invalid value for parameter "lc_monetary": "fraternité.utf-8"

Those last SET lc_* changes aren't strictly necessary, because we
never restore to those, we only switch to them and then restore to the
database default, but since they share code with CREATE DATABASE the
new check kicks in anyway.

I tried adding check_locale() checks into CREATE COLLATION, but that
made some tests fail (huh, I didn't know that the set of strings
acceptable to _newlocale() and setlocale() are different -- the former
seems to accept "POSIX" on that OS, which is exercised by some tests,
or maybe it's some path of ours, I didn't check). But, on reflection,
I don't believe that COLLATION objects actually have this problem
anyway, because we never call setlocale() to them, we just use the
modern thread-safe POSIX-2008-oid APIs. So I took that back out to
minimise the patch.

Please see attached, including a restatement of the problem and
solution in the commit message.

I suppose we could add some tests for this, but I was a bit unsure
about the rules for putting non-ASCII into perl scripts and command
lines, and would need to research that...

(Note: a lot of this old stuff is getting ripped out in master by the
ongoing multi-threading effort...)

Attachment Content-Type Size
v2-0001-Reject-non-ASCII-locale-names.patch text/x-patch 7.3 KB

From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-09-04 21:28:22
Message-ID: PH8PR21MB3902991F916D365183F8AE61E59C2@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi Thomas,

How is it going after all the analysis below?

Thanks!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Thursday, August 29, 2024 8:22 PM
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>; Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>; Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Fri, Aug 30, 2024 at 10:55 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> On Fri, Aug 30, 2024 at 3:16 AM Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> > On 2024-08-28 We 6:16 AM, Thomas Munro wrote:
> > > initdb: error: locale name "liberté" contains non-ASCII characters
> >
> > +1 for doing this.
>
> Thanks for looking! OK, let's go with this approach. That two-line
> demo patch doesn't work properly for the system default, and of course
> it should be in sync with the backend function, and needs comments to
> explain this mess, so I'll work on a serious patch.

Done. To develop this patch, I set up a cursed locale name on my FreeBSD box like so:

$ sudo cp -r /usr/share/locale/fr_FR.utf-8 /usr/share/locale/fraternité.utf-8

Then I could exercise the setlocale("") path, approximating what I expect a Turkish Windows system to do without explicit --locale:

$ LC_TIME="fraternité.utf-8" initdb -D pdata ...
initdb: error: locale name "fraternité.utf-8" contains non-ASCII characters

The explicit path is easier to test, like before:

$ initdb -D pgdata --locale "égalité.UTF-9"
...
initdb: error: locale name "égalité.UTF-9" contains non-ASCII characters

That's what the EDB installer (if not changed to provide ASCII-clean names somehow or other) will start failing with, if it tries to feed umlauts to initdb.

Working through that, I understood another problem: even if you provide an acceptable string like --locale "tr-TR", we still need to be able to save-and-restore the value that initdb inherited from the environment. Using my simulation, what I mean is if you do:

$ export LANG="liberté.utf-8"

... then ...

$ initdb -D pgdata --locale "fr_FR.utf-8"

... it still needs to be able to do a safe save-and-restore back to the environment's value without crashing, even though it's not really using that value. I wondered if initdb really needs a locale other than "C" for itself, but I'm trying to minimise the changes here so I
don't break something else... That seems to mean that we have to use
_wsetlocale() for that particular save/restore. This is essentially what Haifang's colleagues were suggesting upthread. Done.

Once a server is up and running, we shouldn't be exposed to the server environment (right?), but there are several more places that accept user-supplied strings, and only CREATE COLLATION fails to run
check_locale():

postgres=# create collation foo (locale = 'fraternité.utf-8'); CREATE COLLATION

The other places I am aware of do, so putting the equivalent code into
check_locale() does the right thing:

postgres=# create database db locale = 'fraternité.utf-8';
WARNING: locale name "fraternité.utf-8" contains non-ASCII characters
ERROR: invalid LC_COLLATE locale name: "fraternité.utf-8"
HINT: If the locale name is specific to ICU, use ICU_LOCALE.

postgres=# set lc_time = 'fraternité.utf-8';
WARNING: locale name "fraternité.utf-8" contains non-ASCII characters
ERROR: invalid value for parameter "lc_time": "fraternité.utf-8"

postgres=# set lc_messages = 'fraternité.utf-8';
WARNING: locale name "fraternité.utf-8" contains non-ASCII characters
ERROR: invalid value for parameter "lc_messages": "fraternité.utf-8"

postgres=# set lc_monetary = 'fraternité.utf-8';
WARNING: locale name "fraternité.utf-8" contains non-ASCII characters
ERROR: invalid value for parameter "lc_monetary": "fraternité.utf-8"

Those last SET lc_* changes aren't strictly necessary, because we never restore to those, we only switch to them and then restore to the database default, but since they share code with CREATE DATABASE the new check kicks in anyway.

I tried adding check_locale() checks into CREATE COLLATION, but that made some tests fail (huh, I didn't know that the set of strings acceptable to _newlocale() and setlocale() are different -- the former seems to accept "POSIX" on that OS, which is exercised by some tests, or maybe it's some path of ours, I didn't check). But, on reflection, I don't believe that COLLATION objects actually have this problem anyway, because we never call setlocale() to them, we just use the modern thread-safe POSIX-2008-oid APIs. So I took that back out to minimise the patch.

Please see attached, including a restatement of the problem and solution in the commit message.

I suppose we could add some tests for this, but I was a bit unsure about the rules for putting non-ASCII into perl scripts and command lines, and would need to research that...

(Note: a lot of this old stuff is getting ripped out in master by the ongoing multi-threading effort...)


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-09-04 21:41:08
Message-ID: CA+hUKGKHXt63fxW-jBWLmzL0GmRsX6BckNvFk=Hazk6hgrjptw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Thu, Sep 5, 2024 at 9:28 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> How is it going after all the analysis below?

I proposed a patch. I am waiting for someone to try it out, review it
and write the magic words "looks good to me", before I commit it and
back-patch it. PostgreSQL will next make a release in early November.

Shortly after that, the EDB Windows installer team would begin
publishing a new Windows installer package, and then you should stop
hearing from Windows users whose runtime library exploded. Hopefully
they will also adjust their GUI that offers a list of locale names to
users, so that it doesn't even offer the ones that don't work, and
does offer some better ones, like (probably) "tr-TR.utf-8".


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-09-04 22:07:14
Message-ID: PH8PR21MB390228556041EB143267036FE59C2@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Cool. is anyone going to try it out? Looking forward to the back-patch and release.

Regards!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Wednesday, September 4, 2024 2:41 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>; Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>; Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Thu, Sep 5, 2024 at 9:28 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> How is it going after all the analysis below?

I proposed a patch. I am waiting for someone to try it out, review it and write the magic words "looks good to me", before I commit it and back-patch it. PostgreSQL will next make a release in early November.

Shortly after that, the EDB Windows installer team would begin publishing a new Windows installer package, and then you should stop hearing from Windows users whose runtime library exploded. Hopefully they will also adjust their GUI that offers a list of locale names to users, so that it doesn't even offer the ones that don't work, and does offer some better ones, like (probably) "tr-TR.utf-8".


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-09-04 22:10:09
Message-ID: 1703734.1725487809@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com> writes:
> Cool. is anyone going to try it out? Looking forward to the back-patch and release.

I thought the implication was pretty clear that YOU ought to try it out.

regards, tom lane


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-09-04 22:35:28
Message-ID: CA+hUKGLbGaMPBWgj4GPR_i=0s8J9XVZZ4_-0DP_ZpCoCZbaDig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Thu, Sep 5, 2024 at 10:10 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com> writes:
> > Cool. is anyone going to try it out? Looking forward to the back-patch and release.
>
> I thought the implication was pretty clear that YOU ought to try it out.

It would also be good to hear from Sandeep and crew, if possible,
since they will be dealing with the consequences of this change.

But here is one thing Haifang might like to consider: the
documentation[1] for setlocale() could really use a note in the
remarks section to highlight this danger. Nothing in there seems to
alert the programmer to the fact that save = setlocale(NULL),
setlocale("something else"), setlocale(save) might fail to restore the
previous locale, and might even abort. Perhaps that's a weird, old,
non-thread-safe thing to be doing (and we're working on removing
that), but it is quite explicitly described in the standard. C99
7.11.1.1 paragraph 8: "The pointer to string returned by the setlocale
function is such that a subsequent call with that string value and its
associated category will restore that part of the program’s locale".
Aborting also fails paragraph 6: "If the selection cannot be honored,
the setlocale function returns a null pointer and the program’s locale
is not changed" (perhaps this has to do with the crt report mode
stuff, I don't know as I'm not a Windows programmer, I'm just trying
to maintain a C program; or maybe it's actually a bug?). Another
thing is that setlocale(NULL) continues to return the "old style"
names while everything else in the OS is using the modern BCP 47
system since Vista, and the locale documentation recommends that
applications prefer those[2]. It might be worth a note about that
too? (I guess stability and backwards compatibility are behind that
choice, but ... it's not really stable anyway.)

[1] https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170
[2] https://learn.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings?view=msvc-170


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-09-04 23:24:28
Message-ID: PH8PR21MB3902E906C37A613A1F34C5B4E59C2@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Okay, I will ask our tester to try the patch. But have you tried it from your end? @Thomas Munro @Tom Lane

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Wednesday, September 4, 2024 3:35 PM
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>; Andrew Dunstan <andrew(at)dunslane(dot)net>; Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>; Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Thu, Sep 5, 2024 at 10:10 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com> writes:
> > Cool. is anyone going to try it out? Looking forward to the back-patch and release.
>
> I thought the implication was pretty clear that YOU ought to try it out.

It would also be good to hear from Sandeep and crew, if possible, since they will be dealing with the consequences of this change.

But here is one thing Haifang might like to consider: the documentation[1] for setlocale() could really use a note in the remarks section to highlight this danger. Nothing in there seems to alert the programmer to the fact that save = setlocale(NULL), setlocale("something else"), setlocale(save) might fail to restore the previous locale, and might even abort. Perhaps that's a weird, old, non-thread-safe thing to be doing (and we're working on removing that), but it is quite explicitly described in the standard. C99
7.11.1.1 paragraph 8: "The pointer to string returned by the setlocale function is such that a subsequent call with that string value and its associated category will restore that part of the program’s locale".
Aborting also fails paragraph 6: "If the selection cannot be honored, the setlocale function returns a null pointer and the program’s locale is not changed" (perhaps this has to do with the crt report mode stuff, I don't know as I'm not a Windows programmer, I'm just trying to maintain a C program; or maybe it's actually a bug?). Another thing is that setlocale(NULL) continues to return the "old style"
names while everything else in the OS is using the modern BCP 47 system since Vista, and the locale documentation recommends that applications prefer those[2]. It might be worth a note about that too? (I guess stability and backwards compatibility are behind that choice, but ... it's not really stable anyway.)

[1] https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170
[2] https://learn.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings?view=msvc-170


From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-09-04 23:38:22
Message-ID: CAKFQuwY_-9B-rj0znkYO9_M5xpq7eNNwWoER2v2XdMtJR=_Atg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Wednesday, September 4, 2024, Haifang Wang (Centific Technologies Inc) <
v-haiwang(at)microsoft(dot)com> wrote:

> Okay, I will ask our tester to try the patch. But have you tried it from
> your end? @Thomas Munro @Tom Lane
>

Thomas wrote that he authored it so you can safely assume it was tested by
him. Tom Lane is not on the hook to be a reviewer here - especially given
his total involvement to date was only a single reply a while ago. Tom as
a committer could pick up the patch and commit it as a neutral party
committer once the others involved on this thread provide some reviews;
though given this is Windows-based I wouldn’t expect Tom to do that here.

David J.


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-09-04 23:39:12
Message-ID: CA+hUKGLn3AnrNN3ytmOg0xhWF2LP7-N+26TCeC-Da=bK3R0Y=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Thu, Sep 5, 2024 at 11:24 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> Okay, I will ask our tester to try the patch. But have you tried it from your end? @Thomas Munro @Tom Lane

I demonstrated it working on my computer in the email:

/message-id/CA%2BhUKG%2B28zRyk8U%3DeFLn8O%2B17VrbpKykWPVrSzf57Ru32dQo8A%40mail.gmail.com

But I can't test on Windows. For context, I have a "UNIX -- Live Free
or Die" license plate attached to the side of my workstation.


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-09-05 00:07:49
Message-ID: PH8PR21MB3902D6EC9EF1647F97F60037E59D2@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Good to know that, Thomas. I'm asking our tester to try it. Will let you know.

Regards!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Wednesday, September 4, 2024 4:39 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; Andrew Dunstan <andrew(at)dunslane(dot)net>; Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>; Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Thu, Sep 5, 2024 at 11:24 AM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> Okay, I will ask our tester to try the patch. But have you tried it
> from your end? @Thomas Munro @Tom Lane

I demonstrated it working on my computer in the email:

/message-id/CA%2BhUKG%2B28zRyk8U%3DeFLn8O%2B17VrbpKykWPVrSzf57Ru32dQo8A%40mail.gmail.com

But I can't test on Windows. For context, I have a "UNIX -- Live Free or Die" license plate attached to the side of my workstation.


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-09-05 00:15:24
Message-ID: CA+hUKGJHGMB_HKemrrFLw=D10Tbfmo_AiWKrzszDL4xud+XwUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Thu, Sep 5, 2024 at 12:07 PM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> Good to know that, Thomas. I'm asking our tester to try it. Will let you know.

To give a more serious response, I don't really expect anything
"simple" about this patch not to work on Windows, as I know that it it
rejects non-ASCII names as I showed (on all OSes), and I know that
compiles and passes our tests on all our 4 CI operating systems:

https://cirrus-ci.com/build/4941642090676224

Really what I'm looking for is (1) feedback on the approach, code and
comments, and thoughts about more complex scenarios I may have failed
to think about, including say, pg_dump, pg_upgrade etc operational
issues, which probably involves lots of previous experience with
PostgreSQL, (2) opinions on whether we should add a test for these
cases and how to put the utf-8 into a script (I'm confused about the
encoding of command line arguments), and (3) a nod from the EDB people
involved in distributing this software on Windows.


From: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-09-05 11:46:40
Message-ID: CANFyU95DaWEVLOdhZPLdjRMAVQGMr1tnq30uWB37MjReyJUfCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Thu, Sep 5, 2024 at 5:46 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:

> On Thu, Sep 5, 2024 at 12:07 PM Haifang Wang (Centific Technologies
> Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> > Good to know that, Thomas. I'm asking our tester to try it. Will let you
> know.
>
> To give a more serious response, I don't really expect anything
> "simple" about this patch not to work on Windows, as I know that it it
> rejects non-ASCII names as I showed (on all OSes), and I know that
> compiles and passes our tests on all our 4 CI operating systems:
>
> https://cirrus-ci.com/build/4941642090676224
>
> Really what I'm looking for is (1) feedback on the approach, code and
> comments, and thoughts about more complex scenarios I may have failed
> to think about, including say, pg_dump, pg_upgrade etc operational
> issues, which probably involves lots of previous experience with
> PostgreSQL, (2) opinions on whether we should add a test for these
> cases and how to put the utf-8 into a script (I'm confused about the
> encoding of command line arguments), and (3) a nod from the EDB people
> involved in distributing this software on Windows.
>

We can help with producing the builds with the patches provided. You had
also
mentioned about the changes required in the installer script, will it still
be required?

--
Sandeep Thakkar


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-09-05 17:18:22
Message-ID: PH8PR21MB390209222D8B0B73EDA5F90FE59D2@PH8PR21MB3902.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

@Sandeep Thakkar<mailto:sandeep(dot)thakkar(at)enterprisedb(dot)com>, I believe your question is to Thomas, is that right?

@Thomas Munro<mailto:thomas(dot)munro(at)gmail(dot)com> regarding the patch you shared previously, could you give some instruction on how to use it?

Thanks!
Haifang

From: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>
Sent: Thursday, September 5, 2024 4:47 AM
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; Andrew Dunstan <andrew(at)dunslane(dot)net>; Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Thu, Sep 5, 2024 at 5:46 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com<mailto:thomas(dot)munro(at)gmail(dot)com>> wrote:
On Thu, Sep 5, 2024 at 12:07 PM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com<mailto:v-haiwang(at)microsoft(dot)com>> wrote:
> Good to know that, Thomas. I'm asking our tester to try it. Will let you know.

To give a more serious response, I don't really expect anything
"simple" about this patch not to work on Windows, as I know that it it
rejects non-ASCII names as I showed (on all OSes), and I know that
compiles and passes our tests on all our 4 CI operating systems:

https://cirrus-ci.com/build/4941642090676224

Really what I'm looking for is (1) feedback on the approach, code and
comments, and thoughts about more complex scenarios I may have failed
to think about, including say, pg_dump, pg_upgrade etc operational
issues, which probably involves lots of previous experience with
PostgreSQL, (2) opinions on whether we should add a test for these
cases and how to put the utf-8 into a script (I'm confused about the
encoding of command line arguments), and (3) a nod from the EDB people
involved in distributing this software on Windows.

We can help with producing the builds with the patches provided. You had also
mentioned about the changes required in the installer script, will it still be required?

--
Sandeep Thakkar


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>
Cc: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-09-19 23:37:37
Message-ID: CA+hUKGLfrK33XpFXsRcc97a1Qa5Vz1YFEn4GC1vie7yse=ffPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Thu, Sep 5, 2024 at 11:46 PM Sandeep Thakkar
<sandeep(dot)thakkar(at)enterprisedb(dot)com> wrote:
> On Thu, Sep 5, 2024 at 5:46 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>> Really what I'm looking for is (1) feedback on the approach, code and
>> comments, and thoughts about more complex scenarios I may have failed
>> to think about, including say, pg_dump, pg_upgrade etc operational
>> issues, which probably involves lots of previous experience with
>> PostgreSQL, (2) opinions on whether we should add a test for these
>> cases and how to put the utf-8 into a script (I'm confused about the
>> encoding of command line arguments), and (3) a nod from the EDB people
>> involved in distributing this software on Windows.

If I don't hear any objections to this plan soon, I'm going to commit
this and back-patch it into PostgreSQL 16 and PostgreSQL 17 after the
upcoming code freeze for the PostgreSQL 17 release ends. So it'll
probably be in 16.5 and 17.1.

> We can help with producing the builds with the patches provided. You had also
> mentioned about the changes required in the installer script, will it still be required?

If you don't change the installer script, then it will still fail if
someone selects "Türkiye" in your GUI, but now it will fail with an
ERROR rejecting non-ASCII characters, instead of crashing. So people
in Türkiye, Côte d'Ivoire, Curaçao etc will still have no way to
initialise a cluster with your GUI in PostgreSQL 16.5 and 17.1 unless
they follow the instructions on the web to create a "Turkey" (or
whatever non-ASCII string they want). Of course they could always use
initdb.exe directly from the command line with a BCP47 name. Maybe
that's OK, but I think you should consider changing the installer. A
conservative way to do it would be to show all the existing options
that you have now (so that someone who is happy using the old style
names when they don't contain non-ASCII can keep doing so), but also
have a second entry for each country that shows "Turkish, Türkiye
(tr-TR)" and/or perhaps "Turkish, Türkiye (tr-TR.utf-8)" or perhaps
both, and passes just that part in parentheses to initdb, to give
users all the options. Or perhaps you could have a checkbox "BCP 47
locales" that changes the list to show them.

No one has really reported any real world experience choosing between
the tr-TR vs tr-TR.utf-8 alternatives, and you might like to
experiment with that. The second option makes Windows' system
libraries use utf-8 encoding instead of the traditional encoding
associated with the language. As far as I can tell, it doesn't make
any difference at all to PostgreSQL yet, because your installer always
uses --encoding="UTF8" and, on Windows only, that makes PostgreSQL
ignore the locale's encoding and do a whole lot of internal
conversation to wchar_t because PostgreSQL doesn't yet know that
Windows 10+ can work with utf-8 directly.

The reason that I am interested in this .utf-8-or-not question is that
I'd like to consider *disallowing* non-matching encodings (see
commitfest entry #3772, reviewers wanted!), and teaching PostgreSQL
that Windows does in fact have utf-8, just so we can delete a lot of
slow special case code, harmonise with Unix, and generally catch up
with reality. So I figure we might as well start encouraging the
"xx-XX.utf-8" names when using --encoding="UTF8" if we can't find any
downside, because under that plan it would eventually become illegal
to use --locale="tr-TR" (no .utf-8) with --encoding="utf-8" if that
eventually goes in, so it seems sensible to stop creating new clusters
that way ASAP so that users have a better time upgrading in the
future. For example, a pg_upgrade from a PostgreSQL 17 cluster
initialised with --locale="tr-TR" --encoding="UTF8" to PostgreSQL 18
would proabbly require some extra step to rename "tr-TR" to
"tr-TR.UTF8" at some point (not sure exactly where), if PostgreSQL 18
starts rejecting the non-matching combination. I don't know where
that'll go, though -- it's not high priority work, it's just
incremental cleanup and modernisation that practically suggests itself
whenever looking at rejigging locale code for thread-safety and
reading all those comments about wchar_t that are not true.


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>
Cc: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-10-05 01:16:31
Message-ID: CA+hUKGLqN3dhHF_Y6rEgKCVxy0f7uCtSEv7okAzz=D96qVj38Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, Sep 20, 2024 at 11:37 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> If I don't hear any objections to this plan soon, I'm going to commit
> this and back-patch it into PostgreSQL 16 and PostgreSQL 17 after the
> upcoming code freeze for the PostgreSQL 17 release ends. So it'll
> probably be in 16.5 and 17.1.

Done, after another round of polishing and testing.

The main change since v2 was: I realised that initdb's
locale_date_order() also contains a save-and-restore, which should
share code at least for consistency (I don't personally know which
categories really affect process-wide encoding assumptions, so this
might not technically be necessary). I moved the relevant code into
new helper functions, which I think came out tidier.

The wchar_t-based save-restore in initdb.exe is based on a theoretical
observation, as the report was about postgres.exe crashing and that
should now be unreachable with these new name checks, but we can see
the general phenomenon from a two-line standalone repro[1] and so I
suspect there must be environment/input combinations that could crash
initdb.exe itself without it. We don't have any concrete reports of
that, but I preferred to close the theoretical risk using the
recommended technique.

Sorry for taking so long to untangle this mess; I assumed for much of
the thread that we were talking about the failure of existing clusters
to come up (as complained about in numerous other threads), not this
new kind of crash, and was trying to fix that before the Locale
Builder solution was widely known. Fixing Windows bugs as a
non-Windows guy by reading the manual and testing patches on CI can
sometimes feel a little like trying to patch a Mars rover... Thanks
to David R for help with a crucial observation using a martian
debugger.

CREATE COLLATION should probably also start using check_locale()
eventually, and I may look into harmonising that and figure out how to
test it in nearby future work on locales vs threads.

[1] /message-id/CA+hUKG+FxeRLURZ=n8NPyLwgjFds_SqU_cQvE40ks6RQKUGbGg@mail.gmail.com


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-11-11 23:12:07
Message-ID: SJ0PR21MB2022111D2515DA973EA45BC0E5582@SJ0PR21MB2022.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi Thomas and team,

Just would like to follow up with you about this bug. Is the bug fix available? We noticed that there is a new build 17.0-1. However the issue repro on that build too.

Thanks!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Friday, October 4, 2024 6:17 PM
To: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>
Cc: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; Andrew Dunstan <andrew(at)dunslane(dot)net>; Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Fri, Sep 20, 2024 at 11:37 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> If I don't hear any objections to this plan soon, I'm going to commit
> this and back-patch it into PostgreSQL 16 and PostgreSQL 17 after the
> upcoming code freeze for the PostgreSQL 17 release ends. So it'll
> probably be in 16.5 and 17.1.

Done, after another round of polishing and testing.

The main change since v2 was: I realised that initdb's
locale_date_order() also contains a save-and-restore, which should share code at least for consistency (I don't personally know which categories really affect process-wide encoding assumptions, so this might not technically be necessary). I moved the relevant code into new helper functions, which I think came out tidier.

The wchar_t-based save-restore in initdb.exe is based on a theoretical observation, as the report was about postgres.exe crashing and that should now be unreachable with these new name checks, but we can see the general phenomenon from a two-line standalone repro[1] and so I suspect there must be environment/input combinations that could crash initdb.exe itself without it. We don't have any concrete reports of that, but I preferred to close the theoretical risk using the recommended technique.

Sorry for taking so long to untangle this mess; I assumed for much of the thread that we were talking about the failure of existing clusters to come up (as complained about in numerous other threads), not this new kind of crash, and was trying to fix that before the Locale Builder solution was widely known. Fixing Windows bugs as a non-Windows guy by reading the manual and testing patches on CI can sometimes feel a little like trying to patch a Mars rover... Thanks to David R for help with a crucial observation using a martian debugger.

CREATE COLLATION should probably also start using check_locale() eventually, and I may look into harmonising that and figure out how to test it in nearby future work on locales vs threads.

[1] /message-id/CA+hUKG+FxeRLURZ=n8NPyLwgjFds_SqU_cQvE40ks6RQKUGbGg@mail.gmail.com


From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
Cc: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-11-12 00:42:40
Message-ID: CA+hUKGJy3nfnDnPjoJ78L_zBD+nitPTtuEbDd6q4=owNQicgCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Tue, Nov 12, 2024 at 12:12 PM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> Just would like to follow up with you about this bug. Is the bug fix available? We noticed that there is a new build 17.0-1. However the issue repro on that build too.

Hi Haifang,
The change is in v17.1 and v16.5, but the release process for those is
in progress right now, getting them out to the packaging teams.
Hopefully new Windows installer packages will appear in a few days or
so?


From: "Haifang Wang (Centific Technologies Inc)" <v-haiwang(at)microsoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Rahul Pandey <pandeyrah(at)microsoft(dot)com>, Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>, Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>, Amy Wishnousky <amyw(at)microsoft(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Shweta Gulati <gulatishweta(at)microsoft(dot)com>, Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Date: 2024-11-12 00:50:06
Message-ID: SJ0PR21MB20226058B6104B28F7775274E5592@SJ0PR21MB2022.namprd21.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Great to hear that, Thomas. Looking forward to hearing from your once the new release is available.

Thanks!
Haifang

-----Original Message-----
From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Sent: Monday, November 11, 2024 4:43 PM
To: Haifang Wang (Centific Technologies Inc) <v-haiwang(at)microsoft(dot)com>
Cc: Sandeep Thakkar <sandeep(dot)thakkar(at)enterprisedb(dot)com>; Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>; Andrew Dunstan <andrew(at)dunslane(dot)net>; Rahul Pandey <pandeyrah(at)microsoft(dot)com>; Vishwa Deepak <Vishwa(dot)Deepak(at)microsoft(dot)com>; Shawn Steele <Shawn(dot)Steele(at)microsoft(dot)com>; Amy Wishnousky <amyw(at)microsoft(dot)com>; pgsql-bugs(at)lists(dot)postgresql(dot)org; Shweta Gulati <gulatishweta(at)microsoft(dot)com>; Ashish Nawal <nawalashish(at)microsoft(dot)com>
Subject: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607

On Tue, Nov 12, 2024 at 12:12 PM Haifang Wang (Centific Technologies
Inc) <v-haiwang(at)microsoft(dot)com> wrote:
> Just would like to follow up with you about this bug. Is the bug fix available? We noticed that there is a new build 17.0-1. However the issue repro on that build too.

Hi Haifang,
The change is in v17.1 and v16.5, but the release process for those is in progress right now, getting them out to the packaging teams.
Hopefully new Windows installer packages will appear in a few days or so?