Re: Fwd: 8.0 Beta3 worked, RC1 didn't!

Lists: pgsql-hackers-win32
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers-win32(at)postgresql(dot)org
Subject: Fwd: 8.0 Beta3 worked, RC1 didn't!
Date: 2004-12-24 15:00:56
Message-ID: 8122.1103900456@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers-win32

Forwarding the attached in case anyone missed it on -general.

The shmem attach address shown in his messages (00DC0000) seems mighty
low. What I am suspecting is:
1. Postmaster boots, creates shmem, and for some idiotic reason
2003 Server creates the shmem segment just above the end of
regular memory.
2. When subprocesses launch and re-read GUC settings, for one
reason or another they use up a little more RAM than the
postmaster did.
3. Subprocesses fail to attach to shmem because the target
address is now in their regular RAM range.

I don't know why 2003 Server has such a brain-dead choice of shmem
address assignment, nor why listen_addresses might prompt a little extra
growth of RAM usage. But the theory seems to fit the available facts.

If this is correct then we have to do something to force a smarter
choice of shmem address on Windows. One brute-force way to do it
might be to malloc a couple hundred K just before the postmaster
attaches to shmem, and then release?

Theory B is that somehow UsedShmemSegAddr is not being passed down
accurately in this case, but that seems a mite improbable.

regards, tom lane

------- Forwarded Message

Date: 23 Dec 2004 08:33:12 -0800
From: nico(at)def2shoot(dot)com (Nicolas COUSSEMACQ)
To: pgsql-general(at)postgresql(dot)org
Subject: [GENERAL] 8.0 Beta3 worked, RC1 didn't!

I have the same problem !

When I setup Postgres 8.0 Beta 4 on a Windows Xp or 2003 Server, it works
parfectly with parameter listen_adresses set to '*' or localhost.
I have been testing Beta5, RC1 and RC2 on my XP workstation and there is no
problem, event if I accept external connections ( listen_adresses = '*').
Then I tried to setup Beta5, RC1 or RC2 on a station with 2003 Server, I can
only acces the Database when listen_adresses = localhost. If i set
listen_adresses = '*', i have a connection problem in PgAdmin saying "Could
not recieve server response to SSL negociation packet : Connection reset by
peer (0X00002746/10054). It appends when I launch pgadmin directly logged on
the station, when i'm connected with remote access and even from my XP
workstation.
The log file contains many lines such these ones :
2004-12-23 16:55:17 FATAL: could not attach to proper memory at fixed
address: shmget(key=5432001, addr=00DC0000) failed: Invalid argument
2004-12-23 16:55:17 FATAL: could not attach to proper memory at fixed
address: shmget(key=5432001, addr=00DC0000) failed: Invalid argument
2004-12-23 16:55:17 LOG: background writer process (PID 680) exited with
exit code 0
2004-12-23 16:55:17 LOG: terminating any other active server processes
2004-12-23 16:55:17 LOG: all server processes terminated; reinitializing

If I switch the listen_addresses parameter back to localhost', I can connect
to the DB in PgAdmin from the server screen or remote acces.

Those these information help you ?

""A. Mous"" <a(dot)mous(at)shaw(dot)ca> a crit dans le message de
news:000801c4e7d1$058c5300$6500a8c0(at)PETER(dot)(dot)(dot)
> Hi all,
>
> I'm using psql 8.0.0 on a client's site who's running win server 2003.
> We've had him on beta 3 for some time, and no problems at all (yes, in a
> sense, he is a beta tester as well, but doesn't know it!). Today I tried
to
> upgrade the db to RC1 and had some problems.
>
> Remote clients connect to this database, so I have to set listen_addresses
=
> '*' in the posrgresql.conf file. This is the only change to the config
> file. Doing this with RC1 and trying to connect locally with through psql
> resulted in the following error message:
>
> "could not receive server response to SSL negotiation packet; connection
> reset by peer (0x00002746/10054)"
>
> Removing the modified line in the config file resolved the problem
> (locally), however, no clients can connect! Beta 3 does not seem to have
> this issue, so we had to revert back to it for now.
>
> I would appreciate any ideas that some of you may have. Much thanks,
>
> -Peter
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
> joining column's datatypes do not match
>

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

------- End of Forwarded Message


From: Gary Doades <gpd(at)gpdnet(dot)co(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers-win32(at)postgresql(dot)org
Subject: Re: Fwd: 8.0 Beta3 worked, RC1 didn't!
Date: 2004-12-24 15:50:17
Message-ID: 41CC3AB9.5060805@gpdnet.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers-win32

AFAIK Win32 does not care where in private process address space the
"shared memory" segment is. It can be mapped to different addresses in
different processes and still share the same physical address space.
This is why Win32 puts the private shared address anywhere in its own
address space, because it doesn't matter.

All that is needed is to create a *named* memory mapped segment of a
particular size and get other process to map to the same name for the
same memory segment size and it automagically works.

If you try to force it to any particular private process address you may
fail as you don't always know where program code (DLLs etc.) may be loaded.

Cheers,
Gary.

Tom Lane wrote:
> Forwarding the attached in case anyone missed it on -general.
>
> The shmem attach address shown in his messages (00DC0000) seems mighty
> low. What I am suspecting is:
> 1. Postmaster boots, creates shmem, and for some idiotic reason
> 2003 Server creates the shmem segment just above the end of
> regular memory.
> 2. When subprocesses launch and re-read GUC settings, for one
> reason or another they use up a little more RAM than the
> postmaster did.
> 3. Subprocesses fail to attach to shmem because the target
> address is now in their regular RAM range.
>
> I don't know why 2003 Server has such a brain-dead choice of shmem
> address assignment, nor why listen_addresses might prompt a little extra
> growth of RAM usage. But the theory seems to fit the available facts.
>
> If this is correct then we have to do something to force a smarter
> choice of shmem address on Windows. One brute-force way to do it
> might be to malloc a couple hundred K just before the postmaster
> attaches to shmem, and then release?
>
> Theory B is that somehow UsedShmemSegAddr is not being passed down
> accurately in this case, but that seems a mite improbable.
>
> regards, tom lane
>
> ------- Forwarded Message
>
> Date: 23 Dec 2004 08:33:12 -0800
> From: nico(at)def2shoot(dot)com (Nicolas COUSSEMACQ)
> To: pgsql-general(at)postgresql(dot)org
> Subject: [GENERAL] 8.0 Beta3 worked, RC1 didn't!
>
> I have the same problem !
>
> When I setup Postgres 8.0 Beta 4 on a Windows Xp or 2003 Server, it works
> parfectly with parameter listen_adresses set to '*' or localhost.
> I have been testing Beta5, RC1 and RC2 on my XP workstation and there is no
> problem, event if I accept external connections ( listen_adresses = '*').
> Then I tried to setup Beta5, RC1 or RC2 on a station with 2003 Server, I can
> only acces the Database when listen_adresses = localhost. If i set
> listen_adresses = '*', i have a connection problem in PgAdmin saying "Could
> not recieve server response to SSL negociation packet : Connection reset by
> peer (0X00002746/10054). It appends when I launch pgadmin directly logged on
> the station, when i'm connected with remote access and even from my XP
> workstation.
> The log file contains many lines such these ones :
> 2004-12-23 16:55:17 FATAL: could not attach to proper memory at fixed
> address: shmget(key=5432001, addr=00DC0000) failed: Invalid argument
> 2004-12-23 16:55:17 FATAL: could not attach to proper memory at fixed
> address: shmget(key=5432001, addr=00DC0000) failed: Invalid argument
> 2004-12-23 16:55:17 LOG: background writer process (PID 680) exited with
> exit code 0
> 2004-12-23 16:55:17 LOG: terminating any other active server processes
> 2004-12-23 16:55:17 LOG: all server processes terminated; reinitializing
>
> If I switch the listen_addresses parameter back to localhost', I can connect
> to the DB in PgAdmin from the server screen or remote acces.
>
>
> Those these information help you ?
>
>
> ""A. Mous"" <a(dot)mous(at)shaw(dot)ca> a écrit dans le message de
> news:000801c4e7d1$058c5300$6500a8c0(at)PETER(dot)(dot)(dot)
>
>>Hi all,
>>
>>I'm using psql 8.0.0 on a client's site who's running win server 2003.
>>We've had him on beta 3 for some time, and no problems at all (yes, in a
>>sense, he is a beta tester as well, but doesn't know it!). Today I tried
>
> to
>
>>upgrade the db to RC1 and had some problems.
>>
>>Remote clients connect to this database, so I have to set listen_addresses
>
> =
>
>>'*' in the posrgresql.conf file. This is the only change to the config
>>file. Doing this with RC1 and trying to connect locally with through psql
>>resulted in the following error message:
>>
>>"could not receive server response to SSL negotiation packet; connection
>>reset by peer (0x00002746/10054)"
>>
>>Removing the modified line in the config file resolved the problem
>>(locally), however, no clients can connect! Beta 3 does not seem to have
>>this issue, so we had to revert back to it for now.
>>
>>I would appreciate any ideas that some of you may have. Much thanks,
>>
>>-Peter
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 9: the planner will ignore your desire to choose an index scan if your
>> joining column's datatypes do not match
>>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>
> ------- End of Forwarded Message
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
> joining column's datatypes do not match
>
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gary Doades <gpd(at)gpdnet(dot)co(dot)uk>
Cc: pgsql-hackers-win32(at)postgresql(dot)org
Subject: Re: Fwd: 8.0 Beta3 worked, RC1 didn't!
Date: 2004-12-24 16:03:31
Message-ID: 8690.1103904211@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers-win32

Gary Doades <gpd(at)gpdnet(dot)co(dot)uk> writes:
> AFAIK Win32 does not care where in private process address space the
> "shared memory" segment is. It can be mapped to different addresses in
> different processes and still share the same physical address space.
> This is why Win32 puts the private shared address anywhere in its own
> address space, because it doesn't matter.

Win32 may not care, but we do. The shared memory segment must be mapped
at the same address in every backend.

> If you try to force it to any particular private process address you may
> fail as you don't always know where program code (DLLs etc.) may be loaded.

This is (or ought to be) irrelevant, because we are only talking about
instances of a single executable.

regards, tom lane


From: Gary Doades <gpd(at)gpdnet(dot)co(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers-win32(at)postgresql(dot)org
Subject: Re: Fwd: 8.0 Beta3 worked, RC1 didn't!
Date: 2004-12-24 16:15:30
Message-ID: 41CC40A2.5020201@gpdnet.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers-win32

Tom Lane wrote:
> Gary Doades <gpd(at)gpdnet(dot)co(dot)uk> writes:
>
>>AFAIK Win32 does not care where in private process address space the
>>"shared memory" segment is. It can be mapped to different addresses in
>>different processes and still share the same physical address space.
>>This is why Win32 puts the private shared address anywhere in its own
>>address space, because it doesn't matter.
>
>
> Win32 may not care, but we do. The shared memory segment must be mapped
> at the same address in every backend.

Forgive me for not knowing the internals of postgres, but why? As long
as all the shared memory is accessed from the same relative offsets from
the private starting address it will refer to the same physical shared
memory address and should work.

Is this to maintain compatibility with the other platforms way of doing
things, or the postgres internal architecture?

If this is the case then your suggestion may be the only one, to
artificially bump up the first free address and hope that it is enough.
Seems a bit hit and miss though (probably more hit than miss) since it's
not easily known what the extra allocation for the subsequent backends
may be.

>>If you try to force it to any particular private process address you may
>>fail as you don't always know where program code (DLLs etc.) may be loaded.
>
>
> This is (or ought to be) irrelevant, because we are only talking about
> instances of a single executable.
>
Agreed, as long as you can't have code dynamically linked from one
backend, but not another.

Cheers,
Gary.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gary Doades <gpd(at)gpdnet(dot)co(dot)uk>
Cc: pgsql-hackers-win32(at)postgresql(dot)org
Subject: Re: Fwd: 8.0 Beta3 worked, RC1 didn't!
Date: 2004-12-24 16:30:17
Message-ID: 8916.1103905817@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers-win32

Gary Doades <gpd(at)gpdnet(dot)co(dot)uk> writes:
> Tom Lane wrote:
>> Win32 may not care, but we do. The shared memory segment must be mapped
>> at the same address in every backend.

> Forgive me for not knowing the internals of postgres, but why? As long
> as all the shared memory is accessed from the same relative offsets from
> the private starting address it will refer to the same physical shared
> memory address and should work.

Because we use absolute addresses in many cases. There was once a
convention of making everything relative to ShmemBase, but we've
abandoned that for reasons of code simplicity (and to a lesser extent
performance). There are still some places using relative offsets but
they are gradually going away. We are not reversing that decision
just because some flavors of Windows have stupid algorithms for
assigning default shmem addresses.

> If this is the case then your suggestion may be the only one, to
> artificially bump up the first free address and hope that it is enough.
> Seems a bit hit and miss though (probably more hit than miss) since it's
> not easily known what the extra allocation for the subsequent backends
> may be.

The needed extra allocation should really be *zero*. Keep in mind that
the intention of the EXEC_BACKEND code is to emulate the Unix case where
backends are spawned by fork(). Therefore the state of the backend at
the point where it needs to attach to shmem should really be hardly at
all different from the state of the postmaster. I'm moderately
interested to find out why changing listen_addresses seems to affect
this, but on the strength of the available evidence I'd suspect it's a
matter of just a few bytes that happens to exceed an allocation
boundary.

It might be that we could solve the problem by rethinking the order of
operations --- maybe we should reattach to shared memory during
restore_backend_variables, before the exec'd backend has had a chance to
do much of anything.

regards, tom lane


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers-win32(at)postgresql(dot)org
Subject: Re: Fwd: 8.0 Beta3 worked, RC1 didn't!
Date: 2004-12-24 17:37:49
Message-ID: 200412241737.iBOHbnO00708@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers-win32

Tom Lane wrote:
> Forwarding the attached in case anyone missed it on -general.
>
> The shmem attach address shown in his messages (00DC0000) seems mighty
> low. What I am suspecting is:
> 1. Postmaster boots, creates shmem, and for some idiotic reason
> 2003 Server creates the shmem segment just above the end of
> regular memory.
> 2. When subprocesses launch and re-read GUC settings, for one
> reason or another they use up a little more RAM than the
> postmaster did.
> 3. Subprocesses fail to attach to shmem because the target
> address is now in their regular RAM range.
>
> I don't know why 2003 Server has such a brain-dead choice of shmem
> address assignment, nor why listen_addresses might prompt a little extra
> growth of RAM usage. But the theory seems to fit the available facts.
>
> If this is correct then we have to do something to force a smarter
> choice of shmem address on Windows. One brute-force way to do it
> might be to malloc a couple hundred K just before the postmaster
> attaches to shmem, and then release?
>
> Theory B is that somehow UsedShmemSegAddr is not being passed down
> accurately in this case, but that seems a mite improbable.

I am confused. I thought we used a hard-coded location for shared
memory on Win32.

I thought it was 00xDB0000 something but I can't find any mention of
that. Was it removed? Are we now starting the postgres.exe binary and
assuming we can map to the same shared memory address as postmaster.exe?

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: pgsql-hackers-win32(at)postgresql(dot)org
Subject: Re: Fwd: 8.0 Beta3 worked, RC1 didn't!
Date: 2004-12-24 18:28:03
Message-ID: 9865.1103912883@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers-win32

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> I thought it was 00xDB0000 something but I can't find any mention of
> that. Was it removed? Are we now starting the postgres.exe binary and
> assuming we can map to the same shared memory address as postmaster.exe?

Looks that way to me; and I think it considerably safer than using any
hard-wired address. My current feeling is that the problem stems from
waiting too long to reattach to shared memory, and that we ought to do
that as soon as we can read the shmem address info from the temp file.

Just had a thought ... is it possible that this problem was introduced
by the recent changes to pass backend variables in shared memory instead
of in a temp file? ISTM fairly possible that mapping that memory is
going to interfere with where we need to map the main shared memory
block. I see that it gets unmapped after being read, but maybe the
damage is already done.

regards, tom lane