Re: Premature timeout on git.postgresql.org?

Lists: pgsql-www
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-www(at)lists(dot)postgresql(dot)org
Subject: Premature timeout on git.postgresql.org?
Date: 2022-02-19 18:36:58
Message-ID: 75991.1645295818@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg메이저 토토 사이트SQL : Postg메이저 토토 사이트SQL 메일 링리스트 : 2022-02-19 18:36 이후 PGSQL www

Trying to check out from the git mirror on a machine with a
slow internet connection, I repeatably get

$ time git clone https://git.postgresql.org/git/postgresql.git pgsql
Cloning into 'pgsql'...
remote: Enumerating objects: 14523, done.
remote: Counting objects: 100% (14523/14523), done.
remote: Compressing objects: 100% (9482/9482), done.
error: RPC failed; curl 92 HTTP/2 stream 3 was not closed cleanly before end of the underlying stream
fetch-pack: unexpected disconnect while reading sideband packet
fatal: early EOF
fatal: fetch-pack: invalid index-pack output
364.36 real 0.02 user 0.08 sys

It gets almost but not quite to the end of the data-transfer stage
before failing. I think I've seen this symptom before, and IIRC
we traced it to a poorly-chosen timeout somewhere in the PG web
infrastructure. I can't find a record of that in the archives,
though; the closest thing is [1] which is a somewhat different
symptom.

regards, tom lane

[1] /message-id/flat/CAGuFTBUrqO8kK_J2URKpWwathQSunZU6VtqMSrJnEt8F0qZaTg%40mail.gmail.com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-www(at)lists(dot)postgresql(dot)org
Subject: Re: Premature timeout on git.postgresql.org?
Date: 2022-02-20 14:52:48
Message-ID: YhJVwDSNE3q7oVRM@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www

On Sat, Feb 19, 2022 at 01:36:58PM -0500, Tom Lane wrote:
> Trying to check out from the git mirror on a machine with a
> slow internet connection, I repeatably get
>
> $ time git clone https://git.postgresql.org/git/postgresql.git pgsql
> Cloning into 'pgsql'...
> remote: Enumerating objects: 14523, done.
> remote: Counting objects: 100% (14523/14523), done.
> remote: Compressing objects: 100% (9482/9482), done.
> error: RPC failed; curl 92 HTTP/2 stream 3 was not closed cleanly before end of the underlying stream
> fetch-pack: unexpected disconnect while reading sideband packet
> fatal: early EOF
> fatal: fetch-pack: invalid index-pack output
> 364.36 real 0.02 user 0.08 sys
>
> It gets almost but not quite to the end of the data-transfer stage
> before failing. I think I've seen this symptom before, and IIRC
> we traced it to a poorly-chosen timeout somewhere in the PG web
> infrastructure. I can't find a record of that in the archives,
> though; the closest thing is [1] which is a somewhat different
> symptom.

I did a git clone from https://git.postgresql.org/git/postgresql.git on
Friday and just now and was only seeing 1.15 MiB/s, which seemed low on
my 100Mb connection.

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-www(at)lists(dot)postgresql(dot)org
Subject: Re: Premature timeout on git.postgresql.org?
Date: 2022-02-20 15:07:49
Message-ID: CABUevEwBXJHQo8BAnGSY3TxTqPwdm4POo+ypm8Qr5E=BhrnQWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www

On Sun, Feb 20, 2022 at 3:52 PM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> On Sat, Feb 19, 2022 at 01:36:58PM -0500, Tom Lane wrote:
> > Trying to check out from the git mirror on a machine with a
> > slow internet connection, I repeatably get
> >
> > $ time git clone https://git.postgresql.org/git/postgresql.git pgsql
> > Cloning into 'pgsql'...
> > remote: Enumerating objects: 14523, done.
> > remote: Counting objects: 100% (14523/14523), done.
> > remote: Compressing objects: 100% (9482/9482), done.
> > error: RPC failed; curl 92 HTTP/2 stream 3 was not closed cleanly before end of the underlying stream
> > fetch-pack: unexpected disconnect while reading sideband packet
> > fatal: early EOF
> > fatal: fetch-pack: invalid index-pack output
> > 364.36 real 0.02 user 0.08 sys
> >
> > It gets almost but not quite to the end of the data-transfer stage
> > before failing. I think I've seen this symptom before, and IIRC
> > we traced it to a poorly-chosen timeout somewhere in the PG web
> > infrastructure. I can't find a record of that in the archives,
> > though; the closest thing is [1] which is a somewhat different
> > symptom.

Hmm. That's interesting. Do you see it repeatable now? Or is it
possible it was a temporary network flakiness?

How long (ish) does it take from the start of the clone until it fails?

> I did a git clone from https://git.postgresql.org/git/postgresql.git on
> Friday and just now and was only seeing 1.15 MiB/s, which seemed low on
> my 100Mb connection.

I get about 50MB/500Mbit per second when I clone from it. The git
server is physically located in Amsterdam, maybe you have a limited
transit performance there (or it has to you).

Can you, as well, try again now and see if you see the same level of
performance now?

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/


From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-www(at)lists(dot)postgresql(dot)org
Subject: Re: Premature timeout on git.postgresql.org?
Date: 2022-02-20 16:31:23
Message-ID: 6d164f4c-b89d-5e70-e064-f49c697ba066@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg스포츠 토토 베트맨SQL : Postg스포츠 토토 베트맨SQL 메일 링리스트 : 2022-02-20 이후 PGSQL WWW 16:31

On 2/20/22 07:07, Magnus Hagander wrote:
> On Sun, Feb 20, 2022 at 3:52 PM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>>

>
> I get about 50MB/500Mbit per second when I clone from it. The git
> server is physically located in Amsterdam, maybe you have a limited
> transit performance there (or it has to you).
>
> Can you, as well, try again now and see if you see the same level of
> performance now?
>

For a point of reference I just cloned on a Linode instance in Fremont
CA. For the Postgres repo I got 2.25 MiB/s. Trying some other project
repos I got 25-27 MiB/s

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-www(at)lists(dot)postgresql(dot)org
Subject: Re: Premature timeout on git.postgresql.org?
Date: 2022-02-20 16:54:26
Message-ID: YhJyQu47/I1j6D1f@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www

On Sun, Feb 20, 2022 at 08:31:23AM -0800, Adrian Klaver wrote:
> On 2/20/22 07:07, Magnus Hagander wrote:
> > On Sun, Feb 20, 2022 at 3:52 PM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > >
>
> >
> > I get about 50MB/500Mbit per second when I clone from it. The git
> > server is physically located in Amsterdam, maybe you have a limited
> > transit performance there (or it has to you).
> >
> > Can you, as well, try again now and see if you see the same level of
> > performance now?
> >
>
> For a point of reference I just cloned on a Linode instance in Fremont CA.
> For the Postgres repo I got 2.25 MiB/s. Trying some other project repos I
> got 25-27 MiB/s

My transfer number fluctuated from 800kB to 2.2MB.

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-www(at)lists(dot)postgresql(dot)org
Subject: Re: Premature timeout on git.postgresql.org?
Date: 2022-02-20 18:34:56
Message-ID: 301685.1645382096@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> On Sun, Feb 20, 2022 at 08:31:23AM -0800, Adrian Klaver wrote:
>> For a point of reference I just cloned on a Linode instance in Fremont CA.
>> For the Postgres repo I got 2.25 MiB/s. Trying some other project repos I
>> got 25-27 MiB/s

> My transfer number fluctuated from 800kB to 2.2MB.

Yeah, I'd supposed that the el-cheapo ethernet dongle I'm using on
florican's host was the source of the crummy performance, but it is not,
or at least not all of it: there is something rotten between here and
Amsterdam. Trying it now on my primary workstation, I can clone from
the github mirror at a more or less respectable speed:

$ time git clone https://github.com/postgres/postgres.git pgsql-github
Cloning into 'pgsql-github'...
remote: Enumerating objects: 879421, done.
remote: Counting objects: 100% (240/240), done.
remote: Compressing objects: 100% (185/185), done.
remote: Total 879421 (delta 111), reused 82 (delta 55), pack-reused 879181
Receiving objects: 100% (879421/879421), 616.24 MiB | 33.59 MiB/s, done.
Resolving deltas: 100% (721439/721439), done.

real 0m36.548s
user 0m59.798s
sys 0m2.830s

but not so much from our official repo:

$ time git clone https://git.postgresql.org/git/postgresql.git pgsql-pg
Cloning into 'pgsql-pg'...
remote: Enumerating objects: 14782, done.
remote: Counting objects: 100% (14782/14782), done.
remote: Compressing objects: 100% (9581/9581), done.
remote: Total 879421 (delta 9965), reused 6933 (delta 5125), pack-reused 864639
Receiving objects: 100% (879421/879421), 289.26 MiB | 1.40 MiB/s, done.
Resolving deltas: 100% (754313/754313), done.

real 3m48.432s
user 0m54.379s
sys 0m4.514s

The 1.4MB/s average seen here hides a very variable speed: like
Bruce, I saw a speed around 700/800 kB/s to start, and then it
gradually ramped up to something over 2MB/s.

Now in fairness, the ping time to github.com from here is circa 10ms
while the ping time to gothos.postgresql.org is circa 100ms, so
I'd not really expect equivalent performance ... but this seems
seriously bad.

Trying it again on the machine where I originally saw the problem,
cloning from github works:

$ time git clone https://github.com/postgres/postgres.git pgsql-github
Cloning into 'pgsql-github'...
remote: Enumerating objects: 879421, done.
remote: Counting objects: 100% (240/240), done.
remote: Compressing objects: 100% (129/129), done.
remote: Total 879421 (delta 111), reused 235 (delta 111), pack-reused 879181
Receiving objects: 100% (879421/879421), 616.37 MiB | 4.00 MiB/s, done.
Resolving deltas: 100% (721488/721488), done.
Updating files: 100% (6058/6058), done.
318.52 real 332.96 user 26.12 sys

It's slow, but I expected that. I then tried twice to clone from
git.postgresql.org. The first try succeeded:

$ time git clone https://git.postgresql.org/git/postgresql.git pgsql-pg
Cloning into 'pgsql-pg'...
remote: Enumerating objects: 14782, done.
remote: Counting objects: 100% (14782/14782), done.
remote: Compressing objects: 100% (9581/9581), done.
remote: Total 879421 (delta 9966), reused 6933 (delta 5125), pack-reused 864639
Receiving objects: 100% (879421/879421), 289.20 MiB | 862.00 KiB/s, done.
Resolving deltas: 100% (754314/754314), done.
Updating files: 100% (6058/6058), done.
502.96 real 307.15 user 15.66 sys

The second try got the same results as yesterday:

$ time git clone https://git.postgresql.org/git/postgresql.git pgsql-pg
Cloning into 'pgsql-pg'...
remote: Enumerating objects: 14782, done.
remote: Counting objects: 100% (14782/14782), done.
remote: Compressing objects: 100% (9581/9581), done.
[ I hit return a few extra times to capture state, last printout was this: ]
Receiving objects: 99% (876395/879421), 285.62 MiB | 1.19 MiB/s
error: RPC failed; curl 92 HTTP/2 stream 3 was not closed cleanly before end of the underlying stream
fetch-pack: unexpected disconnect while reading sideband packet
fatal: early EOF
fatal: fetch-pack: invalid index-pack output
323.67 real 0.00 user 0.11 sys

(The instantaneous transfer speed varied from circa 1.3MB/s to
as low as a couple hundred kB/s.)

I doubt there's much we can do at the project level about the
poor transfer speed of the transatlantic link. However, the
fact that it's failing outright looks to me to be due to a
timeout of 5 or possibly 6 minutes that's breaking the "sideband"
connection. That probably is within our control.

regards, tom lane


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, pgsql-www(at)lists(dot)postgresql(dot)org
Subject: Re: Premature timeout on git.postgresql.org?
Date: 2022-02-21 11:01:33
Message-ID: CABUevEwP4GqCOOf2bKgx5xguGj2wMO+0QN1bazz8Es9HN==Stw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www

On Sun, Feb 20, 2022 at 7:35 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > On Sun, Feb 20, 2022 at 08:31:23AM -0800, Adrian Klaver wrote:
> >> For a point of reference I just cloned on a Linode instance in Fremont CA.
> >> For the Postgres repo I got 2.25 MiB/s. Trying some other project repos I
> >> got 25-27 MiB/s
>
> > My transfer number fluctuated from 800kB to 2.2MB.
>
> Yeah, I'd supposed that the el-cheapo ethernet dongle I'm using on
> florican's host was the source of the crummy performance, but it is not,
> or at least not all of it: there is something rotten between here and
> Amsterdam. Trying it now on my primary workstation, I can clone from
> the github mirror at a more or less respectable speed:

github does have a global CDN distributing things though, and it's
very likely the postgres entries were cached locally :)

<snip>

> The 1.4MB/s average seen here hides a very variable speed: like
> Bruce, I saw a speed around 700/800 kB/s to start, and then it
> gradually ramped up to something over 2MB/s.
>
> Now in fairness, the ping time to github.com from here is circa 10ms
> while the ping time to gothos.postgresql.org is circa 100ms, so
> I'd not really expect equivalent performance ... but this seems
> seriously bad.

It is definitely worse than I would've thought -- maybe I'm clouded by
it being fast enough while in Europe.

<snip>

> The second try got the same results as yesterday:

Actually, they don't seem to be the same results as yesterday AFAICT?

Yesterday it failed under "Compressing objects", which is before it
even tries to send anything across the network, which would indicate a
host issue on the git server (which is.. weird). This time it's
failing in Receiving objects, which is when it's actually, well,
receiving objects. It probably failed just as it started, so it is
likely to be the same underlying problem.

> $ time git clone https://git.postgresql.org/git/postgresql.git pgsql-pg
> Cloning into 'pgsql-pg'...
> remote: Enumerating objects: 14782, done.
> remote: Counting objects: 100% (14782/14782), done.
> remote: Compressing objects: 100% (9581/9581), done.
> [ I hit return a few extra times to capture state, last printout was this: ]
> Receiving objects: 99% (876395/879421), 285.62 MiB | 1.19 MiB/s
> error: RPC failed; curl 92 HTTP/2 stream 3 was not closed cleanly before end of the underlying stream
> fetch-pack: unexpected disconnect while reading sideband packet
> fatal: early EOF
> fatal: fetch-pack: invalid index-pack output
> 323.67 real 0.00 user 0.11 sys
>
> (The instantaneous transfer speed varied from circa 1.3MB/s to
> as low as a couple hundred kB/s.)
>
> I doubt there's much we can do at the project level about the
> poor transfer speed of the transatlantic link. However, the

We'll check in with our hosting provider on this topic and see if they
have something to say.

I did double check a run from a US server myself now, and it completes
in about 2 minutes so it's much faster than yours -- but much slower
than Europe. I also did some metrics during this time and there is
close to zero load on the server itself, so it must be somewhere in
the transit

> fact that it's failing outright looks to me to be due to a
> timeout of 5 or possibly 6 minutes that's breaking the "sideband"
> connection. That probably is within our control.

Yeah, that should be, if it is on our server.

However, I'm not really sure that's where the problem is. I can get
through an artificially slow clone from the US west coast:

$ time trickle -d 250 git clone https://git.postgresql.org/git/postgresql.git
trickle: Could not reach trickled, working independently: No such file
or directory
Cloning into 'postgresql'...
remote: Enumerating objects: 14820, done.
remote: Counting objects: 100% (14820/14820), done.
remote: Compressing objects: 100% (9619/9619), done.
remote: Total 879459 (delta 9993), reused 6929 (delta 5125), pack-reused 864639
Receiving objects: 100% (879459/879459), 289.21 MiB | 258.00 KiB/s, done.
Resolving deltas: 100% (754341/754341), done.

real 20m17.918s
user 2m22.406s
sys 0m4.184s

so it seems 20 minutes to transfer it is.. OK. (Per the server logs,
about 18 of those 20 minutes were in the single https call that timed
out for you)

Any chance there's a network device on your local network that might
have a 5 minute timeout?

And -- can you send me (offlist) exactly which IP these requests would
be coming in from, and run one more test today, and I'll see if I can
find some more details in some more verbose logging that's on now.

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/