Lists: | buildfarm-memberspgsql-hackers |
---|
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-18 22:48:11 |
Message-ID: | 18617.1421621291@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
One of the biggest causes of buildfarm run failures is "out of disk
space". That's not just because people are running buildfarm critters
on small slow machines; it's because "make check-world" is an enormous
space hog. Some numbers from current HEAD:
clean source tree: 120MB
built source tree: 400MB
tree after make check-world: 3GB
(This is excluding ~250MB for one's git repo.)
The reason for all the bloat is the temporary install trees that we
create, which tend to eat up about 100MB apiece, and there are dozens
of them (eg, one per testable contrib module). Those don't get removed
until the end of the test run, so the usage is cumulative.
The attached proposed patch removes each temp install tree as soon as
we're done with it, in the normal case where no error was detected.
This brings the peak space usage down from ~3GB to ~750MB.
To make things better in the buildfarm, we'd have to back-patch this into
all active branches, but I don't see any big problem with doing so.
Any objections?
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
clean-up-temp-installs-immediately.patch | text/x-diff | 844 bytes |
From: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-18 23:48:26 |
Message-ID: | CAB7nPqTQDdPxA_4NgDG--oFJvhBhZoDODr2gTYbg6hsu-mmE2g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
On Mon, Jan 19, 2015 at 7:48 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> To make things better in the buildfarm, we'd have to back-patch this into
> all active branches, but I don't see any big problem with doing so.
> Any objections?
Back-patching sounds like a good idea to me. At least this will allow
hamster to build all the active branches.
--
Michael
From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-19 00:39:59 |
Message-ID: | 54BC525F.1090405@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
On 01/18/2015 05:48 PM, Tom Lane wrote:
> One of the biggest causes of buildfarm run failures is "out of disk
> space". That's not just because people are running buildfarm critters
> on small slow machines; it's because "make check-world" is an enormous
> space hog. Some numbers from current HEAD:
>
> clean source tree: 120MB
> built source tree: 400MB
> tree after make check-world: 3GB
>
> (This is excluding ~250MB for one's git repo.)
>
> The reason for all the bloat is the temporary install trees that we
> create, which tend to eat up about 100MB apiece, and there are dozens
> of them (eg, one per testable contrib module). Those don't get removed
> until the end of the test run, so the usage is cumulative.
>
> The attached proposed patch removes each temp install tree as soon as
> we're done with it, in the normal case where no error was detected.
> This brings the peak space usage down from ~3GB to ~750MB.
>
> To make things better in the buildfarm, we'd have to back-patch this into
> all active branches, but I don't see any big problem with doing so.
>
> Any objections?
>
>
I don't have an issue, but you should be aware that the buildfarm
doesn't in fact run "make check-world", and it doesn't to a test install
for each contrib module, since it runs "installcheck", not "check" for
those. It also cleans up some data directories as it goes.
cheers
andrew
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-19 02:20:39 |
Message-ID: | 23766.1421634039@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On 01/18/2015 05:48 PM, Tom Lane wrote:
>> One of the biggest causes of buildfarm run failures is "out of disk
>> space". That's not just because people are running buildfarm critters
>> on small slow machines; it's because "make check-world" is an enormous
>> space hog. Some numbers from current HEAD:
> I don't have an issue, but you should be aware that the buildfarm
> doesn't in fact run "make check-world", and it doesn't to a test install
> for each contrib module, since it runs "installcheck", not "check" for
> those. It also cleans up some data directories as it goes.
Darn. I knew that it didn't use check-world per se, but I'd supposed
it was doing something morally equivalent. But I checked just now and
didn't see the space consumption of the pgsql.build + inst trees going
much above about 750MB, so it's clearly not as bad as "make check-world".
I think the patch I proposed is still worthwhile though, because it
looks like the buildfarm is doing this on a case-by-case basis and
missing some cases: I see the tmp_check directories for pg_upgrade and
test_decoding sticking around till the end of the run. That could
be fixed in the script of course, but why not have pg_regress do it?
Also, investigating space consumption on my actual buildfarm critters,
it seems like there might be some low hanging fruit in terms of git
checkout management. It looks to me like each branch has a git repo
that only shares objects that existed as of the initial cloning, so
that over time each branch eats more and more unshared space. Also
I wonder about the value of keeping around a checked-out tree per
branch and copying it each time rather than just checking out fresh.
What I see on dromedary, which has been around a bit less than a year,
is that the at-rest space consumption for all 6 active branches is
2.4G even though a single copy of the git repo is just over 400MB:
$ du -hsc pgmirror.git HEAD REL*
416M pgmirror.git
363M HEAD
345M REL9_0_STABLE
351M REL9_1_STABLE
354M REL9_2_STABLE
358M REL9_3_STABLE
274M REL9_4_STABLE
2.4G total
It'd presumably be worse on a critter that's existed longer.
Curious to know if you've looked into alternatives here. I realize
that the tradeoffs might be different with an external git repo,
but for one being managed by the buildfarm script, it seems like
we could do better than this space-wise, for (maybe) little time
penalty. I'd be willing to do some experimenting if you don't have
time for it.
regards, tom lane
From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-19 04:10:27 |
Message-ID: | 54BC83B3.5030305@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
On 01/18/2015 09:20 PM, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> On 01/18/2015 05:48 PM, Tom Lane wrote:
>>> One of the biggest causes of buildfarm run failures is "out of disk
>>> space". That's not just because people are running buildfarm critters
>>> on small slow machines; it's because "make check-world" is an enormous
>>> space hog. Some numbers from current HEAD:
>> I don't have an issue, but you should be aware that the buildfarm
>> doesn't in fact run "make check-world", and it doesn't to a test install
>> for each contrib module, since it runs "installcheck", not "check" for
>> those. It also cleans up some data directories as it goes.
> Darn. I knew that it didn't use check-world per se, but I'd supposed
> it was doing something morally equivalent. But I checked just now and
> didn't see the space consumption of the pgsql.build + inst trees going
> much above about 750MB, so it's clearly not as bad as "make check-world".
>
> I think the patch I proposed is still worthwhile though, because it
> looks like the buildfarm is doing this on a case-by-case basis and
> missing some cases: I see the tmp_check directories for pg_upgrade and
> test_decoding sticking around till the end of the run. That could
> be fixed in the script of course, but why not have pg_regress do it?
>
> Also, investigating space consumption on my actual buildfarm critters,
> it seems like there might be some low hanging fruit in terms of git
> checkout management. It looks to me like each branch has a git repo
> that only shares objects that existed as of the initial cloning, so
> that over time each branch eats more and more unshared space. Also
> I wonder about the value of keeping around a checked-out tree per
> branch and copying it each time rather than just checking out fresh.
> What I see on dromedary, which has been around a bit less than a year,
> is that the at-rest space consumption for all 6 active branches is
> 2.4G even though a single copy of the git repo is just over 400MB:
>
> $ du -hsc pgmirror.git HEAD REL*
> 416M pgmirror.git
> 363M HEAD
> 345M REL9_0_STABLE
> 351M REL9_1_STABLE
> 354M REL9_2_STABLE
> 358M REL9_3_STABLE
> 274M REL9_4_STABLE
> 2.4G total
>
> It'd presumably be worse on a critter that's existed longer.
>
> Curious to know if you've looked into alternatives here. I realize
> that the tradeoffs might be different with an external git repo,
> but for one being managed by the buildfarm script, it seems like
> we could do better than this space-wise, for (maybe) little time
> penalty. I'd be willing to do some experimenting if you don't have
> time for it.
This isn't happening for me. Here's crake:
[andrew(at)emma root]$ du -shc pgmirror.git/ [RH]*/pgsql
218M pgmirror.git/
149M HEAD/pgsql
134M REL9_0_STABLE/pgsql
138M REL9_1_STABLE/pgsql
140M REL9_2_STABLE/pgsql
143M REL9_3_STABLE/pgsql
146M REL9_4_STABLE/pgsql
1.1G total
Maybe you need some git garbage collection?
An alternative would be to remove the pgsql directory at the end of the
run and thus do a complete fresh checkout each run. As you say it would
cost some time but save some space. At least it would be doable as an
option, not sure I'd want to make it non-optional.
cheers
andrew
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-19 05:28:54 |
Message-ID: | 28310.1421645334@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On 01/18/2015 09:20 PM, Tom Lane wrote:
>> What I see on dromedary, which has been around a bit less than a year,
>> is that the at-rest space consumption for all 6 active branches is
>> 2.4G even though a single copy of the git repo is just over 400MB:
>> $ du -hsc pgmirror.git HEAD REL*
>> 416M pgmirror.git
>> 363M HEAD
>> 345M REL9_0_STABLE
>> 351M REL9_1_STABLE
>> 354M REL9_2_STABLE
>> 358M REL9_3_STABLE
>> 274M REL9_4_STABLE
>> 2.4G total
> This isn't happening for me. Here's crake:
> [andrew(at)emma root]$ du -shc pgmirror.git/ [RH]*/pgsql
> 218M pgmirror.git/
> 149M HEAD/pgsql
> 134M REL9_0_STABLE/pgsql
> 138M REL9_1_STABLE/pgsql
> 140M REL9_2_STABLE/pgsql
> 143M REL9_3_STABLE/pgsql
> 146M REL9_4_STABLE/pgsql
> 1.1G total
> Maybe you need some git garbage collection?
Weird ... for me, dromedary and prairiedog are both showing very similar
numbers. Shouldn't GC be automatic? These machines are not running
latest and greatest git (looks like 1.7.3.1 and 1.7.9.6 respectively),
maybe that has something to do with it?
A fresh clone from git://git.postgresql.org/git/postgresql.git right
now is 167MB (using dromedary's git version), so we're both showing
some bloat over the minimum possible repo size, but it's curious that
mine is so much worse.
But the larger point is that git fetch does not, AFAICT, have the same
kind of optimization that git clone does to do hard-linking when copying
an object from a local source repo. With or without GC, the resulting
duplicative storage is going to be the dominant effect after awhile on a
machine tracking a full set of branches.
> An alternative would be to remove the pgsql directory at the end of the
> run and thus do a complete fresh checkout each run. As you say it would
> cost some time but save some space. At least it would be doable as an
> option, not sure I'd want to make it non-optional.
What I was thinking is that a complete-fresh-checkout approach would
remove the need for the copy_source step that happens now, thus buying
back at least most of the I/O cost. But that's only considering the
working tree. The real issue here seems to be about having duplicative
git repos ... seems like we ought to be able to avoid that.
regards, tom lane
From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-19 14:37:53 |
Message-ID: | 54BD16C1.6070304@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
On 01/19/2015 12:28 AM, Tom Lane wrote:
>> An alternative would be to remove the pgsql directory at the end of the
>> run and thus do a complete fresh checkout each run. As you say it would
>> cost some time but save some space. At least it would be doable as an
>> option, not sure I'd want to make it non-optional.
> What I was thinking is that a complete-fresh-checkout approach would
> remove the need for the copy_source step that happens now, thus buying
> back at least most of the I/O cost. But that's only considering the
> working tree. The real issue here seems to be about having duplicative
> git repos ... seems like we ought to be able to avoid that.
>
>
It won't save a copy in the case of a vpath build, because there's no
copying done then.
But I'm wondering if we should look at using the tricks git-new-workdir
uses, setting up symlinks instead of a full clone. Then we'd have one
clone with a bunch of different work dirs. That plus a but of explicitly
done garbage collection and possibly a periodic re-clone might do the trick.
cheers
andrew
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-19 14:53:46 |
Message-ID: | 25092.1421679226@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> But I'm wondering if we should look at using the tricks git-new-workdir
> uses, setting up symlinks instead of a full clone. Then we'd have one
> clone with a bunch of different work dirs. That plus a but of explicitly
> done garbage collection and possibly a periodic re-clone might do the trick.
Yeah, I was wondering whether it'd be okay to depend on git-new-workdir.
That would fix the problem pretty nicely. But in the installations I've
seen, that's not in PATH but squirreled away in some hard-to-guess library
directory ...
regards, tom lane
From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-19 15:09:15 |
Message-ID: | 54BD1E1B.4050209@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
On 01/19/2015 09:53 AM, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> But I'm wondering if we should look at using the tricks git-new-workdir
>> uses, setting up symlinks instead of a full clone. Then we'd have one
>> clone with a bunch of different work dirs. That plus a but of explicitly
>> done garbage collection and possibly a periodic re-clone might do the trick.
> Yeah, I was wondering whether it'd be okay to depend on git-new-workdir.
> That would fix the problem pretty nicely. But in the installations I've
> seen, that's not in PATH but squirreled away in some hard-to-guess library
> directory ...
>
>
Yeah. Luckily, there are really only half a dozen or so lines of script
that do the actual work - the rest is sanity checks. I think we can
replicate that without requiring the script. I'll have a stab later in
the week.
cheers
andrew
From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-19 19:07:54 |
Message-ID: | 20150119190754.GB24381@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
On 2015-01-18 17:48:11 -0500, Tom Lane wrote:
> One of the biggest causes of buildfarm run failures is "out of disk
> space". That's not just because people are running buildfarm critters
> on small slow machines; it's because "make check-world" is an enormous
> space hog. Some numbers from current HEAD:
>
> clean source tree: 120MB
> built source tree: 400MB
> tree after make check-world: 3GB
>
> (This is excluding ~250MB for one's git repo.)
>
> The reason for all the bloat is the temporary install trees that we
> create, which tend to eat up about 100MB apiece, and there are dozens
> of them (eg, one per testable contrib module). Those don't get removed
> until the end of the test run, so the usage is cumulative.
>
> The attached proposed patch removes each temp install tree as soon as
> we're done with it, in the normal case where no error was detected.
> This brings the peak space usage down from ~3GB to ~750MB.
I was wondering before if we couldn't always do the the temp
installation into $top_builddir/tmp_install or something like it. With
an additional small ugly hacking ontop we could even avoid reinstalling
for every target in check-world.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From: | Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | <pgsql-hackers(at)postgreSQL(dot)org> |
Subject: | Re: Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-20 02:33:24 |
Message-ID: | 54BDBE74.5050309@BlueTreble.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
On 1/19/15 1:07 PM, Andres Freund wrote:
> On 2015-01-18 17:48:11 -0500, Tom Lane wrote:
>> One of the biggest causes of buildfarm run failures is "out of disk
>> space". That's not just because people are running buildfarm critters
>> on small slow machines; it's because "make check-world" is an enormous
>> space hog. Some numbers from current HEAD:
>>
>> clean source tree: 120MB
>> built source tree: 400MB
>> tree after make check-world: 3GB
>>
>> (This is excluding ~250MB for one's git repo.)
>>
>> The reason for all the bloat is the temporary install trees that we
>> create, which tend to eat up about 100MB apiece, and there are dozens
>> of them (eg, one per testable contrib module). Those don't get removed
>> until the end of the test run, so the usage is cumulative.
>>
>> The attached proposed patch removes each temp install tree as soon as
>> we're done with it, in the normal case where no error was detected.
>> This brings the peak space usage down from ~3GB to ~750MB.
>
> I was wondering before if we couldn't always do the the temp
> installation into $top_builddir/tmp_install or something like it. With
> an additional small ugly hacking ontop we could even avoid reinstalling
> for every target in check-world.
FWIW, if anyone's going to do some serious tinkering in here; it'd be really nice to create a separate utility for managing temporary installs. That would make it trivial for PGXN modules to use something other than pg_regress for their test framework.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-20 18:42:53 |
Message-ID: | 54BEA1AD.7050507@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
On 01/19/2015 09:53 AM, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> But I'm wondering if we should look at using the tricks git-new-workdir
>> uses, setting up symlinks instead of a full clone. Then we'd have one
>> clone with a bunch of different work dirs. That plus a but of explicitly
>> done garbage collection and possibly a periodic re-clone might do the trick.
> Yeah, I was wondering whether it'd be okay to depend on git-new-workdir.
> That would fix the problem pretty nicely. But in the installations I've
> seen, that's not in PATH but squirreled away in some hard-to-guess library
> directory ...
>
>
We should move this discussion to the buildfarm members list.
I'll be publishing a patch there.
cheers
andrew
From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | PGBuildFarm <pgbuildfarm-members(at)pgfoundry(dot)org> |
Subject: | Re: [Pgbuildfarm-members] [HACKERS] Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-20 18:49:20 |
Message-ID: | 54BEA330.6090308@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
On 01/19/2015 09:53 AM, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> But I'm wondering if we should look at using the tricks git-new-workdir
>> uses, setting up symlinks instead of a full clone. Then we'd have one
>> clone with a bunch of different work dirs. That plus a but of explicitly
>> done garbage collection and possibly a periodic re-clone might do the trick.
> Yeah, I was wondering whether it'd be okay to depend on git-new-workdir.
> That would fix the problem pretty nicely. But in the installations I've
> seen, that's not in PATH but squirreled away in some hard-to-guess library
> directory ...
>
>
Following some discussion on -hackers, here's a trial patch that reduces
the amount of space taken by symlinking the git repos on all non-HEAD
branches to the HEAD branch repo.
There are more steps we can possibly take to reduce space consumption,
but this is a start. If anyone is feeling brave they can apply this and
see how it goes.
For now at least, it doesn't work on Windows, nor on machines using the
git_reference config setting.
cheers
amdrew
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | PGBuildFarm <pgbuildfarm-members(at)pgfoundry(dot)org> |
Subject: | Re: [Pgbuildfarm-members] [HACKERS] Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-21 03:44:52 |
Message-ID: | 4622.1421811892@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Following some discussion on -hackers, here's a trial patch that reduces
> the amount of space taken by symlinking the git repos on all non-HEAD
> branches to the HEAD branch repo.
Erm ... no patch attached?
regards, tom lane
From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | PGBuildFarm <pgbuildfarm-members(at)pgfoundry(dot)org> |
Subject: | Re: [Pgbuildfarm-members] [HACKERS] Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-21 03:50:52 |
Message-ID: | 54BF221C.4000205@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
On 01/20/2015 10:44 PM, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> Following some discussion on -hackers, here's a trial patch that reduces
>> the amount of space taken by symlinking the git repos on all non-HEAD
>> branches to the HEAD branch repo.
> Erm ... no patch attached?
>
>
Oh, darn. Here it is.
To use it, add
git_use_workdirs => 1,
to your config, and remove the $buildroot/REL*/pgsql directories
cheers
andrew
From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | PGBuildFarm <pgbuildfarm-members(at)pgfoundry(dot)org> |
Subject: | Re: [Pgbuildfarm-members] [HACKERS] Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-21 03:53:21 |
Message-ID: | 54BF22B1.7000506@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
On 01/20/2015 10:50 PM, Andrew Dunstan wrote:
>
> On 01/20/2015 10:44 PM, Tom Lane wrote:
>> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>>> Following some discussion on -hackers, here's a trial patch that
>>> reduces
>>> the amount of space taken by symlinking the git repos on all non-HEAD
>>> branches to the HEAD branch repo.
>> Erm ... no patch attached?
>>
>>
>
>
>
>
> Oh, darn. Here it is.
>
> To use it, add
>
> git_use_workdirs => 1,
>
> to your config, and remove the $buildroot/REL*/pgsql directories
>
>
Grr, looks like it's being stripped by mailman.
I will fix
cheers
andrew
From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | PGBuildFarm <pgbuildfarm-members(at)pgfoundry(dot)org> |
Subject: | Re: [Pgbuildfarm-members] [HACKERS] Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-21 04:00:02 |
Message-ID: | 54BF2442.8000101@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
On 01/20/2015 10:53 PM, Andrew Dunstan wrote:
>
> On 01/20/2015 10:50 PM, Andrew Dunstan wrote:
>>
>> On 01/20/2015 10:44 PM, Tom Lane wrote:
>>> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>>>> Following some discussion on -hackers, here's a trial patch that
>>>> reduces
>>>> the amount of space taken by symlinking the git repos on all non-HEAD
>>>> branches to the HEAD branch repo.
>>> Erm ... no patch attached?
>>>
>>>
>>
>>
>>
>>
>> Oh, darn. Here it is.
>>
>> To use it, add
>>
>> git_use_workdirs => 1,
>>
>> to your config, and remove the $buildroot/REL*/pgsql directories
>>
>>
>
>
>
> Grr, looks like it's being stripped by mailman.
>
> I will fix
>
<bullwinkle-mode>This time for Sure!</>
cheers
andrew
Attachment | Content-Type | Size |
---|---|---|
use_workdirs1.patch | text/x-patch | 3.6 KB |
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | PGBuildFarm <pgbuildfarm-members(at)pgfoundry(dot)org> |
Subject: | Re: [Pgbuildfarm-members] [HACKERS] Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-21 05:43:53 |
Message-ID: | 7521.1421819033@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | buildfarm-members pgsql-hackers |
Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On 01/20/2015 10:44 PM, Tom Lane wrote:
>> Erm ... no patch attached?
> Oh, darn. Here it is.
I've applied this on dromedary, and it definitely makes a nice dent
in the at-rest space consumption for a full set of branches. The old
data directory contents were
$ du -hsc pgmirror.git HEAD REL*
418M pgmirror.git
367M HEAD
348M REL9_0_STABLE
353M REL9_1_STABLE
356M REL9_2_STABLE
360M REL9_3_STABLE
277M REL9_4_STABLE
2.4G total
Post-patch, with a freshly created data directory (including a
fresh clone from the git server), I've got
$ du -hsc pgmirror.git HEAD REL*
167M pgmirror.git
107M HEAD
86M REL9_0_STABLE
91M REL9_1_STABLE
95M REL9_2_STABLE
100M REL9_3_STABLE
105M REL9_4_STABLE
753M total
It appears that the peak transient space consumption while building a
branch is about 500MB. In addition to these numbers, I've got a shade
under 1GB in ccache space (that's configurable of course, but by default
ccache will eat up to that much). So the total disk space to run a
buildfarm member with ccache was something close to 4GB with the old way,
2.25GB with this patch.
(These numbers aren't totally comparable of course, since the year-old
installation had acquired some git repo bloat which this one hasn't
had time to yet. But any way you slice it, I've saved well more than
1GB of space.)
regards, tom lane
From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | PGBuildFarm <pgbuildfarm-members(at)pgfoundry(dot)org> |
Subject: | Re: [Pgbuildfarm-members] [HACKERS] Reducing buildfarm disk usage: remove temp installs when done |
Date: | 2015-01-29 00:45:34 |
Message-ID: | 54C982AE.7000903@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | Postg토토 캔SQL : Postg토토 pgsql-hackers |
On 01/21/2015 12:43 AM, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> On 01/20/2015 10:44 PM, Tom Lane wrote:
>>> Erm ... no patch attached?
>> Oh, darn. Here it is.
> I've applied this on dromedary, and it definitely makes a nice dent
> in the at-rest space consumption for a full set of branches. The old
> data directory contents were
>
> $ du -hsc pgmirror.git HEAD REL*
> 418M pgmirror.git
> 367M HEAD
> 348M REL9_0_STABLE
> 353M REL9_1_STABLE
> 356M REL9_2_STABLE
> 360M REL9_3_STABLE
> 277M REL9_4_STABLE
> 2.4G total
>
> Post-patch, with a freshly created data directory (including a
> fresh clone from the git server), I've got
>
> $ du -hsc pgmirror.git HEAD REL*
> 167M pgmirror.git
> 107M HEAD
> 86M REL9_0_STABLE
> 91M REL9_1_STABLE
> 95M REL9_2_STABLE
> 100M REL9_3_STABLE
> 105M REL9_4_STABLE
> 753M total
>
> It appears that the peak transient space consumption while building a
> branch is about 500MB. In addition to these numbers, I've got a shade
> under 1GB in ccache space (that's configurable of course, but by default
> ccache will eat up to that much). So the total disk space to run a
> buildfarm member with ccache was something close to 4GB with the old way,
> 2.25GB with this patch.
>
> (These numbers aren't totally comparable of course, since the year-old
> installation had acquired some git repo bloat which this one hasn't
> had time to yet. But any way you slice it, I've saved well more than
> 1GB of space.)
>
>
Some more thoughts about this:
* there is probably precious little virtue in keeping a local git
mirror for most use cases, once we do this
* regular running of "git gc" pays handsomely, by the look of it. Not
sure if we should have the buildfarm client do this or just make it
a cron job.
* I'm experimenting with removing the work tree on success and
checking it out again when we run. That reduces the static storage
per branch to a handful of megabytes, at the cost of a little IO and
a few seconds of processing time.
* vpath builds also save space and time, since we don't copy the
source in that case.
cheers
andrew