Re: Huge Data sets, simple queries

Lists: pgsql-performance
From: "Luke Lonergan" <LLonergan(at)greenplum(dot)com>
To: "hubert depesz lubaczewski" <depesz(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Huge Data sets, simple queries
Date: 2006-01-29 18:44:08
Message-ID: 3E37B936B592014B978C4415F90D662D023F28D9@MI8NYCMAIL06.Mi8.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Depesz,

> [mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of
> hubert depesz lubaczewski
> Sent: Sunday, January 29, 2006 3:25 AM
>
> hmm .. do i understand correctly that you're suggesting that
> using raid 10 and/or hardware raid adapter might hurt disc
> subsystem performance? could you elaborate on the reasons,
> please? it's not that i'm against the idea - i'm just curious
> as this is very "against-common-sense". and i always found it
> interesting when somebody states something that uncommon...

See previous postings on this list - often when someone is reporting a
performance problem with large data, the answer comes back that their
I/O setup is not performing well. Most times, people are trusting that
when they buy a hardware RAID adapter and set it up, that the
performance will be what they expect and what is theoretically correct
for the number of disk drives.

In fact, in our testing of various host-based SCSI RAID adapters (LSI,
Dell PERC, Adaptec, HP SmartArray), we find that *all* of them
underperform, most of them severely. Some produce results slower than a
single disk drive. We've found that some external SCSI RAID adapters,
those built into the disk chassis, often perform better. I think this
might be due to the better drivers and perhaps a different marketplace
for the higher end solutions driving performance validation.

The important lesson we've learned is to always test the I/O subsystem
performance - you can do so with a simple test like:
time bash -c "dd if=/dev/zero of=bigfile bs=8k count=4000000 && sync"
time dd if=bigfile of=/dev/null bs=8k

If the answer isn't something close to the theoretical rate, you are
likely limited by your RAID setup. You might be shocked to find a
severe performance problem. If either is true, switching to software
RAID using a simple SCSI adapter will fix the problem.

BTW - we've had very good experiences with the host-based SATA adapters
from 3Ware. The Areca controllers are also respected.

Oh - and about RAID 10 - for large data work it's more often a waste of
disk performance-wise compared to RAID 5 these days. RAID5 will almost
double the performance on a reasonable number of drives.

- Luke


From: "Jeffrey W(dot) Baker" <jwbaker(at)acm(dot)org>
To: Luke Lonergan <LLonergan(at)greenplum(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Huge Data sets, simple queries
Date: 2006-01-29 21:04:01
Message-ID: 1138568641.10923.6.camel@noodles
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Sun, 2006-01-29 at 13:44 -0500, Luke Lonergan wrote:
> Depesz,
>
> > [mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of
> > hubert depesz lubaczewski
> > Sent: Sunday, January 29, 2006 3:25 AM
> >
> > hmm .. do i understand correctly that you're suggesting that
> > using raid 10 and/or hardware raid adapter might hurt disc
> > subsystem performance? could you elaborate on the reasons,
> > please? it's not that i'm against the idea - i'm just curious
> > as this is very "against-common-sense". and i always found it
> > interesting when somebody states something that uncommon...

> Oh - and about RAID 10 - for large data work it's more often a waste of
> disk performance-wise compared to RAID 5 these days. RAID5 will almost
> double the performance on a reasonable number of drives.

I think you might want to be more specific here. I would agree with you
for data warehousing, decision support, data mining, and similar
read-mostly non-transactional loads. For transactional loads RAID-5 is,
generally speaking, a disaster due to the read-before-write problem.

While we're on the topic, I just installed another one of those Areca
ARC-1130 controllers with 1GB cache. It's ludicrously fast: 250MB/sec
burst writes, CPU-limited reads. I can't recommend them highly enough.

-jwb

PS: Could you look into fixing your mailer? Your messages sometimes
don't contain In-Reply-To headers, and therefore don't thread properly.


From: Charles Sprickman <spork(at)bway(dot)net>
To: Luke Lonergan <LLonergan(at)greenplum(dot)com>
Cc: hubert depesz lubaczewski <depesz(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Huge Data sets, simple queries
Date: 2006-01-30 05:35:12
Message-ID: Pine.OSX.4.61.0601300030200.372@spork-book.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Sun, 29 Jan 2006, Luke Lonergan wrote:

> In fact, in our testing of various host-based SCSI RAID adapters (LSI,
> Dell PERC, Adaptec, HP SmartArray), we find that *all* of them
> underperform, most of them severely.

[snip]

> The important lesson we've learned is to always test the I/O subsystem
> performance - you can do so with a simple test like:
> time bash -c "dd if=/dev/zero of=bigfile bs=8k count=4000000 && sync"
> time dd if=bigfile of=/dev/null bs=8k

I'm curious about this since we're shopping around for something new... I
do want to get some kind of baseline to compare new products to. Areca
sent me stats on their SCSI->SATA controller and it looks like it maxes
out around 10,000 IOPS.

I'd like to see how our existing stuff compares to this. I'd especially
like to see it in graph form such as the docs Areca sent (IOPS on one
axis, block size on the other, etc.). Looking at the venerable Bonnie, it
doesn't really seem to focus so much on the number of read/write
operations per second, but on big bulky transfers.

What are you folks using to measure your arrays?

I've been considering using some of our data and just basically
benchmarking postgres on various hardware with that, but I cannot compare
that to any manufacturer tests.

Sorry to meander a bit off topic, but I've been getting frustrated with
this little endeavour...

Thanks,

Charles

> - Luke
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>


From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Charles Sprickman" <spork(at)bway(dot)net>
Cc: "hubert depesz lubaczewski" <depesz(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Huge Data sets, simple queries
Date: 2006-01-30 07:25:22
Message-ID: C002FF62.1B429%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Charles,

On 1/29/06 9:35 PM, "Charles Sprickman" <spork(at)bway(dot)net> wrote:

> What are you folks using to measure your arrays?

Bonnie++ measures random I/Os, numbers we find are typically in the 500/s
range, the best I've seen is 1500/s on a large Fibre Channel RAID0 (at
http://www.wlug.org.nz/HarddiskBenchmarks)

- Luke


From: hubert depesz lubaczewski <depesz(at)gmail(dot)com>
To: Luke Lonergan <LLonergan(at)greenplum(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Huge Data sets, simple queries
Date: 2006-01-30 17:53:29
Message-ID: 9e4684ce0601300953kdb9f812h12820410cc3d5580@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On 1/29/06, Luke Lonergan <LLonergan(at)greenplum(dot)com> wrote:
> Oh - and about RAID 10 - for large data work it's more often a waste of
> disk performance-wise compared to RAID 5 these days. RAID5 will almost
> double the performance on a reasonable number of drives.

how many is reasonable?

depesz


From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "hubert depesz lubaczewski" <depesz(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Huge Data sets, simple queries
Date: 2006-01-30 19:48:49
Message-ID: C003ADA1.1B4A0%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Depesz,

On 1/30/06 9:53 AM, "hubert depesz lubaczewski" <depesz(at)gmail(dot)com> wrote:

>> double the performance on a reasonable number of drives.
>
> how many is reasonable?

What I mean by that is: given a set of disks N, the read performance of RAID
will be equal to the drive read rate A times the number of drives used for
reading by the RAID algorithm. In the case of RAID5, that number is (N-1),
so the read rate is A x (N-1). In the case of RAID10, that number is N/2,
so the read rate is A x (N/2). So, the ratio of read performance
RAID5/RAID10 is (N-1)/(N/2) = 2 x (N-1)/N. For numbers of drives, this
ratio looks like this:
N RAID5/RAID10
3 1.33
6 1.67
8 1.75
14 1.86

So - I think reasonable would be 6-8, which are common disk configurations.

- Luke