Lists: | Postg사설 토토 사이트SQL |
---|
From: | ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> |
---|---|
To: | pgsql-patches(at)postgresql(dot)org |
Subject: | Miscalculation in IsCheckpointOnSchedule() |
Date: | 2007-11-14 05:26:12 |
Message-ID: | 20071114132759.30E4.ITAGAKI.TAKAHIRO@oss.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-patches |
I run long DBT-2 with 8.3beta2 and saw checkpoint spikes periodically.
The progress against WAL segments consumption *jumped up* in such cases.
It seems to be a miscalculation in IsCheckpointOnSchedule().
xrecoff is uint32, therefore the difference of two xrecoffs could
be between -4G and +4G. We should not cast it to int32, that domain
is [-2G, +2G).
Here is a patch to fix it, casting xrecoff to double directly.
Index: src/backend/postmaster/bgwriter.c
===================================================================
--- src/backend/postmaster/bgwriter.c (HEAD)
+++ src/backend/postmaster/bgwriter.c (working copy)
@@ -718,7 +718,7 @@
recptr = GetInsertRecPtr();
elapsed_xlogs =
(((double) (int32) (recptr.xlogid - ckpt_start_recptr.xlogid)) * XLogSegsPerFile +
- ((double) (int32) (recptr.xrecoff - ckpt_start_recptr.xrecoff)) / XLogSegSize) /
+ ((double) recptr.xrecoff - (double) ckpt_start_recptr.xrecoff) / XLogSegSize) /
CheckPointSegments;
if (progress < elapsed_xlogs)
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> |
Cc: | pgsql-patches(at)postgresql(dot)org |
Subject: | Re: Miscalculation in IsCheckpointOnSchedule() |
Date: | 2007-11-14 05:32:14 |
Message-ID: | 14221.1195018334@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | Postg토토SQL : Postg토토SQL |
ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> - ((double) (int32) (recptr.xrecoff - ckpt_start_recptr.xrecoff)) / XLogSegSize) /
> + ((double) recptr.xrecoff - (double) ckpt_start_recptr.xrecoff) / XLogSegSize) /
Surely this makes matters worse, not better. What happens near a segment
boundary crossing?
regards, tom lane
From: | ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-patches(at)postgresql(dot)org |
Subject: | Re: Miscalculation in IsCheckpointOnSchedule() |
Date: | 2007-11-14 06:09:55 |
Message-ID: | 20071114144003.30E7.ITAGAKI.TAKAHIRO@oss.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-patches |
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> > - ((double) (int32) (recptr.xrecoff - ckpt_start_recptr.xrecoff)) / XLogSegSize) /
> > + ((double) recptr.xrecoff - (double) ckpt_start_recptr.xrecoff) / XLogSegSize) /
>
> Surely this makes matters worse, not better. What happens near a segment
> boundary crossing?
Here is the dumped progres information by the attached patch
(only for debug purpose).
cur prog. xlog prog. time prog.
[-400ms] 0.030503 0.019247 0.031158 (diff xlogid=0, xrecoff=82665472)
[-200ms] 0.031176 0.019957 0.031839 (diff xlogid=0, xrecoff=85712896)
[*] 0.031860 1.020706 0.032521 (diff xlogid=1, xrecoff=105709568)
> recptr.xrecoff - ckpt_start_recptr.xrecoff
recptr.xrecoff is reset to 0 or so when xlogid is bumped up. At that time,
if ckpt_start_recptr.xrecoff is greater than 2G, we cannot represent
the difference with int32 because the value is less than -2G.
Casting double after int32 does not help us.
We use ( xlogid * 255 * 16MB + xrecoff ) as a base value for the calculation.
If we interprets xrecoff=105709568 at [*] as "minus uint32", we can calcurate
it correctly as below:
without fix with fix
[-400ms] 82665472 82665472 (same)
[-200ms] 85712896 85712896 (same)
[*] 4383899648 88932353 (= 1*255*16MB - (0xFFFFFFFF - 105709568) )
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
83xlogdbg.patch | application/octet-stream | 1.5 KB |
From: | Heikki Linnakangas <heikki(at)enterprisedb(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org |
Subject: | Re: Miscalculation in IsCheckpointOnSchedule() |
Date: | 2007-11-14 11:47:08 |
Message-ID: | 473AE03C.9040406@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-patches |
Tom Lane wrote:
> ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
>> - ((double) (int32) (recptr.xrecoff - ckpt_start_recptr.xrecoff)) / XLogSegSize) /
>> + ((double) recptr.xrecoff - (double) ckpt_start_recptr.xrecoff) / XLogSegSize) /
>
> Surely this makes matters worse, not better. What happens near a segment
> boundary crossing?
Hmm. There seems to be another little bug in there. XLogSegsPerFile is
defined as 0xffffffff/XLogSegSize, which is 255 with default settings.
It should be 256. That leads to negative elapsed_xlogs estimates at xlog
file boundaries. XLogCheckpointNeeded suffers from it too; the number of
segments consumed since last checkpoint is off by one per each xlogid,
so we trigger the checkpoint a little bit too late.
Otherwise, the patch looks good to me. I also tested the calculation
with a little C program (attached), and it seems to work on segment and
xlog file boundaries just fine.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
xlog-arithmetic.c | text/x-csrc | 1.4 KB |
From: | Heikki Linnakangas <heikki(at)enterprisedb(dot)com> |
---|---|
To: | |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org |
Subject: | Re: Miscalculation in IsCheckpointOnSchedule() |
Date: | 2007-11-14 12:08:38 |
Message-ID: | 473AE546.6080703@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | Postg사설 토토 사이트SQL |
Heikki Linnakangas wrote:
> Tom Lane wrote:
>> ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
>>> - ((double) (int32) (recptr.xrecoff -
>>> ckpt_start_recptr.xrecoff)) / XLogSegSize) /
>>> + ((double) recptr.xrecoff - (double)
>>> ckpt_start_recptr.xrecoff) / XLogSegSize) /
>>
>> Surely this makes matters worse, not better. What happens near a segment
>> boundary crossing?
>
> Hmm. There seems to be another little bug in there. XLogSegsPerFile is
> defined as 0xffffffff/XLogSegSize, which is 255 with default settings.
> It should be 256. That leads to negative elapsed_xlogs estimates at xlog
> file boundaries. XLogCheckpointNeeded suffers from it too; the number of
> segments consumed since last checkpoint is off by one per each xlogid,
> so we trigger the checkpoint a little bit too late.
I'll take that back. We intentionally don't use the last possible
segment of each xlog file, to avoid overflows. Sorry for the noise.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> |
Cc: | pgsql-patches(at)postgresql(dot)org |
Subject: | Re: Miscalculation in IsCheckpointOnSchedule() |
Date: | 2007-11-14 21:15:33 |
Message-ID: | 15711.1195074933@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-patches |
ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Surely this makes matters worse, not better. What happens near a segment
>> boundary crossing?
> Here is the dumped progres information by the attached patch
> (only for debug purpose).
Oh, I take that back. I was thinking that conversion of unsigned to
double usually does the Wrong Thing, but in this context it seems to be
the right thing.
regards, tom lane