Strange error message

Lists: pgsql-hackerspgsql-ports
From: Adriaan Joubert <a(dot)joubert(at)albourne(dot)com>
To: Postgresql <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Strange error message
Date: 2000-09-29 14:46:44
Message-ID: 39D4AB54.99EA1A43@albourne.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-ports

Hi,

we've suddenly started getting this error message out of postgres
(7.0.2). Does anybody know where it comes from?

ERROR: UNLockBuffer: buffer 0 is not locked

Any help appreciated,

Adriaan


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: a(dot)joubert(at)albourne(dot)com
Cc: Postgresql <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Strange error message
Date: 2000-09-29 15:45:47
Message-ID: 17537.970242347@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-ports

Adriaan Joubert <a(dot)joubert(at)albourne(dot)com> writes:
> we've suddenly started getting this error message out of postgres
> (7.0.2). Does anybody know where it comes from?

> ERROR: UNLockBuffer: buffer 0 is not locked

Evidently something is passing an invalid buffer number to LockBuffer
in src/backend/storage/buffer/bufmgr.c. (0 is InvalidBuffer, but
LockBuffer won't notice that unless you compiled with asserts enabled.)
Whatever the bug is, it's not directly LockBuffer's fault.

What exactly are you doing that provokes this message?

regards, tom lane


From: Adriaan Joubert <a(dot)joubert(at)albourne(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Arrigo Triulzi <arrigo(at)albourne(dot)com>, pgsql-ports(at)postgresql(dot)org
Subject: Re: Alpha spinlock
Date: 2000-10-02 05:21:45
Message-ID: 39D81B69.207D5F0D@albourne.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-ports

Tom Lane wrote:

> > For a while I though it might be because we are using an alpha TAS in
> > the spinlock rather than the old semaphore. I replaced our spinlock
> > with the standard one and it made no difference. We have been running
> > with our spinlock implementation for nearly 2 months on a production
> > database now without a hitch, so I think it is ok. Did I ever submit
> > any patches for the Alpha spinlock?
>
> Not that I recall. We did get some advice from some Alpha gurus at DEC
> who seemed to think the existing TAS code is OK. What was it that you
> felt needed to be improved?

The current code uses semaphores, which has the advantage that it works
well even on multi-processor machines, but the disadvantage that it is not
the fastest way possible. Writing a spinlock on Alpha for SMP machines is
very difficult, as you need to deal with memory barriers. A real mess. But
then one of the people at Compaq pointed out to us that there is a
ready-made routine on Alpha. We implemented it with the two patches below.
I ran tests with lots of parallel back-ends and got around a 10% speed
increase. I include the two patches. Perhaps some of the other people
running Tru64 can have a look at these as well.

Cheers,

Adriaan

Attachment Content-Type Size
patch1 text/plain 654 bytes
patch2 text/plain 749 bytes

From: Adriaan Joubert <a(dot)joubert(at)albourne(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Postgresql <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Strange error message
Date: 2000-10-02 05:42:33
Message-ID: 39D82049.C6A3CBE7@albourne.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-ports

Tom Lane wrote:

> Adriaan Joubert <a(dot)joubert(at)albourne(dot)com> writes:
> > we've suddenly started getting this error message out of postgres
> > (7.0.2). Does anybody know where it comes from?
>
> > ERROR: UNLockBuffer: buffer 0 is not locked
>
> Evidently something is passing an invalid buffer number to LockBuffer
> in src/backend/storage/buffer/bufmgr.c. (0 is InvalidBuffer, but
> LockBuffer won't notice that unless you compiled with asserts enabled.)
> Whatever the bug is, it's not directly LockBuffer's fault.

Right, I'vebuilt a new database and everything seemed fine for a while and
now I've got this message back. It is due to the index on one of our
tables getting messed up - at least, if we drop and recreate the index
everything is fine. What should I do to track down what is happening?
Compile with asserts, or run with specific logging? Any advice
appreciated!

Adriaan


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: a(dot)joubert(at)albourne(dot)com
Cc: Postgresql <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Strange error message
Date: 2000-10-02 06:22:49
Message-ID: 944.970467769@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-ports

Adriaan Joubert <a(dot)joubert(at)albourne(dot)com> writes:
>>>> ERROR: UNLockBuffer: buffer 0 is not locked

> Right, I'vebuilt a new database and everything seemed fine for a while and
> now I've got this message back. It is due to the index on one of our
> tables getting messed up - at least, if we drop and recreate the index
> everything is fine. What should I do to track down what is happening?
> Compile with asserts, or run with specific logging?

Compile with asserts and -g, and get a backtrace from the ensuing
coredump. (LockBuffer *will* Assert when passed a zero. If we are
really lucky, we might see an earlier Assert failure that will give
more clue about where the zero comes from --- but if not, the backtrace
from the bogus LockBuffer call might still be useful.)

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: a(dot)joubert(at)albourne(dot)com
Cc: Arrigo Triulzi <arrigo(at)albourne(dot)com>, pgsql-ports(at)postgresql(dot)org
Subject: Re: Alpha spinlock
Date: 2000-10-02 06:26:40
Message-ID: 965.970468000@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-ports

Adriaan Joubert <a(dot)joubert(at)albourne(dot)com> writes:
> Tom Lane wrote:
>> Not that I recall. We did get some advice from some Alpha gurus at DEC
>> who seemed to think the existing TAS code is OK. What was it that you
>> felt needed to be improved?

> The current code uses semaphores,

Oh, I'm sorry, I was thinking about the Linux-Alpha code, for which
there has been an inline-assembly version of tas() for a long time;
and it was that code that we asked the DEC people about. I had
forgotten that the OSF port uses a different semaphore method.

Is there a reason we couldn't use the Linux-Alpha code on OSF too?
I'd just as soon minimize the difference between the two ports ...

regards, tom lane


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: a(dot)joubert(at)albourne(dot)com
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Arrigo Triulzi <arrigo(at)albourne(dot)com>, pgsql-ports(at)postgresql(dot)org
Subject: Re: Re: Alpha spinlock
Date: 2000-10-08 04:25:29
Message-ID: 200010080425.AAA04041@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-ports

Tom, what do you want to do with this patch?

>
>
> Tom Lane wrote:
>
> > > For a while I though it might be because we are using an alpha TAS in
> > > the spinlock rather than the old semaphore. I replaced our spinlock
> > > with the standard one and it made no difference. We have been running
> > > with our spinlock implementation for nearly 2 months on a production
> > > database now without a hitch, so I think it is ok. Did I ever submit
> > > any patches for the Alpha spinlock?
> >
> > Not that I recall. We did get some advice from some Alpha gurus at DEC
> > who seemed to think the existing TAS code is OK. What was it that you
> > felt needed to be improved?
>
> The current code uses semaphores, which has the advantage that it works
> well even on multi-processor machines, but the disadvantage that it is not
> the fastest way possible. Writing a spinlock on Alpha for SMP machines is
> very difficult, as you need to deal with memory barriers. A real mess. But
> then one of the people at Compaq pointed out to us that there is a
> ready-made routine on Alpha. We implemented it with the two patches below.
> I ran tests with lots of parallel back-ends and got around a 10% speed
> increase. I include the two patches. Perhaps some of the other people
> running Tru64 can have a look at these as well.
>
> Cheers,
>
> Adriaan

> *** src/include/storage/s_lock.h Sat Sep 30 12:13:42 2000
> --- src/include/storage/s_lock.h- Sat Sep 30 12:13:42 2000
> ***************
> *** 252,266 ****
> * Note that slock_t on the Alpha AXP is msemaphore instead of char
> * (see storage/ipc.h).
> */
> - #include <alpha/builtins.h>
> - #if 0
> #define TAS(lock) (msem_lock((lock), MSEM_IF_NOWAIT) < 0)
> #define S_UNLOCK(lock) msem_unlock((lock), 0)
> #define S_INIT_LOCK(lock) msem_init((lock), MSEM_UNLOCKED)
> #define S_LOCK_FREE(lock) (!(lock)->msem_state)
> - #else
> - #define TAS(lock) (__INTERLOCKED_TESTBITSS_QUAD((lock),0))
> - #endif
>
> #else /* i.e. not __osf__ */
>
> --- 252,261 ----

> *** src/include/port/alpha.h Sat Sep 30 12:13:21 2000
> --- src/include/port/alpha.h- Sat Sep 30 12:13:21 2000
> ***************
> *** 1,10 ****
> #define USE_POSIX_TIME
> #define DISABLE_XOPEN_NLS
> #define HAS_TEST_AND_SET
> ! /*#include <sys/mman.h>*/ /* for msemaphore */
> ! /*typedef msemaphore slock_t;*/
> ! #include <alpha/builtins.h>
> ! typedef volatile long slock_t;
>
> /* some platforms define __alpha, but not __alpha__ */
> #if defined(__alpha) && !defined(__alpha__)
> --- 1,8 ----
> #define USE_POSIX_TIME
> #define DISABLE_XOPEN_NLS
> #define HAS_TEST_AND_SET
> ! #include <sys/mman.h> /* for msemaphore */
> ! typedef msemaphore slock_t;
>
> /* some platforms define __alpha, but not __alpha__ */
> #if defined(__alpha) && !defined(__alpha__)

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: a(dot)joubert(at)albourne(dot)com, Arrigo Triulzi <arrigo(at)albourne(dot)com>, pgsql-ports(at)postgresql(dot)org
Subject: Re: Re: Alpha spinlock
Date: 2000-10-08 04:34:30
Message-ID: 19939.970979670@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-ports

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> Tom, what do you want to do with this patch?

Apply away ...

regards, tom lane


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: a(dot)joubert(at)albourne(dot)com
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Arrigo Triulzi <arrigo(at)albourne(dot)com>, pgsql-ports(at)postgresql(dot)org
Subject: Re: Re: Alpha spinlock
Date: 2000-10-08 04:38:34
Message-ID: 200010080438.AAA04756@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-ports

Applied, but include/port/alpha.h is now include/port/osf.h.

>
>
> Tom Lane wrote:
>
> > > For a while I though it might be because we are using an alpha TAS in
> > > the spinlock rather than the old semaphore. I replaced our spinlock
> > > with the standard one and it made no difference. We have been running
> > > with our spinlock implementation for nearly 2 months on a production
> > > database now without a hitch, so I think it is ok. Did I ever submit
> > > any patches for the Alpha spinlock?
> >
> > Not that I recall. We did get some advice from some Alpha gurus at DEC
> > who seemed to think the existing TAS code is OK. What was it that you
> > felt needed to be improved?
>
> The current code uses semaphores, which has the advantage that it works
> well even on multi-processor machines, but the disadvantage that it is not
> the fastest way possible. Writing a spinlock on Alpha for SMP machines is
> very difficult, as you need to deal with memory barriers. A real mess. But
> then one of the people at Compaq pointed out to us that there is a
> ready-made routine on Alpha. We implemented it with the two patches below.
> I ran tests with lots of parallel back-ends and got around a 10% speed
> increase. I include the two patches. Perhaps some of the other people
> running Tru64 can have a look at these as well.
>
> Cheers,
>
> Adriaan

> *** src/include/storage/s_lock.h Sat Sep 30 12:13:42 2000
> --- src/include/storage/s_lock.h- Sat Sep 30 12:13:42 2000
> ***************
> *** 252,266 ****
> * Note that slock_t on the Alpha AXP is msemaphore instead of char
> * (see storage/ipc.h).
> */
> - #include <alpha/builtins.h>
> - #if 0
> #define TAS(lock) (msem_lock((lock), MSEM_IF_NOWAIT) < 0)
> #define S_UNLOCK(lock) msem_unlock((lock), 0)
> #define S_INIT_LOCK(lock) msem_init((lock), MSEM_UNLOCKED)
> #define S_LOCK_FREE(lock) (!(lock)->msem_state)
> - #else
> - #define TAS(lock) (__INTERLOCKED_TESTBITSS_QUAD((lock),0))
> - #endif
>
> #else /* i.e. not __osf__ */
>
> --- 252,261 ----

> *** src/include/port/alpha.h Sat Sep 30 12:13:21 2000
> --- src/include/port/alpha.h- Sat Sep 30 12:13:21 2000
> ***************
> *** 1,10 ****
> #define USE_POSIX_TIME
> #define DISABLE_XOPEN_NLS
> #define HAS_TEST_AND_SET
> ! /*#include <sys/mman.h>*/ /* for msemaphore */
> ! /*typedef msemaphore slock_t;*/
> ! #include <alpha/builtins.h>
> ! typedef volatile long slock_t;
>
> /* some platforms define __alpha, but not __alpha__ */
> #if defined(__alpha) && !defined(__alpha__)
> --- 1,8 ----
> #define USE_POSIX_TIME
> #define DISABLE_XOPEN_NLS
> #define HAS_TEST_AND_SET
> ! #include <sys/mman.h> /* for msemaphore */
> ! typedef msemaphore slock_t;
>
> /* some platforms define __alpha, but not __alpha__ */
> #if defined(__alpha) && !defined(__alpha__)

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026