Re: BUG #17910: gcc-introduced load may cause concurrency bug

Lists: pgsql-bugs
From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: absoler(at)smail(dot)nju(dot)edu(dot)cn
Subject: BUG #17910: gcc-introduced load may cause concurrency bug
Date: 2023-04-27 08:45:15
Message-ID: 17910-9050afa682d5dc56@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 17910
Logged by: Kunbo Zhang
Email address: absoler(at)smail(dot)nju(dot)edu(dot)cn
PostgreSQL version: 15.2
Operating system: ubuntu 20.04
Description:

we discover that, under Ubuntu 20.04 with gcc-12.1, built
postgres-release-15.2 has an
compiled-introduced load operation in "_bt_parallel_build_main(dsm_segment
*seg, shm_toc *toc)",

near src/backend/access/nbtree/nbtsort.c:1840, the source code is:

if (!btshared->isconcurrent)
{
heapLockmode = ShareLock;
indexLockmode = AccessExclusiveLock;
}
else
{
heapLockmode = ShareUpdateExclusiveLock;
indexLockmode = RowExclusiveLock;
}

the `btshared->isconcurrent` is only read once to control two if-branches

and the corresponding disassembly is:

/home/postgres-REL_15_2/src/backend/access/nbtree/nbtsort.c:1840
56d415: 80 78 0a 01 cmpb $0x1,0xa(%rax) #
btshared->isconcurrent
/home/postgres-REL_15_2/src/backend/access/nbtree/nbtsort.c:1834
56d419: 48 89 c3 mov %rax,%rbx
/home/postgres-REL_15_2/src/backend/access/nbtree/nbtsort.c:1840
56d41c: 0f b6 40 0a movzbl 0xa(%rax),%eax #
btshared->isconcurrent
56d420: 45 19 e4 sbb %r12d,%r12d
/home/postgres-REL_15_2/src/backend/access/nbtree/nbtsort.c:1849
56d423: 8b 3b mov (%rbx),%edi
/home/postgres-REL_15_2/src/backend/access/nbtree/nbtsort.c:1840
56d425: 41 83 e4 05 and $0x5,%r12d
56d429: 41 83 c4 03 add $0x3,%r12d
56d42d: f6 d8 neg %al
56d42f: 45 19 f6 sbb %r14d,%r14d
56d432: 41 83 c6 05 add $0x5,%r14d

we can see the compiled program load `btshared->isconcurrent` twice, and
each loaded value is used
for an assignment. And `btshared->isconcurrent` seems to be a shared object,
if it's modified
concurrently else where between the `cmp` and `mov` instructions, there may
be some concurrency
bugs.


From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: absoler(at)smail(dot)nju(dot)edu(dot)cn, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17910: gcc-introduced load may cause concurrency bug
Date: 2023-04-27 09:01:41
Message-ID: A34A108F-E4AA-42DD-9EAC-9754D003AF3E@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

> On 27 Apr 2023, at 10:45, PG Bug reporting form <noreply(at)postgresql(dot)org> wrote:

> we discover that, under Ubuntu 20.04 with gcc-12.1, built
> postgres-release-15.2 has an
> compiled-introduced load operation in "_bt_parallel_build_main(dsm_segment
> *seg, shm_toc *toc)",

Thanks for your report!

> we can see the compiled program load `btshared->isconcurrent` twice, and
> each loaded value is used
> for an assignment. And `btshared->isconcurrent` seems to be a shared object,
> if it's modified
> concurrently else where between the `cmp` and `mov` instructions, there may
> be some concurrency
> bugs.

The isconcurrent struct member is set to indicate if the operation is a CREATE
INDEX CONCURRENTLY or not, and should not be changed at any point during the
index creation. If you look at the definition of BTShared it has this comment:

/*
* These fields are not modified during the sort. They primarily exist
* for the benefit of worker processes that need to create BTSpool state
* corresponding to that used by the leader.
*/
Oid heaprelid;
Oid indexrelid;
bool isunique;
bool nulls_not_distinct;
bool isconcurrent;
int scantuplesortstates;

There should be no concurrent modifications of this, did you observe any such
case where this caused an issue?

--
Daniel Gustafsson