Lists: | pgsql-bugs |
---|
From: | PG Bug reporting form <noreply(at)postgresql(dot)org> |
---|---|
To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Cc: | exclusion(at)gmail(dot)com |
Subject: | BUG #18499: Reindexing spgist index concurrently triggers Assert("TransactionIdIsValid(state->myXid)") |
Date: | 2024-06-07 11:00:00 |
Message-ID: | 18499-8a519c280f956480@postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
The following bug has been logged on the website:
Bug reference: 18499
Logged by: Alexander Lakhin
Email address: exclusion(at)gmail(dot)com
PostgreSQL version: 17beta1
Operating system: Ubuntu 22.04
Description:
The following script:
cat << EOF | psql &
CREATE TABLE t(i int, p point);
CREATE INDEX spgist_point_idx ON t USING spgist(p);
INSERT INTO t (i, p)
SELECT g, point(g*10+1, g*10+1) FROM generate_series(1, 100000) g;
SELECT pg_sleep(1);
INSERT INTO t (i, p)
SELECT g, point(g*10+1, g*10+1) FROM generate_series(1, 100000) g;
EOF
cat << EOF | psql
SELECT pg_sleep(1);
REINDEX INDEX CONCURRENTLY spgist_point_idx;
EOF
triggers an assertion failure with the following stack trace:
TRAP: failed Assert("TransactionIdIsValid(state->myXid)"), File:
"spgutils.c", Line: 1078, PID: 2975757
(gdb) bt
...
#5 0x00005569aefe0727 in ExceptionalCondition
(conditionName=conditionName(at)entry=0x5569af0562a0
"TransactionIdIsValid(state->myXid)", fileName=fileName(at)entry=0x5569af05636c
"spgutils.c", lineNumber=lineNumber(at)entry=1078)
at assert.c:66
#6 0x00005569aeb3c344 in spgFormDeadTuple
(state=state(at)entry=0x7ffc3a771760, tupstate=tupstate(at)entry=1,
blkno=blkno(at)entry=0, offnum=offnum(at)entry=1) at spgutils.c:1078
#7 0x00005569aeb3385b in spgPageIndexMultiDelete (...) at
spgdoinsert.c:166
#8 0x00005569aeb34d90 in doPickSplit (...) at spgdoinsert.c:1177
#9 0x00005569aeb35a57 in spgdoinsert (...) at spgdoinsert.c:2139
#10 0x00005569aeb3661f in spginsert (...) at spginsert.c:206
#11 0x00005569aeb11a6d in index_insert (...) at indexam.c:230
#12 0x00005569aeb035f5 in heapam_index_validate_scan (...) at
heapam_handler.c:1963
#13 0x00005569aeb9589b in table_index_validate_scan (...) at
../../../src/include/access/tableam.h:1855
#14 validate_index (...) at index.c:3392
#15 0x00005569aec4dbeb in ReindexRelationConcurrently (...) at
indexcmds.c:4036
#16 0x00005569aec4ec05 in ReindexIndex (...) at indexcmds.c:2814
#17 0x00005569aec4efd9 in ExecReindex (...) at indexcmds.c:2743
(gdb) frame 6
(gdb) p state->myXid
$3 = 0
Without asserts enabled, REINDEX executed with no error.
Reproduced on REL_12_STABLE .. master.
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | exclusion(at)gmail(dot)com |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #18499: Reindexing spgist index concurrently triggers Assert("TransactionIdIsValid(state->myXid)") |
Date: | 2024-06-08 22:52:53 |
Message-ID: | 428421.1717887173@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> The following script:
> ...
> triggers an assertion failure with the following stack trace:
> TRAP: failed Assert("TransactionIdIsValid(state->myXid)"), File:
> "spgutils.c", Line: 1078, PID: 2975757
Thanks for the report! The problem evidently starts in
initSpGistState, which does
state->myXid = GetTopTransactionIdIfAny();
Ordinarily this would produce a valid myXid, since if we're creating
an index or inserting a tuple into an existing index, there would
already be an XID. But REINDEX CONCURRENTLY reaches this code in a
new transaction that hasn't assigned an XID and probably never will.
So if we have to make a REDIRECT tuple, we have trouble.
> Without asserts enabled, REINDEX executed with no error.
It's still broken though, because creating a REDIRECT tuple with
an invalid XID is not OK; it'll cause trouble later on.
I experimented with
- state->myXid = GetTopTransactionIdIfAny();
+ state->myXid = GetTopTransactionId();
but that causes the parallel-vacuum tests to fall over with
ERROR: cannot assign XIDs during a parallel operation
Apparently, we never make REDIRECT tuples during VACUUM, or we'd
have seen this error reported before. We could maybe make this
code read something like "if not VACUUM then assign an XID", but
that might still fail in parallelized concurrent REINDEX.
In any case, it feels like a bad idea that initSpGistState doesn't
always make myXid valid: that seems like it's assuming too much
about when we can create REDIRECT tuples.
What seems like it should work is
state->myXid = GetTopTransactionIdIfAny();
+ if (!TransactionIdIsValid(state->myXid))
+ state->myXid = ReadNextTransactionId();
on the grounds that if we lack an XID, then the next-to-be-assigned
XID should be a safe expiry horizon for any REDIRECT we might need.
There are some comments to be updated, and in HEAD I'd be inclined to
rename "myXid" to something that doesn't imply that it's necessarily
our own XID. Maybe "redirectXid", since it's not used for anything
but that.
regards, tom lane
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | exclusion(at)gmail(dot)com |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #18499: Reindexing spgist index concurrently triggers Assert("TransactionIdIsValid(state->myXid)") |
Date: | 2024-06-16 20:34:07 |
Message-ID: | 3030961.1718570047@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
I wrote:
> In any case, it feels like a bad idea that initSpGistState doesn't
> always make myXid valid: that seems like it's assuming too much
> about when we can create REDIRECT tuples.
> What seems like it should work is
> state->myXid = GetTopTransactionIdIfAny();
> + if (!TransactionIdIsValid(state->myXid))
> + state->myXid = ReadNextTransactionId();
> on the grounds that if we lack an XID, then the next-to-be-assigned
> XID should be a safe expiry horizon for any REDIRECT we might need.
I spent a good deal more time thinking about this, and concluded that
that idea doesn't help. The only case where this is problematic is
redirect tuples inserted during REINDEX CONCURRENTLY. As far as I can
see, using ReadNextTransactionId adds no safety because that XID could
be consumed and then fall below the global xmin horizon before REINDEX
finishes. (If REINDEX CONCURRENTLY held onto a single snapshot for
the duration, that snapshot's xmin would prevent this; but it doesn't,
so there could be times when there's no snapshot anywhere.)
Nonetheless, it seems like it'd mostly be okay to insert invalid XID
in redirects generated by REINDEX CONCURRENTLY. The reason that that
seems safe is that R.C. doesn't allow any scans to use the index at
all till it's done; therefore there could never be a scan "in flight"
to such a redirect. We do allow insertions to happen concurrently
with R.C., but those should never be in flight to a redirect at all,
as explained in src/backend/access/spgist/README. And furthermore,
R.C. also locks out VACUUM. This means that it should be safe for
VACUUM to clean up a redirect inserted by R.C. immediately when it
sees one.
The one problem is that VACUUM's test
GlobalVisTestIsRemovableXid(vistest, dt->xid))
would fail on invalid xid, mainly because FullXidRelativeTo doesn't
work on that. (It'd get an assertion failure, or produce the wrong
value without assertions.) We can fix that by adding an explicit
test for invalid xid, which we really need to do anyway because of
the possibility that existing indexes contain such redirects.
So I conclude that we basically just need to remove the failing
assertion in spgFormDeadTuple and instead add a guard for invalid
xid in vacuumRedirectAndPlaceholder. The attached draft patch
also adds some comments and performs the threatened rename of
myXid to redirectXid. (I wouldn't do that renaming in back
branches, only HEAD.)
Thoughts?
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
fix-spgist-redirect-in-reindex-concurrently.patch | text/x-diff | 5.2 KB |
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | exclusion(at)gmail(dot)com |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #18499: Reindexing spgist index concurrently triggers Assert("TransactionIdIsValid(state->myXid)") |
Date: | 2024-06-16 22:52:52 |
Message-ID: | 3045795.1718578372@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
I wrote:
> So I conclude that we basically just need to remove the failing
> assertion in spgFormDeadTuple and instead add a guard for invalid
> xid in vacuumRedirectAndPlaceholder.
BTW, a different line of attack could be to not generate redirects
at all during REINDEX CONCURRENTLY: on the basis of this argument,
we don't need them. So that would look roughly similar to the tests
that skip making redirects when isBuild is true, and it'd allow
keeping the assertion in spgFormDeadTuple, and it'd save some
usually-trifling amount of work in the next VACUUM. However, I'm
not sure there's a nice way for spginsert() to know whether it's
being invoked in REINDEX CONCURRENTLY or a normal INSERT/UPDATE
query. Can we trust indexInfo->ii_Concurrent for that?
regards, tom lane
From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | exclusion(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #18499: Reindexing spgist index concurrently triggers Assert("TransactionIdIsValid(state->myXid)") |
Date: | 2024-06-16 23:24:10 |
Message-ID: | Zm90GkbGq16gjJy0@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
On Sun, Jun 16, 2024 at 06:52:52PM -0400, Tom Lane wrote:
> BTW, a different line of attack could be to not generate redirects
> at all during REINDEX CONCURRENTLY: on the basis of this argument,
> we don't need them. So that would look roughly similar to the tests
> that skip making redirects when isBuild is true, and it'd allow
> keeping the assertion in spgFormDeadTuple, and it'd save some
> usually-trifling amount of work in the next VACUUM. However, I'm
> not sure there's a nice way for spginsert() to know whether it's
> being invoked in REINDEX CONCURRENTLY or a normal INSERT/UPDATE
> query. Can we trust indexInfo->ii_Concurrent for that?
I am not sure to understand the redirection part for spgist, but
except if I am missing something, we already rely on ii_Concurrent for
other index AMs like BRIN paths to check if we are dealing with a
concurrent build path or not. index_concurrently_build() is used
by both CIC and REINDEX CONCURRENTLY, where the flag is set after a
BuildIndexInfo().
--
Michael
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | exclusion(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #18499: Reindexing spgist index concurrently triggers Assert("TransactionIdIsValid(state->myXid)") |
Date: | 2024-06-16 23:30:46 |
Message-ID: | 3050563.1718580646@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Michael Paquier <michael(at)paquier(dot)xyz> writes:
> On Sun, Jun 16, 2024 at 06:52:52PM -0400, Tom Lane wrote:
>> ... not sure there's a nice way for spginsert() to know whether it's
>> being invoked in REINDEX CONCURRENTLY or a normal INSERT/UPDATE
>> query. Can we trust indexInfo->ii_Concurrent for that?
> I am not sure to understand the redirection part for spgist, but
> except if I am missing something, we already rely on ii_Concurrent for
> other index AMs like BRIN paths to check if we are dealing with a
> concurrent build path or not. index_concurrently_build() is used
> by both CIC and REINDEX CONCURRENTLY, where the flag is set after a
> BuildIndexInfo().
Right. I was thinking that CIC wouldn't reach spginsert(), rather
spgbuild(), but it does feel a bit rickety. A separate flag would
be better.
On the whole I'm inclined to stick with the patch as I have it.
Maybe somebody would like to investigate this idea as an improvement
for v18 or later.
regards, tom lane
From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | exclusion(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #18499: Reindexing spgist index concurrently triggers Assert("TransactionIdIsValid(state->myXid)") |
Date: | 2024-06-21 01:54:09 |
Message-ID: | ZnTdQW2ne9m10zXf@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
On Sun, Jun 16, 2024 at 07:30:46PM -0400, Tom Lane wrote:
> Right. I was thinking that CIC wouldn't reach spginsert(), rather
> spgbuild(), but it does feel a bit rickety. A separate flag would
> be better.
Why does it feel rickety for you? We already rely on this flag so it
does not seem like a huge problem to rely on it again for the sake of
this index AM.
--
Michael