Re: [HACKERS] index fix report

Lists: pgsql-hackers
From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: hackers(at)postgreSQL(dot)org (PostgreSQL-development)
Subject: index fix report
Date: 1998-09-03 04:21:50
Message-ID: 199809030421.AAA26089@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I am still waiting for the first report from my index fix.

Anyone?

--
Bruce Momjian | 830 Blythe Avenue
maillist(at)candle(dot)pha(dot)pa(dot)us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)


From: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
To: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
Cc: PostgreSQL-development <hackers(at)postgreSQL(dot)org>
Subject: Re: [HACKERS] index fix report
Date: 1998-09-03 04:59:33
Message-ID: 35EE2235.F6ACF1A1@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I am still waiting for the first report from my index fix.
> Anyone?

OK, here is my interim report: everything is broken :)

Looks like David's patches busted several Makefiles. Will poke at it and
let you know.

In the meantime, I can't test your fixes; maybe next time wait a few
hours to let things settle down before adding new breakage? *wink*

- Tom

gcc -I../../../include -I../../../backend -O2 -m486 -Wall
-Wmissing-prototypes -I../.. -Wno-error -c geqo_ox2.c -o geqo_ox2.o
ld -r -o SUBSYS.o geqo_copy.o geqo_eval.o geqo_main.o geqo_misc.o
geqo_params.o geqo_paths.o geqo_pool.o geqo_recombination.o
geqo_selection.o geqo_erx.o geqo_pmx.o geqo_cx.o geqo_px.o geqo_ox1.o
geqo_ox2.o
make[3]: Leaving directory
`/opt/postgres/pgsql/src/backend/optimizer/geqo'
for i in path plan prep util geqo; do make -C $i prep/SUBSYS.o; done
make[3]: Entering directory
`/opt/postgres/pgsql/src/backend/optimizer/path'
make[3]: *** No rule to make target `prep/SUBSYS.o'. Stop.
make[3]: Leaving directory
`/opt/postgres/pgsql/src/backend/optimizer/path'
make[3]: Entering directory
`/opt/postgres/pgsql/src/backend/optimizer/plan'
make[3]: *** No rule to make target `prep/SUBSYS.o'. Stop.
make[3]: Leaving directory
`/opt/postgres/pgsql/src/backend/optimizer/plan'
make[3]: Entering directory
`/opt/postgres/pgsql/src/backend/optimizer/prep'
make[3]: *** No rule to make target `prep/SUBSYS.o'. Stop.
make[3]: Leaving directory
`/opt/postgres/pgsql/src/backend/optimizer/prep'
make[3]: Entering directory
`/opt/postgres/pgsql/src/backend/optimizer/util'
make[3]: *** No rule to make target `prep/SUBSYS.o'. Stop.
make[3]: Leaving directory
`/opt/postgres/pgsql/src/backend/optimizer/util'
make[3]: Entering directory
`/opt/postgres/pgsql/src/backend/optimizer/geqo'
make[3]: *** No rule to make target `prep/SUBSYS.o'. Stop.
make[3]: Leaving directory
`/opt/postgres/pgsql/src/backend/optimizer/geqo'
make[2]: *** [prep/SUBSYS.o] Error 2
make[2]: Leaving directory `/opt/postgres/pgsql/src/backend/optimizer'
make[1]: *** [optimizer.dir] Error 2
make[1]: Leaving directory `/opt/postgres/pgsql/src/backend'
make: *** [install] Error 2


From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: lockhart(at)alumni(dot)caltech(dot)edu (Thomas G(dot) Lockhart)
Cc: hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-03 05:08:12
Message-ID: 199809030508.BAA02789@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > I am still waiting for the first report from my index fix.
> > Anyone?
>
> OK, here is my interim report: everything is broken :)
>
> Looks like David's patches busted several Makefiles. Will poke at it and
> let you know.
>
> In the meantime, I can't test your fixes; maybe next time wait a few
> hours to let things settle down before adding new breakage? *wink*

It is compiling here now. Strange. There are few problems with
libpgtcl and psql.c that I am fixing now.

--
Bruce Momjian | 830 Blythe Avenue
maillist(at)candle(dot)pha(dot)pa(dot)us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)


From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: lockhart(at)alumni(dot)caltech(dot)edu (Thomas G(dot) Lockhart)
Cc: hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-03 05:09:50
Message-ID: 199809030509.BAA02891@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> OK, it looks like there is a missing file in the tree. I had to modify
>
> ./backend/optimizer/prep/Makefile
>
> to remove a reference to a nonexistant object file prepkeyset.o
> presumably needing to be built from prepkeyset.c.
>
> That got things building until the next link step, when some routines
> were a missin'. Got that file anywhere? Seems useful...

No wonder it compiles here so nicely. Adding it now. I will also do a
make distclean and cvs to see if any other files are missing.
>
> - Tom
>
> commands/SUBSYS.o: In function `parse_ksqo':
> commands/SUBSYS.o(.text+0xdff4): undefined reference to
> `_use_keyset_query_optimizer'
> commands/SUBSYS.o(.text+0xe014): undefined reference to
> `_use_keyset_query_optimizer'
> commands/SUBSYS.o: In function `show_ksqo':
> commands/SUBSYS.o(.text+0xe045): undefined reference to
> `_use_keyset_query_optimizer'
> commands/SUBSYS.o: In function `reset_ksqo':
> commands/SUBSYS.o(.text+0xe085): undefined reference to
> `_use_keyset_query_optimizer'
> optimizer/SUBSYS.o: In function `planner':
> optimizer/SUBSYS.o(.text+0x564b): undefined reference to
> `transformKeySetQuery'
> make[1]: *** [postgres] Error 1
> make[1]: Leaving directory `/opt/postgres/pgsql/src/backend'
> make: *** [install] Error 2
>

--
Bruce Momjian | 830 Blythe Avenue
maillist(at)candle(dot)pha(dot)pa(dot)us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)


From: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
To: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
Cc: PostgreSQL-development <hackers(at)postgreSQL(dot)org>
Subject: Re: [HACKERS] index fix report
Date: 1998-09-03 05:11:32
Message-ID: 35EE2504.F30AEB30@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

OK, it looks like there is a missing file in the tree. I had to modify

./backend/optimizer/prep/Makefile

to remove a reference to a nonexistant object file prepkeyset.o
presumably needing to be built from prepkeyset.c.

That got things building until the next link step, when some routines
were a missin'. Got that file anywhere? Seems useful...

- Tom

commands/SUBSYS.o: In function `parse_ksqo':
commands/SUBSYS.o(.text+0xdff4): undefined reference to
`_use_keyset_query_optimizer'
commands/SUBSYS.o(.text+0xe014): undefined reference to
`_use_keyset_query_optimizer'
commands/SUBSYS.o: In function `show_ksqo':
commands/SUBSYS.o(.text+0xe045): undefined reference to
`_use_keyset_query_optimizer'
commands/SUBSYS.o: In function `reset_ksqo':
commands/SUBSYS.o(.text+0xe085): undefined reference to
`_use_keyset_query_optimizer'
optimizer/SUBSYS.o: In function `planner':
optimizer/SUBSYS.o(.text+0x564b): undefined reference to
`transformKeySetQuery'
make[1]: *** [postgres] Error 1
make[1]: Leaving directory `/opt/postgres/pgsql/src/backend'
make: *** [install] Error 2


From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: lockhart(at)alumni(dot)caltech(dot)edu (Thomas G(dot) Lockhart)
Cc: hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-03 05:18:05
Message-ID: 199809030518.BAA16184@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > I am still waiting for the first report from my index fix.
> > Anyone?
>
> OK, here is my interim report: everything is broken :)
>
> Looks like David's patches busted several Makefiles. Will poke at it and
> let you know.
>
> In the meantime, I can't test your fixes; maybe next time wait a few
> hours to let things settle down before adding new breakage? *wink*

OK, everything is there now.

--
Bruce Momjian | 830 Blythe Avenue
maillist(at)candle(dot)pha(dot)pa(dot)us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)


From: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
To: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
Cc: hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-03 05:30:25
Message-ID: 35EE2971.CCF4B974@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: Postg스포츠 토토 결과SQL

> make distclean and cvs to see if any other files are missing.

More breakage. Will disable the tcl stuff and see if I can proceed...

- Tom

gcc -I../../include -I../../backend -O2 -m486 -Wall
-Wmissing-prototypes -I../../backend -I../../include
-I../../interfaces/libpq -fpic -c pgtclCmds.c -o pgtclCmds.o
pgtclCmds.c: In function `Pg_connect':
pgtclCmds.c:366: parse error before `else'
pgtclCmds.c:373: warning: control reaches end of non-void function
pgtclCmds.c: In function `Pg_listen':
pgtclCmds.c:1460: parse error at end of input
make[2]: *** [pgtclCmds.o] Error 1
make[2]: Leaving directory `/opt/postgres/pgsql/src/interfaces/libpgtcl'
make[1]: *** [install] Error 2
make[1]: Leaving directory `/opt/postgres/pgsql/src/interfaces'
make: *** [install] Error 2


From: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
To: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
Cc: hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-03 05:36:31
Message-ID: 35EE2ADF.42BFC8FA@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> make distclean and cvs to see if any other files are missing.

... And more breakage ... Fortunately I had a psql from yesterday lying
around, so will continue testing ...

- Tom

psql.c: In function `SendQuery':
psql.c:1139: warning: passing arg 1 of `handleCopyIn' from incompatible
pointer type
psql.c:1143: warning: passing arg 1 of `handleCopyIn' from incompatible
pointer type
psql.c: At top level:
psql.c:2957: conflicting types for `handleCopyIn'
psql.c:153: previous declaration of `handleCopyIn'
make[2]: *** [psql.o] Error 1
make[2]: Leaving directory `/opt/postgres/pgsql/src/bin/psql'
make[2]: Entering directory `/opt/postgres/pgsql/src/bin/pg_dump'


From: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
To: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
Cc: hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-03 05:52:52
Message-ID: 35EE2EB4.1FA7EA64@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> OK, everything is there now.

The select_having test now passes. The other tests which were broken
yesterday are still broken:

constraints .. failed -- trouble finding a new table after dropping old
...
create_index .. failed -- trouble creating an index after creating first
sanity_check .. failed -- missing tables from previous failures, n/a
...
select_views .. failed -- old (one month?) problem with core dump
...
run_ruletest .. failed -- dba account name differs, n/a

I did a "make distclean" earlier this evening, and a "make clean" and
"initdb" during this recent build session, so things should be up to
date. Let's not worry about select_views until we've fixed constraints
and create_index, since that is an older problem.

I'm sure we are getting closer though...

- Tom


From: David Hartwig <daveh(at)insightdist(dot)com>
To: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
Cc: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>, PostgreSQL-development <hackers(at)postgreSQL(dot)org>
Subject: Re: [HACKERS] index fix report
Date: 1998-09-03 13:33:17
Message-ID: 35EE9A9D.E6F37682@insightdist.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thomas G. Lockhart wrote:

> > I am still waiting for the first report from my index fix.
> > Anyone?
>
> OK, here is my interim report: everything is broken :)
>
> Looks like David's patches busted several Makefiles. Will poke at it and
> let you know.
>

Did this get resolved? I reviewed the original patch, and the missing
items that are mentioned in this thread are in the patch.


From: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
To: David Hartwig <daveh(at)insightdist(dot)com>
Cc: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>, PostgreSQL-development <hackers(at)postgreSQL(dot)org>
Subject: Re: [HACKERS] index fix report
Date: 1998-09-03 14:11:50
Message-ID: 35EEA3A6.8E9270F7@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Did this get resolved?

Yes, a file got left out of the cvs commit; Bruce fixed it right away.

Update:

I'm still seeing problems with the regression tests, and it appears to
be the same symptom reported by someone else earlier: a couple of tables
(or indices) exist but something is munged in pg_class so that I can
only see the entry using a "like" query; an "=" equals query does not
return the row.

This may only happen with tables which have been destroyed and then
redefined??

I've done a full "cvs update -Pd", a "make distclean", and an initdb.
I'll try a clean checkout next. If that doesn't help, then will start
poking at it...

- Tom


From: David Hartwig <daveh(at)insightdist(dot)com>
To: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
Cc: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>, PostgreSQL-development <hackers(at)postgreSQL(dot)org>
Subject: Re: [HACKERS] index fix report
Date: 1998-09-03 14:41:29
Message-ID: 35EEAA99.95D1132B@insightdist.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thomas G. Lockhart wrote:

> > Did this get resolved?
>
> Yes, a file got left out of the cvs commit; Bruce fixed it right away.
>
> Update:
>
> I'm still seeing problems with the regression tests, and it appears to
> be the same symptom reported by someone else earlier: a couple of tables
> (or indices) exist but something is munged in pg_class so that I can
> only see the entry using a "like" query; an "=" equals query does not
> return the row.

I know why this was happening. (At least on the surface) In my case,
whenever a I added an index to a table pg_class_relname_index was getting
corrupted. The nature of the corruption was that any query that used the
pg_class_relname_index to find a table that was just indexed, could no
longer find it. The corruption must occur on the update of pg_class when
the index is added. This explains why:

CREATE TABLE foo (i int);
CREATE INDEX foo_idx ON foo USING btree(i);

SELECT * FROM pg_class;

Showed a complete correct list of tables.

SELECT * FROM pg_class WHERE = 'foo';
and
SELECT * FROM pg_class WHERE LIKE 'foo%';

Showed nothing. I did not know LIKE was using an index. It through
me. Nice job by sombody.

SELECT * FROM pg_class WHERE LIKE '%foo';

Showed my original table. But this query like the first does a full
scan.

In any case I will be doing some testing myself this afternoon.


From: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
To: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
Cc: hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-03 15:06:43
Message-ID: 35EEB083.1054D521@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> OK, everything is there now.

I did a CVSup, a cvs checkout into a new tree, and still see the same
problems. I'll bet that most of the symptoms in the regression tests are
due to the same thing, which seems to involve table munging of pg_class.
It looks like some tuples end up with leading garbage of some sort in
the first field: see the example below.

I'll try to distill this down to a simpler test case. If anyone has some
hints on where to look...

- Tom

regression=> select * from primary_tbl;
ERROR: primary_tbl: Table does not exist.

regression=> select * from pg_class where relname = 'primary_tbl';
relname|reltype|relowner|relam|relpages|reltuples|relhasindex|relisshared|relkind|relnatts|relchecks|reltriggers|relhasrules|relacl
-------+-------+--------+-----+--------+---------+-----------+-----------+-------+--------+---------+-----------+-----------+------
(0 rows)

regression=> select * from pg_class where relname like '%primary_tbl%';
relname
|reltype|relowner|relam|relpages|reltuples|relhasindex|relisshared|relkind|relnatts|relchecks|reltriggers|relhasrules|relacl
----------------+-------+--------+-----+--------+---------+-----------+-----------+-------+--------+---------+-----------+-----------+------
primary_tbl | 0| 500| 0| 0| 0|t
|f |r | 2| 0| 0|f |
primary_tbl_pkey| 0| 500| 403| 2| 1024|f
|f |i | 2| 0| 0|f |
(2 rows)

regression=> select * from pg_class where relname like 'primary_tbl%';
relname|reltype|relowner|relam|relpages|reltuples|relhasindex|relisshared|relkind|relnatts|relchecks|reltriggers|relhasrules|relacl
-------+-------+--------+-----+--------+---------+-----------+-----------+-------+--------+---------+-----------+-----------+------
(0 rows)

regression=> select * from pg_class where relname like '%primary_tbl';
relname
|reltype|relowner|relam|relpages|reltuples|relhasindex|relisshared|relkind|relnatts|relchecks|reltriggers|relhasrules|relacl
-----------+-------+--------+-----+--------+---------+-----------+-----------+-------+--------+---------+-----------+-----------+------
primary_tbl| 0| 500| 0| 0| 0|t
|f |r | 2| 0| 0|f |
(1 row)


From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: daveh(at)insightdist(dot)com (David Hartwig)
Cc: lockhart(at)alumni(dot)caltech(dot)edu, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-03 15:09:54
Message-ID: 199809031509.LAA06504@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>
>
> Thomas G. Lockhart wrote:
>
> > > I am still waiting for the first report from my index fix.
> > > Anyone?
> >
> > OK, here is my interim report: everything is broken :)
> >
> > Looks like David's patches busted several Makefiles. Will poke at it and
> > let you know.
> >
>
> Did this get resolved? I reviewed the original patch, and the missing
> items that are mentioned in this thread are in the patch.

Resolved. I forgot to add the new file.

--
Bruce Momjian | 830 Blythe Avenue
maillist(at)candle(dot)pha(dot)pa(dot)us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)


From: David Hartwig <daveh(at)insightdist(dot)com>
To: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
Cc: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-03 15:13:43
Message-ID: 35EEB227.AE20ABE1@insightdist.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom, its in the index. Do an explain before each of your queries.

Thomas G. Lockhart wrote:

> > OK, everything is there now.
>
> I did a CVSup, a cvs checkout into a new tree, and still see the same
> problems. I'll bet that most of the symptoms in the regression tests are
> due to the same thing, which seems to involve table munging of pg_class.
> It looks like some tuples end up with leading garbage of some sort in
> the first field: see the example below.
>
> I'll try to distill this down to a simpler test case. If anyone has some
> hints on where to look...
>
> - Tom
>
> regression=> select * from primary_tbl;
> ERROR: primary_tbl: Table does not exist.
>
> regression=> select * from pg_class where relname = 'primary_tbl';
> relname|reltype|relowner|relam|relpages|reltuples|relhasindex|relisshared|relkind|relnatts|relchecks|reltriggers|relhasrules|relacl
> -------+-------+--------+-----+--------+---------+-----------+-----------+-------+--------+---------+-----------+-----------+------
> (0 rows)
>
> regression=> select * from pg_class where relname like '%primary_tbl%';
> relname
> |reltype|relowner|relam|relpages|reltuples|relhasindex|relisshared|relkind|relnatts|relchecks|reltriggers|relhasrules|relacl
> ----------------+-------+--------+-----+--------+---------+-----------+-----------+-------+--------+---------+-----------+-----------+------
> primary_tbl | 0| 500| 0| 0| 0|t
> |f |r | 2| 0| 0|f |
> primary_tbl_pkey| 0| 500| 403| 2| 1024|f
> |f |i | 2| 0| 0|f |
> (2 rows)
>
> regression=> select * from pg_class where relname like 'primary_tbl%';
> relname|reltype|relowner|relam|relpages|reltuples|relhasindex|relisshared|relkind|relnatts|relchecks|reltriggers|relhasrules|relacl
> -------+-------+--------+-----+--------+---------+-----------+-----------+-------+--------+---------+-----------+-----------+------
> (0 rows)
>
> regression=> select * from pg_class where relname like '%primary_tbl';
> relname
> |reltype|relowner|relam|relpages|reltuples|relhasindex|relisshared|relkind|relnatts|relchecks|reltriggers|relhasrules|relacl
> -----------+-------+--------+-----+--------+---------+-----------+-----------+-------+--------+---------+-----------+-----------+------
> primary_tbl| 0| 500| 0| 0| 0|t
> |f |r | 2| 0| 0|f |
> (1 row)


From: David Hartwig <daveh(at)insightdist(dot)com>
To: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
Cc: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-03 15:19:00
Message-ID: 35EEB364.37E4BA30@insightdist.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thomas G. Lockhart wrote:

> > OK, everything is there now.
>
> I did a CVSup, a cvs checkout into a new tree, and still see the same
> problems.

Forgot to mention. I still have the problem also. Tom, what are you running on? Is platform still a factor in this problem?


From: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
To: David Hartwig <daveh(at)insightdist(dot)com>
Cc: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-04 01:22:30
Message-ID: 35EF40D6.65F4480A@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Forgot to mention. I still have the problem also. Tom, what are
> you running on? Is platform still a factor in this problem?

Platform is a problem in that Bruce's FreeBSD/i686 machine does not show
the symptoms, but it's a pretty widespread problem beyond that afaik.
I'm running on Linux/i686. Tatsuo sees problems on Linux/PPC. Some Sparc
machines running Solaris and Linux apparently still show problems.

I'm guessing that it is a byte alignment difference in malloc behavior
between the systems which exposes misaligned structures on some
platforms. How's that for pure speculation, eh?

- Tom


From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: lockhart(at)alumni(dot)caltech(dot)edu (Thomas G(dot) Lockhart)
Cc: daveh(at)insightdist(dot)com, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-04 01:24:48
Message-ID: 199809040124.VAA14658@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > Forgot to mention. I still have the problem also. Tom, what are
> > you running on? Is platform still a factor in this problem?
>
> Platform is a problem in that Bruce's FreeBSD/i686 machine does not show
> the symptoms, but it's a pretty widespread problem beyond that afaik.
> I'm running on Linux/i686. Tatsuo sees problems on Linux/PPC. Some Sparc
> machines running Solaris and Linux apparently still show problems.
>
> I'm guessing that it is a byte alignment difference in malloc behavior
> between the systems which exposes misaligned structures on some
> platforms. How's that for pure speculation, eh?

Let me tell you what I need to help debug this.

It would help to know if it is the cache, or an index problem. It is
sometimes hard to determine because the cache often uses the indexes to
load the cache.

Can someone step through a bad entry, and tell me where it is failing?
If it is in the executor, it probably is an index. EXPLAIN does show
what indexes are involved. Are several indexes failing, or just one?

I like the malloc idea, but am unsure how the problem just started
happening with the multi-key system indexes.

--
Bruce Momjian | 830 Blythe Avenue
maillist(at)candle(dot)pha(dot)pa(dot)us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)


From: Vadim Mikheev <vadim(at)krs(dot)ru>
To: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
Cc: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>, daveh(at)insightdist(dot)com, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-04 03:07:41
Message-ID: 35EF597D.77FEC1CD@krs.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian wrote:
>
> > > Forgot to mention. I still have the problem also. Tom, what are
> > > you running on? Is platform still a factor in this problem?
> >
> > Platform is a problem in that Bruce's FreeBSD/i686 machine does not show
> > the symptoms, but it's a pretty widespread problem beyond that afaik.
> > I'm running on Linux/i686. Tatsuo sees problems on Linux/PPC. Some Sparc
^^^^^^^^^^
> > machines running Solaris and Linux apparently still show problems.
^^^^^^^^^^^^^^^^^^^^^^^^
This is what I have:

SunOS sunpine.krs.ru 5.5.1 Generic_103640-12 sun4u sparc SUNW,Ultra-4

I didn't install 6.4 on this machine but could to do this...
What are the problems ?

Vadim


From: David Hartwig <daybee(at)bellatlantic(dot)net>
To: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
Cc: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>, daveh(at)insightdist(dot)com, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-04 03:22:43
Message-ID: 35EF5D02.812438CC@bellatlantic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian wrote:

> > > Forgot to mention. I still have the problem also. Tom, what are
> > > you running on? Is platform still a factor in this problem?
> >
> > Platform is a problem in that Bruce's FreeBSD/i686 machine does not show
> > the symptoms, but it's a pretty widespread problem beyond that afaik.
> > I'm running on Linux/i686. Tatsuo sees problems on Linux/PPC. Some Sparc
> > machines running Solaris and Linux apparently still show problems.
> >
> > I'm guessing that it is a byte alignment difference in malloc behavior
> > between the systems which exposes misaligned structures on some
> > platforms. How's that for pure speculation, eh?
>
> Let me tell you what I need to help debug this.
>
> It would help to know if it is the cache, or an index problem. It is
> sometimes hard to determine because the cache often uses the indexes to
> load the cache.
>
> Can someone step through a bad entry, and tell me where it is failing?
> If it is in the executor, it probably is an index. EXPLAIN does show
> what indexes are involved. Are several indexes failing, or just one?
>
> I like the malloc idea, but am unsure how the problem just started
> happening with the multi-key system indexes.

I will try to frame this as best I can. I would be good it other could
verify my statements or add to them.

When I run this simple scenario:

create table foo (i int);
-- everything is fine
select * from pg_class where relname = 'foo'
-- no problem
select * from pg_class where oid = {oid_num}
-- no problem
create index foo_x on foo using btree(i);
-- Looks ok but it is not
select * from pg_class where relname = 'foo'
-- no rows found
select * from pg_class where oid = {oid_num}
-- no rows found
-- The table and the index in pg_class cannot be found via ether index.
-- They look like single part indexes too.
select * from pg_class
-- shows foo and foo_x along w/ everything else.
-- I can use the UPDATE statement to rewrite the foo and foo_x rows into
pg_class
-- and all is well again.
-- INSERTing into foo does not seem to be a problem.
-- ALTER table has similar negative effects on the system tables, but I
have yet to sort them all out.

I have verified all this using the latest snapshot on an AIX 4.1.4 system.
Non-gcc compiler. I will let you know if the problem is on my Linux box. I
need to reboot and test. But to the best of my knowledge the problem in not
showing itself there.


From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: daybee(at)bellatlantic(dot)net (David Hartwig)
Cc: lockhart(at)alumni(dot)caltech(dot)edu, daveh(at)insightdist(dot)com, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-04 14:59:59
Message-ID: 199809041500.LAA02186@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> When I run this simple scenario:
>
> create table foo (i int);
> -- everything is fine
> select * from pg_class where relname = 'foo'
> -- no problem
> select * from pg_class where oid = {oid_num}
> -- no problem
> create index foo_x on foo using btree(i);
> -- Looks ok but it is not
> select * from pg_class where relname = 'foo'
> -- no rows found
> select * from pg_class where oid = {oid_num}
> -- no rows found
> -- The table and the index in pg_class cannot be found via ether index.
> -- They look like single part indexes too.
> select * from pg_class
> -- shows foo and foo_x along w/ everything else.
> -- I can use the UPDATE statement to rewrite the foo and foo_x rows into
> pg_class
> -- and all is well again.
> -- INSERTing into foo does not seem to be a problem.
> -- ALTER table has similar negative effects on the system tables, but I
> have yet to sort them all out.

This does help. Can you check UpdateRelationRelation(), which is called
from create_index, and which calls CatalogIndexInsert()? Seems like the
problem must be in that area.

Looks like Tatsuo Ishii is on this already, as he has suggested some
good fixes to heap_addheader(), which is called from
UpdateRelationRelation().

Again, I am sorry to have broken this stuff so badly.

--
Bruce Momjian | 830 Blythe Avenue
maillist(at)candle(dot)pha(dot)pa(dot)us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)


From: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
To: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
Cc: daveh(at)insightdist(dot)com, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-04 15:08:02
Message-ID: 35F00252.E7B6BA5D@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> It would help to know if it is the cache, or an index problem. It is
> sometimes hard to determine because the cache often uses the indexes
> to load the cache.
> Can someone step through a bad entry, and tell me where it is failing?
> If it is in the executor, it probably is an index. EXPLAIN does show
> what indexes are involved. Are several indexes failing, or just one?

I'm not sure how to "step through a bad entry" for this case. Just
haven't done it before, and have never used gdb on the backend. That may
explain why I've got so many debugging print statements :)

I believe that in at least some cases the index itself is damaged. If it
were just problems _updating_ the cache, then stopping and restarting
all frontends and backends might fix the problem, at least for the first
query. That doesn't eliminate the possibility that it is a problem with
the cache as it is first built though.

regression=> select oid, relname from pg_class where relname =
'primary_tbl';
oid|relname
---+-------
(0 rows)

regression=> explain select relname
regression-> from pg_class where relname = 'primary_key';
NOTICE: QUERY PLAN:

Index Scan using pg_class_relname_index on pg_class
(cost=2.03 size=2 width=32)

EXPLAIN

regression=> select oid, relname from pg_class
regression-> where relname like '%primary%';
oid|relname
-----+----------------
19947|primary_tbl
19957|primary_tbl_pkey
(2 rows)

regression=> explain select oid, relname from pg_class
regression-> where oid = 19947;
NOTICE: QUERY PLAN:

Index Scan using pg_class_oid_index on pg_class
(cost=2.03 size=2 width=36)

EXPLAIN

So these indices appear damaged. Now here is another symptom from my
regression tests, which appears to illustrate cache damage, though since
it is after the fact perhaps a damaged index has just been changed
enough in the meantime to uncover the right nodes...

The regression result shows trouble finding a relation on which to
create the index, and once it has trouble it never finds the relation
_during the same session_:

QUERY: CREATE INDEX onek_unique1 ON onek USING btree(unique1 int4_ops);
QUERY: CREATE INDEX onek_unique2 ON onek USING btree(unique2 int4_ops);
ERROR: DefineIndex: onek relation not found
QUERY: CREATE INDEX onek_hundred ON onek USING btree(hundred int4_ops);
ERROR: DefineIndex: onek relation not found
QUERY: CREATE INDEX onek_stringu1 ON onek USING btree(stringu1
name_ops);
ERROR: DefineIndex: onek relation not found
QUERY: CREATE INDEX tenk1_unique1 ON tenk1 USING btree(unique1
int4_ops);
QUERY: CREATE INDEX tenk1_unique2 ON tenk1 USING btree(unique2
int4_ops);
QUERY: CREATE INDEX tenk1_hundred ON tenk1 USING btree(hundred
int4_ops);
QUERY: CREATE INDEX tenk2_unique1 ON tenk2 USING btree(unique1
int4_ops);
QUERY: CREATE INDEX tenk2_unique2 ON tenk2 USING btree(unique2
int4_ops);
ERROR: DefineIndex: tenk2 relation not found
QUERY: CREATE INDEX tenk2_hundred ON tenk2 USING btree(hundred
int4_ops);
ERROR: DefineIndex: tenk2 relation not found
<snip other tables' indices successfully created>

However, if I go back in after the regression test is over, the table is
found and the index created:

regression=> CREATE INDEX onek_unique2 ON onek
regression-> USING btree(unique2 int4_ops);
CREATE

btw, my linux box is not quite as sensitive to the problem(s) as David's
AIX box; his simpler test case does not fail on my machine :(

- Tom


From: "Thomas G(dot) Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
To: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>, daveh(at)insightdist(dot)com, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-04 15:52:54
Message-ID: 35F00CD6.A3376CF1@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> regression=> explain select oid, relname from pg_class
> regression-> where oid = 19947;
> NOTICE: QUERY PLAN:
>
> Index Scan using pg_class_oid_index on pg_class
> (cost=2.03 size=2 width=36)
>
> EXPLAIN
>
> So these indices appear damaged.

Sorry, left out one of the test cases which lead to the conclusion that
multiple indices are damaged on my machine:

regression=> select oid, relname from pg_class
regression-> where oid = 19947;
oid|relname
---+-------
(0 rows)

- Tom


From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: lockhart(at)alumni(dot)caltech(dot)edu (Thomas G(dot) Lockhart)
Cc: daveh(at)insightdist(dot)com, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] index fix report
Date: 1998-09-04 16:13:44
Message-ID: 199809041613.MAA02080@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > It would help to know if it is the cache, or an index problem. It is
> > sometimes hard to determine because the cache often uses the indexes
> > to load the cache.
> > Can someone step through a bad entry, and tell me where it is failing?
> > If it is in the executor, it probably is an index. EXPLAIN does show
> > what indexes are involved. Are several indexes failing, or just one?
>
> I'm not sure how to "step through a bad entry" for this case. Just
> haven't done it before, and have never used gdb on the backend. That may
> explain why I've got so many debugging print statements :)

Oh, you are missing so much fun. Just compile with -g, and run gdb as
the postgres user, and do 'run -D /usr/local/pgsql/data test' and you
get a nice prompt. You are not using the postmaster, and are not using
the shared buffer cache, but this is usually not a problem. Give you
the ability to do all sorts of things. 's' steps into functions, 'n'
steps over functions, 'break' stops at certain functions or line
numbers.

>
> I believe that in at least some cases the index itself is damaged. If it
> were just problems _updating_ the cache, then stopping and restarting
> all frontends and backends might fix the problem, at least for the first
> query. That doesn't eliminate the possibility that it is a problem with
> the cache as it is first built though.
>

OK, let me suggest something. Create a table, make it int4, stuff some
values in there, create an index, do a vacuum, and make sure the index
is being used using EXPLAIN. Then, see if you can retrieve the values
using the index.

This should show if the problem exists external to pg_class. I believe
you will find that it works fine, so it must be the system indexes that
are at fault. Are other system indexes affected, or only pg_class
indexes?

--
Bruce Momjian | 830 Blythe Avenue
maillist(at)candle(dot)pha(dot)pa(dot)us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)