pgsql-hackers_odd_bug

This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.



To: "Postgres Hackers" <pgsql-hackers@postgresql.org>,
From: "Brian Hirt" <bhirt@mobygames.com>
Subject: [HACKERS] ACK table corrupted, unique index
violated.
Date: Wed, 12 Dec 2001 08:10:07 -0700

I'm having a very scary problem.

First, here's my system:  smp dual PIII800 512MB memory running redhat 6.2
kernel 2.2.18
PostgreSQL 7.1.3 on i686-pc-linux-gnu, compiled by GCC egcs-2.91.66

In a nutshell, my primary key index got a NOTICE to recreate when the
database was vacuumed.  I dropped the index and tried to recreate it.  I get
a key violation when i try to do this.    I find there are some 200 rows
with the exact same developer_id and oid.

This is a very serious problem.  1) the uinque index should have prevented
this from happening.  2) i looked at my code and there is absolutely no way
my code inserted multiple rows with the same id.

This leads me to believe that there is a big problem with postgres; possibly
in vacuum.  This has also underminded my confidence in postgres data
integrity that such a basic concept can be violated.

I want to help you guys find this problem because I have a lot invested in
postgres and overall have been very happy with it.  I've included
information that i think might be useful.  If there is more that i can
supply, let me know and I will provide it if I can.

My system runs a vacuum every day at 4AMCST.  I've checked all of my
application logs for the day before the NOTICE appeared and the day that it
appeared.  I see no SQL errors logged (and all SQL errors are logged by my
applications) for either day.  I've checked the postmaster logs for both
days and don't see any ERROR's logged.  There are some NOTICES the day
before that i don't know what they mean, but don't look good.

NOTICE:  Cannot rename init file
/moby/pgsql/base/156130/pg_internal.init.19833 to
/moby/pgsql/base/156130/pg_internal.init: No such file or directory
NOTICE:  Cannot rename init file
/moby/pgsql/base/156130/pg_internal.init.19839 to
/moby/pgsql/base/156130/pg_internal.init: No such file or directory
NOTICE:  Cannot rename init file
/moby/pgsql/base/156130/pg_internal.init.19835 to
/moby/pgsql/base/156130/pg_internal.init: No such file or directory
NOTICE:  Cannot rename init file
/moby/pgsql/base/156130/pg_internal.init.19834 to
/moby/pgsql/base/156130/pg_internal.init: No such file or directory
NOTICE:  Cannot rename init file
/moby/pgsql/base/156130/pg_internal.init.19837 to
/moby/pgsql/base/156130/pg_internal.init: No such file or direc

however, on further inspection, the error above appears fairly frequently.
In fact, the first occurance was over 6 weeks before the corruption.

Here is the DEBUG notices for vacuum the day before the corruption:

DEBUG:  --Relation developer--
DEBUG:  Pages 514: Changed 29, reaped 39, Empty 0, New 0; Tup 47971: Vac 52,
Keep/VTL 0/0, Crash 0, UnUsed 89, MinLen 65, MaxLen 133; Re-using:
Free/Avail. Space 6812/1768; EndEmpty/Avail. Pages 0/10. CPU 0.04s/0.00u
sec.
DEBUG:  Index developer_primary_key: Pages 120; Tuples 47971: Deleted 52.
CPU 0.00s/0.09u sec.
DEBUG:  Index developer_recent_mod_key: Pages 119; Tuples 47971: Deleted 52.
CPU 0.00s/0.05u sec.
DEBUG:  Index developer_approved: Pages 121; Tuples 47971: Deleted 52. CPU
0.03s/0.06u sec.
DEBUG:  Index developer_search_idx: Pages 204; Tuples 47971: Deleted 52. CPU
0.00s/0.04u sec.
DEBUG:  Rel developer: Pages: 514 --> 514; Tuple(s) moved: 15. CPU
0.01s/0.02u sec.
DEBUG:  Index developer_primary_key: Pages 120; Tuples 47971: Deleted 15.
CPU 0.00s/0.05u sec.
DEBUG:  Index developer_recent_mod_key: Pages 119; Tuples 47971: Deleted 15.
CPU 0.00s/0.04u sec.
DEBUG:  Index developer_approved: Pages 121; Tuples 47971: Deleted 15. CPU
0.00s/0.05u sec.
DEBUG:  Index developer_search_idx: Pages 204; Tuples 47971: Deleted 15. CPU
0.00s/0.05u sec.

Here is the vaccum DEBUG messages the day of the corruption:

DEBUG:  --Relation developer--
DEBUG:  Pages 515: Changed 25, reaped 39, Empty 0, New 0; Tup 48038: Vac 53,
Keep/VTL 0/0, Crash 0, UnUsed 89, MinLen 65, MaxLen 133; Re-using:
Free/Avail. Space 9144/3348; EndEmpty/Avail. Pages 0/9. CPU 0.02s/0.01u sec.
DEBUG:  Index developer_primary_key: Pages 120; Tuples 48023: Deleted 53.
CPU 0.00s/0.04u sec.
NOTICE:  Index developer_primary_key: NUMBER OF INDEX' TUPLES (48023) IS NOT
THE SAME AS HEAP' (48038).
        Recreate the index.
DEBUG:  Index developer_recent_mod_key: Pages 119; Tuples 48023: Deleted 53.
CPU 0.00s/0.04u sec.
NOTICE:  Index developer_recent_mod_key: NUMBER OF INDEX' TUPLES (48023) IS
NOT THE SAME AS HEAP' (48038).
        Recreate the index.
DEBUG:  Index developer_approved: Pages 121; Tuples 48023: Deleted 53. CPU
0.00s/0.04u sec.
NOTICE:  Index developer_approved: NUMBER OF INDEX' TUPLES (48023) IS NOT
THE SAME AS HEAP' (48038).
        Recreate the index.
DEBUG:  Index developer_search_idx: Pages 204; Tuples 48023: Deleted 53. CPU
0.00s/0.04u sec.
NOTICE:  Index developer_search_idx: NUMBER OF INDEX' TUPLES (48023) IS NOT
THE SAME AS HEAP' (48038).
        Recreate the index.
DEBUG:  Rel developer: Pages: 515 --> 515; Tuple(s) moved: 34. CPU
0.00s/0.03u sec.
DEBUG:  Index developer_primary_key: Pages 120; Tuples 48023: Deleted 34.
CPU 0.00s/0.04u sec.
NOTICE:  Index developer_primary_key: NUMBER OF INDEX' TUPLES (48023) IS NOT
THE SAME AS HEAP' (48038).
        Recreate the index.
DEBUG:  Index developer_recent_mod_key: Pages 119; Tuples 48023: Deleted 34.
CPU 0.00s/0.04u sec.
NOTICE:  Index developer_recent_mod_key: NUMBER OF INDEX' TUPLES (48023) IS
NOT THE SAME AS HEAP' (48038).
        Recreate the index.
DEBUG:  Index developer_approved: Pages 121; Tuples 48023: Deleted 34. CPU
0.00s/0.03u sec.
NOTICE:  Index developer_approved: NUMBER OF INDEX' TUPLES (48023) IS NOT
THE SAME AS HEAP' (48038).
        Recreate the index.
DEBUG:  Index developer_search_idx: Pages 204; Tuples 48023: Deleted 34. CPU
0.00s/0.06u sec.
NOTICE:  Index developer_search_idx: NUMBER OF INDEX' TUPLES (48023) IS NOT
THE SAME AS HEAP' (48038).
        Recreate the index.

I also appear to be getting this quite often:

NOTICE:  RegisterSharedInvalid: SI buffer overflow
NOTICE:  InvalidateSharedInvalid: cache state reset


Here is output from me trying to re-create an index.

basement=# drop index developer_primary_key;
DROP
basement=# create unique index developer_primary_key on
developer(developer_id);ERROR:  Cannot create unique index. Table contains
non-unique values
basement=# select developer_id,count(*) from developer group by developer_id
having count(*) > 1;
 developer_id | count
--------------+-------
        11107 |     2
        18493 |     2
        50983 |     2
        50984 |     2
        50985 |     2
        50986 |     2
        50987 |     2
        50988 |     2
        50989 |     2
        50990 |     2
        50991 |     2
        50992 |     2
        50993 |     2
        50994 |     2
        50995 |     2
        50996 |     2
        50997 |     2
        51020 |     2
        51021 |     2
        51022 |     2
        51023 |     2
        51024 |     2
        51025 |     2
        51026 |     2
        51029 |     2
        51030 |     2
        51031 |     2
        51032 |     2
        51033 |     2
        51034 |     2
        51035 |     2
        51036 |     2
        51037 |     2
        51038 |     2
        51039 |     2
        51040 |     2
        51041 |     2
        51042 |     2
        51043 |     3
        51044 |     3
        51045 |     3
        51046 |     3
        51047 |     3
        51048 |     3
        51049 |     3
        51050 |     3
        51051 |     3
        51052 |     3
        51053 |     3
        51054 |     3
        51055 |     3
        51056 |     3
        51057 |     3
        51058 |     3
        51059 |     3
        51060 |     3
        51061 |     3
        51062 |     2
        51063 |     2
        51064 |     2
        51065 |     2
        51066 |     2
        51067 |     2
        51068 |     2
        51069 |     2
        51070 |     2
        51071 |     2
        51072 |     2
        51073 |     2
        51074 |     2
        51075 |     2
        51076 |     2
        51152 |     2
        51614 |     2
        51615 |     2
        51616 |     2
        51617 |     2
        51618 |     2
        51619 |     2
        51620 |     2
        51621 |     2
        51622 |     2
        51623 |     2
        51678 |     2
        51680 |     2
        51681 |     2
        51682 |     2
        51683 |     2
        51768 |     2
        51862 |     2
        51863 |     2
        51864 |     2
        51950 |     2
        52094 |     2
        52095 |     2
        52096 |     2
        52097 |     2
        52098 |     2
        52099 |     2
        52100 |     2
        52101 |     2
        52103 |     2
        52104 |     2
        52105 |     2
        52106 |     2
        52107 |     2
        52108 |     2
        52109 |     2
        52110 |     2
        52111 |     2
        52112 |     2
        52113 |     2
        52114 |     2
        52115 |     2
        52116 |     2
        52117 |     2
        52118 |     2
        52119 |     2
        52120 |     2
        52121 |     2
        52122 |     2
        52123 |     2
        52124 |     2
        52125 |     2
        52126 |     2
        52127 |     2
        52128 |     2
        52129 |     2
        52130 |     2
        52131 |     2
        52132 |     2
        52133 |     2
        52134 |     2
        52135 |     2
        52136 |     2
        52137 |     2
        52167 |     2
        52168 |     2
        52169 |     2
        52170 |     2
        52171 |     2
        52172 |     2
        52173 |     2
        52174 |     2
        52175 |     2
        52180 |     2
        52181 |     2
        52182 |     2
        52222 |     2
        52223 |     2
        52224 |     2
        52225 |     2
        52226 |     2
        52227 |     2
        52228 |     2
        52229 |     2
        52230 |     2
        52232 |     2
        52233 |     2
        52234 |     2
        52235 |     2
        52236 |     2
        52237 |     2
        52238 |     2
        52239 |     2
        52240 |     2
        52241 |     2
        52242 |     2
        52243 |     2
        52244 |     2
        52245 |     2
        52246 |     2
        52247 |     2
        52248 |     2
        52249 |     2
        52250 |     2
        52251 |     2
        52465 |     2
        52466 |     2
        52467 |     2
        52468 |     2
        52469 |     2
        52470 |     2
        52471 |     2
        52472 |     2
        52473 |     2
        52474 |     2
        52475 |     2
        52476 |     2
        52477 |     2
        52478 |     2
        52479 |     2
        52480 |     2
        52481 |     2
        52482 |     2
        52483 |     2
        52484 |     2
        52485 |     2
        52486 |     2
        52487 |     2
(200 rows)

basement=# select oid,developer_id from developer where developer_id =52487;
    oid    | developer_id
-----------+--------------
 401440180 |        52487
 401440180 |        52487

===

To: "Brian Hirt" <bhirt@mobygames.com>
From: Tom Lane <tgl@sss.pgh.pa.us>
Subject: Re: [HACKERS] [GENERAL] ACK table corrupted, unique
index violated. 
Date: Wed, 12 Dec 2001 13:01:28 -0500

"Brian Hirt" <bhirt@mobygames.com> writes:
> In a nutshell, my primary key index got a NOTICE to recreate when the
> database was vacuumed.  I dropped the index and tried to recreate it.  I get
> a key violation when i try to do this.    I find there are some 200 rows
> with the exact same developer_id and oid.

You're the third person to have reported something like this, so there's
something strange going on.  Can you give access to your system to
someone who can poke into it (probably me or Vadim)?

> There are some NOTICES the day
> before that i don't know what they mean, but don't look good.

> NOTICE:  Cannot rename init file
> /moby/pgsql/base/156130/pg_internal.init.19833 to
> /moby/pgsql/base/156130/pg_internal.init: No such file or directory

These seem extremely bizarre.

			regards, tom lane

===

To: "Tom Lane" <tgl@sss.pgh.pa.us>
From: "Brian Hirt" <bhirt@mobygames.com>
Subject: Re: [HACKERS] problems with table corruption
continued 
Date: Tue, 18 Dec 2001 11:07:26 -0700


"Tom Lane" <tgl@sss.pgh.pa.us> wrote: 

> "Brian Hirt" <bhirt@mobygames.com> writes:

> > Do you think this might also explain the following errors i was seeing?
>
> > NOTICE:  Cannot rename init file
> > /moby/pgsql/base/156130/pg_internal.init.19833 to
> > /moby/pgsql/base/156130/pg_internal.init: No such file or directory
>
> No, that error is not coming from anywhere near VACUUM; it's from
> relcache startup (see write_irels in src/backend/utils/cache/relcache.c).
> The rename source file has just been created in the very same routine,
> so it's difficult to see how one would get a "No such file" failure from
> rename().
>
> It is possible that several backends create temp init files at the
> same time and all try to rename their own temp files into place as
> the pg_internal.init file.  However, that should work: the rename
> man page says
>
>      The rename() system call causes the source file to be renamed to
>      target.  If target exists, it is first removed.  Both source and
>      target must be of the same type (that is, either directories or
>      nondirectories), and must reside on the same file system.
>
>      If target can be created or if it existed before the call, rename()
>      guarantees that an instance of target will exist, even if the system
>      crashes in the midst of the operation.
>
> so we should end up with the extra copies deleted and just one init file
> remaining after the slowest backend renames its copy into place.
>
> Do you by chance have your database directory mounted via NFS, or some
> other strange filesystem where the normal semantics of concurrent rename
> might not work quite right?
>
> FWIW, I believe this condition is pretty harmless, but it would be nice
> to understand exactly why you're seeing the message.

Well, when my application starts, about 100 backend database connections
start up at the same time so this fits in with the circumstance you
describe.  However, I'm just using a standard ext2 filesystem on 2.2 linux
kernel.

It's good to know that this error should be harmless.



===

To: Mike Mascari <mascarm@mascari.com>
From: Doug McNaught <doug@wireboard.com>
Subject: Re: [HACKERS] Concerns about this release
Date: 18 Dec 2001 14:47:37 -0500

Mike Mascari <mascarm@mascari.com> writes:

> Tom Lane wrote:
> > 
> > FWIW, I trust lazy VACUUM a lot *more* than I trust the old VACUUM code.
> > Read the tuple-chain-moving logic in vacuum.c sometime, and then tell me
> > how confident you feel in it.  (My gut tells me that that logic is
> > responsible for the recent reports of duplicate tuples in 7.1.*, though
> > I can't yet back this up with any evidence.)
> 
> For all the various bugs which have been in PG over the years, the
> recent crop of "duplicate tuples" is the absolute scariest. Can a
> release really be made without knowing precisely the cause of those
> corruptions? The various theories offered by the posters (SMP
> machine, CREATE INDEX in pl/pgsql functions, etc.) aren't comforting
> either. To me, all other feature enhancements pale in comparison to
> pinning down this bug.

The one instance of the bug that has been definitely pinned down
involved the old VACUUM code in 7.1.3 (plus a don't-do-that-then index
function).


===

To: Mike Mascari <mascarm@mascari.com>
From: Tom Lane <tgl@sss.pgh.pa.us>
Subject: Re: [HACKERS] Concerns about this release 
Date: Tue, 18 Dec 2001 14:52:11 -0500

Mike Mascari <mascarm@mascari.com> writes:
> For all the various bugs which have been in PG over the years, the
> recent crop of "duplicate tuples" is the absolute scariest. Can a
> release really be made without knowing precisely the cause of those
> corruptions?

I'm not happy about it either, but the existence of an unrepaired bug
in 7.1 is hardly grounds for not releasing 7.2.  Finding a bug in 7.2
that doesn't exist in 7.1 would be such grounds, sure.  But we put out
releases when we think they are better than what's out there; holding
them in a (vain) quest for perfection isn't the way to make progress.

FWIW, we seem to understand the mechanism behind Brian Hirt's case,
at least.  Need to poll the other reporters and see if the explanation
fits their cases.
===

To: Don Baccus <dhogaza@pacifier.com>
From: Tom Lane <tgl@sss.pgh.pa.us>
Subject: Re: [HACKERS] Concerns about this release 
Date: Tue, 18 Dec 2001 15:27:00 -0500

Don Baccus <dhogaza@pacifier.com> writes:
> 5. The semantics of VACUUM have changed.  Silently (in the sense that
>     there's no notification or warning spewed out).

???  VACUUM has no semantics: it does not alter your data (or at least
it's not supposed to ;-)).  This change is transparent in the same
way that the WAL and function manager changes were.  If there is any
lack of transparency, it would show up as greater disk space usage
than you might expect --- which seems *exactly* parallel to WAL.
And you don't have the option to turn WAL off.  I don't think you
can consistently maintain that adding WAL is good and changing VACUUM
is bad.

> Or if the new one hoses your tables.   Tom's bright, but he's not 
> certified bug-free.

Certainly, but you are assuming that the old VACUUM is bug-free,
which is, um, overly optimistic.  The new VACUUM code is (mostly) a
subset of the old, and has removed all the most ticklish bits of the old
code.  So if you are looking for the fewest bugs you should prefer the
new to the old anyway.  Case in point: Brian Hirt's bug does not arise
under new-style VACUUM.  I had to say VACUUM FULL to make it happen
in current sources.

===

the rest of The Pile (a partial mailing list archive)

doom@kzsu.stanford.edu