pgsql-possible_problems_with_the_new_vaccuum

This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.



To: Barry Lind <barry@xythos.com>
From: Andrew McMillan <andrew@catalyst.net.nz>
Subject: Re: [HACKERS] problems with new vacuum (??)
Date: 02 Jan 2002 21:29:59 +1300

On Wed, 2002-01-02 at 13:31, Barry Lind wrote:
> Over the last two days I have been struggling with running vacuum on a 
> 7.2b4 database that is running in a production environment.  This was 
> essentially the first time I ran vacuum on this database since it was 
> upgraded to 7.2.  This database is characterized by one large table that 
> has many inserts and deletes, however generally contains zero rows.  So 
> over the course of the last few weeks this table had grown in size to 
> about 2.5G (or more correctly the corresponding toast table grew that 
> large).
> 
> So the first problem I had was that the vaccum (regular vacuum not full 
> vacuum) took a very long time on this table (2+ hours).  Now I would 
> expect it to take a while, so that in and of itself isn't a problem. 
> But while this vacuum was running the rest of the system was performing 
> very poorly.  Opperations that usually are subsecond, where taking 
> minutes to complete.  At first I thought there was some sort of locking 
> problem, but these opperations did complete, but after a very long time.

Is it possible that you waited until a point when the work that vacuum
has to do is being undone faster by the new transactions coming
through?  This might be complicated by the fact that (from your vague
description) the table is heavily toasted.

Also, as a suggestion, if you can know there are zero records in the
table very often, why not TRUNCATE it at those times?  That should be a
_lot_ quicker than vacuuming it!

===

To: Tom Lane <tgl@sss.pgh.pa.us>
From: Hannu Krosing <hannu@tm.ee>
Subject: Re: [HACKERS] problems with new vacuum (??)
Date: Wed, 02 Jan 2002 12:55:56 +0200

Tom Lane wrote:
> 
> Barry Lind <barry@xythos.com> writes:
> > But while this vacuum was running the rest of the system was performing
> > very poorly.  Opperations that usually are subsecond, where taking
> > minutes to complete.
> 
> Is this any different from the behavior of 7.1 vacuum?  Also, what
> platform are you on?
> 
> I've noticed on a Linux 2.4 box (RH 7.2, typical commodity-grade PC
> hardware) that vacuum, pgbench, or almost any I/O intensive operation
> drives interactive performance into the ground.

They drive each other to the ground too ;(

When I tried to run the new vacuum concurrently with a pgbench in hope 
to make it perform better for large number of updates (via removing the 
need to scan large number of dead tuples) 1 concurrent vacuum was able
to 
make 128 pgbench backends more than twice as slow as they were without
vacuum. 
And this is an extra slowdown from another 2-3X slowdown due to dead
tuples 
(got from comparing speed on VACUUM FULL db and db aftre doing ~10k 
pgbench transactions)

> I have not had an
> opportunity to try to characterize the problem, but I suspect Linux's
> disk I/O scheduler is not bright enough to prioritize interactive
> operations.

Have you any ideas how to distinguish between interactive and
non-interactive
disk I/O coming from postgresql backends ?

Can I for example nice the vacuum'ing backend without getting the 
"reverse priority" effects ?

===

To: Barry Lind <barry@xythos.com>, Tom Lane
<tgl@sss.pgh.pa.us>
From: "Matthew T. O'Connor" <matthew@zeut.net>
Subject: Re: [HACKERS] problems with new vacuum (??)
Date: Wed, 2 Jan 2002 08:49:58 -0600

On Tuesday 01 January 2002 11:23 pm, Barry Lind wrote:
> Tom,
>
> The platform is Redhat 7.0 with a 2.2.19 kernal.

Is this and IDE based system?  If so do you have the drives running in DMA 
mode?

What are the results of "/sbin/hdparm /dev/hd(?)" (a,b,c,d ... which ever 
drive you are running the database on.)

The 2.2 linux kernel defaults to DMA off.  You can try to enable dma by 
issuing /sbin/hdparm -d1 /dev/hd(?)  You can also test the disk speed with 
/sbin/hdparm -tT /dev/hd(?).

In my experience enabling this feature can make a huge improvement in I/O 
intensive applications.  Other options can help also, but I find dma to have 
the largest impact.  I find linux almost unusable without it.

===

To: Hannu Krosing <hannu@tm.ee>
From: Tom Lane <tgl@sss.pgh.pa.us>
Subject: Re: [HACKERS] problems with new vacuum (??) 
Date: Wed, 02 Jan 2002 10:56:40 -0500

Hannu Krosing <hannu@tm.ee> writes:
> Have you any ideas how to distinguish between interactive and
> non-interactive disk I/O coming from postgresql backends ?

I don't see how.  For one thing, the backend that originally dirtied
a buffer is not necessarily the one that writes it out.  Even assuming
that we could assign a useful priority to different I/O requests,
how do we tell the kernel about it?  There's no portable API for that
AFAIK.

One thing that would likely help a great deal is to have the WAL files
on a separate disk spindle, but since what I've got is a one-disk
system, I can't test that on this PC.

===

To: "Matthew T. O'Connor" <matthew@zeut.net>
From: Bruce Momjian <pgman@candle.pha.pa.us>
Subject: Re: [HACKERS] problems with new vacuum (??)
Date: Wed, 2 Jan 2002 13:13:32 -0500 (EST)

> In my experience enabling this feature can make a huge improvement in I/O 
> intensive applications.  Other options can help also, but I find dma to have 
> the largest impact.  I find linux almost unusable without it.

Oh, I should mention my BSD/OS data point is with one SCSI disk, soft
updates and tagged queuing enabled.

===

To: Tom Lane <tgl@sss.pgh.pa.us>
From: Bruce Momjian <pgman@candle.pha.pa.us>
Subject: Re: [HACKERS] problems with new vacuum (??)
Date: Wed, 2 Jan 2002 13:12:03 -0500 (EST)

Tom Lane wrote:
> Barry Lind <barry@xythos.com> writes:
> > But while this vacuum was running the rest of the system was performing 
> > very poorly.  Opperations that usually are subsecond, where taking 
> > minutes to complete.
> 
> Is this any different from the behavior of 7.1 vacuum?  Also, what
> platform are you on?
> 
> I've noticed on a Linux 2.4 box (RH 7.2, typical commodity-grade PC
> hardware) that vacuum, pgbench, or almost any I/O intensive operation
> drives interactive performance into the ground.  I have not had an
> opportunity to try to characterize the problem, but I suspect Linux's
> disk I/O scheduler is not bright enough to prioritize interactive
> operations.

Just as a data point, I have not seen pgbench dramatically affect
performance on BSD/OS.  Interactive sessions are just slightly slower
when then need to access the disk.

===

To: Bruce Momjian <pgman@candle.pha.pa.us>
From: Don Baccus <dhogaza@pacifier.com>
Subject: Re: [HACKERS] problems with new vacuum (??)
Date: Wed, 02 Jan 2002 10:34:35 -0800

Bruce Momjian wrote:

>>In my experience enabling this feature can make a huge improvement in I/O 
>>intensive applications.  Other options can help also, but I find dma to have 
>>the largest impact.  I find linux almost unusable without it.
>>
> 
> Oh, I should mention my BSD/OS data point is with one SCSI disk, soft
> updates and tagged queuing enabled.


If Tom's system is IDE-based and he's not explicitly enabled DMA then 
this alone would explain the difference you two are seeing, just as the 
poster above is implying.  I have one system with an older 15GB disk 
that causes a kernel panic if I try to enable DMA, and I see the kind of 
system performance issues described by Tom on that system.

On my main server downtown (SCSI) and my normal desktop (two IDE drives 
that do work properly with DMA enabled) things run much, much better 
when there's a lot of disk I/O going on.  These are all Linux systems, 
not BSD...

===

To: Don Baccus <dhogaza@pacifier.com>
From: Tom Lane <tgl@sss.pgh.pa.us>
Subject: Re: [HACKERS] problems with new vacuum (??) 
Date: Wed, 02 Jan 2002 13:40:32 -0500

Don Baccus <dhogaza@pacifier.com> writes:
> If Tom's system is IDE-based and he's not explicitly enabled DMA then 
> this alone would explain the difference you two are seeing,

It is IDE, but DMA is on:

[root@rh1 root]# hdparm -v /dev/hda

/dev/hda:
 multcount    = 16 (on)
 I/O support  =  0 (default 16-bit)
 unmaskirq    =  0 (off)
 using_dma    =  1 (on)
 keepsettings =  0 (off)
 nowerr       =  0 (off)
 readonly     =  0 (off)
 readahead    =  8 (on)
 geometry     = 9729/255/63, sectors = 156301488, start = 0

[root@rh1 root]# hdparm -i /dev/hda

/dev/hda:

 Model=ST380021A, FwRev=3.10, SerialNo=3HV0CZ2L
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
 BuffType=unknown, BuffSize=2048kB, MaxMultSect=16, MultSect=16
 CurCHS=16383/16/63, CurSects=-66060037, LBA=yes, LBAsects=156301488
 IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes: pio0 pio1 pio2 pio3 pio4
 DMA modes: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5
 AdvancedPM=no
 Drive Supports : Reserved : ATA-1 ATA-2 ATA-3 ATA-4 ATA-5

This is an out-of-the-box RH 7.2 install (kernel 2.4.7-10) on recent
Dell hardware.  If anyone can suggest further tuning of the hdparm
settings, I'm all ears.  Don't know a darn thing about disk tuning
for Linux.

===


the rest of The Pile (a partial mailing list archive)

doom@kzsu.stanford.edu