This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.
Subject: Performance tuning
From: John Summerfield <summer@OS2.ami.com.au>
Date: Thu, 08 Apr 1999 18:55:44 +0800
A little while ago, someone asked here about tuning tools for Linux, and
some said words to the effect, "Don't be silly. Linux is so clever it
tunes itself."
Since then I've had a little think about the question. here's one answer
about why Linux needs tuning tools.
Before I go further: I've been involved in the tuning of IBM mainframes
(with operating systems much more complicated than Linux), but my practice
on Linux is quite limited. Without decent tuning tools, tuning computers
is akin to witchcraft.
Here is a snapshot of a display from procinfo. It's showing rates: not
cumulative figures since my last reboot, and is updating every ten seconds.
I'm updating a PostgresSql database with a java program, both on the same
system. The CPU is a P133, and the system has three EIDE drives plus a CDD.
Bootup: Mon Apr 5 09:31:07 1999 Load average: 1.04 1.08 0.99 2/75 26321
user : 0:00:00.31 3.1% page in : 13 disk 1: 0r
0w
nice : 0:00:00.47 4.7% page out: 2429 disk 2: 3r
697w
system: 0:00:06.92 69.1% swap in : 0 disk 3: 1r
1w
idle : 0:00:02.32 23.2% swap out: 0 disk 4: 0r
0w
uptime: 3d 7:59:02.30 context : 1793
irq 0: 1002 timer irq 6: 0
irq 1: 0 keyboard irq 9: 53 SMC EPIC/100
irq 2: 0 cascade [4] irq 10: 0 eth0
irq 3: 0 serial irq 13: 0 fpu
irq 4: 0 irq 14: 4880 ide0
irq 5: 0 parport0 irq 15: 4 ide1
I've clipped the memory figures as they're useless.
The Bootup line shows load average (number of processes ready to run) for
three intervals then the number of processes ready, the number of
processes in the system and the PID of the last-run process.
First thing to say: If I sustained those figures for extended periods, I'd
been looking to upgrade something, wouldn't I? Since the CPU is flat out,
perhaps an upgrade to a PI-450 or PIII?.
Look a little further: the I/O rate on disk 2 is a little high: about as
fast as it can go I imagine. It's paging heavily. Note; the number of
pages is NOT really pages - it's blocks. I think the number of writes is
better: presumably it's using multi-sector writes and writing 4K pages
together.
I recall years ago the Department of Social Security (Australia) got some
nice new Amdahl computers with Storage Technology solid-stated disks. One
was running at 100% CPU and 600 or so pages/sec paging. We were impressed.
Here, there's just me, and my computer has more RAM (96 Mb) than that
Amdahl computer had.
here's what the free command says about my RAM:
[summer@emu summer]$ free
total used free shared buffers cached
Mem: 95688 93364 2324 26276 12008 55532
-/+ buffers/cache: 25824 69864
Swap: 112888 13280 99608
[summer@emu summer]$
The main points are that Linux does indeed recognize the 96 Mb (I tell it
actually), and that I have a decent amount of swap space. Indeed, it's not
using much.
So now we see that:
1 The system's paging lots
2 The CPU is fully engaged.
I have another observation:
system: 0:00:06.92 69.1%
This is exceptional, and some would say it shows the need to optimize the
kernel. However, I suspect that PostgresSQL is stupid enough that it will
still flog the computer. The raw data for my database doesn't amount to as
much as 12 Mb: it should be able to store the entire database in RAM
without using any swap space.
I suggest that this figure is so high because of the excessive paging
activity. Get rid of the paging and the CPU will be able to do something
useful.
I will exonerate java in this case because I have some other information:
I run the java app on another computer on the LAN and still that P133
system gets flogged: here's a snapshot with the Java app on another
machine:
Bootup: Mon Apr 5 09:31:07 1999 Load average: 1.05 0.57 0.41 4/72 26456
user : 0:00:01.05 10.4% page in : 4 disk 1: 0r
0w
nice : 0:00:00.00 0.0% page out: 2236 disk 2: 0r
579w
system: 0:00:07.34 72.9% swap in : 0 disk 3: 3r
1w
idle : 0:00:01.68 16.7% swap out: 0 disk 4: 0r
0w
uptime: 3d 8:57:20.38 context : 1756
irq 0: 1007 timer irq 6: 0
irq 1: 0 keyboard irq 9: 44 SMC EPIC/100
irq 2: 0 cascade [4] irq 10: 0 eth0
irq 3: 0 serial irq 13: 0 fpu
irq 4: 0 irq 14: 4470 ide0
irq 5: 0 parport0 irq 15: 10 ide1
Traffic's coming in on the SMC NIC: as you can see, it's not raising a
sweat. And that traffic includes getting the data from the server by NFS:
disk 2 above carries the database, disk 3 the source data.
Hands up those who think I should rush out and buy more RAM?
It happens this M/b supports up to 256 Mb, so I could certainly add more.
My suspicion is that PostgresSQL is finding how much virtual memory is
available and using too much. I'd cut out the use of swapper but for
something else I run that runs out of memory if I don't have swap space.
I've not fixed the problem: for one thing, it won't last long enough in
use to cause a serious problem. However, before buying more RAM, I'd
explore options to limit the virtual memory available to PostgresSQL.
I reckon that if I can coax PostgresSQL into running in 48 Mb (which
should be heaps) the entire system, including this program, will run
better.
If PostgresSQL won't perform well on this system, best to ditch it and get
something that will. A commercial enterprise would do well to look at
mainline commercial RDBMS software such as DADBAS, ORACLE, DB2.
===
Subject: Re: Performance tuning
From: Matthew Kirkwood <weejock@ferret.lmh.ox.ac.uk>
Date: Thu, 8 Apr 1999 23:55:00 +0100 (GMT)
On Thu, 8 Apr 1999, John Summerfield wrote:
> A little while ago, someone asked here about tuning tools for Linux, and
> some said words to the effect, "Don't be silly. Linux is so clever it
> tunes itself."
>
> Since then I've had a little think about the question. here's one answer
> about why Linux needs tuning tools.
> Here is a snapshot of a display from procinfo. It's showing rates: not
> cumulative figures since my last reboot, and is updating every ten seconds.
I wouldn't trust the information displayed by procinfo and top for this
sort of thing (though is provides a reasonable average over longer periods
of time).
> Bootup: Mon Apr 5 09:31:07 1999 Load average: 1.04 1.08 0.99 2/75 26321
> The Bootup line shows load average (number of processes ready to run)
> for three intervals then the number of processes ready, the number of
> processes in the system and the PID of the last-run process.
>
> First thing to say: If I sustained those figures for extended periods,
> I'd been looking to upgrade something, wouldn't I? Since the CPU is
> flat out, perhaps an upgrade to a PI-450 or PIII?.
Not necessarily. You might often find processes which block after using
only a fraction of their quantum, but which are ready to run by the time
shortly after.
We run a quake server here which displays exactly these characteristics.
(This is one reason why Linux makes to good quake server platform -
because the process looks a little like and interactive one, it often
responds quicker than it might if it was constantly bottling on CPU.)
> Look a little further: the I/O rate on disk 2 is a little high: about as
> fast as it can go I imagine. It's paging heavily. Note; the number of
> pages is NOT really pages - it's blocks. I think the number of writes is
> better: presumably it's using multi-sector writes and writing 4K pages
> together.
So it's time to move some filesystems, or get an additional swap disk.
> So now we see that:
> 1 The system's paging lots
> 2 The CPU is fully engaged.
>
> I have another observation:
> system: 0:00:06.92 69.1%
>
> This is exceptional, and some would say it shows the need to optimize the
> kernel. However, I suspect that PostgresSQL is stupid enough that it will
> still flog the computer. The raw data for my database doesn't amount to as
> much as 12 Mb: it should be able to store the entire database in RAM
> without using any swap space.
Two reasons here: firstly, as you say, PostgreSQL can really batter disk
and CPU (try running it with -F if you don't expect the power to go out)
and secondly Linux attributes a lot more time to "system" than other OSes,
even though it's not using more. Who is lying in this case, is not clear,
but it's probably both :)
> I suggest that this figure is so high because of the excessive paging
> activity. Get rid of the paging and the CPU will be able to do something
> useful.
Indeed.
I fail to see where automatic tuning software would help any of this. If
you don't have enough memory, then you will swap. If you put swap and
some busy filesystems on the same disk, performance will suck.
Some decent performance monitoring software would help (and is, indeed, in
the works, I'm told).
On the other hand, if you find a case where you really thing Linux is
doing a bad job then you may want to tell the guys at linux-mm@kvack.org
about it and see if they have any ideas.
===
Subject: Re: Performance tuning
From: jfm2@club-internet.fr
Date: 9 Apr 1999 05:54:41 -0000
>
> If PostgresSQL won't perform well on this system, best to ditch it and get
> something that will. A commercial enterprise would do well to look at
> mainline commercial RDBMS software such as DADBAS, ORACLE, DB2.
And Sybase, and Inprise and Ingres II all of them having Linux
versions. If you are a commercial enterprise you would do well in
using an RDBMS who can restore your data to the instant just before
you had a disk head crash. If you restore you data at the point it
was the day before that means you have people who will never receive
what they ordered and they will not be happy about it. Specially if
they are companies and have to cancel orders _they_ accepted from
their clients due to you failing to deliver goods they needed to
manufacture their product.
Does Postgres support before and after journals, saving without
stopping the database? Yes, right. No, then it is usable at
university but not for handling $$$.
===