This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.
Subject: Performance tuning From: John Summerfield <summer@OS2.ami.com.au> Date: Thu, 08 Apr 1999 18:55:44 +0800 A little while ago, someone asked here about tuning tools for Linux, and some said words to the effect, "Don't be silly. Linux is so clever it tunes itself." Since then I've had a little think about the question. here's one answer about why Linux needs tuning tools. Before I go further: I've been involved in the tuning of IBM mainframes (with operating systems much more complicated than Linux), but my practice on Linux is quite limited. Without decent tuning tools, tuning computers is akin to witchcraft. Here is a snapshot of a display from procinfo. It's showing rates: not cumulative figures since my last reboot, and is updating every ten seconds. I'm updating a PostgresSql database with a java program, both on the same system. The CPU is a P133, and the system has three EIDE drives plus a CDD. Bootup: Mon Apr 5 09:31:07 1999 Load average: 1.04 1.08 0.99 2/75 26321 user : 0:00:00.31 3.1% page in : 13 disk 1: 0r 0w nice : 0:00:00.47 4.7% page out: 2429 disk 2: 3r 697w system: 0:00:06.92 69.1% swap in : 0 disk 3: 1r 1w idle : 0:00:02.32 23.2% swap out: 0 disk 4: 0r 0w uptime: 3d 7:59:02.30 context : 1793 irq 0: 1002 timer irq 6: 0 irq 1: 0 keyboard irq 9: 53 SMC EPIC/100 irq 2: 0 cascade [4] irq 10: 0 eth0 irq 3: 0 serial irq 13: 0 fpu irq 4: 0 irq 14: 4880 ide0 irq 5: 0 parport0 irq 15: 4 ide1 I've clipped the memory figures as they're useless. The Bootup line shows load average (number of processes ready to run) for three intervals then the number of processes ready, the number of processes in the system and the PID of the last-run process. First thing to say: If I sustained those figures for extended periods, I'd been looking to upgrade something, wouldn't I? Since the CPU is flat out, perhaps an upgrade to a PI-450 or PIII?. Look a little further: the I/O rate on disk 2 is a little high: about as fast as it can go I imagine. It's paging heavily. Note; the number of pages is NOT really pages - it's blocks. I think the number of writes is better: presumably it's using multi-sector writes and writing 4K pages together. I recall years ago the Department of Social Security (Australia) got some nice new Amdahl computers with Storage Technology solid-stated disks. One was running at 100% CPU and 600 or so pages/sec paging. We were impressed. Here, there's just me, and my computer has more RAM (96 Mb) than that Amdahl computer had. here's what the free command says about my RAM: [summer@emu summer]$ free total used free shared buffers cached Mem: 95688 93364 2324 26276 12008 55532 -/+ buffers/cache: 25824 69864 Swap: 112888 13280 99608 [summer@emu summer]$ The main points are that Linux does indeed recognize the 96 Mb (I tell it actually), and that I have a decent amount of swap space. Indeed, it's not using much. So now we see that: 1 The system's paging lots 2 The CPU is fully engaged. I have another observation: system: 0:00:06.92 69.1% This is exceptional, and some would say it shows the need to optimize the kernel. However, I suspect that PostgresSQL is stupid enough that it will still flog the computer. The raw data for my database doesn't amount to as much as 12 Mb: it should be able to store the entire database in RAM without using any swap space. I suggest that this figure is so high because of the excessive paging activity. Get rid of the paging and the CPU will be able to do something useful. I will exonerate java in this case because I have some other information: I run the java app on another computer on the LAN and still that P133 system gets flogged: here's a snapshot with the Java app on another machine: Bootup: Mon Apr 5 09:31:07 1999 Load average: 1.05 0.57 0.41 4/72 26456 user : 0:00:01.05 10.4% page in : 4 disk 1: 0r 0w nice : 0:00:00.00 0.0% page out: 2236 disk 2: 0r 579w system: 0:00:07.34 72.9% swap in : 0 disk 3: 3r 1w idle : 0:00:01.68 16.7% swap out: 0 disk 4: 0r 0w uptime: 3d 8:57:20.38 context : 1756 irq 0: 1007 timer irq 6: 0 irq 1: 0 keyboard irq 9: 44 SMC EPIC/100 irq 2: 0 cascade [4] irq 10: 0 eth0 irq 3: 0 serial irq 13: 0 fpu irq 4: 0 irq 14: 4470 ide0 irq 5: 0 parport0 irq 15: 10 ide1 Traffic's coming in on the SMC NIC: as you can see, it's not raising a sweat. And that traffic includes getting the data from the server by NFS: disk 2 above carries the database, disk 3 the source data. Hands up those who think I should rush out and buy more RAM? It happens this M/b supports up to 256 Mb, so I could certainly add more. My suspicion is that PostgresSQL is finding how much virtual memory is available and using too much. I'd cut out the use of swapper but for something else I run that runs out of memory if I don't have swap space. I've not fixed the problem: for one thing, it won't last long enough in use to cause a serious problem. However, before buying more RAM, I'd explore options to limit the virtual memory available to PostgresSQL. I reckon that if I can coax PostgresSQL into running in 48 Mb (which should be heaps) the entire system, including this program, will run better. If PostgresSQL won't perform well on this system, best to ditch it and get something that will. A commercial enterprise would do well to look at mainline commercial RDBMS software such as DADBAS, ORACLE, DB2. === Subject: Re: Performance tuning From: Matthew Kirkwood <weejock@ferret.lmh.ox.ac.uk> Date: Thu, 8 Apr 1999 23:55:00 +0100 (GMT) On Thu, 8 Apr 1999, John Summerfield wrote: > A little while ago, someone asked here about tuning tools for Linux, and > some said words to the effect, "Don't be silly. Linux is so clever it > tunes itself." > > Since then I've had a little think about the question. here's one answer > about why Linux needs tuning tools. > Here is a snapshot of a display from procinfo. It's showing rates: not > cumulative figures since my last reboot, and is updating every ten seconds. I wouldn't trust the information displayed by procinfo and top for this sort of thing (though is provides a reasonable average over longer periods of time). > Bootup: Mon Apr 5 09:31:07 1999 Load average: 1.04 1.08 0.99 2/75 26321 > The Bootup line shows load average (number of processes ready to run) > for three intervals then the number of processes ready, the number of > processes in the system and the PID of the last-run process. > > First thing to say: If I sustained those figures for extended periods, > I'd been looking to upgrade something, wouldn't I? Since the CPU is > flat out, perhaps an upgrade to a PI-450 or PIII?. Not necessarily. You might often find processes which block after using only a fraction of their quantum, but which are ready to run by the time shortly after. We run a quake server here which displays exactly these characteristics. (This is one reason why Linux makes to good quake server platform - because the process looks a little like and interactive one, it often responds quicker than it might if it was constantly bottling on CPU.) > Look a little further: the I/O rate on disk 2 is a little high: about as > fast as it can go I imagine. It's paging heavily. Note; the number of > pages is NOT really pages - it's blocks. I think the number of writes is > better: presumably it's using multi-sector writes and writing 4K pages > together. So it's time to move some filesystems, or get an additional swap disk. > So now we see that: > 1 The system's paging lots > 2 The CPU is fully engaged. > > I have another observation: > system: 0:00:06.92 69.1% > > This is exceptional, and some would say it shows the need to optimize the > kernel. However, I suspect that PostgresSQL is stupid enough that it will > still flog the computer. The raw data for my database doesn't amount to as > much as 12 Mb: it should be able to store the entire database in RAM > without using any swap space. Two reasons here: firstly, as you say, PostgreSQL can really batter disk and CPU (try running it with -F if you don't expect the power to go out) and secondly Linux attributes a lot more time to "system" than other OSes, even though it's not using more. Who is lying in this case, is not clear, but it's probably both :) > I suggest that this figure is so high because of the excessive paging > activity. Get rid of the paging and the CPU will be able to do something > useful. Indeed. I fail to see where automatic tuning software would help any of this. If you don't have enough memory, then you will swap. If you put swap and some busy filesystems on the same disk, performance will suck. Some decent performance monitoring software would help (and is, indeed, in the works, I'm told). On the other hand, if you find a case where you really thing Linux is doing a bad job then you may want to tell the guys at linux-mm@kvack.org about it and see if they have any ideas. === Subject: Re: Performance tuning From: jfm2@club-internet.fr Date: 9 Apr 1999 05:54:41 -0000 > > If PostgresSQL won't perform well on this system, best to ditch it and get > something that will. A commercial enterprise would do well to look at > mainline commercial RDBMS software such as DADBAS, ORACLE, DB2. And Sybase, and Inprise and Ingres II all of them having Linux versions. If you are a commercial enterprise you would do well in using an RDBMS who can restore your data to the instant just before you had a disk head crash. If you restore you data at the point it was the day before that means you have people who will never receive what they ordered and they will not be happy about it. Specially if they are companies and have to cancel orders _they_ accepted from their clients due to you failing to deliver goods they needed to manufacture their product. Does Postgres support before and after journals, saving without stopping the database? Yes, right. No, then it is usable at university but not for handling $$$. ===