modperl-swish_search_engine_again_huh

This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.



Date: Thu, 05 Oct 2000 12:37:48 -0700
To: modperl@apache.org
From: Bill Moseley <moseley@hank.org>
Subject: Re: Forking in mod_perl? (benchmarking)

I'm working with the Swish search engine (www.apache.org and the guide use
it).

Until this month, SWISH could only be called via a fork/exec.  Now there's
an early C library for swish that I've built into a perl module for use
with mod_perl.

Yea! No forking!

I decided to do some quick benchmarking with ab.  I'm rotten at
benchmarking anything, so any suggestions are welcome.

My main question was this:  With the library version you first call a
routine to open the index files.  This reads in header info and gets ready
for the search.  Then you run the query, and then you call a routine to
close the index.

OR, you can open the index file, and do multiple queries without opening
and closing the index each time.  Somewhat like caching a DBI connection, I
suppose.

So I wanted to see how much faster it is to keep the index file open.

I decided to start Apache with only one child, so it would handle ALL the
requests.  I'm running ab on the same machine, and only doing 100 requests.

Running my mod_perl program without asking for a query I can get almost 100
requests per second.  That's just writing from memory and logging to an
open file.

Now comparing the two methods of calling SWISH I got about 7.7 request per
second leaving the index file open between requests, and 6.5 per second
opening each time.  My guess is Linux is helping buffer the file contents
quite a bit since this machine isn't doing anything else at the time, so
there might be a wider gap if the machine was busy.

Now, here's why this post is under this subject thread:

For fun I changed over to forking Apache and exec'ing SWISH each request,
and I got just over 6 requests per second.  I guess I would have expected
much worse, but again, I think Linux is helping out quite a bit in the fork.

And for more fun, the "same" program under mod_cgi: 0.90 requests/second


===

the rest of The Pile (a partial mailing list archive)

doom@kzsu.stanford.edu