This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.
To: modperl@apache.org From: "Bruce W. Hoylman" <bhoylma@qwest.com> Subject: Suggestions on an XML-RPC Service using modperl? Date: Tue, 1 Jan 2002 12:19:00 -0700 Ciao! I would like some input on an intranet web service I am currently in the process of designing, the core of which will be modperl on UN*X. The service itself is to access a couple of back end data stores given parameters received in an XML-RPC request, then return the results in an XML-RPC formated response. The data from the back end sources will be loaded into memory at service initialization, for fast access. The data is small enough and memory plentiful enough to allow this. That's pretty much it in terms of the high level data flow. It has to be relatively fast, OTO 5+ requests/sec. as a relative volumetric. I'm going to use modperl due to the embedded perl interpreter characteristics it provides, allowing initialization overhead to be incurred at startup. I also wish to use an in-memory, read-only hash structure shared across all modperl processes for access to the cached back end data, rather than making expensive calls to these stores for each request. Again, throughput is critical. I would like your thoughts on the cache management concept of the service. I'm looking at MLDBM::Sync as the mechanism for managing the filesystem representation of the in-memory hash content. What to manage the in-memory structure itself in terms of accessing its content? Is a Tie structure too expensive? I want to end up with a single structure accessible to all of the modperl processes, loaded at service startup. This service will ultimately be registered within a UDDI/SOAP framework, FYI. However this will not be in the first incarnation of the service itself. Thoughts and comments welcome. Obviously this is an early brainstorm (more like a drizzle) but I hope to get a few stimulating comments from this most excellent resource, the list. === To: bhoylma@qwest.com From: Chip Turner <cturner@redhat.com> Subject: Re: Suggestions on an XML-RPC Service using modperl? Date: 02 Jan 2002 03:44:12 -0500 "Bruce W. Hoylman" <bhoylma@qwest.com> writes: > Ciao! > > I would like some input on an intranet web service I am currently in the > process of designing, the core of which will be modperl on UN*X. Excellent choice. This works quite well. Of course, like others on this list, I might be a bit biased. > The service itself is to access a couple of back end data stores given > parameters received in an XML-RPC request, then return the results in an > XML-RPC formated response. The data from the back end sources will be > loaded into memory at service initialization, for fast access. The data > is small enough and memory plentiful enough to allow this. How often does the data change? How is it stored on the back end? You may not need to cache anything if, say, you have a decent SQL database on the backend. Caching never hurts, but it isn't always necessary. The Cache::* modules may be of use for this, though, should you still need it. You also might want to consider not sharing the data in each process; the complexity gained vs the memory lost by storing it in each process may be a workable tradeoff. I probably would try it first with no cache, then a per-process on-demand cache, then finally a shared cache, in that order. > That's pretty much it in terms of the high level data flow. It has to > be relatively fast, OTO 5+ requests/sec. as a relative volumetric. This should be quite easy. I don't have the necessary setup handy to benchmark it, but I imagine you can easily achieve performance at that level using Frontier::RPC inside a mod_perl handler. We typically use custom code for interfacing the handler, but IIRC the Frontier module comes with a mod_perl handler that, if not enturely suitable, is easily modified to your needs. > I'm going to use modperl due to the embedded perl interpreter > characteristics it provides, allowing initialization overhead to be > incurred at startup. I also wish to use an in-memory, read-only hash > structure shared across all modperl processes for access to the cached > back end data, rather than making expensive calls to these stores for > each request. Again, throughput is critical. Five hits/second should be absolutely no problem. If you expect slow clients, a mod_proxy in front of things (http://perl.apache.org/guide) can help. > I would like your thoughts on the cache management concept of the > service. I'm looking at MLDBM::Sync as the mechanism for managing the > filesystem representation of the in-memory hash content. What to manage > the in-memory structure itself in terms of accessing its content? Is a > Tie structure too expensive? I want to end up with a single structure > accessible to all of the modperl processes, loaded at service startup. There are a lot of options, but really, I would hold off on deciding complicated caching schemes until you know what throughput you get without them. Even then, I'd avoid disk-based cache systems, instead preferring Cache::* if it must be shared, or just global variables if it doesn't need to be. Can you be more specific about what the data looks like, where it resides, and how expensive loading it is? I wouldn't worry about optimization yet, unless you know beyond the shadow of a doubt speed will be a problem. My hunch is you can do maybe 50-100 hits/second on decent Intel hardware via the Frontier modules, so I don't think performance will be a problem. This is unverified, though; I really need to benchmark it sometime. Maybe others have pushed Frontier to its speed limits? === To: Chip Turner <cturner@redhat.com> From: Jon Robison <jrobison@uniphied.com> Subject: Re: Suggestions on an XML-RPC Service using modperl? Date: Wed, 02 Jan 2002 10:07:40 -0500 As far as the cacheing goes, we have had extremely good luck with IPC::ShareLite used to share info across mod_perl processes. === To: <bhoylma@qwest.com>, <modperl@apache.org> From: "Perrin Harkins" <perrin@elem.com> Subject: Re: Suggestions on an XML-RPC Service using modperl? Date: Thu, 3 Jan 2002 12:24:12 -0500 > I would like your thoughts on the cache management concept of the > service. I'm looking at MLDBM::Sync as the mechanism for managing the > filesystem representation of the in-memory hash content. What to manage > the in-memory structure itself in terms of accessing its content? MLDBM::Sync includes an option for using an in-memory cache in front of the dbm cache. This is fine for read-only data. > Is a Tie structure too expensive? Tied variables are slower than calling methods on objects, but if you have a class implemented as a TIE, you can always just call the STORE, FETCH, etc. methods directly. === To: <bhoylma@qwest.com>, "Chip Turner" <cturner@redhat.com> From: "Perrin Harkins" <perrin@elem.com> Subject: Re: Suggestions on an XML-RPC Service using modperl? Date: Thu, 3 Jan 2002 12:28:01 -0500 > Even then, I'd avoid disk-based cache systems, instead > preferring Cache::* if it must be shared, or just global variables if > it doesn't need to be. Cache::FileCache is disk-based, and it is the fastest of the Cache:: options for most data sets. There was a thread a little while back about data sharing that showed the top performers to be Cache::Mmap and IPC::MM. Cache::Cache and MLDBM::Sync should be more than fast enough for all but the most highly optimized systems. === To: "Jon Robison" <jrobison@uniphied.com>, "Chip Turner" <cturner@redhat.com> From: "Perrin Harkins" <perrin@elem.com> Subject: Re: Suggestions on an XML-RPC Service using modperl? Date: Thu, 3 Jan 2002 12:29:00 -0500 > As far as the cacheing goes, we have had extremely good luck with > IPC::ShareLite used to share info across mod_perl processes. IPC::ShareLite is not as fast as some of the other options, especially when dealing with a large data set. The disk-based options tend to be faster. ===