This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.
Subject: Re: ApacheCon report From: Perrin Harkins <perrin@primenet.com> Date: Mon, 30 Oct 2000 19:58:59 -0800 (PST) On Mon, 30 Oct 2000, Tim Sweetman wrote: > Matt Sergeant wrote: > > > > On Fri, 27 Oct 2000, Tim Sweetman wrote: > > > > > In no particular order, and splitting hairs some of the time... > > > > > > Sounded like mod_backhand was best used NOT in the same Apache as a phat > > > application server (eg. mod_perl), because you don't want memory-heavy > > > processes sitting waiting for responses. You'd be better off with a > > > separate switching machine - or serve your static content from > > > machine(s) that know to backhand dynamic requests to a phat machine. I > > > think that's what Theo reckoned... > > > > Yes, but the backend mod_perl servers are running backhand. So you have: > > > > B B B B > > \ | | / > > \ \/ / > > \|/ > > F > > > > Where all the servers are running mod_backhand, but only F is publically > > accessible. There may also be >1 F. Its in his slides, and is prettier > > than the above :-) > > Yeah. I know how it was set up in Theo's demo (like that) but I got the > impression that this wouldn't be optimal for a mod_Perl setup (or other > big-footprinted configuration). You _can_ run mod_backhand on your > content servers. You don't _have_ to. Here's what I recall Theo saying (relative to mod_perl): - Don't use a proxy server for doling out bytes to slow clients; just set the buffer on your sockets high enough to allow the server to dump the page and move on. This has been discussed here before, notably in this post: http://forum.swarthmore.edu/epigone/modperl/grerdbrerdwul/20000811200559.B1742@rajpur.iagora.es The conclusion was that you could end up paying dearly for the lingeirng close on the socket. - If you use apache+mod_proxy as a load balancer in front of your back-end servers (as opposed to a commercial solution like big ip), use mod_backhand instead and your front-end server will be able to handle requests itself when it's not too busy. - He has created a way for proxied requests to use keep-alive without enabling keep-alive for the whole server. The obvious problem - that each server will soon use up every other server's available clients - is somehow avoided by sharing open sockets to the other servers in some external daemon. This sounded cool, but fishy. Ultimately, I don't see any way around the fact that proxying from one server to another ties up two processes for that time rather than one, so if your bottleneck is the number of processes you can run before running out of RAM, this is not a good approach. If your bottleneck is CPU or disk access, then it might be useful. I guess that means this is not so hot for the folks who are mostly bottlenecked by an RDBMS, but might be good for XML'ers running CPU hungry transformations. (Yes, I saw Matt's talk on AxKit's cache...) > One thing I have my eye on (which doesn't mean I'll necessarily get it > done :) ) is some sort of data-holding-class that sits between an > application and a template in a transparent way (eg. it could hold the > method names & args that you were passing to a templating system, like a > "command" design pattern IIRC). In Perl, we call that a "variable". (Sorry, couldn't resist...) Templating systems in Java jump through hoops to make generic data structures, but Perl doesn't have to. Just freeze it with Storable if you need to save it. > This would potentially allow: > + switching between different templating systems ...? Nearly all of them use standard perl hash refs. > + checking out template tweaks without rerunning the app - good for > "interactive" systems which > keep chucking form data at users This is easy to do with most systems. I once wrote something that could turn arbitrary form inputs into a data structure suitable for feeding to Template Toolkit or similar. Then I could create a form for entering different kinds of test data, and save the test data using Storable. It was used for building templates for a system which ran the templating as a batch process and so had a terrible turnaround time for testing changes. > + (fairly transparent) conversion to XML/XSL/etc at an appropriate time, > as/when/if a site/project > grows to an appropriate size There are some modules out there that serialize perl data structures as XML. Or you can just write a template for it. === Subject: Re: ApacheCon report From: "Les Mikesell" <lesmikesell@home.com> Date: Tue, 31 Oct 2000 01:38:06 -0600 Original Message ----- From: "Perrin Harkins" <perrin@primenet.com> > > > Here's what I recall Theo saying (relative to mod_perl): > > - Don't use a proxy server for doling out bytes to slow clients; just set > the buffer on your sockets high enough to allow the server to dump the > page and move on. This has been discussed here before, notably in this > post: > > http://forum.swarthmore.edu/epigone/modperl/grerdbrerdwul/20000811200559.B17 42@rajpur.iagora.es > > The conclusion was that you could end up paying dearly for the lingeirng > close on the socket. In practice I see a fairly consistent ratio of 10 front-end proxies running per one back end on a site where most hits end up being proxied so the lingering is a real problem. > Ultimately, I don't see any way around the fact that proxying from one > server to another ties up two processes for that time rather than one, so > if your bottleneck is the number of processes you can run before running > out of RAM, this is not a good approach. The point is you only tie up the back end for the time it takes to deliver to the proxy, then it moves on to another request while the proxy dribbles the content back to the client. Plus, of course, it doesn't have to be on the same machine. > If your bottleneck is CPU or > disk access, then it might be useful. I guess that means this is not so > hot for the folks who are mostly bottlenecked by an RDBMS, but might be > good for XML'ers running CPU hungry transformations. (Yes, I saw Matt's > talk on AxKit's cache...) Spreading requests over multiple backends is the fix for this. There is some gain in efficiency if you dedicate certain backend servers to certain tasks since you will then tend to have the right things in the cache buffers. === Subject: Re: ApacheCon report From: Perrin Harkins <perrin@primenet.com> Date: Tue, 31 Oct 2000 00:00:39 -0800 (PST) On Tue, 31 Oct 2000, Les Mikesell wrote: > > Ultimately, I don't see any way around the fact that proxying from one > > server to another ties up two processes for that time rather than one, so > > if your bottleneck is the number of processes you can run before running > > out of RAM, this is not a good approach. > > The point is you only tie up the back end for the time it takes to deliver > to the proxy, then it moves on to another request while the proxy > dribbles the content back to the client. Plus, of course, it doesn't > have to be on the same machine. I was actually talking about doing this with no front-end proxy, just mod_perl servers. That's what Theo was suggesting. I use a mod_proxy front-end myself and it works very well. === Subject: Re: ApacheCon report From: Gunther Birznieks <gunther@extropia.com> Date: Tue, 31 Oct 2000 16:13:56 +0800 At 12:00 AM 10/31/2000 -0800, Perrin Harkins wrote: >On Tue, 31 Oct 2000, Les Mikesell wrote: > > > Ultimately, I don't see any way around the fact that proxying from one > > > server to another ties up two processes for that time rather than one, so > > > if your bottleneck is the number of processes you can run before running > > > out of RAM, this is not a good approach. > > > > The point is you only tie up the back end for the time it takes to deliver > > to the proxy, then it moves on to another request while the proxy > > dribbles the content back to the client. Plus, of course, it doesn't > > have to be on the same machine. > >I was actually talking about doing this with no front-end proxy, just >mod_perl servers. That's what Theo was suggesting. That might work. Although I think there's enough image files that get served by front-end proxies that it still makes sense to have the front-end proxy engines. As a bonus, if you write your app smart with cache directive headers, some of the dynamic content can truly be cached by the front-end server. === Subject: Re: ApacheCon report From: Perrin Harkins <perrin@primenet.com> Date: Tue, 31 Oct 2000 00:26:43 -0800 (PST) On Tue, 31 Oct 2000, Gunther Birznieks wrote: > As a bonus, if you write your app smart with cache directive > headers, some of the dynamic content can truly be cached by the front-end > server. We're using this technique now and it really rocks. Great performance. === Subject: Re: ApacheCon report From: Bill Moseley <moseley@hank.org> Date: Tue, 31 Oct 2000 10:43:53 -0800 At 04:13 PM 10/31/00 +0800, Gunther Birznieks wrote: >As a bonus, if you write your app smart with cache directive >headers, some of the dynamic content can truly be cached by the front-end >server. Gunther, Can you give some details? I have co-branded template driven content that is dynamically generated, but I allow caching. Is this and example of what you mean, or are you describing something else? === Subject: Re: ApacheCon report From: Gunther Birznieks <gunther@extropia.com> Date: Wed, 01 Nov 2000 10:03:52 +0800 At 10:43 AM 10/31/2000 -0800, Bill Moseley wrote: >At 04:13 PM 10/31/00 +0800, Gunther Birznieks wrote: > >As a bonus, if you write your app smart with cache directive > >headers, some of the dynamic content can truly be cached by the front-end > >server. > >Gunther, > >Can you give some details? I have co-branded template driven content that >is dynamically generated, but I allow caching. Is this and example of what >you mean, or are you describing something else? No, that should be all you need. If you don't turn off caching, mod_proxy is a caching proxy even in reverse proxy mode. So if you support caching, then that should be a bonus for you! === Subject: Re: ApacheCon report From: Ask Bjoern Hansen <ask@apache.org> Date: Tue, 31 Oct 2000 18:25:02 -0800 (PST) On Mon, 30 Oct 2000, Perrin Harkins wrote: [...] > - Don't use a proxy server for doling out bytes to slow clients; just set > the buffer on your sockets high enough to allow the server to dump the > page and move on. This has been discussed here before, notably in this > post: > > http://forum.swarthmore.edu/epigone/modperl/grerdbrerdwul/20000811200559.B1742@rajpur.iagora.es > > The conclusion was that you could end up paying dearly for the lingeirng > close on the socket. Mr. Llima must do something I don't, because with real world requests I see a 15-20 to 1 ratio of mod_proxy/mod_perl processes at "my" site. And that is serving <500byte stuff. === Subject: Re: ApacheCon report From: Leslie Mikesell <les@Mcs.Net> Date: Wed, 1 Nov 2000 17:05:29 -0600 (CST) According to Michael Blakeley: > > > I'm not following. Everyone agrees that we don't want to have big > > > mod_perl processes waiting on slow clients. The question is whether > > > tuning your socket buffer can provide the same benefits as a proxy server > > > and the conclusion so far is that it can't because of the lingering close > > > problem. Are you saying something different? > > > > A tcp close is supposed to require an acknowledgement from the > > other end or a fairly long timeout. I don't see how a socket buffer > > alone can change this. Likewise for any of the load balancer > > front ends that work on the tcp connection level (but I'd like to > > be proven wrong about this). > > Solaris lets a user-level application close() a socket immediately > and go on to do other work. The sockets layer (the TCP/IP stack) will > continue to keep that socket open while it delivers any buffered > sends - but the user application doesn't need to know this (and > naturally won't be able to read any incoming data if it arrives). > When the tcp send buffer is empty, the socket will truly close, with > all the usual FIN et. al. dialogue. > > Anyway, since the socket is closed from the mod_perl point of view, > the heavyweight mod_perl process is no longer tied up. I don't know > if this holds true for Linux as well, but if it doesn't, there's > always the source code. I still like the idea of having mod_rewrite in a lightweight front end, and if the request turns out to be static at that point there isn't much point in dealing with proxying. Has anyone tried putting software load balancing behind the front end proxy with something like eddieware, balance or ultra monkey? In that scheme the front ends might use an IP takeover failover and/or DNS load balancing and would proxy to what they think is a single back end server - then this would hit a tcp level balancer instead. === Subject: Re: ApacheCon report From: Perrin Harkins <perrin@primenet.com> Date: Wed, 1 Nov 2000 16:00:11 -0800 (PST) On Wed, 1 Nov 2000, Leslie Mikesell wrote: > I still like the idea of having mod_rewrite in a lightweight > front end, and if the request turns out to be static at that > point there isn't much point in dealing with proxying. Or if the request is in the proxy cache... > Has anyone tried putting software load balancing behind the front end > proxy with something like eddieware, balance or ultra monkey? In that > scheme the front ends might use an IP takeover failover and/or DNS > load balancing and would proxy to what they think is a single back end > server - then this would hit a tcp level balancer instead. We use that setup with a hardware load balancer. It works very well. === Subject: Re: proxy front-ends (was: Re: ApacheCon report) From: Roger Espel Llima <espel@iagora.net> Date: Thu, 2 Nov 2000 19:21:50 +0100 Ask Bjoern Hansen <ask@apache.org> wrote: > Mr. Llima must do something I don't, because with real world > requests I see a 15-20 to 1 ratio of mod_proxy/mod_perl processes at > "my" site. And that is serving <500byte stuff. and Michael Blakeley <mike@blakeley.com> later replied: > Solaris lets a user-level application close() a socket immediately > and go on to do other work. The sockets layer (the TCP/IP stack) will > continue to keep that socket open while it delivers any buffered > sends - but the user application doesn't need to know this [...] > Anyway, since the socket is closed from the mod_perl point of view, > the heavyweight mod_perl process is no longer tied up. I don't know > if this holds true for Linux as well, but if it doesn't, there's > always the source code. This is exactly it. I did some tests with a real-world server, and the conclusion was that, as long as the total write() size is less than the kernel's max write buffer, then write() followed by close() doesn't block. That was using Linux, where the kernel buffer size can be set by echo'ing numbers into /proc/sys/net/core/wmem_{default,max}. However, Apache doesn't use a plain close(), but instead calls a function called lingering_close, which tries to make sure that the other side has received all the data. That is done by select()ing on the socket in 2 second increments, until the other side either closes the connection too, or times out. And *THIS* is the reason why front-end servers are good. An apache process spends an average of 0.8 seconds (in my measurements) per request doing lingering close. This is consistent with Ask's ratio of 15-20 to 1 frontend to backend servers. Now, there's no reason in principle why lingering_close() couldn't be done in the kernel, freeing the user process from the waiting job, and making frontend servers unnecessary. There's even an interface for it, namely SO_LINGER, and Apache knows how to use it. But SO_LINGER is badly specified, and known to be broken in most tcp/ip stacks, so currently it's kind of a bad idea to use it, and we're stuck with the two server model. === Subject: Re: proxy front-ends (was: Re: ApacheCon report) From: Gunther Birznieks <gunther@extropia.com> Date: Fri, 03 Nov 2000 10:15:06 +0800 Although I don't have much to add to the conversation, I just wanted to say that this is one of the most absolutely technically enlightening posts I've read on the mod_perl list in a while. It's really interesting to finally clarify this once and for all. Smells like a mod_perl guide addition. :) At 07:21 PM 11/2/2000 +0100, Roger Espel Llima wrote: >Ask Bjoern Hansen <ask@apache.org> wrote: > > Mr. Llima must do something I don't, because with real world > > requests I see a 15-20 to 1 ratio of mod_proxy/mod_perl processes at > > "my" site. And that is serving <500byte stuff. > >and Michael Blakeley <mike@blakeley.com> later replied: > > Solaris lets a user-level application close() a socket immediately > > and go on to do other work. The sockets layer (the TCP/IP stack) will > > continue to keep that socket open while it delivers any buffered > > sends - but the user application doesn't need to know this [...] > > Anyway, since the socket is closed from the mod_perl point of view, > > the heavyweight mod_perl process is no longer tied up. I don't know > > if this holds true for Linux as well, but if it doesn't, there's > > always the source code. > >This is exactly it. I did some tests with a real-world server, and the >conclusion was that, as long as the total write() size is less than the >kernel's max write buffer, then write() followed by close() doesn't >block. That was using Linux, where the kernel buffer size can be set by >echo'ing numbers into /proc/sys/net/core/wmem_{default,max}. > >However, Apache doesn't use a plain close(), but instead calls a >function called lingering_close, which tries to make sure that the other >side has received all the data. That is done by select()ing on the >socket in 2 second increments, until the other side either closes the >connection too, or times out. And *THIS* is the reason why front-end >servers are good. An apache process spends an average of 0.8 seconds >(in my measurements) per request doing lingering close. This is >consistent with Ask's ratio of 15-20 to 1 frontend to backend servers. > >Now, there's no reason in principle why lingering_close() couldn't be >done in the kernel, freeing the user process from the waiting job, and >making frontend servers unnecessary. There's even an interface for it, >namely SO_LINGER, and Apache knows how to use it. But SO_LINGER is >badly specified, and known to be broken in most tcp/ip stacks, so >currently it's kind of a bad idea to use it, and we're stuck with the >two server model. > === Subject: Re: proxy front-ends (was: Re: ApacheCon report) From: Joe Schaefer <joe@sunstarsys.com> Date: 03 Nov 2000 10:00:23 -0500 Gunther Birznieks <gunther@extropia.com> writes: > Although I don't have much to add to the conversation, I just wanted to say > that this is one of the most absolutely technically enlightening posts I've > read on the mod_perl list in a while. It's really interesting to finally > clarify this once and for all. You bet - brilliant detective/expository work going on here! On a side note, a while back I was trying to coerce the TUX developers to rework their server a little. I've included snippets of the email correspondence below: ============ From: Joe Schaefer <joe+tux@sunstarsys.com> Subject: Can tux 'proxy' for the user space daemon? Date: 06 Oct 2000 13:32:10 -0400 It would be great if TUX is someday capable of replacing the "reverse proxy" kludge for mod_perl. From skimming the docs, it seems that TUX on port 80 + apache on 8080 seems to fit this bill. Question: In this setup, how does TUX behave wrt HTTP/1.1 keepalives to/from apache? Say apache is configured with mod_perl, and keepalives are disabled on apache. Is TUX capable of maintaining keepalives on the browser <-> TUX connection, while maintaining a separate "pool" of (closed) TUX <-> apache connections? If I'm way off here on how TUX works (or will work), please correct me! Thanks. ========== From: Ingo Molnar <mingo@elte.hu> Subject: Re: Can tux 'proxy' for the user space daemon? Date: Sat, 7 Oct 2000 13:42:49 +0200 (CEST) if TUX sees a request that is redirected to Apache, then all remaining requests on the connection are redirected to Apache as well. TUX wont ever see that connection again, the redirection works by 'trimming' all previous input up to the request which goes to Apache, then the socket itself is hung into Apache's listen socket, as if it came as a unique request from the browser. This technique is completely transparent both to Apache and to the browser. There is no mechanizm to 'bounce back' a connection from Apache to TUX. (while connections do get bounced back and forth between the kernel and user-space TUX modules.) so eg. if the first 2 request within a single persistent HTTP/1.1 connection can be handled by TUX then it will be handled by TUX, and the third (and all succeeding) requests will be redirected to Apache. Logging will happen by TUX for the first 2 requests, and the remaining requests will be logged by Apache. ========== From: Joe Schaefer <joe+tux@sunstarsys.com> Subject: Re: Can tux 'proxy' for the user space daemon? Date: 07 Oct 2000 19:52:31 -0400 Too bad- this means that HTTP/1.1 pages generated by an apache module won't benefit from TUX serving the images and stylesheet links contained therein. I guess disabling keepalives on the apache connection is (still) the only way to go. I still think it would be cool if there was some hack to make this work- perhaps a TUX "gateway" module could do it? Instead of handing off a request directly to apache, maybe a (user-space) TUX module could hand it off and then return control back to TUX when the page has been delivered. Is such a "gateway" TUX module viable? ========= From: Ingo Molnar <mingo@elte.hu> Subject: Re: Can tux 'proxy' for the user space daemon? Date: Mon, 9 Oct 2000 11:42:57 +0200 (CEST) depends on the complexity of the module. If it's simple functionality then it might be best to write a dedicated TUX module for it, without Apache. but if it's too complex then the same code that is used to hand a TCP connection over to Apache can be used by Apache to send a connection back to TUX as well. A new branch of the TUX system-call could handle this. ===