This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.
Subject: Connection Pooling/TP Monitor From: Jeffrey Horn <horn@cs.wisc.edu> Date: Tue, 24 Oct 2000 15:13:28 -0500 (CDT) First let me say that I'm aware that this topic comes up with some frequency on the mod_perl and DBI-users list. I am aware of posts like this one: http://forum.swarthmore.edu/epigone/modperl/breetalwox/38B4DB3F.612476CE@acm.org which argue against the necessity of pooling. However, I am also aware of a _major_ ISP that implements their email system using a _major_ RDBMS that has had problems that are best solved via connection pooling. Essentially, the time it takes them to search through all the cached connections is nearly as long as the time it is taking to read/write to the database. Although, I'm not implementing email as this ISP is, I think that scalability in my case may definitely run into similar roadblocks. I am interested in hearing from anyone that has tried to implement true connection pooling either within Apache or as an external process. I'm particularly interested in hearing about implementations that could be made to work or are done using Perl and DBI/DBD. I am mostly interested in things that are Open Source or licensed like Perl itself. I am aware of a project called Gnu Transaction Server (GTS), but it doesn't seem like this is quite ready for prime time at the moment or is even under active development. I've seen posts that hint at using shared memory and IPC to implement this within Apache as well as posts that hint at possibilities of implementing this using DBI::Proxy. I basically want to do what the big TP monitors (Tuxedo/Encina/CICS) do with respect to condensing connections to a database, but I'm not in need of features like two-phase commit, cross database joins, heterogeneous database environment, etc. incorporated in these products. Even if you'd simply be interested in working on such a project, I'd like to hear from you. If you think such a project is plain stupid, I'd also be interested in hearing from you (but be gentle!). If you already have something sort of working along these lines, I'd DEFINITELY be interested in hearing from you! ==== Subject: Re: Connection Pooling / TP Monitor From: Tim Bunce <Tim.Bunce@ig.co.uk> Date: Tue, 24 Oct 2000 22:25:12 +0100 On Tue, Oct 24, 2000 at 03:09:47PM -0500, Jeff Horn wrote: > > I basically want to do what the big TP monitors (Tuxedo/Encina/CICS) do with respect to condensing connections to a database, but I'm not in need of features like two-phase commit, cross database joins, heterogeneous database environment, etc. incorporated in these products. I think there's lots more mileage to be had from developing DBI::ProxyServer further. Tim. p.s. Jeff, please keep me CC'd on any dialogue. Thanks. === Subject: Re: Connection Pooling / TP Monitor From: Matt Sergeant <matt@sergeant.org> Date: Thu, 26 Oct 2000 20:47:20 +0100 (BST) On Tue, 24 Oct 2000, Jeff Horn wrote: > However, I am also aware of a _major_ ISP that implements their email > system using a _major_ RDBMS that has had problems that are best > solved via connection pooling. Essentially, the time it takes them to > search through all the cached connections is nearly as long as the > time it is taking to read/write to the database. Although, I'm not > implementing email as this ISP is, I think that scalability in my case > may definitely run into similar roadblocks. > > I am interested in hearing from anyone that has tried to implement > true connection pooling either within Apache or as an external > process. I'm particularly interested in hearing about implementations > that could be made to work or are done using Perl and DBI/DBD. I am > mostly interested in things that are Open Source or licensed like Perl > itself. Having just returned from ApacheCon, I can honestly recommend looking at mod_backhand to simply have a few servers that run the DBI pool, and have database intensive requests go to those servers. It is a *very* cool solution to just these sorts of scalability problems. PS: I'll have an ApacheCon report "coming soon". === Subject: Re: Connection Pooling / TP Monitor From: Tim Bunce <Tim.Bunce@ig.co.uk> Date: Fri, 27 Oct 2000 12:04:44 +0100 On Thu, Oct 26, 2000 at 08:47:20PM +0100, Matt Sergeant wrote: > On Tue, 24 Oct 2000, Jeff Horn wrote: > > > However, I am also aware of a _major_ ISP that implements their email > > system using a _major_ RDBMS that has had problems that are best > > solved via connection pooling. Essentially, the time it takes them to > > search through all the cached connections is nearly as long as the > > time it is taking to read/write to the database. Although, I'm not > > implementing email as this ISP is, I think that scalability in my case > > may definitely run into similar roadblocks. > > > > I am interested in hearing from anyone that has tried to implement > > true connection pooling either within Apache or as an external > > process. I'm particularly interested in hearing about implementations > > that could be made to work or are done using Perl and DBI/DBD. I am > > mostly interested in things that are Open Source or licensed like Perl > > itself. > > Having just returned from ApacheCon, I can honestly recommend looking at > mod_backhand to simply have a few servers that run the DBI pool, and have > database intensive requests go to those servers. It is a *very* cool > solution to just these sorts of scalability problems. To redirect incoming url's that require database work to mod_perl 'heavy' servers? Just like a smarter and more dynamic mod_rewrite? Yes? Or, here's an odd thought that just crossed my mind... You could have a set of apache servers that are 'pure' DBI proxy servers. That is, they POST requests containing SQL (for prepare_cached) plus bind parameter values and return responses containing the results. Basically I'm proposing that apache be used as an alternative framework for DBI::ProxyServer. Almost all the marshaling code and higher level logic is already in DBI::ProxyServer and DBD::Proxy. Shouldn't be too hard to do and you'd gain in all sorts of ways. Anyone fancy having a go? Let me know so we can discuss it in more detail. === Subject: Re: Connection Pooling / TP Monitor From: Matt Sergeant <matt@sergeant.org> Date: Fri, 27 Oct 2000 12:26:44 +0100 (BST) On Fri, 27 Oct 2000, Tim Bunce wrote: > On Thu, Oct 26, 2000 at 08:47:20PM +0100, Matt Sergeant wrote: > > On Tue, 24 Oct 2000, Jeff Horn wrote: > > > > > However, I am also aware of a _major_ ISP that implements their email > > > system using a _major_ RDBMS that has had problems that are best > > > solved via connection pooling. Essentially, the time it takes them to > > > search through all the cached connections is nearly as long as the > > > time it is taking to read/write to the database. Although, I'm not > > > implementing email as this ISP is, I think that scalability in my case > > > may definitely run into similar roadblocks. > > > > > > I am interested in hearing from anyone that has tried to implement > > > true connection pooling either within Apache or as an external > > > process. I'm particularly interested in hearing about implementations > > > that could be made to work or are done using Perl and DBI/DBD. I am > > > mostly interested in things that are Open Source or licensed like Perl > > > itself. > > > > Having just returned from ApacheCon, I can honestly recommend looking at > > mod_backhand to simply have a few servers that run the DBI pool, and have > > database intensive requests go to those servers. It is a *very* cool > > solution to just these sorts of scalability problems. > > To redirect incoming url's that require database work to mod_perl 'heavy' > servers? Just like a smarter and more dynamic mod_rewrite? Yes? Yes basically, except its not a redirect. mod_backhand can use keep-alives to ensure that it never has to recreate a new connection to the heavy backend servers, unlike mod_rewrite or mod_proxy. And it can do it in a smart way so that remote connections don't use keepalives (because they are evil for mod_perl servers - see the mod_perl guide), but backhand connections do. Very very cool technology. > Or, here's an odd thought that just crossed my mind... > > You could have a set of apache servers that are 'pure' DBI proxy servers. > That is, they POST requests containing SQL (for prepare_cached) plus > bind parameter values and return responses containing the results. > > Basically I'm proposing that apache be used as an alternative framework for > DBI::ProxyServer. Almost all the marshaling code and higher level logic > is already in DBI::ProxyServer and DBD::Proxy. Shouldn't be too hard to do > and you'd gain in all sorts of ways. > > Anyone fancy having a go? Let me know so we can discuss it in more detail. Sounds like just a CORBA/RPC type thing. Wouldn't you be better off using CORBA::ORBit? === Subject: Re: Connection Pooling / TP Monitor From: Tim Bunce <Tim.Bunce@ig.co.uk> Date: Fri, 27 Oct 2000 12:51:27 +0100 On Fri, Oct 27, 2000 at 12:26:44PM +0100, Matt Sergeant wrote: > > > Or, here's an odd thought that just crossed my mind... > > > > You could have a set of apache servers that are 'pure' DBI proxy servers. > > That is, they POST requests containing SQL (for prepare_cached) plus > > bind parameter values and return responses containing the results. > > > > Basically I'm proposing that apache be used as an alternative framework for > > DBI::ProxyServer. Almost all the marshaling code and higher level logic > > is already in DBI::ProxyServer and DBD::Proxy. Shouldn't be too hard to do > > and you'd gain in all sorts of ways. > > > > Anyone fancy having a go? Let me know so we can discuss it in more detail. > > Sounds like just a CORBA/RPC type thing. Wouldn't you be better off using > CORBA::ORBit? Maybe. I dunno. I don't actually need this stuff, I just want there to be a solution out there for those that do. I'm waving my hands around and pointing in various directions hoping someone will _do_ something! === Subject: Re: Connection Pooling / TP Monitor From: Matt Sergeant <matt@sergeant.org> Date: Fri, 27 Oct 2000 13:02:02 +0100 (BST) On Fri, 27 Oct 2000, Tim Bunce wrote: > > Sounds like just a CORBA/RPC type thing. Wouldn't you be better off using > > CORBA::ORBit? > > Maybe. I dunno. I don't actually need this stuff, I just want there to > be a solution out there for those that do. I'm waving my hands around > and pointing in various directions hoping someone will _do_ something! Hehe... OK, lets think about exactly what is needed here then. I figure Doug's Apache::DBIPool module (for mod_perl 2.0) is exactly the right architecture: 2 pools of connections (Busy and Waiting) New connections always taken from the head of Waiting Finished connections always replaced on the head of Waiting Threaded architecture (DBI::Oracle handles don't survive a fork) One thread for management One thread per connection once a handle has been supplied Some sort of timeout mechanism for connections if the pool is fully allocated Anything I've missed? If we don't go the threaded route, we can't easily expand and contract the connection pool I don't think - but I'd love to be proved wrong. Also an entire Apache server for the connection pool would be too much - the pre-forking server from the cookbook would be better. And it should even work on Win32 now... === Subject: Re: Connection Pooling / TP Monitor From: merlyn@stonehenge.com (Randal L. Schwartz) Date: 27 Oct 2000 07:38:00 -0700 Tim" == Tim Bunce <Tim.Bunce@ig.co.uk> writes: Tim> You could have a set of apache servers that are 'pure' DBI proxy Tim> servers. That is, they POST requests containing SQL (for Tim> prepare_cached) plus bind parameter values and return responses Tim> containing the results. Tim> Basically I'm proposing that apache be used as an alternative Tim> framework for DBI::ProxyServer. Almost all the marshaling code Tim> and higher level logic is already in DBI::ProxyServer and Tim> DBD::Proxy. Shouldn't be too hard to do and you'd gain in all Tim> sorts of ways. You could also use SOAP or SOAP::Lite as the interface. Most of that code seems ready for this kind of application already. === Subject: Re: Connection Pooling / TP Monitor From: Matt Sergeant <matt@sergeant.org> Date: Fri, 27 Oct 2000 15:41:47 +0100 (BST) On 27 Oct 2000, (Randal L. Schwartz) wrote: > >>>>> "Tim" == Tim Bunce <Tim.Bunce@ig.co.uk> writes: > > Tim> You could have a set of apache servers that are 'pure' DBI proxy > Tim> servers. That is, they POST requests containing SQL (for > Tim> prepare_cached) plus bind parameter values and return responses > Tim> containing the results. > > Tim> Basically I'm proposing that apache be used as an alternative > Tim> framework for DBI::ProxyServer. Almost all the marshaling code > Tim> and higher level logic is already in DBI::ProxyServer and > Tim> DBD::Proxy. Shouldn't be too hard to do and you'd gain in all > Tim> sorts of ways. > > You could also use SOAP or SOAP::Lite as the interface. Most of that > code seems ready for this kind of application already. There are some issues still with this architecture, the primary one is that SOAP is too heavy weight for anything that seriously needs connection pooling for speed issues, especially in Perl (due to the XML parsing speed issues). === Subject: Re: Connection Pooling / TP Monitor From: Gunther Birznieks <gunther@extropia.com> Date: Sun, 29 Oct 2000 00:50:41 +0800 At 07:38 AM 10/27/00 -0700, Randal L. Schwartz wrote: > >>>>> "Tim" == Tim Bunce <Tim.Bunce@ig.co.uk> writes: > >Tim> You could have a set of apache servers that are 'pure' DBI proxy >Tim> servers. That is, they POST requests containing SQL (for >Tim> prepare_cached) plus bind parameter values and return responses >Tim> containing the results. > >Tim> Basically I'm proposing that apache be used as an alternative >Tim> framework for DBI::ProxyServer. Almost all the marshaling code >Tim> and higher level logic is already in DBI::ProxyServer and >Tim> DBD::Proxy. Shouldn't be too hard to do and you'd gain in all >Tim> sorts of ways. > >You could also use SOAP or SOAP::Lite as the interface. Most of that >code seems ready for this kind of application already. I would second that. We've done this using SOAP. We have a DataSource::SOAP driver that acts as a lightweight interface to a Jakarta TomCat server for the DB stuff. We get the benefits of Perl on the front-end and Java DB Connection pooling logic/proxying on the middle tier. Of course I guess you could do the SOAP Server in Perl too, but Java was a bit easier because we also get built in shared memory caching for frequently issued queries with the way our particular interfaces work. Anyway, with SOAP it doesn't matter what language you use for what -- in theory. === Subject: Re: Connection Pooling / TP Monitor From: Gunther Birznieks <gunther@extropia.com> Date: Sun, 29 Oct 2000 00:56:07 +0800 At 03:41 PM 10/27/00 +0100, Matt Sergeant wrote: >On 27 Oct 2000, (Randal L. Schwartz) wrote: > > > >>>>> "Tim" == Tim Bunce <Tim.Bunce@ig.co.uk> writes: > > > > Tim> You could have a set of apache servers that are 'pure' DBI proxy > > Tim> servers. That is, they POST requests containing SQL (for > > Tim> prepare_cached) plus bind parameter values and return responses > > Tim> containing the results. > > > > Tim> Basically I'm proposing that apache be used as an alternative > > Tim> framework for DBI::ProxyServer. Almost all the marshaling code > > Tim> and higher level logic is already in DBI::ProxyServer and > > Tim> DBD::Proxy. Shouldn't be too hard to do and you'd gain in all > > Tim> sorts of ways. > > > > You could also use SOAP or SOAP::Lite as the interface. Most of that > > code seems ready for this kind of application already. > >There are some issues still with this architecture, the primary one is >that SOAP is too heavy weight for anything that seriously needs connection >pooling for speed issues, especially in Perl (due to the XML parsing speed >issues). What we did for our SOAP objects is that we don't use XML Parsing exactly. The reality is that SOAP parsing with a generic object library is heavy weight. But if you are supporting only 7-8 method calls, it's really not bad to write regex that can do all the appropriate parsing with very little code. Our SOAP drivers have well-defined interfaces and are extremely fast using IO::Socket, Regex SOAP utility methods (to construct and strip things like envelope headers). You might argue that there are more things to go wrong this way and you would be right. However, method calls are usually quite well defined and always have the same basic parameter definitions. So as long as you stick to the well-defined interfaces, it's not bad in practice. === Subject: Re: Connection Pooling / TP Monitor From: "Les Mikesell" <lesmikesell@home.com> Date: Sat, 28 Oct 2000 18:36:23 -0500 Original Message ----- From: "Matt Sergeant" <matt@sergeant.org> . > > > > To redirect incoming url's that require database work to mod_perl 'heavy' > > servers? Just like a smarter and more dynamic mod_rewrite? Yes? > > Yes basically, except its not a redirect. mod_backhand can use keep-alives > to ensure that it never has to recreate a new connection to the heavy > backend servers, unlike mod_rewrite or mod_proxy. And it can do it in a > smart way so that remote connections don't use keepalives (because they > are evil for mod_perl servers - see the mod_perl guide), but backhand > connections do. Very very cool technology. Is there any way to tie proxy requests mapped by mod_rewrite to a balanced set of servers through mod_backhand (or anything similar)? Also, can mod_backhand (or any alternative) work with non-apache back end servers? I'm really looking for a way to let mod_rewrite do the first cut at deciding where (or whether) to send a request, but then be able to send to a load balanced, fail over set, preferably without having to interpose another physical proxy. ===