database_connection_pooling

This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.



Subject: Connection Pooling/TP Monitor
From: Jeffrey Horn <horn@cs.wisc.edu>
Date: Tue, 24 Oct 2000 15:13:28 -0500 (CDT)

First let me say that I'm aware that this topic comes up
with some frequency on the mod_perl and DBI-users list.  I
am aware of posts like this one:
 
http://forum.swarthmore.edu/epigone/modperl/breetalwox/38B4DB3F.612476CE@acm.org
 
which argue against the necessity of pooling.
 
However, I am also aware of a _major_ ISP that implements
their email system using a _major_ RDBMS that has had
problems that are best solved via connection pooling.
Essentially, the time it takes them to search through all
the cached connections is nearly as long as the time it is
taking to read/write to the database.  Although, I'm not
implementing email as this ISP is, I think that scalability
in my case may definitely run into similar roadblocks.
 
I am interested in hearing from anyone that has tried to
implement true connection pooling either within Apache or as
an external process.  I'm particularly interested in hearing
about implementations that could be made to work or are done
using Perl and DBI/DBD.  I am mostly interested in things
that are Open Source or licensed like Perl itself.
 
I am aware of a project called Gnu Transaction Server (GTS),
but it doesn't seem like this is quite ready for prime time
at the moment or is even under active development.  I've
seen posts that hint at using shared memory and IPC to
implement this within Apache as well as posts that hint at
possibilities of implementing this using DBI::Proxy.
 
I basically want to do what the big TP monitors
(Tuxedo/Encina/CICS) do with respect to condensing
connections to a database, but I'm not in need of features
like two-phase commit, cross database joins, heterogeneous
database environment, etc. incorporated in these products.
 
Even if you'd simply be interested in working on such a
project, I'd like to hear from you.  If you think such a
project is plain stupid, I'd also be interested in hearing
from you (but be gentle!).  If you already have something
sort of working along these lines, I'd DEFINITELY be
interested in hearing from you!
 
====

Subject: Re: Connection Pooling / TP Monitor
From: Tim Bunce <Tim.Bunce@ig.co.uk>
Date: Tue, 24 Oct 2000 22:25:12 +0100

On Tue, Oct 24, 2000 at 03:09:47PM -0500, Jeff Horn wrote:
> 
> I basically want to do what the big TP monitors (Tuxedo/Encina/CICS) do with respect to condensing connections to a database, but I'm not in need of features like two-phase commit, cross database joins, heterogeneous database environment, etc. incorporated in these products.


I think there's lots more mileage to be had from developing
DBI::ProxyServer further.

Tim.

p.s. Jeff, please keep me CC'd on any dialogue. Thanks.

===

Subject: Re: Connection Pooling / TP Monitor
From: Matt Sergeant <matt@sergeant.org>
Date: Thu, 26 Oct 2000 20:47:20 +0100 (BST)

On Tue, 24 Oct 2000, Jeff Horn wrote:

> However, I am also aware of a _major_ ISP that implements their email
> system using a _major_ RDBMS that has had problems that are best
> solved via connection pooling.  Essentially, the time it takes them to
> search through all the cached connections is nearly as long as the
> time it is taking to read/write to the database.  Although, I'm not
> implementing email as this ISP is, I think that scalability in my case
> may definitely run into similar roadblocks.
> 
> I am interested in hearing from anyone that has tried to implement
> true connection pooling either within Apache or as an external
> process.  I'm particularly interested in hearing about implementations
> that could be made to work or are done using Perl and DBI/DBD.  I am
> mostly interested in things that are Open Source or licensed like Perl
> itself.

Having just returned from ApacheCon, I can honestly recommend looking at
mod_backhand to simply have a few servers that run the DBI pool, and have
database intensive requests go to those servers. It is a *very* cool
solution to just these sorts of scalability problems.

PS: I'll have an ApacheCon report "coming soon".

===

Subject: Re: Connection Pooling / TP Monitor
From: Tim Bunce <Tim.Bunce@ig.co.uk>
Date: Fri, 27 Oct 2000 12:04:44 +0100

On Thu, Oct 26, 2000 at 08:47:20PM +0100, Matt Sergeant wrote:
> On Tue, 24 Oct 2000, Jeff Horn wrote:
> 
> > However, I am also aware of a _major_ ISP that implements their email
> > system using a _major_ RDBMS that has had problems that are best
> > solved via connection pooling.  Essentially, the time it takes them to
> > search through all the cached connections is nearly as long as the
> > time it is taking to read/write to the database.  Although, I'm not
> > implementing email as this ISP is, I think that scalability in my case
> > may definitely run into similar roadblocks.
> > 
> > I am interested in hearing from anyone that has tried to implement
> > true connection pooling either within Apache or as an external
> > process.  I'm particularly interested in hearing about implementations
> > that could be made to work or are done using Perl and DBI/DBD.  I am
> > mostly interested in things that are Open Source or licensed like Perl
> > itself.
> 
> Having just returned from ApacheCon, I can honestly recommend looking at
> mod_backhand to simply have a few servers that run the DBI pool, and have
> database intensive requests go to those servers. It is a *very* cool
> solution to just these sorts of scalability problems.

To redirect incoming url's that require database work to mod_perl 'heavy'
servers? Just like a smarter and more dynamic mod_rewrite? Yes?

Or, here's an odd thought that just crossed my mind...

You could have a set of apache servers that are 'pure' DBI proxy servers.
That is, they POST requests containing SQL (for prepare_cached) plus
bind parameter values and return responses containing the results.

Basically I'm proposing that apache be used as an alternative framework for
DBI::ProxyServer. Almost all the marshaling code and higher level logic
is already in DBI::ProxyServer and DBD::Proxy. Shouldn't be too hard to do
and you'd gain in all sorts of ways.

Anyone fancy having a go? Let me know so we can discuss it in more detail.

===

Subject: Re: Connection Pooling / TP Monitor
From: Matt Sergeant <matt@sergeant.org>
Date: Fri, 27 Oct 2000 12:26:44 +0100 (BST)

On Fri, 27 Oct 2000, Tim Bunce wrote:

> On Thu, Oct 26, 2000 at 08:47:20PM +0100, Matt Sergeant wrote:
> > On Tue, 24 Oct 2000, Jeff Horn wrote:
> > 
> > > However, I am also aware of a _major_ ISP that implements their email
> > > system using a _major_ RDBMS that has had problems that are best
> > > solved via connection pooling.  Essentially, the time it takes them to
> > > search through all the cached connections is nearly as long as the
> > > time it is taking to read/write to the database.  Although, I'm not
> > > implementing email as this ISP is, I think that scalability in my case
> > > may definitely run into similar roadblocks.
> > > 
> > > I am interested in hearing from anyone that has tried to implement
> > > true connection pooling either within Apache or as an external
> > > process.  I'm particularly interested in hearing about implementations
> > > that could be made to work or are done using Perl and DBI/DBD.  I am
> > > mostly interested in things that are Open Source or licensed like Perl
> > > itself.
> > 
> > Having just returned from ApacheCon, I can honestly recommend looking at
> > mod_backhand to simply have a few servers that run the DBI pool, and have
> > database intensive requests go to those servers. It is a *very* cool
> > solution to just these sorts of scalability problems.
> 
> To redirect incoming url's that require database work to mod_perl 'heavy'
> servers? Just like a smarter and more dynamic mod_rewrite? Yes?

Yes basically, except its not a redirect. mod_backhand can use keep-alives
to ensure that it never has to recreate a new connection to the heavy
backend servers, unlike mod_rewrite or mod_proxy. And it can do it in a
smart way so that remote connections don't use keepalives (because they
are evil for mod_perl servers - see the mod_perl guide), but backhand
connections do. Very very cool technology.

> Or, here's an odd thought that just crossed my mind...
> 
> You could have a set of apache servers that are 'pure' DBI proxy servers.
> That is, they POST requests containing SQL (for prepare_cached) plus
> bind parameter values and return responses containing the results.
> 
> Basically I'm proposing that apache be used as an alternative framework for
> DBI::ProxyServer. Almost all the marshaling code and higher level logic
> is already in DBI::ProxyServer and DBD::Proxy. Shouldn't be too hard to do
> and you'd gain in all sorts of ways.
> 
> Anyone fancy having a go? Let me know so we can discuss it in more detail.

Sounds like just a CORBA/RPC type thing. Wouldn't you be better off using
CORBA::ORBit?

===

Subject: Re: Connection Pooling / TP Monitor
From: Tim Bunce <Tim.Bunce@ig.co.uk>
Date: Fri, 27 Oct 2000 12:51:27 +0100

On Fri, Oct 27, 2000 at 12:26:44PM +0100, Matt Sergeant wrote:
> 
> > Or, here's an odd thought that just crossed my mind...
> > 
> > You could have a set of apache servers that are 'pure' DBI proxy servers.
> > That is, they POST requests containing SQL (for prepare_cached) plus
> > bind parameter values and return responses containing the results.
> > 
> > Basically I'm proposing that apache be used as an alternative framework for
> > DBI::ProxyServer. Almost all the marshaling code and higher level logic
> > is already in DBI::ProxyServer and DBD::Proxy. Shouldn't be too hard to do
> > and you'd gain in all sorts of ways.
> > 
> > Anyone fancy having a go? Let me know so we can discuss it in more detail.
> 
> Sounds like just a CORBA/RPC type thing. Wouldn't you be better off using
> CORBA::ORBit?

Maybe. I dunno. I don't actually need this stuff, I just want there to
be a solution out there for those that do. I'm waving my hands around
and pointing in various directions hoping someone will _do_ something!

===

Subject: Re: Connection Pooling / TP Monitor
From: Matt Sergeant <matt@sergeant.org>
Date: Fri, 27 Oct 2000 13:02:02 +0100 (BST)

On Fri, 27 Oct 2000, Tim Bunce wrote:

> > Sounds like just a CORBA/RPC type thing. Wouldn't you be better off using
> > CORBA::ORBit?
> 
> Maybe. I dunno. I don't actually need this stuff, I just want there to
> be a solution out there for those that do. I'm waving my hands around
> and pointing in various directions hoping someone will _do_ something!

Hehe...

OK, lets think about exactly what is needed here then. I figure Doug's
Apache::DBIPool module (for mod_perl 2.0) is exactly the right
architecture:

	2 pools of connections (Busy and Waiting)
	New connections always taken from the head of Waiting
	Finished connections always replaced on the head of Waiting
	Threaded architecture (DBI::Oracle handles don't survive a fork)
		One thread for management
		One thread per connection once a handle has been supplied
	Some sort of timeout mechanism for connections if the pool is
fully allocated

Anything I've missed?

If we don't go the threaded route, we can't easily expand and contract the
connection pool I don't think - but I'd love to be proved wrong. Also an
entire Apache server for the connection pool would be too much - the
pre-forking server from the cookbook would be better. And it should even
work on Win32 now...

===

Subject: Re: Connection Pooling / TP Monitor
From: merlyn@stonehenge.com (Randal L. Schwartz)
Date: 27 Oct 2000 07:38:00 -0700

Tim" == Tim Bunce <Tim.Bunce@ig.co.uk> writes:

Tim> You could have a set of apache servers that are 'pure' DBI proxy
Tim> servers.  That is, they POST requests containing SQL (for
Tim> prepare_cached) plus bind parameter values and return responses
Tim> containing the results.

Tim> Basically I'm proposing that apache be used as an alternative
Tim> framework for DBI::ProxyServer. Almost all the marshaling code
Tim> and higher level logic is already in DBI::ProxyServer and
Tim> DBD::Proxy. Shouldn't be too hard to do and you'd gain in all
Tim> sorts of ways.

You could also use SOAP or SOAP::Lite as the interface.  Most of that
code seems ready for this kind of application already.

===

Subject: Re: Connection Pooling / TP Monitor
From: Matt Sergeant <matt@sergeant.org>
Date: Fri, 27 Oct 2000 15:41:47 +0100 (BST)

On 27 Oct 2000, (Randal L. Schwartz) wrote:

> >>>>> "Tim" == Tim Bunce <Tim.Bunce@ig.co.uk> writes:
> 
> Tim> You could have a set of apache servers that are 'pure' DBI proxy
> Tim> servers.  That is, they POST requests containing SQL (for
> Tim> prepare_cached) plus bind parameter values and return responses
> Tim> containing the results.
> 
> Tim> Basically I'm proposing that apache be used as an alternative
> Tim> framework for DBI::ProxyServer. Almost all the marshaling code
> Tim> and higher level logic is already in DBI::ProxyServer and
> Tim> DBD::Proxy. Shouldn't be too hard to do and you'd gain in all
> Tim> sorts of ways.
> 
> You could also use SOAP or SOAP::Lite as the interface.  Most of that
> code seems ready for this kind of application already.

There are some issues still with this architecture, the primary one is
that SOAP is too heavy weight for anything that seriously needs connection
pooling for speed issues, especially in Perl (due to the XML parsing speed
issues).

===

Subject: Re: Connection Pooling / TP Monitor
From: Gunther Birznieks <gunther@extropia.com>
Date: Sun, 29 Oct 2000 00:50:41 +0800

At 07:38 AM 10/27/00 -0700, Randal L. Schwartz wrote:
> >>>>> "Tim" == Tim Bunce <Tim.Bunce@ig.co.uk> writes:
>
>Tim> You could have a set of apache servers that are 'pure' DBI proxy
>Tim> servers.  That is, they POST requests containing SQL (for
>Tim> prepare_cached) plus bind parameter values and return responses
>Tim> containing the results.
>
>Tim> Basically I'm proposing that apache be used as an alternative
>Tim> framework for DBI::ProxyServer. Almost all the marshaling code
>Tim> and higher level logic is already in DBI::ProxyServer and
>Tim> DBD::Proxy. Shouldn't be too hard to do and you'd gain in all
>Tim> sorts of ways.
>
>You could also use SOAP or SOAP::Lite as the interface.  Most of that
>code seems ready for this kind of application already.

I would second that. We've done this using SOAP. We have a DataSource::SOAP 
driver that acts as a lightweight interface to a Jakarta TomCat server for 
the DB stuff. We get the benefits of Perl on the front-end and Java DB 
Connection pooling logic/proxying on the middle tier.

Of course I guess you could do the SOAP Server in Perl too, but Java was a 
bit easier because we also get built in shared memory caching for 
frequently issued queries with the way our particular interfaces work. 
Anyway, with SOAP it doesn't matter what language you use for what -- in 
theory.

===

Subject: Re: Connection Pooling / TP Monitor
From: Gunther Birznieks <gunther@extropia.com>
Date: Sun, 29 Oct 2000 00:56:07 +0800

At 03:41 PM 10/27/00 +0100, Matt Sergeant wrote:
>On 27 Oct 2000, (Randal L. Schwartz) wrote:
>
> > >>>>> "Tim" == Tim Bunce <Tim.Bunce@ig.co.uk> writes:
> >
> > Tim> You could have a set of apache servers that are 'pure' DBI proxy
> > Tim> servers.  That is, they POST requests containing SQL (for
> > Tim> prepare_cached) plus bind parameter values and return responses
> > Tim> containing the results.
> >
> > Tim> Basically I'm proposing that apache be used as an alternative
> > Tim> framework for DBI::ProxyServer. Almost all the marshaling code
> > Tim> and higher level logic is already in DBI::ProxyServer and
> > Tim> DBD::Proxy. Shouldn't be too hard to do and you'd gain in all
> > Tim> sorts of ways.
> >
> > You could also use SOAP or SOAP::Lite as the interface.  Most of that
> > code seems ready for this kind of application already.
>
>There are some issues still with this architecture, the primary one is
>that SOAP is too heavy weight for anything that seriously needs connection
>pooling for speed issues, especially in Perl (due to the XML parsing speed
>issues).

What we did for our SOAP objects is that we don't use XML Parsing exactly. 
The reality is that SOAP parsing with a generic object library is heavy 
weight. But if you are supporting only 7-8 method calls, it's really not 
bad to write regex that can do all the appropriate parsing with very little 
code. Our SOAP drivers have well-defined interfaces and are extremely fast 
using IO::Socket, Regex SOAP utility methods (to construct and strip things 
like envelope headers).

You might argue that there are more things to go wrong this way and you 
would be right. However, method calls are usually quite well defined and 
always have the same basic parameter definitions. So as long as you stick 
to the well-defined interfaces, it's not bad in practice.

===

Subject: Re: Connection Pooling / TP Monitor
From: "Les Mikesell" <lesmikesell@home.com>
Date: Sat, 28 Oct 2000 18:36:23 -0500

Original Message -----
From: "Matt Sergeant" <matt@sergeant.org>
.
> >
> > To redirect incoming url's that require database work to mod_perl
'heavy'
> > servers? Just like a smarter and more dynamic mod_rewrite? Yes?
>
> Yes basically, except its not a redirect. mod_backhand can use keep-alives
> to ensure that it never has to recreate a new connection to the heavy
> backend servers, unlike mod_rewrite or mod_proxy. And it can do it in a
> smart way so that remote connections don't use keepalives (because they
> are evil for mod_perl servers - see the mod_perl guide), but backhand
> connections do. Very very cool technology.

Is there any way to tie proxy requests mapped by mod_rewrite to
a balanced set of servers through mod_backhand (or anything
similar)?    Also, can mod_backhand (or any alternative) work
with non-apache back end servers?    I'm really looking for a way
to let mod_rewrite do the first cut at deciding where (or whether)
to send a request, but then be able to send to a load balanced, fail
over set, preferably without having to interpose another physical
proxy.

===


the rest of The Pile (a partial mailing list archive)

doom@kzsu.stanford.edu