modperl-cachecashe_sharedmemorycache

This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.



To: "Perrin Harkins" <perrin@elem.com>
From: "Rob Bloodgood" <robb@empire2.com>
Subject: Shared memory caching revisited (was "it's supposed
to SHARE it, not make more!")
Date: Tue, 4 Sep 2001 12:14:52 -0700

> > One of the shiny golden nuggets I received from said slice was a
> > shared memory cache.  It was simple, it was elegant, it was
> > perfect.  It was also based on IPC::Shareable.  GREAT idea.  BAD
> > juju.

> Just use Cache::Cache.  It's faster and easier.

Now, ya see...
Once upon a time, not many moons ago, the issue of Cache::Cache came up with
the SharedMemory Cache and the fact that it has NO locking semantics.  When
I found this thread in searching for ways to implement my own locking scheme
to make up for this lack, I came upon YOUR comments that perhaps
Apache::Session::Lock::Semaphore could be used, without any of the rest of
the Apache::Session package.

That was a good enough lead for me.

So a went into the manpage, and I went into the module, and then I
mis-understood how the semaphore key was determined, and wasted a good hour
or two trying to patch it.  Then I reverted to my BASICS: Data::Dumper is
your FRIEND.  Print DEBUGGING messages.  Duh, of course, except for some
reason I didn't think to worry about it, at first, in somebody else's
module. <sigh> So, see what I did wrong, undo the patches, and:

A:S:L:S makes the ASSUMPTION that the argument passed to its locking methods
is an Apache::Session object.  Specifically, that it is a hashref of the
following (at least partial) structure:

{
  data => {
            _session_id => (something)
          }
}

The _session_id is used as the seed for the locking semaphore.  *IF* I
understood the requirements correctly, the _session_id has to be the same
FOR EVERY PROCESS in order for the locking to work as desired, for a given
shared data structure.
So my new caching code is at the end of this message.

***OH WOW!***  So, DURING the course of composing this message, I've
realized that the function expire_old_accounts() is now redundant!
Cache::Cache takes care of that, both with expires_in and max_size.  I'm
leaving it in for reference, just to show how it's improved. :-)

***OH WOW! v1.1*** :-) I've also just now realized that the call to
bind_accounts() could actually go right inside lookup_account(), if:
1) lookup_account() is the only function using the cache, or
2) lookup_account() is ALWAYS THE FIRST function to access the cache, or
3) every OTHER function accessing the cache has the same call,
   of the form "bind() unless defined $to_bind;"

I think for prudence I'll leave outside for now.

L8r,
Rob

====>%= snip =%<====

use Apache::Session::Lock::Semaphore ();
use Cache::SizeAwareSharedMemoryCache ();

# this is used in %cache_options, as well as for locking
use constant SIGNATURE => 'EXIT';
use constant MAX_ACCOUNTS => 300;

# use vars qw/%ACCOUNTS/;
use vars qw/$ACCOUNTS $locker/;

my %cache_options = ( namespace => SIGNATURE,
                      default_expires_in =>
		          max_size => MAX_ACCOUNTS );

sub handler {

# ... init code here.  parse $account from the request, and then:

    bind_accounts() unless defined($ACCOUNTS);

# verify (access the cache)
    my $accountinfo = lookup_account($account)
      or $r->log_reason("no such account: $account"), return
HTTP_NO_CONTENT;

# ... content here

}


# Bind the account variables to shared memory
sub bind_accounts {
    warn "bind_accounts: Binding shared memory" if $debug;

    $ACCOUNTS =
      Cache::SizeAwareSharedMemoryCache->new( \%cache_options ) or
	croak( "Couldn't instantiate SizeAwareSharedMemoryCache : $!" );

    # Shut up Apache::Session::Lock::Semaphore
    $ACCOUNTS->{data}->{_session_id} = join '', SIGNATURE, @INC;

    $locker = Apache::Session::Lock::Semaphore->new();

    # not quite ready to trust this yet. :-)
    # We'll keep it separate for now.
    #
    #$ACCOUNTS->set('locker', $locker);

    warn "bind_accounts: done" if $debug;
}

### DEPRECATED!  Cache::Cache does this FOR us!
# bring the current session to the front and
# get rid of any that haven't been used recently
sub expire_old_accounts {

    ### DEPRECATED!
    return;

    my $id = shift;
    warn "expire_old_accounts: entered\n" if $debug;

    $locker->acquire_write_lock($ACCOUNTS);
    #tied(%ACCOUNTS)->shlock;
    my @accounts = grep( $id ne $_, @{$ACCOUNTS->get('QUEUE') || []} );
    unshift @accounts, $id;
    if (@accounts > MAX_ACCOUNTS) {
	my $to_delete = pop @accounts;
	$ACCOUNTS->remove($to_delete);
    }
    $ACCOUNTS->set('QUEUE', \@accounts);
    $locker->release_write_lock($ACCOUNTS);
    #tied(%ACCOUNTS)->shunlock;

    warn "expire_old_accounts: done\n" if $debug;
}

sub lookup_account {
    my $id = shift;

    warn "lookup_account: begin" if $debug;
    expire_old_accounts($id);

    warn "lookup_account: Accessing \$ACCOUNTS{$id}" if $debug;

    my $s = $ACCOUNTS->get($id);

    if (defined $s) {
	# SUCCESSFUL CACHE HIT
	warn "lookup_account: Retrieved accountinfo from Cache (bypassing SQL)" if
$debug;
	return $s;
    }

    ## NOT IN CACHE... refreshing.

    warn "lookup_account: preparing SQL" if $debug;

	# ... do some SQL here.  Assign results to $s

    $locker->acquire_write_lock($ACCOUNTS);
    # tied(%ACCOUNTS)->shlock;

    warn "lookup_account: assigning \$s to shared mem" if $debug;
    $ACCOUNTS->set($id, $s);

    $locker->release_write_lock($ACCOUNTS);
    # tied(%ACCOUNTS)->shunlock;

    return $s;

}

====>%= snip =%<====

===

To: "Rob Bloodgood" <robb@empire2.com>
From: "Perrin Harkins" <perrin@elem.com>
Subject: Re: Shared memory caching revisited (was "it's
supposed to SHARE it, not make more!")
Date: Tue, 4 Sep 2001 15:37:56 -0400

> Once upon a time, not many moons ago, the issue of Cache::Cache came up
with
> the SharedMemory Cache and the fact that it has NO locking semantics.

It does atomic updates.  Do you really need more than that?  The thread you
got this from was referring to checking out a piece of data, making a bunch
of changes on it, and preventing anyone else from reading it (at least for
update purposes) until the changes are checked in.  Most apps are fine with
a "last save wins" approach as long as the data is protected from
corruption.

> When
> I found this thread in searching for ways to implement my own locking
scheme
> to make up for this lack, I came upon YOUR comments that perhaps
> Apache::Session::Lock::Semaphore could be used, without any of the rest of
> the Apache::Session package.

Or you could use Apache::Session::Lock::File, which is probably easier to
deal with.

> The _session_id is used as the seed for the locking semaphore.  *IF* I
> understood the requirements correctly, the _session_id has to be the same
> FOR EVERY PROCESS in order for the locking to work as desired, for a given
> shared data structure.

Only if you want to lock the whole thing, rather than a single record.
Cache::Cache typically updates just one record at a time, not the whole data
structure, so you should only need to lock that one record.

I had a quick look at your code and it seems redundant with Cache::Cache.
You're using the locking just to ensure safe updates, which is already done
for you.

- Perrin

===

To: "Perrin Harkins" <perrin@elem.com>
From: "Rob Bloodgood" <robb@empire2.com>
Subject: RE: Shared memory caching revisited (was "it's
supposed to SHARE it, not make more!")
Date: Tue, 4 Sep 2001 12:55:28 -0700

> > The _session_id is used as the seed for the locking semaphore.
> > *IF* I understood the requirements correctly, the _session_id has
> > to be the same FOR EVERY PROCESS in order for the locking to work
> > as desired, for a given shared data structure.
>
> Only if you want to lock the whole thing, rather than a single
> record.  Cache::Cache typically updates just one record at a time,
> not the whole data structure, so you should only need to lock that
> one record.

Uhh... good point, except that I don't trust the Cache code.  The AUTHOR
isn't ready to put his stamp of approval on the locking/updating.  I'm
running 10 hits/sec on this server, and "last write wins," which ELIMINATES
other writes, is not acceptable.

> I had a quick look at your code and it seems redundant with
> Cache::Cache.  You're using the locking just to ensure safe updates,
> which is already done for you.

Well, for a single, atomic lock, maybe.  My two points above are the why of
my hesitancy.  Additionally, what if I decide to add to my handler?  What if
I update more than one thing at once?  Now I've got the skeleton based on
something that somebody trusts (A:S:L:S), vs what somebody thinks is
alpha/beta (C:SASMC).

In other words....

TIMTOWTDI! :-)

===

To: "Rob Bloodgood" <robb@empire2.com>
From: "Perrin Harkins" <perrin@elem.com>
Subject: Re: Shared memory caching revisited (was "it's
supposed to SHARE it, not make more!")
Date: Tue, 4 Sep 2001 16:29:18 -0400

> Uhh... good point, except that I don't trust the Cache code.  The AUTHOR
> isn't ready to put his stamp of approval on the locking/updating.

That sort of hesitancy is typical of CPAN.  I wouldn't worry about it.  I
think I remember Randal saying he helped a bit with that part.  In my
opinion, there is no good reason to think that the Apache::Session locking
code is in better shape than the Cache::Cache locking, unless you've
personally reviewed the code in both modules.

> I'm
> running 10 hits/sec on this server, and "last write wins," which
ELIMINATES
> other writes, is not acceptable.

As far as I can see, that's all that your code is doing.  You're simply
locking when you write, in order to prevent corruption.  You aren't
acquiring an exclusive lock when you read, so anyone could come in between
your read and write and make an update which would get overwritten when you
write, i.e. "last write wins."

You're more than welcome to roll your own solution based on your personal
preferences, but I don't want people to get the wrong idea about
Cache::Cache.  It handles the basic locking needed for safe updates.

===
To: "Perrin Harkins" <perrin@elem.com>
From: "Rob Bloodgood" <robb@empire2.com>
Subject: RE: Shared memory caching revisited (was "it's
supposed to SHARE it, not make more!")
Date: Tue, 4 Sep 2001 14:07:48 -0700

> > Uhh... good point, except that I don't trust the Cache code.  The AUTHOR
> > isn't ready to put his stamp of approval on the locking/updating.
>
> That sort of hesitancy is typical of CPAN.  I wouldn't worry about it.  I
> think I remember Randal saying he helped a bit with that part.  In my
> opinion, there is no good reason to think that the Apache::Session locking
> code is in better shape than the Cache::Cache locking, unless you've
> personally reviewed the code in both modules.

Well, the fact is, I respect your opinion.  And YES, it seems like I'm doing
more work than is probably necessary.  I've been screwed over SO MANY TIMES
by MYSELF not thinking of some little detail, than I've developed a tendency
to design in redundant design redundancy :-) so that if one thing fails, the
other will catch it.  This reduces downtime...

> > I'm running 10 hits/sec on this server, and "last write wins,"
> > which ELIMINATES other writes, is not acceptable.
>
> As far as I can see, that's all that your code is doing.  You're
> simply locking when you write, in order to prevent corruption.  You
> aren't acquiring an exclusive lock when you read, so anyone could
> come in between your read and write and make an update which would
> get overwritten when you write, i.e. "last write wins."

Again, good point... I'm coding as if the WHOLE cache structure will break
if any little thing gets out of line.  I was trying to think in terms of
data safety like one would with threading, because A) I was worried about
weather shared memory was as sensitive to locks/corruption as threading, and
B) I reviewed Apache::Session's lock code, but didn't review Cache::Cache's
(20/20 hindsight, ya know).

> You're more than welcome to roll your own solution based on your
> personal preferences, but I don't want people to get the wrong idea
> about Cache::Cache.  It handles the basic locking needed for safe
> updates.

Then my code just got waaaaaaay simpler, both in terms of data flow and
individual coding sections.  THANK YOU! :-)

===


To: Rob Bloodgood <robb@empire2.com>
From: DeWitt Clinton <dewitt@unto.net>
Subject: Re: Shared memory caching revisited (was "it's
supposed to SHARE it, not make more!")
Date: Tue, 4 Sep 2001 18:35:07 -0400

On Tue, Sep 04, 2001 at 12:14:52PM -0700, Rob Bloodgood wrote:

> ***OH WOW!***  So, DURING the course of composing this message, I've
> realized that the function expire_old_accounts() is now redundant!
> Cache::Cache takes care of that, both with expires_in and max_size.  I'm
> leaving it in for reference, just to show how it's improved. :-)
>
> [snip]
> 
> use Apache::Session::Lock::Semaphore ();
> use Cache::SizeAwareSharedMemoryCache ();
> 
> # this is used in %cache_options, as well as for locking
> use constant SIGNATURE => 'EXIT';
> use constant MAX_ACCOUNTS => 300;
> 
> # use vars qw/%ACCOUNTS/;
> use vars qw/$ACCOUNTS $locker/;
> 
> my %cache_options = ( namespace => SIGNATURE,
>                       default_expires_in =>
> 		          max_size => MAX_ACCOUNTS );


Very neat thought about how to use max_size to limit the the
accounts!  

Unfortunately, you demonstrated that I did a *terrible* job at
documenting what "size" means.

It means size in bytes, not items.

I will add max_items and limit_items to the TODO list.  In the
meantime, I will improve the documentation.

===

To: "Perrin Harkins" <perrin@elem.com>
From: merlyn@stonehenge.com (Randal L. Schwartz)
Subject: Re: Shared memory caching revisited (was "it's
supposed to SHARE it, not make more!")
Date: 04 Sep 2001 16:13:48 -0700

>>>>> "Perrin" == Perrin Harkins <perrin@elem.com> writes:

>> Uhh... good point, except that I don't trust the Cache code.  The
>> AUTHOR isn't ready to put his stamp of approval on the
>> locking/updating.

Perrin> That sort of hesitancy is typical of CPAN.  I wouldn't worry
Perrin> about it.  I think I remember Randal saying he helped a bit
Perrin> with that part.

I helped with the code that ensures that *file* writes are atomic
updates.  I taught DeWitt the trick of writing to a temp file, then
renaming when ready, so that any readers see only the old file or the
new file, but never a partially written file.

I don't think Cache::Cache has enough logic for an "atomic
read-modify-write" in any of its modes to implement (for example) a
web hit counter.  It has only "atomic write".  The "last write wins"
strategy is fine for caching, but not for transacting, so I can see
why Rob is a bit puzzled.

It'd be nice if we could build a generic "atomic read-modify-write",
but now we're back to Apache::Session, which in spite of its name
works fine away from Apache. :)

Caching.  An area of interest of mine, but I still don't seem to get
around to really writing the framework I want, so all I can do is keep
lobbing grenades into the parts I don't want. :) :) Sorry guys. :)

===

To: "Randal L. Schwartz" <merlyn@stonehenge.com>
From: "Perrin Harkins" <perrin@elem.com>
Subject: Re: Shared memory caching revisited (was "it's
supposed to SHARE it, not make more!")
Date: Tue, 4 Sep 2001 19:29:57 -0400

> I don't think Cache::Cache has enough logic for an "atomic
> read-modify-write" in any of its modes to implement (for example) a
> web hit counter.  It has only "atomic write".  The "last write wins"
> strategy is fine for caching, but not for transacting, so I can see
> why Rob is a bit puzzled.

In his example code he was only doing atomic writes as well, so it should
work at least as well for his app as what he had before.

> It'd be nice if we could build a generic "atomic read-modify-write"

Maybe a get_for_update() method is what's needed.  It would block any other
process from doing a set() or a get_for_update() until the set() for that
key has completed.  It's still just advisory locking though, so if you
forget and use a regular get() for some data you later plan to set(), you
will not be getting atomic read-modify-write.  Maybe get() could be re-named
to get_read_only(), or set a flag that prevents saving the fetched data.
Most caching apps are happy with "last save wins" though, so I guess
anything like that would need to be optional.

===
To: merlyn@stonehenge.com (Randal L. Schwartz),
From: Christian Jaeger <christian.jaeger@sl.ethz.ch>
Subject: Re: Shared memory caching revisited (was "it's
supposed to SHARE
Date: Wed, 5 Sep 2001 02:15:54 +0100

At 16:13 Uhr -0700 4.9.2001, Randal L. Schwartz wrote:
>I don't think Cache::Cache has enough logic for an "atomic
>read-modify-write" in any of its modes to implement (for example) a
>web hit counter.  It has only "atomic write".  The "last write wins"
>strategy is fine for caching, but not for transacting, so I can see
>why Rob is a bit puzzled.
>
>It'd be nice if we could build a generic "atomic read-modify-write",
>but now we're back to Apache::Session, which in spite of its name
>works fine away from Apache. :)
>
>Caching.  An area of interest of mine, but I still don't seem to get
>around to really writing the framework I want, so all I can do is keep
>lobbing grenades into the parts I don't want. :) :) Sorry guys. :)


What about my IPC::FsSharevars? I've once mentioned it on this list, 
but I don't have the time to read all list mail, so maybe I've missed 
some conclusions following the discussion from last time.

I've still not used it under heavy traffic, but it's supposed to 
offer transaction safety, while allowing concurrent access to 
different variables from even the same session (it locks each 
variable independantly, and allows both shared and exclusive locks).

Still, it has been written for my fastcgi framework, not mod_perl, 
but should work under the latter too I think (you'll have to load it 
in a startup.pl script since it registers the pid of the parent 
process so it can send signals to the whole process group (should be 
easy to change to use the ppid of the process instead)).

   http://www.eile.ethz.ch/download/

Write me if you have any issues with it (and I'm eager to hear about 
success stories =)).

===

To: "'Christian Jaeger '" <christian.jaeger@sl.ethz.ch>
From: Geoffrey Young <gyoung@laserlink.net>
Subject: RE: Shared memory caching revisited (was "it's
supposed to SHARE 
Date: Tue, 4 Sep 2001 20:37:16 -0400 

> What about my IPC::FsSharevars? I've once mentioned it on this list, 
> but I don't have the time to read all list mail, so maybe I've missed 
> some conclusions following the discussion from last time.

I remember the post and went to find IPC::FsSharevars a while ago and was
un-intrigued when I didn't find it on CPAN.  has there been any feedback
from the normal perl module forums?

===

To: Geoffrey Young <gyoung@laserlink.net>
From: Christian Jaeger <christian.jaeger@sl.ethz.ch>
Subject: RE: Shared memory caching revisited (was "it's
supposed to SHARE 
Date: Wed, 5 Sep 2001 02:51:17 +0100

At 20:37 Uhr -0400 4.9.2001, Geoffrey Young wrote:
>I remember the post and went to find IPC::FsSharevars a while ago and was
>un-intrigued when I didn't find it on CPAN.  has there been any feedback
>from the normal perl module forums?

I haven't announced it on other forums (yet). (I think it's more of a 
working version yet that needs feedback and some work to make it 
generally useable (i.e. under mod_perl). Which forum should I post 
on?)

===

To: Christian Jaeger <christian.jaeger@sl.ethz.ch>
From: merlyn@stonehenge.com (Randal L. Schwartz)
Subject: Re: Shared memory caching revisited (was "it's
supposed to SHARE  it, not make more!")
Date: 04 Sep 2001 23:18:20 -0700

>>>>> "Christian" == Christian Jaeger <christian.jaeger@sl.ethz.ch> writes:

Christian> I haven't announced it on other forums (yet). (I think it's
Christian> more of a working version yet that needs feedback and some
Christian> work to make it generally useable (i.e. under
Christian> mod_perl). Which forum should I post on?)

If you put it on the CPAN with a version number below 1, that's
usually a clue that it's still alpha or beta.  Then you can announce
it through the normal module announcement structures.

If you hide it, I'm sure not installing it.

===

the rest of The Pile (a partial mailing list archive)

doom@kzsu.stanford.edu