modperl-massive_website_tips_and_tricks_early_planning_stages

This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.

To: modperl list <modperl@apache.org>,
From: Viljo Marrandi <vilts@bbz.ee>
Subject: Tips & tricks needed :)
Date: Wed, 19 Dec 2001 10:50:49 +0200

Hello,

We're going to make a web-site for insurance company (err, more like
portal for several companies) and the problem is that ( I think ) it's
going to be our biggest and most complex site we've ever done AND we're
going to use some new stuff we've never used. So I'd be very happy if
you can give me some points what to look at, what are real no-no's and
what are go-go's.

1. We're going to switch from mysql to postgresql, because we need
transactions, triggers and all other stuff that mysql doesn't support.
What could be possible problems going from mysql to postgres, if any?

2. We will use Template-Toolkit and Apache/mod_perl. Problem is that 2
out of 3 people have never used TT or programmed mod_perl and OO Perl.
Only I've made sites this way, they've used Embperl til now. How can I
make this switch for them a little easier? I know I must spend a lot of
time teaching them, but may-be there are some kinda switchover tutorials
or something?

3. Authorization. Is cookie based auth most reasonable or are there some
other ways too? .htaccess will not do, I think, because all data is in
the same directory and authorized access/login is needed only on some
parts of site. Which data should I send with cookie? Only some random
key which is also stored in dbase and this key is used to find real data
from dbase? (I guess I must read again this thread about cookies).

4. How is most reasonable to store(and use too) complex formulas and
coefficients? Problem is that there are 4 companies and each of them has
different way to calculate same thing eg. insurance for travelling, car
insurance etc. Unfortunately they are all quite different, because every
company uses even different things to calculate final result. So if we
use different formula for every company and insurance type we end up
with ~50 formulas and none understands afterwards which is which. Are
there any guidelines to generalize formulas? Ok, let's say we even
somehow make these formulas general enough to use, but where shall the
calculation take place? Postgres stored procs or in perl code/module (i
think this) or even in TT? Constans will be in db.

5. Any other things to look out when creating large site and/or running
it over SSL and/or using above described configuration?

P.S. I hope that in about few months I can write about this project to
success stories ;-)

Thanks for your attention,
Viljo

===
To: Viljo Marrandi <vilts@bbz.ee>
From: Jorge Godoy <godoy@conectiva.com>
Subject: Re: Tips & tricks needed :)
Date: Wed, 19 Dec 2001 09:19:05 -0200

 
Viljo Marrandi <vilts@bbz.ee> writes:


I'm answering what I can... :-)

> 3. Authorization. Is cookie based auth most reasonable or are there some
> other ways too? .htaccess will not do, I think, because all data is in
> the same directory and authorized access/login is needed only on some
> parts of site. Which data should I send with cookie? Only some random
> key which is also stored in dbase and this key is used to find real data
> from dbase? (I guess I must read again this thread about cookies).

First of all, why putting everything at the same place? It would be
easier to maintain things if only the generic things are at the same
place and specific things are in their own directories.=20

Depending on how much data you're going to store, I'd use a cookie
that is a hash (e.g. MD5) to some index at the database. Don't send
confidential information on them (people might use some public
internet to view their information) and try not sending plain text
(people might be tempted to change values).=20

> 4. How is most reasonable to store(and use too) complex formulas and
> coefficients? Problem is that there are 4 companies and each of them has
> different way to calculate same thing eg. insurance for travelling, car
> insurance etc. Unfortunately they are all quite different, because every
> company uses even different things to calculate final result. So if we
> use different formula for every company and insurance type we end up
> with ~50 formulas and none understands afterwards which is which. Are
> there any guidelines to generalize formulas? Ok, let's say we even
> somehow make these formulas general enough to use, but where shall the
> calculation take place? Postgres stored procs or in perl code/module (i
> think this) or even in TT? Constans will be in db.

Create modules for each company. This way you'll have each company's
functions on her own module. They will be differentiated by
namespace. (And you can use references to select the appropriated
module, the information on which company that client belongs to might
be in a record at the database ;-))

By using references, your only thing to worry will be naming the same
things with the same names. You should also pass values by reference.=20

Using modules will also make it possible to change formulae without
worring about which of them are common to other companies or which
are not common.=20

And, since you're already going to use OO Perl... ;-)


===

To: Viljo Marrandi <vilts@bbz.ee>
From: wsheldah@lexmark.com
Subject: Re: Tips & tricks needed :)
Date: Wed, 19 Dec 2001 09:00:29 -0500

1. Regarding the switch to postgresql, I think that's a good
choice. Just pay attention to postgresql's data types, and
try to get your fields types and lengths correct the first
time if possible. It doesn't completely support the ALTER
TABLE command, so changing column types can be a pain,
although it's still possible. The other thing is that SQL
syntax might be slightly different in a few cases, though
it's been too long since I used MySQL to remember any
examples. Postgresql's web site has some tips for switching,
I think at http://techdocs.postgresql.org.

2.  Have them read some articles on the whole MVC approach,
since it sounds like you'll be using that. And of course
read Damian's book several times for OO perl.

4. You might put the formulas in a perl superclass, with one
method per formula.  Then create a subclass for each
different company that has that company's algorithm. All the
calling code has to worry about is which company it's
dealing with when it instantiates the object; after that all
the right formulas will get used automatically. This should
make it easy to add more companies, too. I guess the general
principle is that when you're faced with tons of complexity,
try breaking it down into smaller pieces and add an
abstraction layer or two, so you and the program can deal
with it.

Hope this helps. I'll be watching for the success story!

 -- Wes Sheldahl
===

To: wsheldah@lexmark.com
From: fliptop <fliptop@peacecomputers.com>
Subject: Re: Tips & tricks needed :)
Date: Wed, 19 Dec 2001 09:25:48 -0500

wsheldah@lexmark.com wrote:

> 
> 1. Regarding the switch to postgresql, I think that's a good choice. Just pay
> attention to postgresql's data types, and try to get your fields types and
> lengths correct the first time if possible. It doesn't completely support the
> ALTER TABLE command, so changing column types can be a pain, although it's still
> possible. The other thing is that SQL syntax might be slightly different in a
> few cases, though it's been too long since I used MySQL to remember any
> examples. Postgresql's web site has some tips for switching, I think at
> http://techdocs.postgresql.org.


something i'll add to that - if your new postgresql db will have foreign 
keys, and you previously didn't have any code written to guarantee your 
data's integrity in mysql, then you probably won't be able to import all 
your data without some massaging (unless you're sure your data's 
integrity is ok).  i wholeheartedly second wes' statement that switching 
to postgresql is a good choice.

===
To: modperl list <modperl@apache.org>
From: Jean-Michel Hiver <jhiver@mkdoc.com>
Subject: Re: Tips & tricks needed :)
Date: Wed, 19 Dec 2001 15:04:11 +0000

If you're developing a complex application, you'll probably want to
split it in a horde of specialized modules. Few things to remember:


==
You will probably feel the need to use static variables (i.e. variables
shared with all instances of a given class) at some point. For example
if you have a singleton object you might have something like that:

  package Your::Singleton;
  use strict;
  use 5.6;
  use our $ETERNAL = undef;

  sub instance
  {
    my $class = shift;
    return $ETERNAL if (defined $ETERNAL);
    $ETERNAL = $class->new (@_);
    return $ETERNAL;
  }
  
  sub new { ... blah blah code ... }

  1;

ALWAYS reinitialize $Your::Singleton::ETERNAL on each query!
mod_perl will *NOT* do it for you.

You might think 'ah yeah but it would be nice if
$Your::Singleton::ETERNAL could be persistent across queries...' which
is sometimes desirable, but remember that if you have multiple instances
of your application running on the same apache,
$Your::Singleton::ETERNAL will be common to ALL of them.


==
Cyclic memory references are dangerous, try to avoid them as much as
possible! Perl garbage collector does miserably fails in the case of
cyclic refs.

If you have a cycling references that keep going out of scope, they will
never be garbage collected and your server might have some trouble :-)


==
Beware of regular expressions /o modifier! The application I'm working
on has a cool feature heavily using regular expressions: automagic
hyperlinking of text / html data when appropriate. I used to use the /o
modifier and got a few nasty surprises (until I discovered the mod_perl
guide traps page)


==
Other than that, more generally speaking:

Always hide classes implementation with method calls! Not so long ago I
did tend to write using less method calls and directly accessing
object's attributes and now this is my #1 source of maintenance problem
and headaches.

It you think it's too slow then consider it's better to buy a bigger CPU
than 3 tons of aspirin. Also avoid using packages names inside functions
as much as possible, as it tends to screw inheritance.


Finally my biggest piece of advice:

ENFORCE a coding style. ENFORCE using english for variable, function
names and comments (for example although I'm French I really can't stand
code written with french variable names and comments! The Perl language
is using English keywords after all. Be consistent FFS) . ENFORCE
commenting what every single method does.

Having said that I do naturally tend to write awful code that only I can
understand, but at least everything is properly commented :)


===

To: Matt Sergeant <msergeant@startechgroup.co.uk>
From: Tatsuhiko Miyagawa <miyagawa@edge.co.jp>
Subject: Re: Tips & tricks needed :)
Date: Thu, 20 Dec 2001 12:56:47 +0900

On Wed, 19 Dec 2001 16:01:22 -0000
Matt Sergeant <msergeant@startechgroup.co.uk> wrote:

> Actually I was wondering about writing an Apache::Singleton class, that
> works the same as Class::Singleton, but clears the singleton out on each
> request (by using pnotes). Would anyone be interested in that?

Like this? (using register_cleanup instead of pnotes)


package Apache::Singleton;

use strict;
use vars qw($VERSION);
$VERSION = '0.01';

use Apache;

sub instance {
    my $class = shift;

    # get a reference to the _instance variable in the $class package
    no strict 'refs';
    my $instance = "$class\::_instance";

    unless (defined $$instance) {
	$$instance = $class->_new_instance(@_);
	Apache->request->register_cleanup(sub { undef $$instance });
    }

    return $$instance;
}

sub _new_instance {
    bless {}, shift;
}

===

To: "'Tatsuhiko Miyagawa'" <miyagawa@edge.co.jp>
From: Matt Sergeant <msergeant@startechgroup.co.uk>
Subject: RE: Tips & tricks needed :)
Date: Thu, 20 Dec 2001 08:57:32 -0000

> -----Original Message-----
> From: Tatsuhiko Miyagawa [mailto:miyagawa@edge.co.jp]
> 
> On Wed, 19 Dec 2001 16:01:22 -0000
> Matt Sergeant <msergeant@startechgroup.co.uk> wrote:
> 
> > Actually I was wondering about writing an Apache::Singleton 
> class, that
> > works the same as Class::Singleton, but clears the 
> singleton out on each
> > request (by using pnotes). Would anyone be interested in that?
> 
> Like this? (using register_cleanup instead of pnotes)
> 
> 
> package Apache::Singleton;
> 
> use strict;
> use vars qw($VERSION);
> $VERSION = '0.01';
> 
> use Apache;
> 
> sub instance {
>     my $class = shift;
> 
>     # get a reference to the _instance variable in the $class package
>     no strict 'refs';
>     my $instance = "$class\::_instance";
> 
>     unless (defined $$instance) {
> 	$$instance = $class->_new_instance(@_);
> 	Apache->request->register_cleanup(sub { undef $$instance });
>     }
> 
>     return $$instance;
> }
> 
> sub _new_instance {
>     bless {}, shift;
> }

Yeah, just like that. Why don't you wrap it up and stick it on CPAN? Saves
me another module :-)

===

To: templates list <templates@template-toolkit.org>,
From: Mark Fowler <mark@twoshortplanks.com>
Subject: Re: Tips & tricks needed :)
Date: Thu, 20 Dec 2001 10:19:50 +0000 (GMT)

(sorry to break threading but I'm getting this from multiple lists)

> that IE 6 (beta at the time) considered my cookies to be third party
> because I used frame-based domain redirection and by default would not
> accept them.

You need to include a P3P header in your HTTP header that contains a
Compact Policy (CP) - a geek code of what your P3P xml privacy document
contains.  See http://www.w3c.org/P3P/.

Some research I did seems to indicate that current implementations of IE6
 will accept cookies no matter what CP you use (rather than checking it
against your security settings and deciding if the CP represents a
privacy policy that violates your chosen level of disclosure.)  I'd really
appreciate it other people could check this and confirm that IE6 is not
offering any actual privacy level protection and is just discriminated
against people that don't have P3P headers.

My (Profero's) module for automagically converting a P3P document (the 
xml) into a CP (the geek-code version of that xml document) is in beta
here:

http://twoshortplanks.com/temp/P3P-ToCP-0.02.tar.gz

Please test, break and get back to me when it doesn't work.  It just 
follows the spec and uses XML::XPath to pull the stuff out.

Later

Mark.

===

To: Mark Fowler <mark@twoshortplanks.com>
From: Mark Maunder <mark@swiftcamel.com>
Subject: Re: [OT] Tips & tricks needed :)
Date: Thu, 20 Dec 2001 11:04:59 +0000

Mark Fowler wrote:

> I'd really appreciate it other people could check this and confirm that IE6
> is not
> offering any actual privacy level protection and is just discriminated
> against people that don't have P3P headers.
>

I tried a few header combinations before I got IE6 to send cookies in frames
where one frame is an external site, so it is parsing the header, not just
requiring its existence. I'm not sure if it actually looks at a users settings
to determine if the policy is acceptable based on user prefs.

===

To: Mark Fowler <mark@twoshortplanks.com>
From: Igor Sysoev <is@rambler-co.ru>
Subject: Re: Tips & tricks needed :)
Date: Thu, 20 Dec 2001 16:16:27 +0300 (MSK)

On Thu, 20 Dec 2001, Mark Fowler wrote:

> (sorry to break threading but I'm getting this from multiple lists)
> 
> > that IE 6 (beta at the time) considered my cookies to be third party
> > because I used frame-based domain redirection and by default would not
> > accept them.
> 
> You need to include a P3P header in your HTTP header that contains a
> Compact Policy (CP) - a geek code of what your P3P xml privacy document
> contains.  See http://www.w3c.org/P3P/.
> 
> Some research I did seems to indicate that current implementations of IE6
>  will accept cookies no matter what CP you use (rather than checking it
> against your security settings and deciding if the CP represents a
> privacy policy that violates your chosen level of disclosure.)  I'd really
> appreciate it other people could check this and confirm that IE6 is not
> offering any actual privacy level protection and is just discriminated
> against people that don't have P3P headers.

I found that IE6 require P3P header with medium and higher security
settings but CP content doesn't matter - it need simply P3P: CP='anything'.

Igor Sysoev

===

To: "Tatsuhiko Miyagawa" <miyagawa@edge.co.jp>,
From: "Perrin Harkins" <perrin@elem.com>
Subject: Re: Tips & tricks needed :)
Date: Thu, 20 Dec 2001 11:51:30 -0500

> Like this? (using register_cleanup instead of pnotes)

Better to use pnotes.  I started out doing this kind of thing with
register_cleanup and had problems like random segfaults.  I think it was
because other cleanup handlers sometimes needed access to these resources.

===

To: "Perrin Harkins" <perrin@elem.com>
From: Tatsuhiko Miyagawa <miyagawa@edge.co.jp>
Subject: Re: Tips & tricks needed :)
Date: Fri, 21 Dec 2001 12:31:49 +0900

On Thu, 20 Dec 2001 11:51:30 -0500
"Perrin Harkins" <perrin@elem.com> wrote:

> > Like this? (using register_cleanup instead of pnotes)
> 
> Better to use pnotes.  I started out doing this kind of thing with
> register_cleanup and had problems like random segfaults.  I think it was
> because other cleanup handlers sometimes needed access to these resources.

I'll take care of it. Thanks for the input.


===
To: "Perrin Harkins" <perrin@elem.com>
From: Tatsuhiko Miyagawa <miyagawa@edge.co.jp>
Subject: [ANNOUNCE] Apache::Singleton 0.02 (Re: Tips &
tricks needed :))
Date: Sat, 22 Dec 2001 17:04:11 +0900

On Thu, 20 Dec 2001 11:51:30 -0500
"Perrin Harkins" <perrin@elem.com> wrote:

> > Like this? (using register_cleanup instead of pnotes)
> 
> Better to use pnotes.  I started out doing this kind of thing with
> register_cleanup and had problems like random segfaults.  I think it was
> because other cleanup handlers sometimes needed access to these resources.

Now it uses pnotes(). Todo is to add scope configuration for each
classes.

The URL

    http://bulknews.net/lib/archives/Apache-Singleton-0.02.tar.gz

has entered CPAN as

  file: $CPAN/authors/id/M/MI/MIYAGAWA/Apache-Singleton-0.02.tar.gz
  size: 1621 bytes
   md5: 89a86023ea672f571860e91696ff03bb


0.02  Sat Dec 22 16:58:34 JST 2001
	- use pnotes instead of register_cleanup
	  (Thanks to Perrin Harkins <perrin@elem.com>)


===

To: modperl@apache.org
From: Tatsuhiko Miyagawa <miyagawa@edge.co.jp>
Subject: [ANNOUNCE] Apache::Singleton 0.03
Date: Sat, 22 Dec 2001 22:39:03 +0900

On Sat, 22 Dec 2001 17:04:11 +0900
Tatsuhiko Miyagawa <miyagawa@edge.co.jp> wrote:

> Now it uses pnotes(). Todo is to add scope configuration for each
> classes.

Added subclasses with own object lifetime configuration.
I myself am just a little dubious about its implementation,
especially for "Server" scope. Any suggestions welcome.

The URL

    http://bulknews.net/lib/archives/Apache-Singleton-0.03.tar.gz

has entered CPAN as

  file: $CPAN/authors/id/M/MI/MIYAGAWA/Apache-Singleton-0.03.tar.gz
  size: 3415 bytes
   md5: ba59d1e0acfd6364b045ba869c6b799c


0.03  Sat Dec 22 22:29:50 JST 2001
	- Added test for multiple classes
	* Added Request, Process, Server subclasses

NAME
    Apache::Singleton - Singleton class for mod_perl

SYNOPSIS
      package Printer;
      use base qw(Apache::Singleton);

      # same: default is per Request
      package Printer::PerRequest;
      use base qw(Apache::Singleton::Request);

      package Printer::PerProcess;
      use base qw(Apache::Singleton::Process);

      package Printer::PerServer;
      use base qw(Apache::Singleton::Server);

DESCRIPTION
    Apache::Singleton works the same as Class::Singleton, but with various
    object lifetime (scope). See the Class::Singleton manpage first.

OBJECT LIFETIME
    By inheriting one of the following sublasses of Apache::Singleton, you
    can change the scope of your object.

    Request
          use base qw(Apache::Singleton::Request);

        One instance for one request. Apache::Singleton will remove
        intstance on each request. Implemented using mod_perl "pnotes" API.
        This is the default scope, so inheriting from Apache::Singleton
        would do the same effect.

    Process
          use base qw(Apache::Singleton::Process);

        One instance for one httpd process. Implemented using package
        global. Notice this is the same beaviour with Class::Singleton ;)

    Server
          use base qw(Apache::Singleton::Server);

        One instance for one server (across all httpd processes).
        Implemented using Cache::SharedMemoryCache (IPC).

        Note that multiple process cannot share blessed reference without
        serialization, so *One instance for one server* is just an idea.
        What it means is, one instance for one process, and multiple
        instances with shared data across one server. See t/05_server.t in
        this module distribution for what it exactly means.

AUTHOR
    Original idea by Matt Sergeant <matt@sergeant.org> and Perrin Harkins
    <perrin@elem.com>.

    Code by Tatsuhiko Miyagawa <miyagawa@bulknews.net>

    This library is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.

SEE ALSO
    the Apache::Singleton::Request manpage, the Apache::Singleton::Process
    manpage, the Apache::Singleton::Server manpage, the Class::Singleton
    manpage, the Cache::SharedMemoryCache manpage



===

To: <modperl@apache.org>
From: brian moseley <bcm@maz.org>
Subject: Re: [ANNOUNCE] Apache::Singleton 0.03
Date: Sat, 22 Dec 2001 06:04:28 -0800 (PST)

On Sat, 22 Dec 2001, Tatsuhiko Miyagawa wrote:

>         Note that multiple process cannot share blessed reference without
>         serialization, so *One instance for one server* is just an idea.
>         What it means is, one instance for one process, and multiple
>         instances with shared data across one server. See t/05_server.t in
>         this module distribution for what it exactly means.

hmm.. it looks like for the server-scoped singleton, the
process-cached version will be returned by _get_instance
even if the shared-memory-cached version was modified by
another process since the last time _set_instance was called
in the first process. is this really what you want?

===
the rest of The Pile (a partial mailing list archive)
doom@kzsu.stanford.edu